【架构全景】AI短剧生成平台后端系统技术架构深度解析

睡觉

491人浏览 · 2026-05-28 16:28:30

睡觉 · 2026-05-28 16:28:30 发布

一、项目概述

本文深入剖析一个AI短剧生成平台的完整后端技术架构。该平台融合了AI图片生成、视频生成、语音合成等多项AI能力，为用户提供从剧本创作到视频生成的全流程服务。

核心能力矩阵：

模块	功能	支持平台
图片生成	文生图、图生图、角色一致性	阿里云百炼、OpenAI、Gemini、MiniMax、火山引擎
视频生成	文生视频、图生视频、首尾帧控制、视频续写	阿里云百炼、火山引擎、MiniMax、Vidu、Kling
语音合成	TTS文本转语音	MiniMax
智能编剧	AI续写、剧情拆解、分镜生成	大语言模型

二、整体架构设计

2.1 系统架构图

┌─────────────────────────────────────────────────────────────────────────┐
│                        前端应用层 (Frontend)                            │
│                    Drama Creation Platform                             │
│   [剧本编辑器] [角色管理] [场景设计] [分镜预览] [视频合成]               │
└──────────────────────────────┬──────────────────────────────────────────┘
                               │ HTTP API
                               ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                        API Gateway Layer                               │
│   ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐   │
│   │ dramas   │ │ scenes   │ │ images   │ │ videos   │ │ users    │   │
│   │ router   │ │ router   │ │ router   │ │ router   │ │ router   │   │
│   └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘   │
└───────┬┴───────────┬┴───────────┬┴───────────┬┴───────────┬┴────────┘
        │             │             │             │             │
        ▼             ▼             ▼             ▼             ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                        业务服务层 (Service Layer)                       │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌───────────┐  │
│  │ ImageService │  │ VideoService │  │ TTSService   │  │ Balance   │  │
│  │ (图片生成)   │  │ (视频生成)   │  │ (语音合成)   │  │ (余额管理)│  │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘  └─────┬─────┘  │
│         │                  │                  │                │       │
│         ▼                  ▼                  ▼                ▼       │
│  ┌───────────────────────────────────────────────────────────────┐     │
│  │                     AI Adapter Layer                          │     │
│  │  ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐     │     │
│  │  │  Ali │ │ OpenAI│ │Gemini│ │Minimax││ Volc │ │ Vidu │     │     │
│  │  │Image │ │Image │ │Image │ │ Image ││Engine│ │Video │     │     │
│  │  └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘     │     │
│  └─────┴────────┴────────┴────────┴────────┴────────┴───────────┘     │
└─────────────────────────────────────────────────────────────────────────┘
                               │
                               ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                        数据访问层 (Data Layer)                         │
│   ┌─────────────────────────────────────────────────────────────┐     │
│   │                    Drizzle ORM + MySQL                      │     │
│   │  dramas | scenes | characters | storyboards | generations  │     │
│   │  users  | transactions | notifications | configs           │     │
│   └─────────────────────────────────────────────────────────────┘     │
└─────────────────────────────────────────────────────────────────────────┘
                               │
                               ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                        基础设施层 (Infrastructure)                     │
│   [MySQL数据库] [腾讯云COS存储] [FFmpeg处理] [任务日志系统]             │
└─────────────────────────────────────────────────────────────────────────┘

2.2 核心服务模块

服务模块职责：

服务	职责	核心文件
`image-generation.ts`	图片生成任务管理、异步轮询、结果处理	`/backend/src/services/image-generation.ts`
`video-generation.ts`	视频生成任务管理、异步轮询、结果处理	`/backend/src/services/video-generation.ts`
`tts-generation.ts`	语音合成服务	`/backend/src/services/tts-generation.ts`
`balance.ts`	用户余额管理、积分扣减、事务记录	`/backend/src/services/balance.ts`
`drama-context.ts`	剧情上下文构建、角色参考图管理	`/backend/src/services/drama-context.ts`
`batch-generation.ts`	批量生成任务协调	`/backend/src/services/batch-generation.ts`

三、任务管理架构

3.1 图片生成任务流程

class Semaphore {
  private running = 0
  private queue: (() => void)[] = []
  
  async acquire(): Promise<() => void> {
    if (this.running < this.maxConcurrent) {
      this.running++
      return () => this.release()
    }
    return new Promise(resolve => {
      this.queue.push(() => {
        this.running++
        resolve(() => this.release())
      })
    })
  }
}

const imageSemaphore = new Semaphore(5)  // 限制并发数为5

任务生命周期：

┌──────────────┐    ┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│ 1. 入队      │ -> │ 2. 构建请求  │ -> │ 3. 调用API   │ -> │ 4. 解析响应  │
│ (生成记录)   │    │ (参数适配)   │    │ (异步/同步)  │    │ (提取结果)   │
└──────────────┘    └──────────────┘    └──────────────┘    └──────────────┘
         │                                      │
         ▼                                      ▼
┌──────────────┐                        ┌──────────────┐
│ 5. 轮询等待  │ <- 异步模式 <- ─────── │ 6. 处理完成  │
│ (最多10分钟) │                        │ (下载+上传)  │
└──────────────┘                        └──────────────┘
         │                                      │
         └───────────────> ─────────────────────┘

3.2 重试与限流机制

async function retryWithBackoff<T>(
  fn: () => Promise<T>,
  maxRetries: number = 3,
  baseDelayMs: number = 1000
): Promise<T> {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      return await fn()
    } catch (err: any) {
      if (attempt === maxRetries) throw err
      const delay = baseDelayMs * Math.pow(2, attempt) + Math.random() * 500
      await new Promise(resolve => setTimeout(resolve, delay))
    }
  }
  throw new Error('retryWithBackoff: unreachable')
}

限流策略：

图片生成：最大5个并发任务
视频生成：无硬限制（视频生成本身耗时较长）
轮询间隔：图片5秒，视频10秒
超时时间：图片10分钟，视频50分钟

四、数据架构设计

4.1 核心数据表关系

dramas (剧集)
├── id, title, description, status, created_at
└── scenes (场景)
    ├── id, drama_id, prompt, image_url, status
    └── characters (角色)
        └── id, scene_id, name, image_url, description

storyboards (分镜)
├── id, scene_id, prompt, first_frame_url, last_frame_url
└── image_generations (图片生成记录)
    └── id, storyboard_id, prompt, image_url, status

video_generations (视频生成记录)
├── id, storyboard_id, prompt, video_url, status, duration

4.2 价格与余额系统

export async function deductBalance(
  userId: number, 
  amount: number, 
  type: string, 
  description: string
): Promise<boolean> {
  return await db.transaction(async (tx) => {
    const [user] = await tx.select({ balance: schema.users.balance })
      .from(schema.users)
      .where(eq(schema.users.id, userId))
      .for('update')  // 行级锁

    if (!user || user.balance < amount) {
      return false
    }

    await tx.update(schema.users)
      .set({ balance: user.balance - amount })
      .where(eq(schema.users.id, userId))

    await tx.insert(schema.transaction_records).values({
      user_id: userId,
      type,
      amount: -amount,
      balance_before: user.balance,
      balance_after: user.balance - amount,
      description,
      status: 'completed',
    })

    return true
  })
}

价格配置：

服务类型	价格项	计费方式
图片生成	`generate_image`	单次固定价格
视频生成	`video_price_configs`	按分辨率+时长计费
AI改写	`ai_rewrite`	单次固定价格
分镜拆解	`ai_storyboard`	单次固定价格

五、AI能力集成架构

5.1 适配器注册表模式

export const imageAdapters: Record<string, ImageProviderAdapter> = {
  minimax: new MiniMaxImageAdapter(),
  openai: new OpenAIImageAdapter(),
  gemini: new GeminiImageAdapter(),
  volcengine: new VolcEngineImageAdapter(),
  ali: new AliImageAdapter(),
  chatfire: new OpenAIImageAdapter(),
}

export const videoAdapters: Record<string, VideoProviderAdapter> = {
  minimax: new MiniMaxVideoAdapter(),
  volcengine: new VolcEngineVideoAdapter(),
  vidu: new ViduVideoAdapter(),
  ali: new AliVideoAdapter(),
  kling: new KlingVideoAdapter(),
}

export function getImageAdapter(provider: string): ImageProviderAdapter {
  return imageAdapters[provider.toLowerCase()] || imageAdapters['minimax']
}

5.2 多Provider容错机制

export async function getActiveConfig(type: 'image' | 'video' | 'tts'): Promise<AIConfig | null> {
  const configs = await db.select()
    .from(schema.ai_configs)
    .where(eq(schema.ai_configs.type, type))
    .orderBy(asc(schema.ai_configs.priority))
  
  for (const config of configs) {
    if (await validateConfig(config)) {
      return config
    }
  }
  return null
}

配置优先级策略：

根据配置优先级排序
依次验证配置有效性
返回第一个可用配置
支持动态切换Provider

六、视频生成核心流程

6.1 上下文感知的Prompt构建

const { contextPrompt, reference_image_urls, style: contextStyle, dialogue } = 
  await buildVideoGenerationContext({
    drama_id: params.dramaId,
    episode_id: params.episodeId,
    storyboard_id: params.storyboardId,
  })

let enhancedPrompt = params.prompt
if (contextPrompt) {
  enhancedPrompt = `【当前镜头】\n${params.prompt}\n\n${contextPrompt}`
}

if (dialogue) {
  enhancedPrompt = `${enhancedPrompt}\n\n【配音要求】\n严格按照以下对白内容进行配音：${dialogue}`
}

上下文组成：

角色信息：角色描述、参考图片
场景信息：场景设定、环境描述
剧情上下文：前后镜头关联
配音要求：对白文本

6.2 结果处理与存储

async function handleVideoComplete(id: number, videoUrl: string, duration: number | null) {
  const relativePath = await downloadFile(videoUrl, 'videos')
  const localPath = getAbsolutePath(relativePath)
  
  const cosUrl = await uploadFile(localPath)
  
  try {
    await fs.unlink(localPath)
  } catch {}
  
  await db.update(schema.video_generations)
    .set({ video_url: cosUrl, status: 'completed', completed_at: now() })
    .where(eq(schema.video_generations.id, id))

  await createNotification(userId, 'video_success', '视频生成完成', ...)
}

文件流转流程：

从AI平台下载视频 → 本地临时文件
上传至腾讯云COS → 获取CDN URL
清理本地临时文件
更新数据库记录
发送用户通知

七、扩展能力与集成

7.1 FFmpeg视频处理

// ffmpeg-merge.ts - 视频拼接
// ffmpeg-compose.ts - 视频合成（添加字幕、音频等）
// grid-split.ts - 图片网格分割处理

7.2 Webhook支持

// Vidu适配器特殊处理
static parseCallbackState(body: any): { status: 'completed' | 'failed'; video_url?: string; error?: string } {
  const state = body.state
  if (state === 'success') {
    return { status: 'completed', video_url: body.video_url }
  }
  if (state === 'failed') {
    return { status: 'failed', error: body.error || 'Vidu generation failed' }
  }
  return { status: 'failed', error: `Unknown state: ${state}` }
}

八、监控与日志系统

8.1 任务日志体系

logTaskStart('VideoTask', 'enqueue', { id, provider, storyboardId })
logTaskProgress('VideoTask', 'build-request', { id, provider })
logTaskPayload('VideoTask', 'request payload', { id, body })
logTaskSuccess('VideoTask', 'poll-complete', { id, videoUrl })
logTaskError('VideoTask', 'poll-failed', { id, error })
logTaskWarn('VideoTask', 'poll-retry', { id, attempt })

8.2 通知系统

export function createNotification(
  userId: number,
  type: string,
  title: string,
  message: string,
  link?: string
)

通知类型：

image_success / image_failed - 图片生成状态
video_success / video_failed - 视频生成状态
episode_progress - 剧集生成进度

九、架构优势总结

维度	设计特点	技术价值
可扩展性	Adapter模式 + 注册表	新增AI平台只需实现接口
容错性	多Provider自动切换 + 事务回滚	提升系统稳定性
性能优化	信号量限流 + 指数退避重试	防止资源过载
可维护性	统一接口抽象 + 共享工具模块	降低维护成本
安全性	事务性余额扣减 + 敏感信息脱敏	保障数据安全
可观测性	完整任务日志 + 通知系统	便于问题排查