Compare commits
23 Commits
main
...
backend-wo
| Author | SHA1 | Date | |
|---|---|---|---|
| c60cb47ee1 | |||
| 061eb7d867 | |||
| 07384c5e19 | |||
| 4280624810 | |||
| 028718df0b | |||
| a6eddf1c14 | |||
| 9e307e307c | |||
| c2e9558f5b | |||
| c626ec51d6 | |||
| 1ac9b1bde3 | |||
| 1c451c6ab3 | |||
| 408c5fca47 | |||
| 2a1aa4c994 | |||
| ebac2e86b5 | |||
| 47653ee319 | |||
| 4d2a4a0299 | |||
| e6387cf7af | |||
| fde94f4698 | |||
| dddf410dcb | |||
| 301ec4fc3b | |||
| 2cfd7de5d5 | |||
| a2897ef2be | |||
| e6a5ea46a6 |
@@ -1,146 +1,110 @@
|
|||||||
# SKG TK 二创验证 — 当前状态(2026-05-13)
|
# SKG TK 二创验证 — 当前状态(2026-05-18)
|
||||||
|
|
||||||
## 一句话
|
## 一句话
|
||||||
SKG AI 素材生产管线第二条思路:TK 链接/上传 → 拆轨 → 抽关键帧(5 张+手动加)→ Vision 识别 → 改写文案 → 生图 → 生视频 → 合成。**MVP 通到生图,剩余 3 个节点占位**。
|
当前产品方向已收窄为“信息流广告快速复刻”:TK 链接 / 上传视频后,先下载源视频,再并行跑音频文案路和视频视觉路;视频视觉路自动抽 6 张人物定向随机参考帧;产品素材独立成池,自动识别视角并补缺角度;分镜工作台按逐句时间轴写新口播、人物/产品需求和首尾帧规划。当前暂停直接提交视频模型,先逐条生成并审核首帧 / 尾帧。
|
||||||
|
|
||||||
## 路径 / 端口
|
## 路径 / 端口
|
||||||
- 路径:`~/Projects/business/20260512-20260512-skg-tk-二创验证/`
|
- 当前工作树:`/Users/kangwan/Projects/business/20260512-20260512-skg-tk-二创验证-backend/`
|
||||||
- web dev:`cd web && pnpm dev`(端口 **4290**)
|
- 主项目路径:`/Users/kangwan/Projects/business/20260512-20260512-skg-tk-二创验证/`
|
||||||
- api dev:`cd api && source .venv/bin/activate && uvicorn main:app --port 4291 --reload`
|
- 后台启动:`./scripts/start-dev-background.sh`(前端 4290,后端 4291,launchd 托管)
|
||||||
- 测试 job:`?job=c6767f3a166b`(chrisorb 71s 竖屏 TK)
|
- 后台停止:`./scripts/stop-dev-background.sh`
|
||||||
|
- web dev:`cd web && npm run dev`
|
||||||
|
- api dev:`cd api && uvicorn main:app --host 127.0.0.1 --port 4291`
|
||||||
|
- 注意:后端不要带 `--reload` 跑下载、抽帧、音频和生图等长任务。
|
||||||
|
|
||||||
## SKG 网关能力(实测 · 关键!)
|
## 当前模型分工
|
||||||
`base_url: https://ai.skg.com/ezlink/v1`
|
`LLM_BASE_URL` 默认走 `https://ai.skg.com/ezlink/v1`,图片同样默认走 `IMAGE_BASE_URL=https://ai.skg.com/ezlink/v1`,语音默认走 `https://ai.skg.com/azure`,生产视频默认走 `https://ai.skg.com/doubao`。
|
||||||
key 写在 `api/.env` 的 `LLM_API_KEY`
|
|
||||||
|
|
||||||
| 端点 / 字段 | 状态 | 用途 |
|
| 任务 | 当前模型 / 通道 | 备注 |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| `/v1/chat/completions` text-only | ✅ 通 | translate / rewrite |
|
| TK 下载 | `yt-dlp` + 可选 cookies | 公开视频裸下载;受限视频可配 `YTDLP_COOKIES_FILE` 或 `YTDLP_COOKIES_FROM_BROWSER`,也可直接上传 MP4。 |
|
||||||
| `/v1/chat/completions` + image_url | ✅ **通**(之前误判为不通,是 dog.jpg 那张图损坏) | vision 识别图片(gemini-2.5-flash 推荐) |
|
| 远端 ASR | `ASR_MODEL=whisper-1` | 失败后进本机 ASR,再进多模态兜底。 |
|
||||||
| `/v1/chat/completions` + input_audio | ❌ 不通 | ASR 不能走这条 |
|
| 本机 ASR | `LOCAL_ASR_MODEL=mlx-community/whisper-tiny` | 默认二级兜底,优先产出真实逐句时间轴。 |
|
||||||
| `/v1/audio/transcriptions` (whisper) | ❌ 404 | 整个 audio 端点都没暴露 |
|
| ASR 兜底 / 音频分析 | `ASR_FALLBACK_MODEL=gemini-2.5-flash` | 多模态音频兜底;后端会拒绝假字幕、重复文本和覆盖率过低结果。 |
|
||||||
| `/v1/audio/speech` (tts) | ❌ 404 | |
|
| 字幕翻译 | `TRANSLATE_MODEL=gemini-2.5-flash` | 保留 Gemini。 |
|
||||||
| `/v1/images/generations` (text→image) | ✅ 通 | 生图(gemini-3-pro-image-preview = nano-banana-pro) |
|
| 画面理解 | `VISION_MODEL=gpt-4o` | 关键帧 Vision 已切 GPT;旧环境若写 `gemini-*` 会自动归一化到 `GPT_TEXT_MODEL`。 |
|
||||||
| `/v1/images/generations` + image 参数 | ✅ **通**(image-to-image) | 实测能传 reference image,关键的发现 |
|
| 通用改写 / 分镜描述 | `REWRITE_MODEL=gpt-4o` | 已切 GPT;旧 Gemini 覆盖值会自动归一化。 |
|
||||||
| `/v1/images/edits` | ❌ 404 | |
|
| 新口播改写 | `AUDIO_REWRITE_MODEL=gpt-4o` | 默认跟随 `REWRITE_MODEL`;旧 Gemini 覆盖值会自动归一化。 |
|
||||||
| `/v1/videos/*` (sora-2) | ❌ 404 | 视频生成需要 IT 开通或外部 key |
|
| 产品视角识别 | `PRODUCT_VIEW_MODEL=gpt-image-2` | 产品图批量识别视角、左右 / 上下 / 内外侧、用途和风险。 |
|
||||||
| `/v1/files` | ❌ 403 "必须指定渠道" | |
|
| 所有生图 / 修图 | `gpt-image-2` | 服务端硬锁,无图片模型 fallback;覆盖关键帧生图、水印清理、元素提取、主体资产包、产品补角度、首尾帧。 |
|
||||||
|
| 配音 | `VOICE_PROVIDER=azure_openai` + `AZURE_TTS_MODEL=gpt-4o-mini-tts` | 语音固定 Azure OpenAI TTS。后端会按 `AZURE_TTS_PATHS` 依次尝试路径,便于区分路径错误和整条语音服务不可用。 |
|
||||||
|
| 视频 | `VIDEO_MODEL=seedance` | 当前主流程暂停直接提交;生产通道默认 `ai.skg.com/doubao`,Seedance 真实 ID 由 `VIDEO_MODEL_SEEDANCE` 配置。 |
|
||||||
|
|
||||||
**网关后端 = one-hub 多渠道代理**。当前 key 分组叫「纯OpenAI+AWSClaude+Gemini官方」,缺 audio 渠道(`gpt-4o-audio-preview` 503 "无可用渠道")和 video 渠道。
|
## 当前主流程
|
||||||
|
| 步 | 模块 | 状态 | 备注 |
|
||||||
## 模型选型(已写入 api/.env)
|
|
||||||
```
|
|
||||||
ASR_MODEL=whisper-1 # ⚠️ 端点 404,ASR 还没真跑通
|
|
||||||
TRANSLATE_MODEL=gemini-2.5-flash # ✅ text 已通
|
|
||||||
REWRITE_MODEL=gemini-2.5-pro # 占位
|
|
||||||
VISION_MODEL=gemini-2.5-flash # ✅ 识别已通
|
|
||||||
IMAGE_MODEL=gemini-3-pro-image-preview # ✅ nano-banana-pro,i2i 已通
|
|
||||||
```
|
|
||||||
|
|
||||||
## Pipeline 状态(8 节点合并版)
|
|
||||||
原 10 节点已合并:input + download + split 合一;translate 合到 transcript;videogen 和 compose 占位。
|
|
||||||
|
|
||||||
| 步 | 节点 | 状态 | 备注 |
|
|
||||||
|---|---|---|---|
|
|---|---|---|---|
|
||||||
| 1 | **输入·Input**(合并下载+拆分) | ✅ | yt-dlp 真下 + ffmpeg 拆 wav |
|
| 1 | 输入 / 下载 | 已通 | TK 链接或上传视频创建 job,下载完成后进入分析队列。 |
|
||||||
| 2 | **关键帧·Keyframes** | ✅ | D 启发式:候选 30 张 → pHash 去重 + Laplacian variance 评分 + 时序分桶 → 5 张;手动加帧 OK |
|
| 2 | 音频文案路 | 已通 | 拆 `audio.wav`,ASR、翻译、讲话人 / 节奏 / 背景音分析;结果默认折叠展示。 |
|
||||||
| 3 | **转录·ASR** | ❌ 阻塞 | SKG 网关 audio 不通;待 IT 开 audio 渠道 / 外部 key |
|
| 3 | 视频视觉路 | 已通 | 自动抽 6 张人物定向随机参考帧;当前工作区按 9:16 原视频播放秒数手动补帧。 |
|
||||||
| 4 | **翻译·Translate** | ❌ 阻塞 | 依赖 ASR |
|
| 4 | 相似主体资产 | 已通 | 用关键帧和可选内置角色生成同一主体的 10 张白底视图。 |
|
||||||
| 5 | **改写·Rewrite** | ⏳ 占位 | 等用户给产品信息模板 |
|
| 5 | 产品资产池 | 已通 | 上传 / 内置产品图统一入池,自动识别视角、结构点、用途、风险,缺角度可补图。 |
|
||||||
| 6 | **生图·Image Gen** | ✅ **刚做完** | nano-banana-pro i2i + 正负 prompt |
|
| 6 | 分镜工作台 | 已通 | 按逐句时间轴编辑新口播、镜头类型、人物 / 产品开关、首帧 / 尾帧规划。 |
|
||||||
| 7 | **生视频·Video Gen** | ⏳ 占位 | sora-2 端点不通 |
|
| 7 | 首尾帧闸门 | 已通 | 每条分镜先用相似主体视图和产品素材生成首帧 / 尾帧,审核后保存。 |
|
||||||
| 8 | **合成·Compose** | ⏳ 占位 | 本地 ffmpeg + 字幕 + TTS |
|
| 8 | 视频候选 | 暂停直提 | 历史候选保留展示;当前不再一键打 Seedance,等首尾帧审核后再开放单条提交。 |
|
||||||
|
|
||||||
## UI 架构(重要)
|
|
||||||
- **左侧 sidebar**(108px 极窄):8 个 stage tile 竖排 + DAG 路径分叉表达
|
|
||||||
- **主区 ReactFlow**:8 节点 DAG(input → keyframe/asr → ... → compose)
|
|
||||||
- **点 sidebar tile**:从左滑出 drawer panel(粉/紫/橙 Kanban 风格)
|
|
||||||
- **关键帧 lightbox**:**embedded 嵌入到 keyframe drawer**(不全屏)—— `<FrameLightbox embedded ... />`,drawer 宽度有 expandedFrame 时 760,无时 400
|
|
||||||
- **Input 节点上方**:多视频缩略图浮条 + 「+」加新视频
|
|
||||||
- **关键帧节点上方**:5+ 张缩略图按视频原比例(aspect-ratio: width/height)
|
|
||||||
- **缩略图 hover**:弹大图静态(关键帧是垫图素材,不放视频)
|
|
||||||
- **缩略图点击**:打开 keyframe drawer 内的 lightbox(左大图 + 右识别面板)
|
|
||||||
|
|
||||||
## 数据模型(关键 typescript / pydantic)
|
|
||||||
```typescript
|
|
||||||
KeyFrame {
|
|
||||||
index: number // 稳定 ID(不连续!frames 数组按 timestamp 排序)
|
|
||||||
timestamp: number
|
|
||||||
url: string
|
|
||||||
description?: {
|
|
||||||
scene, objects: [{name, position, color, extract_prompt}],
|
|
||||||
style, suggested_prompt
|
|
||||||
}
|
|
||||||
generated_images?: [{ id, prompt, model, mode, url, selected, created_at }]
|
|
||||||
}
|
|
||||||
|
|
||||||
Job { frames: KeyFrame[] ... }
|
|
||||||
```
|
|
||||||
|
|
||||||
**前端取帧必须用 `frames.find(x => x.index === activeIndex)` 不能用数组下标**(之前的 bug)。
|
|
||||||
|
|
||||||
## 关键文件
|
## 关键文件
|
||||||
- `web/app/page.tsx` — 多 job state 管理(jobs[] + activeJobId),8 节点 LAYOUT
|
- `api/main.py` — FastAPI 后端、模型路由、任务状态、ASR/翻译/音频分析、生图、产品识别、首尾帧和视频接口。
|
||||||
- `web/components/dashboard.tsx` — sidebar + drawer + 9 个 Kanban section(input/keyframe/asr/translate/rewrite/imagegen/videogen/compose),含 `ImageGenCard` 子组件
|
- `api/database.py` — 后端数据库层;当前用 SQLite 保存 document / job / media asset 元数据,媒体文件仍在 `jobs/<jobId>/`。
|
||||||
- `web/components/lightbox.tsx` — `FrameLightbox` 支持 `embedded` prop
|
- `api/.env.example` — 本地模型和网关模板;已包含 `GPT_TEXT_MODEL=gpt-4o`。
|
||||||
- `web/components/video-lightbox.tsx` — Input 节点点视频缩略图弹的播放器
|
- `deploy/.env.production.example` — 生产环境模板;视频默认 SKG Doubao / Seedance 网关。
|
||||||
- `web/components/nodes/index.tsx` — ReactFlow 8 节点定义
|
- `RULES.md` — 启动、部署事实、模型环境变量和项目规则。
|
||||||
- `web/lib/api.ts` — API client
|
- `docs/source-analysis.html` — 源码解析页;任何影响产品理解、接口、模型分工或操作路径的改动都要同步这里。
|
||||||
- `api/main.py` — FastAPI 所有端点,KeyFrame/GeneratedImage 模型
|
- `web/components/ad-recreation-board.tsx` — 当前信息流复刻主工作台。
|
||||||
|
- `web/components/media-asset-tile.tsx` — 统一媒体素材缩略图、hover 放大、删除和状态遮罩组件。
|
||||||
|
- `web/lib/api.ts` — 前端 API client 和运行模型标注类型。
|
||||||
|
|
||||||
## 已通的 API 端点
|
## 主要 API
|
||||||
```
|
```
|
||||||
POST /jobs 创建 job(链接)
|
|
||||||
POST /jobs/upload 上传视频
|
|
||||||
GET /jobs/{id} job 状态
|
|
||||||
POST /jobs/{id}/analyze?frames=5 拆轨+抽帧+ASR 自动一气呵成
|
|
||||||
POST /jobs/{id}/frames?t=<sec> 手动按时间戳加帧
|
|
||||||
POST /jobs/{id}/frames/{idx}/describe ✅ Vision 识别(3 次重试 + reasoning_content 兜底)
|
|
||||||
POST /jobs/{id}/frames/{idx}/generate ✅ 生图(i2i / text-only, 含 negative_prompt)
|
|
||||||
GET /jobs/{id}/frames/{idx}/gen/{gen_id}.jpg 生成图二进制
|
|
||||||
POST /jobs/{id}/frames/{idx}/gen/{gen_id}/select 选用某 gen 给下游
|
|
||||||
GET /jobs/{id}/video.mp4 原视频
|
|
||||||
GET /jobs/{id}/frames/{idx}.jpg 关键帧 jpg
|
|
||||||
GET /health
|
GET /health
|
||||||
|
GET /documents
|
||||||
|
POST /jobs
|
||||||
|
POST /jobs/{id}/download/retry
|
||||||
|
POST /jobs/upload
|
||||||
|
GET /jobs
|
||||||
|
GET /jobs/{id}
|
||||||
|
DELETE /jobs/{id}
|
||||||
|
POST /jobs/{id}/analyze
|
||||||
|
POST /jobs/{id}/transcribe
|
||||||
|
POST /jobs/{id}/frames?t=<sec>
|
||||||
|
DELETE /jobs/{id}/frames/{idx}
|
||||||
|
POST /jobs/{id}/frames/{idx}/describe
|
||||||
|
POST /jobs/{id}/frames/{idx}/cleanup
|
||||||
|
POST /jobs/{id}/frames/{idx}/cleanup/apply
|
||||||
|
POST /jobs/{id}/frames/{idx}/generate
|
||||||
|
POST /jobs/{id}/frames/{idx}/scene-asset
|
||||||
|
POST /jobs/{id}/frames/{idx}/elements
|
||||||
|
POST /jobs/{id}/frames/{idx}/elements/{element_id}/cutout
|
||||||
|
POST /jobs/{id}/frames/{idx}/elements/{element_id}/subject-assets
|
||||||
|
POST /jobs/{id}/assets
|
||||||
|
PUT /jobs/{id}/product-refs
|
||||||
|
POST /jobs/{id}/assets/product-views/analyze
|
||||||
|
POST /jobs/{id}/assets/product-angle
|
||||||
|
POST /jobs/{id}/script/rewrite
|
||||||
|
PUT /jobs/{id}/frames/{idx}/storyboard
|
||||||
|
POST /jobs/{id}/frames/{idx}/storyboard/video
|
||||||
```
|
```
|
||||||
|
|
||||||
## 已知坑 / 不要再踩
|
## 当前约束 / 不要踩
|
||||||
1. **关键帧 index 不连续**:手动加帧后 frames 数组按 timestamp 排序,index 是稳定 ID。lightbox 必须用 `frames.find(x => x.index === activeIndex)`,**不要**用 `frames[activeIndex]`。
|
1. 图片 / 视频 / 抽帧 / 产品图 / 生成图 / 首尾帧 / 视频候选缩略图默认复用 `web/components/media-asset-tile.tsx`。
|
||||||
2. **SKG 网关 vision 之前测试结果错误**:用 `dog.jpg` 那张 wikipedia 200px 缩略图损坏 / metadata 异常,导致一直以为 image input 不通。用标准 PNG / 真实 jpeg 测就通了。
|
2. 所有生图入口服务端只允许 `gpt-image-2`,不要重新加 Gemini 图片模型或其他 fallback。
|
||||||
3. **Gemini 2.5 Flash 默认带 thinking**,`content` 字段经常为空(token 都给了 reasoning),要从 `reasoning_content` 正则挖 JSON 兜底。
|
3. 画面理解和文案改写默认归 GPT:`VISION_MODEL`、`REWRITE_MODEL`、`AUDIO_REWRITE_MODEL` 会拦截旧 `gemini-*` 覆盖值。
|
||||||
4. **缩略图 aspect-ratio**:必须用 `aspectRatio: ${job.width}/${job.height}` 自适应,不要强制 `aspect-video` 16:9(竖屏视频会被裁切)。
|
4. Gemini 仍保留在 ASR fallback / 音频分析 / 翻译链路,不要误删。
|
||||||
5. **ReactFlow `type="input"` / `"output"` 是 reserved**:自带白底默认样式,要 CSS 覆盖 `.react-flow .react-flow__node-input { background: transparent !important; ... }`。
|
5. 语音只走 Azure OpenAI TTS;不要新增或依赖其他配音通道配置。
|
||||||
6. **ReactFlow 12 colorMode 独立于 next-themes**:必须 `<ReactFlow colorMode={resolvedTheme}>` 联动,否则节点白底。
|
6. 当前主流程不直接批量提交视频;先走“分镜规划 → 首尾帧 → 人工审核”。
|
||||||
7. **FastAPI BackgroundTasks 用法**:`bg.add_task(func, arg)` 不能传 coroutine。
|
7. 产品素材池默认是“同一产品”,不做不同产品身份判断;视角识别必须按佩戴者左 / 右、上 / 下、内 / 外侧描述。
|
||||||
8. **ffmpeg 8 mjpeg encoder 拒绝 yuv420p**:抽帧必须加 `-pix_fmt yuvj420p`,且 `-vsync` 改 `-fps_mode`。
|
8. 自动抽帧默认是 `frames=6` + `target=random_subject` + `quality=accurate` + `mode=replace`;如果需要特定动作或表情,用“当前点抽帧”手动补。
|
||||||
9. **抽帧速度**:场景切换检测(`select='gt(scene,0.4)'`)超慢(71s 视频要 30s+),换均匀采样 fast seek(5 张 < 3 秒)。
|
9. 文档是顶层业务归类:每个 TK 链接或上传视频默认一个 `document`,`job` 归属到 `document_id`;DB 存元数据和文件索引,视频 / 图片 / 音频文件不进 DB。
|
||||||
|
10. 后端长任务不要用 `--reload`。
|
||||||
|
11. 关键帧 `index` 是稳定 ID,不等于数组下标;前端取帧用 `frames.find(x => x.index === idx)`。
|
||||||
|
12. TikTok cookies 属于账号登录态,只能放本机 / 服务器私有环境;不要提交 cookies 文件或账号密码。
|
||||||
|
|
||||||
## 待办(按优先级)
|
## 最近变更
|
||||||
1. **ASR 阻塞**:找 SKG IT 开 audio 渠道,或给一个外部 ASR key(Deepgram / 讯飞 / OpenAI 直连)
|
- 2026-05-18:TK 链接下载新增 `YTDLP_COOKIES_FILE` / `YTDLP_COOKIES_FROM_BROWSER` 支持;受限视频失败时前端提示上传 MP4 或配置后端 cookies 登录态。
|
||||||
2. **生图测试反馈**:刚做完,等用户在浏览器试 → 调 negative prompt / 模型选型
|
- 2026-05-18:素材输入端失败任务支持重新下载 / 重新解析;选中失败且无 `video_url` 的 TK 素材时调用后端重试接口,已有视频的失败任务会清掉自动触发标记并重新跑音频/视觉路。
|
||||||
3. **区域化修图(inpainting)**:用户讨论了,方案 A 纯 prompt / B 矩形框 / C 画笔 mask / D SAM;暂时搁置
|
- 2026-05-18:清理个人语音通道残留,`/health`、前端类型、环境模板和文档不再暴露相关字段或配置。
|
||||||
4. **改写 Rewrite**:等用户给产品信息卡模板
|
- 2026-05-18:新增后端数据库层,SQLite 默认落在 `APP_DB_URL` / `DATABASE_URL` 或 `JOBS_DIR/app.db`;`/documents` 返回文档归类列表,`/health.database` 返回 DB 状态。
|
||||||
5. **视频生成**:sora-2 走 SKG 端点不通;考虑外部 key (Runway/Kling/Veo3)
|
- 2026-05-18:`VISION_MODEL`、`REWRITE_MODEL`、`AUDIO_REWRITE_MODEL` 切到 GPT 默认模型 `gpt-4o`,并加旧 Gemini 环境变量归一化保护。
|
||||||
6. **合成 Compose**:全本地 ffmpeg + 字幕 + TTS
|
- 2026-05-18:语音通道固定 Azure OpenAI TTS,并按 `AZURE_TTS_PATHS` 尝试语音路径。
|
||||||
|
- 2026-05-18:当前主路径暂停直接提交视频,改为逐条首尾帧闸门。
|
||||||
## 操作流(开发会话)
|
- 2026-05-18:媒体素材交互统一收口到 `MediaAssetTile`。
|
||||||
```bash
|
- 2026-05-18:产品图视角识别和产品缺角度补图收敛到 `gpt-image-2`。
|
||||||
# 1. 启动后端(如未跑)
|
|
||||||
cd ~/Projects/business/20260512-20260512-skg-tk-二创验证/api
|
|
||||||
source .venv/bin/activate
|
|
||||||
uvicorn main:app --port 4291 --reload
|
|
||||||
|
|
||||||
# 2. 启动前端(如未跑)
|
|
||||||
cd ../web
|
|
||||||
pnpm dev
|
|
||||||
|
|
||||||
# 3. 浏览器
|
|
||||||
open http://localhost:4290/?job=c6767f3a166b
|
|
||||||
```
|
|
||||||
|
|
||||||
## 用户偏好提醒(feedback memory)
|
|
||||||
- feedback_image-gen-model:生图统一用 nano-banana-pro ✅
|
|
||||||
- feedback_keep-scope-small:小需求小做
|
|
||||||
- feedback_flow-dont-stop:连续执行到交付,真分叉才问
|
|
||||||
- feedback_demand-before-infra:基建前先反问谁/痛点/频率
|
|
||||||
- feedback_no-guessing-ports:操作前先核实
|
|
||||||
|
|||||||
@@ -1,105 +1,5 @@
|
|||||||
{
|
{
|
||||||
"entries": [
|
"entries": [
|
||||||
{
|
|
||||||
"files_changed": 5,
|
|
||||||
"hash": "d802701",
|
|
||||||
"message": "auto-save 2026-05-15 17:22 (~4, -1)",
|
|
||||||
"ts": "2026-05-15T17:22:54+08:00",
|
|
||||||
"type": "commit"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 2,
|
|
||||||
"message": "Codex 会话活跃 · 最近命令:codex · 2 项未提交变更 · 最近提交:auto-save 2026-05-15 17:22 (~4, -1)",
|
|
||||||
"ts": "2026-05-15T09:24:48Z",
|
|
||||||
"type": "session-heartbeat"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 3,
|
|
||||||
"hash": "dcd8560",
|
|
||||||
"message": "auto-save 2026-05-15 17:28 (~3)",
|
|
||||||
"ts": "2026-05-15T17:28:27+08:00",
|
|
||||||
"type": "commit"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 1,
|
|
||||||
"hash": "25c4723",
|
|
||||||
"message": "auto-save 2026-05-15 17:33 (~1)",
|
|
||||||
"ts": "2026-05-15T17:33:59+08:00",
|
|
||||||
"type": "commit"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 1,
|
|
||||||
"message": "Codex 会话活跃 · 最近命令:codex · 1 项未提交变更 · 最近提交:auto-save 2026-05-15 17:33 (~1)",
|
|
||||||
"ts": "2026-05-15T09:34:48Z",
|
|
||||||
"type": "session-heartbeat"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 1,
|
|
||||||
"hash": "1110500",
|
|
||||||
"message": "auto-save 2026-05-15 17:39 (~1)",
|
|
||||||
"ts": "2026-05-15T17:39:32+08:00",
|
|
||||||
"type": "commit"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 1,
|
|
||||||
"message": "Codex 会话活跃 · 最近命令:codex · 1 项未提交变更 · 最近提交:auto-save 2026-05-15 17:39 (~1)",
|
|
||||||
"ts": "2026-05-15T09:44:48Z",
|
|
||||||
"type": "session-heartbeat"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 1,
|
|
||||||
"hash": "0b97d03",
|
|
||||||
"message": "auto-save 2026-05-15 17:44 (~1)",
|
|
||||||
"ts": "2026-05-15T17:45:02+08:00",
|
|
||||||
"type": "commit"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 1,
|
|
||||||
"hash": "eeeaebd",
|
|
||||||
"message": "auto-save 2026-05-15 17:50 (~1)",
|
|
||||||
"ts": "2026-05-15T17:50:32+08:00",
|
|
||||||
"type": "commit"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 3,
|
|
||||||
"message": "Codex 会话活跃 · 最近命令:codex · 3 项未提交变更 · 最近提交:auto-save 2026-05-15 17:50 (~1)",
|
|
||||||
"ts": "2026-05-15T09:54:48Z",
|
|
||||||
"type": "session-heartbeat"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 4,
|
|
||||||
"hash": "a662130",
|
|
||||||
"message": "auto-save 2026-05-15 17:55 (+1, ~3)",
|
|
||||||
"ts": "2026-05-15T17:56:05+08:00",
|
|
||||||
"type": "commit"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 2,
|
|
||||||
"hash": "fae3fb3",
|
|
||||||
"message": "auto-save 2026-05-15 18:01 (~2)",
|
|
||||||
"ts": "2026-05-15T18:01:35+08:00",
|
|
||||||
"type": "commit"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 1,
|
|
||||||
"message": "Codex 会话活跃 · 最近命令:codex · 1 项未提交变更 · 最近提交:auto-save 2026-05-15 18:01 (~2)",
|
|
||||||
"ts": "2026-05-15T10:04:49Z",
|
|
||||||
"type": "session-heartbeat"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 1,
|
|
||||||
"hash": "84143bc",
|
|
||||||
"message": "auto-save 2026-05-15 18:06 (~1)",
|
|
||||||
"ts": "2026-05-15T18:07:06+08:00",
|
|
||||||
"type": "commit"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"files_changed": 1,
|
|
||||||
"hash": "6c8bc42",
|
|
||||||
"message": "auto-save 2026-05-15 18:12 (~1)",
|
|
||||||
"ts": "2026-05-15T18:12:39+08:00",
|
|
||||||
"type": "commit"
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
"files_changed": 4,
|
"files_changed": 4,
|
||||||
"message": "Codex 会话活跃 · 最近命令:codex · 4 项未提交变更 · 最近提交:auto-save 2026-05-15 18:12 (~1)",
|
"message": "Codex 会话活跃 · 最近命令:codex · 4 项未提交变更 · 最近提交:auto-save 2026-05-15 18:12 (~1)",
|
||||||
@@ -3254,6 +3154,111 @@
|
|||||||
"message": "auto-save 2026-05-18 07:27 (~6)",
|
"message": "auto-save 2026-05-18 07:27 (~6)",
|
||||||
"hash": "9790e5b",
|
"hash": "9790e5b",
|
||||||
"files_changed": 6
|
"files_changed": 6
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T14:30:08+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "auto-save 2026-05-18 14:30 (~5)",
|
||||||
|
"hash": "e6a5ea4",
|
||||||
|
"files_changed": 5
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T14:31:59+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "chore: switch vision and rewrite models to gpt",
|
||||||
|
"hash": "a2897ef",
|
||||||
|
"files_changed": 0
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T14:34:36+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "chore: force gpt routing for vision and rewrite",
|
||||||
|
"hash": "2cfd7de",
|
||||||
|
"files_changed": 5
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T14:38:02+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "docs: refresh current project status",
|
||||||
|
"hash": "301ec4f",
|
||||||
|
"files_changed": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T14:39:23+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "chore: update development worklog",
|
||||||
|
"hash": "dddf410",
|
||||||
|
"files_changed": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T14:46:24+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "auto-save 2026-05-18 14:46 (~7)",
|
||||||
|
"hash": "e6387cf",
|
||||||
|
"files_changed": 7
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T14:49:53+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "fix: force azure openai tts voice path",
|
||||||
|
"hash": "4d2a4a0",
|
||||||
|
"files_changed": 4
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T15:08:05+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "auto-save 2026-05-18 15:07 (~5)",
|
||||||
|
"hash": "ebac2e8",
|
||||||
|
"files_changed": 5
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T15:13:30+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "auto-save 2026-05-18 15:13 (~8)",
|
||||||
|
"hash": "2a1aa4c",
|
||||||
|
"files_changed": 8
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T15:29:47+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "auto-save 2026-05-18 15:29 (+1, ~5)",
|
||||||
|
"hash": "1c451c6",
|
||||||
|
"files_changed": 6
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T15:34:15+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "feat: add backend document database",
|
||||||
|
"hash": "1ac9b1b",
|
||||||
|
"files_changed": 4
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T15:40:58+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "fix: backfill database on startup",
|
||||||
|
"hash": "c2e9558",
|
||||||
|
"files_changed": 1
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T15:51:30+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "chore: remove personal voice channel remnants",
|
||||||
|
"hash": "a6eddf1",
|
||||||
|
"files_changed": 7
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T16:35:29+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "feat: support tiktok download cookies",
|
||||||
|
"hash": "4280624",
|
||||||
|
"files_changed": 9
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"ts": "2026-05-18T16:49:51+08:00",
|
||||||
|
"type": "commit",
|
||||||
|
"message": "fix: allow retrying failed source analysis",
|
||||||
|
"hash": "061eb7d",
|
||||||
|
"files_changed": 6
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
|
|||||||
19
RULES.md
19
RULES.md
@@ -11,7 +11,7 @@
|
|||||||
- 详见 `CLAUDE.md` 立项决策段 + `.memory/plan.md` 七步管线拆解
|
- 详见 `CLAUDE.md` 立项决策段 + `.memory/plan.md` 七步管线拆解
|
||||||
- 风格:`04-Dark-Gallery-Ambient`(路径:`~/Projects/research/20260305-网页风格库/04-Dark-Gallery-Ambient.md`)
|
- 风格:`04-Dark-Gallery-Ambient`(路径:`~/Projects/research/20260305-网页风格库/04-Dark-Gallery-Ambient.md`)
|
||||||
- 第一冲刺:步骤 1-4(下载 / 拆轨 / 关键帧 / ASR+翻译)
|
- 第一冲刺:步骤 1-4(下载 / 拆轨 / 关键帧 / ASR+翻译)
|
||||||
- 当前产品方向(2026-05-18 再确认):先解决信息流广告快速复刻的第一步,不再沿用“开始后线性完成抽帧、分镜、元素生成、合成”的旧做法。主界面为“左侧素材输入列 + 右侧信息流复刻工作表”。用户粘贴 TK 链接或上传视频后点击“开始分析”,系统自动下载源视频;下载完成后并行启动两条路:音频文案路提取原音频文案/字幕,并分析讲话人、语速节奏、背景音乐/环境声/音效;视频视觉路自动抽取参考帧,供人工选择可用主体并生成相似主体白底视图。产品图上传后独立形成产品资产包,自动识别视角/结构/比例并补缺角度。分镜工作台按逐句时间轴规划新口播、镜头类型、首帧/尾帧、人物需求和产品出现方式;当前暂停直接调视频模型,先逐条用“相似主体视图 + 产品素材池 + 首尾帧文字规划”生成并审核首帧/尾帧,保存规划后再决定哪些分镜进入单条视频候选。
|
- 当前产品方向(2026-05-18 再确认):先解决信息流广告快速复刻的第一步,不再沿用“开始后线性完成抽帧、分镜、元素生成、合成”的旧做法。主界面为“左侧素材输入列 + 右侧信息流复刻工作表”。用户粘贴 TK 链接或上传视频后点击“开始分析”,系统自动下载源视频;下载完成后并行启动两条路:音频文案路提取原音频文案/字幕,并分析讲话人、语速节奏、背景音乐/环境声/音效;视频视觉路自动抽取 6 张人物定向随机参考帧,供人工选择可用主体并生成相似主体白底视图。产品图上传后独立形成产品资产包,自动识别视角/结构/比例并补缺角度。分镜工作台按逐句时间轴规划新口播、镜头类型、首帧/尾帧、人物需求和产品出现方式;当前暂停直接调视频模型,先逐条用“相似主体视图 + 产品素材池 + 首尾帧文字规划”生成并审核首帧/尾帧,保存规划后再决定哪些分镜进入单条视频候选。
|
||||||
|
|
||||||
## 部署事实
|
## 部署事实
|
||||||
- 平台:VPS `76.13.31.179`(Ubuntu 24.04 / Docker Compose / Coolify Traefik)
|
- 平台:VPS `76.13.31.179`(Ubuntu 24.04 / Docker Compose / Coolify Traefik)
|
||||||
@@ -24,7 +24,7 @@
|
|||||||
- 服务器目录:`/opt/skg-marketing-studio`
|
- 服务器目录:`/opt/skg-marketing-studio`
|
||||||
- 生产启动:`docker compose -f docker-compose.prod.yml --env-file deploy/.env.production up -d --build`
|
- 生产启动:`docker compose -f docker-compose.prod.yml --env-file deploy/.env.production up -d --build`
|
||||||
- 生产架构:`web` 容器用 Nginx 承载 Next 静态导出;`/login/`、`/_next/`、`/assets/`、`/skg-logo-black.svg`、`/oasis-source/` 等登录页必需静态资源公开访问;未登录访问工作台跳转 `/login/`,`/api/` 通过 Nginx `auth_request` 校验 FastAPI 会话 Cookie 后反代到 `skg-marketing-api:4291`;Traefik 通过 `coolify` 外部网络接入 80/443
|
- 生产架构:`web` 容器用 Nginx 承载 Next 静态导出;`/login/`、`/_next/`、`/assets/`、`/skg-logo-black.svg`、`/oasis-source/` 等登录页必需静态资源公开访问;未登录访问工作台跳转 `/login/`,`/api/` 通过 Nginx `auth_request` 校验 FastAPI 会话 Cookie 后反代到 `skg-marketing-api:4291`;Traefik 通过 `coolify` 外部网络接入 80/443
|
||||||
- 持久化目录:服务器 `./data/jobs` 挂载到后端 `/data/jobs`
|
- 持久化目录:服务器 `./data/jobs` 挂载到后端 `/data/jobs`;默认后端数据库为 `APP_DB_URL=sqlite:////data/jobs/app.db`,只存文档 / job / 媒体资产元数据和文件索引,原视频、音频、抽帧、生图、视频候选仍放在 `/data/jobs/<jobId>/`
|
||||||
- 登录凭证:用户名写下方快捷登录;密码明文备份只放服务器 `/root/skg-marketing-studio-login.txt`,生产环境变量 `WEB_AUTH_PASSWORD` / `WEB_AUTH_SESSION_SECRET` 只放服务器 `deploy/.env.production`
|
- 登录凭证:用户名写下方快捷登录;密码明文备份只放服务器 `/root/skg-marketing-studio-login.txt`,生产环境变量 `WEB_AUTH_PASSWORD` / `WEB_AUTH_SESSION_SECRET` 只放服务器 `deploy/.env.production`
|
||||||
|
|
||||||
## 快捷登录
|
## 快捷登录
|
||||||
@@ -56,20 +56,21 @@
|
|||||||
- `ASR_TIMEOUT_SECONDS`:远端 ASR / 音频分析单次请求超时,默认 45 秒,避免第一步长时间停在转录中
|
- `ASR_TIMEOUT_SECONDS`:远端 ASR / 音频分析单次请求超时,默认 45 秒,避免第一步长时间停在转录中
|
||||||
- `LOCAL_ASR_BIN` / `LOCAL_ASR_MODEL` / `LOCAL_ASR_TIMEOUT_SECONDS`:本机 ASR 兜底,默认使用 `/opt/homebrew/bin/mlx_whisper` + `mlx-community/whisper-tiny`,用于当前 SKG 网关 `/audio/transcriptions` 不可用时生成真实逐句时间轴
|
- `LOCAL_ASR_BIN` / `LOCAL_ASR_MODEL` / `LOCAL_ASR_TIMEOUT_SECONDS`:本机 ASR 兜底,默认使用 `/opt/homebrew/bin/mlx_whisper` + `mlx-community/whisper-tiny`,用于当前 SKG 网关 `/audio/transcriptions` 不可用时生成真实逐句时间轴
|
||||||
- `TRANSLATE_MODEL`:字幕翻译模型,默认 `gemini-2.5-flash`
|
- `TRANSLATE_MODEL`:字幕翻译模型,默认 `gemini-2.5-flash`
|
||||||
- `REWRITE_MODEL`:通用改写/分镜描述模型,默认 `gemini-2.5-pro`
|
- `GPT_TEXT_MODEL`:GPT 文本 / 视觉默认模型,默认 `gpt-4o`;用于兜底修正旧 Gemini 覆盖值
|
||||||
- `AUDIO_REWRITE_MODEL`:后续音频口播改写模型,默认跟随 `REWRITE_MODEL`;当前第一步不默认调用口播改写,只保留原文案和声音分析
|
- `REWRITE_MODEL`:通用改写/分镜描述模型,默认 `gpt-4o`;如果旧环境仍写 `gemini-*`,后端会自动改用 `GPT_TEXT_MODEL`
|
||||||
|
- `VISION_MODEL`:关键帧画面理解模型,默认 `gpt-4o`;如果旧环境仍写 `gemini-*`,后端会自动改用 `GPT_TEXT_MODEL`
|
||||||
|
- `AUDIO_REWRITE_MODEL`:后续音频口播改写模型,默认跟随 `REWRITE_MODEL`;如果旧环境仍写 `gemini-*`,后端会自动改用 `REWRITE_MODEL`
|
||||||
- `AUDIO_PRODUCT_BRIEF`:音频口播改写时注入的 SKG 产品卖点
|
- `AUDIO_PRODUCT_BRIEF`:音频口播改写时注入的 SKG 产品卖点
|
||||||
- `PRODUCT_VIEW_MODEL`:同一产品素材池的视角标注/自动识别模型;当前按项目要求强制使用 `gpt-image-2`
|
- `PRODUCT_VIEW_MODEL`:同一产品素材池的视角标注/自动识别模型;当前按项目要求强制使用 `gpt-image-2`
|
||||||
- `IMAGE_BASE_URL` / `IMAGE_API_KEY` / `IMAGE_MODEL`:OpenAI 兼容生图网关;当前所有生图入口一律强制使用 `gpt-image-2`,不做其他图片模型 fallback
|
- `IMAGE_BASE_URL` / `IMAGE_API_KEY` / `IMAGE_MODEL`:OpenAI 兼容生图网关;当前所有生图入口一律强制使用 `gpt-image-2`,不做其他图片模型 fallback
|
||||||
- `GPT_IMAGE_MODEL` / `SUBJECT_ASSET_IMAGE_MODEL` / `SUBJECT_ASSET_IMAGE_MODELS`:保留兼容旧环境变量名,但服务端会强制主体 6 视图和所有其他生图入口都只使用 `gpt-image-2`
|
- `GPT_IMAGE_MODEL` / `SUBJECT_ASSET_IMAGE_MODEL` / `SUBJECT_ASSET_IMAGE_MODELS`:保留兼容旧环境变量名,但服务端会强制主体 6 视图和所有其他生图入口都只使用 `gpt-image-2`
|
||||||
- `AI_HTTP_PROXY` / `IMAGE_HTTP_PROXY`:可选的 AI 网关出站代理;本地 launchd 后台进程不一定继承 shell 的 `http_proxy/https_proxy`,如生图报 DNS / ConnectError,可在本地 `api/.env` 配置后重启后端。`/health` 只回传是否配置代理,不回传代理地址。
|
- `AI_HTTP_PROXY` / `IMAGE_HTTP_PROXY`:可选的 AI 网关出站代理;本地 launchd 后台进程不一定继承 shell 的 `http_proxy/https_proxy`,如生图报 DNS / ConnectError,可在本地 `api/.env` 配置后重启后端。`/health` 只回传是否配置代理,不回传代理地址。
|
||||||
- `VOICE_PROVIDER`:配音通道,当前固定使用 `azure_openai`
|
- `YTDLP_COOKIES_FILE` / `YTDLP_COOKIES_FROM_BROWSER`:可选 TikTok 下载登录态;优先使用 cookies 文件,其次读取本机浏览器 cookies。cookies 文件属于敏感登录态,只能放本机或服务器私有路径,不允许入库。
|
||||||
|
- `VOICE_PROVIDER`:配音通道,服务端固定使用 `azure_openai`
|
||||||
- `AZURE_OPENAI_BASE_URL` / `AZURE_OPENAI_API_KEY`:微软 Azure OpenAI 协议配音网关;本地未单独配置 Key 时回退复用 `LLM_API_KEY`
|
- `AZURE_OPENAI_BASE_URL` / `AZURE_OPENAI_API_KEY`:微软 Azure OpenAI 协议配音网关;本地未单独配置 Key 时回退复用 `LLM_API_KEY`
|
||||||
- `AZURE_TTS_MODEL` / `AZURE_TTS_VOICE_ID` / `AZURE_TTS_VOICE_POOL` / `AZURE_TTS_PATH`:Azure OpenAI TTS 模型、默认音色、音色池和 OpenAI 协议语音路径
|
- `AZURE_TTS_MODEL` / `AZURE_TTS_VOICE_ID` / `AZURE_TTS_VOICE_POOL` / `AZURE_TTS_PATH` / `AZURE_TTS_PATHS`:Azure OpenAI TTS 模型、默认音色、音色池和 OpenAI 协议语音路径;后端会按 `AZURE_TTS_PATHS` 依次尝试,便于区分路径不对和整条语音服务不可用
|
||||||
- `MINIMAX_API_KEY`:MiniMax T2A 配音 Key,只能放本地 `api/.env`,不能入库;当前第一步暂不默认调用
|
|
||||||
- `MINIMAX_TTS_BASE_URL` / `MINIMAX_TTS_MODEL` / `MINIMAX_TTS_VOICE_ID`:MiniMax 旧配音端点、模型和兜底音色配置,仅作为保留兼容;当前不作为默认语音通道
|
|
||||||
- `MINIMAX_TTS_VOICE_POOL`:MiniMax 英文随机音色池;当前默认男声 `English_magnetic_voiced_man`、女声 `English_Upbeat_Woman`、成熟声 `English_MaturePartner`,供后续新配音阶段使用
|
|
||||||
- `POE_API_KEY` / `VIDEO_API_KEY`:视频生成通道 Key,只能放本地环境变量
|
- `POE_API_KEY` / `VIDEO_API_KEY`:视频生成通道 Key,只能放本地环境变量
|
||||||
|
- `APP_DB_URL` / `DATABASE_URL`:后端元数据数据库;当前内置实现支持 `sqlite:///`,生产默认 `sqlite:////data/jobs/app.db`。文档归类以 `documents` 为顶层,一条 TK 链接或一次上传默认一个 document,`jobs` 和 `media_assets` 归属到 `document_id`。
|
||||||
- `WEB_AUTH_USERNAME` / `WEB_AUTH_PASSWORD` / `WEB_AUTH_SESSION_SECRET`:生产网页登录和会话签名配置;密码和 session secret 只放服务器环境变量,不入库
|
- `WEB_AUTH_USERNAME` / `WEB_AUTH_PASSWORD` / `WEB_AUTH_SESSION_SECRET`:生产网页登录和会话签名配置;密码和 session secret 只放服务器环境变量,不入库
|
||||||
- `FFMPEG_BIN` / `FFPROBE_BIN`:可选本地媒体二进制路径;本机 Homebrew ffmpeg 动态库损坏时,后端会自动跳过不可用的 PATH 版本并尝试本机静态 ffmpeg 备选,生产仍建议使用系统 ffmpeg/ffprobe
|
- `FFMPEG_BIN` / `FFPROBE_BIN`:可选本地媒体二进制路径;本机 Homebrew ffmpeg 动态库损坏时,后端会自动跳过不可用的 PATH 版本并尝试本机静态 ffmpeg 备选,生产仍建议使用系统 ffmpeg/ffprobe
|
||||||
- 生产环境变量:服务器只使用 `deploy/.env.production`,模板为 `deploy/.env.production.example`;真实 Key 不入库
|
- 生产环境变量:服务器只使用 `deploy/.env.production`,模板为 `deploy/.env.production.example`;真实 Key 不入库
|
||||||
|
|||||||
@@ -17,7 +17,9 @@ LOCAL_ASR_BIN=/opt/homebrew/bin/mlx_whisper
|
|||||||
LOCAL_ASR_MODEL=mlx-community/whisper-tiny
|
LOCAL_ASR_MODEL=mlx-community/whisper-tiny
|
||||||
LOCAL_ASR_TIMEOUT_SECONDS=180
|
LOCAL_ASR_TIMEOUT_SECONDS=180
|
||||||
TRANSLATE_MODEL=gemini-2.5-flash
|
TRANSLATE_MODEL=gemini-2.5-flash
|
||||||
REWRITE_MODEL=gemini-2.5-pro
|
GPT_TEXT_MODEL=gpt-4o
|
||||||
|
REWRITE_MODEL=gpt-4o
|
||||||
|
VISION_MODEL=gpt-4o
|
||||||
PRODUCT_VIEW_MODEL=gpt-image-2
|
PRODUCT_VIEW_MODEL=gpt-image-2
|
||||||
IMAGE_BASE_URL=https://ai.skg.com/ezlink/v1
|
IMAGE_BASE_URL=https://ai.skg.com/ezlink/v1
|
||||||
IMAGE_API_KEY=
|
IMAGE_API_KEY=
|
||||||
@@ -27,14 +29,17 @@ SUBJECT_ASSET_IMAGE_MODEL=gpt-image-2
|
|||||||
SUBJECT_ASSET_IMAGE_MODELS=gpt-image-2
|
SUBJECT_ASSET_IMAGE_MODELS=gpt-image-2
|
||||||
# 可选:本地网络需要代理访问 ai.skg.com 时配置;launchd 不一定继承 shell 代理变量。
|
# 可选:本地网络需要代理访问 ai.skg.com 时配置;launchd 不一定继承 shell 代理变量。
|
||||||
AI_HTTP_PROXY=
|
AI_HTTP_PROXY=
|
||||||
|
YTDLP_COOKIES_FILE=
|
||||||
|
YTDLP_COOKIES_FROM_BROWSER=
|
||||||
VIDEO_MODEL=seedance
|
VIDEO_MODEL=seedance
|
||||||
VIDEO_MODEL_SEEDANCE=seedance-2-fast
|
VIDEO_MODEL_SEEDANCE=seedance-2-fast
|
||||||
VIDEO_MODEL_KLING=kling-omni
|
VIDEO_MODEL_KLING=kling-omni
|
||||||
VIDEO_MODEL_VEO3=veo-3.1-fast
|
VIDEO_MODEL_VEO3=veo-3.1-fast
|
||||||
|
|
||||||
# 音频文案改写 + Azure OpenAI 配音
|
# 音频文案改写 + Azure OpenAI 配音
|
||||||
AUDIO_REWRITE_MODEL=gemini-2.5-pro
|
AUDIO_REWRITE_MODEL=gpt-4o
|
||||||
AUDIO_PRODUCT_BRIEF="SKG 智能按摩产品,主打日常肩颈、腰背、眼部、膝盖或足部放松;广告表达要高级、干净、可信,不做医疗疗效承诺。"
|
AUDIO_PRODUCT_BRIEF="SKG 智能按摩产品,主打日常肩颈、腰背、眼部、膝盖或足部放松;广告表达要高级、干净、可信,不做医疗疗效承诺。"
|
||||||
|
# 语音通道服务端固定为 Azure OpenAI。
|
||||||
VOICE_PROVIDER=azure_openai
|
VOICE_PROVIDER=azure_openai
|
||||||
AZURE_OPENAI_BASE_URL=https://ai.skg.com/azure
|
AZURE_OPENAI_BASE_URL=https://ai.skg.com/azure
|
||||||
AZURE_OPENAI_API_KEY=
|
AZURE_OPENAI_API_KEY=
|
||||||
@@ -42,13 +47,7 @@ AZURE_TTS_MODEL=gpt-4o-mini-tts
|
|||||||
AZURE_TTS_VOICE_ID=alloy
|
AZURE_TTS_VOICE_ID=alloy
|
||||||
AZURE_TTS_VOICE_POOL=alloy,verse,shimmer
|
AZURE_TTS_VOICE_POOL=alloy,verse,shimmer
|
||||||
AZURE_TTS_PATH=/audio/speech
|
AZURE_TTS_PATH=/audio/speech
|
||||||
|
AZURE_TTS_PATHS=/audio/speech,/v1/audio/speech
|
||||||
# MiniMax 旧配音通道,保留兼容;默认不走
|
|
||||||
MINIMAX_API_KEY=
|
|
||||||
MINIMAX_TTS_BASE_URL=https://api.minimax.io
|
|
||||||
MINIMAX_TTS_MODEL=speech-2.8-turbo
|
|
||||||
MINIMAX_TTS_VOICE_ID=English_expressive_narrator
|
|
||||||
MINIMAX_TTS_VOICE_POOL=English_magnetic_voiced_man,English_Upbeat_Woman,English_MaturePartner
|
|
||||||
|
|
||||||
# Poe 视频 API(优先用于 Seedance / Kling / Veo)
|
# Poe 视频 API(优先用于 Seedance / Kling / Veo)
|
||||||
POE_API_BASE_URL=https://api.poe.com/v1
|
POE_API_BASE_URL=https://api.poe.com/v1
|
||||||
@@ -80,7 +79,8 @@ VIDEO_DURATION_FIELD=seconds
|
|||||||
VIDEO_POLL_TIMEOUT_SECONDS=900
|
VIDEO_POLL_TIMEOUT_SECONDS=900
|
||||||
|
|
||||||
# 工作目录
|
# 工作目录
|
||||||
KEYFRAME_COUNT=12
|
APP_DB_URL=sqlite:///./jobs/app.db
|
||||||
|
KEYFRAME_COUNT=6
|
||||||
JOBS_DIR=./jobs
|
JOBS_DIR=./jobs
|
||||||
|
|
||||||
# CORS
|
# CORS
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# SKG TK 二创 API
|
# SKG TK 二创 API
|
||||||
|
|
||||||
FastAPI 后端,跑 yt-dlp + ffmpeg + ASR/翻译/英文 SKG 产品介绍文案 + MiniMax 英文配音管线。
|
FastAPI 后端,跑 yt-dlp + ffmpeg + ASR/翻译/英文 SKG 产品介绍文案 + Azure OpenAI 英文配音管线。
|
||||||
|
|
||||||
## 启动
|
## 启动
|
||||||
|
|
||||||
@@ -9,7 +9,7 @@ cd api
|
|||||||
python3 -m venv .venv
|
python3 -m venv .venv
|
||||||
source .venv/bin/activate
|
source .venv/bin/activate
|
||||||
pip install -r requirements.txt
|
pip install -r requirements.txt
|
||||||
cp .env.example .env # 按需填 LLM_API_KEY / MINIMAX_API_KEY
|
cp .env.example .env # 按需填 LLM_API_KEY / AZURE_OPENAI_API_KEY
|
||||||
uvicorn main:app --host 127.0.0.1 --port 4291
|
uvicorn main:app --host 127.0.0.1 --port 4291
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -18,21 +18,23 @@ uvicorn main:app --host 127.0.0.1 --port 4291
|
|||||||
## 路由
|
## 路由
|
||||||
|
|
||||||
- `GET /health` — 健康检查 + 配置状态
|
- `GET /health` — 健康检查 + 配置状态
|
||||||
|
- `GET /documents` — 后端数据库里的文档归类列表;一条 TK 链接或一次上传视频默认一个 document
|
||||||
- `POST /jobs` `{url}` — 创建 job,后台下载源视频,视频就绪后可手动解析或提取音频
|
- `POST /jobs` `{url}` — 创建 job,后台下载源视频,视频就绪后可手动解析或提取音频
|
||||||
- `GET /jobs/{id}` — 当前状态 + 产物;若原始音轨已拆出,会返回 `source_audio_url`
|
- `GET /jobs/{id}` — 当前状态 + 产物;若原始音轨已拆出,会返回 `source_audio_url`
|
||||||
- `POST /jobs/{id}/transcribe` — 触发音频提取 + ASR + 翻译 + SKG 英文产品介绍文案;文案长度按原音频时长估算,配置 MiniMax 后从英文随机音色池生成配音。前端 Audio 节点提供“提取音频 / 重新提取音频”按钮,可与抽帧并行,不自动触发
|
- `POST /jobs/{id}/transcribe` — 触发音频提取 + ASR + 翻译 + SKG 英文产品介绍文案;文案长度按原音频时长估算,配置 Azure OpenAI TTS 后从 Azure 音色池生成配音。前端 Audio 节点提供“提取音频 / 重新提取音频”按钮,可与抽帧并行,不自动触发
|
||||||
- `GET /jobs/{id}/video.mp4` — 原视频
|
- `GET /jobs/{id}/video.mp4` — 原视频
|
||||||
- `GET /jobs/{id}/audio.wav` — 拆轨后的原始音频,供前端底部音频条生成波形
|
- `GET /jobs/{id}/audio.wav` — 拆轨后的原始音频,供前端底部音频条生成波形
|
||||||
- `GET /jobs/{id}/audio-script.mp3` — 英文改写文案的 MiniMax 配音
|
- `GET /jobs/{id}/audio-script.mp3` — 英文改写文案的 Azure OpenAI TTS 配音
|
||||||
- `GET /jobs/{id}/frames/{i}.jpg` — 第 i 张关键帧(0-9)
|
- `GET /jobs/{id}/frames/{i}.jpg` — 第 i 张关键帧(0-9)
|
||||||
|
|
||||||
## Mock 模式
|
## Mock 模式
|
||||||
|
|
||||||
未设 `LLM_API_KEY` 时,转录走本地 mock,便于 UI 联调;未设 `MINIMAX_API_KEY` 时只生成改写文案,不生成配音文件。
|
未设 `LLM_API_KEY` 时,转录走本地 mock,便于 UI 联调;未设 `AZURE_OPENAI_API_KEY` 且无法复用 `LLM_API_KEY` 时只生成改写文案,不生成配音文件。
|
||||||
|
|
||||||
## 依赖
|
## 依赖
|
||||||
|
|
||||||
- `ffmpeg` 系统二进制(拆轨 / 抽帧)
|
- `ffmpeg` 系统二进制(拆轨 / 抽帧)
|
||||||
- `yt-dlp` 系统二进制(也可走 Python 包)
|
- `yt-dlp` 系统二进制(也可走 Python 包)
|
||||||
|
- SQLite 元数据数据库(默认 `APP_DB_URL=sqlite:///./jobs/app.db`);只存 document / job / media asset 元数据,原视频、音频、抽帧和生成文件继续放 `jobs/<jobId>/`
|
||||||
- OpenAI 兼容 LLM 网关(ASR / 翻译 / 文案改写);如果 `/audio/transcriptions` 不可用,会用 `ASR_FALLBACK_MODEL` 走 Gemini 多模态音频识别
|
- OpenAI 兼容 LLM 网关(ASR / 翻译 / 文案改写);如果 `/audio/transcriptions` 不可用,会用 `ASR_FALLBACK_MODEL` 走 Gemini 多模态音频识别
|
||||||
- MiniMax T2A HTTP(英文产品介绍文案配音,使用 `MINIMAX_API_KEY`;默认随机音色池 `English_magnetic_voiced_man,English_Upbeat_Woman,English_MaturePartner`)
|
- Azure OpenAI TTS(英文产品介绍文案配音,使用 `AZURE_OPENAI_API_KEY` 或回退复用 `LLM_API_KEY`;默认音色池 `alloy,verse,shimmer`)
|
||||||
|
|||||||
536
api/database.py
Normal file
536
api/database.py
Normal file
@@ -0,0 +1,536 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import sqlite3
|
||||||
|
import time
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
|
||||||
|
SCHEMA_VERSION = 1
|
||||||
|
|
||||||
|
|
||||||
|
def default_database_url(jobs_dir: Path) -> str:
|
||||||
|
return os.getenv("APP_DB_URL") or os.getenv("DATABASE_URL") or f"sqlite:///{jobs_dir / 'app.db'}"
|
||||||
|
|
||||||
|
|
||||||
|
def redact_database_url(url: str) -> str:
|
||||||
|
if "://" not in url or "@" not in url:
|
||||||
|
return url
|
||||||
|
scheme, rest = url.split("://", 1)
|
||||||
|
_, host = rest.rsplit("@", 1)
|
||||||
|
return f"{scheme}://***@{host}"
|
||||||
|
|
||||||
|
|
||||||
|
def infer_source_kind(url: str) -> str:
|
||||||
|
if url.startswith("upload://"):
|
||||||
|
return "upload"
|
||||||
|
if url.startswith("http://") or url.startswith("https://"):
|
||||||
|
return "tiktok_link"
|
||||||
|
return "unknown"
|
||||||
|
|
||||||
|
|
||||||
|
def default_workflow_mode(source_kind: str) -> str:
|
||||||
|
if source_kind == "upload":
|
||||||
|
return "uploaded_reference"
|
||||||
|
return "feed_recreation"
|
||||||
|
|
||||||
|
|
||||||
|
def document_title(url: str, source_kind: str, fallback: str) -> str:
|
||||||
|
if source_kind == "upload":
|
||||||
|
return url.replace("upload://", "", 1).strip() or fallback
|
||||||
|
if url:
|
||||||
|
return url.strip()[:120]
|
||||||
|
return fallback
|
||||||
|
|
||||||
|
|
||||||
|
def storage_prefix(document_id: str, source_kind: str, workflow_mode: str) -> str:
|
||||||
|
source = source_kind or "unknown"
|
||||||
|
mode = workflow_mode or default_workflow_mode(source)
|
||||||
|
return f"{mode}/{source}/{document_id}"
|
||||||
|
|
||||||
|
|
||||||
|
class AppDatabase:
|
||||||
|
def __init__(self, url: str, jobs_dir: Path):
|
||||||
|
self.url = url
|
||||||
|
self.jobs_dir = jobs_dir
|
||||||
|
self.path = self._sqlite_path(url)
|
||||||
|
self.enabled = True
|
||||||
|
self.error = ""
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _sqlite_path(url: str) -> Path:
|
||||||
|
if url == ":memory:":
|
||||||
|
return Path(":memory:")
|
||||||
|
if not url.startswith("sqlite:///"):
|
||||||
|
raise RuntimeError("当前内置数据库层只支持 sqlite:/// URL;Postgres 迁移会复用同一张表语义。")
|
||||||
|
raw = url[len("sqlite:///"):]
|
||||||
|
return Path(raw).expanduser().resolve()
|
||||||
|
|
||||||
|
def connect(self) -> sqlite3.Connection:
|
||||||
|
if str(self.path) != ":memory:":
|
||||||
|
self.path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
conn = sqlite3.connect(str(self.path))
|
||||||
|
conn.row_factory = sqlite3.Row
|
||||||
|
conn.execute("PRAGMA foreign_keys = ON")
|
||||||
|
return conn
|
||||||
|
|
||||||
|
def init(self) -> None:
|
||||||
|
with self.connect() as conn:
|
||||||
|
conn.executescript(
|
||||||
|
"""
|
||||||
|
CREATE TABLE IF NOT EXISTS schema_meta (
|
||||||
|
key TEXT PRIMARY KEY,
|
||||||
|
value TEXT NOT NULL
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE TABLE IF NOT EXISTS documents (
|
||||||
|
id TEXT PRIMARY KEY,
|
||||||
|
title TEXT NOT NULL,
|
||||||
|
source_kind TEXT NOT NULL,
|
||||||
|
workflow_mode TEXT NOT NULL,
|
||||||
|
source_url TEXT NOT NULL DEFAULT '',
|
||||||
|
primary_job_id TEXT NOT NULL DEFAULT '',
|
||||||
|
status TEXT NOT NULL DEFAULT 'created',
|
||||||
|
storage_prefix TEXT NOT NULL,
|
||||||
|
metadata_json TEXT NOT NULL DEFAULT '{}',
|
||||||
|
created_at REAL NOT NULL,
|
||||||
|
updated_at REAL NOT NULL
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE TABLE IF NOT EXISTS jobs (
|
||||||
|
id TEXT PRIMARY KEY,
|
||||||
|
document_id TEXT NOT NULL,
|
||||||
|
source_kind TEXT NOT NULL,
|
||||||
|
workflow_mode TEXT NOT NULL,
|
||||||
|
source_url TEXT NOT NULL DEFAULT '',
|
||||||
|
status TEXT NOT NULL,
|
||||||
|
progress INTEGER NOT NULL DEFAULT 0,
|
||||||
|
message TEXT NOT NULL DEFAULT '',
|
||||||
|
storage_path TEXT NOT NULL,
|
||||||
|
state_path TEXT NOT NULL,
|
||||||
|
video_url TEXT NOT NULL DEFAULT '',
|
||||||
|
duration REAL NOT NULL DEFAULT 0,
|
||||||
|
width INTEGER NOT NULL DEFAULT 0,
|
||||||
|
height INTEGER NOT NULL DEFAULT 0,
|
||||||
|
frame_count INTEGER NOT NULL DEFAULT 0,
|
||||||
|
video_count INTEGER NOT NULL DEFAULT 0,
|
||||||
|
error TEXT NOT NULL DEFAULT '',
|
||||||
|
metadata_json TEXT NOT NULL DEFAULT '{}',
|
||||||
|
created_at REAL NOT NULL,
|
||||||
|
updated_at REAL NOT NULL,
|
||||||
|
FOREIGN KEY(document_id) REFERENCES documents(id) ON DELETE CASCADE
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE TABLE IF NOT EXISTS media_assets (
|
||||||
|
id TEXT PRIMARY KEY,
|
||||||
|
document_id TEXT NOT NULL,
|
||||||
|
job_id TEXT NOT NULL,
|
||||||
|
kind TEXT NOT NULL,
|
||||||
|
role TEXT NOT NULL,
|
||||||
|
path TEXT NOT NULL DEFAULT '',
|
||||||
|
url TEXT NOT NULL DEFAULT '',
|
||||||
|
frame_index INTEGER,
|
||||||
|
timestamp REAL,
|
||||||
|
width INTEGER NOT NULL DEFAULT 0,
|
||||||
|
height INTEGER NOT NULL DEFAULT 0,
|
||||||
|
duration REAL NOT NULL DEFAULT 0,
|
||||||
|
metadata_json TEXT NOT NULL DEFAULT '{}',
|
||||||
|
created_at REAL NOT NULL,
|
||||||
|
updated_at REAL NOT NULL,
|
||||||
|
FOREIGN KEY(document_id) REFERENCES documents(id) ON DELETE CASCADE,
|
||||||
|
FOREIGN KEY(job_id) REFERENCES jobs(id) ON DELETE CASCADE
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_documents_updated_at ON documents(updated_at DESC);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_documents_source_kind ON documents(source_kind);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_documents_workflow_mode ON documents(workflow_mode);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_jobs_document_id ON jobs(document_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_jobs_updated_at ON jobs(updated_at DESC);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_assets_document_id ON media_assets(document_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_assets_job_id ON media_assets(job_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_assets_role ON media_assets(role);
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
conn.execute(
|
||||||
|
"INSERT OR REPLACE INTO schema_meta(key, value) VALUES('schema_version', ?)",
|
||||||
|
(str(SCHEMA_VERSION),),
|
||||||
|
)
|
||||||
|
|
||||||
|
def normalize_job_document(self, job: dict[str, Any]) -> dict[str, Any]:
|
||||||
|
job_id = str(job.get("id") or "")
|
||||||
|
source_url = str(job.get("url") or "")
|
||||||
|
source_kind = str(job.get("source_kind") or "") or infer_source_kind(source_url)
|
||||||
|
workflow_mode = str(job.get("workflow_mode") or "") or default_workflow_mode(source_kind)
|
||||||
|
document_id = str(job.get("document_id") or "") or job_id
|
||||||
|
prefix = str(job.get("storage_prefix") or "") or storage_prefix(document_id, source_kind, workflow_mode)
|
||||||
|
return {
|
||||||
|
"document_id": document_id,
|
||||||
|
"source_kind": source_kind,
|
||||||
|
"workflow_mode": workflow_mode,
|
||||||
|
"storage_prefix": prefix,
|
||||||
|
"title": document_title(source_url, source_kind, document_id),
|
||||||
|
}
|
||||||
|
|
||||||
|
def sync_job(self, job: dict[str, Any], job_path: Path) -> None:
|
||||||
|
if not self.enabled:
|
||||||
|
return
|
||||||
|
now = time.time()
|
||||||
|
job_id = str(job.get("id") or "")
|
||||||
|
if not job_id:
|
||||||
|
return
|
||||||
|
doc = self.normalize_job_document(job)
|
||||||
|
state_path = job_path / "state.json"
|
||||||
|
frames = list(job.get("frames") or [])
|
||||||
|
generated_videos = list(job.get("generated_videos") or [])
|
||||||
|
metadata = {
|
||||||
|
"audio_segment_count": len(job.get("transcript") or []),
|
||||||
|
"product_ref_count": len(job.get("product_refs") or []),
|
||||||
|
"storyboard_image_count": len(job.get("storyboard_images") or []),
|
||||||
|
}
|
||||||
|
with self.connect() as conn:
|
||||||
|
existing = conn.execute(
|
||||||
|
"SELECT created_at FROM documents WHERE id = ?",
|
||||||
|
(doc["document_id"],),
|
||||||
|
).fetchone()
|
||||||
|
created_at = float(existing["created_at"]) if existing else now
|
||||||
|
conn.execute(
|
||||||
|
"""
|
||||||
|
INSERT INTO documents(
|
||||||
|
id, title, source_kind, workflow_mode, source_url, primary_job_id,
|
||||||
|
status, storage_prefix, metadata_json, created_at, updated_at
|
||||||
|
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||||
|
ON CONFLICT(id) DO UPDATE SET
|
||||||
|
title = excluded.title,
|
||||||
|
source_kind = excluded.source_kind,
|
||||||
|
workflow_mode = excluded.workflow_mode,
|
||||||
|
source_url = excluded.source_url,
|
||||||
|
primary_job_id = excluded.primary_job_id,
|
||||||
|
status = excluded.status,
|
||||||
|
storage_prefix = excluded.storage_prefix,
|
||||||
|
metadata_json = excluded.metadata_json,
|
||||||
|
updated_at = excluded.updated_at
|
||||||
|
""",
|
||||||
|
(
|
||||||
|
doc["document_id"],
|
||||||
|
doc["title"],
|
||||||
|
doc["source_kind"],
|
||||||
|
doc["workflow_mode"],
|
||||||
|
str(job.get("url") or ""),
|
||||||
|
job_id,
|
||||||
|
str(job.get("status") or "created"),
|
||||||
|
doc["storage_prefix"],
|
||||||
|
json.dumps(metadata, ensure_ascii=False),
|
||||||
|
created_at,
|
||||||
|
now,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
existing_job = conn.execute("SELECT created_at FROM jobs WHERE id = ?", (job_id,)).fetchone()
|
||||||
|
job_created_at = float(existing_job["created_at"]) if existing_job else now
|
||||||
|
conn.execute(
|
||||||
|
"""
|
||||||
|
INSERT INTO jobs(
|
||||||
|
id, document_id, source_kind, workflow_mode, source_url, status,
|
||||||
|
progress, message, storage_path, state_path, video_url, duration,
|
||||||
|
width, height, frame_count, video_count, error, metadata_json,
|
||||||
|
created_at, updated_at
|
||||||
|
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||||
|
ON CONFLICT(id) DO UPDATE SET
|
||||||
|
document_id = excluded.document_id,
|
||||||
|
source_kind = excluded.source_kind,
|
||||||
|
workflow_mode = excluded.workflow_mode,
|
||||||
|
source_url = excluded.source_url,
|
||||||
|
status = excluded.status,
|
||||||
|
progress = excluded.progress,
|
||||||
|
message = excluded.message,
|
||||||
|
storage_path = excluded.storage_path,
|
||||||
|
state_path = excluded.state_path,
|
||||||
|
video_url = excluded.video_url,
|
||||||
|
duration = excluded.duration,
|
||||||
|
width = excluded.width,
|
||||||
|
height = excluded.height,
|
||||||
|
frame_count = excluded.frame_count,
|
||||||
|
video_count = excluded.video_count,
|
||||||
|
error = excluded.error,
|
||||||
|
metadata_json = excluded.metadata_json,
|
||||||
|
updated_at = excluded.updated_at
|
||||||
|
""",
|
||||||
|
(
|
||||||
|
job_id,
|
||||||
|
doc["document_id"],
|
||||||
|
doc["source_kind"],
|
||||||
|
doc["workflow_mode"],
|
||||||
|
str(job.get("url") or ""),
|
||||||
|
str(job.get("status") or "created"),
|
||||||
|
int(job.get("progress") or 0),
|
||||||
|
str(job.get("message") or ""),
|
||||||
|
str(job_path),
|
||||||
|
str(state_path),
|
||||||
|
str(job.get("video_url") or ""),
|
||||||
|
float(job.get("duration") or 0),
|
||||||
|
int(job.get("width") or 0),
|
||||||
|
int(job.get("height") or 0),
|
||||||
|
len(frames),
|
||||||
|
len(generated_videos),
|
||||||
|
str(job.get("error") or ""),
|
||||||
|
json.dumps(metadata, ensure_ascii=False),
|
||||||
|
job_created_at,
|
||||||
|
now,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
conn.execute("DELETE FROM media_assets WHERE job_id = ?", (job_id,))
|
||||||
|
for asset in self._job_assets(job, job_path, doc["document_id"]):
|
||||||
|
conn.execute(
|
||||||
|
"""
|
||||||
|
INSERT INTO media_assets(
|
||||||
|
id, document_id, job_id, kind, role, path, url, frame_index,
|
||||||
|
timestamp, width, height, duration, metadata_json, created_at, updated_at
|
||||||
|
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||||
|
""",
|
||||||
|
(
|
||||||
|
asset["id"],
|
||||||
|
asset["document_id"],
|
||||||
|
asset["job_id"],
|
||||||
|
asset["kind"],
|
||||||
|
asset["role"],
|
||||||
|
asset.get("path", ""),
|
||||||
|
asset.get("url", ""),
|
||||||
|
asset.get("frame_index"),
|
||||||
|
asset.get("timestamp"),
|
||||||
|
int(asset.get("width") or 0),
|
||||||
|
int(asset.get("height") or 0),
|
||||||
|
float(asset.get("duration") or 0),
|
||||||
|
json.dumps(asset.get("metadata") or {}, ensure_ascii=False),
|
||||||
|
now,
|
||||||
|
now,
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
def _job_assets(self, job: dict[str, Any], job_path: Path, document_id: str) -> list[dict[str, Any]]:
|
||||||
|
job_id = str(job.get("id") or "")
|
||||||
|
items: list[dict[str, Any]] = []
|
||||||
|
|
||||||
|
def add(
|
||||||
|
asset_id: str,
|
||||||
|
kind: str,
|
||||||
|
role: str,
|
||||||
|
path: Path | str = "",
|
||||||
|
url: str = "",
|
||||||
|
frame_index: int | None = None,
|
||||||
|
timestamp: float | None = None,
|
||||||
|
width: int = 0,
|
||||||
|
height: int = 0,
|
||||||
|
duration: float = 0.0,
|
||||||
|
metadata: dict[str, Any] | None = None,
|
||||||
|
) -> None:
|
||||||
|
items.append({
|
||||||
|
"id": asset_id,
|
||||||
|
"document_id": document_id,
|
||||||
|
"job_id": job_id,
|
||||||
|
"kind": kind,
|
||||||
|
"role": role,
|
||||||
|
"path": str(path) if path else "",
|
||||||
|
"url": url,
|
||||||
|
"frame_index": frame_index,
|
||||||
|
"timestamp": timestamp,
|
||||||
|
"width": width,
|
||||||
|
"height": height,
|
||||||
|
"duration": duration,
|
||||||
|
"metadata": metadata or {},
|
||||||
|
})
|
||||||
|
|
||||||
|
if (job_path / "source.mp4").exists() or job.get("video_url"):
|
||||||
|
add(
|
||||||
|
f"{job_id}:source_video",
|
||||||
|
"video",
|
||||||
|
"source_video",
|
||||||
|
job_path / "source.mp4",
|
||||||
|
str(job.get("video_url") or f"/jobs/{job_id}/video.mp4"),
|
||||||
|
duration=float(job.get("duration") or 0),
|
||||||
|
width=int(job.get("width") or 0),
|
||||||
|
height=int(job.get("height") or 0),
|
||||||
|
)
|
||||||
|
if (job_path / "audio.wav").exists() or job.get("source_audio_url"):
|
||||||
|
add(
|
||||||
|
f"{job_id}:source_audio",
|
||||||
|
"audio",
|
||||||
|
"source_audio",
|
||||||
|
job_path / "audio.wav",
|
||||||
|
str(job.get("source_audio_url") or f"/jobs/{job_id}/audio.wav"),
|
||||||
|
duration=float(job.get("duration") or 0),
|
||||||
|
)
|
||||||
|
|
||||||
|
for frame in job.get("frames") or []:
|
||||||
|
idx = int(frame.get("index") or 0)
|
||||||
|
add(
|
||||||
|
f"{job_id}:frame:{idx}",
|
||||||
|
"image",
|
||||||
|
"keyframe",
|
||||||
|
job_path / "frames" / f"{idx:03d}.jpg",
|
||||||
|
str(frame.get("url") or f"/jobs/{job_id}/frames/{idx}.jpg"),
|
||||||
|
frame_index=idx,
|
||||||
|
timestamp=float(frame.get("timestamp") or 0),
|
||||||
|
metadata={"quality_report": frame.get("quality_report")},
|
||||||
|
)
|
||||||
|
if frame.get("cleaned_url"):
|
||||||
|
add(
|
||||||
|
f"{job_id}:frame:{idx}:cleaned",
|
||||||
|
"image",
|
||||||
|
"cleaned_keyframe",
|
||||||
|
job_path / "cleaned" / f"{idx:03d}.jpg",
|
||||||
|
str(frame.get("cleaned_url")),
|
||||||
|
frame_index=idx,
|
||||||
|
timestamp=float(frame.get("timestamp") or 0),
|
||||||
|
)
|
||||||
|
for generated in frame.get("generated_images") or []:
|
||||||
|
gen_id = str(generated.get("id") or "")
|
||||||
|
if gen_id:
|
||||||
|
add(
|
||||||
|
f"{job_id}:generated_image:{idx}:{gen_id}",
|
||||||
|
"image",
|
||||||
|
"generated_image",
|
||||||
|
job_path / "gen" / f"{idx:03d}_{gen_id}.jpg",
|
||||||
|
str(generated.get("url") or ""),
|
||||||
|
frame_index=idx,
|
||||||
|
metadata={"model": generated.get("model"), "mode": generated.get("mode")},
|
||||||
|
)
|
||||||
|
for scene_asset in frame.get("scene_assets") or []:
|
||||||
|
asset_id = str(scene_asset.get("id") or "")
|
||||||
|
if asset_id:
|
||||||
|
add(
|
||||||
|
f"{job_id}:scene_asset:{asset_id}",
|
||||||
|
"image",
|
||||||
|
str(scene_asset.get("asset_role") or "scene_asset"),
|
||||||
|
job_path / "assets" / f"{asset_id}.jpg",
|
||||||
|
str(scene_asset.get("url") or ""),
|
||||||
|
frame_index=idx,
|
||||||
|
width=int(scene_asset.get("width") or 0),
|
||||||
|
height=int(scene_asset.get("height") or 0),
|
||||||
|
metadata={"label": scene_asset.get("label"), "scene_mode": scene_asset.get("scene_mode")},
|
||||||
|
)
|
||||||
|
for element in frame.get("elements") or []:
|
||||||
|
element_id = str(element.get("id") or "")
|
||||||
|
cutout_ids = list(element.get("cutouts") or [])
|
||||||
|
legacy_cutout = element.get("cutout_id")
|
||||||
|
if legacy_cutout and legacy_cutout not in cutout_ids:
|
||||||
|
cutout_ids.append(legacy_cutout)
|
||||||
|
for cutout_id in cutout_ids:
|
||||||
|
add(
|
||||||
|
f"{job_id}:cutout:{idx}:{element_id}:{cutout_id}",
|
||||||
|
"image",
|
||||||
|
"element_cutout",
|
||||||
|
job_path / "elements" / f"{idx:03d}_{element_id}_{cutout_id}.jpg",
|
||||||
|
f"/jobs/{job_id}/frames/{idx}/elements/{element_id}/cutouts/{cutout_id}.jpg",
|
||||||
|
frame_index=idx,
|
||||||
|
metadata={"element_id": element_id, "name_zh": element.get("name_zh")},
|
||||||
|
)
|
||||||
|
for subject_asset in element.get("subject_assets") or []:
|
||||||
|
asset_id = str(subject_asset.get("id") or "")
|
||||||
|
if asset_id:
|
||||||
|
add(
|
||||||
|
f"{job_id}:subject_asset:{asset_id}",
|
||||||
|
"image",
|
||||||
|
"subject_asset",
|
||||||
|
job_path / "assets" / f"{asset_id}.jpg",
|
||||||
|
str(subject_asset.get("url") or ""),
|
||||||
|
frame_index=idx,
|
||||||
|
width=int(subject_asset.get("width") or 0),
|
||||||
|
height=int(subject_asset.get("height") or 0),
|
||||||
|
metadata={"view": subject_asset.get("view"), "label": subject_asset.get("label")},
|
||||||
|
)
|
||||||
|
|
||||||
|
for ref in job.get("product_refs") or []:
|
||||||
|
asset_id = str(ref.get("id") or ref.get("asset_id") or ref.get("url") or "")
|
||||||
|
if asset_id:
|
||||||
|
add(
|
||||||
|
f"{job_id}:product_ref:{asset_id}",
|
||||||
|
"image",
|
||||||
|
"product_ref",
|
||||||
|
self._path_from_job_url(job_path, job_id, str(ref.get("url") or "")),
|
||||||
|
str(ref.get("url") or ""),
|
||||||
|
metadata=ref,
|
||||||
|
)
|
||||||
|
|
||||||
|
for video in job.get("generated_videos") or []:
|
||||||
|
video_id = str(video.get("id") or "")
|
||||||
|
if video_id:
|
||||||
|
add(
|
||||||
|
f"{job_id}:generated_video:{video_id}",
|
||||||
|
"video",
|
||||||
|
"generated_video",
|
||||||
|
job_path / "videos" / f"{video_id}.mp4",
|
||||||
|
str(video.get("url") or ""),
|
||||||
|
frame_index=video.get("frame_idx"),
|
||||||
|
duration=float(video.get("duration") or 0),
|
||||||
|
metadata={"status": video.get("status"), "model": video.get("model"), "error": video.get("error")},
|
||||||
|
)
|
||||||
|
return items
|
||||||
|
|
||||||
|
def _path_from_job_url(self, job_path: Path, job_id: str, url: str) -> str:
|
||||||
|
prefix = f"/jobs/{job_id}/"
|
||||||
|
if not url.startswith(prefix):
|
||||||
|
return ""
|
||||||
|
tail = url[len(prefix):]
|
||||||
|
if tail == "video.mp4":
|
||||||
|
return str(job_path / "source.mp4")
|
||||||
|
return str(job_path / tail)
|
||||||
|
|
||||||
|
def delete_job(self, job_id: str) -> None:
|
||||||
|
if not self.enabled:
|
||||||
|
return
|
||||||
|
with self.connect() as conn:
|
||||||
|
row = conn.execute("SELECT document_id FROM jobs WHERE id = ?", (job_id,)).fetchone()
|
||||||
|
conn.execute("DELETE FROM jobs WHERE id = ?", (job_id,))
|
||||||
|
if row:
|
||||||
|
remaining = conn.execute(
|
||||||
|
"SELECT COUNT(*) AS c FROM jobs WHERE document_id = ?",
|
||||||
|
(row["document_id"],),
|
||||||
|
).fetchone()
|
||||||
|
if int(remaining["c"] or 0) == 0:
|
||||||
|
conn.execute("DELETE FROM documents WHERE id = ?", (row["document_id"],))
|
||||||
|
|
||||||
|
def list_documents(self, limit: int | None = None) -> list[dict[str, Any]]:
|
||||||
|
sql = """
|
||||||
|
SELECT
|
||||||
|
d.*,
|
||||||
|
COUNT(DISTINCT j.id) AS job_count,
|
||||||
|
COUNT(DISTINCT a.id) AS asset_count
|
||||||
|
FROM documents d
|
||||||
|
LEFT JOIN jobs j ON j.document_id = d.id
|
||||||
|
LEFT JOIN media_assets a ON a.document_id = d.id
|
||||||
|
GROUP BY d.id
|
||||||
|
ORDER BY d.updated_at DESC
|
||||||
|
"""
|
||||||
|
params: tuple[Any, ...] = ()
|
||||||
|
if limit is not None and limit > 0:
|
||||||
|
sql += " LIMIT ?"
|
||||||
|
params = (limit,)
|
||||||
|
with self.connect() as conn:
|
||||||
|
rows = conn.execute(sql, params).fetchall()
|
||||||
|
return [dict(row) for row in rows]
|
||||||
|
|
||||||
|
def health(self) -> dict[str, Any]:
|
||||||
|
if not self.enabled:
|
||||||
|
return {"enabled": False, "url": redact_database_url(self.url), "error": self.error}
|
||||||
|
try:
|
||||||
|
with self.connect() as conn:
|
||||||
|
docs = conn.execute("SELECT COUNT(*) AS c FROM documents").fetchone()["c"]
|
||||||
|
jobs = conn.execute("SELECT COUNT(*) AS c FROM jobs").fetchone()["c"]
|
||||||
|
assets = conn.execute("SELECT COUNT(*) AS c FROM media_assets").fetchone()["c"]
|
||||||
|
return {
|
||||||
|
"enabled": True,
|
||||||
|
"url": redact_database_url(self.url),
|
||||||
|
"schema_version": SCHEMA_VERSION,
|
||||||
|
"documents": int(docs or 0),
|
||||||
|
"jobs": int(jobs or 0),
|
||||||
|
"assets": int(assets or 0),
|
||||||
|
}
|
||||||
|
except Exception as e:
|
||||||
|
return {"enabled": False, "url": redact_database_url(self.url), "error": str(e)}
|
||||||
|
|
||||||
|
|
||||||
|
def create_database(url: str, jobs_dir: Path) -> AppDatabase:
|
||||||
|
db = AppDatabase(url, jobs_dir)
|
||||||
|
db.init()
|
||||||
|
return db
|
||||||
333
api/main.py
333
api/main.py
@@ -25,10 +25,19 @@ from fastapi.middleware.cors import CORSMiddleware
|
|||||||
from fastapi.responses import FileResponse
|
from fastapi.responses import FileResponse
|
||||||
from pydantic import BaseModel, Field
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
|
from database import create_database, default_database_url, default_workflow_mode, infer_source_kind, storage_prefix
|
||||||
|
|
||||||
load_dotenv()
|
load_dotenv()
|
||||||
|
|
||||||
JOBS_DIR = Path(os.getenv("JOBS_DIR", "./jobs")).resolve()
|
JOBS_DIR = Path(os.getenv("JOBS_DIR", "./jobs")).resolve()
|
||||||
JOBS_DIR.mkdir(parents=True, exist_ok=True)
|
JOBS_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
DATABASE_URL = default_database_url(JOBS_DIR)
|
||||||
|
DB_INIT_ERROR = ""
|
||||||
|
try:
|
||||||
|
DB = create_database(DATABASE_URL, JOBS_DIR)
|
||||||
|
except Exception as e:
|
||||||
|
DB = None
|
||||||
|
DB_INIT_ERROR = str(e)
|
||||||
CORS_ORIGINS = [o.strip() for o in os.getenv("CORS_ORIGINS", "http://localhost:4290,http://127.0.0.1:4290").split(",") if o.strip()]
|
CORS_ORIGINS = [o.strip() for o in os.getenv("CORS_ORIGINS", "http://localhost:4290,http://127.0.0.1:4290").split(",") if o.strip()]
|
||||||
PRODUCT_LIBRARY_DIR = Path(
|
PRODUCT_LIBRARY_DIR = Path(
|
||||||
os.getenv("PRODUCT_LIBRARY_DIR", Path(__file__).resolve().parent / "product_library" / "skg-products")
|
os.getenv("PRODUCT_LIBRARY_DIR", Path(__file__).resolve().parent / "product_library" / "skg-products")
|
||||||
@@ -48,8 +57,18 @@ LOCAL_ASR_BIN = os.getenv("LOCAL_ASR_BIN", "").strip()
|
|||||||
LOCAL_ASR_MODEL = os.getenv("LOCAL_ASR_MODEL", "mlx-community/whisper-tiny").strip() or "mlx-community/whisper-tiny"
|
LOCAL_ASR_MODEL = os.getenv("LOCAL_ASR_MODEL", "mlx-community/whisper-tiny").strip() or "mlx-community/whisper-tiny"
|
||||||
LOCAL_ASR_TIMEOUT_SECONDS = max(30, int(os.getenv("LOCAL_ASR_TIMEOUT_SECONDS", "180")))
|
LOCAL_ASR_TIMEOUT_SECONDS = max(30, int(os.getenv("LOCAL_ASR_TIMEOUT_SECONDS", "180")))
|
||||||
TRANSLATE_MODEL = os.getenv("TRANSLATE_MODEL", "gemini-2.5-flash")
|
TRANSLATE_MODEL = os.getenv("TRANSLATE_MODEL", "gemini-2.5-flash")
|
||||||
REWRITE_MODEL = os.getenv("REWRITE_MODEL", "gemini-2.5-pro")
|
DEFAULT_GPT_TEXT_MODEL = os.getenv("GPT_TEXT_MODEL", "gpt-4o").strip() or "gpt-4o"
|
||||||
VISION_MODEL = os.getenv("VISION_MODEL", "gemini-2.5-flash")
|
|
||||||
|
|
||||||
|
def gpt_model_env(name: str, default: str | None = None) -> str:
|
||||||
|
value = os.getenv(name, default or DEFAULT_GPT_TEXT_MODEL).strip()
|
||||||
|
if not value or value.lower().startswith("gemini-"):
|
||||||
|
return default or DEFAULT_GPT_TEXT_MODEL
|
||||||
|
return value
|
||||||
|
|
||||||
|
|
||||||
|
REWRITE_MODEL = gpt_model_env("REWRITE_MODEL")
|
||||||
|
VISION_MODEL = gpt_model_env("VISION_MODEL")
|
||||||
IMAGE_BASE_URL = os.getenv("IMAGE_BASE_URL", LLM_BASE_URL).strip()
|
IMAGE_BASE_URL = os.getenv("IMAGE_BASE_URL", LLM_BASE_URL).strip()
|
||||||
IMAGE_API_KEY = os.getenv("IMAGE_API_KEY", LLM_API_KEY).strip()
|
IMAGE_API_KEY = os.getenv("IMAGE_API_KEY", LLM_API_KEY).strip()
|
||||||
AI_HTTP_PROXY = (
|
AI_HTTP_PROXY = (
|
||||||
@@ -73,29 +92,14 @@ PRODUCT_ASSET_MIN_LONG_SIDE = max(512, int(os.getenv("PRODUCT_ASSET_MIN_LONG_SID
|
|||||||
PRODUCT_ASSET_MIN_SHORT_SIDE = max(320, int(os.getenv("PRODUCT_ASSET_MIN_SHORT_SIDE", "600")))
|
PRODUCT_ASSET_MIN_SHORT_SIDE = max(320, int(os.getenv("PRODUCT_ASSET_MIN_SHORT_SIDE", "600")))
|
||||||
PRODUCT_ASSET_JPEG_QUALITY = max(80, min(95, int(os.getenv("PRODUCT_ASSET_JPEG_QUALITY", "92"))))
|
PRODUCT_ASSET_JPEG_QUALITY = max(80, min(95, int(os.getenv("PRODUCT_ASSET_JPEG_QUALITY", "92"))))
|
||||||
VIDEO_MODEL = os.getenv("VIDEO_MODEL", "seedance").strip() or "seedance"
|
VIDEO_MODEL = os.getenv("VIDEO_MODEL", "seedance").strip() or "seedance"
|
||||||
|
YTDLP_COOKIES_FILE = os.getenv("YTDLP_COOKIES_FILE", "").strip()
|
||||||
|
YTDLP_COOKIES_FROM_BROWSER = os.getenv("YTDLP_COOKIES_FROM_BROWSER", "").strip()
|
||||||
AUDIO_PRODUCT_BRIEF = os.getenv(
|
AUDIO_PRODUCT_BRIEF = os.getenv(
|
||||||
"AUDIO_PRODUCT_BRIEF",
|
"AUDIO_PRODUCT_BRIEF",
|
||||||
"SKG 智能按摩产品,主打日常肩颈、腰背、眼部、膝盖或足部放松;广告表达要高级、干净、可信,不做医疗疗效承诺。",
|
"SKG 智能按摩产品,主打日常肩颈、腰背、眼部、膝盖或足部放松;广告表达要高级、干净、可信,不做医疗疗效承诺。",
|
||||||
).strip()
|
).strip()
|
||||||
AUDIO_REWRITE_MODEL = os.getenv("AUDIO_REWRITE_MODEL", REWRITE_MODEL).strip() or REWRITE_MODEL
|
AUDIO_REWRITE_MODEL = gpt_model_env("AUDIO_REWRITE_MODEL", REWRITE_MODEL)
|
||||||
MINIMAX_API_KEY = os.getenv("MINIMAX_API_KEY", "").strip()
|
VOICE_PROVIDER = "azure_openai"
|
||||||
MINIMAX_TTS_BASE_URL = os.getenv("MINIMAX_TTS_BASE_URL", "https://api.minimax.io").strip().rstrip("/")
|
|
||||||
MINIMAX_TTS_MODEL = os.getenv("MINIMAX_TTS_MODEL", "speech-2.8-turbo").strip() or "speech-2.8-turbo"
|
|
||||||
MINIMAX_TTS_VOICE_ID = os.getenv(
|
|
||||||
"MINIMAX_TTS_VOICE_ID",
|
|
||||||
"English_expressive_narrator",
|
|
||||||
).strip() or "English_expressive_narrator"
|
|
||||||
DEFAULT_MINIMAX_TTS_VOICE_POOL = [
|
|
||||||
"English_magnetic_voiced_man",
|
|
||||||
"English_Upbeat_Woman",
|
|
||||||
"English_MaturePartner",
|
|
||||||
]
|
|
||||||
MINIMAX_TTS_VOICE_POOL = [
|
|
||||||
v.strip()
|
|
||||||
for v in os.getenv("MINIMAX_TTS_VOICE_POOL", ",".join(DEFAULT_MINIMAX_TTS_VOICE_POOL)).split(",")
|
|
||||||
if v.strip()
|
|
||||||
]
|
|
||||||
VOICE_PROVIDER = os.getenv("VOICE_PROVIDER", "azure_openai").strip().lower() or "azure_openai"
|
|
||||||
AZURE_OPENAI_BASE_URL = os.getenv("AZURE_OPENAI_BASE_URL", "https://ai.skg.com/azure").strip().rstrip("/")
|
AZURE_OPENAI_BASE_URL = os.getenv("AZURE_OPENAI_BASE_URL", "https://ai.skg.com/azure").strip().rstrip("/")
|
||||||
AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY", LLM_API_KEY).strip()
|
AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY", LLM_API_KEY).strip()
|
||||||
AZURE_TTS_MODEL = os.getenv("AZURE_TTS_MODEL", "gpt-4o-mini-tts").strip() or "gpt-4o-mini-tts"
|
AZURE_TTS_MODEL = os.getenv("AZURE_TTS_MODEL", "gpt-4o-mini-tts").strip() or "gpt-4o-mini-tts"
|
||||||
@@ -107,6 +111,11 @@ AZURE_TTS_VOICE_POOL = [
|
|||||||
if v.strip()
|
if v.strip()
|
||||||
]
|
]
|
||||||
AZURE_TTS_PATH = os.getenv("AZURE_TTS_PATH", "/audio/speech").strip() or "/audio/speech"
|
AZURE_TTS_PATH = os.getenv("AZURE_TTS_PATH", "/audio/speech").strip() or "/audio/speech"
|
||||||
|
AZURE_TTS_PATHS = [
|
||||||
|
p.strip()
|
||||||
|
for p in os.getenv("AZURE_TTS_PATHS", f"{AZURE_TTS_PATH},/audio/speech,/v1/audio/speech").split(",")
|
||||||
|
if p.strip()
|
||||||
|
]
|
||||||
|
|
||||||
POE_API_BASE_URL = os.getenv("POE_API_BASE_URL", "https://api.poe.com/v1").strip() or "https://api.poe.com/v1"
|
POE_API_BASE_URL = os.getenv("POE_API_BASE_URL", "https://api.poe.com/v1").strip() or "https://api.poe.com/v1"
|
||||||
POE_API_KEY = os.getenv("POE_API_KEY", "").strip()
|
POE_API_KEY = os.getenv("POE_API_KEY", "").strip()
|
||||||
@@ -238,8 +247,8 @@ JobStatus = Literal[
|
|||||||
"transcribing", "transcribed", "failed",
|
"transcribing", "transcribed", "failed",
|
||||||
]
|
]
|
||||||
|
|
||||||
KEYFRAME_COUNT = int(os.getenv("KEYFRAME_COUNT", "12"))
|
KEYFRAME_COUNT = int(os.getenv("KEYFRAME_COUNT", "6"))
|
||||||
FrameExtractTarget = Literal["transparent_human", "balanced", "subject", "transition", "expression", "motion"]
|
FrameExtractTarget = Literal["random_subject", "transparent_human", "balanced", "subject", "transition", "expression", "motion"]
|
||||||
FrameExtractMode = Literal["replace", "append"]
|
FrameExtractMode = Literal["replace", "append"]
|
||||||
FrameExtractQuality = Literal["auto", "fast", "accurate", "ultra"]
|
FrameExtractQuality = Literal["auto", "fast", "accurate", "ultra"]
|
||||||
AnalyzeTask = tuple[str, int, FrameExtractTarget, FrameExtractMode, FrameExtractQuality]
|
AnalyzeTask = tuple[str, int, FrameExtractTarget, FrameExtractMode, FrameExtractQuality]
|
||||||
@@ -252,6 +261,7 @@ SceneMode = Literal["remove_subject", "similar", "style"]
|
|||||||
SceneStyle = Literal["source", "premium_product", "clean_studio", "warm_lifestyle", "cinematic"]
|
SceneStyle = Literal["source", "premium_product", "clean_studio", "warm_lifestyle", "cinematic"]
|
||||||
SceneAssetRole = Literal["scene", "first_frame", "last_frame"]
|
SceneAssetRole = Literal["scene", "first_frame", "last_frame"]
|
||||||
FRAME_TARGET_LABELS: dict[FrameExtractTarget, str] = {
|
FRAME_TARGET_LABELS: dict[FrameExtractTarget, str] = {
|
||||||
|
"random_subject": "人物随机",
|
||||||
"transparent_human": "透明骨架人",
|
"transparent_human": "透明骨架人",
|
||||||
"balanced": "综合关键帧",
|
"balanced": "综合关键帧",
|
||||||
"subject": "清晰主体",
|
"subject": "清晰主体",
|
||||||
@@ -541,6 +551,10 @@ class AudioScript(BaseModel):
|
|||||||
class Job(BaseModel):
|
class Job(BaseModel):
|
||||||
id: str
|
id: str
|
||||||
url: str
|
url: str
|
||||||
|
document_id: str = ""
|
||||||
|
source_kind: Literal["tiktok_link", "upload", "unknown"] = "unknown"
|
||||||
|
workflow_mode: Literal["feed_recreation", "uploaded_reference"] = "feed_recreation"
|
||||||
|
storage_prefix: str = ""
|
||||||
status: JobStatus = "created"
|
status: JobStatus = "created"
|
||||||
progress: int = 0
|
progress: int = 0
|
||||||
message: str = ""
|
message: str = ""
|
||||||
@@ -640,8 +654,26 @@ def job_with_artifacts(job: Job) -> Job:
|
|||||||
return job.model_copy(update=updates)
|
return job.model_copy(update=updates)
|
||||||
|
|
||||||
|
|
||||||
|
def ensure_job_document_fields(job: Job) -> Job:
|
||||||
|
source_kind = job.source_kind if job.source_kind != "unknown" else infer_source_kind(job.url)
|
||||||
|
workflow_mode = job.workflow_mode or default_workflow_mode(source_kind)
|
||||||
|
document_id = job.document_id or job.id
|
||||||
|
job.source_kind = source_kind if source_kind in {"tiktok_link", "upload"} else "unknown"
|
||||||
|
job.workflow_mode = workflow_mode if workflow_mode in {"feed_recreation", "uploaded_reference"} else "feed_recreation"
|
||||||
|
job.document_id = document_id
|
||||||
|
job.storage_prefix = job.storage_prefix or storage_prefix(document_id, job.source_kind, job.workflow_mode)
|
||||||
|
return job
|
||||||
|
|
||||||
|
|
||||||
def save_state(job: Job) -> None:
|
def save_state(job: Job) -> None:
|
||||||
(job_dir(job.id) / "state.json").write_text(job.model_dump_json(indent=2))
|
ensure_job_document_fields(job)
|
||||||
|
d = job_dir(job.id)
|
||||||
|
(d / "state.json").write_text(job.model_dump_json(indent=2))
|
||||||
|
if DB:
|
||||||
|
try:
|
||||||
|
DB.sync_job(job.model_dump(mode="json"), d)
|
||||||
|
except Exception as e:
|
||||||
|
print(f"[database sync failed] job={job.id} error={e}", flush=True)
|
||||||
|
|
||||||
|
|
||||||
def update(job: Job, **kw) -> None:
|
def update(job: Job, **kw) -> None:
|
||||||
@@ -884,6 +916,12 @@ async def lifespan(_: FastAPI):
|
|||||||
message="服务重启 · 上次音频处理已中断,可重新处理",
|
message="服务重启 · 上次音频处理已中断,可重新处理",
|
||||||
)
|
)
|
||||||
JOBS[p.name] = job
|
JOBS[p.name] = job
|
||||||
|
ensure_job_document_fields(job)
|
||||||
|
if DB:
|
||||||
|
try:
|
||||||
|
DB.sync_job(job.model_dump(mode="json"), p)
|
||||||
|
except Exception as e:
|
||||||
|
print(f"[database restore sync failed] job={job.id} error={e}", flush=True)
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
yield
|
yield
|
||||||
@@ -995,6 +1033,35 @@ def run(cmd: list[str], cwd: Path | None = None) -> str:
|
|||||||
return res.stdout
|
return res.stdout
|
||||||
|
|
||||||
|
|
||||||
|
def ytdlp_cookie_args() -> list[str]:
|
||||||
|
if YTDLP_COOKIES_FILE:
|
||||||
|
cookies = Path(YTDLP_COOKIES_FILE).expanduser()
|
||||||
|
if not cookies.exists():
|
||||||
|
raise RuntimeError("TikTok cookies 文件不可用,请检查 YTDLP_COOKIES_FILE 配置。")
|
||||||
|
return ["--cookies", str(cookies)]
|
||||||
|
if YTDLP_COOKIES_FROM_BROWSER:
|
||||||
|
return ["--cookies-from-browser", YTDLP_COOKIES_FROM_BROWSER]
|
||||||
|
return []
|
||||||
|
|
||||||
|
|
||||||
|
def normalize_download_error(error: Exception) -> str:
|
||||||
|
raw = str(error)
|
||||||
|
lower = raw.lower()
|
||||||
|
auth_required = (
|
||||||
|
"log in for access" in lower
|
||||||
|
or "login" in lower and "cookies" in lower
|
||||||
|
or "cookies-from-browser" in lower
|
||||||
|
or "sign in" in lower and "tiktok" in lower
|
||||||
|
)
|
||||||
|
if auth_required:
|
||||||
|
return (
|
||||||
|
"TikTok 下载需要登录态。请上传视频文件,或在后端配置 "
|
||||||
|
"YTDLP_COOKIES_FILE / YTDLP_COOKIES_FROM_BROWSER 后重试。"
|
||||||
|
f"原始错误:{raw}"
|
||||||
|
)
|
||||||
|
return raw
|
||||||
|
|
||||||
|
|
||||||
# ---- 启发式选帧工具 ----
|
# ---- 启发式选帧工具 ----
|
||||||
import imagehash
|
import imagehash
|
||||||
import numpy as np
|
import numpy as np
|
||||||
@@ -1408,7 +1475,10 @@ def _target_score(item: dict, target: FrameExtractTarget) -> float:
|
|||||||
scene = float(item.get("scene_score_n", 0.0))
|
scene = float(item.get("scene_score_n", 0.0))
|
||||||
motion = float(item.get("motion_n", 0.0))
|
motion = float(item.get("motion_n", 0.0))
|
||||||
|
|
||||||
if target == "transparent_human":
|
if target == "random_subject":
|
||||||
|
# 人物定向随机抽帧先用中心主体/清晰度形成候选池,再在池内随机取样。
|
||||||
|
score = center * 0.52 + sharp * 0.24 + contrast * 0.14 + color * 0.10
|
||||||
|
elif target == "transparent_human":
|
||||||
# 当前抽帧阶段走本地算力:优先清晰中心主体、高对比、适度色彩和时间覆盖。
|
# 当前抽帧阶段走本地算力:优先清晰中心主体、高对比、适度色彩和时间覆盖。
|
||||||
# 透明骨架人的语义判断留给后续审核/识别,不在抽帧阶段逐帧调用 Vision。
|
# 透明骨架人的语义判断留给后续审核/识别,不在抽帧阶段逐帧调用 Vision。
|
||||||
score = center * 0.45 + sharp * 0.30 + contrast * 0.15 + color * 0.10
|
score = center * 0.45 + sharp * 0.30 + contrast * 0.15 + color * 0.10
|
||||||
@@ -1460,6 +1530,15 @@ def _select_keyframes(candidates: list[dict], n: int, target: FrameExtractTarget
|
|||||||
elif it["score"] > dup["score"]:
|
elif it["score"] > dup["score"]:
|
||||||
deduped[deduped.index(dup)] = it
|
deduped[deduped.index(dup)] = it
|
||||||
|
|
||||||
|
if target == "random_subject":
|
||||||
|
# 人物定向随机:从清晰、中心主体更强的候选池里随机抽,不再按动作峰值排序。
|
||||||
|
ranked = sorted(deduped, key=lambda x: -float(x.get("score", 0.0)))
|
||||||
|
pool_size = min(len(ranked), max(n * 6, n + 8))
|
||||||
|
pool = ranked[:pool_size] if pool_size > 0 else ranked
|
||||||
|
selected = random.sample(pool, k=min(n, len(pool))) if len(pool) > n else list(pool)
|
||||||
|
selected.sort(key=lambda x: x["idx"])
|
||||||
|
return selected
|
||||||
|
|
||||||
# 时序分桶:把候选时间轴等分 n 段,每段取当前目标下最优的
|
# 时序分桶:把候选时间轴等分 n 段,每段取当前目标下最优的
|
||||||
total = len(candidates)
|
total = len(candidates)
|
||||||
buckets: list[list[dict]] = [[] for _ in range(n)]
|
buckets: list[list[dict]] = [[] for _ in range(n)]
|
||||||
@@ -1648,13 +1727,15 @@ def pipeline_download(job_id: str) -> None:
|
|||||||
update(job, status="downloading", message="本地上传 · 跳过下载", progress=15)
|
update(job, status="downloading", message="本地上传 · 跳过下载", progress=15)
|
||||||
else:
|
else:
|
||||||
update(job, status="downloading", message="yt-dlp 下载中…", progress=5)
|
update(job, status="downloading", message="yt-dlp 下载中…", progress=5)
|
||||||
run([
|
cmd = [
|
||||||
"yt-dlp", "-f", "best[ext=mp4]/best",
|
"yt-dlp", "-f", "best[ext=mp4]/best",
|
||||||
"-o", str(mp4),
|
"-o", str(mp4),
|
||||||
"--no-warnings", "--no-playlist",
|
"--no-warnings", "--no-playlist",
|
||||||
"--retries", "3",
|
"--retries", "3",
|
||||||
|
*ytdlp_cookie_args(),
|
||||||
job.url,
|
job.url,
|
||||||
])
|
]
|
||||||
|
run(cmd)
|
||||||
if not mp4.exists():
|
if not mp4.exists():
|
||||||
raise RuntimeError("下载完成但找不到 source.mp4")
|
raise RuntimeError("下载完成但找不到 source.mp4")
|
||||||
|
|
||||||
@@ -1677,13 +1758,13 @@ def pipeline_download(job_id: str) -> None:
|
|||||||
)
|
)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
message = "视频元数据解析失败" if stage == "metadata" else "下载失败"
|
message = "视频元数据解析失败" if stage == "metadata" else "下载失败"
|
||||||
update(job, status="failed", error=str(e), message=message)
|
update(job, status="failed", error=normalize_download_error(e), message=message)
|
||||||
|
|
||||||
|
|
||||||
def pipeline_analyze(
|
def pipeline_analyze(
|
||||||
job_id: str,
|
job_id: str,
|
||||||
frame_count: int = KEYFRAME_COUNT,
|
frame_count: int = KEYFRAME_COUNT,
|
||||||
target: FrameExtractTarget = "transparent_human",
|
target: FrameExtractTarget = "random_subject",
|
||||||
mode: FrameExtractMode = "replace",
|
mode: FrameExtractMode = "replace",
|
||||||
quality: FrameExtractQuality = "auto",
|
quality: FrameExtractQuality = "auto",
|
||||||
) -> None:
|
) -> None:
|
||||||
@@ -1849,7 +1930,7 @@ def analyze_queue_worker() -> None:
|
|||||||
ANALYZE_WORKER_RUNNING = False
|
ANALYZE_WORKER_RUNNING = False
|
||||||
|
|
||||||
|
|
||||||
# ---------- 音频转写 + 翻译 + SKG 改写 + MiniMax 配音 ----------
|
# ---------- 音频转写 + 翻译 + SKG 改写 + Azure OpenAI 配音 ----------
|
||||||
|
|
||||||
class TranscriptionUnavailable(RuntimeError):
|
class TranscriptionUnavailable(RuntimeError):
|
||||||
pass
|
pass
|
||||||
@@ -2305,18 +2386,6 @@ def _rewrite_audio_script_sync(segments: list[TranscriptSegment], target_seconds
|
|||||||
return fallback, f"改写失败,使用本地模板:{e}"
|
return fallback, f"改写失败,使用本地模板:{e}"
|
||||||
|
|
||||||
|
|
||||||
def _minimax_tts_url() -> str:
|
|
||||||
if MINIMAX_TTS_BASE_URL.endswith("/v1/t2a_v2"):
|
|
||||||
return MINIMAX_TTS_BASE_URL
|
|
||||||
return f"{MINIMAX_TTS_BASE_URL}/v1/t2a_v2"
|
|
||||||
|
|
||||||
|
|
||||||
def _choose_minimax_voice_id() -> str:
|
|
||||||
if MINIMAX_TTS_VOICE_POOL:
|
|
||||||
return random.choice(MINIMAX_TTS_VOICE_POOL)
|
|
||||||
return MINIMAX_TTS_VOICE_ID
|
|
||||||
|
|
||||||
|
|
||||||
def _choose_azure_voice_id() -> str:
|
def _choose_azure_voice_id() -> str:
|
||||||
if AZURE_TTS_VOICE_POOL:
|
if AZURE_TTS_VOICE_POOL:
|
||||||
return random.choice(AZURE_TTS_VOICE_POOL)
|
return random.choice(AZURE_TTS_VOICE_POOL)
|
||||||
@@ -2324,9 +2393,7 @@ def _choose_azure_voice_id() -> str:
|
|||||||
|
|
||||||
|
|
||||||
def _choose_tts_voice_id() -> str:
|
def _choose_tts_voice_id() -> str:
|
||||||
if VOICE_PROVIDER == "azure_openai":
|
|
||||||
return _choose_azure_voice_id()
|
return _choose_azure_voice_id()
|
||||||
return _choose_minimax_voice_id()
|
|
||||||
|
|
||||||
|
|
||||||
def _voice_speed_for(voice_id: str, target_seconds: float, text: str) -> float:
|
def _voice_speed_for(voice_id: str, target_seconds: float, text: str) -> float:
|
||||||
@@ -2343,60 +2410,22 @@ def _voice_speed_for(voice_id: str, target_seconds: float, text: str) -> float:
|
|||||||
return 0.99
|
return 0.99
|
||||||
|
|
||||||
|
|
||||||
def _minimax_tts_sync(job_id: str, text: str, voice_id: str, target_seconds: float = 12.0) -> str:
|
def _azure_tts_url_for(path_value: str) -> str:
|
||||||
if not MINIMAX_API_KEY:
|
path = path_value if path_value.startswith("/") else f"/{path_value}"
|
||||||
raise RuntimeError("MINIMAX_API_KEY 未配置,未生成配音")
|
|
||||||
if not text.strip():
|
|
||||||
raise RuntimeError("改写文案为空,未生成配音")
|
|
||||||
payload = {
|
|
||||||
"model": MINIMAX_TTS_MODEL,
|
|
||||||
"text": text.strip()[:9500],
|
|
||||||
"stream": False,
|
|
||||||
"language_boost": "English",
|
|
||||||
"output_format": "hex",
|
|
||||||
"voice_setting": {
|
|
||||||
"voice_id": voice_id,
|
|
||||||
"speed": _voice_speed_for(voice_id, target_seconds, text),
|
|
||||||
"vol": 1,
|
|
||||||
"pitch": 0,
|
|
||||||
},
|
|
||||||
"audio_setting": {
|
|
||||||
"sample_rate": 32000,
|
|
||||||
"bitrate": 128000,
|
|
||||||
"format": "mp3",
|
|
||||||
"channel": 1,
|
|
||||||
},
|
|
||||||
}
|
|
||||||
resp = httpx.post(
|
|
||||||
_minimax_tts_url(),
|
|
||||||
headers={"Authorization": f"Bearer {MINIMAX_API_KEY}", "Content-Type": "application/json"},
|
|
||||||
json=payload,
|
|
||||||
timeout=90,
|
|
||||||
)
|
|
||||||
resp.raise_for_status()
|
|
||||||
data = resp.json()
|
|
||||||
base_resp = data.get("base_resp") or {}
|
|
||||||
if int(base_resp.get("status_code", 0) or 0) != 0:
|
|
||||||
raise RuntimeError(base_resp.get("status_msg") or "MiniMax TTS 返回失败")
|
|
||||||
audio_hex = ((data.get("data") or {}).get("audio") or "").strip()
|
|
||||||
if not audio_hex:
|
|
||||||
raise RuntimeError("MiniMax TTS 未返回 audio hex")
|
|
||||||
try:
|
|
||||||
audio_bytes = bytes.fromhex(audio_hex)
|
|
||||||
except ValueError as e:
|
|
||||||
raise RuntimeError(f"MiniMax TTS audio hex 无法解析:{e}") from e
|
|
||||||
out = job_dir(job_id) / "audio_script.mp3"
|
|
||||||
out.write_bytes(audio_bytes)
|
|
||||||
return f"/jobs/{job_id}/audio-script.mp3"
|
|
||||||
|
|
||||||
|
|
||||||
def _azure_tts_url() -> str:
|
|
||||||
path = AZURE_TTS_PATH if AZURE_TTS_PATH.startswith("/") else f"/{AZURE_TTS_PATH}"
|
|
||||||
if AZURE_OPENAI_BASE_URL.endswith(path):
|
if AZURE_OPENAI_BASE_URL.endswith(path):
|
||||||
return AZURE_OPENAI_BASE_URL
|
return AZURE_OPENAI_BASE_URL
|
||||||
return f"{AZURE_OPENAI_BASE_URL}{path}"
|
return f"{AZURE_OPENAI_BASE_URL}{path}"
|
||||||
|
|
||||||
|
|
||||||
|
def _azure_tts_urls() -> list[str]:
|
||||||
|
urls: list[str] = []
|
||||||
|
for path in AZURE_TTS_PATHS or [AZURE_TTS_PATH]:
|
||||||
|
url = _azure_tts_url_for(path)
|
||||||
|
if url not in urls:
|
||||||
|
urls.append(url)
|
||||||
|
return urls
|
||||||
|
|
||||||
|
|
||||||
def _azure_openai_tts_sync(job_id: str, text: str, voice_id: str, target_seconds: float = 12.0) -> str:
|
def _azure_openai_tts_sync(job_id: str, text: str, voice_id: str, target_seconds: float = 12.0) -> str:
|
||||||
if not AZURE_OPENAI_API_KEY:
|
if not AZURE_OPENAI_API_KEY:
|
||||||
raise RuntimeError("AZURE_OPENAI_API_KEY 或 LLM_API_KEY 未配置,未生成配音")
|
raise RuntimeError("AZURE_OPENAI_API_KEY 或 LLM_API_KEY 未配置,未生成配音")
|
||||||
@@ -2409,18 +2438,32 @@ def _azure_openai_tts_sync(job_id: str, text: str, voice_id: str, target_seconds
|
|||||||
"response_format": "mp3",
|
"response_format": "mp3",
|
||||||
"speed": _voice_speed_for(voice_id, target_seconds, text),
|
"speed": _voice_speed_for(voice_id, target_seconds, text),
|
||||||
}
|
}
|
||||||
resp = httpx.post(
|
|
||||||
_azure_tts_url(),
|
|
||||||
headers = {
|
headers = {
|
||||||
"Authorization": f"Bearer {AZURE_OPENAI_API_KEY}",
|
"Authorization": f"Bearer {AZURE_OPENAI_API_KEY}",
|
||||||
"api-key": AZURE_OPENAI_API_KEY,
|
"api-key": AZURE_OPENAI_API_KEY,
|
||||||
"Content-Type": "application/json",
|
"Content-Type": "application/json",
|
||||||
},
|
}
|
||||||
json=payload,
|
resp: httpx.Response | None = None
|
||||||
timeout=120,
|
errors: list[str] = []
|
||||||
)
|
with ai_http_client(timeout=120) as client:
|
||||||
|
for url in _azure_tts_urls():
|
||||||
|
try:
|
||||||
|
current = client.post(url, headers=headers, json=payload)
|
||||||
|
except Exception as e:
|
||||||
|
errors.append(f"{url}: {type(e).__name__}: {e}")
|
||||||
|
continue
|
||||||
|
if current.status_code < 400:
|
||||||
|
resp = current
|
||||||
|
break
|
||||||
|
errors.append(f"{url}: HTTP {current.status_code}: {current.text[:180]}")
|
||||||
|
if current.status_code not in {404, 405}:
|
||||||
|
resp = current
|
||||||
|
break
|
||||||
|
if resp is None:
|
||||||
|
raise RuntimeError("Azure OpenAI TTS 不可用;已尝试 " + " | ".join(errors))
|
||||||
if resp.status_code >= 400:
|
if resp.status_code >= 400:
|
||||||
raise RuntimeError(f"Azure OpenAI TTS HTTP {resp.status_code}: {resp.text[:300]}")
|
detail = " | ".join(errors) or resp.text[:300]
|
||||||
|
raise RuntimeError(f"Azure OpenAI TTS HTTP {resp.status_code}: {detail[:600]}")
|
||||||
audio_bytes = resp.content
|
audio_bytes = resp.content
|
||||||
if not audio_bytes:
|
if not audio_bytes:
|
||||||
raise RuntimeError("Azure OpenAI TTS 未返回音频内容")
|
raise RuntimeError("Azure OpenAI TTS 未返回音频内容")
|
||||||
@@ -2437,9 +2480,7 @@ def _azure_openai_tts_sync(job_id: str, text: str, voice_id: str, target_seconds
|
|||||||
|
|
||||||
|
|
||||||
def _tts_sync(job_id: str, text: str, voice_id: str, target_seconds: float = 12.0) -> tuple[str, str, str]:
|
def _tts_sync(job_id: str, text: str, voice_id: str, target_seconds: float = 12.0) -> tuple[str, str, str]:
|
||||||
if VOICE_PROVIDER == "azure_openai":
|
|
||||||
return _azure_openai_tts_sync(job_id, text, voice_id, target_seconds), "azure_openai", AZURE_TTS_MODEL
|
return _azure_openai_tts_sync(job_id, text, voice_id, target_seconds), "azure_openai", AZURE_TTS_MODEL
|
||||||
return _minimax_tts_sync(job_id, text, voice_id, target_seconds), "minimax", MINIMAX_TTS_MODEL
|
|
||||||
|
|
||||||
|
|
||||||
def _build_audio_script_sync(job_id: str, segments: list[TranscriptSegment], target_seconds: float = 12.0) -> AudioScript:
|
def _build_audio_script_sync(job_id: str, segments: list[TranscriptSegment], target_seconds: float = 12.0) -> AudioScript:
|
||||||
@@ -2451,8 +2492,8 @@ def _build_audio_script_sync(job_id: str, segments: list[TranscriptSegment], tar
|
|||||||
speaker_profile, rhythm_profile = _audio_delivery_profile(segments, duration, selected_voice_id)
|
speaker_profile, rhythm_profile = _audio_delivery_profile(segments, duration, selected_voice_id)
|
||||||
voice_url = ""
|
voice_url = ""
|
||||||
voice_error = ""
|
voice_error = ""
|
||||||
voice_provider = "azure_openai" if VOICE_PROVIDER == "azure_openai" else "minimax"
|
voice_provider = "azure_openai"
|
||||||
voice_model = AZURE_TTS_MODEL if voice_provider == "azure_openai" else MINIMAX_TTS_MODEL
|
voice_model = AZURE_TTS_MODEL
|
||||||
try:
|
try:
|
||||||
voice_url, voice_provider, voice_model = _tts_sync(job_id, rewritten, selected_voice_id, duration)
|
voice_url, voice_provider, voice_model = _tts_sync(job_id, rewritten, selected_voice_id, duration)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
@@ -3050,7 +3091,8 @@ def health() -> dict:
|
|||||||
"auth_configured": WEB_AUTH_CONFIGURED,
|
"auth_configured": WEB_AUTH_CONFIGURED,
|
||||||
"base_url": LLM_BASE_URL or "openai-default",
|
"base_url": LLM_BASE_URL or "openai-default",
|
||||||
"image_base_url": IMAGE_BASE_URL or LLM_BASE_URL or "openai-default",
|
"image_base_url": IMAGE_BASE_URL or LLM_BASE_URL or "openai-default",
|
||||||
"voice_base_url": AZURE_OPENAI_BASE_URL if VOICE_PROVIDER == "azure_openai" else MINIMAX_TTS_BASE_URL,
|
"voice_base_url": AZURE_OPENAI_BASE_URL,
|
||||||
|
"database": DB.health() if DB else {"enabled": False, "url": DATABASE_URL, "error": DB_INIT_ERROR},
|
||||||
"models": {
|
"models": {
|
||||||
"asr": ASR_MODEL,
|
"asr": ASR_MODEL,
|
||||||
"local_asr": LOCAL_ASR_MODEL,
|
"local_asr": LOCAL_ASR_MODEL,
|
||||||
@@ -3067,15 +3109,12 @@ def health() -> dict:
|
|||||||
"subject_image": SUBJECT_ASSET_IMAGE_MODEL,
|
"subject_image": SUBJECT_ASSET_IMAGE_MODEL,
|
||||||
"subject_image_fallbacks": SUBJECT_ASSET_IMAGE_MODELS,
|
"subject_image_fallbacks": SUBJECT_ASSET_IMAGE_MODELS,
|
||||||
"voice_provider": VOICE_PROVIDER,
|
"voice_provider": VOICE_PROVIDER,
|
||||||
"voice_base_url": AZURE_OPENAI_BASE_URL if VOICE_PROVIDER == "azure_openai" else MINIMAX_TTS_BASE_URL,
|
"voice_base_url": AZURE_OPENAI_BASE_URL,
|
||||||
"voice_tts": AZURE_TTS_MODEL if VOICE_PROVIDER == "azure_openai" else MINIMAX_TTS_MODEL,
|
"voice_tts": AZURE_TTS_MODEL,
|
||||||
"voice_id": AZURE_TTS_VOICE_ID if VOICE_PROVIDER == "azure_openai" else MINIMAX_TTS_VOICE_ID,
|
"voice_tts_paths": AZURE_TTS_PATHS,
|
||||||
"voice_pool": AZURE_TTS_VOICE_POOL if VOICE_PROVIDER == "azure_openai" else (MINIMAX_TTS_VOICE_POOL or [MINIMAX_TTS_VOICE_ID]),
|
"voice_id": AZURE_TTS_VOICE_ID,
|
||||||
"voice_configured": bool(AZURE_OPENAI_API_KEY) if VOICE_PROVIDER == "azure_openai" else bool(MINIMAX_API_KEY),
|
"voice_pool": AZURE_TTS_VOICE_POOL,
|
||||||
"minimax_tts": MINIMAX_TTS_MODEL,
|
"voice_configured": bool(AZURE_OPENAI_API_KEY),
|
||||||
"minimax_voice": MINIMAX_TTS_VOICE_ID,
|
|
||||||
"minimax_voice_pool": MINIMAX_TTS_VOICE_POOL or [MINIMAX_TTS_VOICE_ID],
|
|
||||||
"minimax_configured": bool(MINIMAX_API_KEY),
|
|
||||||
"video": VIDEO_MODEL,
|
"video": VIDEO_MODEL,
|
||||||
"video_aliases": VIDEO_MODEL_ALIASES,
|
"video_aliases": VIDEO_MODEL_ALIASES,
|
||||||
"video_provider": video_provider_name(),
|
"video_provider": video_provider_name(),
|
||||||
@@ -3088,6 +3127,9 @@ def health() -> dict:
|
|||||||
|
|
||||||
class JobSummary(BaseModel):
|
class JobSummary(BaseModel):
|
||||||
id: str
|
id: str
|
||||||
|
document_id: str = ""
|
||||||
|
source_kind: str = "unknown"
|
||||||
|
workflow_mode: str = "feed_recreation"
|
||||||
url: str
|
url: str
|
||||||
status: JobStatus
|
status: JobStatus
|
||||||
progress: int = 0
|
progress: int = 0
|
||||||
@@ -3103,6 +3145,29 @@ class JobSummary(BaseModel):
|
|||||||
mtime: float = 0.0
|
mtime: float = 0.0
|
||||||
|
|
||||||
|
|
||||||
|
class DocumentSummary(BaseModel):
|
||||||
|
id: str
|
||||||
|
title: str
|
||||||
|
source_kind: str
|
||||||
|
workflow_mode: str
|
||||||
|
source_url: str = ""
|
||||||
|
primary_job_id: str = ""
|
||||||
|
status: str = "created"
|
||||||
|
storage_prefix: str = ""
|
||||||
|
job_count: int = 0
|
||||||
|
asset_count: int = 0
|
||||||
|
created_at: float = 0.0
|
||||||
|
updated_at: float = 0.0
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/documents", response_model=list[DocumentSummary])
|
||||||
|
def list_documents(limit: int | None = None) -> list[DocumentSummary]:
|
||||||
|
if not DB:
|
||||||
|
return []
|
||||||
|
rows = DB.list_documents(limit)
|
||||||
|
return [DocumentSummary(**row) for row in rows]
|
||||||
|
|
||||||
|
|
||||||
@app.get("/jobs", response_model=list[JobSummary])
|
@app.get("/jobs", response_model=list[JobSummary])
|
||||||
def list_jobs(limit: int | None = None) -> list[JobSummary]:
|
def list_jobs(limit: int | None = None) -> list[JobSummary]:
|
||||||
"""所有 job 的精简列表,按磁盘 state.json mtime 倒序(最新优先)。前端无 ?job= 时用它回填历史。"""
|
"""所有 job 的精简列表,按磁盘 state.json mtime 倒序(最新优先)。前端无 ?job= 时用它回填历史。"""
|
||||||
@@ -3111,8 +3176,12 @@ def list_jobs(limit: int | None = None) -> list[JobSummary]:
|
|||||||
state_path = JOBS_DIR / job_id / "state.json"
|
state_path = JOBS_DIR / job_id / "state.json"
|
||||||
mtime = state_path.stat().st_mtime if state_path.exists() else 0.0
|
mtime = state_path.stat().st_mtime if state_path.exists() else 0.0
|
||||||
thumb = f"/jobs/{job_id}/frames/{job.frames[0].index}.jpg" if job.frames else ""
|
thumb = f"/jobs/{job_id}/frames/{job.frames[0].index}.jpg" if job.frames else ""
|
||||||
|
ensure_job_document_fields(job)
|
||||||
items.append(JobSummary(
|
items.append(JobSummary(
|
||||||
id=job.id,
|
id=job.id,
|
||||||
|
document_id=job.document_id,
|
||||||
|
source_kind=job.source_kind,
|
||||||
|
workflow_mode=job.workflow_mode,
|
||||||
url=job.url,
|
url=job.url,
|
||||||
status=job.status,
|
status=job.status,
|
||||||
progress=job.progress,
|
progress=job.progress,
|
||||||
@@ -3138,13 +3207,38 @@ async def create_job(req: CreateJobReq, bg: BackgroundTasks) -> Job:
|
|||||||
if not req.url.strip():
|
if not req.url.strip():
|
||||||
raise HTTPException(400, "url required")
|
raise HTTPException(400, "url required")
|
||||||
job_id = uuid.uuid4().hex[:12]
|
job_id = uuid.uuid4().hex[:12]
|
||||||
job = Job(id=job_id, url=req.url.strip())
|
job = Job(id=job_id, url=req.url.strip(), document_id=job_id, source_kind="tiktok_link", workflow_mode="feed_recreation")
|
||||||
JOBS[job_id] = job
|
JOBS[job_id] = job
|
||||||
save_state(job)
|
save_state(job)
|
||||||
bg.add_task(pipeline_download, job_id)
|
bg.add_task(pipeline_download, job_id)
|
||||||
return job
|
return job
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/jobs/{job_id}/download/retry", response_model=Job)
|
||||||
|
async def retry_job_download(job_id: str, bg: BackgroundTasks) -> Job:
|
||||||
|
job = JOBS.get(job_id)
|
||||||
|
if not job:
|
||||||
|
raise HTTPException(404, "job not found")
|
||||||
|
if job.source_kind == "upload" or job.url.startswith("upload://"):
|
||||||
|
raise HTTPException(409, "uploaded videos cannot be redownloaded; upload the file again")
|
||||||
|
if job.status in {"downloading", "splitting", "transcribing"}:
|
||||||
|
raise HTTPException(409, f"job is busy: {job.status}")
|
||||||
|
|
||||||
|
mp4 = job_dir(job_id) / "source.mp4"
|
||||||
|
if mp4.exists() and mp4.stat().st_size == 0:
|
||||||
|
mp4.unlink()
|
||||||
|
update(
|
||||||
|
job,
|
||||||
|
status="downloading",
|
||||||
|
progress=1,
|
||||||
|
error="",
|
||||||
|
message="重新提交下载…",
|
||||||
|
video_url="",
|
||||||
|
)
|
||||||
|
bg.add_task(pipeline_download, job_id)
|
||||||
|
return job
|
||||||
|
|
||||||
|
|
||||||
@app.post("/jobs/upload", response_model=Job)
|
@app.post("/jobs/upload", response_model=Job)
|
||||||
async def create_job_from_upload(bg: BackgroundTasks, file: UploadFile = File(...)) -> Job:
|
async def create_job_from_upload(bg: BackgroundTasks, file: UploadFile = File(...)) -> Job:
|
||||||
if not file.filename:
|
if not file.filename:
|
||||||
@@ -3162,7 +3256,7 @@ async def create_job_from_upload(bg: BackgroundTasks, file: UploadFile = File(..
|
|||||||
if not mp4.exists() or mp4.stat().st_size == 0:
|
if not mp4.exists() or mp4.stat().st_size == 0:
|
||||||
raise HTTPException(500, "upload failed")
|
raise HTTPException(500, "upload failed")
|
||||||
|
|
||||||
job = Job(id=job_id, url=f"upload://{file.filename}")
|
job = Job(id=job_id, url=f"upload://{file.filename}", document_id=job_id, source_kind="upload", workflow_mode="uploaded_reference")
|
||||||
JOBS[job_id] = job
|
JOBS[job_id] = job
|
||||||
save_state(job)
|
save_state(job)
|
||||||
bg.add_task(pipeline_download, job_id)
|
bg.add_task(pipeline_download, job_id)
|
||||||
@@ -3174,7 +3268,7 @@ async def trigger_analyze(
|
|||||||
job_id: str,
|
job_id: str,
|
||||||
bg: BackgroundTasks,
|
bg: BackgroundTasks,
|
||||||
frames: int = KEYFRAME_COUNT,
|
frames: int = KEYFRAME_COUNT,
|
||||||
target: FrameExtractTarget = "transparent_human",
|
target: FrameExtractTarget = "random_subject",
|
||||||
mode: FrameExtractMode = "replace",
|
mode: FrameExtractMode = "replace",
|
||||||
quality: FrameExtractQuality = "auto",
|
quality: FrameExtractQuality = "auto",
|
||||||
) -> Job:
|
) -> Job:
|
||||||
@@ -3252,6 +3346,11 @@ def delete_job(job_id: str) -> dict[str, bool | str]:
|
|||||||
job = JOBS.pop(job_id, None)
|
job = JOBS.pop(job_id, None)
|
||||||
if not job and not d.exists():
|
if not job and not d.exists():
|
||||||
raise HTTPException(404, "job not found")
|
raise HTTPException(404, "job not found")
|
||||||
|
if DB:
|
||||||
|
try:
|
||||||
|
DB.delete_job(job_id)
|
||||||
|
except Exception as e:
|
||||||
|
print(f"[database delete failed] job={job_id} error={e}", flush=True)
|
||||||
if d.exists():
|
if d.exists():
|
||||||
shutil.rmtree(d)
|
shutil.rmtree(d)
|
||||||
return {"ok": True, "id": job_id}
|
return {"ok": True, "id": job_id}
|
||||||
|
|||||||
@@ -3,7 +3,8 @@
|
|||||||
|
|
||||||
# Runtime
|
# Runtime
|
||||||
JOBS_DIR=/data/jobs
|
JOBS_DIR=/data/jobs
|
||||||
KEYFRAME_COUNT=12
|
APP_DB_URL=sqlite:////data/jobs/app.db
|
||||||
|
KEYFRAME_COUNT=6
|
||||||
CORS_ORIGINS=https://marketing.skg.com
|
CORS_ORIGINS=https://marketing.skg.com
|
||||||
API_PORT=4291
|
API_PORT=4291
|
||||||
|
|
||||||
@@ -22,7 +23,9 @@ LLM_API_KEY=
|
|||||||
ASR_MODEL=whisper-1
|
ASR_MODEL=whisper-1
|
||||||
ASR_FALLBACK_MODEL=gemini-2.5-flash
|
ASR_FALLBACK_MODEL=gemini-2.5-flash
|
||||||
TRANSLATE_MODEL=gemini-2.5-flash
|
TRANSLATE_MODEL=gemini-2.5-flash
|
||||||
REWRITE_MODEL=gemini-2.5-pro
|
GPT_TEXT_MODEL=gpt-4o
|
||||||
|
REWRITE_MODEL=gpt-4o
|
||||||
|
VISION_MODEL=gpt-4o
|
||||||
PRODUCT_VIEW_MODEL=gpt-image-2
|
PRODUCT_VIEW_MODEL=gpt-image-2
|
||||||
IMAGE_BASE_URL=https://ai.skg.com/ezlink/v1
|
IMAGE_BASE_URL=https://ai.skg.com/ezlink/v1
|
||||||
IMAGE_API_KEY=
|
IMAGE_API_KEY=
|
||||||
@@ -33,9 +36,14 @@ SUBJECT_ASSET_IMAGE_MODELS=gpt-image-2
|
|||||||
# Optional outbound proxy for AI gateway calls. Leave blank on normal VPS networking.
|
# Optional outbound proxy for AI gateway calls. Leave blank on normal VPS networking.
|
||||||
AI_HTTP_PROXY=
|
AI_HTTP_PROXY=
|
||||||
|
|
||||||
|
# Optional TikTok download login state for yt-dlp. Keep cookies files private.
|
||||||
|
YTDLP_COOKIES_FILE=
|
||||||
|
YTDLP_COOKIES_FROM_BROWSER=
|
||||||
|
|
||||||
# Audio rewrite and Azure OpenAI TTS
|
# Audio rewrite and Azure OpenAI TTS
|
||||||
AUDIO_REWRITE_MODEL=gemini-2.5-pro
|
AUDIO_REWRITE_MODEL=gpt-4o
|
||||||
AUDIO_PRODUCT_BRIEF="SKG smart massage products for daily neck, shoulder, back, eye, knee, and foot relaxation. Keep claims premium, clean, credible, and non-medical."
|
AUDIO_PRODUCT_BRIEF="SKG smart massage products for daily neck, shoulder, back, eye, knee, and foot relaxation. Keep claims premium, clean, credible, and non-medical."
|
||||||
|
# Voice is fixed to Azure OpenAI in the backend.
|
||||||
VOICE_PROVIDER=azure_openai
|
VOICE_PROVIDER=azure_openai
|
||||||
AZURE_OPENAI_BASE_URL=https://ai.skg.com/azure
|
AZURE_OPENAI_BASE_URL=https://ai.skg.com/azure
|
||||||
AZURE_OPENAI_API_KEY=
|
AZURE_OPENAI_API_KEY=
|
||||||
@@ -43,13 +51,7 @@ AZURE_TTS_MODEL=gpt-4o-mini-tts
|
|||||||
AZURE_TTS_VOICE_ID=alloy
|
AZURE_TTS_VOICE_ID=alloy
|
||||||
AZURE_TTS_VOICE_POOL=alloy,verse,shimmer
|
AZURE_TTS_VOICE_POOL=alloy,verse,shimmer
|
||||||
AZURE_TTS_PATH=/audio/speech
|
AZURE_TTS_PATH=/audio/speech
|
||||||
|
AZURE_TTS_PATHS=/audio/speech,/v1/audio/speech
|
||||||
# Legacy MiniMax TTS fallback; not the default voice provider.
|
|
||||||
MINIMAX_API_KEY=
|
|
||||||
MINIMAX_TTS_BASE_URL=https://api.minimax.io
|
|
||||||
MINIMAX_TTS_MODEL=speech-2.8-turbo
|
|
||||||
MINIMAX_TTS_VOICE_ID=English_expressive_narrator
|
|
||||||
MINIMAX_TTS_VOICE_POOL=English_magnetic_voiced_man,English_Upbeat_Woman,English_MaturePartner
|
|
||||||
|
|
||||||
# Video generation. Use SKG Doubao / Seedance gateway in production.
|
# Video generation. Use SKG Doubao / Seedance gateway in production.
|
||||||
POE_API_BASE_URL=https://api.poe.com/v1
|
POE_API_BASE_URL=https://api.poe.com/v1
|
||||||
|
|||||||
File diff suppressed because one or more lines are too long
@@ -19,6 +19,7 @@ import { AdRecreationBoard } from "@/components/ad-recreation-board"
|
|||||||
import {
|
import {
|
||||||
addManualFrame, analyzeJob, createJob, getJob, listJobs, uploadJob, deleteJob, deleteFrame, deleteGeneratedImage,
|
addManualFrame, analyzeJob, createJob, getJob, listJobs, uploadJob, deleteJob, deleteFrame, deleteGeneratedImage,
|
||||||
deleteGeneratedVideo, deleteCutout, generateStoryboardVideo, triggerTranscribe, describeFrame, updateStoryboard, copyProductLibraryAsset,
|
deleteGeneratedVideo, deleteCutout, generateStoryboardVideo, triggerTranscribe, describeFrame, updateStoryboard, copyProductLibraryAsset,
|
||||||
|
formatJobError, retryJobDownload,
|
||||||
type Job, type ImageRef, type KeyFrame, type ProductFusionShot, type StoryboardScene, type FrameExtractMode, type FrameExtractQuality, type FrameExtractTarget,
|
type Job, type ImageRef, type KeyFrame, type ProductFusionShot, type StoryboardScene, type FrameExtractMode, type FrameExtractQuality, type FrameExtractTarget,
|
||||||
} from "@/lib/api"
|
} from "@/lib/api"
|
||||||
import { TRANSPARENT_HUMAN_NEGATIVE_PROMPT, TRANSPARENT_HUMAN_VIDEO_PROMPT } from "@/lib/workflow-target"
|
import { TRANSPARENT_HUMAN_NEGATIVE_PROMPT, TRANSPARENT_HUMAN_VIDEO_PROMPT } from "@/lib/workflow-target"
|
||||||
@@ -40,6 +41,7 @@ const VIDEO_FRAME_PANEL_ID = "video-frame-panel"
|
|||||||
const FLOATING_PANEL_IDS = new Set([KEYFRAME_PANEL_ID, VIDEO_FRAME_PANEL_ID])
|
const FLOATING_PANEL_IDS = new Set([KEYFRAME_PANEL_ID, VIDEO_FRAME_PANEL_ID])
|
||||||
const DIRECT_VIDEO_GENERATION_PAUSED = true
|
const DIRECT_VIDEO_GENERATION_PAUSED = true
|
||||||
const FRAME_TARGET_LABELS: Record<FrameExtractTarget, string> = {
|
const FRAME_TARGET_LABELS: Record<FrameExtractTarget, string> = {
|
||||||
|
random_subject: "人物随机",
|
||||||
transparent_human: "透明骨架人",
|
transparent_human: "透明骨架人",
|
||||||
balanced: "综合关键帧",
|
balanced: "综合关键帧",
|
||||||
subject: "清晰主体",
|
subject: "清晰主体",
|
||||||
@@ -242,8 +244,8 @@ export default function Home() {
|
|||||||
const handleAnalyzeJob = useCallback(async (jobId: string, options?: { mode?: FrameExtractMode }) => {
|
const handleAnalyzeJob = useCallback(async (jobId: string, options?: { mode?: FrameExtractMode }) => {
|
||||||
const targetJob = jobs.find((item) => item.id === jobId)
|
const targetJob = jobs.find((item) => item.id === jobId)
|
||||||
if (!targetJob) return
|
if (!targetJob) return
|
||||||
const frameTarget = frameTargets[jobId] ?? "transparent_human"
|
const frameTarget = frameTargets[jobId] ?? "random_subject"
|
||||||
const frameCount = frameCounts[jobId] ?? 12
|
const frameCount = frameCounts[jobId] ?? 6
|
||||||
const frameQuality = frameQualities[jobId] ?? "auto"
|
const frameQuality = frameQualities[jobId] ?? "auto"
|
||||||
const mode = options?.mode ?? (targetJob.frames.length > 0 ? "append" : "replace")
|
const mode = options?.mode ?? (targetJob.frames.length > 0 ? "append" : "replace")
|
||||||
setActiveJobId(jobId)
|
setActiveJobId(jobId)
|
||||||
@@ -487,8 +489,8 @@ export default function Home() {
|
|||||||
const visualRunning = target.status === "splitting"
|
const visualRunning = target.status === "splitting"
|
||||||
if (!hasVisualResult && !visualRunning && !autoTriggeredRef.current.has(visualKey)) {
|
if (!hasVisualResult && !visualRunning && !autoTriggeredRef.current.has(visualKey)) {
|
||||||
autoTriggeredRef.current.add(visualKey)
|
autoTriggeredRef.current.add(visualKey)
|
||||||
const frameTarget = frameTargets[target.id] ?? "motion"
|
const frameTarget = frameTargets[target.id] ?? "random_subject"
|
||||||
const frameCount = frameCounts[target.id] ?? 12
|
const frameCount = frameCounts[target.id] ?? 6
|
||||||
const frameQuality = frameQualities[target.id] ?? "accurate"
|
const frameQuality = frameQualities[target.id] ?? "accurate"
|
||||||
try {
|
try {
|
||||||
const updated = await analyzeJob(target.id, frameCount, frameTarget, "replace", frameQuality)
|
const updated = await analyzeJob(target.id, frameCount, frameTarget, "replace", frameQuality)
|
||||||
@@ -572,15 +574,30 @@ export default function Home() {
|
|||||||
const handleStartProduction = useCallback(async (inputUrl?: string) => {
|
const handleStartProduction = useCallback(async (inputUrl?: string) => {
|
||||||
const trimmed = inputUrl?.trim()
|
const trimmed = inputUrl?.trim()
|
||||||
const created = trimmed ? await handleSubmit(trimmed) : undefined
|
const created = trimmed ? await handleSubmit(trimmed) : undefined
|
||||||
const target = created ?? job
|
let target = created ?? job
|
||||||
if (!target) {
|
if (!target) {
|
||||||
toast.info("先粘贴视频链接或选择一个素材任务")
|
toast.info("先粘贴视频链接或选择一个素材任务")
|
||||||
return
|
return
|
||||||
}
|
}
|
||||||
|
if (!created && target.status === "failed") {
|
||||||
|
autoTriggeredRef.current.delete(`${target.id}:audio`)
|
||||||
|
autoTriggeredRef.current.delete(`${target.id}:visual`)
|
||||||
|
}
|
||||||
|
if (!created && target.status === "failed" && !target.video_url) {
|
||||||
|
try {
|
||||||
|
target = await retryJobDownload(target.id)
|
||||||
|
updateJobInList(target)
|
||||||
|
toast.info("已重新提交下载;下载完成后会自动跑音频文案路和视觉抽帧路")
|
||||||
|
} catch (e) {
|
||||||
|
toast.error("重新下载失败:" + (e instanceof Error ? e.message : String(e)))
|
||||||
|
return
|
||||||
|
}
|
||||||
|
}
|
||||||
setProductionJobIds((prev) => new Set(prev).add(target.id))
|
setProductionJobIds((prev) => new Set(prev).add(target.id))
|
||||||
toast.success("已进入并行素材分析:下载完成后自动跑音频文案路和视觉抽帧路")
|
if (target.video_url) toast.success("已进入并行素材分析:音频文案路和视觉抽帧路会同步推进")
|
||||||
|
else toast.success("已进入并行素材分析:下载完成后自动跑音频文案路和视觉抽帧路")
|
||||||
void startProductionLanesForJob(target)
|
void startProductionLanesForJob(target)
|
||||||
}, [handleSubmit, job, startProductionLanesForJob])
|
}, [handleSubmit, job, startProductionLanesForJob, updateJobInList])
|
||||||
|
|
||||||
useEffect(() => {
|
useEffect(() => {
|
||||||
if (productionJobIds.size === 0) return
|
if (productionJobIds.size === 0) return
|
||||||
@@ -863,6 +880,9 @@ export default function Home() {
|
|||||||
if (job?.status === "downloaded" && prevStatusRef.current !== "downloaded") {
|
if (job?.status === "downloaded" && prevStatusRef.current !== "downloaded") {
|
||||||
toast.info("视频已下载,音频解析会自动开始;也可以在右侧手动重试", { duration: 6000 })
|
toast.info("视频已下载,音频解析会自动开始;也可以在右侧手动重试", { duration: 6000 })
|
||||||
}
|
}
|
||||||
|
if (job?.status === "failed" && prevStatusRef.current !== "failed") {
|
||||||
|
toast.error(formatJobError(job.error) || "任务失败", { duration: 10000 })
|
||||||
|
}
|
||||||
prevStatusRef.current = job?.status ?? null
|
prevStatusRef.current = job?.status ?? null
|
||||||
|
|
||||||
const TERMINAL: Job["status"][] = ["downloaded", "frames_extracted", "transcribed", "failed"]
|
const TERMINAL: Job["status"][] = ["downloaded", "frames_extracted", "transcribed", "failed"]
|
||||||
|
|||||||
@@ -32,6 +32,7 @@ import {
|
|||||||
cutoutElement,
|
cutoutElement,
|
||||||
deleteSubjectAsset,
|
deleteSubjectAsset,
|
||||||
effectiveFrameUrl,
|
effectiveFrameUrl,
|
||||||
|
formatJobError,
|
||||||
generateSceneAsset,
|
generateSceneAsset,
|
||||||
generateProductAngleAsset,
|
generateProductAngleAsset,
|
||||||
generateSubjectAssets,
|
generateSubjectAssets,
|
||||||
@@ -52,6 +53,7 @@ import { type NodeData } from "@/components/nodes"
|
|||||||
import { MediaAssetTile } from "@/components/media-asset-tile"
|
import { MediaAssetTile } from "@/components/media-asset-tile"
|
||||||
|
|
||||||
const TARGETS: Array<{ value: FrameExtractTarget; label: string }> = [
|
const TARGETS: Array<{ value: FrameExtractTarget; label: string }> = [
|
||||||
|
{ value: "random_subject", label: "人物随机" },
|
||||||
{ value: "balanced", label: "综合" },
|
{ value: "balanced", label: "综合" },
|
||||||
{ value: "subject", label: "主体" },
|
{ value: "subject", label: "主体" },
|
||||||
{ value: "motion", label: "动作" },
|
{ value: "motion", label: "动作" },
|
||||||
@@ -1449,6 +1451,9 @@ function MaterialColumn({
|
|||||||
onSubmitUrl: () => void
|
onSubmitUrl: () => void
|
||||||
onStartProduction: () => void
|
onStartProduction: () => void
|
||||||
}) {
|
}) {
|
||||||
|
const actionLabel = !url.trim() && job?.status === "failed"
|
||||||
|
? job.video_url ? "重新解析" : "重新下载"
|
||||||
|
: "开始分析"
|
||||||
return (
|
return (
|
||||||
<section className="flex min-h-0 flex-col gap-3 rounded-lg border border-white/10 bg-white/[0.035] p-3 shadow-2xl">
|
<section className="flex min-h-0 flex-col gap-3 rounded-lg border border-white/10 bg-white/[0.035] p-3 shadow-2xl">
|
||||||
<header className="shrink-0 border-b border-white/10 pb-3">
|
<header className="shrink-0 border-b border-white/10 pb-3">
|
||||||
@@ -1474,7 +1479,7 @@ function MaterialColumn({
|
|||||||
disabled={data.submitting || (!url.trim() && !job)}
|
disabled={data.submitting || (!url.trim() && !job)}
|
||||||
className="inline-flex h-10 items-center justify-center rounded-md bg-rose-600 px-3 text-[13px] font-semibold text-white transition hover:bg-rose-500 disabled:cursor-not-allowed disabled:opacity-45"
|
className="inline-flex h-10 items-center justify-center rounded-md bg-rose-600 px-3 text-[13px] font-semibold text-white transition hover:bg-rose-500 disabled:cursor-not-allowed disabled:opacity-45"
|
||||||
>
|
>
|
||||||
开始分析
|
{actionLabel}
|
||||||
</button>
|
</button>
|
||||||
<button
|
<button
|
||||||
type="button"
|
type="button"
|
||||||
@@ -1875,11 +1880,11 @@ function SourceReferenceBuildPanel({
|
|||||||
for (const frame of job.frames) {
|
for (const frame of job.frames) {
|
||||||
if (selectedFrames.has(frame.index)) onToggleFrame(frame.index)
|
if (selectedFrames.has(frame.index)) onToggleFrame(frame.index)
|
||||||
}
|
}
|
||||||
const updated = await analyzeJob(job.id, 12, "motion", "replace", "accurate")
|
const updated = await analyzeJob(job.id, 6, "random_subject", "replace", "accurate")
|
||||||
onJobUpdate(updated)
|
onJobUpdate(updated)
|
||||||
toast.info("已按动作峰值逻辑重新抽取 12 张参考帧,完成后在这里人工选择主角参考。")
|
toast.info("已按人物定向随机逻辑重新抽取 6 张参考帧,完成后在这里人工选择主角参考。")
|
||||||
} catch (e) {
|
} catch (e) {
|
||||||
toast.error("12 张关键帧抽取失败:" + (e instanceof Error ? e.message : String(e)))
|
toast.error("6 张关键帧抽取失败:" + (e instanceof Error ? e.message : String(e)))
|
||||||
} finally {
|
} finally {
|
||||||
setExtracting(false)
|
setExtracting(false)
|
||||||
}
|
}
|
||||||
@@ -1887,7 +1892,7 @@ function SourceReferenceBuildPanel({
|
|||||||
|
|
||||||
const generateSimilarActor = async () => {
|
const generateSimilarActor = async () => {
|
||||||
if (!frames.length) {
|
if (!frames.length) {
|
||||||
toast.warning("请先自动抽帧 12 张,或在原版视频上手动补帧。")
|
toast.warning("请先自动抽帧 6 张,或在原版视频上手动补帧。")
|
||||||
return
|
return
|
||||||
}
|
}
|
||||||
const baseFrame = subjectReferenceFrames[0]
|
const baseFrame = subjectReferenceFrames[0]
|
||||||
@@ -2000,11 +2005,11 @@ function SourceReferenceBuildPanel({
|
|||||||
type="button"
|
type="button"
|
||||||
onClick={() => void extractKeyframes()}
|
onClick={() => void extractKeyframes()}
|
||||||
disabled={!job.video_url || extracting || job.status === "splitting"}
|
disabled={!job.video_url || extracting || job.status === "splitting"}
|
||||||
title="自动按动作峰值抽 12 张参考帧,更偏向手势、表情变化、节奏点和镜头变化"
|
title="自动按人物定向随机逻辑抽 6 张参考帧,保留手动当前点补帧"
|
||||||
className="inline-flex h-8 items-center justify-center gap-1 rounded-md bg-white px-3 text-[11px] font-semibold text-black transition hover:bg-white/90 disabled:cursor-not-allowed disabled:opacity-40"
|
className="inline-flex h-8 items-center justify-center gap-1 rounded-md bg-white px-3 text-[11px] font-semibold text-black transition hover:bg-white/90 disabled:cursor-not-allowed disabled:opacity-40"
|
||||||
>
|
>
|
||||||
{extracting || job.status === "splitting" ? <Loader2 className="h-3.5 w-3.5 animate-spin" /> : <Scissors className="h-3.5 w-3.5" />}
|
{extracting || job.status === "splitting" ? <Loader2 className="h-3.5 w-3.5 animate-spin" /> : <Scissors className="h-3.5 w-3.5" />}
|
||||||
自动抽帧 12 张
|
自动抽帧 6 张
|
||||||
</button>
|
</button>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
@@ -2039,7 +2044,7 @@ function SourceReferenceBuildPanel({
|
|||||||
})}
|
})}
|
||||||
{!frames.length && (
|
{!frames.length && (
|
||||||
<div className="col-span-full flex h-[106px] items-center justify-center rounded border border-dashed border-white/12 text-[11px] text-white/34">
|
<div className="col-span-full flex h-[106px] items-center justify-center rounded border border-dashed border-white/12 text-[11px] text-white/34">
|
||||||
点击“自动抽帧 12 张”,或在原版视频播放器上用“当前点抽帧”补充人物参考。
|
点击“自动抽帧 6 张”,或在原版视频播放器上用“当前点抽帧”补充人物参考。
|
||||||
</div>
|
</div>
|
||||||
)}
|
)}
|
||||||
</div>
|
</div>
|
||||||
@@ -3405,7 +3410,7 @@ function FrameExtractControls({
|
|||||||
</div>
|
</div>
|
||||||
<div className="grid grid-cols-[1fr_1fr_72px] gap-2">
|
<div className="grid grid-cols-[1fr_1fr_72px] gap-2">
|
||||||
<select
|
<select
|
||||||
value={job ? data.frameTargets[job.id] ?? "transparent_human" : "balanced"}
|
value={job ? data.frameTargets[job.id] ?? "random_subject" : "random_subject"}
|
||||||
onChange={(e) => job && data.onFrameTargetChange(job.id, e.target.value as FrameExtractTarget)}
|
onChange={(e) => job && data.onFrameTargetChange(job.id, e.target.value as FrameExtractTarget)}
|
||||||
disabled={!job}
|
disabled={!job}
|
||||||
className={controlClass}
|
className={controlClass}
|
||||||
@@ -3424,8 +3429,8 @@ function FrameExtractControls({
|
|||||||
type="number"
|
type="number"
|
||||||
min={1}
|
min={1}
|
||||||
max={20}
|
max={20}
|
||||||
value={job ? data.frameCounts[job.id] ?? 12 : 12}
|
value={job ? data.frameCounts[job.id] ?? 6 : 6}
|
||||||
onChange={(e) => job && data.onFrameCountChange(job.id, Number(e.target.value) || 12)}
|
onChange={(e) => job && data.onFrameCountChange(job.id, Number(e.target.value) || 6)}
|
||||||
disabled={!job}
|
disabled={!job}
|
||||||
className={`${controlClass} text-center`}
|
className={`${controlClass} text-center`}
|
||||||
/>
|
/>
|
||||||
@@ -3858,6 +3863,7 @@ function MaterialCard({
|
|||||||
onDelete?: () => void
|
onDelete?: () => void
|
||||||
}) {
|
}) {
|
||||||
const tone = statusTone(job)
|
const tone = statusTone(job)
|
||||||
|
const errorText = formatJobError(job.error)
|
||||||
return (
|
return (
|
||||||
<button
|
<button
|
||||||
type="button"
|
type="button"
|
||||||
@@ -3879,6 +3885,12 @@ function MaterialCard({
|
|||||||
<Metric label="文案" value={job.audio_script?.source_text || job.transcript.length ? "ready" : "-"} compact />
|
<Metric label="文案" value={job.audio_script?.source_text || job.transcript.length ? "ready" : "-"} compact />
|
||||||
<Metric label="段落" value={`${job.transcript.length}`} compact />
|
<Metric label="段落" value={`${job.transcript.length}`} compact />
|
||||||
</div>
|
</div>
|
||||||
|
{job.status === "failed" && errorText && (
|
||||||
|
<div className="mt-2 flex gap-1.5 rounded-md border border-rose-300/18 bg-rose-500/[0.08] px-2 py-1.5 text-[11px] leading-snug text-rose-100/82">
|
||||||
|
<AlertTriangle className="mt-0.5 h-3.5 w-3.5 shrink-0" />
|
||||||
|
<span className="line-clamp-3">{errorText}</span>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
{onDelete && (
|
{onDelete && (
|
||||||
<span
|
<span
|
||||||
role="button"
|
role="button"
|
||||||
|
|||||||
@@ -641,15 +641,15 @@ export const Dashboard = forwardRef<DashboardHandle, Props>(function Dashboard({
|
|||||||
</div>
|
</div>
|
||||||
</KanbanCard>
|
</KanbanCard>
|
||||||
|
|
||||||
<KanbanCard tone="green" tags={["配音"]} title={job?.audio_script?.voice_model || "MiniMax T2A"}>
|
<KanbanCard tone="green" tags={["配音"]} title={job?.audio_script?.voice_model || "Azure OpenAI TTS"}>
|
||||||
{job?.audio_script?.voice_url ? (
|
{job?.audio_script?.voice_url ? (
|
||||||
<audio controls className="h-8 w-full" src={apiAssetUrl(job.audio_script.voice_url)} />
|
<audio controls className="h-8 w-full" src={apiAssetUrl(job.audio_script.voice_url)} />
|
||||||
) : (
|
) : (
|
||||||
<div className="text-[11px] text-[var(--text-soft)]">
|
<div className="text-[11px] text-[var(--text-soft)]">
|
||||||
{job?.audio_script?.error || "配置 MiniMax 后自动生成配音文件"}
|
{job?.audio_script?.error || "配置 Azure OpenAI TTS 后自动生成配音文件"}
|
||||||
</div>
|
</div>
|
||||||
)}
|
)}
|
||||||
<div className="kanban-meta">{job?.audio_script?.voice_id || "random English voice"}</div>
|
<div className="kanban-meta">{job?.audio_script?.voice_id || "Azure voice"}</div>
|
||||||
</KanbanCard>
|
</KanbanCard>
|
||||||
</>
|
</>
|
||||||
)}
|
)}
|
||||||
|
|||||||
@@ -133,6 +133,7 @@ function clamp(value: number, min: number, max: number) {
|
|||||||
const THUMBNAIL_HEIGHT = 192
|
const THUMBNAIL_HEIGHT = 192
|
||||||
const FLOATING_PANEL_EDGE_INSET = 8
|
const FLOATING_PANEL_EDGE_INSET = 8
|
||||||
const FRAME_TARGET_OPTIONS: Array<{ value: FrameExtractTarget; label: string; hint: string }> = [
|
const FRAME_TARGET_OPTIONS: Array<{ value: FrameExtractTarget; label: string; hint: string }> = [
|
||||||
|
{ value: "random_subject", label: "人物随机", hint: "从清晰人物候选里随机抽取" },
|
||||||
{ value: "transparent_human", label: "透明骨架人", hint: "本地算力筛清晰主体,不逐帧调用 Vision" },
|
{ value: "transparent_human", label: "透明骨架人", hint: "本地算力筛清晰主体,不逐帧调用 Vision" },
|
||||||
{ value: "balanced", label: "综合关键帧", hint: "清晰、去重、变化、时间覆盖" },
|
{ value: "balanced", label: "综合关键帧", hint: "清晰、去重、变化、时间覆盖" },
|
||||||
{ value: "subject", label: "清晰主体", hint: "人物 / 产品主体更清楚" },
|
{ value: "subject", label: "清晰主体", hint: "人物 / 产品主体更清楚" },
|
||||||
@@ -140,7 +141,7 @@ const FRAME_TARGET_OPTIONS: Array<{ value: FrameExtractTarget; label: string; hi
|
|||||||
{ value: "expression", label: "表情瞬间", hint: "人物 / 动物表情倾向" },
|
{ value: "expression", label: "表情瞬间", hint: "人物 / 动物表情倾向" },
|
||||||
{ value: "motion", label: "动作峰值", hint: "动作变化更明显" },
|
{ value: "motion", label: "动作峰值", hint: "动作变化更明显" },
|
||||||
]
|
]
|
||||||
const FRAME_COUNT_OPTIONS = [12, 8, 5, 3]
|
const FRAME_COUNT_OPTIONS = [6, 12, 8, 5, 3]
|
||||||
const FRAME_QUALITY_OPTIONS: Array<{ value: FrameExtractQuality; label: string; hint: string }> = [
|
const FRAME_QUALITY_OPTIONS: Array<{ value: FrameExtractQuality; label: string; hint: string }> = [
|
||||||
{ value: "auto", label: "自动", hint: "展示友好:按电脑性能选择,最高只到精细" },
|
{ value: "auto", label: "自动", hint: "展示友好:按电脑性能选择,最高只到精细" },
|
||||||
{ value: "fast", label: "快速", hint: "2fps / 360px,长视频省电" },
|
{ value: "fast", label: "快速", hint: "2fps / 360px,长视频省电" },
|
||||||
@@ -575,8 +576,8 @@ export function InputNode({ data, selected }: NodeProps<{ data: NodeData }> | an
|
|||||||
const aspectStr = ready ? `${j.width}/${j.height}` : "9/16"
|
const aspectStr = ready ? `${j.width}/${j.height}` : "9/16"
|
||||||
const thumbNaturalWidth = ready && j.height ? Math.max(96, Math.round(THUMBNAIL_HEIGHT * j.width / j.height)) : 96
|
const thumbNaturalWidth = ready && j.height ? Math.max(96, Math.round(THUMBNAIL_HEIGHT * j.width / j.height)) : 96
|
||||||
const toolWidth = Math.max(148, thumbNaturalWidth)
|
const toolWidth = Math.max(148, thumbNaturalWidth)
|
||||||
const target = d.frameTargets[j.id] ?? "transparent_human"
|
const target = d.frameTargets[j.id] ?? "random_subject"
|
||||||
const count = d.frameCounts[j.id] ?? 12
|
const count = d.frameCounts[j.id] ?? 6
|
||||||
const quality = d.frameQualities[j.id] ?? "auto"
|
const quality = d.frameQualities[j.id] ?? "auto"
|
||||||
const jHasFrames = j.frames.length > 0
|
const jHasFrames = j.frames.length > 0
|
||||||
const jRunning = ["splitting", "transcribing"].includes(j.status)
|
const jRunning = ["splitting", "transcribing"].includes(j.status)
|
||||||
@@ -815,8 +816,8 @@ export function VideoFramePanelNode({ data }: any) {
|
|||||||
const duration = panelJob.duration ?? 0
|
const duration = panelJob.duration ?? 0
|
||||||
const frames = [...panelJob.frames].sort((a, b) => a.timestamp - b.timestamp)
|
const frames = [...panelJob.frames].sort((a, b) => a.timestamp - b.timestamp)
|
||||||
const aspect = panelJob.width && panelJob.height ? `${panelJob.width}/${panelJob.height}` : "9/16"
|
const aspect = panelJob.width && panelJob.height ? `${panelJob.width}/${panelJob.height}` : "9/16"
|
||||||
const panelTarget = d.frameTargets[panelJob.id] ?? "transparent_human"
|
const panelTarget = d.frameTargets[panelJob.id] ?? "random_subject"
|
||||||
const panelCount = d.frameCounts[panelJob.id] ?? 12
|
const panelCount = d.frameCounts[panelJob.id] ?? 6
|
||||||
const panelQuality = d.frameQualities[panelJob.id] ?? "auto"
|
const panelQuality = d.frameQualities[panelJob.id] ?? "auto"
|
||||||
const panelRunning = ["splitting", "transcribing"].includes(panelJob.status)
|
const panelRunning = ["splitting", "transcribing"].includes(panelJob.status)
|
||||||
const dockText: Record<CanvasPanelDock, string> = {
|
const dockText: Record<CanvasPanelDock, string> = {
|
||||||
@@ -2102,7 +2103,7 @@ export function RewriteNode({ data, selected }: any) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
/* ============================================================
|
/* ============================================================
|
||||||
5b. AudioNode — 合并 ASR + 翻译 + 改写 + MiniMax 配音
|
5b. AudioNode — 合并 ASR + 翻译 + 改写 + Azure OpenAI 配音
|
||||||
============================================================ */
|
============================================================ */
|
||||||
export function AudioNode({ data, selected }: any) {
|
export function AudioNode({ data, selected }: any) {
|
||||||
const d: NodeData = data
|
const d: NodeData = data
|
||||||
@@ -2152,9 +2153,9 @@ export function AudioNode({ data, selected }: any) {
|
|||||||
}}
|
}}
|
||||||
>
|
>
|
||||||
<div>
|
<div>
|
||||||
音轨 → 取时长/节奏 → SKG 英文产品口播 → MiniMax 随机英文配音<br />
|
音轨 → 取时长/节奏 → SKG 英文产品口播 → Azure OpenAI 英文配音<br />
|
||||||
<span className="text-[var(--text-faint)] font-mono">
|
<span className="text-[var(--text-faint)] font-mono">
|
||||||
{audioScript?.rewrite_model || "AUDIO_REWRITE_MODEL"} → {audioScript?.voice_model || "MiniMax T2A"}
|
{audioScript?.rewrite_model || "AUDIO_REWRITE_MODEL"} → {audioScript?.voice_model || "Azure OpenAI TTS"}
|
||||||
</span>
|
</span>
|
||||||
</div>
|
</div>
|
||||||
{job && (
|
{job && (
|
||||||
@@ -2195,7 +2196,7 @@ export function AudioNode({ data, selected }: any) {
|
|||||||
)}
|
)}
|
||||||
</div>
|
</div>
|
||||||
)}
|
)}
|
||||||
{voiceUrl && <div className="text-[10.5px] text-emerald-200/85">MiniMax natural English voice ready · 底部音频条播放</div>}
|
{voiceUrl && <div className="text-[10.5px] text-emerald-200/85">Azure OpenAI English voice ready · 底部音频条播放</div>}
|
||||||
{isRewriting && (
|
{isRewriting && (
|
||||||
<div className="text-[10.5px] text-[var(--text-faint)]">正在按原音频时长生成英文产品口播和配音…</div>
|
<div className="text-[10.5px] text-[var(--text-faint)]">正在按原音频时长生成英文产品口播和配音…</div>
|
||||||
)}
|
)}
|
||||||
|
|||||||
@@ -172,10 +172,7 @@ export interface RuntimeModels {
|
|||||||
voice_id?: string
|
voice_id?: string
|
||||||
voice_pool?: string[]
|
voice_pool?: string[]
|
||||||
voice_configured?: boolean
|
voice_configured?: boolean
|
||||||
minimax_tts?: string
|
voice_tts_paths?: string[]
|
||||||
minimax_voice?: string
|
|
||||||
minimax_voice_pool?: string[]
|
|
||||||
minimax_configured?: boolean
|
|
||||||
video?: string
|
video?: string
|
||||||
video_aliases?: Record<string, string>
|
video_aliases?: Record<string, string>
|
||||||
video_provider?: string
|
video_provider?: string
|
||||||
@@ -189,6 +186,15 @@ export interface RuntimeHealth {
|
|||||||
llm_configured?: boolean
|
llm_configured?: boolean
|
||||||
auth_configured?: boolean
|
auth_configured?: boolean
|
||||||
base_url?: string
|
base_url?: string
|
||||||
|
database?: {
|
||||||
|
enabled: boolean
|
||||||
|
url?: string
|
||||||
|
schema_version?: number
|
||||||
|
documents?: number
|
||||||
|
jobs?: number
|
||||||
|
assets?: number
|
||||||
|
error?: string
|
||||||
|
}
|
||||||
models?: RuntimeModels
|
models?: RuntimeModels
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -419,7 +425,7 @@ export interface KeyFrame {
|
|||||||
generated_images?: GeneratedImage[]
|
generated_images?: GeneratedImage[]
|
||||||
}
|
}
|
||||||
|
|
||||||
export type FrameExtractTarget = "transparent_human" | "balanced" | "subject" | "transition" | "expression" | "motion"
|
export type FrameExtractTarget = "random_subject" | "transparent_human" | "balanced" | "subject" | "transition" | "expression" | "motion"
|
||||||
export type FrameExtractMode = "replace" | "append"
|
export type FrameExtractMode = "replace" | "append"
|
||||||
export type FrameExtractQuality = "auto" | "fast" | "accurate" | "ultra"
|
export type FrameExtractQuality = "auto" | "fast" | "accurate" | "ultra"
|
||||||
export type AssetBackground = "white" | "black"
|
export type AssetBackground = "white" | "black"
|
||||||
@@ -574,6 +580,10 @@ export interface ProductRefStateItem {
|
|||||||
export interface Job {
|
export interface Job {
|
||||||
id: string
|
id: string
|
||||||
url: string
|
url: string
|
||||||
|
document_id?: string
|
||||||
|
source_kind?: "tiktok_link" | "upload" | "unknown"
|
||||||
|
workflow_mode?: "feed_recreation" | "uploaded_reference"
|
||||||
|
storage_prefix?: string
|
||||||
status: JobStatus
|
status: JobStatus
|
||||||
progress: number
|
progress: number
|
||||||
message?: string
|
message?: string
|
||||||
@@ -596,14 +606,13 @@ export interface BackendHealth {
|
|||||||
llm_configured: boolean
|
llm_configured: boolean
|
||||||
auth_configured?: boolean
|
auth_configured?: boolean
|
||||||
base_url: string
|
base_url: string
|
||||||
|
database?: RuntimeHealth["database"]
|
||||||
models?: {
|
models?: {
|
||||||
asr?: string
|
asr?: string
|
||||||
translate?: string
|
translate?: string
|
||||||
rewrite?: string
|
rewrite?: string
|
||||||
audio_rewrite?: string
|
audio_rewrite?: string
|
||||||
minimax_tts?: string
|
voice_tts_paths?: string[]
|
||||||
minimax_voice?: string
|
|
||||||
minimax_configured?: boolean
|
|
||||||
video?: string
|
video?: string
|
||||||
video_aliases?: Record<string, string>
|
video_aliases?: Record<string, string>
|
||||||
video_base_url?: string
|
video_base_url?: string
|
||||||
@@ -617,6 +626,25 @@ export function apiAssetUrl(path?: string | null): string {
|
|||||||
return `${API_BASE}${path.startsWith("/") ? "" : "/"}${path}`
|
return `${API_BASE}${path.startsWith("/") ? "" : "/"}${path}`
|
||||||
}
|
}
|
||||||
|
|
||||||
|
export function isRestrictedDownloadError(error?: string | null): boolean {
|
||||||
|
const text = (error ?? "").toLowerCase()
|
||||||
|
return (
|
||||||
|
text.includes("tiktok 下载需要登录态") ||
|
||||||
|
text.includes("log in for access") ||
|
||||||
|
text.includes("cookies-from-browser") ||
|
||||||
|
text.includes("ytdlp_cookies_file") ||
|
||||||
|
(text.includes("tiktok") && text.includes("cookies"))
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
export function formatJobError(error?: string | null): string {
|
||||||
|
if (!error) return ""
|
||||||
|
if (isRestrictedDownloadError(error)) {
|
||||||
|
return "这个 TikTok 视频需要登录态。请上传 MP4,或让后端配置 YTDLP_COOKIES_FROM_BROWSER / YTDLP_COOKIES_FILE 后重试。"
|
||||||
|
}
|
||||||
|
return error
|
||||||
|
}
|
||||||
|
|
||||||
export async function getHealth(): Promise<BackendHealth> {
|
export async function getHealth(): Promise<BackendHealth> {
|
||||||
const res = await fetch(`${API_BASE}/health`)
|
const res = await fetch(`${API_BASE}/health`)
|
||||||
if (!res.ok) throw new Error(`health ${res.status}`)
|
if (!res.ok) throw new Error(`health ${res.status}`)
|
||||||
@@ -633,6 +661,15 @@ export async function createJob(tkUrl: string): Promise<Job> {
|
|||||||
return res.json()
|
return res.json()
|
||||||
}
|
}
|
||||||
|
|
||||||
|
export async function retryJobDownload(id: string): Promise<Job> {
|
||||||
|
const res = await fetch(`${API_BASE}/jobs/${id}/download/retry`, { method: "POST" })
|
||||||
|
if (!res.ok) {
|
||||||
|
const text = await res.text().catch(() => "")
|
||||||
|
throw apiError("retryJobDownload", res.status, text)
|
||||||
|
}
|
||||||
|
return res.json()
|
||||||
|
}
|
||||||
|
|
||||||
export async function uploadJob(file: File): Promise<Job> {
|
export async function uploadJob(file: File): Promise<Job> {
|
||||||
const fd = new FormData()
|
const fd = new FormData()
|
||||||
fd.append("file", file)
|
fd.append("file", file)
|
||||||
@@ -664,6 +701,9 @@ export async function deleteJob(id: string): Promise<{ ok: boolean; id: string }
|
|||||||
|
|
||||||
export interface JobSummary {
|
export interface JobSummary {
|
||||||
id: string
|
id: string
|
||||||
|
document_id?: string
|
||||||
|
source_kind?: string
|
||||||
|
workflow_mode?: string
|
||||||
url: string
|
url: string
|
||||||
status: JobStatus
|
status: JobStatus
|
||||||
progress: number
|
progress: number
|
||||||
@@ -679,6 +719,28 @@ export interface JobSummary {
|
|||||||
mtime: number
|
mtime: number
|
||||||
}
|
}
|
||||||
|
|
||||||
|
export interface DocumentSummary {
|
||||||
|
id: string
|
||||||
|
title: string
|
||||||
|
source_kind: string
|
||||||
|
workflow_mode: string
|
||||||
|
source_url: string
|
||||||
|
primary_job_id: string
|
||||||
|
status: string
|
||||||
|
storage_prefix: string
|
||||||
|
job_count: number
|
||||||
|
asset_count: number
|
||||||
|
created_at: number
|
||||||
|
updated_at: number
|
||||||
|
}
|
||||||
|
|
||||||
|
export async function listDocuments(limit?: number): Promise<DocumentSummary[]> {
|
||||||
|
const qs = limit && limit > 0 ? `?limit=${limit}` : ""
|
||||||
|
const res = await fetch(`${API_BASE}/documents${qs}`)
|
||||||
|
if (!res.ok) throw new Error(`listDocuments ${res.status}`)
|
||||||
|
return res.json()
|
||||||
|
}
|
||||||
|
|
||||||
export async function listJobs(limit?: number): Promise<JobSummary[]> {
|
export async function listJobs(limit?: number): Promise<JobSummary[]> {
|
||||||
const qs = limit && limit > 0 ? `?limit=${limit}` : ""
|
const qs = limit && limit > 0 ? `?limit=${limit}` : ""
|
||||||
const res = await fetch(`${API_BASE}/jobs${qs}`)
|
const res = await fetch(`${API_BASE}/jobs${qs}`)
|
||||||
@@ -694,8 +756,8 @@ export async function triggerTranscribe(id: string): Promise<Job> {
|
|||||||
|
|
||||||
export async function analyzeJob(
|
export async function analyzeJob(
|
||||||
id: string,
|
id: string,
|
||||||
frames = 12,
|
frames = 6,
|
||||||
target: FrameExtractTarget = "balanced",
|
target: FrameExtractTarget = "random_subject",
|
||||||
mode: FrameExtractMode = "replace",
|
mode: FrameExtractMode = "replace",
|
||||||
quality: FrameExtractQuality = "auto",
|
quality: FrameExtractQuality = "auto",
|
||||||
): Promise<Job> {
|
): Promise<Job> {
|
||||||
|
|||||||
Reference in New Issue
Block a user