docs: record multilingual asr deployment

auto-save 2026-05-22 09:27 (~2)
fix: support multilingual audio transcription
2026-05-22 09:30:59 +08:00 · 2026-05-22 09:27:54 +08:00 · 2026-05-22 09:26:59 +08:00 · 2026-05-21 17:44:40 +08:00 · 2026-05-21 17:29:16 +08:00 · 2026-05-21 17:16:54 +08:00
16 changed files with 3264 additions and 2391 deletions
--- a/.memory/assistant-handoff.md
+++ b/.memory/assistant-handoff.md
@@ -1,6 +1,6 @@
 # 项目接力

- 生成时间：May 21, 2026 at 02:06
+- 生成时间：May 22, 2026 at 09:01
 - 项目：SKG Marketing Studio / SKG 营销内容工作台
 - 路径：/Users/kangwan/Projects/business/20260512-20260512-skg-tk-二创验证
 - 状态：active
@@ -9,7 +9,7 @@
 ## 最近助手会话概览

 - Claude：a9e0449c-d9cb-4a2a-bb16-16596dfb552a · 时间未知
- Codex：019e447d-68c7-7db1-a499-b5eb6a98a7c2 · 时间未知
+- Codex：019e49d1-d114-7d52-8cf0-fbb4585f5ce6 · 时间未知
 - Cursor：未找到匹配当前项目的最近会话

 ## Claude 最近会话
@@ -92,42 +92,57 @@

 ## Codex 最近会话

- Session ID：019e447d-68c7-7db1-a499-b5eb6a98a7c2
- Transcript：/Users/kangwan/.codex/sessions/2026/05/20/rollout-2026-05-20T16-25-30-019e447d-68c7-7db1-a499-b5eb6a98a7c2.jsonl
+- Session ID：019e49d1-d114-7d52-8cf0-fbb4585f5ce6
+- Transcript：/Users/kangwan/.codex/sessions/2026/05/21/rollout-2026-05-21T17-15-47-019e49d1-d114-7d52-8cf0-fbb4585f5ce6.jsonl
 - 工作目录：/Users/kangwan/Projects/business/20260512-20260512-skg-tk-二创验证
 - 分支：main
 - 敏感字段：已对 token / key / password / secret 做脱敏

 ### 最近用户要求

- 我看的没改变呀  我感觉这个已经彻底调整崩了
- figma mcp是通的么
- 连一些被 我都打开了
+- 分镜这个板块，我每次点击视频 都变成了选择 其实我就是想打开看下而已 你这个逻辑不太对  你看下
+- 为什么要选择呢  你给我点击下载的按钮不就好了
+- 而且每个生成的视频当然是自动保存呀  还需要我点击保存干啥

 ### Codex 最近回复

-> 现在命令行访问本机端口被沙箱拦了一次，`nc` 明确返回 `Operation not permitted`。我用已授权外部执行再测一次，这样能区分“Figma 没开”和“Codex 沙箱不让连本地端口”。
+> 生产部署正在跑第二轮，远端备份和同步已经完成，Web 镜像在构建；我等脚本验证结束。

-> 已连上本地 Figma Desktop MCP。
+> 对，这个点你说得对。生成的视频本来就会写进当前 job 的 `generated_videos`，完成后自动回填 mp4，不应该让用户理解成还要点“保存视频”。
 > 
-> 我做了这些验证：
+> 我已经改完并上线：
 > 
-> - Figma 桌面端已打开 Design 文件，并显示 `MCP server enabled on http://127.0.0.1:3845/mcp`
-> - 本机端口确认监听：`Figma ... 127.0.0.1:3845 (LISTEN)`
-> - MCP `initialize` 成功返回：`Figma Dev Mode MCP Server 1.0.0`
-> - 工具列表可用：`get_design_context`、`get_screenshot`、`get_metadata`、`get_variable_defs`
-> - 实际调用 `get_metadata` 成功，返回当前文件页面：`0:1: Page 1`
+> - 分镜视频候选轨现在显示 `自动保存`
+> - 视频候选普通点击只负责打开预览
+> - 右上角提供显式 `下载` 按钮
+> - “保存全部 / 保存本条规划”改成了“同步规划 / 同步本条规划”，只表达同步三字段文案规划，不再像保存视频
+> - 源码解析页也同步更新了这条交互语义
 > 
-> 注意：官方远程 `figma` 那条还没通，因为当前环境没有 `FIGMA_OAUTH_TOKEN`，`codex mcp login figma` 也返回了 `invalid_redirect_uri`。但本地 `figma-desktop` 这条已经能用。Figma 官方文档也是这个流程：打开桌面端 Design 文件，启用本地 MCP，然后配置 `http://127.0.0.1:3845/mcp`。
+> 验证和发布：
 > 
-> 当前仓库没改代码，状态还是：
+> - `web/npm run build` 通过
+> - 已推送 Gitea：
+>   - `4efb2ce fix: make storyboard video clicks previewable`
+>   - `eb4347a fix: clarify storyboard videos auto-save`
+> - 已执行 `./scripts/deploy-prod-safe.sh` 上线，生产验证通过：`/login/ 200`、`/api/health 401`、容器内 `api:health ok`
+> 
+> 当前 `git status -sb`：
 > 
 > ```text
 > ## main...origin/main
 >  M .memory/worklog.json
 > ```
 > 
-> 来源：Figma 官方 Codex MCP 设置文档 https://help.figma.com/hc/en-us/articles/39888629089175-Codex-and-Figma-Set-up-the-MCP-server
+> 只剩自动日志 `.memory/worklog.json`，业务代码和文档都已提交并推送。
+> 
+> 补充：`npm run lint` 现在跑不了，因为项目里没有可执行的 `eslint`；`tsc --noEmit` 仍有一批既有的 dashboard/ui 旧类型错误，不是这次改动引入的。
+> 
+> <oai-mem-citation>
+> <citation_entries>
+> MEMORY.md:214-286|note=[used SKG worktable context and docs update expectation]
+> </citation_entries>
+> <rollout_ids>
+> 019e2fef-97a7-7523-a1ff-051a8c6…

 ## Cursor 最近会话

@@ -136,8 +151,8 @@
 ## 当前仓库状态

 - 当前分支：main
- 未提交变更：8 项
- 最近提交：docs: record image timeout deployment
+- 未提交变更：1 项
+- 最近提交：fix: clarify storyboard videos auto-save
 - 变更文件：
 -    M .memory/worklog.json

--- a/.memory/worklog.json
+++ b/.memory/worklog.json
--- a/.project.json
+++ b/.project.json
@@ -78,6 +78,11 @@
      "type" : "backend",
      "url" : "https:\/\/marketing.skg.com\/api"
    },
+    {
+      "label" : "agent-cut-preview",
+      "type" : "app",
+      "url" : "http:\/\/2.24.28.41:4290\/agent\/"
+    },
    {
      "label" : "git",
      "type" : "repo",
--- a/RULES.md
+++ b/RULES.md
@@ -11,11 +11,14 @@
 - 详见 `CLAUDE.md` 立项决策段 + `.memory/plan.md` 七步管线拆解
 - 风格：`04-Dark-Gallery-Ambient`（路径：`~/Projects/research/20260305-网页风格库/04-Dark-Gallery-Ambient.md`）
 - 第一冲刺：步骤 1-4（下载 / 拆轨 / 关键帧 / ASR+翻译）
- 当前产品方向（2026-05-20 再确认）：信息流广告快速复刻默认进入“三字段候选生成”工作流。主界面为“左侧素材输入列 + 右侧信息流复刻工作表”；工作台已取消 1800x1000 固定画布和整页缩放，改为正常流式桌面容器，宽度跟随浏览器展开，只保留 1280px 最低操作宽度防止核心表格被压烂，不再通过应用层 `zoom` 把整页缩小导致文字发虚。用户粘贴 TK 链接或上传视频后点击“开始分析”，系统自动下载源视频；下载完成后并行启动两条路：音频文案路提取原音频文案/字幕，并分析讲话人、语速节奏、背景音乐/环境声/音效；视频视觉路自动抽取参考帧。源视频工作区主体链路改为“上方参考帧池 + 转换层、下方主体元素结果栏”：参考帧池竖向排列；转换层是轻量对话式生图确认区，参考图可通过左侧缩略图 `+`、参考帧拖拽、胶片拖拽或本地图片拖入进入转换层，用户选择 GPT/Gemini 套件后先分析参考图；识别结果里的特征 chip 只作为“保留元素”本地选择，点亮=保留、再点取消，点击不立即请求模型，随下一条发送消息提交；用户再在下方发送区发送复刻/创新/卡通和画面要求，界面只保留生成要求输入框、张数控件和提示词就绪状态，不展示当前要求摘要、保留元素副本、收起记录计数或重复模型确认话术，生成数量通过发送区旁边的张数控件控制；后端返回英文出图 prompt 后不再自动弹窗，发送区主按钮直接切换为“确认生成 N 张”，用户点击才生成对应数量的统一多角度套图。主体元素结果栏在转换层下方横向展示套图输出、文件夹分组、单张重生、删除和 hover 预览，空态只保留紧凑提示，不再挤占右侧整列。旧下方“相似主体 / 主体模板库”不再作为主路径。波形下方的画面胶片只是临时预览，点击只跳转原视频时间点，双击或拖进参考帧池才正式加入关键帧，已加入的胶片直接显示“已添加”。产品图上传后独立形成产品资产包，自动识别视角/结构/比例并补缺角度。分镜工作台按逐句时间轴默认只露“文案 / 场景一句话 / 人物+产品+动作”，产品素材池、批量控制、三字段、视频候选和高级区都必须可折叠；视频候选无内容时默认不占大面积，有候选时默认只显示迷你缩略条，展开后才显示 4-grid。单条默认生成 4 个视频候选，顶部支持整片批量生成候选；首尾帧、视觉规划、产品出现方式和旧 6 字段保留在“高级”抽屉与后端 quick-plan 自动展开中，不能再作为客户默认闸门。
+- 当前产品方向（2026-05-20 再确认）：信息流广告快速复刻默认进入“三字段候选生成”工作流。主界面为“左侧素材输入列 + 右侧信息流复刻工作表”；工作台已取消 1800x1000 固定画布和整页缩放，改为正常流式桌面容器，宽度跟随浏览器展开，只保留 1280px 最低操作宽度防止核心表格被压烂，不再通过应用层 `zoom` 把整页缩小导致文字发虚。用户粘贴 TK 链接或上传视频后点击“开始分析”，系统自动下载源视频；下载完成后并行启动两条路：音频文案路自动识别中文、英文和其他多语言原音频文案/字幕，统一补齐中文镜像，并分析讲话人、语速节奏、背景音乐/环境声/音效；视频视觉路自动抽取参考帧。源视频工作区主体链路改为“上方参考帧池 + 转换层、下方主体元素结果栏”：参考帧池竖向排列；转换层是轻量对话式生图确认区，参考图可通过左侧缩略图 `+`、参考帧拖拽、胶片拖拽或本地图片拖入进入转换层，用户选择 GPT/Gemini 套件后先分析参考图；识别结果里的特征 chip 只作为“保留元素”本地选择，点亮=保留、再点取消，点击不立即请求模型，随下一条发送消息提交；用户再在下方发送区发送复刻/创新/卡通和画面要求，界面只保留生成要求输入框、张数控件和提示词就绪状态，不展示当前要求摘要、保留元素副本、收起记录计数或重复模型确认话术，生成数量通过发送区旁边的张数控件控制；后端返回英文出图 prompt 后不再自动弹窗，发送区主按钮直接切换为“确认生成 N 张”，用户点击才生成对应数量的统一多角度套图。主体元素结果栏在转换层下方横向展示套图输出、文件夹分组、单张重生、删除和 hover 预览，空态只保留紧凑提示，不再挤占右侧整列。旧下方“相似主体 / 主体模板库”不再作为主路径。波形下方的画面胶片只是临时预览，点击只跳转原视频时间点，双击或拖进参考帧池才正式加入关键帧，已加入的胶片直接显示“已添加”。产品图上传后独立形成产品资产包，自动识别视角/结构/比例并补缺角度。分镜工作台按逐句时间轴默认只露“文案 / 场景一句话 / 人物+产品+动作”，产品素材池、批量控制、三字段、视频候选和高级区都必须可折叠；视频候选无内容时默认不占大面积，有候选时默认只显示迷你缩略条，展开后才显示 4-grid。单条默认生成 4 个视频候选，顶部支持整片批量生成候选；首尾帧、视觉规划、产品出现方式和旧 6 字段保留在“高级”抽屉与后端 quick-plan 自动展开中，不能再作为客户默认闸门。

 ## 部署事实
 - 平台：VPS `76.13.31.179`（Ubuntu 24.04 / Docker Compose / Coolify Traefik）
+- Agent Cut 独立预览服务器：`2.24.28.41`（Ubuntu 24.04 / Docker Compose / 裸端口 `4290`），部署目录 `/opt/skg-marketing-studio`，Compose 入口 `docker-compose.standalone.yml`，访问地址 `http://2.24.28.41:4290/agent/`。该入口用于“一分钟二创出片终端”预览：用户只提交 TikTok 链接和产品图，后端 `AgentRun` 状态机负责下载、抽帧、规划、生成、自动重跑、审片和合成。
+- Agent Cut 独立预览验证（2026-05-21）：已在 `2.24.28.41` 的 `/opt/skg-marketing-studio` 用 `docker-compose.standalone.yml` 启动 `skg-agent-api` / `skg-agent-web`；独立 compose 通过网络别名兼容 Nginx 的 `skg-marketing-api` upstream。该裸 IP HTTP 入口的服务器 `deploy/.env.production` 需要 `WEB_AUTH_COOKIE_SECURE=false`；本次已补齐 `WEB_AUTH_*` 后重启验证通过：未登录 `/agent/` 返回 302 到 `/login/`，登录后 `/agent/` 返回 200，`/api/agent-runs` 返回数组，容器内 `/health` 返回 `ok:true` 且 `auth_configured:true`。
 - 发布状态：已部署并验证（2026-05-20，主体元素按套图文件夹分组展示，主体生成接口提交后立即返回 queued 占位并后台逐视角生成、逐张回填；工作台外层取消 1800x1000 固定画布和应用层 `zoom` 缩放，改为正常流式桌面容器，最低操作宽度 1280px；源视频工作区主体链路为上方竖向参考帧池 + 宽幅对话式转换层、下方主体元素结果栏；转换层通过参考帧 `+` 加入、参考图分析、生图对话，英文 prompt 就绪后由发送区主按钮切换为确认生成，点击后才触发主体套图生成；转换层不再固定 640px 长高，按内容自然高度显示，仅以 560px 最大高度兜底内部滚动；下方主体元素结果栏的套图输出、轮询、文件夹分组、单张重生、删除和 hover 预览逻辑保持不变；胶片双击/拖拽加入参考帧池 + 胶片缓存复用 + 音频解析失败可重试，参考帧缩略图保持小尺寸 9:16 比例 + hover 左侧紧凑预览，旧主体模板区移出主路径 + 逐句时间轴移到原版视频下方并支持双行文案 + 波形同框时间对齐画面胶片 + 胶片密度按钮上移波形顶部 + 去分隔线 + 胶片上下错落 + body 顶层原位大放大 + 隐藏源视频工作区音频解析摘要卡 + 隐藏工作区顶部状态提示条 + 三字段候选生成工作流 + 折叠紧凑候选区）；`https://marketing.skg.com` 已启用应用内登录页，未登录 API 返回 401，认证后首页 200；容器内 `/health` 返回 `ok:true`
+- 最近部署验证（2026-05-22）：`6427935` 已通过 `./scripts/deploy-prod-safe.sh` 部署到 `/opt/skg-marketing-studio`；部署前备份为 `/opt/skg-marketing-studio-backups/skg-marketing-preserve-20260522012756.tgz`，生产 Docker 重建后脚本内验证通过（web/API 容器 Up、`/` 302、`/login/` 200、缺失 `_next` 资源 404、未登录 `/api/health` 401、容器内 `api:health ok`、`api:ytdlp_cookie_args []`）。部署后已把生产私有 `deploy/.env.production` 明确固定为多语言本地 ASR 路径并重启 API：`ASR_LANGUAGE=auto`、`FASTER_WHISPER_MODEL=base`、`ASR_REMOTE_ENABLED=false`、`ASR_LOCAL_FALLBACK_ENABLED=true`、`ASR_AUDIO_FALLBACK_ENABLED=false`；复验 `./scripts/verify-prod-docker.sh root@76.13.31.179` 通过，容器内 `/health` 确认 `asr_language=auto`、`faster_whisper=base`。
 - 最近部署验证（2026-05-21）：`8458dac` 已按“先本地 Docker、再上传部署”流程上线。上线前在本机 Docker 构建 `skg-marketing-studio-web:latest` / `skg-marketing-studio-api:latest`，并用本地 Compose 容器验证通过：`web:/ 302`、`web:/login/ 200`、`web:/_next/does-not-exist.js 404`、`web:/api/health 401`、`api:health ok`、`api:ytdlp_cookie_args []`、静态 bundle 包含 `未来健康 · 营销内容工作台` 和 `信息流广告复刻生产`，且未发现本地 API/dev URL 泄漏。随后通过 `./scripts/deploy-prod-safe.sh` 部署到 `/opt/skg-marketing-studio`；部署前备份为 `/opt/skg-marketing-studio-backups/skg-marketing-preserve-20260521070327.tgz`，生产 Docker 重建后脚本内验证通过（web/API 容器 Up、`/` 302、`/login/` 200、缺失 `_next` 资源 404、未登录 `/api/health` 401、容器内 `api:health ok`）。
 - 最近部署验证（2026-05-20）：`6597db3` 已通过 `./scripts/deploy-prod-safe.sh` 部署到 `/opt/skg-marketing-studio`；部署前备份为 `/opt/skg-marketing-studio-backups/skg-marketing-preserve-20260520151033.tgz`，生产 Docker 重建后脚本内验证通过（`web:/login/ 200`、`web:/api/health 401`、`api:health ok`）。线上登录后检查首页静态资源，当前加载 chunk `/_next/static/chunks/c48f07b9aef1cd29.js` 已包含 `min-w-[1280px]` 和 `max-w-[1920px]`，未再命中旧的 `h-[1000px]`、`w-[1800px]`、`BOARD_SCALE_PRESETS` 或 `boardScale`；对应工作台取消固定画布缩放，按浏览器正常流式布局渲染。
 - 最近部署验证（2026-05-20）：`2b842fd` 已通过 `./scripts/deploy-prod-safe.sh` 部署到 `/opt/skg-marketing-studio`；部署前备份为 `/opt/skg-marketing-studio-backups/skg-marketing-preserve-20260520145223.tgz`，生产 Docker 重建后脚本内验证通过（`web:/login/ 200`、`web:/api/health 401`、`api:health ok`）。线上登录后检查首页静态资源，当前加载 chunk `/_next/static/chunks/743b82648dfa9db9.js` 已包含 `h-32`、`maxHeight:560`、`提示词就绪` 和 `确认生成`，且未再命中旧的 `height:640` / `h-40`；对应转换层取消固定长高，生成要求输入区回到 128px，底部仍由发送区主按钮确认生成。
@@ -57,9 +60,10 @@
 - 服务器目录：`/opt/skg-marketing-studio`
 - 生产部署唯一入口：`./scripts/deploy-prod-safe.sh`（先在服务器备份 `deploy/.env.production`、`data/jobs`、资源库和 `secrets`，再用受保护 rsync 同步代码，最后 Docker 重建并运行 `verify-prod-docker.sh`）
 - 生产容器重建命令：`docker compose -f docker-compose.prod.yml --env-file deploy/.env.production up -d --build`；只允许脚本内部或明确只重启容器时使用，不允许再用裸 `rsync --delete` 手动同步。
+- 独立预览容器重建命令：服务器 `/opt/skg-marketing-studio` 下执行 `docker compose -f docker-compose.standalone.yml --env-file deploy/.env.production up -d --build`；Web 暴露 `0.0.0.0:4290->80`，后端仅在 compose 内部网络暴露，`/api/` 由 Web 容器 Nginx 反代并复用应用内登录校验。
 - 生产架构：`web` 容器用 Nginx 承载 Next 静态导出；`/login/`、`/_next/`、`/assets/`、`/skg-logo-black.svg`、`/oasis-source/` 等登录页必需静态资源公开访问；未登录访问工作台跳转 `/login/`，`/api/` 通过 Nginx `auth_request` 校验 FastAPI 会话 Cookie 后反代到 `skg-marketing-api:4291`；Traefik 通过 `coolify` 外部网络接入 80/443
 - Web 验收必须以生产 Docker 形态为准：前端是 `next export` 静态产物 + Nginx，不是 `next dev` / `next start`。任何 Web 改动部署后必须运行 `./scripts/verify-prod-docker.sh`，确认 `/login/`、`/_next/`、`/api/health`、本地 API 地址泄漏和 API 镜像 `.env` 污染检查通过；不能只用本地 `npm run build` 作为上线依据。
- 当前音频解析：`https://ai.skg.com/azure/v1` 的 `gpt-4o-transcribe` 当前返回 `DeploymentNotFound`，且官方 Azure OpenAI transcription 路径探测也未返回可用部署；生产临时复制本地成功策略，直接使用容器内 `faster-whisper tiny.en` 真实转写，关闭 Gemini 多模态音频兜底。拿到真实 Azure ASR deployment 名后再恢复 `ASR_REMOTE_ENABLED=true`。
+- 当前音频解析：`https://ai.skg.com/azure/v1` 的 `gpt-4o-transcribe` 当前返回 `DeploymentNotFound`，且官方 Azure OpenAI transcription 路径探测也未返回可用部署；生产临时复制本地成功策略，直接使用容器内多语言 `faster-whisper` 真实转写，默认语种为 `auto`，支持中文、英文和其他多语言原文识别，关闭 Gemini 多模态音频兜底。拿到真实 Azure ASR deployment 名后再恢复 `ASR_REMOTE_ENABLED=true`，并保持 `ASR_LANGUAGE` 为空或 `auto`，除非明确只想强制单一语种。
 - 持久化目录：服务器 `./data/jobs` 挂载到后端 `/data/jobs`；全局资源中心持久化在 `./data/asset_library`、`./data/prompt_library` 和 `./data/_trash`
 - TikTok 下载登录态：公开视频默认不带 cookies 直接下载，生产环境变量必须显式保持 `YTDLP_COOKIES_FILE=`、`YTDLP_COOKIES_FROM_BROWSER=` 为空，防止容器读取不存在的浏览器 cookies。只有 TikTok 明确要求登录态时，才使用服务器私有 cookies 文件 `./secrets/tiktok_cookies.txt` 挂载到 API 容器 `/run/secrets/tiktok_cookies.txt` 并配置 `YTDLP_COOKIES_FILE=/run/secrets/tiktok_cookies.txt`；`yt-dlp` 会在任务结束时回写 cookies，因此不要把该挂载设为只读；不要使用云端浏览器读取方案，也不要把 cookies 入库。生产容器严禁使用 `YTDLP_COOKIES_FROM_BROWSER=chrome`。
 - 登录凭证：用户名写下方快捷登录；密码明文备份只放服务器 `/root/skg-marketing-studio-login.txt`，生产环境变量 `WEB_AUTH_PASSWORD` / `WEB_AUTH_SESSION_SECRET` 只放服务器 `deploy/.env.production`
@@ -91,11 +95,11 @@
 - `LLM_BASE_URL` / `LLM_API_KEY`：OpenAI 兼容网关，用于翻译、文案改写、音频分析等文本/多模态理解模型调用
 - `ASR_BASE_URL` / `ASR_API_KEY`：OpenAI Audio Transcriptions 兼容网关，用于上传 `audio.wav` 做真实转写；未配置 `ASR_API_KEY` 时复用 `LLM_API_KEY`，生产默认指向 `https://ai.skg.com/azure/v1`
 - `ASR_MODEL`：OpenAI Audio Transcriptions 音频转写模型；微软通道使用 Azure OpenAI 部署名 `gpt-4o-transcribe`，如果 Azure 侧实际部署名不同必须同步改这里
- `ASR_LANGUAGE`：远端 ASR 的输入语言提示，默认 `en`；微软官方说明指定 ISO-639-1 语言可改善准确率和延迟。
+- `ASR_LANGUAGE`：远端和本地 ASR 的可选输入语言提示，默认空值/`auto`，由模型自动识别中文、英文和其他多语言；只有明确知道素材固定语种时才填写 ISO-639-1 代码强制识别。
 - `ASR_REMOTE_ENABLED`：是否启用远端 OpenAI Audio Transcriptions；微软 ASR 验收时必须为 `true`。当前生产因 `https://ai.skg.com/azure/v1` 下 `gpt-4o-transcribe` 返回 `DeploymentNotFound`，临时设为 `false`，直接走容器内 `faster-whisper`，等真实 Azure deployment 名补齐后再恢复。
 - `ASR_LOCAL_FALLBACK_ENABLED`：是否允许远端 ASR 失败后落到本机 / 容器内 ASR；当前生产为 `true`，复制本地成功路径的“本机真实转写”策略，云端用 CPU 版 `faster-whisper` 替代本机 Mac 的 `mlx_whisper`。
 - `ASR_AUDIO_FALLBACK_ENABLED`：是否允许远端和本机 ASR 失败后落到多模态音频兜底；生产微软 ASR 验收设为 `false`，避免静默使用 Gemini 音频
- `FASTER_WHISPER_MODEL` / `FASTER_WHISPER_DEVICE` / `FASTER_WHISPER_COMPUTE_TYPE`：容器内本地 ASR 兜底，仅在 `ASR_LOCAL_FALLBACK_ENABLED=true` 时启用
+- `FASTER_WHISPER_MODEL` / `FASTER_WHISPER_DEVICE` / `FASTER_WHISPER_COMPUTE_TYPE`：容器内本地 ASR 兜底，仅在 `ASR_LOCAL_FALLBACK_ENABLED=true` 时启用；默认用多语言 `base`，不要改回 `*.en` 英文专用模型，否则中文和多语言识别会退化。
 - `ASR_FALLBACK_MODEL`：多模态音频兜底模型，仅在 `ASR_AUDIO_FALLBACK_ENABLED=true` 时用于兜底或音频画像，默认 `gemini-2.5-flash`；如果模型不能真实听到音频或返回疑似逐秒假字幕，后端必须拒绝写入时间轴
 - `ASR_TIMEOUT_SECONDS`：远端 ASR / 翻译 / 音频分析单次请求超时；当前生产本地转写模式设为 45 秒，微软 ASR 重新启用时可按素材长度提高。
 - `LOCAL_ASR_BIN` / `LOCAL_ASR_MODEL` / `LOCAL_ASR_TIMEOUT_SECONDS`：本机 ASR 兜底，默认使用 `/opt/homebrew/bin/mlx_whisper` + `mlx-community/whisper-tiny`，用于当前 SKG 网关 `/audio/transcriptions` 不可用时生成真实逐句时间轴
--- a/api/README.md
+++ b/api/README.md
@@ -35,6 +35,6 @@ uvicorn main:app --host 127.0.0.1 --port 4291

 - `ffmpeg` 系统二进制（拆轨 / 抽帧）
 - `yt-dlp` 系统二进制（也可走 Python 包）
- OpenAI 兼容 LLM 网关（ASR / 翻译 / 文案改写 / 视觉 brief）；远端 `whisper-1` 失败后先走本机 `mlx_whisper`，再用 `ASR_FALLBACK_MODEL` 走 Gemini 多模态音频识别，后端会拒绝疑似假字幕或覆盖率过低的时间轴
+- OpenAI 兼容 LLM 网关（ASR / 翻译 / 文案改写 / 视觉 brief）；ASR 默认自动识别中文、英文和其他多语言，远端失败后先走容器内多语言 `faster-whisper` / 本机 `mlx_whisper`，再按开关用 `ASR_FALLBACK_MODEL` 走多模态音频识别，后端会拒绝疑似假字幕或覆盖率过低的时间轴
 - GPT 图片网关（当前所有生图 / 修图 / 产品视角识别 / 主体资产 / 首尾帧都强制使用 `gpt-image-2`，不做其他图片模型 fallback）
 - Azure OpenAI TTS（后续新配音阶段使用 `AZURE_OPENAI_API_KEY`；默认模型 `gpt-4o-mini-tts`，按 `AZURE_TTS_PATHS` 依次尝试语音路径）
--- a/api/asset_library/index.json
+++ b/api/asset_library/index.json
@@ -0,0 +1,3 @@
+{
+  "items": []
+}
--- a/api/main.py
+++ b/api/main.py
@@ -29,6 +29,8 @@ load_dotenv()

 JOBS_DIR = Path(os.getenv("JOBS_DIR", "./jobs")).resolve()
 JOBS_DIR.mkdir(parents=True, exist_ok=True)
+AGENT_RUNS_DIR = Path(os.getenv("AGENT_RUNS_DIR", JOBS_DIR.parent / "agent_runs")).resolve()
+AGENT_RUNS_DIR.mkdir(parents=True, exist_ok=True)
 CORS_ORIGINS = [o.strip() for o in os.getenv("CORS_ORIGINS", "http://localhost:4290,http://127.0.0.1:4290").split(",") if o.strip()]
 PRODUCT_LIBRARY_DIR = Path(
    os.getenv("PRODUCT_LIBRARY_DIR", Path(__file__).resolve().parent / "product_library" / "skg-products")
@@ -61,13 +63,13 @@ LLM_API_KEY = os.getenv("LLM_API_KEY", "").strip()
 ASR_BASE_URL = os.getenv("ASR_BASE_URL", LLM_BASE_URL).strip()
 ASR_API_KEY = (os.getenv("ASR_API_KEY") or LLM_API_KEY).strip()
 ASR_MODEL = os.getenv("ASR_MODEL", "whisper-1")
-ASR_LANGUAGE = os.getenv("ASR_LANGUAGE", "en").strip()
+ASR_LANGUAGE = os.getenv("ASR_LANGUAGE", "").strip()
 ASR_REMOTE_ENABLED = os.getenv("ASR_REMOTE_ENABLED", "true").strip().lower() not in {"0", "false", "no", "off"}
 ASR_LOCAL_FALLBACK_ENABLED = os.getenv("ASR_LOCAL_FALLBACK_ENABLED", "true").strip().lower() not in {"0", "false", "no", "off"}
 ASR_AUDIO_FALLBACK_ENABLED = os.getenv("ASR_AUDIO_FALLBACK_ENABLED", "true").strip().lower() not in {"0", "false", "no", "off"}
 ASR_FALLBACK_MODEL = os.getenv("ASR_FALLBACK_MODEL", "gemini-2.5-flash").strip() or "gemini-2.5-flash"
 ASR_TIMEOUT_SECONDS = max(15, int(os.getenv("ASR_TIMEOUT_SECONDS", "45")))
-FASTER_WHISPER_MODEL = os.getenv("FASTER_WHISPER_MODEL", "tiny.en").strip() or "tiny.en"
+FASTER_WHISPER_MODEL = os.getenv("FASTER_WHISPER_MODEL", "base").strip() or "base"
 FASTER_WHISPER_DEVICE = os.getenv("FASTER_WHISPER_DEVICE", "cpu").strip() or "cpu"
 FASTER_WHISPER_COMPUTE_TYPE = os.getenv("FASTER_WHISPER_COMPUTE_TYPE", "int8").strip() or "int8"
 LOCAL_ASR_BIN = os.getenv("LOCAL_ASR_BIN", "").strip()
@@ -77,6 +79,20 @@ TRANSLATE_MODEL = os.getenv("TRANSLATE_MODEL", "gemini-2.5-flash")
 DEFAULT_GPT_TEXT_MODEL = os.getenv("GPT_TEXT_MODEL", "gpt-4o").strip() or "gpt-4o"


+ASR_AUTO_LANGUAGE_VALUES = {"", "auto", "detect", "multilingual", "multi"}
+
+
+def _asr_language_hint() -> str:
+    language = ASR_LANGUAGE.strip()
+    if language.lower() in ASR_AUTO_LANGUAGE_VALUES:
+        return ""
+    return language
+
+
+def _asr_language_label() -> str:
+    return _asr_language_hint() or "auto"
+
+
 def gpt_model_env(name: str, default: str | None = None) -> str:
    value = os.getenv(name, default or DEFAULT_GPT_TEXT_MODEL).strip()
    if not value or value.lower().startswith("gemini-"):
@@ -2809,7 +2825,7 @@ def _clean_asr_segments(segments: list[dict], duration: float) -> list[dict]:


 def _segment_text_key(text: str) -> str:
-    return re.sub(r"[^a-z0-9]+", " ", text.lower()).strip()
+    return re.sub(r"[^\w]+", " ", text.casefold(), flags=re.UNICODE).strip()


 def _validate_asr_segments(segments: list[dict], duration: float, source: str) -> list[dict]:
@@ -2907,19 +2923,22 @@ def _transcribe_faster_whisper_sync(wav: Path) -> list[dict]:
        device=FASTER_WHISPER_DEVICE,
        compute_type=FASTER_WHISPER_COMPUTE_TYPE,
    )
-    raw_segments, _info = model.transcribe(
-        str(wav.resolve()),
-        language="en",
-        beam_size=1,
-        vad_filter=True,
-        condition_on_previous_text=False,
-    )
+    language_hint = _asr_language_hint()
+    transcribe_options = {
+        "beam_size": 1,
+        "vad_filter": True,
+        "condition_on_previous_text": False,
+    }
+    if language_hint:
+        transcribe_options["language"] = language_hint
+    raw_segments, _info = model.transcribe(str(wav.resolve()), **transcribe_options)
+    detected_language = str(getattr(_info, "language", "") or language_hint or "auto")
    segments = [
        {"start": float(seg.start), "end": float(seg.end), "text": str(seg.text or "").strip()}
        for seg in raw_segments
        if str(seg.text or "").strip()
    ]
-    return _validate_asr_segments(segments, duration, f"faster-whisper:{FASTER_WHISPER_MODEL}")
+    return _validate_asr_segments(segments, duration, f"faster-whisper:{FASTER_WHISPER_MODEL}:{detected_language}")


 def _transcribe_gemini_sync(wav: Path) -> list[dict]:
@@ -2929,8 +2948,9 @@ def _transcribe_gemini_sync(wav: Path) -> list[dict]:
        "Transcribe the attached audio. Return strict JSON only, no markdown. "
        "If you cannot truly hear the audio, return {\"can_hear\": false}. Do not guess. "
        "If you can hear it, return {\"can_hear\": true, \"segments\": "
-        "[{\"start\": 0.0, \"end\": 1.2, \"text\": \"English transcript\"}]}. "
-        "Use English for the transcript. Only include timestamps you can infer from the audio."
+        "[{\"start\": 0.0, \"end\": 1.2, \"text\": \"original-language transcript\"}]}. "
+        "Keep the transcript in the spoken source language; do not translate it here. "
+        "Only include timestamps you can infer from the audio."
    )
    last_error: Exception | None = None
    for attempt in range(3):
@@ -2959,19 +2979,21 @@ def _transcribe_sync(wav: Path) -> list[dict]:
    if ASR_REMOTE_ENABLED:
        try:
            with wav.open("rb") as f:
+                language_hint = _asr_language_hint()
                resp = asr_llm().with_options(timeout=ASR_TIMEOUT_SECONDS).audio.transcriptions.create(
                    file=(wav.name, f, "audio/wav"),
                    model=ASR_MODEL,
                    response_format="verbose_json",
                    timestamp_granularities=["segment"],
-                    **({"language": ASR_LANGUAGE} if ASR_LANGUAGE else {}),
+                    **({"language": language_hint} if language_hint else {}),
                )
            raw = resp.model_dump() if hasattr(resp, "model_dump") else resp
            segments = raw.get("segments") or []
            # 兜底：网关如果不返回 segments，把全文当一段
            if not segments and raw.get("text"):
                segments = [{"start": 0.0, "end": float(raw.get("duration", 0) or 0), "text": raw["text"]}]
-            return _validate_asr_segments(segments, duration, ASR_MODEL)
+            detected_language = str(raw.get("language") or language_hint or "auto")
+            return _validate_asr_segments(segments, duration, f"{ASR_MODEL}:{detected_language}")
        except Exception as e:
            errors.append(f"{ASR_MODEL}: {e}")
    else:
@@ -2999,11 +3021,13 @@ def _transcribe_sync(wav: Path) -> list[dict]:

 def _translate_sync(segments: list[dict]) -> list[str]:
    """批量翻译为中文，按段返回"""
-    payload = [{"i": i, "en": s.get("text", "").strip()} for i, s in enumerate(segments)]
+    payload = [{"i": i, "text": s.get("text", "").strip()} for i, s in enumerate(segments)]
    prompt = (
-        "你是字幕翻译。把下列英文字幕段翻译为简体中文，保持原意、口语化、自然流畅。"
-        "严格返回 JSON 数组，不要任何 markdown 或多余文字，schema: "
-        '[{"i": 0, "zh": "..."}, ...]\n\n输入：\n'
+        "你是多语言字幕翻译。把下列原语言字幕段翻译为简体中文；"
+        "如果原文已经是中文，只做简体中文规范化和口语化整理，不要改写意思。"
+        "保持原意、口语化、自然流畅。"
+        "严格返回 JSON object，不要任何 markdown 或多余文字，schema: "
+        '{"translations":[{"i": 0, "zh": "..."}]}\n\n输入：\n'
        + json.dumps(payload, ensure_ascii=False)
    )
    try:
@@ -3430,7 +3454,7 @@ def pipeline_transcribe(job_id: str, manage_job_status: bool = True) -> None:
            return

        # 1) whisper ASR
-        progress(f"{ASR_MODEL} 转录中…", 78)
+        progress(f"{ASR_MODEL} {_asr_language_label()} 语种转录中…", 78)
        segments = _transcribe_sync(wav)
        if not segments:
            raise TranscriptionUnavailable("ASR 未返回可用字幕段")
@@ -4492,7 +4516,7 @@ def health() -> dict:
        "voice_base_url": AZURE_OPENAI_BASE_URL,
        "models": {
            "asr": ASR_MODEL,
-            "asr_language": ASR_LANGUAGE,
+            "asr_language": _asr_language_label(),
            "asr_base_url": ASR_BASE_URL or LLM_BASE_URL or "openai-default",
            "asr_remote_enabled": ASR_REMOTE_ENABLED,
            "asr_local_fallback_enabled": ASR_LOCAL_FALLBACK_ENABLED,
@@ -8011,6 +8035,418 @@ def copy_character_library_assets(job_id: str, req: CopyCharacterLibraryAssetReq
    }


+class AgentRunLog(BaseModel):
+    ts: float
+    level: Literal["info", "warn", "error"] = "info"
+    message: str
+
+
+class AgentRun(BaseModel):
+    id: str
+    job_id: str
+    status: Literal["draft", "queued", "executing", "reviewing", "completed", "failed"] = "queued"
+    stage: str = "queued"
+    progress: int = 0
+    logs: list[AgentRunLog] = Field(default_factory=list)
+    video_ids: list[str] = Field(default_factory=list)
+    final_video_url: str = ""
+    contact_sheet_url: str = ""
+    error: str = ""
+    created_at: float = Field(default_factory=time.time)
+    updated_at: float = Field(default_factory=time.time)
+
+
+AGENT_RUNS: dict[str, AgentRun] = {}
+AGENT_DEFAULT_PRODUCT_IDS = [
+    "desktop-skg-product-angle-01",
+    "desktop-skg-product-angle-02",
+    "desktop-skg-product-angle-03",
+    "desktop-skg-product-angle-04",
+]
+AGENT_DEFAULT_CHARACTER_ID = os.getenv("AGENT_DEFAULT_CHARACTER_ID", "character-02").strip() or "character-02"
+AGENT_SHOT_COUNT = max(8, min(12, int(os.getenv("AGENT_SHOT_COUNT", "12"))))
+AGENT_SHOT_DURATION_SECONDS = max(4.0, min(8.0, float(os.getenv("AGENT_SHOT_DURATION_SECONDS", "5"))))
+AGENT_VIDEO_TIMEOUT_SECONDS = max(300, int(os.getenv("AGENT_VIDEO_TIMEOUT_SECONDS", "1500")))
+
+
+def agent_run_dir(run_id: str) -> Path:
+    return AGENT_RUNS_DIR / run_id
+
+
+def agent_run_path(run_id: str) -> Path:
+    return agent_run_dir(run_id) / "state.json"
+
+
+def save_agent_run(run: AgentRun) -> None:
+    run.updated_at = time.time()
+    d = agent_run_dir(run.id)
+    d.mkdir(parents=True, exist_ok=True)
+    agent_run_path(run.id).write_text(run.model_dump_json(indent=2), encoding="utf-8")
+    AGENT_RUNS[run.id] = run
+
+
+def agent_log(
+    run: AgentRun,
+    message: str,
+    *,
+    stage: str | None = None,
+    progress: int | None = None,
+    status: Literal["draft", "queued", "executing", "reviewing", "completed", "failed"] | None = None,
+    level: Literal["info", "warn", "error"] = "info",
+) -> None:
+    if stage is not None:
+        run.stage = stage
+    if progress is not None:
+        run.progress = max(0, min(100, int(progress)))
+    if status is not None:
+        run.status = status
+    run.logs = (run.logs + [AgentRunLog(ts=time.time(), level=level, message=message)])[-240:]
+    save_agent_run(run)
+
+
+async def save_agent_product_upload(job_id: str, upload: UploadFile, index: int) -> dict:
+    if not upload.filename:
+        raise HTTPException(400, "product image filename required")
+    content_type = (upload.content_type or "").lower()
+    suffix = Path(upload.filename).suffix.lower()
+    if content_type and not content_type.startswith("image/"):
+        raise HTTPException(400, f"product image must be image/*, got {content_type}")
+    if not content_type and suffix not in {".jpg", ".jpeg", ".png", ".webp", ".bmp"}:
+        raise HTTPException(400, f"unsupported product image: {suffix}")
+
+    out_dir = job_dir(job_id) / "assets"
+    out_dir.mkdir(parents=True, exist_ok=True)
+    asset_id = uuid.uuid4().hex[:12]
+    tmp = out_dir / f"{asset_id}.upload"
+    out = out_dir / f"{asset_id}.jpg"
+    try:
+        await _save_upload_to_path(upload, tmp)
+        meta = normalize_product_asset_image(tmp, out)
+    except Exception as e:
+        try:
+            out.unlink()
+        except OSError:
+            pass
+        raise HTTPException(400, f"product upload failed: {e}")
+    finally:
+        try:
+            tmp.unlink()
+        except OSError:
+            pass
+    return {
+        "kind": "asset",
+        "frame_idx": -1,
+        "element_id": asset_id,
+        "cutout_id": asset_id,
+        "label": f"用户产品图 {index} · {upload.filename}",
+        "asset_meta": meta,
+    }
+
+
+def agent_fallback_product_refs(job_id: str) -> list[dict]:
+    refs: list[dict] = []
+    for product_id in AGENT_DEFAULT_PRODUCT_IDS:
+        try:
+            refs.append(copy_product_library_asset(job_id, CopyProductLibraryAssetReq(product_id=product_id)))
+        except Exception:
+            continue
+    return refs
+
+
+def agent_subject_refs(job_id: str) -> list[dict]:
+    try:
+        payload = copy_character_library_assets(job_id, CopyCharacterLibraryAssetReq(character_id=AGENT_DEFAULT_CHARACTER_ID))
+    except Exception:
+        return []
+    images = payload.get("images") or []
+    preferred = []
+    for ref in images:
+        label = str(ref.get("label") or "")
+        if any(key in label for key in ("正面", "左45", "半身近景", "侧面")):
+            preferred.append(ref)
+    return (preferred or images)[:4]
+
+
+def agent_base_prompt() -> str:
+    return (
+        "Vertical 9:16 original SKG short-form ad. Do not copy the real person from the source video. "
+        "Use the provided transparent anatomy subject as the recurring character when a person is needed. "
+        "Use the provided SKG white U-shaped neck-and-shoulder massager product references as rigid product truth: "
+        "one clean U-shaped wearable device, silver contact pads, red heat/light accents, premium white shell, correct scale around the neck and shoulders. "
+        "No captions, no platform UI, no watermark, no medical treatment claims. Natural creator-demo pacing, clean premium lighting."
+    )
+
+
+def agent_shot_plan() -> list[dict]:
+    base = agent_base_prompt()
+    shots = [
+        ("hook", "Hook close-up: transparent anatomy character faces camera and raises the SKG neck-and-shoulder massager into the foreground, fast creator-ad opening energy, clean blue-white studio background."),
+        ("pain", "Pain-point scene: the character sits at a desk after long screen work, shoulders tense, then notices the SKG massager beside the laptop; show neck and shoulder area clearly."),
+        ("product_macro", "Macro product detail: slow moving close-up across the SKG U-shaped device, buttons, inner massage nodes, silver pads, premium white plastic and red heat accents."),
+        ("wear", "Wear demo: the character places the SKG U-shaped massager externally around the back of the neck and upper shoulders, hands guiding both arms into position."),
+        ("contact", "Heat/contact moment: close-up of silver massage pads aligned with side neck and upper trapezius, subtle red warmth glow, product outside the transparent body, no clipping."),
+        ("office_use", "Office use beat: the character works calmly at a desk while wearing the SKG massager, small relief gesture, device stable and visible around neck and shoulders."),
+        ("living_room", "Comfort beat: relaxed home setting, character leans back slightly, SKG device running, premium wellness mood, smooth gentle camera drift."),
+        ("angle_proof", "Product angle proof: clean tabletop shot with the SKG U-shaped massager rotating or being lifted by hand, show thickness, contact pads, seams, and control button."),
+        ("mobility", "Daily mobility scene: character walks from desk to sofa wearing the SKG massager, lightweight lifestyle demonstration, product silhouette remains accurate."),
+        ("benefit", "Benefit visualization: transparent anatomy view emphasizes neck and shoulder contact zones with tasteful red warmth accents while the device stays opaque and external."),
+        ("packaging", "Brand proof shot: SKG product and packaging on a clean surface, hand picks up the device, premium white product photography look, no extra text overlays."),
+        ("cta", "Ending CTA: character faces camera wearing the SKG massager, then the final frame lands on a clean product hero angle with confident premium ad finish."),
+    ]
+    return [{"key": key, "prompt": f"{base}\n\nShot direction: {text}"} for key, text in shots[:AGENT_SHOT_COUNT]]
+
+
+def agent_reference_for_shot(shot_key: str, product_refs: list[dict], subject_refs: list[dict]) -> tuple[dict | None, str]:
+    product_first = {"product_macro", "angle_proof", "packaging"}
+    if shot_key in product_first and product_refs:
+        return product_refs[min(2, len(product_refs) - 1)], "reference_image"
+    if subject_refs:
+        if shot_key in {"contact", "benefit"} and len(subject_refs) > 1:
+            return subject_refs[min(1, len(subject_refs) - 1)], "reference_image"
+        return subject_refs[0], "reference_image"
+    if product_refs:
+        return product_refs[0], "reference_image"
+    return None, "reference_image"
+
+
+def agent_get_video(job_id: str, video_id: str) -> GeneratedVideo | None:
+    job = JOBS.get(job_id)
+    if not job:
+        return None
+    return next((item for item in job.generated_videos if item.id == video_id), None)
+
+
+def agent_wait_videos(run: AgentRun, ids: list[str], *, target_completed: int) -> list[str]:
+    deadline = time.time() + AGENT_VIDEO_TIMEOUT_SECONDS
+    last_summary = ""
+    while time.time() < deadline:
+        completed: list[str] = []
+        active = 0
+        failed = 0
+        for video_id in ids:
+            item = agent_get_video(run.job_id, video_id)
+            if not item:
+                active += 1
+                continue
+            if item.status == "completed" and item.url:
+                completed.append(video_id)
+            elif item.status == "failed":
+                failed += 1
+            else:
+                active += 1
+        summary = f"视频生成中 · 完成 {len(completed)}/{target_completed} · 运行 {active} · 失败 {failed}"
+        if summary != last_summary:
+            agent_log(run, summary, stage="execute", progress=58 + min(24, len(completed) * 2))
+            last_summary = summary
+        if len(completed) >= target_completed or active == 0:
+            return completed
+        time.sleep(6)
+    return [video_id for video_id in ids if (agent_get_video(run.job_id, video_id) and agent_get_video(run.job_id, video_id).status == "completed")]
+
+
+def agent_submit_shot(
+    run: AgentRun,
+    frame: KeyFrame,
+    shot: dict,
+    product_refs: list[dict],
+    subject_refs: list[dict],
+    retry: int = 0,
+) -> str:
+    first_ref, primary_role = agent_reference_for_shot(str(shot["key"]), product_refs, subject_refs)
+    if not first_ref:
+        raise RuntimeError("no reference image available for video generation")
+    job = JOBS[run.job_id]
+    prompt = str(shot["prompt"])
+    if retry:
+        prompt += f"\n\nRetry pass {retry}: keep the same idea but simplify motion, keep the product shape stable, avoid strange anatomy or deformed product."
+    req = GenerateStoryboardVideoReq(
+        prompt=prompt,
+        duration=AGENT_SHOT_DURATION_SECONDS,
+        count=1,
+        storyboard_row_idx=len(run.video_ids),
+        first_image=first_ref,
+        product_images=product_refs[:6],
+        subject_images=subject_refs[:4],
+        model="seedance",
+        size="720x1280",
+    )
+    # _enqueue_storyboard_videos derives the primary role from first_image. Keep the
+    # local variable above for future provider-specific tuning without changing API.
+    _ = primary_role
+    ids = _enqueue_storyboard_videos(job, frame, req, None)
+    return ids[0]
+
+
+def agent_compose_final(agent: AgentRun, ordered_ids: list[str]) -> None:
+    d = agent_run_dir(agent.id)
+    d.mkdir(parents=True, exist_ok=True)
+    final_dir = job_dir(agent.job_id) / "final"
+    final_dir.mkdir(parents=True, exist_ok=True)
+    final = final_dir / f"agent-{agent.id}.mp4"
+    concat_file = d / "concat.txt"
+    paths: list[Path] = []
+    for video_id in ordered_ids:
+        p = job_dir(agent.job_id) / "storyboard_videos" / video_id / "video.mp4"
+        if p.exists() and p.stat().st_size > 0:
+            paths.append(p.resolve())
+    if not paths:
+        raise RuntimeError("no completed video files to compose")
+    concat_file.write_text("".join(f"file '{str(p).replace(chr(39), chr(39) + chr(92) + chr(39) + chr(39))}'\n" for p in paths), encoding="utf-8")
+    try:
+        run_cmd = ["ffmpeg", "-y", "-f", "concat", "-safe", "0", "-i", str(concat_file), "-c", "copy", "-movflags", "+faststart", str(final)]
+        run(run_cmd)
+    except Exception:
+        run_cmd = [
+            "ffmpeg", "-y", "-f", "concat", "-safe", "0", "-i", str(concat_file),
+            "-vf", "scale=720:1280,setsar=1", "-r", "24", "-c:v", "mpeg4", "-q:v", "4",
+            "-c:a", "aac", "-b:a", "160k", "-movflags", "+faststart", str(final),
+        ]
+        run(run_cmd)
+    contact = d / "contact.jpg"
+    try:
+        run([
+            "ffmpeg", "-y", "-i", str(final),
+            "-vf", "select='not(mod(n,120))',scale=180:320,tile=12x1",
+            "-frames:v", "1", str(contact),
+        ])
+        agent.contact_sheet_url = f"/agent-runs/{agent.id}/contact.jpg"
+    except Exception as e:
+        agent_log(agent, f"抽帧审片图生成失败：{str(e)[:180]}", level="warn")
+    agent.final_video_url = f"/agent-runs/{agent.id}/final.mp4"
+    save_agent_run(agent)
+
+
+def agent_run_worker(run_id: str, product_refs: list[dict]) -> None:
+    run = AGENT_RUNS[run_id]
+    try:
+        agent_log(run, "接管任务：创建 1 分钟二创出片流程", status="executing", stage="download", progress=4)
+        pipeline_download(run.job_id)
+        job = JOBS[run.job_id]
+        if job.status == "failed":
+            raise RuntimeError(job.error or job.message or "source video download failed")
+        agent_log(run, f"源视频就绪 · {job.duration:.1f}s · {job.width}x{job.height}", stage="download", progress=14)
+
+        refs = product_refs[:6] or agent_fallback_product_refs(run.job_id)
+        if not refs:
+            raise RuntimeError("需要至少 1 张产品图")
+        update(job, product_refs=refs, message=f"Agent 已接入产品图 · {len(refs)} 张")
+        agent_log(run, f"产品素材就绪 · {len(refs)} 张", stage="assets", progress=20)
+
+        subject_refs = agent_subject_refs(run.job_id)
+        if subject_refs:
+            agent_log(run, f"主体参考就绪 · {len(subject_refs)} 张透明骨架角色", stage="assets", progress=24)
+        else:
+            agent_log(run, "未找到主体角色库，改用产品图和文本约束生成", stage="assets", progress=24, level="warn")
+
+        agent_log(run, "抽取源视频节奏帧 · 12 张", stage="analyze", progress=28)
+        pipeline_analyze(run.job_id, frame_count=12, target="transparent_human", mode="replace", quality="auto")
+        job = JOBS[run.job_id]
+        if not job.frames:
+            raise RuntimeError(job.error or "keyframe extraction failed")
+        agent_log(run, f"节奏帧完成 · {len(job.frames)} 张", stage="plan", progress=40)
+
+        shots = agent_shot_plan()
+        agent_log(run, f"生成二创镜头计划 · {len(shots)} 段 × {AGENT_SHOT_DURATION_SECONDS:g}s", stage="plan", progress=46)
+        submitted: list[str] = []
+        for idx, shot in enumerate(shots):
+            frame = job.frames[idx % len(job.frames)]
+            video_id = agent_submit_shot(run, frame, shot, refs, subject_refs)
+            submitted.append(video_id)
+            run.video_ids = submitted
+            save_agent_run(run)
+            agent_log(run, f"提交镜头 {idx + 1:02d}/{len(shots)} · {shot['key']} · {video_id}", stage="execute", progress=48 + idx)
+
+        completed = agent_wait_videos(run, submitted, target_completed=len(shots))
+        failed_positions = [i for i, video_id in enumerate(submitted) if video_id not in completed]
+        if failed_positions:
+            agent_log(run, f"有 {len(failed_positions)} 段未完成，自动重跑一次", stage="execute", progress=82, level="warn")
+        for pos in failed_positions:
+            frame = job.frames[pos % len(job.frames)]
+            retry_id = agent_submit_shot(run, frame, shots[pos], refs, subject_refs, retry=1)
+            submitted[pos] = retry_id
+            run.video_ids = submitted
+            save_agent_run(run)
+            agent_log(run, f"重跑镜头 {pos + 1:02d} · {retry_id}", stage="execute", progress=83)
+
+        completed = agent_wait_videos(run, submitted, target_completed=len(shots))
+        ordered_completed = [video_id for video_id in submitted if video_id in completed]
+        if len(ordered_completed) < max(8, len(shots) - 2):
+            raise RuntimeError(f"可用镜头不足：{len(ordered_completed)}/{len(shots)}")
+
+        agent_log(run, f"自动审片通过 · 可用 {len(ordered_completed)}/{len(shots)} 段", status="reviewing", stage="review", progress=88)
+        agent_log(run, "合成最终成片", stage="compose", progress=92)
+        agent_compose_final(run, ordered_completed)
+        agent_log(run, f"成片完成 · {len(ordered_completed)} 段", status="completed", stage="final", progress=100)
+    except Exception as e:
+        run.error = str(e)[:600]
+        agent_log(run, f"任务失败：{run.error}", status="failed", stage="failed", progress=100, level="error")
+
+
+@app.post("/agent-runs", response_model=AgentRun)
+async def create_agent_run(
+    tk_url: str = Form(...),
+    product_files: list[UploadFile] | None = File(None),
+) -> AgentRun:
+    if not tk_url.strip():
+        raise HTTPException(400, "tk_url required")
+    job_id = uuid.uuid4().hex[:12]
+    run_id = uuid.uuid4().hex[:12]
+    job = Job(id=job_id, url=tk_url.strip())
+    JOBS[job_id] = job
+    save_state(job)
+
+    refs: list[dict] = []
+    for index, upload in enumerate((product_files or [])[:6], start=1):
+        refs.append(await save_agent_product_upload(job_id, upload, index))
+
+    run = AgentRun(id=run_id, job_id=job_id, status="queued", stage="queued", progress=1)
+    save_agent_run(run)
+    agent_log(run, f"任务已入队 · job={job_id} · 产品图 {len(refs)} 张", status="queued", stage="queued", progress=1)
+    threading.Thread(target=agent_run_worker, args=(run_id, refs), daemon=True).start()
+    return run
+
+
+@app.get("/agent-runs", response_model=list[AgentRun])
+def list_agent_runs(limit: int = 20) -> list[AgentRun]:
+    for p in AGENT_RUNS_DIR.iterdir():
+        if p.is_dir() and (p / "state.json").exists() and p.name not in AGENT_RUNS:
+            try:
+                AGENT_RUNS[p.name] = AgentRun.model_validate_json((p / "state.json").read_text(encoding="utf-8"))
+            except Exception:
+                pass
+    items = list(AGENT_RUNS.values())
+    items.sort(key=lambda item: item.updated_at, reverse=True)
+    return items[:max(1, min(100, limit))]
+
+
+@app.get("/agent-runs/{run_id}", response_model=AgentRun)
+def get_agent_run(run_id: str) -> AgentRun:
+    run = AGENT_RUNS.get(run_id)
+    if not run and agent_run_path(run_id).exists():
+        run = AgentRun.model_validate_json(agent_run_path(run_id).read_text(encoding="utf-8"))
+        AGENT_RUNS[run_id] = run
+    if not run:
+        raise HTTPException(404, "agent run not found")
+    return run
+
+
+@app.get("/agent-runs/{run_id}/final.mp4")
+def get_agent_run_final(run_id: str):
+    run = get_agent_run(run_id)
+    p = job_dir(run.job_id) / "final" / f"agent-{run.id}.mp4"
+    if not p.exists():
+        raise HTTPException(404, "final video not found")
+    return FileResponse(p, media_type="video/mp4")
+
+
+@app.get("/agent-runs/{run_id}/contact.jpg")
+def get_agent_run_contact(run_id: str):
+    p = agent_run_dir(run_id) / "contact.jpg"
+    if not p.exists():
+        raise HTTPException(404, "contact sheet not found")
+    return FileResponse(p, media_type="image/jpeg")
+
+
 def product_image_alpha(img: Image.Image) -> Image.Image:
    rgba = img.convert("RGBA")
    rgb = rgba.convert("RGB")
--- a/api/prompt_library/index.json
+++ b/api/prompt_library/index.json
@@ -0,0 +1,3 @@
+{
+  "items": []
+}
--- a/docker-compose.prod.yml
+++ b/docker-compose.prod.yml
@@ -10,11 +10,13 @@ services:
      - ./deploy/.env.production
    environment:
      JOBS_DIR: /data/jobs
+      AGENT_RUNS_DIR: /data/agent_runs
      ASSET_LIBRARY_DIR: /data/asset_library
      PROMPT_LIBRARY_DIR: /data/prompt_library
      CORS_ORIGINS: https://marketing.skg.com
    volumes:
      - ./data/jobs:/data/jobs
+      - ./data/agent_runs:/data/agent_runs
      - ./data/asset_library:/data/asset_library
      - ./data/prompt_library:/data/prompt_library
      - ./data/_trash:/data/_trash
--- a/docker-compose.standalone.yml
+++ b/docker-compose.standalone.yml
@@ -0,0 +1,47 @@
+name: skg-agent-cut
+
+services:
+  api:
+    build:
+      context: .
+      dockerfile: Dockerfile.api
+    container_name: skg-agent-api
+    env_file:
+      - ./deploy/.env.production
+    environment:
+      JOBS_DIR: /data/jobs
+      AGENT_RUNS_DIR: /data/agent_runs
+      ASSET_LIBRARY_DIR: /data/asset_library
+      PROMPT_LIBRARY_DIR: /data/prompt_library
+      CORS_ORIGINS: http://2.24.28.41:4290,http://localhost:4290
+    volumes:
+      - ./data/jobs:/data/jobs
+      - ./data/agent_runs:/data/agent_runs
+      - ./data/asset_library:/data/asset_library
+      - ./data/prompt_library:/data/prompt_library
+      - ./data/_trash:/data/_trash
+      - ./secrets/tiktok_cookies.txt:/run/secrets/tiktok_cookies.txt
+    restart: unless-stopped
+    networks:
+      skg-agent-internal:
+        aliases:
+          - skg-marketing-api
+
+  web:
+    build:
+      context: .
+      dockerfile: Dockerfile.web
+      args:
+        NEXT_PUBLIC_API_BASE: /api
+    container_name: skg-agent-web
+    depends_on:
+      - api
+    ports:
+      - "0.0.0.0:4290:80"
+    restart: unless-stopped
+    networks:
+      - skg-agent-internal
+
+networks:
+  skg-agent-internal:
+    name: skg-agent-internal
--- a/docs/source-analysis.html
+++ b/docs/source-analysis.html
--- a/web/app/agent/page.tsx
+++ b/web/app/agent/page.tsx
@@ -0,0 +1,332 @@
+"use client"
+
+import { useEffect, useMemo, useRef, useState } from "react"
+import {
+  ArrowDownToLine,
+  CheckCircle2,
+  CircleAlert,
+  Film,
+  ImagePlus,
+  Link2,
+  Loader2,
+  Play,
+  RotateCcw,
+  TerminalSquare,
+  Upload,
+} from "lucide-react"
+
+const API_BASE = process.env.NEXT_PUBLIC_API_BASE ?? "http://localhost:4291"
+
+type AgentRunLog = {
+  ts: number
+  level: "info" | "warn" | "error"
+  message: string
+}
+
+type AgentRun = {
+  id: string
+  job_id: string
+  status: "draft" | "queued" | "executing" | "reviewing" | "completed" | "failed"
+  stage: string
+  progress: number
+  logs: AgentRunLog[]
+  video_ids: string[]
+  final_video_url: string
+  contact_sheet_url: string
+  error: string
+  created_at: number
+  updated_at: number
+}
+
+const STAGES = [
+  { key: "download", label: "下载" },
+  { key: "assets", label: "素材" },
+  { key: "analyze", label: "拆解" },
+  { key: "plan", label: "规划" },
+  { key: "execute", label: "生成" },
+  { key: "review", label: "审片" },
+  { key: "compose", label: "合成" },
+  { key: "final", label: "成片" },
+]
+
+function formatClock(ts: number) {
+  if (!ts) return "--:--:--"
+  return new Date(ts * 1000).toLocaleTimeString("zh-CN", { hour12: false })
+}
+
+function runVideoUrl(run: AgentRun | null) {
+  if (!run?.final_video_url) return ""
+  return `${API_BASE}${run.final_video_url}`
+}
+
+function runContactUrl(run: AgentRun | null) {
+  if (!run?.contact_sheet_url) return ""
+  return `${API_BASE}${run.contact_sheet_url}`
+}
+
+export default function AgentPage() {
+  const [url, setUrl] = useState("")
+  const [files, setFiles] = useState<File[]>([])
+  const [run, setRun] = useState<AgentRun | null>(null)
+  const [recent, setRecent] = useState<AgentRun[]>([])
+  const [submitting, setSubmitting] = useState(false)
+  const [error, setError] = useState("")
+  const terminalRef = useRef<HTMLDivElement>(null)
+
+  const previews = useMemo(() => files.map((file) => ({ file, url: URL.createObjectURL(file) })), [files])
+  useEffect(() => () => previews.forEach((item) => URL.revokeObjectURL(item.url)), [previews])
+
+  useEffect(() => {
+    fetch(`${API_BASE}/agent-runs?limit=8`, { cache: "no-store" })
+      .then((res) => (res.ok ? res.json() : []))
+      .then((items: AgentRun[]) => {
+        setRecent(items)
+        const latest = items.find((item) => item.status === "executing" || item.status === "reviewing" || item.status === "completed")
+        if (latest) setRun(latest)
+      })
+      .catch(() => undefined)
+  }, [])
+
+  useEffect(() => {
+    if (!run || run.status === "completed" || run.status === "failed") return
+    const timer = window.setInterval(async () => {
+      try {
+        const res = await fetch(`${API_BASE}/agent-runs/${run.id}`, { cache: "no-store" })
+        if (!res.ok) return
+        const next = await res.json()
+        setRun(next)
+      } catch {
+        /* keep current state */
+      }
+    }, 2000)
+    return () => window.clearInterval(timer)
+  }, [run?.id, run?.status])
+
+  useEffect(() => {
+    const el = terminalRef.current
+    if (el) el.scrollTop = el.scrollHeight
+  }, [run?.logs.length])
+
+  async function submit() {
+    setError("")
+    if (!url.trim()) {
+      setError("需要 TikTok 链接")
+      return
+    }
+    setSubmitting(true)
+    try {
+      const form = new FormData()
+      form.append("tk_url", url.trim())
+      files.slice(0, 6).forEach((file) => form.append("product_files", file))
+      const res = await fetch(`${API_BASE}/agent-runs`, { method: "POST", body: form })
+      if (!res.ok) {
+        const text = await res.text().catch(() => "")
+        throw new Error(text.slice(0, 260) || `HTTP ${res.status}`)
+      }
+      const created = await res.json()
+      setRun(created)
+      setRecent((prev) => [created, ...prev.filter((item) => item.id !== created.id)].slice(0, 8))
+    } catch (e) {
+      setError(e instanceof Error ? e.message : String(e))
+    } finally {
+      setSubmitting(false)
+    }
+  }
+
+  const activeStageIndex = run ? Math.max(0, STAGES.findIndex((item) => item.key === run.stage)) : -1
+  const canStart = !!url.trim() && !submitting
+  const videoSrc = runVideoUrl(run)
+  const contactSrc = runContactUrl(run)
+
+  return (
+    <main className="min-h-screen bg-[#f3f4f7] text-[#111318]">
+      <div className="mx-auto flex min-h-screen w-full max-w-[1720px] flex-col gap-5 px-5 py-5">
+        <header className="flex items-center justify-between rounded-[28px] border border-black/5 bg-white/80 px-5 py-4 shadow-[0_24px_80px_rgba(20,25,38,0.08)] backdrop-blur-xl">
+          <div>
+            <div className="text-[12px] font-semibold uppercase tracking-[0.18em] text-[#7b8190]">SKG Agent Cut</div>
+            <h1 className="mt-1 text-[26px] font-semibold tracking-normal text-[#111318]">一分钟二创出片终端</h1>
+          </div>
+          <div className="hidden items-center gap-2 rounded-full bg-[#111318] px-3 py-2 text-[12px] font-medium text-white md:flex">
+            <TerminalSquare className="h-4 w-4 text-[#81d4ff]" />
+            {run ? `${run.status} · ${run.progress}%` : "standby"}
+          </div>
+        </header>
+
+        <section className="grid min-h-[calc(100vh-128px)] grid-cols-1 gap-5 xl:grid-cols-[390px_minmax(520px,1fr)_420px]">
+          <aside className="flex flex-col gap-4 rounded-[30px] border border-black/5 bg-white/85 p-4 shadow-[0_24px_80px_rgba(20,25,38,0.08)] backdrop-blur-xl">
+            <div className="rounded-[24px] border border-[#dfe3ea] bg-[#f8f9fb] p-4">
+              <label className="mb-2 flex items-center gap-2 text-[13px] font-semibold text-[#2b3038]">
+                <Link2 className="h-4 w-4 text-[#0a84ff]" />
+                TikTok 链接
+              </label>
+              <textarea
+                value={url}
+                onChange={(e) => setUrl(e.target.value)}
+                placeholder="https://www.tiktok.com/@..."
+                className="h-28 w-full resize-none rounded-[18px] border border-[#d9dee8] bg-white px-4 py-3 text-[14px] leading-relaxed text-[#111318] outline-none transition focus:border-[#0a84ff] focus:ring-4 focus:ring-[#0a84ff]/10"
+              />
+            </div>
+
+            <div className="rounded-[24px] border border-[#dfe3ea] bg-[#f8f9fb] p-4">
+              <label className="mb-3 flex items-center gap-2 text-[13px] font-semibold text-[#2b3038]">
+                <ImagePlus className="h-4 w-4 text-[#34c759]" />
+                产品图
+              </label>
+              <label className="flex h-32 cursor-pointer flex-col items-center justify-center rounded-[20px] border border-dashed border-[#c7ceda] bg-white text-center transition hover:border-[#0a84ff] hover:bg-[#f7fbff]">
+                <Upload className="mb-2 h-6 w-6 text-[#7b8190]" />
+                <span className="text-[13px] font-medium text-[#2b3038]">上传产品图</span>
+                <span className="mt-1 text-[12px] text-[#7b8190]">最多 6 张</span>
+                <input
+                  type="file"
+                  accept="image/*"
+                  multiple
+                  className="hidden"
+                  onChange={(e) => {
+                    const next = Array.from(e.target.files ?? []).slice(0, 6)
+                    setFiles(next)
+                  }}
+                />
+              </label>
+              {previews.length > 0 && (
+                <div className="mt-3 grid grid-cols-3 gap-2">
+                  {previews.map((item) => (
+                    <div key={`${item.file.name}-${item.file.size}`} className="aspect-square overflow-hidden rounded-[14px] border border-black/5 bg-white">
+                      <img src={item.url} alt={item.file.name} className="h-full w-full object-contain" />
+                    </div>
+                  ))}
+                </div>
+              )}
+            </div>
+
+            {error && (
+              <div className="rounded-[18px] border border-[#ff453a]/20 bg-[#ff453a]/10 px-4 py-3 text-[13px] text-[#9f1d17]">
+                {error}
+              </div>
+            )}
+
+            <button
+              type="button"
+              disabled={!canStart}
+              onClick={submit}
+              className="flex h-14 items-center justify-center gap-2 rounded-[20px] bg-[#111318] text-[15px] font-semibold text-white shadow-[0_16px_40px_rgba(17,19,24,0.18)] transition hover:bg-black disabled:cursor-not-allowed disabled:bg-[#b8bec8]"
+            >
+              {submitting ? <Loader2 className="h-5 w-5 animate-spin" /> : <Play className="h-5 w-5" />}
+              开始出片
+            </button>
+
+            <div className="mt-auto rounded-[24px] border border-[#dfe3ea] bg-[#f8f9fb] p-3">
+              <div className="mb-2 text-[12px] font-semibold text-[#7b8190]">最近任务</div>
+              <div className="space-y-2">
+                {recent.slice(0, 4).map((item) => (
+                  <button
+                    key={item.id}
+                    type="button"
+                    onClick={() => setRun(item)}
+                    className="flex w-full items-center justify-between rounded-[16px] bg-white px-3 py-2 text-left text-[12px] text-[#2b3038] transition hover:bg-[#f1f5fb]"
+                  >
+                    <span className="font-medium">{item.id}</span>
+                    <span className="text-[#7b8190]">{item.status}</span>
+                  </button>
+                ))}
+              </div>
+            </div>
+          </aside>
+
+          <section className="flex min-h-[680px] flex-col rounded-[30px] border border-black/5 bg-[#111318] p-4 shadow-[0_24px_80px_rgba(20,25,38,0.16)]">
+            <div className="mb-4 flex items-center justify-between">
+              <div className="flex items-center gap-3">
+                <div className="flex h-10 w-10 items-center justify-center rounded-[16px] bg-white/8">
+                  <TerminalSquare className="h-5 w-5 text-[#81d4ff]" />
+                </div>
+                <div>
+                  <h2 className="text-[16px] font-semibold text-white">Agent Terminal</h2>
+                  <p className="text-[12px] text-white/45">{run ? `run ${run.id} · job ${run.job_id}` : "waiting for input"}</p>
+                </div>
+              </div>
+              {run?.status === "failed" ? (
+                <CircleAlert className="h-5 w-5 text-[#ff453a]" />
+              ) : run?.status === "completed" ? (
+                <CheckCircle2 className="h-5 w-5 text-[#34c759]" />
+              ) : (
+                <Loader2 className={`h-5 w-5 text-[#81d4ff] ${run ? "animate-spin" : ""}`} />
+              )}
+            </div>
+
+            <div className="mb-4 grid grid-cols-4 gap-2 lg:grid-cols-8">
+              {STAGES.map((stage, index) => {
+                const active = index <= activeStageIndex || run?.status === "completed"
+                return (
+                  <div key={stage.key} className={`rounded-[14px] px-3 py-2 text-[12px] ${active ? "bg-white text-[#111318]" : "bg-white/6 text-white/40"}`}>
+                    {stage.label}
+                  </div>
+                )
+              })}
+            </div>
+
+            <div className="mb-4 h-2 overflow-hidden rounded-full bg-white/8">
+              <div className="h-full rounded-full bg-[#34c759] transition-all duration-700" style={{ width: `${run?.progress ?? 0}%` }} />
+            </div>
+
+            <div ref={terminalRef} className="min-h-0 flex-1 overflow-auto rounded-[22px] border border-white/8 bg-black px-4 py-4 font-mono text-[12px] leading-relaxed text-[#d8f3dc]">
+              {!run && <div className="text-white/35">$ idle</div>}
+              {run?.logs.map((log, index) => (
+                <div key={`${log.ts}-${index}`} className={log.level === "error" ? "text-[#ff8a80]" : log.level === "warn" ? "text-[#ffd166]" : "text-[#d8f3dc]"}>
+                  <span className="text-white/30">[{formatClock(log.ts)}]</span> {log.message}
+                </div>
+              ))}
+            </div>
+          </section>
+
+          <aside className="flex flex-col gap-4 rounded-[30px] border border-black/5 bg-white/85 p-4 shadow-[0_24px_80px_rgba(20,25,38,0.08)] backdrop-blur-xl">
+            <div className="flex items-center justify-between">
+              <div>
+                <div className="text-[12px] font-semibold uppercase tracking-[0.16em] text-[#7b8190]">Final</div>
+                <h2 className="mt-1 text-[18px] font-semibold text-[#111318]">成片播放器</h2>
+              </div>
+              <Film className="h-5 w-5 text-[#ff9f0a]" />
+            </div>
+
+            <div className="aspect-[9/16] overflow-hidden rounded-[26px] border border-black/8 bg-[#111318]">
+              {videoSrc ? (
+                <video key={videoSrc} src={videoSrc} controls playsInline className="h-full w-full bg-black object-contain" />
+              ) : (
+                <div className="flex h-full flex-col items-center justify-center gap-3 text-[#7b8190]">
+                  <Film className="h-8 w-8" />
+                  <span className="text-[13px]">等待成片</span>
+                </div>
+              )}
+            </div>
+
+            {contactSrc && (
+              <div className="overflow-hidden rounded-[18px] border border-black/8 bg-white">
+                <img src={contactSrc} alt="final contact sheet" className="w-full object-cover" />
+              </div>
+            )}
+
+            <div className="grid grid-cols-2 gap-2">
+              <a
+                href={videoSrc || undefined}
+                download
+                className={`flex h-11 items-center justify-center gap-2 rounded-[16px] text-[13px] font-semibold ${videoSrc ? "bg-[#0a84ff] text-white" : "pointer-events-none bg-[#dfe3ea] text-[#8d94a1]"}`}
+              >
+                <ArrowDownToLine className="h-4 w-4" />
+                下载
+              </a>
+              <button
+                type="button"
+                onClick={() => {
+                  setRun(null)
+                  setError("")
+                }}
+                className="flex h-11 items-center justify-center gap-2 rounded-[16px] bg-[#eef1f6] text-[13px] font-semibold text-[#2b3038] transition hover:bg-[#e3e7ef]"
+              >
+                <RotateCcw className="h-4 w-4" />
+                重来
+              </button>
+            </div>
+          </aside>
+        </section>
+      </div>
+    </main>
+  )
+}
--- a/web/components/ad-recreation-board.tsx
+++ b/web/components/ad-recreation-board.tsx
@@ -3,7 +3,7 @@
 import { type DragEvent as ReactDragEvent, type MouseEvent as ReactMouseEvent, type ReactNode, type RefObject, useEffect, useMemo, useRef, useState } from "react"
 import { createPortal } from "react-dom"
 import {
-  AlertTriangle, BookOpen, Check, ChevronDown, Circle, Film, FileText, Image as ImageIcon, Info, Link2, Loader2, Minus,
+  AlertTriangle, BookOpen, Check, ChevronDown, Circle, Download, Film, FileText, Image as ImageIcon, Info, Link2, Loader2, Minus,
  MessageSquare, Mic, Moon, Package, PanelRight, Play, Plus, RefreshCw, Scissors, Send, Sparkles, Sun, Trash2, Upload, Wand2,
 } from "lucide-react"
 import { toast } from "sonner"
@@ -863,6 +863,17 @@ function videoSrc(video: GeneratedVideo) {
  return apiAssetUrl(video.url)
 }

+function downloadMedia(url: string, filename: string) {
+  if (!url || typeof document === "undefined") return
+  const link = document.createElement("a")
+  link.href = url
+  link.download = filename
+  link.rel = "noreferrer"
+  document.body.appendChild(link)
+  link.click()
+  link.remove()
+}
+
 function audioPreview(job: Job | null) {
  if (!job) return "粘贴 TK 链接或上传视频后，系统会先下载视频；下载完成后自动提取音频文案。"
  const source = job.audio_script?.source_text?.trim() || job.audio_script?.source_zh?.trim()
@@ -1324,8 +1335,8 @@ function audioModelTrace(models?: RuntimeModels): ModelTraceSpec {
    title: "音频解析",
    model: modelList([models?.asr, models?.translate, models?.asr_fallback]),
    chain: [
-      `ASR 转写：远端 ${remoteState}，模型 ${modelValue(models?.asr)}${models?.asr_language ? `，语言 ${models.asr_language}` : ""}；本机转写 ${localState}，使用 ${localModel}；多模态兜底${models?.asr_audio_fallback_enabled === false ? "关闭" : `为 ${modelValue(models?.asr_fallback)}`}，并拒绝假字幕/重复时间轴`,
-      `字幕翻译：${modelValue(models?.translate)} 按 ASR 段落输出中文；失败时保留原文时间轴，中文可为空`,
+      `ASR 转写：远端 ${remoteState}，模型 ${modelValue(models?.asr)}，语言 ${models?.asr_language || "auto"}；本机转写 ${localState}，使用 ${localModel} 自动识别中文/多语言；多模态兜底${models?.asr_audio_fallback_enabled === false ? "关闭" : `为 ${modelValue(models?.asr_fallback)}`}，并拒绝假字幕/重复时间轴`,
+      `字幕翻译：${modelValue(models?.translate)} 按原语言 ASR 段落输出中文；原文已是中文时保留为中文镜像，失败时保留原文时间轴`,
      `讲话人 / 节奏 / 背景音：${modelValue(models?.asr_fallback)} 读取 audio.wav + 转写时间轴做多模态分析；失败时用本地时长/段落估算兜底`,
    ],
    note: "点击“解析音频”后触发；开始任务下载完成后也会自动走这条链路。",
@@ -1373,7 +1384,7 @@ function scriptRewriteModelTrace(models?: RuntimeModels): ModelTraceSpec {
    chain: [
      `主改写：${modelValue(models?.audio_rewrite)} 根据原文案、当前分镜、作者想法生成新口播`,
      `模型回退：依次尝试 ${modelValue(models?.asr_fallback)} 和 ${modelValue(models?.translate)}；全部失败时用本地模板保留分镜可编辑`,
-      "返回结果只写入当前分镜文案编辑框；点击保存规划后才写入 frame.storyboard.action",
+      "返回结果会先写入当前分镜文案编辑框；生成候选前会自动同步到分镜数据，无需单独保存视频",
    ],
  }
 }
@@ -1388,7 +1399,7 @@ function videoModelTrace(models: RuntimeModels | undefined, model: string): Mode
      `服务商：${modelValue(models?.video_provider)} · ${modelValue(models?.video_base_url)}`,
      "当前主工作台暂停直接提交视频；旧入口误触也会被页面层保护",
      "开放后输入会包含已确认首尾帧、当前分镜文案、产品素材、相似主体资产和画面规划",
-      "输出为异步候选视频，完成后回填到对应分镜行；Sora 已停用",
+      "输出为异步候选视频，提交后立即写入当前任务，完成后自动回填 mp4 到对应分镜行；Sora 已停用",
    ],
  }
 }
@@ -5584,23 +5595,6 @@ function AudioStoryboardPlanPanel({
    }
  }

-  const selectVideoForRow = async (row: AudioStoryboardRow, frame: KeyFrame | null, videoId: string) => {
-    if (!job || !frame) return
-    const plannedRow = { ...planForRow(row, frame), skgCopy: copyForRow(row), skgCopyZh: copyZhForRow(row) }
-    try {
-      const legacyRowIndex = legacyRowIndexForFrame(frame.index)
-      const savedSceneForRow = storyboardSceneBelongsToRow(frame.storyboard, row.index, legacyRowIndex)
-        ? frame.storyboard
-        : null
-      const scene = buildSceneForPlannedRow(plannedRow, frame, savedSceneForRow, videoId)
-      const updated = await updateStoryboard(job.id, frame.index, scene)
-      onJobUpdate?.(updated)
-      toast.success(`分镜 ${row.index + 1} 已选用该视频`)
-    } catch (e) {
-      toast.error("选用视频失败：" + (e instanceof Error ? e.message : String(e)))
-    }
-  }
-
  const clearVideosForRow = (videos: GeneratedVideo[]) => {
    if (!videos.length) return
    for (const video of videos) onDeleteVideo?.(video.id)
@@ -5904,9 +5898,9 @@ function AudioStoryboardPlanPanel({
    setStoryboardSaveBusyRow(row.index)
    try {
      await saveRowStoryboardDraft(row, frame)
-      toast.success("已保存本条三字段规划")
+      toast.success("本条三字段规划已同步")
    } catch (e) {
-      toast.error("保存本条规划失败：" + (e instanceof Error ? e.message : String(e)))
+      toast.error("同步本条规划失败：" + (e instanceof Error ? e.message : String(e)))
    } finally {
      setStoryboardSaveBusyRow(null)
    }
@@ -5918,7 +5912,7 @@ function AudioStoryboardPlanPanel({
      .map((row) => ({ row: planForRow(row, referenceFrameForRow(row)), frame: referenceFrameForRow(row) }))
      .filter((item): item is { row: AudioStoryboardRow; frame: KeyFrame } => !!item.frame)
    if (!jobsToSubmit.length) {
-      if (!quiet) toast.warning("先完成前置抽帧，让每条分镜有可保存的承载位置")
+      if (!quiet) toast.warning("先完成前置抽帧，让每条分镜有可同步的承载位置")
      return { ok: 0, failed: rows.length }
    }
    setBatchStoryboardSaveBusy(true)
@@ -5932,12 +5926,12 @@ function AudioStoryboardPlanPanel({
          ok += 1
        } catch (e) {
          failed += 1
-          console.warn("批量保存分镜规划失败", item.row.index, e)
+          console.warn("批量同步分镜规划失败", item.row.index, e)
        }
      }
      if (!quiet) {
-        if (failed) toast.warning(`已保存 ${ok} 条规划，${failed} 条失败`)
-        else toast.success(`已保存全部 ${ok} 条分镜规划`)
+        if (failed) toast.warning(`已同步 ${ok} 条规划，${failed} 条失败`)
+        else toast.success(`已同步全部 ${ok} 条分镜规划`)
      }
    } finally {
      setStoryboardSaveBusyRow(null)
@@ -6307,10 +6301,11 @@ function AudioStoryboardPlanPanel({
                  type="button"
                  onClick={() => void saveAllStoryboardDrafts()}
                  disabled={batchStoryboardSaveBusy || !rows.length || !orderedFrames.length}
-                  className="skg-primary-action inline-flex h-9 items-center justify-center gap-1 px-2.5 text-[11px] font-semibold transition disabled:cursor-not-allowed disabled:opacity-40"
+                  title="只同步三字段规划；生成的视频会自动保存到当前任务"
+                  className="skg-secondary-action inline-flex h-9 items-center justify-center gap-1 px-2.5 text-[11px] font-semibold transition disabled:cursor-not-allowed disabled:opacity-40"
                >
                  {batchStoryboardSaveBusy ? <Loader2 className="h-3.5 w-3.5 animate-spin" /> : <Check className="h-3.5 w-3.5" />}
-                  保存全部
+                  同步规划
                </button>
              </div>
            </div>
@@ -6468,7 +6463,6 @@ function AudioStoryboardPlanPanel({
                      job={job}
                      videos={rowVideos}
                      enabled={!!referenceFrame}
-                      selectedVideoId={selectedVideoIdForRow(row, referenceFrame)}
                      busy={quickVideoBusyRow === row.index}
                      count={rowVideoCount}
                      onCountChange={(count) => patchRowVideoCount(row.index, count)}
@@ -6476,7 +6470,6 @@ function AudioStoryboardPlanPanel({
                      onReroll={() => void drawVideosForRow(plannedRow, referenceFrame, rowVideoCount)}
                      onRegenerate={() => void drawVideosForRow(plannedRow, referenceFrame, 1)}
                      onClear={() => clearVideosForRow(rowVideos)}
-                      onSelect={(videoId) => void selectVideoForRow(plannedRow, referenceFrame, videoId)}
                      onDeleteVideo={onDeleteVideo}
                    />
                  </div>
@@ -6697,7 +6690,6 @@ function AudioStoryboardPlanPanel({
                    videos={rowVideos}
                    enabled={!!referenceFrame}
                    expanded={videosOpen}
-                    selectedVideoId={selectedVideoIdForRow(row, referenceFrame)}
                    busy={quickVideoBusyRow === row.index}
                    count={rowVideoCount}
                    onCountChange={(count) => patchRowVideoCount(row.index, count)}
@@ -6706,7 +6698,6 @@ function AudioStoryboardPlanPanel({
                    onReroll={() => void drawVideosForRow(plannedRow, referenceFrame, rowVideoCount)}
                    onRegenerate={() => void drawVideosForRow(plannedRow, referenceFrame, 1)}
                    onClear={() => clearVideosForRow(rowVideos)}
-                    onSelect={(videoId) => void selectVideoForRow(plannedRow, referenceFrame, videoId)}
                    onDeleteVideo={onDeleteVideo}
                  />
                  <div className="mt-1 flex items-center justify-between gap-2">
@@ -6720,10 +6711,11 @@ function AudioStoryboardPlanPanel({
                    type="button"
                    onClick={() => void saveSingleRowStoryboardDraft(plannedRow, referenceFrame)}
                    disabled={!referenceFrame || savingStoryboard}
-                    className="skg-primary-action mt-1.5 inline-flex h-8 w-full items-center justify-center gap-1 px-2 text-[11px] font-semibold transition disabled:cursor-not-allowed disabled:opacity-40"
+                    title="只同步本条三字段规划；生成的视频会自动保存到当前任务"
+                    className="mt-1.5 inline-flex h-8 w-full items-center justify-center gap-1 rounded-md border border-white/10 bg-white/[0.04] px-2 text-[11px] font-semibold text-white/58 transition hover:border-cyan-300/35 hover:text-cyan-100 disabled:cursor-not-allowed disabled:opacity-40"
                  >
                    {savingStoryboard ? <Loader2 className="h-3.5 w-3.5 animate-spin" /> : <Check className="h-3.5 w-3.5" />}
-                    保存本条规划
+                    同步本条规划
                  </button>
                </StoryboardPlanCell>
                </div>
@@ -7062,7 +7054,6 @@ function StoryboardVideoSlots({
  job,
  videos,
  enabled,
-  selectedVideoId = "",
  busy = false,
  count = 4,
  onCountChange,
@@ -7070,14 +7061,12 @@ function StoryboardVideoSlots({
  onReroll,
  onRegenerate,
  onClear,
-  onSelect,
  onDeleteVideo,
 }: {
  job: Job
  videos: GeneratedVideo[]
  enabled: boolean
  expanded?: boolean
-  selectedVideoId?: string
  busy?: boolean
  count?: number
  onCountChange?: (count: number) => void
@@ -7086,12 +7075,10 @@ function StoryboardVideoSlots({
  onReroll?: () => void
  onRegenerate?: () => void
  onClear?: () => void
-  onSelect?: (videoId: string) => void
  onDeleteVideo?: (videoId: string) => void
 }) {
  const visible = videos
  const runningCount = videos.filter((video) => video.status === "queued" || video.status === "in_progress").length
-  const selectedVideo = selectedVideoId ? videos.find((video) => video.id === selectedVideoId) : null
  const targetCount = clampVideoCount(count)
  const emptyCount = visible.length ? 0 : Math.max(1, targetCount)
  return (
@@ -7103,7 +7090,9 @@ function StoryboardVideoSlots({
          <span className="shrink-0 text-[10px] text-white/34">
            {videos.length ? `${videos.length} 条${runningCount ? ` · ${runningCount} 生成中` : ""}` : enabled ? "待生成" : "待抽帧"}
          </span>
-          {selectedVideo ? <span className="rounded border border-emerald-300/20 bg-emerald-300/[0.08] px-1.5 py-0.5 text-[10px] text-emerald-100/72">已选 {shortId(selectedVideo.id)}</span> : null}
+          {videos.length ? (
+            <span className="rounded border border-emerald-300/18 bg-emerald-300/[0.07] px-1.5 py-0.5 text-[10px] text-emerald-100/70">自动保存</span>
+          ) : null}
        </div>
        <div className="flex flex-wrap items-center gap-1.5">
          <label className="inline-flex h-7 items-center gap-1 rounded-md border border-white/10 bg-black/36 px-1.5 text-[10px] font-semibold text-white/48">
@@ -7146,9 +7135,7 @@ function StoryboardVideoSlots({
            key={video.id}
            job={job}
            video={video}
-            selected={selectedVideoId === video.id}
            className="h-[168px] w-[94px]"
-            onSelect={onSelect ? () => onSelect(video.id) : undefined}
            onRegenerate={onRegenerate}
            onDelete={onDeleteVideo ? () => onDeleteVideo(video.id) : undefined}
          />
@@ -7265,40 +7252,50 @@ function StoryboardVideoPreview({
  job,
  video,
  className = "h-20 w-12",
-  selected = false,
-  onSelect,
  onRegenerate,
  onDelete,
 }: {
  job: Job
  video: GeneratedVideo
  className?: string
-  selected?: boolean
-  onSelect?: () => void
  onRegenerate?: () => void
  onDelete?: () => void
 }) {
  const src = videoSrc(video)
+  const playableSrc = src && video.status === "completed" ? src : ""
  const poster = videoPoster(job, video)
  const running = video.status === "queued" || video.status === "in_progress"
  return (
    <MediaAssetTile
      kind="video"
-      src={src && video.status === "completed" ? src : undefined}
+      src={playableSrc || undefined}
      poster={poster}
-      href={onSelect ? undefined : src || undefined}
+      href={playableSrc || undefined}
      alt={`片段 ${shortId(video.id)}`}
      label={`${shortId(video.id)} · ${video.model}`}
      meta={video.status}
      className={`shrink-0 bg-black/45 ${className}`}
      objectFit="cover"
-      selected={selected}
-      onClick={onSelect}
-      title={`${video.model} · ${video.status}`}
+      title={playableSrc ? "点击打开视频预览" : `${video.model} · ${video.status}`}
      bottom={<span className="block truncate rounded bg-black/70 px-1 py-0.5 text-center font-mono text-[9px] text-white/62">{running ? "生成中" : video.status === "failed" ? "失败" : shortId(video.id)}</span>}
-      topLeft={selected ? <span className="inline-flex h-5 w-5 items-center justify-center rounded-full bg-emerald-400 text-black"><Check className="h-3 w-3" /></span> : undefined}
      topRight={running ? <Loader2 className="h-3 w-3 animate-spin text-cyan-100" /> : undefined}
-      actions={onRegenerate ? [{ key: "regen", label: "重生一个候选", icon: <RefreshCw className="h-3 w-3" />, onClick: onRegenerate, tone: "cyan" }] : []}
+      actions={[
+        ...(playableSrc ? [{
+          key: "download",
+          label: "下载视频",
+          icon: <Download className="h-3 w-3" />,
+          onClick: () => downloadMedia(playableSrc, `skg-storyboard-${shortId(video.id)}.mp4`),
+          tone: "cyan" as const,
+        }] : []),
+        ...(onRegenerate ? [{
+          key: "regen",
+          label: "重生一个候选",
+          icon: <RefreshCw className="h-3 w-3" />,
+          onClick: onRegenerate,
+          tone: "neutral" as const,
+        }] : []),
+      ]}
+      actionsAlwaysVisible={!!playableSrc}
      onDelete={onDelete}
      deleteLabel="删除这个视频候选"
    />
@@ -7921,13 +7918,13 @@ function ComposeSummary({
        <PanelRight className="h-4 w-4 text-rose-200" />
        <div>
          <div className="text-[13px] font-semibold text-white">完整视频合成</div>
-          <div className="text-[11px] text-white/40">音频和已选分镜视频将合成完整广告，接口待接入。</div>
+          <div className="text-[11px] text-white/40">音频和已生成候选视频将合成完整广告，接口待接入。</div>
        </div>
      </div>
      <div className="flex items-center gap-2 text-[11px] text-white/52">
        <Requirement label="音频" ready={audioReady} detail={audioReady ? "已生成" : "待解析"} />
        <Requirement label="候选" ready={generatedVideoCount > 0} detail={`${generatedVideoCount}`} />
-        <Requirement label="已选" ready={selectedVideoCount > 0} detail={`${selectedVideoCount}`} />
+        <Requirement label="自动存" ready={generatedVideoCount > 0} detail={generatedVideoCount > 0 ? "已保存" : "待候选"} />
        <button type="button" disabled className="inline-flex h-10 cursor-not-allowed items-center justify-center gap-2 rounded-md border border-white/10 bg-white/[0.04] px-3 text-[12px] font-semibold text-white/34">
          <Film className="h-4 w-4" />
          合成完整视频
@@ -8177,19 +8174,31 @@ function VideoCandidate({
  const src = videoSrc(video)
  const poster = videoPoster(job, video)
  const running = video.status === "queued" || video.status === "in_progress"
+  const playableSrc = src && video.status === "completed" ? src : ""
+  const thumb = (
+    <>
+      {playableSrc ? (
+        <video src={playableSrc} poster={poster} muted playsInline className="h-full w-full object-cover" />
+      ) : poster ? (
+        <img src={poster} alt={`片段 ${shortId(video.id)}`} className="h-full w-full object-cover opacity-80" />
+      ) : (
+        <div className="flex h-full w-full items-center justify-center text-white/30"><Film className="h-4 w-4" /></div>
+      )}
+      <div className="absolute right-1 top-1 rounded-full bg-black/70 p-0.5">{selected ? <Check className="h-3 w-3 text-rose-200" /> : <Circle className="h-3 w-3 text-white/55" />}</div>
+    </>
+  )
  return (
    <div className={`rounded-lg border p-2 transition ${selected ? "border-rose-400/70 bg-rose-500/10" : "border-white/10 bg-black/30"}`}>
      <div className="flex gap-2">
-        <button type="button" onClick={onToggle} className="relative h-24 w-14 shrink-0 overflow-hidden rounded-md border border-white/10 bg-black">
-          {src && video.status === "completed" ? (
-            <video src={src} poster={poster} muted playsInline className="h-full w-full object-cover" />
-          ) : poster ? (
-            <img src={poster} alt={`片段 ${shortId(video.id)}`} className="h-full w-full object-cover opacity-80" />
-          ) : (
-            <div className="flex h-full w-full items-center justify-center text-white/30"><Film className="h-4 w-4" /></div>
-          )}
-          <div className="absolute right-1 top-1 rounded-full bg-black/70 p-0.5">{selected ? <Check className="h-3 w-3 text-rose-200" /> : <Circle className="h-3 w-3 text-white/55" />}</div>
-        </button>
+        {playableSrc ? (
+          <a href={playableSrc} target="_blank" rel="noreferrer" className="relative h-24 w-14 shrink-0 overflow-hidden rounded-md border border-white/10 bg-black" title="打开视频预览">
+            {thumb}
+          </a>
+        ) : (
+          <div className="relative h-24 w-14 shrink-0 overflow-hidden rounded-md border border-white/10 bg-black">
+            {thumb}
+          </div>
+        )}
        <div className="min-w-0 flex-1">
          <div className="flex items-center justify-between gap-2">
            <div className="truncate font-mono text-[12px] text-white/80">{shortId(video.id)} · {video.model}</div>
@@ -8204,11 +8213,17 @@ function VideoCandidate({
            <span>{video.progress}%</span>
          </div>
          {video.error && <div className="mt-1 line-clamp-2 text-[11px] text-rose-200/80">{video.error}</div>}
-          {src && video.status === "completed" && (
-            <a href={src} target="_blank" rel="noreferrer" className="mt-2 inline-flex items-center gap-1 text-[11px] font-medium text-cyan-200 hover:text-cyan-100">
-              <Play className="h-3 w-3" />
-              预览片段
-            </a>
+          {playableSrc && (
+            <div className="mt-2 flex flex-wrap items-center gap-2">
+              <a href={playableSrc} target="_blank" rel="noreferrer" className="inline-flex items-center gap-1 text-[11px] font-medium text-cyan-200 hover:text-cyan-100">
+                <Play className="h-3 w-3" />
+                预览片段
+              </a>
+              <a href={playableSrc} download={`skg-storyboard-${shortId(video.id)}.mp4`} className="inline-flex items-center gap-1 text-[11px] font-medium text-emerald-200 hover:text-emerald-100">
+                <Download className="h-3 w-3" />
+                下载
+              </a>
+            </div>
          )}
        </div>
      </div>
--- a/web/components/audio-strip.tsx
+++ b/web/components/audio-strip.tsx
@@ -319,7 +319,7 @@ export function AudioStrip({ job, open, onClose }: { job: Job | null; open: bool
                </div>
            ) : (
              <div className="flex h-full items-center justify-center rounded-lg border border-dashed border-white/12 text-[12px] text-white/45">
-                点击音频卡片后开始解析；完成后这里会按时间显示英文、中文翻译和对应波形。
+                点击音频卡片后开始解析；完成后这里会按时间显示原语言文案、中文镜像和对应波形。
              </div>
            )}
            </div>
--- a/web/components/media-asset-tile.tsx
+++ b/web/components/media-asset-tile.tsx
@@ -46,6 +46,7 @@ type MediaAssetTileProps = {
  deleting?: boolean
  deleteDisabled?: boolean
  actions?: MediaAssetAction[]
+  actionsAlwaysVisible?: boolean
  disablePreview?: boolean
 }

@@ -106,6 +107,7 @@ export function MediaAssetTile({
  deleting = false,
  deleteDisabled = false,
  actions = [],
+  actionsAlwaysVisible = false,
  disablePreview = false,
 }: MediaAssetTileProps) {
  const [position, setPosition] = useState<{ left: number; top: number; width: number } | null>(null)
@@ -200,7 +202,7 @@ export function MediaAssetTile({
      {topRight ? <div className="pointer-events-none absolute right-1 top-1 z-10">{topRight}</div> : null}
      {bottom ? <div className="pointer-events-none absolute bottom-1 left-1 right-1 z-10">{bottom}</div> : null}
      {(actions.length || onDelete) ? (
-        <div className="absolute right-1 top-1 z-20 flex flex-col gap-0.5 opacity-0 transition group-hover:opacity-100 group-focus-within:opacity-100">
+        <div className={`absolute right-1 top-1 z-20 flex flex-col gap-0.5 transition ${actionsAlwaysVisible ? "opacity-100" : "opacity-0 group-hover:opacity-100 group-focus-within:opacity-100"}`}>
          {actions.map((action) => (
            <button
              key={action.key}
--- a/web/components/nodes/index.tsx
+++ b/web/components/nodes/index.tsx
@@ -2017,7 +2017,7 @@ export function ASRNode({ data, selected }: any) {
      onTogglePin={() => d.onToggleNodePin?.("asr")}
    >
      <div className="text-[11.5px] text-[var(--text-soft)]">
-        OpenAI-compatible ASR · 英文带时间戳分段
+        OpenAI-compatible ASR · 原语言带时间戳分段
      </div>
      {d.job && d.job.transcript.length > 0 && (
        <div className="mt-2 max-h-24 overflow-y-auto text-[11px] space-y-1 text-[var(--text-strong)]">
Author	SHA1	Message	Date
kang	0d5c32630f	docs: record multilingual asr deployment	2026-05-22 09:30:59 +08:00
kang	7ae92ae4d9	auto-save 2026-05-22 09:27 (~2)	2026-05-22 09:27:54 +08:00
kang	642793500f	fix: support multilingual audio transcription	2026-05-22 09:26:59 +08:00
kang	eb4347a843	fix: clarify storyboard videos auto-save	2026-05-21 17:44:40 +08:00
kang	4efb2ce456	fix: make storyboard video clicks previewable	2026-05-21 17:29:16 +08:00
kang	cc12d7c6a7	auto-save 2026-05-21 17:16 (~2)	2026-05-21 17:16:54 +08:00
kang	77d23a06b3	feat: add one-click agent cut terminal	2026-05-21 16:14:58 +08:00
kang	775ad79661	auto-save 2026-05-21 16:05 (~2)	2026-05-21 16:05:52 +08:00
kang	a3ddb05424	auto-save 2026-05-21 15:43 (+1, ~4)	2026-05-21 15:43:13 +08:00
kang	02a9999d8c	auto-save 2026-05-21 15:25 (~2)	2026-05-21 15:26:00 +08:00
kang	b6fec10371	auto-save 2026-05-21 15:20 (~2)	2026-05-21 15:20:33 +08:00
kang	7bb4f3ea9f	auto-save 2026-05-21 15:15 (+2, ~3)	2026-05-21 15:15:07 +08:00
kang	b82dad4aa8	auto-save 2026-05-21 15:09 (+1, ~3)	2026-05-21 15:09:40 +08:00