auto-save 2026-05-14 10:14 (~7)

This commit is contained in:
2026-05-14 10:14:43 +08:00
parent 96784f9df1
commit ee32d83b6c
7 changed files with 2398 additions and 2330 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -1037,6 +1037,10 @@ def _score_transparent_human_frame(img_path: Path) -> TransparentHumanFrameScore
max_tokens=1200,
)
raw = (resp.choices[0].message.content or "").strip()
if raw.startswith("```"):
import re as _re
match = _re.search(r"\{[\s\S]*\}", raw)
raw = match.group(0) if match else raw
data = json.loads(raw)
except Exception as e:
return TransparentHumanFrameScore(qualified=False, reject_reason=f"AI 评分失败:{e}")

View File

@@ -552,7 +552,7 @@
<p>当前产品不是“复制别人的视频”,而是拆解参考视频,提取可借鉴的镜头元素,再改造成 SKG 产品语境的视频素材。</p>
<div class="pipeline">
<div class="step"><div class="num">1</div><h3>输入</h3><p>TK 链接或本地上传,后端下载/保存源视频。</p></div>
<div class="step"><div class="num">2</div><h3>镜头拆解</h3><p>拆轨、抽关键帧、手动加帧,形成参考分镜池。</p></div>
<div class="step"><div class="num">2</div><h3>镜头拆解</h3><p>拆轨、抽关键帧、手动加帧,形成参考分镜池。当前主题默认使用“透明骨架人”抽帧目标本地先扫候选Vision 再按透明身体、白色骨架、人物占比、清晰度、广告感和产品可用性打分验收;不合格候选会自动换下一帧。</p></div>
<div class="step"><div class="num">3</div><h3>清洗水印</h3><p>对关键帧做全图或区域清洗,清洗版先进入待审核状态;确认后可单张替换,也可一键替换全部待应用清洗版。</p></div>
<div class="step"><div class="num">4</div><h3>主体识别</h3><p>识别场景和主体候选,只是候选,不应锁死。</p></div>
<div class="step"><div class="num">5</div><h3>素材准备</h3><p>清洗关键帧,把多张关键帧作为同一主体的参考,先重绘六张标准站立主体资产图,再按关键帧生成多个去主体、相似或换风格场景图。</p></div>
@@ -625,7 +625,7 @@ api/main.py
</div>
<div class="flow-row">
<div><strong>你看到的区域</strong><span>关键帧素材审核面板</span></div>
<div><strong>主要源码</strong><span><code>FrameLightbox</code>;按“原图/清洗、主体资产、场景图、产品融合、审核”五个页签组织;左侧只放主图/框选画布,但主体资产页左侧改为全部已清洗/已选参考帧网格,场景图页左侧显示全部关键帧并可勾选场景参考。清洗页右侧支持一键清洗未处理帧、单张替换清洗版和一键替换全部待应用清洗版;批量替换顺序调用 <code>applyCleanedFrame</code>,不新增后端接口。产品融合页左侧改为纵向 6 行镜头工作表:每行直接显示产品图、白底人物图、人物图上的产品区域、场景图、描述词、秒数和单条生成按钮,便于一次看完 6 条视频。产品融合槽位的“粘贴”优先使用应用内 <code>clipboard</code>,也支持选中槽位后 Cmd+V 粘贴系统图片。右侧只保留 GPT Image 2 / Seedance 固定模型、当前镜头状态、AI 描述草稿、批量排队和产品图库选用。主体资产页只确认一个统一主体,后端按参考重绘六张纯背景、占满画面的标准站立主体图;场景图依赖主体资产,右侧通过地点、生成方式、风格和参考要素拼出可编辑 prompt再按当前关键帧生成去主体原场景、相似新场景或同构换风格。相关接口包括 <code>cleanupFrame</code><code>applyCleanedFrame</code><code>addElement</code><code>generateSubjectAssets</code><code>generateSceneAsset</code><code>listProductLibrary</code><code>copyProductLibraryAsset</code><code>createProductFusionGuide</code><code>generateProductFusionDescriptions</code></span></div>
<div><strong>主要源码</strong><span><code>FrameLightbox</code>;按“原图/清洗、主体资产、场景图、产品融合、审核”五个页签组织;左侧只放主图/框选画布,但主体资产页左侧改为全部已清洗/已选参考帧网格,场景图页左侧显示全部关键帧并可勾选场景参考。主体识别页会显示透明骨架人目标和 Vision 验收分数。清洗页右侧支持一键清洗未处理帧、单张替换清洗版和一键替换全部待应用清洗版;批量替换顺序调用 <code>applyCleanedFrame</code>,不新增后端接口。产品融合页左侧改为纵向 6 行镜头工作表:每行直接显示产品图、白底人物图、人物图上的产品区域、场景图、描述词、秒数和单条生成按钮,便于一次看完 6 条视频。产品融合槽位的“粘贴”优先使用应用内 <code>clipboard</code>,也支持选中槽位后 Cmd+V 粘贴系统图片。右侧只保留 GPT Image 2 / Seedance 固定模型、当前镜头状态、AI 描述草稿、批量排队和产品图库选用。主体资产页只确认一个统一主体,后端按参考重绘六张纯背景、占满画面的标准站立透明骨架人资产图;场景图依赖主体资产,右侧通过地点、生成方式、风格和参考要素拼出可编辑 prompt再按当前关键帧生成去主体原场景、相似新场景或同构换风格。相关接口包括 <code>cleanupFrame</code><code>applyCleanedFrame</code><code>addElement</code><code>generateSubjectAssets</code><code>generateSceneAsset</code><code>listProductLibrary</code><code>copyProductLibraryAsset</code><code>createProductFusionGuide</code><code>generateProductFusionDescriptions</code></span></div>
<div><strong>适合怎么描述</strong><span>“这一组关键帧如何共同生成一个统一主体包;某张关键帧的水印、去主体场景图、产品融合镜头组和质量风险应该如何审核”。</span></div>
</div>
<div class="flow-row">
@@ -653,15 +653,31 @@ api/main.py
<div class="card">
<h3>KeyFrame</h3>
<p>关键帧是整个产品的核心单位。<code>index</code> 是稳定 ID手动加帧后不连续不能用数组下标代替。</p>
<pre>KeyFrame {
<pre>KeyFrame {
index, timestamp, url,
description,
transparent_human_score,
cleaned_url, cleaned_applied,
quality_report,
scene_assets: SceneAsset[],
elements: KeyElement[],
storyboard: StoryboardScene,
generated_images: GeneratedImage[]
}</pre>
</div>
<div class="card">
<h3>TransparentHumanFrameScore</h3>
<p>透明骨架人主题的抽帧验收结果。只有 <code>target=transparent_human</code> 时会在抽帧阶段写入;普通抽帧目标不要求该字段。</p>
<pre>TransparentHumanFrameScore {
transparent_body_score: 0-25,
skeleton_visible_score: 0-25,
human_prominence_score: 0-15,
clarity_score: 0-15,
commercial_style_score: 0-10,
product_usefulness_score: 0-10,
total_score,
qualified,
reject_reason
}</pre>
</div>
<div class="card">
@@ -753,7 +769,7 @@ SubjectAsset {
<tr><td>创建任务</td><td><code>POST /jobs</code></td><td><code>createJob</code></td><td>提交 TK 链接,后台开始下载,停在 downloaded 等用户点解析。</td></tr>
<tr><td>上传视频</td><td><code>POST /jobs/upload</code></td><td><code>uploadJob</code></td><td>保存 source.mp4然后同样进入下载完成状态。</td></tr>
<tr><td>删除输入视频</td><td><code>DELETE /jobs/{id}</code></td><td><code>deleteJob</code></td><td>从任务队列、URL 和磁盘 <code>jobs/&lt;id&gt;</code> 目录移除整个 job包括源视频、关键帧、元素提取图和生成视频。</td></tr>
<tr><td>解析视频</td><td><code>POST /jobs/{id}/analyze?frames=&amp;target=&amp;mode=&amp;quality=</code></td><td><code>analyzeJob</code></td><td>拆轨 + 目标化抽关键帧。<code>target</code> 支持综合、清晰主体、转场变化、表情瞬间、动作峰值;<code>mode=append</code> 追加新关键帧;<code>quality=auto</code> 根据本机算力和视频时长自动选择快速、精细或极准。多个抽帧请求进入后端队列顺序处理。</td></tr>
<tr><td>解析视频</td><td><code>POST /jobs/{id}/analyze?frames=&amp;target=&amp;mode=&amp;quality=</code></td><td><code>analyzeJob</code></td><td>拆轨 + 目标化抽关键帧。<code>target</code> 支持透明骨架人、综合、清晰主体、转场变化、表情瞬间、动作峰值;当前 UI 默认 <code>transparent_human</code>。透明骨架人目标会先扩大本地候选池,再调用 Vision 按 6 个分数验收;不合格候选自动丢弃并抽下一候选。<code>mode=append</code> 追加新关键帧;<code>quality=auto</code> 根据本机算力和视频时长自动选择快速、精细或极准。多个抽帧请求进入后端队列顺序处理。</td></tr>
<tr><td>手动加帧</td><td><code>POST /jobs/{id}/frames?t=</code></td><td><code>addManualFrame</code></td><td>按视频时间戳抽一帧index 递增但 frames 按 timestamp 排序。</td></tr>
<tr><td>Vision 识别</td><td><code>POST /frames/{idx}/describe</code></td><td><code>describeFrame</code></td><td>写入 frame.description后续可从 objects 加候选元素。</td></tr>
<tr><td>清洗水印</td><td><code>POST /frames/{idx}/cleanup</code></td><td><code>cleanupFrame</code></td><td>支持全图和区域清洗,生成 cleaned 待应用版本;前端批量清洗会顺序调用该接口,不自动覆盖原图。单帧清洗状态按 frame.index 隔离,清洗某一张不会禁用其他关键帧的清洗按钮。</td></tr>
@@ -876,6 +892,18 @@ SubjectAsset {
<h2>变更记录</h2>
<p>这个记录不是 git log 的替代品。它记录“产品理解发生了什么变化、影响了哪些源码、你以后描述需求时该怎么说”。后续每次改功能都要补一条。</p>
<div class="changelog">
<article class="change">
<header>
<h3>2026-05-14 · 抽帧新增透明骨架人 AI 验收目标</h3>
<span class="tag violet">InputNode</span>
<span class="tag blue">Vision</span>
</header>
<div class="body">
<p><strong>问题:</strong>透明人二创不能只靠清晰度抽帧,也不能只要出现“骨头”就算合格;需要确认同一人形角色同时具备透明/半透明外壳、干净白色骨架、足够大且清晰、非恐怖广告感。</p>
<p><strong>改动:</strong><code>FrameExtractTarget</code> 新增 <code>transparent_human</code> 并设为当前 UI 默认目标。后端抽帧先按本地清晰度、中心主体、对比度和去重扩大候选池,再逐张从原视频抽高清帧交给 Vision 评分;评分维度包括透明身体、可见骨架、人物占比、清晰度、商业风格和产品可用性。不合格帧会被删除并自动换下一候选,直到凑够目标张数或候选耗尽。</p>
<p><strong>影响:</strong><code>api/main.py</code><code>web/lib/api.ts</code><code>web/app/page.tsx</code><code>web/components/nodes/index.tsx</code><code>web/components/lightbox.tsx</code><code>web/lib/workflow-target.ts</code><code>docs/source-analysis.html</code></p>
</div>
</article>
<article class="change">
<header>
<h3>2026-05-14 · 清洗页增加一键替换待应用清洗版</h3>

View File

@@ -21,6 +21,7 @@ import {
deleteGeneratedVideo, deleteCutout, generateStoryboardVideo, createProductFusionGuide,
type Job, type ImageRef, type ProductFusionShot, type StoryboardScene, type FrameExtractMode, type FrameExtractQuality, type FrameExtractTarget,
} from "@/lib/api"
import { TRANSPARENT_HUMAN_NEGATIVE_PROMPT, TRANSPARENT_HUMAN_VIDEO_PROMPT } from "@/lib/workflow-target"
const NODE_TYPES = {
input: InputNode,
@@ -35,6 +36,7 @@ const KEYFRAME_PANEL_ID = "keyframe-detail-panel"
const VIDEO_FRAME_PANEL_ID = "video-frame-panel"
const FLOATING_PANEL_IDS = new Set([KEYFRAME_PANEL_ID, VIDEO_FRAME_PANEL_ID])
const FRAME_TARGET_LABELS: Record<FrameExtractTarget, string> = {
transparent_human: "透明骨架人",
balanced: "综合关键帧",
subject: "清晰主体",
transition: "转场变化",
@@ -177,7 +179,7 @@ export default function Home() {
const handleAnalyzeJob = useCallback(async (jobId: string, options?: { mode?: FrameExtractMode }) => {
const targetJob = jobs.find((item) => item.id === jobId)
if (!targetJob) return
const frameTarget = frameTargets[jobId] ?? "balanced"
const frameTarget = frameTargets[jobId] ?? "transparent_human"
const frameCount = frameCounts[jobId] ?? 5
const frameQuality = frameQualities[jobId] ?? "auto"
const mode = options?.mode ?? (targetJob.frames.length > 0 ? "append" : "replace")
@@ -400,6 +402,7 @@ export default function Home() {
"生成一段单镜头连续视频,一镜到底,从首帧平滑过渡到尾帧;不要跳切,不要突然换场景,不要突然换主体,不要蒙太奇,不要多镜头拼接。",
"如果提供了原视频链接,把它只作为节奏、镜头运动、动作顺序和画面调度参考;不要照搬原视频里的品牌、文字、水印、竞品产品或具体人物。",
"时间线0%-15% 锁住首帧构图并轻微启动15%-85% 做平滑连续运动85%-100% 缓慢贴近尾帧并稳定收住。",
TRANSPARENT_HUMAN_VIDEO_PROMPT,
`主体改造:${subjectDirection}`,
`产品替换:${productDirection} 产品必须作为颈部/肩颈按摩仪被正确佩戴或展示,不要放在脸上、手臂上、桌面当摆件,也不要变成瓶子、面霜、医疗设备或食品。`,
`场景改造:${sceneDirection}`,
@@ -416,6 +419,7 @@ export default function Home() {
"运动要求:动作幅度小而连续,速度均匀,手部和产品位置前后一致,产品外形不变形,人物表情和姿态不漂移,背景只允许轻微景深和光影变化。",
"商业质感:真实拍摄感,干净高级,柔和稳定打光,产品边缘清晰,材质真实,画面无抖动、无拉伸、无闪烁。",
"禁止:字幕、文字、平台 UI、TikTok 水印、logo 水印、免责声明、竞品包装、随机新物体、非 SKG 产品、医学骨架、夸张病症画面、恐怖元素、画面撕裂、人物或产品突然变形。",
TRANSPARENT_HUMAN_NEGATIVE_PROMPT,
].join("\n")
try {
@@ -470,10 +474,12 @@ export default function Home() {
`白底人物图:${labelOf(shot.person_image, "人物姿态参考")}。人物姿态、手部接触点和产品佩戴关系以这张图为准。`,
`场景图:${labelOf(shot.scene_image, "场景参考")}。背景、空间、光线和气氛以这张图为准,但不要改变产品框内位置。`,
`动作描述:${shot.action_text.trim()}`,
TRANSPARENT_HUMAN_VIDEO_PROMPT,
"融合要求:产品必须按引导图位置自然贴合人物或手部,尺寸可信,透视一致,边缘清晰,不能悬浮、穿帮、融化、扭曲或变成其他物体。",
"场景要求:把白底人物姿态自然放入场景图的环境中,光线方向和阴影要统一,背景不要出现水印、平台 UI、字幕或竞品包装。",
"商业质感:真实拍摄感、干净高级、产品清楚可辨、人物动作自然、镜头稳定。",
"禁止:文字、水印、随机品牌、非 SKG 产品、医学治疗承诺、夸张病症、恐怖元素、产品位置漂移、产品超过指定融合区域。",
TRANSPARENT_HUMAN_NEGATIVE_PROMPT,
].join("\n")
const updated = await generateStoryboardVideo(job.id, frameIdx, {
prompt,

View File

@@ -9,6 +9,7 @@ import {
type AssetBackground, type AssetSize, type KeyFrame, type Job, type ImageRef, type ProductFusionShot, type SceneMode, type SceneStyle, type SubjectKind,
} from "@/lib/api"
import { ProductLibraryPicker } from "@/components/product-library-picker"
import { TRANSPARENT_HUMAN_FRAME_STANDARD, TRANSPARENT_HUMAN_UI_SUMMARY } from "@/lib/workflow-target"
import { toast } from "sonner"
interface Props {
@@ -225,6 +226,7 @@ export function FrameLightbox({ jobId, frames, activeIndex, selected, onClose, o
if (activeIndex === null || !f || !mounted) return null
const desc = f.description
const transparentScore = f.transparent_human_score ?? desc?.transparent_human_assessment
const elements = f.elements ?? []
const hasCleaned = !!f.cleaned_url
const latestSceneAsset = f.scene_assets?.[f.scene_assets.length - 1] ?? null
@@ -1716,6 +1718,33 @@ export function FrameLightbox({ jobId, frames, activeIndex, selected, onClose, o
</button>
</div>
<div className="mb-2 rounded-md border border-cyan-300/18 bg-cyan-500/[0.06] px-2.5 py-2 text-[10.5px] leading-relaxed text-white/55">
<div className="mb-1 flex items-center justify-between gap-2">
<span className="font-semibold text-cyan-100"></span>
{transparentScore && (
<span className={`rounded px-1.5 py-0.5 text-[9px] font-mono ${
transparentScore.qualified ? "bg-emerald-400/80 text-black" : "bg-amber-400/18 text-amber-100"
}`}>
{transparentScore.qualified ? "合格" : "待复核"} · {transparentScore.total_score ?? (
(transparentScore.transparent_body_score || 0)
+ (transparentScore.skeleton_visible_score || 0)
+ (transparentScore.human_prominence_score || 0)
+ (transparentScore.clarity_score || 0)
+ (transparentScore.commercial_style_score || 0)
+ (transparentScore.product_usefulness_score || 0)
)}/100
</span>
)}
</div>
<div>{TRANSPARENT_HUMAN_UI_SUMMARY}</div>
<div className="mt-1 text-white/38">{TRANSPARENT_HUMAN_FRAME_STANDARD}</div>
{transparentScore?.reject_reason && !transparentScore.qualified && (
<div className="mt-1 rounded border border-amber-300/20 bg-amber-500/10 px-1.5 py-1 text-amber-100/80">
{transparentScore.reject_reason}
</div>
)}
</div>
{!desc ? (
<div className="rounded-lg border border-dashed border-white/15 bg-white/[0.03] p-3 text-[11.5px] text-white/50 leading-relaxed">
{describing ? (

View File

@@ -130,6 +130,7 @@ function clamp(value: number, min: number, max: number) {
const THUMBNAIL_HEIGHT = 192
const FLOATING_PANEL_EDGE_INSET = 8
const FRAME_TARGET_OPTIONS: Array<{ value: FrameExtractTarget; label: string; hint: string }> = [
{ value: "transparent_human", label: "透明骨架人", hint: "AI 验收透明身体 + 白色骨架" },
{ value: "balanced", label: "综合关键帧", hint: "清晰、去重、变化、时间覆盖" },
{ value: "subject", label: "清晰主体", hint: "人物 / 产品主体更清楚" },
{ value: "transition", label: "转场变化", hint: "切镜和画面变化优先" },
@@ -571,7 +572,7 @@ export function InputNode({ data, selected }: NodeProps<{ data: NodeData }> | an
const aspectStr = ready ? `${j.width}/${j.height}` : "9/16"
const thumbNaturalWidth = ready && j.height ? Math.max(96, Math.round(THUMBNAIL_HEIGHT * j.width / j.height)) : 96
const toolWidth = Math.max(148, thumbNaturalWidth)
const target = d.frameTargets[j.id] ?? "balanced"
const target = d.frameTargets[j.id] ?? "transparent_human"
const count = d.frameCounts[j.id] ?? 5
const quality = d.frameQualities[j.id] ?? "auto"
const jHasFrames = j.frames.length > 0
@@ -811,7 +812,7 @@ export function VideoFramePanelNode({ data }: any) {
const duration = panelJob.duration ?? 0
const frames = [...panelJob.frames].sort((a, b) => a.timestamp - b.timestamp)
const aspect = panelJob.width && panelJob.height ? `${panelJob.width}/${panelJob.height}` : "9/16"
const panelTarget = d.frameTargets[panelJob.id] ?? "balanced"
const panelTarget = d.frameTargets[panelJob.id] ?? "transparent_human"
const panelCount = d.frameCounts[panelJob.id] ?? 5
const panelQuality = d.frameQualities[panelJob.id] ?? "auto"
const panelRunning = ["splitting", "transcribing"].includes(panelJob.status)

View File

@@ -199,6 +199,7 @@ export interface KeyFrame {
timestamp: number
url: string
description?: FrameDescription | null
transparent_human_score?: TransparentHumanFrameScore | null
cleaned_url?: string | null
cleaned_applied?: boolean
quality_report?: QualityReport | null
@@ -208,7 +209,7 @@ export interface KeyFrame {
generated_images?: GeneratedImage[]
}
export type FrameExtractTarget = "balanced" | "subject" | "transition" | "expression" | "motion"
export type FrameExtractTarget = "transparent_human" | "balanced" | "subject" | "transition" | "expression" | "motion"
export type FrameExtractMode = "replace" | "append"
export type FrameExtractQuality = "auto" | "fast" | "accurate" | "ultra"
export type AssetBackground = "white" | "black"