feat: add subject image model controls
This commit is contained in:
4
RULES.md
4
RULES.md
@@ -11,7 +11,7 @@
|
||||
- 详见 `CLAUDE.md` 立项决策段 + `.memory/plan.md` 七步管线拆解
|
||||
- 风格:`04-Dark-Gallery-Ambient`(路径:`~/Projects/research/20260305-网页风格库/04-Dark-Gallery-Ambient.md`)
|
||||
- 第一冲刺:步骤 1-4(下载 / 拆轨 / 关键帧 / ASR+翻译)
|
||||
- 当前产品方向(2026-05-19 再确认):信息流广告快速复刻默认进入“三字段候选生成”工作流。主界面为“左侧素材输入列 + 右侧信息流复刻工作表”。用户粘贴 TK 链接或上传视频后点击“开始分析”,系统自动下载源视频;下载完成后并行启动两条路:音频文案路提取原音频文案/字幕,并分析讲话人、语速节奏、背景音乐/环境声/音效;视频视觉路自动抽取参考帧。源视频工作区右侧主体链路是“参考帧池 → 转换层 → 主体元素”:参考帧池竖向排列;转换层只保留真人重构、卡通重构、元素重构、自主描述四个入口,每个入口最多拖入 3 张参考帧,拖入只加入参考队列,不自动生成;用户放好参考和文字后点击生成,右侧主体元素区按每次生成的套图文件夹展示全新 6 视图主体,当前套图在最上层展开,其他套图顺位进入下方可滚动列表,同一重构方向允许保留多套。这四类都属于参考重构,不抠图、不复制原人、不复刻原画面。旧下方“相似主体 / 主体模板库”不再作为主路径。波形下方的画面胶片只是临时预览,点击只跳转原视频时间点,双击或拖进参考帧池才正式加入关键帧,已加入的胶片直接显示“已添加”。产品图上传后独立形成产品资产包,自动识别视角/结构/比例并补缺角度。分镜工作台按逐句时间轴默认只露“文案 / 场景一句话 / 人物+产品+动作”,产品素材池、批量控制、三字段、视频候选和高级区都必须可折叠;视频候选无内容时默认不占大面积,有候选时默认只显示迷你缩略条,展开后才显示 4-grid。单条默认生成 4 个视频候选,顶部支持整片批量生成候选;首尾帧、视觉规划、产品出现方式和旧 6 字段保留在“高级”抽屉与后端 quick-plan 自动展开中,不能再作为客户默认闸门。
|
||||
- 当前产品方向(2026-05-20 再确认):信息流广告快速复刻默认进入“三字段候选生成”工作流。主界面为“左侧素材输入列 + 右侧信息流复刻工作表”。用户粘贴 TK 链接或上传视频后点击“开始分析”,系统自动下载源视频;下载完成后并行启动两条路:音频文案路提取原音频文案/字幕,并分析讲话人、语速节奏、背景音乐/环境声/音效;视频视觉路自动抽取参考帧。源视频工作区右侧主体链路是“参考帧池 → 转换层 → 主体元素”:参考帧池竖向排列;转换层只保留真人重构、卡通重构、元素重构、自主描述四个入口,每个入口最多拖入 3 张参考帧,拖入只加入参考队列,不自动生成;用户放好参考和文字后点击生成,右侧主体元素区按每次生成的套图文件夹展示全新 6 视图主体,当前套图在最上层展开,其他套图顺位进入下方可滚动列表,同一重构方向允许保留多套。转换层可直接选择自动 / GPT / Gemini 生图模型,偏好只影响主体套图生成;提示词输入有本地记忆,会把上次常用词生成可点击小按键。主体重构默认继承参考图里的性别、人种/肤色、年龄体态和角色气质这些广义特征,但生成同一个全新主体;同一套 6 视图必须统一脸部设定、发型、体态、服装类型、配色、材质、剪裁和配饰,避免一套图里每张衣服不同。这四类都属于参考重构,不抠图、不复制原人、不复刻原画面。旧下方“相似主体 / 主体模板库”不再作为主路径。波形下方的画面胶片只是临时预览,点击只跳转原视频时间点,双击或拖进参考帧池才正式加入关键帧,已加入的胶片直接显示“已添加”。产品图上传后独立形成产品资产包,自动识别视角/结构/比例并补缺角度。分镜工作台按逐句时间轴默认只露“文案 / 场景一句话 / 人物+产品+动作”,产品素材池、批量控制、三字段、视频候选和高级区都必须可折叠;视频候选无内容时默认不占大面积,有候选时默认只显示迷你缩略条,展开后才显示 4-grid。单条默认生成 4 个视频候选,顶部支持整片批量生成候选;首尾帧、视觉规划、产品出现方式和旧 6 字段保留在“高级”抽屉与后端 quick-plan 自动展开中,不能再作为客户默认闸门。
|
||||
|
||||
## 部署事实
|
||||
- 平台:VPS `76.13.31.179`(Ubuntu 24.04 / Docker Compose / Coolify Traefik)
|
||||
@@ -78,7 +78,7 @@
|
||||
- `IMAGE_REQUEST_TIMEOUT_SECONDS`:单次图片网关请求超时,默认 60 秒;超时会直接把该视图标失败并继续下一张,避免主体 6 视图整包长时间无反馈
|
||||
- `IMAGE_FALLBACK_ENABLED` / `IMAGE_FALLBACK_MODEL`:图片主模型故障兜底;当前允许在 `gpt-image-2` 超时、429、5xx 或网络错误时临时使用 `gemini-3-pro-image-preview`,400/401/403/404 和参数错误不兜底
|
||||
- `IMAGE_CIRCUIT_FAILURE_THRESHOLD` / `IMAGE_CIRCUIT_COOLDOWN_SECONDS`:短时熔断配置,默认 `gpt-image-2` 连续 2 次上游类失败后 600 秒内直接走 Gemini 兜底;成功恢复后自动清空失败计数
|
||||
- `GPT_IMAGE_MODEL` / `SUBJECT_ASSET_IMAGE_MODEL` / `SUBJECT_ASSET_IMAGE_MODELS`:保留兼容旧环境变量名;主体 6 视图先用 `gpt-image-2`,同一套图内一旦触发 Gemini 兜底,后续视图沿用 Gemini,避免一张张等待主模型超时
|
||||
- `GPT_IMAGE_MODEL` / `SUBJECT_ASSET_IMAGE_MODEL` / `SUBJECT_ASSET_IMAGE_MODELS`:保留兼容旧环境变量名;主体 6 视图在转换层默认自动使用 `gpt-image-2`,同一套图内一旦触发 Gemini 兜底,后续视图沿用 Gemini,避免一张张等待主模型超时;用户显式选择 GPT 或 Gemini 时,`image_model_preference` 会让主体套图只走所选模型
|
||||
- `AI_HTTP_PROXY` / `IMAGE_HTTP_PROXY`:可选的 AI 网关出站代理;本地 launchd 后台进程不一定继承 shell 的 `http_proxy/https_proxy`,如生图报 DNS / ConnectError,可在本地 `api/.env` 配置后重启后端。`/health` 只回传是否配置代理,不回传代理地址。
|
||||
- `YTDLP_COOKIES_FILE` / `YTDLP_COOKIES_FROM_BROWSER`:可选 TikTok 下载登录态;生产云端固定使用 cookies 文件 `/run/secrets/tiktok_cookies.txt`(宿主机 `./secrets/tiktok_cookies.txt` 挂载进容器),本地开发可临时用浏览器 cookies。cookies 文件属于敏感登录态,只能放本机或服务器私有路径,不允许入库。
|
||||
- `VOICE_PROVIDER`:配音通道,服务端固定使用 `azure_openai`;旧环境若写 `minimax` 会被忽略
|
||||
|
||||
47
api/main.py
47
api/main.py
@@ -3547,8 +3547,24 @@ def _image_primary_circuit_open() -> bool:
|
||||
return _image_circuit_snapshot()["primary_open"]
|
||||
|
||||
|
||||
def _image_model_candidates(force_fallback: bool = False) -> list[str]:
|
||||
def _normalize_image_model_preference(value: str | None) -> str:
|
||||
raw = (value or "auto").strip().lower()
|
||||
if raw in {"", "auto", "default"}:
|
||||
return "auto"
|
||||
if raw in {"gpt", "gpt-image", GPT_IMAGE_MODEL.lower()}:
|
||||
return GPT_IMAGE_MODEL
|
||||
if IMAGE_FALLBACK_MODEL and raw in {"gemini", IMAGE_FALLBACK_MODEL.lower()}:
|
||||
return IMAGE_FALLBACK_MODEL
|
||||
return "auto"
|
||||
|
||||
|
||||
def _image_model_candidates(force_fallback: bool = False, preference: str | None = "auto") -> list[str]:
|
||||
normalized = _normalize_image_model_preference(preference)
|
||||
fallbacks = _image_fallback_models()
|
||||
if normalized == GPT_IMAGE_MODEL:
|
||||
return [GPT_IMAGE_MODEL]
|
||||
if normalized == IMAGE_FALLBACK_MODEL and fallbacks:
|
||||
return [IMAGE_FALLBACK_MODEL]
|
||||
if not fallbacks:
|
||||
return [GPT_IMAGE_MODEL]
|
||||
if force_fallback or _image_primary_circuit_open():
|
||||
@@ -3692,6 +3708,7 @@ def _image_edit_call(
|
||||
max_attempts: int = 3,
|
||||
max_side: int = 1024,
|
||||
force_fallback_model: bool = False,
|
||||
image_model_preference: str | None = "auto",
|
||||
) -> tuple[bytes, str]:
|
||||
"""通用 image edit 调用 · 失败重试 + 可选 text fallback。
|
||||
返回 (image_bytes, effective_mode) where effective_mode in {"edit","text"}。
|
||||
@@ -3709,7 +3726,7 @@ def _image_edit_call(
|
||||
if not image_paths:
|
||||
raise RuntimeError("image edit reference image missing")
|
||||
img_bytes_list = [_prepare_image_edit_bytes(path, max_side) for path in image_paths]
|
||||
model_candidates = _image_model_candidates(force_fallback=force_fallback_model)
|
||||
model_candidates = _image_model_candidates(force_fallback=force_fallback_model, preference=image_model_preference)
|
||||
mode_plan: list[str] = ["edit"] if model_candidates != [GPT_IMAGE_MODEL] else ["edit"] * max_attempts
|
||||
if fallback_text:
|
||||
mode_plan.append("text")
|
||||
@@ -3803,6 +3820,7 @@ def _image_text_call(
|
||||
models: list[str] | None = None,
|
||||
max_attempts: int = 3,
|
||||
force_fallback_model: bool = False,
|
||||
image_model_preference: str | None = "auto",
|
||||
) -> tuple[bytes, str]:
|
||||
"""Text-only image generation. gpt-image-2 primary, Gemini only as outage fallback."""
|
||||
import base64 as b64lib
|
||||
@@ -3810,7 +3828,7 @@ def _image_text_call(
|
||||
import httpx
|
||||
if not IMAGE_API_KEY:
|
||||
raise RuntimeError("IMAGE_API_KEY 或 LLM_API_KEY 未配置")
|
||||
candidates = _image_model_candidates(force_fallback=force_fallback_model)
|
||||
candidates = _image_model_candidates(force_fallback=force_fallback_model, preference=image_model_preference)
|
||||
attempt_models = candidates if candidates != [GPT_IMAGE_MODEL] else [GPT_IMAGE_MODEL] * max_attempts
|
||||
last_err = ""
|
||||
capacity_seen = False
|
||||
@@ -5004,6 +5022,7 @@ class GenerateSubjectAssetsReq(BaseModel):
|
||||
reconstruction_mode: Literal["same", "similar"] = "same"
|
||||
subject_profile: SubjectProfilePreference | None = None
|
||||
prompt: str = ""
|
||||
image_model_preference: str = "auto"
|
||||
replace_views: bool = False
|
||||
source_subject_brief: str = ""
|
||||
pack_id: str = ""
|
||||
@@ -5787,9 +5806,17 @@ def _generate_subject_assets_sync(job_id: str, idx: int, element_id: str, req: G
|
||||
"Identity lock: these API calls generate one high-definition multi-view pack for ONE single subject, but each individual output file must show only its one requested view. "
|
||||
"Before rendering, infer one consistent character bible from the supplied text brief and generation instructions: gender presentation, age range, body proportions, head shape, face direction cues, material, silhouette, wardrobe/material style, and commercial mood. "
|
||||
"Keep that same character bible unchanged across every generated view in separate files. "
|
||||
"By default, inherit the reference frames' broad gender presentation, regional/ethnic appearance category, skin-tone family, body-proportion category, and ad-role energy unless the user explicitly overrides them. "
|
||||
"The pack must depict the same newly designed person or character in every view: same face design, same hair design, same body proportions, same skin tone, same age range, and same commercial styling. "
|
||||
"If user direction requests a gender, age, or style change, apply that one change uniformly to all views; never mix male/female, young/old, or multiple style identities inside the same pack. "
|
||||
"For transparent humanoids, keep the same transparent skin shell, skeleton proportions, visible spine/rib cage/pelvis/limb bones, and non-horror wellness character style in every view. "
|
||||
)
|
||||
wardrobe_lock_clause = (
|
||||
"Wardrobe lock: choose one outfit bible before rendering and keep it identical across all views. "
|
||||
"The same garment type, color palette, neckline, sleeve shape, straps, fabric/material, fit, seam logic, and visible accessories must remain consistent from front, side, three-quarter, and back views. "
|
||||
"Do not change clothing between views; do not switch from sportswear to casualwear, dress, coat, hoodie, uniform, or underwear unless the user explicitly requests that single outfit for the whole pack. "
|
||||
"If the reference outfit is useful, inherit its broad wardrobe category and color family, but redraw it as a new non-identical clean commercial outfit. "
|
||||
)
|
||||
neck_product_clause = (
|
||||
"This subject pack is for SKG neck-and-shoulder wearable massage device videos. "
|
||||
"Make the neck, collarbone, shoulder line, upper back, side neck, and shoulder slope clear and product-ready. "
|
||||
@@ -5797,10 +5824,11 @@ def _generate_subject_assets_sync(job_id: str, idx: int, element_id: str, req: G
|
||||
"For back and close-up views, prioritize the cervical spine, shoulder blades, upper trapezius, and clean wearable-device contact area. "
|
||||
)
|
||||
models = SUBJECT_ASSET_IMAGE_MODELS
|
||||
model_preference = _normalize_image_model_preference(req.image_model_preference)
|
||||
generated: list[SubjectAsset] = []
|
||||
generation_errors: list[str] = []
|
||||
first_generation_error: RuntimeError | None = None
|
||||
pack_force_fallback_model = _image_primary_circuit_open()
|
||||
pack_force_fallback_model = model_preference == "auto" and _image_primary_circuit_open()
|
||||
try:
|
||||
for view, view_label in _subject_view_labels(req.subject_kind, req.views):
|
||||
closeup_view = view in {"bust", "back_detail", "bust_front", "bust_left_45", "bust_right_45", "back_neck_detail"} or "detail" in view
|
||||
@@ -5845,6 +5873,7 @@ def _generate_subject_assets_sync(job_id: str, idx: int, element_id: str, req: G
|
||||
+ single_view_clause
|
||||
+ identity_clause
|
||||
+ identity_lock_clause
|
||||
+ wardrobe_lock_clause
|
||||
+ neck_product_clause
|
||||
+ canvas_clause
|
||||
+ prompt_extra_clause
|
||||
@@ -5861,17 +5890,17 @@ def _generate_subject_assets_sync(job_id: str, idx: int, element_id: str, req: G
|
||||
try:
|
||||
if similar_mode:
|
||||
print(
|
||||
f"[subject assets] reconstruction_mode=similar endpoint=/images/generations view={view} image_refs=0 model={'fallback' if pack_force_fallback_model else GPT_IMAGE_MODEL}",
|
||||
f"[subject assets] reconstruction_mode=similar endpoint=/images/generations view={view} image_refs=0 model_preference={model_preference}",
|
||||
flush=True,
|
||||
)
|
||||
img_bytes, _mode = _image_text_call(prompt, models=models, max_attempts=3, force_fallback_model=pack_force_fallback_model)
|
||||
if _mode.endswith(f":{IMAGE_FALLBACK_MODEL}"):
|
||||
img_bytes, _mode = _image_text_call(prompt, models=models, max_attempts=3, force_fallback_model=pack_force_fallback_model, image_model_preference=model_preference)
|
||||
if model_preference == "auto" and _mode.endswith(f":{IMAGE_FALLBACK_MODEL}"):
|
||||
pack_force_fallback_model = True
|
||||
else:
|
||||
if model_src is None:
|
||||
raise RuntimeError("subject asset edit reference image missing")
|
||||
img_bytes, _mode = _image_edit_call(model_src, prompt, models=models, fallback_text=False, max_attempts=3, max_side=1280, force_fallback_model=pack_force_fallback_model)
|
||||
if _mode.endswith(f":{IMAGE_FALLBACK_MODEL}"):
|
||||
img_bytes, _mode = _image_edit_call(model_src, prompt, models=models, fallback_text=False, max_attempts=3, max_side=1280, force_fallback_model=pack_force_fallback_model, image_model_preference=model_preference)
|
||||
if model_preference == "auto" and _mode.endswith(f":{IMAGE_FALLBACK_MODEL}"):
|
||||
pack_force_fallback_model = True
|
||||
except RuntimeError as e:
|
||||
if first_generation_error is None:
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -28,6 +28,7 @@ import {
|
||||
type StoryboardScriptRewriteSegment,
|
||||
type StoryboardScene,
|
||||
type SubjectAsset,
|
||||
type SubjectImageModelPreference,
|
||||
type SubjectProfilePreference,
|
||||
type SubjectKind,
|
||||
addElement,
|
||||
@@ -317,6 +318,15 @@ const RECONSTRUCTION_MODES: Array<{ value: SubjectReconstructionMode; label: str
|
||||
},
|
||||
]
|
||||
|
||||
const SUBJECT_IMAGE_MODEL_OPTIONS: Array<{ value: SubjectImageModelPreference; label: string; detail: string }> = [
|
||||
{ value: "auto", label: "自动", detail: "GPT 失败才兜底" },
|
||||
{ value: "gpt-image-2", label: "GPT", detail: "只用 gpt-image-2" },
|
||||
{ value: "gemini-3-pro-image-preview", label: "Gemini", detail: "直接用 Gemini" },
|
||||
]
|
||||
const SUBJECT_MODEL_MEMORY_KEY = "skg:subject-image-model:v1"
|
||||
const SUBJECT_PROMPT_MEMORY_KEY = "skg:subject-prompt-memory:v1"
|
||||
const SUBJECT_PROMPT_MEMORY_LIMIT = 28
|
||||
|
||||
const SUBJECT_ASSET_SIZE = "2048" as const
|
||||
|
||||
const SUBJECT_PROFILE_CATEGORIES: SubjectProfileCategory[] = [
|
||||
@@ -639,6 +649,77 @@ function resolveSubjectProfile(
|
||||
}
|
||||
}
|
||||
|
||||
function emptySubjectPromptMemory(): Record<SubjectReconstructionMode, string[]> {
|
||||
return { realistic: [], cartoon: [], elements: [], custom: [] }
|
||||
}
|
||||
|
||||
function loadSubjectPromptMemory(): Record<SubjectReconstructionMode, string[]> {
|
||||
if (typeof window === "undefined") return emptySubjectPromptMemory()
|
||||
try {
|
||||
const parsed = JSON.parse(window.localStorage.getItem(SUBJECT_PROMPT_MEMORY_KEY) || "{}") as Partial<Record<SubjectReconstructionMode, string[]>>
|
||||
const next = emptySubjectPromptMemory()
|
||||
for (const mode of Object.keys(next) as SubjectReconstructionMode[]) {
|
||||
next[mode] = Array.isArray(parsed[mode]) ? parsed[mode]!.filter(Boolean).slice(0, SUBJECT_PROMPT_MEMORY_LIMIT) : []
|
||||
}
|
||||
return next
|
||||
} catch {
|
||||
return emptySubjectPromptMemory()
|
||||
}
|
||||
}
|
||||
|
||||
function saveSubjectPromptMemory(memory: Record<SubjectReconstructionMode, string[]>) {
|
||||
if (typeof window === "undefined") return
|
||||
try {
|
||||
window.localStorage.setItem(SUBJECT_PROMPT_MEMORY_KEY, JSON.stringify(memory))
|
||||
} catch {
|
||||
/* localStorage may be unavailable */
|
||||
}
|
||||
}
|
||||
|
||||
function loadSubjectImageModelPreference(): SubjectImageModelPreference {
|
||||
if (typeof window === "undefined") return "auto"
|
||||
const raw = window.localStorage.getItem(SUBJECT_MODEL_MEMORY_KEY)
|
||||
return SUBJECT_IMAGE_MODEL_OPTIONS.some((item) => item.value === raw) ? raw as SubjectImageModelPreference : "auto"
|
||||
}
|
||||
|
||||
function saveSubjectImageModelPreference(value: SubjectImageModelPreference) {
|
||||
if (typeof window === "undefined") return
|
||||
try {
|
||||
window.localStorage.setItem(SUBJECT_MODEL_MEMORY_KEY, value)
|
||||
} catch {
|
||||
/* localStorage may be unavailable */
|
||||
}
|
||||
}
|
||||
|
||||
function subjectPromptChipsFromText(text: string): string[] {
|
||||
const normalized = text.replace(/[,。;;、\n]/g, ",").replace(/\s+/g, " ").trim()
|
||||
const rawParts = normalized.split(",").map((item) => item.trim()).filter(Boolean)
|
||||
const chips: string[] = []
|
||||
const add = (value: string) => {
|
||||
const clean = value.replace(/^需要|^保持|^统一|^加上|^加入|^改成|^不要变/g, "").trim()
|
||||
if (clean.length < 2 || clean.length > 22) return
|
||||
if (!chips.includes(clean)) chips.push(clean)
|
||||
}
|
||||
for (const part of rawParts) {
|
||||
add(part)
|
||||
const matches = part.match(/(不要[^,,。;;、]{1,12}|同一套?[^,,。;;、]{1,10}|统一[^,,。;;、]{1,10}|白色[^,,。;;、]{1,10}|黑色[^,,。;;、]{1,10}|运动[^,,。;;、]{1,10}|亚洲|欧美|女性|男性|年轻|中年|短发|长发|马尾|背心|T恤|瑜伽服|运动装|商业广告感|高级感|科技感|可爱|极简)/g)
|
||||
matches?.forEach(add)
|
||||
}
|
||||
return chips.slice(0, 14)
|
||||
}
|
||||
|
||||
function mergeSubjectPromptMemory(current: string[], text: string) {
|
||||
const chips = subjectPromptChipsFromText(text)
|
||||
return [...chips, ...current.filter((item) => !chips.includes(item))].slice(0, SUBJECT_PROMPT_MEMORY_LIMIT)
|
||||
}
|
||||
|
||||
function appendSubjectPromptChip(text: string, chip: string) {
|
||||
const trimmed = text.trim()
|
||||
if (!trimmed) return chip
|
||||
if (trimmed.includes(chip)) return trimmed
|
||||
return `${trimmed},${chip}`
|
||||
}
|
||||
|
||||
function formatSeconds(raw?: number) {
|
||||
if (!raw || Number.isNaN(raw)) return "0.0s"
|
||||
return `${raw.toFixed(1)}s`
|
||||
@@ -1091,8 +1172,11 @@ function buildSimilarSubjectPrompt(
|
||||
const base = [
|
||||
"Create a new similar but non-identical information-feed ad subject from the selected reference frames.",
|
||||
"Treat all selected frames as evidence for ONE same subject, not multiple different subjects.",
|
||||
"Lock one consistent character bible before generating: same gender presentation, age range, body proportions, head shape, material, silhouette, commercial style, and visual identity across the full multi-view set.",
|
||||
"Default casting rule: inherit the reference frames' broad gender presentation, regional/ethnic appearance category, skin-tone family, body-proportion category, and role energy unless the user explicitly overrides them.",
|
||||
"Lock one consistent character bible before generating: same newly designed person or character, same gender presentation, age range, body proportions, face design, hair design, skin tone, material, silhouette, commercial style, and visual identity across the full multi-view set.",
|
||||
"Lock one wardrobe bible before generating: same garment type, same color palette, same neckline, same sleeve or strap structure, same fabric/material, same fit, and same visible accessories across every view.",
|
||||
"If the user direction asks to change gender, age, or style, apply that single change uniformly to every view; never mix male/female, young/old, or multiple style identities inside one set.",
|
||||
"Never change outfit between views. Do not switch clothing category from front to side to back.",
|
||||
"Keep the pose vocabulary, camera-readability, creator-ad energy, and commercial clarity, but do not copy the exact source identity, face, watermark, captions, platform UI, or pixels.",
|
||||
"This is for SKG neck-and-shoulder wearable massage device videos: keep neck, collarbone, shoulders, side neck, upper back, shoulder blades, and product placement area clean and visible.",
|
||||
"Output high-definition assets suitable for downstream video generation.",
|
||||
@@ -3203,9 +3287,11 @@ function SourceSubjectPipeline({
|
||||
const [activeDropMode, setActiveDropMode] = useState<SubjectReconstructionMode | null>(null)
|
||||
const [conversionFrameIndicesByMode, setConversionFrameIndicesByMode] = useState<Record<SubjectReconstructionMode, number[]>>(() => ({ ...EMPTY_RECONSTRUCTION_FRAME_MAP }))
|
||||
const [reconstructionDirections, setReconstructionDirections] = useState<Record<SubjectReconstructionMode, string>>(() => ({ ...DEFAULT_RECONSTRUCTION_DIRECTIONS }))
|
||||
const [subjectImageModelPreference, setSubjectImageModelPreference] = useState<SubjectImageModelPreference>(() => loadSubjectImageModelPreference())
|
||||
const [promptMemoryByMode, setPromptMemoryByMode] = useState<Record<SubjectReconstructionMode, string[]>>(() => loadSubjectPromptMemory())
|
||||
const [cartoonStyle, setCartoonStyle] = useState<CartoonReconstructionStyle>("3d_animation")
|
||||
const [cartoonStyleOpen, setCartoonStyleOpen] = useState(false)
|
||||
const [subjectBusyFor, setSubjectBusyFor] = useState<{ jobId: string; jobLabel: string; mode: SubjectReconstructionMode; viewCount: number; sourceCount: number; profileLabel: string } | null>(null)
|
||||
const [subjectBusyFor, setSubjectBusyFor] = useState<{ jobId: string; jobLabel: string; mode: SubjectReconstructionMode; viewCount: number; sourceCount: number; profileLabel: string; modelLabel: string } | null>(null)
|
||||
const [subjectAssetBusy, setSubjectAssetBusy] = useState<string | null>(null)
|
||||
const [expandedSubjectPackKey, setExpandedSubjectPackKey] = useState<string | null>(null)
|
||||
const [lastSubjectProfile, setLastSubjectProfile] = useState<ResolvedSubjectProfile | null>(null)
|
||||
@@ -3305,6 +3391,14 @@ function SourceSubjectPipeline({
|
||||
setExpandedSubjectPackKey(null)
|
||||
}, [job.id])
|
||||
|
||||
useEffect(() => {
|
||||
saveSubjectImageModelPreference(subjectImageModelPreference)
|
||||
}, [subjectImageModelPreference])
|
||||
|
||||
useEffect(() => {
|
||||
saveSubjectPromptMemory(promptMemoryByMode)
|
||||
}, [promptMemoryByMode])
|
||||
|
||||
useEffect(() => {
|
||||
if (expandedSubjectPackKey && !subjectAssetPacks.some((pack) => pack.key === expandedSubjectPackKey)) {
|
||||
setExpandedSubjectPackKey(null)
|
||||
@@ -3327,6 +3421,28 @@ function SourceSubjectPipeline({
|
||||
return resolved
|
||||
}
|
||||
|
||||
const rememberPromptForMode = (mode: SubjectReconstructionMode, text = reconstructionDirections[mode]) => {
|
||||
setPromptMemoryByMode((current) => ({
|
||||
...current,
|
||||
[mode]: mergeSubjectPromptMemory(current[mode] || [], text),
|
||||
}))
|
||||
}
|
||||
|
||||
const applyPromptChip = (mode: SubjectReconstructionMode, chip: string) => {
|
||||
setReconstructionDirections((current) => ({
|
||||
...current,
|
||||
[mode]: appendSubjectPromptChip(current[mode], chip),
|
||||
}))
|
||||
setPromptMemoryByMode((current) => ({
|
||||
...current,
|
||||
[mode]: [chip, ...(current[mode] || []).filter((item) => item !== chip)].slice(0, SUBJECT_PROMPT_MEMORY_LIMIT),
|
||||
}))
|
||||
}
|
||||
|
||||
const subjectModelLabel = (value: SubjectImageModelPreference) => {
|
||||
return SUBJECT_IMAGE_MODEL_OPTIONS.find((item) => item.value === value)?.label ?? "自动"
|
||||
}
|
||||
|
||||
const generateSubjectPack = async (mode: SubjectReconstructionMode, sourceIndices = conversionFrameIndicesByMode[mode]) => {
|
||||
if (subjectBusyFor) {
|
||||
toast.warning("主体套图正在生成中,完成后再重生。")
|
||||
@@ -3354,6 +3470,7 @@ function SourceSubjectPipeline({
|
||||
: buildSubjectProfileForRequest()
|
||||
const subjectStyle = reconstructionSubjectStyle(mode)
|
||||
const userDirection = buildReconstructionDirection(mode, reconstructionDirections[mode], cartoonStyle)
|
||||
rememberPromptForMode(mode, reconstructionDirections[mode])
|
||||
const modeName = reconstructionElementName(mode)
|
||||
setSubjectBusyFor({
|
||||
jobId: requestJobId,
|
||||
@@ -3362,6 +3479,7 @@ function SourceSubjectPipeline({
|
||||
viewCount: selectedSubjectViews.length,
|
||||
sourceCount: sourceFrames.length,
|
||||
profileLabel: requestProfile?.summary ?? "按自主描述",
|
||||
modelLabel: subjectModelLabel(subjectImageModelPreference),
|
||||
})
|
||||
try {
|
||||
let workingJob = job
|
||||
@@ -3391,6 +3509,7 @@ function SourceSubjectPipeline({
|
||||
views: selectedSubjectViews,
|
||||
subject_profile: requestProfile?.payload ?? null,
|
||||
prompt: buildSimilarSubjectPrompt(subjectStyle, userDirection, null, requestProfile),
|
||||
image_model_preference: subjectImageModelPreference,
|
||||
replace_views: false,
|
||||
pack_label: `${reconstructionModeConfig(mode).label} ${new Date().toLocaleTimeString("zh-CN", { hour: "2-digit", minute: "2-digit", hour12: false })}`,
|
||||
pack_mode: mode,
|
||||
@@ -3454,6 +3573,7 @@ function SourceSubjectPipeline({
|
||||
? null
|
||||
: lastSubjectProfile ?? buildSubjectProfileForRequest()
|
||||
const subjectStyle = reconstructionSubjectStyle(mode)
|
||||
rememberPromptForMode(mode, reconstructionDirections[mode])
|
||||
const updated = await generateSubjectAssets(job.id, frame.index, element.id, {
|
||||
subject_kind: "living",
|
||||
subject_style: subjectStyle,
|
||||
@@ -3469,6 +3589,7 @@ function SourceSubjectPipeline({
|
||||
null,
|
||||
requestProfile,
|
||||
),
|
||||
image_model_preference: subjectImageModelPreference,
|
||||
replace_views: true,
|
||||
pack_id: asset.pack_id ?? "",
|
||||
pack_label: asset.pack_label ?? "",
|
||||
@@ -3601,6 +3722,30 @@ function SourceSubjectPipeline({
|
||||
<ModelTrace trace={similarSubjectModelTrace(runtimeModels, subjectBusyFor?.mode === "cartoon" ? "cartoon_subject" : "source_actor")} compact />
|
||||
</div>
|
||||
<div className="max-h-[520px] min-h-[410px] overflow-y-auto rounded-md border border-white/10 bg-black/32 p-2 2xl:max-h-[600px] 2xl:min-h-[500px]">
|
||||
<div className="mb-2 rounded-md border border-white/10 bg-black/24 p-1.5">
|
||||
<div className="mb-1.5 flex items-center justify-between gap-2">
|
||||
<span className="text-[10px] font-semibold text-white/62">生图模型</span>
|
||||
<span className="text-[9px] text-white/34">只影响转换层主体套图</span>
|
||||
</div>
|
||||
<div className="grid grid-cols-3 gap-1">
|
||||
{SUBJECT_IMAGE_MODEL_OPTIONS.map((option) => (
|
||||
<button
|
||||
key={option.value}
|
||||
type="button"
|
||||
onClick={() => setSubjectImageModelPreference(option.value)}
|
||||
className={`min-h-9 rounded-md border px-1.5 py-1 text-left transition ${
|
||||
subjectImageModelPreference === option.value
|
||||
? "border-[#d6b36a]/70 bg-[#d6b36a]/16 text-white"
|
||||
: "border-white/10 bg-black/28 text-white/48 hover:border-white/22 hover:text-white"
|
||||
}`}
|
||||
title={option.detail}
|
||||
>
|
||||
<span className="block text-[10px] font-semibold">{option.label}</span>
|
||||
<span className="mt-0.5 block truncate text-[8.5px] text-white/36">{option.detail}</span>
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
<div className="mb-2 rounded-md border border-[#d6b36a]/18 bg-[#d6b36a]/[0.06] px-2.5 py-2 text-[10px] leading-snug text-white/62">
|
||||
先拖入 1-3 张参考帧到对应方向,放好后再点击生成;系统只做参考重构,不复制原人、原脸或原画面。
|
||||
</div>
|
||||
@@ -3608,6 +3753,9 @@ function SourceSubjectPipeline({
|
||||
{RECONSTRUCTION_MODES.map((modeConfig) => {
|
||||
const mode = modeConfig.value
|
||||
const modeFrames = conversionFramesByMode[mode]
|
||||
const promptChips = [...subjectPromptChipsFromText(reconstructionDirections[mode]), ...(promptMemoryByMode[mode] || [])]
|
||||
.filter((chip, index, list) => chip && list.indexOf(chip) === index)
|
||||
.slice(0, 10)
|
||||
const dropActive = activeDropMode === mode
|
||||
const canGenerate = mode === "custom"
|
||||
? Boolean(reconstructionDirections.custom.trim() || modeFrames.length)
|
||||
@@ -3715,10 +3863,26 @@ function SourceSubjectPipeline({
|
||||
<textarea
|
||||
value={reconstructionDirections[mode]}
|
||||
onChange={(event) => setReconstructionDirections((current) => ({ ...current, [mode]: event.target.value }))}
|
||||
onBlur={(event) => rememberPromptForMode(mode, event.target.value)}
|
||||
placeholder={modeConfig.placeholder}
|
||||
rows={2}
|
||||
className="mt-2 min-h-[48px] w-full resize-none rounded-md border border-white/10 bg-black/35 px-2.5 py-2 text-[10.5px] leading-snug text-white outline-none placeholder:text-white/28 focus:border-cyan-300/50"
|
||||
/>
|
||||
{promptChips.length ? (
|
||||
<div className="mt-1.5 flex flex-wrap gap-1">
|
||||
{promptChips.map((chip) => (
|
||||
<button
|
||||
key={chip}
|
||||
type="button"
|
||||
onClick={() => applyPromptChip(mode, chip)}
|
||||
className="h-6 rounded-full border border-white/10 bg-black/28 px-2 text-[9.5px] text-white/52 transition hover:border-[#d6b36a]/50 hover:bg-[#d6b36a]/12 hover:text-white"
|
||||
title="点击加入提示词"
|
||||
>
|
||||
{chip}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
) : null}
|
||||
<button
|
||||
type="button"
|
||||
onClick={() => void generateSubjectPack(mode)}
|
||||
@@ -3752,6 +3916,7 @@ function SourceSubjectPipeline({
|
||||
<div className="mb-2 rounded-md border border-cyan-200/20 bg-cyan-300/[0.07] px-2.5 py-2 text-[10px] leading-snug text-cyan-50/70">
|
||||
正在生成{reconstructionModeConfig(subjectBusyFor.mode).label} {subjectBusyFor.viewCount} 张;参考 {subjectBusyFor.sourceCount || "自主描述"}。
|
||||
<span className="mt-1 block text-cyan-50/58">主体设定:{subjectBusyFor.profileLabel}</span>
|
||||
<span className="mt-1 block text-cyan-50/50">生图模型:{subjectBusyFor.modelLabel}</span>
|
||||
</div>
|
||||
) : null}
|
||||
{subjectAssetPacks.length ? (
|
||||
|
||||
@@ -712,6 +712,7 @@ export type AssetSize = "source" | "1024" | "1536" | "2048"
|
||||
export type SubjectKind = "object" | "living"
|
||||
export type SubjectView = string
|
||||
export type SubjectAssetStatus = "queued" | "in_progress" | "completed" | "failed"
|
||||
export type SubjectImageModelPreference = "auto" | "gpt-image-2" | "gemini-3-pro-image-preview"
|
||||
export type SceneMode = "remove_subject" | "similar" | "style"
|
||||
export type SceneStyle = "source" | "premium_product" | "clean_studio" | "warm_lifestyle" | "cinematic"
|
||||
export type SceneAssetRole = "scene" | "first_frame" | "last_frame"
|
||||
@@ -1517,6 +1518,7 @@ export async function generateSubjectAssets(
|
||||
reconstruction_mode?: "same" | "similar"
|
||||
subject_profile?: SubjectProfilePreference | null
|
||||
prompt?: string
|
||||
image_model_preference?: SubjectImageModelPreference
|
||||
replace_views?: boolean
|
||||
pack_id?: string
|
||||
pack_label?: string
|
||||
@@ -1540,6 +1542,7 @@ export async function generateSubjectAssets(
|
||||
reconstruction_mode: body.reconstruction_mode ?? "same",
|
||||
subject_profile: body.subject_profile ?? null,
|
||||
prompt: body.prompt ?? "",
|
||||
image_model_preference: body.image_model_preference ?? "auto",
|
||||
replace_views: body.replace_views ?? false,
|
||||
pack_id: body.pack_id ?? "",
|
||||
pack_label: body.pack_label ?? "",
|
||||
|
||||
Reference in New Issue
Block a user