Files
trackonr/index.html

252 lines
10 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>TrackOnR 真实世界点跟踪</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
background: #0a0a0a; color: #e0e0e0;
min-height: 100vh; padding: 2rem;
}
.container { max-width: 1200px; margin: 0 auto; }
h1 {
font-size: 2.5rem; font-weight: 700;
background: linear-gradient(135deg, #60a5fa, #a78bfa);
-webkit-background-clip: text; -webkit-text-fill-color: transparent;
margin-bottom: 0.5rem;
}
.subtitle { color: #888; font-size: 1.1rem; margin-bottom: 0.5rem; }
.meta { color: #666; font-size: 0.9rem; margin-bottom: 2rem; }
.meta a { color: #60a5fa; text-decoration: none; }
.meta a:hover { text-decoration: underline; }
.card {
background: #141414; border: 1px solid #222; border-radius: 12px;
padding: 2rem; margin-bottom: 1.5rem;
}
.card h2 { color: #60a5fa; margin-bottom: 1rem; font-size: 1.3rem; }
.card p, .card li { line-height: 1.8; color: #aaa; }
.card ul { padding-left: 1.5rem; }
.card li { margin-bottom: 0.5rem; }
.highlight { color: #a78bfa; font-weight: 600; }
.tag {
display: inline-block; background: #1e293b; color: #60a5fa;
padding: 0.25rem 0.75rem; border-radius: 6px; font-size: 0.85rem;
margin: 0.25rem 0.25rem 0.25rem 0;
}
table { width: 100%; border-collapse: collapse; margin: 1rem 0; }
th, td {
padding: 0.75rem 1rem; text-align: left;
border-bottom: 1px solid #222;
}
th { color: #60a5fa; font-weight: 600; }
td { color: #aaa; }
.grid { display: grid; grid-template-columns: 1fr 1fr; gap: 1.5rem; }
@media (max-width: 768px) { .grid { grid-template-columns: 1fr; } }
code {
background: #1e1e1e; padding: 0.2rem 0.5rem; border-radius: 4px;
font-family: "SF Mono", Monaco, monospace; font-size: 0.9rem; color: #7dd3fc;
}
.pipeline {
display: flex; align-items: center; gap: 0; flex-wrap: wrap;
margin: 1rem 0;
}
.pipeline-step {
background: #1e293b; padding: 0.75rem 1.25rem; border-radius: 8px;
text-align: center; font-size: 0.9rem; color: #e0e0e0;
}
.pipeline-arrow { color: #60a5fa; font-size: 1.5rem; padding: 0 0.5rem; }
.status-badge {
display: inline-block; background: #164e63; color: #22d3ee;
padding: 0.3rem 0.8rem; border-radius: 20px; font-size: 0.8rem;
font-weight: 600;
}
</style>
</head>
<body>
<div class="container">
<h1>Track-On-R</h1>
<p class="subtitle">Real-World Point Tracking with Verifier-Guided Pseudo-Labeling</p>
<p class="meta">
CVPR 2026 &nbsp;|&nbsp;
Gorkay Aydemir, Fatma Guney, Weidi Xie &nbsp;|&nbsp;
<a href="https://kuis-ai.github.io/track_on_r/" target="_blank">Project Page</a> &nbsp;|&nbsp;
<a href="https://arxiv.org/abs/2603.12217" target="_blank">Paper</a> &nbsp;|&nbsp;
<a href="https://github.com/gorkaydemir/track_on" target="_blank">GitHub</a>
&nbsp;&nbsp;<span class="status-badge">源码已 clone</span>
</p>
<!-- 核心概念 -->
<div class="card">
<h2>什么是点跟踪Point Tracking</h2>
<p>在视频的第一帧选中任意一个像素点算法能在后续每一帧精确定位这个点的位置即使目标被遮挡、光照变化、物体变形。这是计算机视觉中的基础能力支撑视频编辑、机器人视觉、自动驾驶、AR/VR 等应用。</p>
</div>
<!-- Track-On 家族 -->
<div class="card">
<h2>Track-On 模型家族</h2>
<table>
<tr><th>模型</th><th>发表</th><th>核心创新</th></tr>
<tr>
<td>Track-On</td>
<td>ICLR 2025</td>
<td>首次提出在线逐帧点跟踪 + Transformer 紧凑记忆机制</td>
</tr>
<tr>
<td>Track-On2</td>
<td>TPAMI 2026</td>
<td>改进架构,更强性能和效率</td>
</tr>
<tr>
<td><span class="highlight">Track-On-R</span></td>
<td>CVPR 2026</td>
<td>Verifier-guided 伪标签在真实视频上微调SOTA</td>
</tr>
</table>
</div>
<!-- 技术架构 -->
<div class="card">
<h2>Track-On-R 技术架构</h2>
<p style="margin-bottom: 1rem;">三阶段训练流水线:</p>
<div class="pipeline">
<div class="pipeline-step">
<strong>Stage 1</strong><br>
Track-On2<br>
<small style="color:#888">合成数据预训练<br>(Kubric Movi-F)</small>
</div>
<span class="pipeline-arrow"></span>
<div class="pipeline-step">
<strong>Stage 2</strong><br>
Verifier 训练<br>
<small style="color:#888">K-Epic 数据集<br>学习判断跟踪质量</small>
</div>
<span class="pipeline-arrow"></span>
<div class="pipeline-step">
<strong>Stage 3</strong><br>
Track-On-R<br>
<small style="color:#888">真实视频微调<br>Verifier 筛选伪标签</small>
</div>
</div>
<ul style="margin-top: 1rem;">
<li><span class="highlight">在线处理</span>:逐帧处理视频,不需要看完整个视频再回溯,适合实时/流式场景</li>
<li><span class="highlight">Transformer 记忆</span>:紧凑的 memory 模块存储历史帧信息,平衡精度和效率</li>
<li><span class="highlight">Verifier 引导</span>:训练一个"质量检验员",对 6 个 teacher 模型的预测打分,只用高质量伪标签微调</li>
<li><span class="highlight">DINOv3 骨干网络</span>:基于 Meta DINOv3 ViT-S/16+ 特征提取</li>
</ul>
</div>
<div class="grid">
<!-- 性能指标 -->
<div class="card">
<h2>性能指标δ_avg</h2>
<table>
<tr><th>数据集</th><th>Track-On2</th><th>Track-On-R</th></tr>
<tr><td>DAVIS</td><td>79.9</td><td><span class="highlight">80.3</span></td></tr>
<tr><td>Kinetics</td><td>69.3</td><td><span class="highlight">71.0</span></td></tr>
<tr><td>RoboTAP</td><td>80.5</td><td><span class="highlight">82.6</span></td></tr>
<tr><td>EgoPoints</td><td>61.7</td><td><span class="highlight">67.3</span></td></tr>
<tr><td>Dynamic Replica</td><td>74.5</td><td><span class="highlight">75.1</span></td></tr>
<tr><td>PointOdyssey</td><td>45.1</td><td><span class="highlight">53.4</span></td></tr>
</table>
<p style="margin-top: 0.5rem; font-size: 0.85rem;">真实世界微调后EgoPoints 提升 +5.6PointOdyssey 提升 +8.3</p>
</div>
<!-- Teacher 模型集成 -->
<div class="card">
<h2>Teacher 模型集成6 个)</h2>
<ul>
<li>Track-On2自身</li>
<li>BootsTAPNextGoogle DeepMind</li>
<li>BootsTAPIRGoogle DeepMind</li>
<li>CoTracker3 windowMeta</li>
<li>Anthro-LocoTrackKAIST</li>
<li>AllTracker</li>
</ul>
<p style="margin-top: 0.75rem; font-size: 0.85rem;">Verifier 对每个 teacher 的预测打分,选最优结果作为伪标签训练 Track-On-R</p>
</div>
</div>
<!-- 预训练模型 -->
<div class="card">
<h2>预训练权重</h2>
<table>
<tr><th>模型</th><th>训练数据</th><th>下载</th></tr>
<tr>
<td>Track-On-R</td>
<td>Kubric + 真实视频</td>
<td><a href="https://huggingface.co/gorkaydemir/track_on_r/resolve/main/track_on_r.pt" style="color:#60a5fa">HuggingFace</a></td>
</tr>
<tr>
<td>Track-On2</td>
<td>Kubric</td>
<td><a href="https://huggingface.co/gorkaydemir/track_on2/resolve/main/trackon2_dinov3_checkpoint.pt" style="color:#60a5fa">HuggingFace</a></td>
</tr>
<tr>
<td>Verifier</td>
<td>K-Epic</td>
<td><a href="https://huggingface.co/gorkaydemir/track_on_r/resolve/main/verifier.pt" style="color:#60a5fa">HuggingFace</a></td>
</tr>
</table>
<p style="margin-top: 0.5rem; font-size: 0.85rem; color: #f59e0b;">
⚠ 需额外申请 DINOv3 骨干权重Meta 许可限制),首次运行自动下载
</p>
</div>
<!-- 运行环境 -->
<div class="card">
<h2>运行环境要求</h2>
<div style="margin-bottom: 1rem;">
<span class="tag">Python 3.12</span>
<span class="tag">PyTorch 2.4.1</span>
<span class="tag">CUDA 12.1</span>
<span class="tag">mmcv 2.2.0</span>
<span class="tag">DINOv3</span>
</div>
<ul>
<li><span class="highlight">必须 NVIDIA GPU</span>Mac 不支持 CUDA无法运行</li>
<li>推荐 GPUA100 / RTX 3090 / RTX 4090 / H100</li>
<li>环境管理:<code>mamba</code><code>conda</code></li>
</ul>
</div>
<!-- 应用场景 -->
<div class="card">
<h2>应用场景</h2>
<div class="grid" style="margin-top: 0.5rem;">
<ul>
<li><strong>视频编辑</strong> — 跟踪物体做特效、抠像、替换</li>
<li><strong>机器人视觉</strong> — 跟踪抓取目标关键点</li>
<li><strong>自动驾驶</strong> — 跟踪行人/车辆关键点</li>
</ul>
<ul>
<li><strong>运动分析</strong> — 跟踪运动员关节运动轨迹</li>
<li><strong>AR/VR</strong> — 空间锚点实时追踪</li>
<li><strong>手语识别</strong> — 跟踪手指/手势关键点</li>
</ul>
</div>
</div>
<!-- 本地文件 -->
<div class="card">
<h2>本地项目结构</h2>
<p style="font-family: monospace; font-size: 0.9rem; line-height: 2;">
<code>source/</code> — Track-On 源码GitHub clone<br>
<code>source/demo.py</code> — 可直接运行的 demo 脚本<br>
<code>source/model/</code> — 模型定义Predictor 类)<br>
<code>source/config/</code> — 训练/推理配置 YAML<br>
<code>source/evaluation/</code> — 6 个 benchmark 评估脚本<br>
<code>source/ensemble/</code> — Teacher 模型集成<br>
<code>source/verifier/</code> — Verifier 模型<br>
</p>
</div>
<p style="text-align: center; color: #444; margin-top: 2rem; font-size: 0.85rem;">
TrackOnR 研究页 · 端口 4130 · 待 NVIDIA GPU 到位后本地运行
</p>
</div>
</body>
</html>