HyperAIHyperAI超神经
首页资讯论文教程数据集百科SOTALLM 模型天梯GPU 天梯顶会
全站搜索
关于
中文
HyperAIHyperAI超神经
  1. 首页
  2. SOTA
  3. 视频到声音生成
  4. Video To Sound Generation On Vgg Sound

Video To Sound Generation On Vgg Sound

评估指标

FAD
FD

评测结果

各个模型在此基准测试上的表现结果

模型名称
FAD
FD
Paper TitleRepository
ReWas2.1615.24Read, Watch and Scream! Sound Generation from Text and Video-
Frieren1.3212.26Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching-
MMAudio-S-16kHz0.795.22Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis-
MaskVAT_Hybrid2.04-Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity-
MMAudio-L-44.1kHz0.974.72Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis-
V-AURA1.92-Temporally Aligned Audio for Video with Autoregression-
V2A-Mapper0.84124.168V2A-Mapper: A Lightweight Solution for Vision-to-Audio Generation by Connecting Foundation Models-
VATT-LLama2.38-Tell What You Hear From What You See -- Video to Audio Generation Through Text-
0 of 8 row(s) selected.
HyperAI

学习、理解、实践,与社区一起构建人工智能的未来

中文

关于

关于我们数据集帮助

产品

资讯教程数据集百科

链接

TVM 中文Apache TVMOpenBayes

© HyperAI超神经

津ICP备17010941号-1京公网安备11010502038810号京公网安备11010502038810号
TwitterBilibili