HyperAIHyperAI超神经
首页资讯最新论文教程数据集百科SOTALLM 模型天梯GPU 天梯顶会
全站搜索
关于
中文
HyperAIHyperAI超神经
  1. 首页
  2. SOTA
  3. 零样本视频问答
  4. Zero Shot Video Question Answer On Video Mme 1

Zero Shot Video Question Answer On Video Mme 1

评估指标

Accuracy (%)

评测结果

各个模型在此基准测试上的表现结果

模型名称
Accuracy (%)
Paper TitleRepository
GPT-4o mini68.9GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding-
VideoLLaMA2 (72B)63.1VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs-
BIMBA-LLaVA-Qwen2-7B64.67BIMBA: Selective-Scan Compression for Long-Range Video Question Answering-
Video-RAG (Based on LLaVA-Video)77.4Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension-
VILA-1.5 (34B)64.1VILA: On Pre-training for Visual Language Models-
Gemini 1.5 Pro81.3Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context-
LongVU (7B)60.6LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding-
MiniCPM-V 2.6 (8B)63.7MiniCPM-V: A GPT-4V Level MLLM on Your Phone-
Gemini 1.5 Flash75.0Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context-
GPT-4o77.2GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding-
0 of 10 row(s) selected.
HyperAI

学习、理解、实践,与社区一起构建人工智能的未来

中文

关于

关于我们数据集帮助

产品

资讯教程数据集百科

链接

TVM 中文Apache TVMOpenBayes

© HyperAI超神经

津ICP备17010941号-1京公网安备11010502038810号京公网安备11010502038810号
TwitterBilibili