Command Palette
Search for a command to run...
Papers
Daily updated cutting-edge AI research papers to help you keep up with the latest AI trends

OpenThoughts-Agent: Data Recipes for Agentic Models

LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

MemGUI-Agent: An End-to-End Long-Horizon Mobile GUI Agent with Proactive Context Management

MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Qwen-AgentWorld: Language World Models for General Agents

Rethinking Training Targets, Architectures and Data Quality for Universal Speech Enhancement

Generative 3D Gaussians with Learned Density Control

TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment

Beyond Isolated Words: Diffusion Brush for Handwritten Text-Line Generation

gsplat: An Open-Source Library for Gaussian Splatting

OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains

OPEN-SWE-TRACES: Advancing Dual-Mode Multilingual Distillation for Software Engineering Agents

Credit Assignment with Resets in Language Model Reasoning

Unlimited OCR Works: Welcome the Era of One-shot Long-horizon Parsing

PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems

OpenRath: Session-Centered Runtime State for Agent Systems

EvoEmbedding: Evolvable Representations for Long-Context Retrieval and Agentic Memory

Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation

World Action Models: A Survey

KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking

Rethinking Shrinkage Bias in LLM FP4 Pretraining: Geometric Origin, Systemic Impact, and UFP4 Recipe

HydraHead: From Head-Level Functional Heterogeneity to Specialized Attention Hybridization

3DCodeBench: Benchmarking Agentic Procedural 3D Modeling Via Code

RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering

Training Software Engineering Agents and Verifiers with SWE-Gym

MAKIEVAL: A Multilingual Automatic WiKIdata-based Framework for Cultural Awareness Evaluation for LLMs

GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning

Multi-Turn Reflective Masking Elicits Reasoning in Mask Diffusion Models

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents

OpenThoughts-Agent: Data Recipes for Agentic Models

LingxiDiagBench: A Multi-Agent Framework for Benchmarking LLMs in Chinese Psychiatric Consultation and Diagnosis

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

MemGUI-Agent: An End-to-End Long-Horizon Mobile GUI Agent with Proactive Context Management

MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers?

Qwen-AgentWorld: Language World Models for General Agents

Rethinking Training Targets, Architectures and Data Quality for Universal Speech Enhancement

Generative 3D Gaussians with Learned Density Control

TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment

Beyond Isolated Words: Diffusion Brush for Handwritten Text-Line Generation

gsplat: An Open-Source Library for Gaussian Splatting

OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains

OPEN-SWE-TRACES: Advancing Dual-Mode Multilingual Distillation for Software Engineering Agents

Credit Assignment with Resets in Language Model Reasoning

Unlimited OCR Works: Welcome the Era of One-shot Long-horizon Parsing

PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems

OpenRath: Session-Centered Runtime State for Agent Systems

EvoEmbedding: Evolvable Representations for Long-Context Retrieval and Agentic Memory

Learning from Your Own Mistakes: Constructing Learnable Micro-Reflective Trajectories for Self-Distillation

World Action Models: A Survey

KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking

Rethinking Shrinkage Bias in LLM FP4 Pretraining: Geometric Origin, Systemic Impact, and UFP4 Recipe

HydraHead: From Head-Level Functional Heterogeneity to Specialized Attention Hybridization

3DCodeBench: Benchmarking Agentic Procedural 3D Modeling Via Code

RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering

Training Software Engineering Agents and Verifiers with SWE-Gym

MAKIEVAL: A Multilingual Automatic WiKIdata-based Framework for Cultural Awareness Evaluation for LLMs

GeneralVLA-2: Geometry-Aware Reconstruction and Governed Memory for Robot Planning

Multi-Turn Reflective Masking Elicits Reasoning in Mask Diffusion Models

BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation

GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents