Action Recognition In Videos On Ntu Rgbd

评估指标

Accuracy (CS)

Accuracy (CV)

评测结果

各个模型在此基准测试上的表现结果

模型名称	Accuracy (CS)	Accuracy (CV)	Paper Title	Repository
MMNet (RGB + Pose)	96.0	98.8	MMNet: A Model-Based Multimodal Network for Human Action Recognition in RGB-D Videos
FUSION (IR+Pose)	91.8	94.9	Infrared and 3D skeleton feature fusion for RGB-D action recognition	-
VPN (RGB + Pose)	95.5	98.0	VPN: Learning Video-Pose Embedding for Activities of Daily Living	-
STAR-Transformer (RGB + Pose)	92.0	96.5	STAR-Transformer: A Spatio-temporal Cross Attention Transformer for Human Action Recognition	-
PoseC3D (RGB + Pose)	97.0	99.6	Revisiting Skeleton-based Action Recognition	-
ViewCon (RGB + Pose)	93.7	98.9	Multi-View Action Recognition Using Contrastive Learning
DVANet (RGB only)	93.4	98.1	DVANet: Disentangling View and Action Features for Multi-View Action Recognition	-
3DA (RGB + Pose)	94.3	97.9	Cross-Modal Learning with 3D Deformable Attention for Action Recognition	-
PoseMap (RGB+Pose)	91.7	95.2	Recognizing Human Actions as the Evolution of Pose Estimation Maps	-
PB-GCN (Skeleton only)	87.5	93.2	Part-based Graph Convolutional Network for Action Recognition	-
DSSCA-SSLM (RGB only)	74.9	-	Deep Multimodal Feature Analysis for Action Recognition in RGB+D Videos	-
UMDR (RGB-D)	96.2	98.0	A Unified Multimodal De- and Re-coupling Framework for RGB-D Motion Recognition	-
DSCNet (RGB + Pose)	97.4	99.4	A Dense-Sparse Complementary Network for Human Action Recognition based on RGB and Skeleton Modalities
TSMF (RGB + Pose)	92.5	97.4	Multimodal Fusion via Teacher-Student Network for Indoor Action Recognition
π-ViT (RGB + Pose)	96.3	99.0	Just Add $\pi$! Pose Induced Video Transformers for Understanding Activities of Daily Living	-
Glimpse Clouds (RGB only)	86.6	93.2	Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points	-
MMTM (RGB+Pose)	91.99	-	MMTM: Multimodal Transfer Module for CNN Fusion	-
B2C-AFM(RGB+Pose)	91.7	-	B2C-AFM: Bi-Directional Co-Temporal and Cross-Spatial Attention Fusion Model for Human Action Recognition
EPP-Net (Parsing + Pose)	94.7	97.7	Explore Human Parsing Modality for Action Recognition	-
Hierarchical Action Classification (RGB + Pose)	95.66	98.79	Hierarchical Action Classification with Network Pruning	-

0 of 25 row(s) selected.