UTKinect-Action3D Dataset


Introduction

This dataset was collected as part of research work on action recognition from depth sequences. The research is described in detail in CVPRW 2012 paper View Invariant Human Action Recognition Using Histograms of 3D Joints

Dataset

The videos was captured using a single stationary Kinect with Kinect for Windows SDK Beta Version. There are 10 action types: walk, sit down, stand up, pick up, carry, throw, push, pull, wave hands, clap hands. There are 10 subjects, Each subject performs each actions twice. Three channels were recorded: RGB, depth and skeleton joint locations. The three channel are synchronized. The framerate is 30f/s. Note we only recorded the frames when the skeleton was tracked, the frame number of the files has jumps. The final frame rate is about 15f/sec. (There is around 2% of the frames where there are multiple skeleton info recorded with slightly different joint locations. This is not caused by a second person. You can chooce either one. )

Example images:

In each video, the subject performs the 10 actions in a concatenate fation, the label of the each action segment is given in actionLabel.txt The dataset contains 4 parts:

(a) RGB images(.jpg), the resolution is 480x640. download (1.79G)

(b) Depth images(.xml), the resolution is 320x240. They are saved using OpenCV. download (367M)

(c) Sketetal joint Locations (.txt) Each row contains the data of one frame, the first number is frame number, the following numbers are the (x,y,z) locations of joint 1-20. The x, y, and z are the coordinates relative to the sensor array, in meters. Detailed description of the coordinates can be found here The index of the joints are described here. download (3.3M)

(d) Labels of action sequence (4KB)

Citation

If you make use of the UTKinect-Action3D dataset in any form, please cite the following reference.

@inproceedings{xia2012view,
      title={View invariant human action recognition using histograms of 3D joints},
      author={Xia, L. and Chen, C.C. and Aggarwal, JK},
      booktitle={Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on},
      pages={20--27},
      year={2012},
      organization={IEEE}
}

If you have any problems, questions, or suggestions regarding the dataset, please contact Lu Xia


by Lu Xia 2012