6. Conclusions
In this paper, we studied vision-based worker action recognition using the Bag-of-Feature framework. A cutting-edge video representation method – dense trajectories was adopted. Three types of descriptors, namely HoG, HoF and MBH, and their combination were tested for performance evaluation. The multi-class SVM with non-linear RBF kernel was applied for training and classification. A new real world dataset were established for system validation with totally 1176 video clips, covering 11 categories of common worker actions. Several challenging situations, such as view angle change, illumination change, interrupted workflow, and interaction between multiple workers, are involved in the proposed dataset. Experimental results showed that the system achieved the best performance with average accuracy of 59% under the configuration of MBH descriptor and codebook size 500, outperforming Gong et al.’s method [43] by 24%. The system holds a promising potential for future real world application