Hands play an important role in our daily life. We use our hands for manipulation in working, emphasis in speaking, communication in non-verbal environment, etc. Hand gesture not only uses for simple commands in traffic control, but also extends as a kind of language–sign language. In the areas of VR/AR and HCI, understanding hand and its action can greatly improve user experience. This covers a broad topics related to hands, including hand detection, tracking, hand pose estimation, gesture recognition, and sign language translation. Four papers are collected this issue. They cover different topics related to hand and gesture.
As a basic and important task in manipulation of virtual world or low attention operation, hand shape and pose estimation attracts many researchers from both academy and industry. Lin Huang et al. present a survey on depth and RGB image-based 3D hand shape and pose estimation. The authors review 3D hand shape and pose estimation methods from RGB-D inputs. As deep learning methods dominate the area, dataset becomes one of key issues. In this paper, the authors provide an overview of the available datasets. These are datasets are widely used in hand and gesture recognition, and is a major driving force which move this area forward.
Dynamic is a major feature in gesture. Compare to other human action, hand gesture is more challenge as the action could be as small as a single finger movement, and as large as the whole body. Several methods are proposed in recent years to encode dynamics for gesture recognition. Yuanyuan Shi et al. review the progressive in dynamic gesture recognition. Three kinds of methods including two-stream CNN, 3D CNN, and RNN (including Long-short Term Memory (LSTM) networks) are presented. In the area of gesture recognition, most of the researchers focus their work on gesture from third-person view. With the requirement from AR, first-person view (ego-view) gesture attracts more attention in recent years. In this survey, the authors also discuss on ego-view gesture recognition. Several factors in continuously gesture recognition are discussed.
RGB-D camera has been widely used in gesture recognition since Xbox. With the help of depth image, somatosensory games become popular. Different to the case in game, gesture recognition needs to deal with hand in detail. Usually the RGB image has higher resolution than depth image. This implies that a combination of RGB and Depth image probably provides a balance between high resolution and depth. Benjia ZHOU et al. proposed a method for multi-modal gesture recognition with Adaptive cross-fusion learning. The proposed method mines relationship between RGB and Depth image, and merge multiple streams through an adaptive one-dimensional convolution. Experiments show its advantage on several datasets, such as IsoGD and NVGesture.
Sign language is a kind of meaningful and complex gesture. Although there are some books on sign language teaching, it's still not easy for a student to learning sign language without a professional tutor. Zhang et al. implement an interactive tutor for Chinese sign language learning with a Smartphone. The system provides several learning modes for the user, and helps the user to learn sign language by assessment of their action. 1000 frequently used Chinese sign words are included in the system. This system provides not only the function of learning sign language, but also the function of interactive dictionary by play sign action to retrieve its meaning, or vice versa.
Hand is very important for interaction with both the world outside and other people. As a major part of the dream of natural HCI, understand hand and its meaning attracts more and more researchers. In this special, four papers covers hand and gesture related topics, including hand pose estimation, dynamic gesture recognition, multi-modal gesture recognition, and sign language interactive teaching. We hope this special issue can provide an overview of this area, and attract more researchers work on it.