The broad application domain of the work presented in this thesis is pattern classification with a focus on gesture recognition and 3D hand pose estimation. One of the main contributions of the proposed thesis is that it investigates and proposes new similarity measures for similarity search in multimedia databases given the task of 3D hand pose estimation using RGB-D. At the same time, towards making 3D hand pose estimation methods more automatic, a novel hand segmentation method is introduced which also relies on depth data. Experimental results demonstrate that the use of depth data increases the discrimination power of the proposed method. On the topic of gesture recognition, a novel method is proposed that combines a well know similarity measure, namely the Dynamic Time Warping (DTW), with a new hand tracking method which is based on depth frames captured by Microsoft’s KinectTM RGB-Depth sensor. When DTW is combined with the near perfect hand tracker gesture recognition accuracy remains high even in very challenging datasets, as demonstrated by experimental results. Another main contribution of the current thesis is an extension of the proposed gesture recognition system in order to handle cases where the user is not standing fronto-parallel with respect to the camera. Our method can recognize gestures captured under various camera viewpoints. At the same time our depth hand tracker is evaluated against one popular open source user skeleton tracker by examining its performance on random signs from a dataset of American Sign Language (ASL) signs. This evaluation can serve as a benchmark for the assessment of more advanced detection and tracking methods that utilize RGB-D data. The proposed structured motion dataset of (ASL) signs has been captured in both RGB and depth format using aMicrosoft Kinect sensor and it will enable researchers to explore body part (i.e. hands) detection and tracking methods, as well as gesture recognition algorithms.