SlideShare ist ein Scribd-Unternehmen logo
1 von 8
Downloaden Sie, um offline zu lesen
TSINGHUA SCIENCE AND TECHNOLOGY
ISSNll1007-0214ll15/18llpp329-336
Volume 17, Number 3, June 2012
A Hand Gesture Based Interactive Presentation System Utilizing
Heterogeneous Cameras
Bobo Zeng, Guijin Wang**
, Xinggang Lin
Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
Abstract: In this paper, a real-time system that utilizes hand gestures to interactively control the presentation
is proposed. The system employs a thermal camera for robust human body segmentation to handle the
complex background and varying illumination posed by the projector. A fast and robust hand localization al-
gorithm is proposed, with which the head, torso, and arm are sequentially localized. Hand trajectories are
segmented and recognized as gestures for interactions. A dual-step calibration algorithm is utilized to map
the interaction regions between the thermal camera and the projected contents by integrating a Web cam-
era. Experiments show that the system has a high recognition rate for hand gestures, and corresponding
interactions can be performed correctly.
Key words: hand gesture detection; gesture recognition; thermal camera; interactive presentation
Introduction
Interactive presentation systems use advanced Human
Computer Interaction (HCI) techniques to provide a
more convenient and user-friendly interface for con-
trolling presentation displays, such as page up/down
controls in a slideshow. Compared with traditional
mouse and keyboard control, the presentation experi-
ence is significantly improved with these techniques.
Hand gesture has wide-ranging applications[1]
. In this
study, we apply it to an interactive presentation system
to create an easy-to-understand interaction interface.
The key to this system is robust localization of the
presenter’s interaction hand. Traditionally, visible light
cameras are used for this task. However, in the presen-
tation scenario, complicated and varying illumination
by the projector changes the appearance of both the
presenter and the hand in the visible light camera, in-
validating methods such as detecting hand skin color[2]
or training hand detectors[3,4]
. To address this short-
coming, we introduce a thermal camera whose imaging
is invariant to the projector’s illumination.
During the presentation, the interaction hand is
small and no salient features exist, so direct localiza-
tion is not feasible. Hu et al.[5]
built a sophisticated
human model, but it is burdened by a heavy computa-
tional cost. Iwasawa et al.[6]
located the head and hand
by analyzing the skeleton heuristically. However, only
a limited number of postures can be estimated. Juang
et al.[7]
found significant body points by analyzing
convex points on a silhouette contour, but the convex
points are not stable when the silhouette’s contour is
missing. Head localization is important as the first step
in many pose estimation methods. Zou et al.[8]
ex-
tracted quadrant arcs to fit an elliptical head contour,
which needs to integrate the arcs combinatorially, so it
is still not fast enough.
Calibration of the camera and the projected contents
is needed for corresponding interaction regions. Spe-
cial patterns need to be projected to find the point
correspondences in Refs. [9,10], which are inconven-
ient and not fully automatic. Wang et al.[11]
found
Received: 2011-12-30; revised: 2012-05-07
** To whom correspondence should be addressed.
E-mail: wangguijin@tsinghua.edu.cn; Tel: 86-10-62781430
Tsinghua Science and Technology, June 2012, 17(3): 329-336
330
correspondence between shots and slides by utilizing
the positions of recognized texts. However, this ap-
proach requires the texts to be presented and recog-
nized robustly, which is not easy to do.
1 System Overview
The system setup is illustrated in Fig. 1. It consists of a
projector with a projection screen, a thermal camera
with 324×256 resolution, and a Web camera. The two
cameras are placed facing the screen and close to each
other for better correspondence. The presenter stands at
the front of the projection screen so the cameras can
see the presenter and the screen simultaneously. The
presenter uses his/her hand for interaction.
The system flowchart is illustrated in Fig. 2. The
thermal camera detects and tracks the presenter’s hand
(gesture detection), and the hand trajectory is
segmented and recognized (gesture recognition). Once
predefined gestures are recognized, the corresponding
interactions are performed and projected to the screen
with the projector-camera calibration. The calibration
is computed with the thermal camera and the Web
camera together to map the interaction regions.
Robust and fast gesture detection by locating the in-
teraction hand largely determines the system’s per-
formance and serves as our major emphasis. Being
aware of the difficulties in directly locating hands, we
first segment the human body by background modeling
with the thermal camera. Based on the segmented hu-
man silhouette, we develop a fast hand localization
algorithm for our real-time application. The human
body is roughly divided into three parts: the head, the
torso, and the arm, and they are localized sequentially.
The hand is finally localized by skeletonizing the arm.
With the hand trajectory, we segment and recognize
three types of gestures (lining, circling, and pointing)
and apply the corresponding interactions given the
pre-computed calibration. Direct calibration from the
thermal camera to the projected contents is impossible,
as no projected contents can be viewed from the ther-
mal camera. Therefore, we develop a dual-step auto-
matic calibration method with the aid of a common
Web camera. The first step is thermal camera and Web
camera calibration based on line detection, and the
second step is Web camera and projected contents (e.g.,
PPT slide) calibration based on Scale Invariant Feature
Transform (SIFT) feature points[12]
.
2 Gesture Detection
We propose a part-based algorithm for localizing the
hand point, including a contour-based head detection
and shape matching based head tracking module, a
histogram-based torso and arm localization module,
and a skeleton-based hand localization module. The
localization algorithm is based on the observations that
the human body is upright and, in interactions, the
hand is located away from the torso, either beside the
torso (horizontally) or above the head (vertically). The
algorithm flow chart is illustrated in Fig. 3.
2.1 Human body segmentation
Human body segmentation is carried out using the
thermal camera, as its images are invariant to the pro-
jector’s light, which is ideal for segmentation. Figure
Fig. 1 System hardware setup
Fig. 2 System flowchart
Bobo Zeng et al.:A Hand Gesture Based Interactive Presentation System Utilizing … 331
4c shows a typical image captured by the thermal
camera; no content on the projection screen can be
observed. We use a Gaussian background model[13]
to
extract the foreground as the segmented human body.
Since the background is simple and static in the indoor
environment, unimodal Gaussian distribution is
employed.
For post-processing, erosion and dilation operations
are performed to remove the noise and connect the
possible gaps caused by inaccurate segmentation. Then,
we use a connected component analysis algorithm[14]
to
select the largest connected component as the human
body and its contour. Figure 4 illustrates that the hu-
man body is well segmented in the thermal camera.
The inferior results of the Web camera are also shown
for comparison.
2.2 Head localization
Head localization is performed for three reasons. First,
directly localizing the hand is not feasible due to the
lack of salient hand features. In contrast, the head has a
salient elliptical shape, which is stable and invariant to
head poses (e.g., frontal or profile view). Second, the
localized head can guide the localization of the torso
and arms by its position and size. Third, since the head
and arm are quite different in size, the head’s localiza-
tion can prevent it from being falsely identified as the
arm. We propose to detect the head by contour-based
ellipse fitting and track it by robust shape matching.
2.2.1 Contour-based elliptical head detection
The head contour is segmented from the human body
contour by using a gradient angle distribution of its
elliptical shape. Then, an ellipse is detected by least
squares fitting with the head contour’s points. The
head contour’s gradient angle distribution is illustrated
in Fig. 5. Searching from the start point (the highest
point on the head contour), the gradient angles of the
contour points vary from the first quadrant [0, π / 2 ]
to the fourth quadrant [ π / 2
− , 0] in the counterclock-
wise direction, and from the second quadrant [ π/2, π ]
to the third quadrant [ , / 2
−π − π ] in the clockwise
direction. The gradient angle of each contour point is
computed by
Fig. 3 Hand localization flowchart
Fig. 4 Comparison of human body segmentation re-
sults of the two cameras: (a) image captured by the
Web camera; (b) the inferior segmentation result of the
Web camera; (c) image captured by the FLIR thermal
camera; and (d) the superior segmentation result of the
thermal camera
Fig. 5 Illustration of gradient angle distribution of an
elliptical head when searching from the start point in
clockwise and counterclockwise directions. The arrow
line shows the dx and dy involved in the gradient angle
computation.
Tsinghua Science and Technology, June 2012, 17(3): 329-336
332
arctan( ), 0;
arctan( ) π, 0 and 0;
arctan(
/
) π, 0 a
/
/ nd 0
y x x
y x x y
y x x y
d d d
d d d d
d d d d
θ
⎧ >
⎪
= +
⎨
⎪ − <
⎩
- .
-
(1)
where x
d and y
d are x and y derivatives obtained
with a [−1 0 1] mask.
The head detection process is illustrated in Fig. 6.
First, the search start point is identified as the intersec-
tion of the human body contour and the principal axis
of the human body. The principle axis is a vertical line
p argmax ( )
x h x
= (2)
where h(x) is the horizontal histogram of human body
pixels. From the start point, the gradient angles are
computed along the contour in counterclockwise and
clockwise directions. The image is smoothed by a 5×5
Gaussian kernel before computing the angles to re-
move noises. The head contour is segmented after
traversing the 4 quadrants in order. The symmetry of
contour segments in clockwise and counterclockwise
directions is employed to reduce false segmentations.
If the segment of one side is exceptionally longer than
another side, then it is truncated to maintain symmetry.
An ellipse is detected with the segmented head con-
tour points by least squares fitting[15]
using the ellipse
equation. Then, the ellipse’s long axis a, short axis b,
and the orientation θ to the y-axis are obtained. The
ellipse is verified as a valid head if the following con-
ditions are satisfied:
(a) The fitting error is less than a specified threshold.
(b) According to the head’s size, the ellipse’s aspect
ratio a/b should fall into a range of [1.0, 2.5] and the
orientation should fall into a range of [ π / 4, π / 4
− ].
If the head is unverified, further head tracking is re-
quired. An unverified head is illustrated in Fig. 7,
which is caused by a wrongly segmented head contour
due to distraction by the arm.
2.2.2 Shape matching based head tracking
When head detection fails, a head template tracking
method utilizing robust shape matching is utilized to
recover the head. We assume that the head has the
same size as the last detected head, but it is now lo-
cated in a different position. The task is to find the best
state hypothesis, ˆ ( , )
t t t
s x y
= , which denotes the posi-
tion of the head at time t. Its probability is measured by
ˆ argmax ( | )
t
t t t
s
s p s o
= (3)
1
ˆ
( | ) ( | ) ( | )
t t t t t t
p s o p o s p s s −
∝ (4)
where ( | )
t t
p o s is the observation probability and
1
ˆ
( | )
t t
p s s − is the state transition probability.
For the observation probability, we develop a shape
matching measure inspired by chamfer matching[16]
, a
popular fast shape matching technique. In our match-
ing method, the last detected head ellipse T is the
“feature template”, and the current body silhouette
contour I is the “feature image”. The original chamfer
matching approach does not consider shape incom-
pleteness. However, in our problem, the head contour
may be missing in the joints position with the neck or
in the possible intersection position with the arm when
it is lifted above the head. So, we introduce a robust
π
π/2
0
−π/2
π
Start point
Head
contour
Principle axis
Fig. 6 A detected head showing the segmented principal
axis of the human body and head contour with the gradient
angle curve. The arrow lines indicate the start point, the left
endpoint, the right endpoint of the head contour, and their
corresponding positions in the gradient angle curve.
Fig. 7 Head tracking illustration in which the head
contour is wrongly segmented due to distraction by the
arm; hence, the fitted head is unverified. The last de-
tected head is used as a template, and shape matching
is employed to track the head. The head template and
its distance transform image are also shown.
Head template
Head distance
image
Search region
Unverified
head
Tracked
head
Bobo Zeng et al.:A Hand Gesture Based Interactive Presentation System Utilizing … 333
matching measure that detects matched points with a
distance greater than a threshold as outliers and rejects
them. We also apply the distance transform to the tem-
plate image (see Fig. 7 for the head template and dis-
tance image) rather than the feature image for conven-
ience. The proposed confidence measure is
th
th
( ), th
( )
( )
| (1 )
( )
t
I
T T
t t
i I s T T
d i d
d d i N
P o s
N d P
ω ω
∈
−
= + −
∑
-
(5)
th ( )
d a b
β
= + (6)
where the first term of Eq. (5) measures the matching
confidence, with ( )
T
d i denoting the chamfer distance
between contour point i in the hypothesis image patch
( )
t
I s and the closest edge point in T, th
d denoting
the distance threshold, and T
N denoting the number
of inlying matches. The second term measures the
matching completeness, with T
P denoting the pe-
rimeter of the head ellipse in the template and ω de-
noting the weighting factor between the two terms.
th
d is set in proportion to the head template’s
semi-axes a and b, and the coefficient β is set to be
0.1.
For the motion model, since the head is near the
principle axis p
x horizontally and obeys the motion
continuity vertically, the state transition probability is
defined as follows:
2 2
1 p 1
ˆ ˆ
( | ) ( | 0, ) ( | 0, )
t t t x t t y
P s s x x y y
σ σ
− −
= − −
N N (7)
where N is the normal distribution; 2
x
σ and 2
y
σ
represent the covariance in the x and y directions,
respectively.
The hypothesis space is generated in a neighborhood
of p 1
ˆ
( , )
t
x y − , and the hypothesis with the highest
probability is obtained by exhaustive search. Figure 7
gives a head tracking example.
2.3 Hand localization
Compared to the head, localization of the torso and the
arm region is relatively easy, and Fig. 8 illustrates the
results for two different poses. The torso is modeled as
a rectangle, whose height extends from the bottom of
the head to the bottom of the body bounding box. The
horizontal histogram of human body pixels below the
head is calculated and binarized with a threshold. Then,
the longest 1 segment is localized as the torso rectan-
gle’s width. Apart from the head and torso, only the
arm remained, which is localized as the largest com-
ponent with connected component analysis.
The identified arm region is skeletonized with the
thinning algorithm in Ref. [17], and two endpoints of
the skeleton are extracted. Based on the layout of the
head, torso, and arm, the endpoint far away from the
body is identified as the initial hand point. Since the
skeleton is easily affected by local noises, we further
extracted the initial hand point’s neighborhood with the
hand size estimated from the head size. The centroid of
body pixels in the neighborhood is computed as the
final hand point.
3 Gesture Recognition
The moving trajectory 1
{ ( , )}
i i i i N
x y
=
p - - of the hand
consists of the localized hand point in each frame. Ges-
ture trajectory is segmented with static start and end
positions by keeping the hand stationary for several
frames[18]
. Features are extracted from the gesture tra-
jectory for recognition. Center of gravity ˆ ˆ ˆ
( , )
x y
=
p
of the segmented trajectory is calculated by
0
1
ˆ
N
i
i
N =
= ∑
p p (8)
The distance feature i
l is calculated by
2 2
2
ˆ ˆ ˆ
|| || ( ) ( )
i i i i
L x x y y
= − = − + −
p p (9)
max 1
max
, ma .
x
i
i i N i
L
l L L
L
= = - -
The orientation feature i
θ is calculated by
1 ˆ
tan
ˆ
i
i
i
y y
x x
θ − −
=
−
(10)
In the recognition, we used a rule-based approach
rather than a learning-based approach (e.g., an Hidden
Markov Model (HMM[19]
) or a Finite-State Machine
(FSM[20]
)). Rule-based approaches are simple, fast, and
Fig. 8 The localized head, torso, arm, arm skeleton,
and the hand point for two different poses. The hori-
zontal histogram without the head and the threshold
for locating the torso are also shown.
Tsinghua Science and Technology, June 2012, 17(3): 329-336
334
require no offline training. The rules are defined as
follows for the predefined three gestures:
Lining The distance features decrease and then
increase linearly, and the orientation features have two
opposite values.
Circling The distance features are constant, and
the orientation features have a 2π value range and in-
crease or decrease monotonically.
Pointing The distance features are near to zero.
4 Gesture-Based Interaction
After a gesture is recognized, the interaction region
needs to be mapped from the camera to the original
contents. Camera projector calibration is employed to
accomplish the mapping.
4.1 Camera projector calibration
The planar correspondence of pixels in the projection
screen between the camera and the original projected
contents is determined by a 3×3 homography matrix H
as follows:
T T
p p c c
[ ] [ 1]
sx sy s x y
= H (11)
where c c
( , )
x y is the projection screen’s point ob-
served in the image, and p p
( , )
x y is the corresponding
point in the original contents. At least four such corre-
spondence pairs are required in order to solve H. Di-
rectly finding such pairs with the thermal camera is
impossible because of the lack of texture of the imag-
ing. We therefore develop a calibration method in two
steps aided by the Web camera: 2cam 2cont ,
= ×
H H H
where H2cam is the homography matrix from the ther-
mal camera to the Web camera, and H2cont is the ho-
mography matrix from the Web camera to the pro-
jected contents.
H2cam is calculated in the first step. The four vertices
of the projection screen in both cameras are detected
by finding lines using the Hough transform[21]
and ob-
taining the quadrilateral. The vertices in both cameras
are used to calculate H2cam . The results are illustrated
in Fig. 9.
To get the homography matrix from the Web camera
to the projected contents, we employed SIFT
keypoints[12]
to detect candidate correspondence points.
SIFT points are scale-space extrema detected in a set
of difference-of-Gaussian images in the scale space.
Each keypoint is assigned a location, scale, orientation,
and a descriptor; it is thereby invariant to scale and
rotation and robust to illumination change and view-
point change. Given the detected SIFT points in two
images, the nearest neighbor matching algorithm from
Ref. [12] is used. After this matching step, the corre-
sponding point pairs are only putative. We use RAN-
dom SAmple Consensus (RANSAC)[17]
to impose the
homography constraints on the previous matched point
pairs and to find the correct correspondences, which
are finally used to calculate H2cont . The whole process
is illustrated in Fig. 10.
Finally, we obtained the final homography by
H=H2cam×H2cont . Though the image quality of the Web
camera was not very good and no special pattern was
projected for calibration, the experiment demonstrated
the effectiveness of our method: it had a maximum
pixel error of less than 8, which was sufficiently pre-
cise for the interaction.
4.2 Interaction
Upon identifying the interaction region of the recog-
nized gesture with the calibration, the semantic inter-
action is carried out. In a PPT interactive presentation
scenario, lining upward or downward controls the
(a)
(b)
(c)
Fig. 9 Vertices detection with the Hough transform:
(a) contrast-adjusted thermal image for viewing the
screen better; (b) corner detection in thermal camera
images; and (c) corner detection in Web camera images
Bobo Zeng et al.:A Hand Gesture Based Interactive Presentation System Utilizing … 335
forward or backward slide transitions, respectively;
lining leftward or rightward horizontally makes anno-
tations on the slide; circling controls the zoom in/out
operations of the interaction regions in the slide; and
pointing controls the activation of buttons in the slide.
Once the gesture is detected and recognized, the cor-
responding actions can be performed in the PPT (such
as annotation) and projected to the screen. Figures 11a
and 11b illustrate a circling gesture and a lining gesture,
respectively, in the presentation.
5 Performance Evaluation
We quantitatively evaluate the proposed system in
terms of head and hand localization accuracy, gesture
recognition rate, and computation time. The system
runs on a Windows PC with an Intel Dual Core 3 GHz
CPU and could be run at 20 frames per second in a
single CPU core without any specific programming
optimization. We conduct the experiments in online
real-time scenarios with complex slide contents (e.g.,
colored figures and backgrounds).
The head and hand localization accuracy is key for
the system performance and is defined as
Localization accuracy =
Number of correctly localized head (or hand)
.
Number of total head (or hand)
For the head, correct localization means that the over-
lap area of localized head ellipse l
D and ground truth
gt
D must exceed 50%:
l gt
l gt
area( )
OverlapRatio = 0.5.
area( )
D D
D D
>
∩
∪
For the hand point, correct localization means that the
point is localized in the actual hand region. The
evaluation results are shown in Table 1, and Fig. 12
shows certain hand point localization results.
Table 2 shows the recognition rates of the three
types of gestures. These results show that our system
has achieved high recognition performance. The main
error in recognition is caused by wrong gesture seg-
mentation.
Table 1 Localization accuracy of the head and hand point
Number
of total
Number of
correctly localized
Localization
accuracy (%)
Head 6895 6839 99.2
Hand point 5092 5044 99.1
Fig. 12 Hand point localization results
Table 2 Gesture recognition results
Gesture
Number of
gestures
Number
recognized
Recognition
rate (%)
Overall
rate (%)
Lining 84 81 96.4
Circling 52 50 96.2
Pointing 43 42 97.7
96.7
(a)
(b)
(c)
Fig. 10 Illustration of finding the homography from
the Web camera to the projected contents: (a) detected
SIFT points in the visible camera and the original con-
tents; (b) the putative matched point pairs; and (c) the
final matched pairs by imposing homography con-
straints with RANSAC
(a) (b)
Fig. 11 (a) Circling gesture for zooming into the
graph in the presentation; and (b) Lining gesture for
annotating a PPT slide. The red annotation line is
drawn on the original PPT slide and projected to the
screen after the lining gesture is recognized.
Tsinghua Science and Technology, June 2012, 17(3): 329-336
336
6 Conclusions
In this study, we design a hand gesture based presenta-
tion system by integrating a thermal camera with a
Web camera; this system provides a natural and con-
venient interaction interface for presentations. Quanti-
tative experimental results show that the system per-
forms efficiently. Currently, our gesture interaction is
limited to one hand, and we plan to extend the work to
handle two hands to enrich the system’s interaction
capability. The Web camera is only used for calibration,
but additional information could be extracted for in-
teraction, such as the shape of the hand (e.g., palm or
fist). Also, other HCI applications such as gaming will
be explored in future studies.
Acknowledgments
The authors would like to thank Nokia China Research Center
for their support of this research.
References
[1] Erol A, Bebis G, Nicolescu M, et al. Vision-based hand
pose estimation: A review. Computer Vision and Image
Understanding, 2007, 108(1-2): 52-73.
[2] Wang F, Ngo C W, Pong T C. Simulating a smartboard by
real-time gesture detection in lecture videos. IEEE
Transactions on Multimedia, 2008, 10(5): 926-935.
[3] Francke H, Ruiz-Del-Solar J, Verschae R. Real-time hand
gesture detection and recognition using boosted classifiers
and active learning. In: Proceedings of the 2nd Pacific Rim
Conference on Advances in Image and Video Technology.
Santiago, Chile, 2007: 533-547.
[4] Fang Y K, Wang K Q, Cheng J, et al. A real-time hand
gesture recognition method. In: Proceedings of the 2007
IEEE International Conference on Multimedia and Expo,
Vols 1-5. Beijing, China, 2007: 995-998.
[5] Hu Z, Wang G, Lin X, et al. Recovery of upper body poses
in static images based on joints detection. Pattern
Recognition Letters, 2009, 30(5): 503-512.
[6] Iwasawa S, Ebihara K, Ohya J, et al. Real-time estimation
of human body posture from monocular thermal images. In:
Proceedings of the 1997 IEEE Computer Vision and
Pattern Recognition. San Juan, Puerto Rico, 1997: 15-20.
[7] Juang C F, Chang C M, Wu J R, et al. Computer
vision-based human body segmentation and posture
estimation. Systems, Man and Cybernetics, Part A: Systems
and Humans, IEEE Transactions on, 2009, 39(1): 119-133.
[8] Zou W, Li Y, Yuan K, et al. Real-time elliptical head
contour detection under arbitrary pose and wide distance
range. Journal of Visual Communication and Image
Representation, 2009, 20(3): 217-228.
[9] Sukthankar R, Stockton R G, Mullin M D. Smarter
presentations: Exploiting homography in camera-projector
systems. In: Proceedings of the Eighth IEEE International
Conference on Computer Vision. Kauai, HI, USA, 2001:
247-253.
[10] Okatani T, Deguchi K. Autocalibration of a projector-
camera system. Pattern Analysis and Machine Intelligence,
IEEE Transactions on, 2005, 27(12): 1845-1855.
[11] Wang F, Ngo C W, Pong T C. Lecture video enhancement
and editing by integrating posture, gesture, and text. IEEE
Transactions on Multimedia, 2007, 9(2): 397-409.
[12] Lowe D G. Distinctive image features from scale-invariant
keypoints. International Journal of Computer Vision, 2004,
60(2): 91-110.
[13] Stauffer C, Grimson W E L. Adaptive background mixture
models for real-time tracking. In: Proceedings of the IEEE
Computer Society Conference on Computer Vision and
Pattern Recognition. Ft. collins, CO, USA, 1999: 246-252.
[14] Chang F, Chen C J, Lu C J. A linear-time
component-labeling algorithm using contour tracing
technique. Computer Vision and Image Understanding,
2004, 93(2): 206-220.
[15] Gander W, Golub G H, Strebel R. Least-squares fitting of
circles and ellipses. BIT, 1994, 34(4): 558-578.
[16] Barrow H G, Tenenbaum J M, Bolles R C, et al. Parametric
correspondence and chamfer matching: Two new
techniques for image matching. In: Proceedings of the 5th
International Joint Conference on Artificial Intelligence.
San Francisco, CA, USA, 1977: 659-663.
[17] Lam L, Lee S W, Suen C Y. Thinning methodologies — A
comprehensive survey. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 1992, 14(9): 869-885.
[18] Davis J, Shah M. Visual gesture recognition. IEE
Proceedings-Vision Image and Signal Processing, 1994,
141(2): 101-106.
[19] Lee H K, Kim J H. An HMM-based threshold model
approach for gesture recognition. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 1999, 21(10):
961-973.
[20] Bobick A F, Wilson A D. A state-based approach to the
representation and recognition of gesture. IEEE
Transactions on Pattern Analysis and Machine Intelligence,
1997, 19(12): 1325-1337.
[21] Duda R O, Hart P E. Use of the Hough transformation to
detect lines and curves in pictures. Commun. ACM, 1972,
15(1): 11-15.

Weitere ähnliche Inhalte

Ähnlich wie A Hand Gesture Based Interactive Presentation System Utilizing Heterogeneous Cameras

A Study on Sparse Representation and Optimal Algorithms in Intelligent Comput...
A Study on Sparse Representation and Optimal Algorithms in Intelligent Comput...A Study on Sparse Representation and Optimal Algorithms in Intelligent Comput...
A Study on Sparse Representation and Optimal Algorithms in Intelligent Comput...MangaiK4
 
A Review Paper on Real-Time Hand Motion Capture
A Review Paper on Real-Time Hand Motion CaptureA Review Paper on Real-Time Hand Motion Capture
A Review Paper on Real-Time Hand Motion CaptureIRJET Journal
 
CHARACTERIZING HUMAN BEHAVIOURS USING STATISTICAL MOTION DESCRIPTOR
CHARACTERIZING HUMAN BEHAVIOURS USING STATISTICAL MOTION DESCRIPTORCHARACTERIZING HUMAN BEHAVIOURS USING STATISTICAL MOTION DESCRIPTOR
CHARACTERIZING HUMAN BEHAVIOURS USING STATISTICAL MOTION DESCRIPTORsipij
 
CROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERA CROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERA ijaceeejournal
 
CROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERACROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERAijaceeejournal
 
HUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOT
HUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOTHUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOT
HUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOTcscpconf
 
Obstacle detection for autonomous systems using stereoscopic images and bacte...
Obstacle detection for autonomous systems using stereoscopic images and bacte...Obstacle detection for autonomous systems using stereoscopic images and bacte...
Obstacle detection for autonomous systems using stereoscopic images and bacte...IJECEIAES
 
AN ALGORITHM TO DETECT DRIVER’S DROWSINESS BASED ON NODDING BEHAVIOUR
AN ALGORITHM TO DETECT DRIVER’S DROWSINESS BASED ON NODDING BEHAVIOURAN ALGORITHM TO DETECT DRIVER’S DROWSINESS BASED ON NODDING BEHAVIOUR
AN ALGORITHM TO DETECT DRIVER’S DROWSINESS BASED ON NODDING BEHAVIOURijscmcj
 
5116ijscmc01
5116ijscmc015116ijscmc01
5116ijscmc01ijscmcj
 
An algorithm to detect drivers
An algorithm to detect driversAn algorithm to detect drivers
An algorithm to detect driversijscmcj
 
ROBUST HUMAN TRACKING METHOD BASED ON APPEARANCE AND GEOMETRICAL FEATURES IN ...
ROBUST HUMAN TRACKING METHOD BASED ON APPEARANCE AND GEOMETRICAL FEATURES IN ...ROBUST HUMAN TRACKING METHOD BASED ON APPEARANCE AND GEOMETRICAL FEATURES IN ...
ROBUST HUMAN TRACKING METHOD BASED ON APPEARANCE AND GEOMETRICAL FEATURES IN ...cscpconf
 
Robust Human Tracking Method Based on Apperance and Geometrical Features in N...
Robust Human Tracking Method Based on Apperance and Geometrical Features in N...Robust Human Tracking Method Based on Apperance and Geometrical Features in N...
Robust Human Tracking Method Based on Apperance and Geometrical Features in N...csandit
 
IRJET- Virtual Changing Room using Image Processing
IRJET- Virtual Changing Room using Image ProcessingIRJET- Virtual Changing Room using Image Processing
IRJET- Virtual Changing Room using Image ProcessingIRJET Journal
 
Development of a 3D Interactive Virtual Market System with Adaptive Treadmill...
Development of a 3D Interactive Virtual Market System with Adaptive Treadmill...Development of a 3D Interactive Virtual Market System with Adaptive Treadmill...
Development of a 3D Interactive Virtual Market System with Adaptive Treadmill...toukaigi
 
C015 apwcs10-positioning
C015 apwcs10-positioningC015 apwcs10-positioning
C015 apwcs10-positioningevegod
 
Interactive full body motion capture using infrared sensor network
Interactive full body motion capture using infrared sensor networkInteractive full body motion capture using infrared sensor network
Interactive full body motion capture using infrared sensor networkijcga
 
HUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOT
HUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOTHUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOT
HUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOTcsandit
 
Pose estimation algorithm for mobile augmented reality based on inertial sen...
Pose estimation algorithm for mobile augmented reality based  on inertial sen...Pose estimation algorithm for mobile augmented reality based  on inertial sen...
Pose estimation algorithm for mobile augmented reality based on inertial sen...IJECEIAES
 
Interactive Full-Body Motion Capture Using Infrared Sensor Network
Interactive Full-Body Motion Capture Using Infrared Sensor Network  Interactive Full-Body Motion Capture Using Infrared Sensor Network
Interactive Full-Body Motion Capture Using Infrared Sensor Network ijcga
 
Trajectory reconstruction for robot programming by demonstration
Trajectory reconstruction for robot programming  by demonstration  Trajectory reconstruction for robot programming  by demonstration
Trajectory reconstruction for robot programming by demonstration IJECEIAES
 

Ähnlich wie A Hand Gesture Based Interactive Presentation System Utilizing Heterogeneous Cameras (20)

A Study on Sparse Representation and Optimal Algorithms in Intelligent Comput...
A Study on Sparse Representation and Optimal Algorithms in Intelligent Comput...A Study on Sparse Representation and Optimal Algorithms in Intelligent Comput...
A Study on Sparse Representation and Optimal Algorithms in Intelligent Comput...
 
A Review Paper on Real-Time Hand Motion Capture
A Review Paper on Real-Time Hand Motion CaptureA Review Paper on Real-Time Hand Motion Capture
A Review Paper on Real-Time Hand Motion Capture
 
CHARACTERIZING HUMAN BEHAVIOURS USING STATISTICAL MOTION DESCRIPTOR
CHARACTERIZING HUMAN BEHAVIOURS USING STATISTICAL MOTION DESCRIPTORCHARACTERIZING HUMAN BEHAVIOURS USING STATISTICAL MOTION DESCRIPTOR
CHARACTERIZING HUMAN BEHAVIOURS USING STATISTICAL MOTION DESCRIPTOR
 
CROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERA CROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERA
 
CROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERACROWD ANALYSIS WITH FISH EYE CAMERA
CROWD ANALYSIS WITH FISH EYE CAMERA
 
HUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOT
HUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOTHUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOT
HUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOT
 
Obstacle detection for autonomous systems using stereoscopic images and bacte...
Obstacle detection for autonomous systems using stereoscopic images and bacte...Obstacle detection for autonomous systems using stereoscopic images and bacte...
Obstacle detection for autonomous systems using stereoscopic images and bacte...
 
AN ALGORITHM TO DETECT DRIVER’S DROWSINESS BASED ON NODDING BEHAVIOUR
AN ALGORITHM TO DETECT DRIVER’S DROWSINESS BASED ON NODDING BEHAVIOURAN ALGORITHM TO DETECT DRIVER’S DROWSINESS BASED ON NODDING BEHAVIOUR
AN ALGORITHM TO DETECT DRIVER’S DROWSINESS BASED ON NODDING BEHAVIOUR
 
5116ijscmc01
5116ijscmc015116ijscmc01
5116ijscmc01
 
An algorithm to detect drivers
An algorithm to detect driversAn algorithm to detect drivers
An algorithm to detect drivers
 
ROBUST HUMAN TRACKING METHOD BASED ON APPEARANCE AND GEOMETRICAL FEATURES IN ...
ROBUST HUMAN TRACKING METHOD BASED ON APPEARANCE AND GEOMETRICAL FEATURES IN ...ROBUST HUMAN TRACKING METHOD BASED ON APPEARANCE AND GEOMETRICAL FEATURES IN ...
ROBUST HUMAN TRACKING METHOD BASED ON APPEARANCE AND GEOMETRICAL FEATURES IN ...
 
Robust Human Tracking Method Based on Apperance and Geometrical Features in N...
Robust Human Tracking Method Based on Apperance and Geometrical Features in N...Robust Human Tracking Method Based on Apperance and Geometrical Features in N...
Robust Human Tracking Method Based on Apperance and Geometrical Features in N...
 
IRJET- Virtual Changing Room using Image Processing
IRJET- Virtual Changing Room using Image ProcessingIRJET- Virtual Changing Room using Image Processing
IRJET- Virtual Changing Room using Image Processing
 
Development of a 3D Interactive Virtual Market System with Adaptive Treadmill...
Development of a 3D Interactive Virtual Market System with Adaptive Treadmill...Development of a 3D Interactive Virtual Market System with Adaptive Treadmill...
Development of a 3D Interactive Virtual Market System with Adaptive Treadmill...
 
C015 apwcs10-positioning
C015 apwcs10-positioningC015 apwcs10-positioning
C015 apwcs10-positioning
 
Interactive full body motion capture using infrared sensor network
Interactive full body motion capture using infrared sensor networkInteractive full body motion capture using infrared sensor network
Interactive full body motion capture using infrared sensor network
 
HUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOT
HUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOTHUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOT
HUMAN BODY DETECTION AND SAFETY CARE SYSTEM FOR A FLYING ROBOT
 
Pose estimation algorithm for mobile augmented reality based on inertial sen...
Pose estimation algorithm for mobile augmented reality based  on inertial sen...Pose estimation algorithm for mobile augmented reality based  on inertial sen...
Pose estimation algorithm for mobile augmented reality based on inertial sen...
 
Interactive Full-Body Motion Capture Using Infrared Sensor Network
Interactive Full-Body Motion Capture Using Infrared Sensor Network  Interactive Full-Body Motion Capture Using Infrared Sensor Network
Interactive Full-Body Motion Capture Using Infrared Sensor Network
 
Trajectory reconstruction for robot programming by demonstration
Trajectory reconstruction for robot programming  by demonstration  Trajectory reconstruction for robot programming  by demonstration
Trajectory reconstruction for robot programming by demonstration
 

Mehr von Lisa Cain

005 Essay Examples How To Start An With ~ Thatsnotus
005 Essay Examples How To Start An With ~ Thatsnotus005 Essay Examples How To Start An With ~ Thatsnotus
005 Essay Examples How To Start An With ~ ThatsnotusLisa Cain
 
Best Tips For Writing And Editing Admissions Essays
Best Tips For Writing And Editing Admissions EssaysBest Tips For Writing And Editing Admissions Essays
Best Tips For Writing And Editing Admissions EssaysLisa Cain
 
Example Abstract Scientific Article - Eagnlrqfzqwvl
Example Abstract Scientific Article - EagnlrqfzqwvlExample Abstract Scientific Article - Eagnlrqfzqwvl
Example Abstract Scientific Article - EagnlrqfzqwvlLisa Cain
 
Why Are Cause And Effect Essays Written
Why Are Cause And Effect Essays WrittenWhy Are Cause And Effect Essays Written
Why Are Cause And Effect Essays WrittenLisa Cain
 
Writing Paper Printable Now, LetS Make Thi
Writing Paper Printable Now, LetS Make ThiWriting Paper Printable Now, LetS Make Thi
Writing Paper Printable Now, LetS Make ThiLisa Cain
 
Example Of Apa Citation In Paper APA Citation Hando
Example Of Apa Citation In Paper APA Citation HandoExample Of Apa Citation In Paper APA Citation Hando
Example Of Apa Citation In Paper APA Citation HandoLisa Cain
 
How To Write An Introduction For Academic Essay Ske
How To Write An Introduction For Academic Essay SkeHow To Write An Introduction For Academic Essay Ske
How To Write An Introduction For Academic Essay SkeLisa Cain
 
Essays Custom - The Writing Center.
Essays Custom - The Writing Center.Essays Custom - The Writing Center.
Essays Custom - The Writing Center.Lisa Cain
 
Success What Is Success, Essay On Education, Success
Success What Is Success, Essay On Education, SuccessSuccess What Is Success, Essay On Education, Success
Success What Is Success, Essay On Education, SuccessLisa Cain
 
Pencil On The Paper, Close Up Saint Norbert
Pencil On The Paper, Close Up Saint NorbertPencil On The Paper, Close Up Saint Norbert
Pencil On The Paper, Close Up Saint NorbertLisa Cain
 
Citing Research Paper - Reasearch Essay
Citing Research Paper - Reasearch EssayCiting Research Paper - Reasearch Essay
Citing Research Paper - Reasearch EssayLisa Cain
 
Example Of An Introduction For A R
Example Of An Introduction For A RExample Of An Introduction For A R
Example Of An Introduction For A RLisa Cain
 
Palimpsest The Six Sharpened Pencils Of Roald Dahl
Palimpsest The Six Sharpened Pencils Of Roald DahlPalimpsest The Six Sharpened Pencils Of Roald Dahl
Palimpsest The Six Sharpened Pencils Of Roald DahlLisa Cain
 
Fundations Paper - ELEMENTARY LITERACY
Fundations Paper - ELEMENTARY LITERACYFundations Paper - ELEMENTARY LITERACY
Fundations Paper - ELEMENTARY LITERACYLisa Cain
 
How To Write A Formal Letter Learn English
How To Write A Formal Letter Learn EnglishHow To Write A Formal Letter Learn English
How To Write A Formal Letter Learn EnglishLisa Cain
 
Writing Good Hooks Worksheet Unique Make It
Writing Good Hooks Worksheet Unique Make ItWriting Good Hooks Worksheet Unique Make It
Writing Good Hooks Worksheet Unique Make ItLisa Cain
 
Find Best Research Paper Writing Service Reviews Here Discover
Find Best Research Paper Writing Service Reviews Here DiscoverFind Best Research Paper Writing Service Reviews Here Discover
Find Best Research Paper Writing Service Reviews Here DiscoverLisa Cain
 
Kitten Writing Santa A Letter Funny Animal Memes, Fun
Kitten Writing Santa A Letter Funny Animal Memes, FunKitten Writing Santa A Letter Funny Animal Memes, Fun
Kitten Writing Santa A Letter Funny Animal Memes, FunLisa Cain
 
How To Write A Term Paper Properly Guide - WatchMeTech
How To Write A Term Paper Properly Guide - WatchMeTechHow To Write A Term Paper Properly Guide - WatchMeTech
How To Write A Term Paper Properly Guide - WatchMeTechLisa Cain
 
Fish Writing Paper Have Fun Teaching, Writing Paper, W
Fish Writing Paper Have Fun Teaching, Writing Paper, WFish Writing Paper Have Fun Teaching, Writing Paper, W
Fish Writing Paper Have Fun Teaching, Writing Paper, WLisa Cain
 

Mehr von Lisa Cain (20)

005 Essay Examples How To Start An With ~ Thatsnotus
005 Essay Examples How To Start An With ~ Thatsnotus005 Essay Examples How To Start An With ~ Thatsnotus
005 Essay Examples How To Start An With ~ Thatsnotus
 
Best Tips For Writing And Editing Admissions Essays
Best Tips For Writing And Editing Admissions EssaysBest Tips For Writing And Editing Admissions Essays
Best Tips For Writing And Editing Admissions Essays
 
Example Abstract Scientific Article - Eagnlrqfzqwvl
Example Abstract Scientific Article - EagnlrqfzqwvlExample Abstract Scientific Article - Eagnlrqfzqwvl
Example Abstract Scientific Article - Eagnlrqfzqwvl
 
Why Are Cause And Effect Essays Written
Why Are Cause And Effect Essays WrittenWhy Are Cause And Effect Essays Written
Why Are Cause And Effect Essays Written
 
Writing Paper Printable Now, LetS Make Thi
Writing Paper Printable Now, LetS Make ThiWriting Paper Printable Now, LetS Make Thi
Writing Paper Printable Now, LetS Make Thi
 
Example Of Apa Citation In Paper APA Citation Hando
Example Of Apa Citation In Paper APA Citation HandoExample Of Apa Citation In Paper APA Citation Hando
Example Of Apa Citation In Paper APA Citation Hando
 
How To Write An Introduction For Academic Essay Ske
How To Write An Introduction For Academic Essay SkeHow To Write An Introduction For Academic Essay Ske
How To Write An Introduction For Academic Essay Ske
 
Essays Custom - The Writing Center.
Essays Custom - The Writing Center.Essays Custom - The Writing Center.
Essays Custom - The Writing Center.
 
Success What Is Success, Essay On Education, Success
Success What Is Success, Essay On Education, SuccessSuccess What Is Success, Essay On Education, Success
Success What Is Success, Essay On Education, Success
 
Pencil On The Paper, Close Up Saint Norbert
Pencil On The Paper, Close Up Saint NorbertPencil On The Paper, Close Up Saint Norbert
Pencil On The Paper, Close Up Saint Norbert
 
Citing Research Paper - Reasearch Essay
Citing Research Paper - Reasearch EssayCiting Research Paper - Reasearch Essay
Citing Research Paper - Reasearch Essay
 
Example Of An Introduction For A R
Example Of An Introduction For A RExample Of An Introduction For A R
Example Of An Introduction For A R
 
Palimpsest The Six Sharpened Pencils Of Roald Dahl
Palimpsest The Six Sharpened Pencils Of Roald DahlPalimpsest The Six Sharpened Pencils Of Roald Dahl
Palimpsest The Six Sharpened Pencils Of Roald Dahl
 
Fundations Paper - ELEMENTARY LITERACY
Fundations Paper - ELEMENTARY LITERACYFundations Paper - ELEMENTARY LITERACY
Fundations Paper - ELEMENTARY LITERACY
 
How To Write A Formal Letter Learn English
How To Write A Formal Letter Learn EnglishHow To Write A Formal Letter Learn English
How To Write A Formal Letter Learn English
 
Writing Good Hooks Worksheet Unique Make It
Writing Good Hooks Worksheet Unique Make ItWriting Good Hooks Worksheet Unique Make It
Writing Good Hooks Worksheet Unique Make It
 
Find Best Research Paper Writing Service Reviews Here Discover
Find Best Research Paper Writing Service Reviews Here DiscoverFind Best Research Paper Writing Service Reviews Here Discover
Find Best Research Paper Writing Service Reviews Here Discover
 
Kitten Writing Santa A Letter Funny Animal Memes, Fun
Kitten Writing Santa A Letter Funny Animal Memes, FunKitten Writing Santa A Letter Funny Animal Memes, Fun
Kitten Writing Santa A Letter Funny Animal Memes, Fun
 
How To Write A Term Paper Properly Guide - WatchMeTech
How To Write A Term Paper Properly Guide - WatchMeTechHow To Write A Term Paper Properly Guide - WatchMeTech
How To Write A Term Paper Properly Guide - WatchMeTech
 
Fish Writing Paper Have Fun Teaching, Writing Paper, W
Fish Writing Paper Have Fun Teaching, Writing Paper, WFish Writing Paper Have Fun Teaching, Writing Paper, W
Fish Writing Paper Have Fun Teaching, Writing Paper, W
 

Kürzlich hochgeladen

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 

Kürzlich hochgeladen (20)

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 

A Hand Gesture Based Interactive Presentation System Utilizing Heterogeneous Cameras

  • 1. TSINGHUA SCIENCE AND TECHNOLOGY ISSNll1007-0214ll15/18llpp329-336 Volume 17, Number 3, June 2012 A Hand Gesture Based Interactive Presentation System Utilizing Heterogeneous Cameras Bobo Zeng, Guijin Wang** , Xinggang Lin Department of Electronic Engineering, Tsinghua University, Beijing 100084, China Abstract: In this paper, a real-time system that utilizes hand gestures to interactively control the presentation is proposed. The system employs a thermal camera for robust human body segmentation to handle the complex background and varying illumination posed by the projector. A fast and robust hand localization al- gorithm is proposed, with which the head, torso, and arm are sequentially localized. Hand trajectories are segmented and recognized as gestures for interactions. A dual-step calibration algorithm is utilized to map the interaction regions between the thermal camera and the projected contents by integrating a Web cam- era. Experiments show that the system has a high recognition rate for hand gestures, and corresponding interactions can be performed correctly. Key words: hand gesture detection; gesture recognition; thermal camera; interactive presentation Introduction Interactive presentation systems use advanced Human Computer Interaction (HCI) techniques to provide a more convenient and user-friendly interface for con- trolling presentation displays, such as page up/down controls in a slideshow. Compared with traditional mouse and keyboard control, the presentation experi- ence is significantly improved with these techniques. Hand gesture has wide-ranging applications[1] . In this study, we apply it to an interactive presentation system to create an easy-to-understand interaction interface. The key to this system is robust localization of the presenter’s interaction hand. Traditionally, visible light cameras are used for this task. However, in the presen- tation scenario, complicated and varying illumination by the projector changes the appearance of both the presenter and the hand in the visible light camera, in- validating methods such as detecting hand skin color[2] or training hand detectors[3,4] . To address this short- coming, we introduce a thermal camera whose imaging is invariant to the projector’s illumination. During the presentation, the interaction hand is small and no salient features exist, so direct localiza- tion is not feasible. Hu et al.[5] built a sophisticated human model, but it is burdened by a heavy computa- tional cost. Iwasawa et al.[6] located the head and hand by analyzing the skeleton heuristically. However, only a limited number of postures can be estimated. Juang et al.[7] found significant body points by analyzing convex points on a silhouette contour, but the convex points are not stable when the silhouette’s contour is missing. Head localization is important as the first step in many pose estimation methods. Zou et al.[8] ex- tracted quadrant arcs to fit an elliptical head contour, which needs to integrate the arcs combinatorially, so it is still not fast enough. Calibration of the camera and the projected contents is needed for corresponding interaction regions. Spe- cial patterns need to be projected to find the point correspondences in Refs. [9,10], which are inconven- ient and not fully automatic. Wang et al.[11] found Received: 2011-12-30; revised: 2012-05-07 ** To whom correspondence should be addressed. E-mail: wangguijin@tsinghua.edu.cn; Tel: 86-10-62781430
  • 2. Tsinghua Science and Technology, June 2012, 17(3): 329-336 330 correspondence between shots and slides by utilizing the positions of recognized texts. However, this ap- proach requires the texts to be presented and recog- nized robustly, which is not easy to do. 1 System Overview The system setup is illustrated in Fig. 1. It consists of a projector with a projection screen, a thermal camera with 324×256 resolution, and a Web camera. The two cameras are placed facing the screen and close to each other for better correspondence. The presenter stands at the front of the projection screen so the cameras can see the presenter and the screen simultaneously. The presenter uses his/her hand for interaction. The system flowchart is illustrated in Fig. 2. The thermal camera detects and tracks the presenter’s hand (gesture detection), and the hand trajectory is segmented and recognized (gesture recognition). Once predefined gestures are recognized, the corresponding interactions are performed and projected to the screen with the projector-camera calibration. The calibration is computed with the thermal camera and the Web camera together to map the interaction regions. Robust and fast gesture detection by locating the in- teraction hand largely determines the system’s per- formance and serves as our major emphasis. Being aware of the difficulties in directly locating hands, we first segment the human body by background modeling with the thermal camera. Based on the segmented hu- man silhouette, we develop a fast hand localization algorithm for our real-time application. The human body is roughly divided into three parts: the head, the torso, and the arm, and they are localized sequentially. The hand is finally localized by skeletonizing the arm. With the hand trajectory, we segment and recognize three types of gestures (lining, circling, and pointing) and apply the corresponding interactions given the pre-computed calibration. Direct calibration from the thermal camera to the projected contents is impossible, as no projected contents can be viewed from the ther- mal camera. Therefore, we develop a dual-step auto- matic calibration method with the aid of a common Web camera. The first step is thermal camera and Web camera calibration based on line detection, and the second step is Web camera and projected contents (e.g., PPT slide) calibration based on Scale Invariant Feature Transform (SIFT) feature points[12] . 2 Gesture Detection We propose a part-based algorithm for localizing the hand point, including a contour-based head detection and shape matching based head tracking module, a histogram-based torso and arm localization module, and a skeleton-based hand localization module. The localization algorithm is based on the observations that the human body is upright and, in interactions, the hand is located away from the torso, either beside the torso (horizontally) or above the head (vertically). The algorithm flow chart is illustrated in Fig. 3. 2.1 Human body segmentation Human body segmentation is carried out using the thermal camera, as its images are invariant to the pro- jector’s light, which is ideal for segmentation. Figure Fig. 1 System hardware setup Fig. 2 System flowchart
  • 3. Bobo Zeng et al.:A Hand Gesture Based Interactive Presentation System Utilizing … 331 4c shows a typical image captured by the thermal camera; no content on the projection screen can be observed. We use a Gaussian background model[13] to extract the foreground as the segmented human body. Since the background is simple and static in the indoor environment, unimodal Gaussian distribution is employed. For post-processing, erosion and dilation operations are performed to remove the noise and connect the possible gaps caused by inaccurate segmentation. Then, we use a connected component analysis algorithm[14] to select the largest connected component as the human body and its contour. Figure 4 illustrates that the hu- man body is well segmented in the thermal camera. The inferior results of the Web camera are also shown for comparison. 2.2 Head localization Head localization is performed for three reasons. First, directly localizing the hand is not feasible due to the lack of salient hand features. In contrast, the head has a salient elliptical shape, which is stable and invariant to head poses (e.g., frontal or profile view). Second, the localized head can guide the localization of the torso and arms by its position and size. Third, since the head and arm are quite different in size, the head’s localiza- tion can prevent it from being falsely identified as the arm. We propose to detect the head by contour-based ellipse fitting and track it by robust shape matching. 2.2.1 Contour-based elliptical head detection The head contour is segmented from the human body contour by using a gradient angle distribution of its elliptical shape. Then, an ellipse is detected by least squares fitting with the head contour’s points. The head contour’s gradient angle distribution is illustrated in Fig. 5. Searching from the start point (the highest point on the head contour), the gradient angles of the contour points vary from the first quadrant [0, π / 2 ] to the fourth quadrant [ π / 2 − , 0] in the counterclock- wise direction, and from the second quadrant [ π/2, π ] to the third quadrant [ , / 2 −π − π ] in the clockwise direction. The gradient angle of each contour point is computed by Fig. 3 Hand localization flowchart Fig. 4 Comparison of human body segmentation re- sults of the two cameras: (a) image captured by the Web camera; (b) the inferior segmentation result of the Web camera; (c) image captured by the FLIR thermal camera; and (d) the superior segmentation result of the thermal camera Fig. 5 Illustration of gradient angle distribution of an elliptical head when searching from the start point in clockwise and counterclockwise directions. The arrow line shows the dx and dy involved in the gradient angle computation.
  • 4. Tsinghua Science and Technology, June 2012, 17(3): 329-336 332 arctan( ), 0; arctan( ) π, 0 and 0; arctan( / ) π, 0 a / / nd 0 y x x y x x y y x x y d d d d d d d d d d d θ ⎧ > ⎪ = + ⎨ ⎪ − < ⎩ - . - (1) where x d and y d are x and y derivatives obtained with a [−1 0 1] mask. The head detection process is illustrated in Fig. 6. First, the search start point is identified as the intersec- tion of the human body contour and the principal axis of the human body. The principle axis is a vertical line p argmax ( ) x h x = (2) where h(x) is the horizontal histogram of human body pixels. From the start point, the gradient angles are computed along the contour in counterclockwise and clockwise directions. The image is smoothed by a 5×5 Gaussian kernel before computing the angles to re- move noises. The head contour is segmented after traversing the 4 quadrants in order. The symmetry of contour segments in clockwise and counterclockwise directions is employed to reduce false segmentations. If the segment of one side is exceptionally longer than another side, then it is truncated to maintain symmetry. An ellipse is detected with the segmented head con- tour points by least squares fitting[15] using the ellipse equation. Then, the ellipse’s long axis a, short axis b, and the orientation θ to the y-axis are obtained. The ellipse is verified as a valid head if the following con- ditions are satisfied: (a) The fitting error is less than a specified threshold. (b) According to the head’s size, the ellipse’s aspect ratio a/b should fall into a range of [1.0, 2.5] and the orientation should fall into a range of [ π / 4, π / 4 − ]. If the head is unverified, further head tracking is re- quired. An unverified head is illustrated in Fig. 7, which is caused by a wrongly segmented head contour due to distraction by the arm. 2.2.2 Shape matching based head tracking When head detection fails, a head template tracking method utilizing robust shape matching is utilized to recover the head. We assume that the head has the same size as the last detected head, but it is now lo- cated in a different position. The task is to find the best state hypothesis, ˆ ( , ) t t t s x y = , which denotes the posi- tion of the head at time t. Its probability is measured by ˆ argmax ( | ) t t t t s s p s o = (3) 1 ˆ ( | ) ( | ) ( | ) t t t t t t p s o p o s p s s − ∝ (4) where ( | ) t t p o s is the observation probability and 1 ˆ ( | ) t t p s s − is the state transition probability. For the observation probability, we develop a shape matching measure inspired by chamfer matching[16] , a popular fast shape matching technique. In our match- ing method, the last detected head ellipse T is the “feature template”, and the current body silhouette contour I is the “feature image”. The original chamfer matching approach does not consider shape incom- pleteness. However, in our problem, the head contour may be missing in the joints position with the neck or in the possible intersection position with the arm when it is lifted above the head. So, we introduce a robust π π/2 0 −π/2 π Start point Head contour Principle axis Fig. 6 A detected head showing the segmented principal axis of the human body and head contour with the gradient angle curve. The arrow lines indicate the start point, the left endpoint, the right endpoint of the head contour, and their corresponding positions in the gradient angle curve. Fig. 7 Head tracking illustration in which the head contour is wrongly segmented due to distraction by the arm; hence, the fitted head is unverified. The last de- tected head is used as a template, and shape matching is employed to track the head. The head template and its distance transform image are also shown. Head template Head distance image Search region Unverified head Tracked head
  • 5. Bobo Zeng et al.:A Hand Gesture Based Interactive Presentation System Utilizing … 333 matching measure that detects matched points with a distance greater than a threshold as outliers and rejects them. We also apply the distance transform to the tem- plate image (see Fig. 7 for the head template and dis- tance image) rather than the feature image for conven- ience. The proposed confidence measure is th th ( ), th ( ) ( ) | (1 ) ( ) t I T T t t i I s T T d i d d d i N P o s N d P ω ω ∈ − = + − ∑ - (5) th ( ) d a b β = + (6) where the first term of Eq. (5) measures the matching confidence, with ( ) T d i denoting the chamfer distance between contour point i in the hypothesis image patch ( ) t I s and the closest edge point in T, th d denoting the distance threshold, and T N denoting the number of inlying matches. The second term measures the matching completeness, with T P denoting the pe- rimeter of the head ellipse in the template and ω de- noting the weighting factor between the two terms. th d is set in proportion to the head template’s semi-axes a and b, and the coefficient β is set to be 0.1. For the motion model, since the head is near the principle axis p x horizontally and obeys the motion continuity vertically, the state transition probability is defined as follows: 2 2 1 p 1 ˆ ˆ ( | ) ( | 0, ) ( | 0, ) t t t x t t y P s s x x y y σ σ − − = − − N N (7) where N is the normal distribution; 2 x σ and 2 y σ represent the covariance in the x and y directions, respectively. The hypothesis space is generated in a neighborhood of p 1 ˆ ( , ) t x y − , and the hypothesis with the highest probability is obtained by exhaustive search. Figure 7 gives a head tracking example. 2.3 Hand localization Compared to the head, localization of the torso and the arm region is relatively easy, and Fig. 8 illustrates the results for two different poses. The torso is modeled as a rectangle, whose height extends from the bottom of the head to the bottom of the body bounding box. The horizontal histogram of human body pixels below the head is calculated and binarized with a threshold. Then, the longest 1 segment is localized as the torso rectan- gle’s width. Apart from the head and torso, only the arm remained, which is localized as the largest com- ponent with connected component analysis. The identified arm region is skeletonized with the thinning algorithm in Ref. [17], and two endpoints of the skeleton are extracted. Based on the layout of the head, torso, and arm, the endpoint far away from the body is identified as the initial hand point. Since the skeleton is easily affected by local noises, we further extracted the initial hand point’s neighborhood with the hand size estimated from the head size. The centroid of body pixels in the neighborhood is computed as the final hand point. 3 Gesture Recognition The moving trajectory 1 { ( , )} i i i i N x y = p - - of the hand consists of the localized hand point in each frame. Ges- ture trajectory is segmented with static start and end positions by keeping the hand stationary for several frames[18] . Features are extracted from the gesture tra- jectory for recognition. Center of gravity ˆ ˆ ˆ ( , ) x y = p of the segmented trajectory is calculated by 0 1 ˆ N i i N = = ∑ p p (8) The distance feature i l is calculated by 2 2 2 ˆ ˆ ˆ || || ( ) ( ) i i i i L x x y y = − = − + − p p (9) max 1 max , ma . x i i i N i L l L L L = = - - The orientation feature i θ is calculated by 1 ˆ tan ˆ i i i y y x x θ − − = − (10) In the recognition, we used a rule-based approach rather than a learning-based approach (e.g., an Hidden Markov Model (HMM[19] ) or a Finite-State Machine (FSM[20] )). Rule-based approaches are simple, fast, and Fig. 8 The localized head, torso, arm, arm skeleton, and the hand point for two different poses. The hori- zontal histogram without the head and the threshold for locating the torso are also shown.
  • 6. Tsinghua Science and Technology, June 2012, 17(3): 329-336 334 require no offline training. The rules are defined as follows for the predefined three gestures: Lining The distance features decrease and then increase linearly, and the orientation features have two opposite values. Circling The distance features are constant, and the orientation features have a 2π value range and in- crease or decrease monotonically. Pointing The distance features are near to zero. 4 Gesture-Based Interaction After a gesture is recognized, the interaction region needs to be mapped from the camera to the original contents. Camera projector calibration is employed to accomplish the mapping. 4.1 Camera projector calibration The planar correspondence of pixels in the projection screen between the camera and the original projected contents is determined by a 3×3 homography matrix H as follows: T T p p c c [ ] [ 1] sx sy s x y = H (11) where c c ( , ) x y is the projection screen’s point ob- served in the image, and p p ( , ) x y is the corresponding point in the original contents. At least four such corre- spondence pairs are required in order to solve H. Di- rectly finding such pairs with the thermal camera is impossible because of the lack of texture of the imag- ing. We therefore develop a calibration method in two steps aided by the Web camera: 2cam 2cont , = × H H H where H2cam is the homography matrix from the ther- mal camera to the Web camera, and H2cont is the ho- mography matrix from the Web camera to the pro- jected contents. H2cam is calculated in the first step. The four vertices of the projection screen in both cameras are detected by finding lines using the Hough transform[21] and ob- taining the quadrilateral. The vertices in both cameras are used to calculate H2cam . The results are illustrated in Fig. 9. To get the homography matrix from the Web camera to the projected contents, we employed SIFT keypoints[12] to detect candidate correspondence points. SIFT points are scale-space extrema detected in a set of difference-of-Gaussian images in the scale space. Each keypoint is assigned a location, scale, orientation, and a descriptor; it is thereby invariant to scale and rotation and robust to illumination change and view- point change. Given the detected SIFT points in two images, the nearest neighbor matching algorithm from Ref. [12] is used. After this matching step, the corre- sponding point pairs are only putative. We use RAN- dom SAmple Consensus (RANSAC)[17] to impose the homography constraints on the previous matched point pairs and to find the correct correspondences, which are finally used to calculate H2cont . The whole process is illustrated in Fig. 10. Finally, we obtained the final homography by H=H2cam×H2cont . Though the image quality of the Web camera was not very good and no special pattern was projected for calibration, the experiment demonstrated the effectiveness of our method: it had a maximum pixel error of less than 8, which was sufficiently pre- cise for the interaction. 4.2 Interaction Upon identifying the interaction region of the recog- nized gesture with the calibration, the semantic inter- action is carried out. In a PPT interactive presentation scenario, lining upward or downward controls the (a) (b) (c) Fig. 9 Vertices detection with the Hough transform: (a) contrast-adjusted thermal image for viewing the screen better; (b) corner detection in thermal camera images; and (c) corner detection in Web camera images
  • 7. Bobo Zeng et al.:A Hand Gesture Based Interactive Presentation System Utilizing … 335 forward or backward slide transitions, respectively; lining leftward or rightward horizontally makes anno- tations on the slide; circling controls the zoom in/out operations of the interaction regions in the slide; and pointing controls the activation of buttons in the slide. Once the gesture is detected and recognized, the cor- responding actions can be performed in the PPT (such as annotation) and projected to the screen. Figures 11a and 11b illustrate a circling gesture and a lining gesture, respectively, in the presentation. 5 Performance Evaluation We quantitatively evaluate the proposed system in terms of head and hand localization accuracy, gesture recognition rate, and computation time. The system runs on a Windows PC with an Intel Dual Core 3 GHz CPU and could be run at 20 frames per second in a single CPU core without any specific programming optimization. We conduct the experiments in online real-time scenarios with complex slide contents (e.g., colored figures and backgrounds). The head and hand localization accuracy is key for the system performance and is defined as Localization accuracy = Number of correctly localized head (or hand) . Number of total head (or hand) For the head, correct localization means that the over- lap area of localized head ellipse l D and ground truth gt D must exceed 50%: l gt l gt area( ) OverlapRatio = 0.5. area( ) D D D D > ∩ ∪ For the hand point, correct localization means that the point is localized in the actual hand region. The evaluation results are shown in Table 1, and Fig. 12 shows certain hand point localization results. Table 2 shows the recognition rates of the three types of gestures. These results show that our system has achieved high recognition performance. The main error in recognition is caused by wrong gesture seg- mentation. Table 1 Localization accuracy of the head and hand point Number of total Number of correctly localized Localization accuracy (%) Head 6895 6839 99.2 Hand point 5092 5044 99.1 Fig. 12 Hand point localization results Table 2 Gesture recognition results Gesture Number of gestures Number recognized Recognition rate (%) Overall rate (%) Lining 84 81 96.4 Circling 52 50 96.2 Pointing 43 42 97.7 96.7 (a) (b) (c) Fig. 10 Illustration of finding the homography from the Web camera to the projected contents: (a) detected SIFT points in the visible camera and the original con- tents; (b) the putative matched point pairs; and (c) the final matched pairs by imposing homography con- straints with RANSAC (a) (b) Fig. 11 (a) Circling gesture for zooming into the graph in the presentation; and (b) Lining gesture for annotating a PPT slide. The red annotation line is drawn on the original PPT slide and projected to the screen after the lining gesture is recognized.
  • 8. Tsinghua Science and Technology, June 2012, 17(3): 329-336 336 6 Conclusions In this study, we design a hand gesture based presenta- tion system by integrating a thermal camera with a Web camera; this system provides a natural and con- venient interaction interface for presentations. Quanti- tative experimental results show that the system per- forms efficiently. Currently, our gesture interaction is limited to one hand, and we plan to extend the work to handle two hands to enrich the system’s interaction capability. The Web camera is only used for calibration, but additional information could be extracted for in- teraction, such as the shape of the hand (e.g., palm or fist). Also, other HCI applications such as gaming will be explored in future studies. Acknowledgments The authors would like to thank Nokia China Research Center for their support of this research. References [1] Erol A, Bebis G, Nicolescu M, et al. Vision-based hand pose estimation: A review. Computer Vision and Image Understanding, 2007, 108(1-2): 52-73. [2] Wang F, Ngo C W, Pong T C. Simulating a smartboard by real-time gesture detection in lecture videos. IEEE Transactions on Multimedia, 2008, 10(5): 926-935. [3] Francke H, Ruiz-Del-Solar J, Verschae R. Real-time hand gesture detection and recognition using boosted classifiers and active learning. In: Proceedings of the 2nd Pacific Rim Conference on Advances in Image and Video Technology. Santiago, Chile, 2007: 533-547. [4] Fang Y K, Wang K Q, Cheng J, et al. A real-time hand gesture recognition method. In: Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, Vols 1-5. Beijing, China, 2007: 995-998. [5] Hu Z, Wang G, Lin X, et al. Recovery of upper body poses in static images based on joints detection. Pattern Recognition Letters, 2009, 30(5): 503-512. [6] Iwasawa S, Ebihara K, Ohya J, et al. Real-time estimation of human body posture from monocular thermal images. In: Proceedings of the 1997 IEEE Computer Vision and Pattern Recognition. San Juan, Puerto Rico, 1997: 15-20. [7] Juang C F, Chang C M, Wu J R, et al. Computer vision-based human body segmentation and posture estimation. Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on, 2009, 39(1): 119-133. [8] Zou W, Li Y, Yuan K, et al. Real-time elliptical head contour detection under arbitrary pose and wide distance range. Journal of Visual Communication and Image Representation, 2009, 20(3): 217-228. [9] Sukthankar R, Stockton R G, Mullin M D. Smarter presentations: Exploiting homography in camera-projector systems. In: Proceedings of the Eighth IEEE International Conference on Computer Vision. Kauai, HI, USA, 2001: 247-253. [10] Okatani T, Deguchi K. Autocalibration of a projector- camera system. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2005, 27(12): 1845-1855. [11] Wang F, Ngo C W, Pong T C. Lecture video enhancement and editing by integrating posture, gesture, and text. IEEE Transactions on Multimedia, 2007, 9(2): 397-409. [12] Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91-110. [13] Stauffer C, Grimson W E L. Adaptive background mixture models for real-time tracking. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Ft. collins, CO, USA, 1999: 246-252. [14] Chang F, Chen C J, Lu C J. A linear-time component-labeling algorithm using contour tracing technique. Computer Vision and Image Understanding, 2004, 93(2): 206-220. [15] Gander W, Golub G H, Strebel R. Least-squares fitting of circles and ellipses. BIT, 1994, 34(4): 558-578. [16] Barrow H G, Tenenbaum J M, Bolles R C, et al. Parametric correspondence and chamfer matching: Two new techniques for image matching. In: Proceedings of the 5th International Joint Conference on Artificial Intelligence. San Francisco, CA, USA, 1977: 659-663. [17] Lam L, Lee S W, Suen C Y. Thinning methodologies — A comprehensive survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992, 14(9): 869-885. [18] Davis J, Shah M. Visual gesture recognition. IEE Proceedings-Vision Image and Signal Processing, 1994, 141(2): 101-106. [19] Lee H K, Kim J H. An HMM-based threshold model approach for gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999, 21(10): 961-973. [20] Bobick A F, Wilson A D. A state-based approach to the representation and recognition of gesture. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(12): 1325-1337. [21] Duda R O, Hart P E. Use of the Hough transformation to detect lines and curves in pictures. Commun. ACM, 1972, 15(1): 11-15.