SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Downloaden Sie, um offline zu lesen
Copyright © 2010 by the Association for Computing Machinery, Inc.
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for commercial advantage and that copies bear this notice and the full citation on the
first page. Copyrights for components of this work owned by others than ACM must be
honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on
servers, or to redistribute to lists, requires prior specific permission and/or a fee.
Request permissions from Permissions Dept, ACM Inc., fax +1 (212) 869-0481 or e-mail
permissions@acm.org.
ETRA 2010, Austin, TX, March 22 – 24, 2010.
© 2010 ACM 978-1-60558-994-7/10/0003 $10.00
Inferring Object Relevance from Gaze in Dynamic Scenes
Melih Kandemir∗
Helsinki University of Technology
Department of Information
and Computer Science
Veli-Matti Saarinen†
Helsinki University of Technology
Low Temperature Laboratory
Samuel Kaski‡
Helsinki University of Technology
Department of Information
and Computer Science
Abstract
As prototypes of data glasses having both data augmentation and
gaze tracking capabilities are becoming available, it is now possi-
ble to develop proactive gaze-controlled user interfaces to display
information about objects, people, and other entities in real-world
setups. In order to decide which objects the augmented information
should be about, and how saliently to augment, the system needs
an estimate of the importance or relevance of the objects of the
scene for the user at a given time. The estimates will be used to
minimize distraction of the user, and for providing efficient spa-
tial management of the augmented items. This work is a feasibility
study on inferring the relevance of objects in dynamic scenes from
gaze. We collected gaze data from subjects watching a video for
a pre-defined task. The results show that a simple ordinal logistic
regression model gives relevance rankings of scene objects with a
promising accuracy.
CR Categories: H.5.r [Information Interfaces and Representation
(HCI)]: User interfaces—User interface management systems
Keywords: augmented reality, gaze tracking, information re-
trieval, intelligent user interfaces, machine learning, ordinal logistic
regression
1 Introduction
In this paper, we develop a method needed for doing information
retrieval in dynamic real-world scenes where the queries are for-
mulated implicitly by gaze. In our setup the user wears a ubiqui-
tous information access device, “data glasses” having eye-tracking
and information augmentation capabilities. The device is assumed
to be capable of recognising and tracking certain types of objects
from the first-person video data of the user Figure 1 illustrates the
idea. Some objects, three faces and the whiteboard in this image,
are augmented with attached boxes that include textual information
obtained from other sources. In such a setup, each visible object
in a scene can be considered as a channel through which additional
relevant information can be obtained as augmented on the screen.
As in traditional information retrieval setups such as text search en-
gines, potential abundance of available information brings up the
need for a mechanism to rank the channels with respect to their rel-
evance. This is particularly important in proactive mobile setups
where the augmented items are also potential distractors.
Our goal is to infer the degree of interest of the user for the objects
∗e-mail: melihk@cis.hut.fi
†e-mail:veli-m@neuro.hut.fi
‡e-mail:samuel.kaski@tkk.fi
Figure 1: A screenshot from the eyesight of hypothetical data
glasses with augmented reality capability during a short presen-
tation in a meeting room (Scene 1)
in the scene. This problem has a connection to modelling of vi-
sual attention [Henderson 2003; Itti et al. 1998; Zhang et al. 2008];
whereas visual attention models typically try to predict the gaze pat-
tern given the scene, our target is the inverse problem of inferring
the user’s state (interests) given the scene and the gaze trajectory.
A good solution for the former problem would obviously help in
our task too, but current visual attention models mainly consider
only physical pointwise saliency which does not yet capture the
mainly top-down nature of effects of user’s interest on the gaze pat-
tern. Although there exists some initial attempts towards two-way
saliency modeling [Torralba et al. 2006], these are evaluated only
for rather trivial visual tasks such as counting a certain type of ob-
jects in static images. Unlike top-down models where the model
is optimised given a well-defined search task, the cognitive task of
the subject in our setup is hidden and may even be unclear to the
subject herself. Hence, we start by data-driven statistical machine
learning techniques for the inverse modeling task.
Gaze data has been used in user interfaces in three ways. Our goal is
the furthest from the most frequent approach, off-line analysis, for
instance studying effectiveness of advertisements in attracting peo-
ple’s attention, or analysis of social interaction. In the second ap-
proach the user selects actions by explicitly looking at the choices,
for instance eye typing [Hyrskykari et al. 2000; Ward and MacKay
2002]. Although such explicit selection mechanisms are easy to im-
plement, they require full user attention and are strenuous because
of the Midas touch effect: each glance activates an action whether
it is intended or not. The third way of using gaze data in user in-
terfaces is implicit feedback. The user uses her gaze normally, and
information needed by the interface is inferred from the gaze data.
An emerging example is proactive information retrieval where sta-
tistical machine learning methods are used for inferring relevance
from gaze patterns. The inferred relevance judgements are then
used as implicit relevance feedback for information retrieval. This
has been done for text retrieval by generating implicit queries from
105
gaze patterns [Hardoon et al. 2007]. The same principle has been
used for image retrieval as well [Klami et al. 2008], recently also
coupled dynamically to a retrieval engine in an interactive zooming
interface [Kozma et al. 2009]. Gaze has additionally been used as
a means of proactive interaction, but not information retrieval, in a
desktop application by assigning a relevance function to the entities
on a synthetic 2D map [Qvarfordt and Zhai 2005].
To test the feasibility of the idea of relevance ranking from gaze in
dynamical real-world setups, we prepared a stimulus video and col-
lected gaze data from subjects watching that video. True relevance
rankings were then asked from the subjects in several frames. We
trained an ordinal logistic regression model and measured its accu-
racy in the relevance prediction task on the left-out data.
2 Measurement Setup
We shot a video from the first-person view of a subject visiting three
indoor scenes. Then we postprocessed this video by augmenting
some of the objects with additional textual information in an at-
tached box. This video was shown to 4 subjects and gaze data was
collected. Right after the viewing session the subjects ranked the
scene objects in relevance order for a subset of the video frames.
The ranking was considered as the ground truth for learning the
models and evaluating them. The modelling task is to predict the
user-given ranking for an object given the gaze-tracking data from
a window immediately preceding the ranked frame.
3 Model for Inferring Relevance
Let us index the stimulus slices preceding each relevance judgement
from 1 to N. We extract a feature vector (details in the Experiments
section) for each scene object i at time slice t to obtain a single un-
labelled data point: fi
(t)
= {f
(t)
i1 , f
(t)
i2 , · · · , f
(t)
id } where d is the
number of features. If we also attach the ground truth relevance
ranking ri
(t)
, we get a labelled data point (fi
(t)
, ri
(t)
). Let us de-
note the set of data points, one for each object, related to time slice
t as a data subset Λ(t)
= {(f1
(t)
, r1
(t)
), · · · , (fmt
(1)
, rmt
(1)
)}
where mt is the number of visible objects at time slice t. Let
us denote the data subset without labels by Λ (t)
, and the maxi-
mum number of visible objects by L = max({m1, · · · , mN }).
For notational convenience, we define the most relevant object to
have rank L, and the rank decreases as relevance decreases. The
whole labelled data set consists of the union of all data subsets
∆ = {Λ(1)
, Λ(2)
, · · · , Λ(N)
}.
We search for a mapping from the feature space to the space of
relevances, which is conventionally [0, 1]. Such a mapping can di-
rectly be achieved using ordinal logistic regression [McCullagh and
Nelder 1989] if we assume that the relevance of an object depends
only on its features, and it is independent of the relevance of the
other visible objects. We use the standard approach as described
briefly below.
Let us denote the probability of the object rank to be k as P(ri
(t)
=
k | f
(t)
i ) = φk(f
(t)
i ). Then we can define the log odds such that the
problem reduces to a batch of L − 1 binary regression problems,
one for each k = 1, 2, · · · , L − 1:
Mk = log
P (ri
(t)
<=k | f(t)
i )
1−P (ri
(t)<= | f(t)
i )
= log
φ0(f(t)
i )+φ1(f(t)
i )+···+φk(f(t)
i )
φk+1(f(t)
i )+φk+2(f(t)
i )+···+φL(f(t)
i )
= w
(k)
0 + wf
(t)
i
where a linear model is assumed. By taking the exponent of both
sides we get the CDF of the rank distribution for object i at time t:
P(ri
(t)
<= k | f
(t)
i ) =
exp(w
(k)
0 + wf
(t)
i )
1 + exp(w
(k)
0 + wf
(t)
i )
.
Notice that we adopted the standard approach and used common
slope coefficients w = [w1, · · · , wd] for all logit models but differ-
ent intercepts w
(k)
0 . In the training phase, we calculate the maxi-
mum likelihood estimates for the parameters θ of this model (θ =
{w
(1)
0 , · · · , w
(k−1)
0 , w1, · · · , wd}) using the Newton-Raphson tech-
nique. Given an unlabelled data subset Λ (t)
at time t, the object
with relevance rank k is predicted to be the one that has the highest
probability for that rank; arg maxi φk(f
(t)
i ).
4 Experiments
4.1 Stimulus Preparation
We shot a video clip of 4 minutes and 17 seconds long from the first-
person view of a subject, using a see-through head mounted display
device. In the scenario of the clip, a visitor coming to our laboratory
is informed of our research project. The scenario consists of three
consecutive scenes:
1. A short presentation in a meeting room: A researcher in-
troduces the project with a block diagram drawn on the white-
board (Figure 1) in a meeting room. People present are asking
questions. The visitor follows the presentation.
2. A walk in the lab corridor: The visitor walks through the
laboratory taking a look at posters on the wall, and zooms
into some of the name tags on office doors.
3. Demo of data collection devices: The host introduces how
eye tracking experiments are made. He demonstrates a mon-
itor with eye tracking capabilities and the head-mounted dis-
play device.
Next, we augmented the video by attaching information boxes to
objects; such as faces, the whiteboard, name tags, posters, and de-
vices related to the project. These were considered to be the objects
potentialls the most interesting to the visitor. Short snippets of tex-
tual information relevant to the objects were displayed inside the
boxes. At most one information box was attached to any one object
at a time. We displayed boxes for all visible objects. There were
from 0 to 4 objects in the scene at a time; average number of scene
objects was 2.017 with 1.36 standard deviation. The frame rate of
the postprocessed video was 12 fps.
4.2 Data Collection
We collected gaze data from 4 subjects while they were watching
the stimulus video to get as much information as they can about
the research project. After the viewing session, the subjects were
shown 154 screenshots from the video in temporal order, each of
which represent a 1.66 seconds slot (20 frames). The users were
asked to select the objects that were relevant to them at that mo-
ment, and also to rank the selected subset of objects according to
their relevance. We defined relevance as the interest in seeing aug-
mented information about an object in the scene at that particular
time. All subjects assured, after ranking, that they were able to
remember the correct ranks for almost all the frames. The sub-
jects were graduate and postgraduate researchers not working on
the project related to the study we present in this paper.
106
4.3 The Eye Tracker
We collected the gaze data with a Tobii 1750 eye tracker with 50Hz
sample rate. The tracker has an infra-red stereo camera on a stan-
dard flat-screen monitor. The device performs tracking by detecting
the pupil centers and measuring the reflection from the cornea. The
successive gazes that were located within an area of 30 pixels are
considered as a single fixation. This corresponds to approximately
0.6 degrees of deflection at a normal viewing distance to an 17”-
screen monitor with 1280 × 1024 pixel resolution. Test subjects
were sitting 60 cm away from the monitor.
4.4 Feature Extraction
We extracted from the gaze and video data a set of features cor-
responding to each visible object. This was done at every time
slice for which the labelled object ranks were available (i.e., for
one frame in every 20 consecutive frames). Each of these features
summarises a particular aspect in the temporal context (recent past).
We define the context at time t to be a slot from time point t − W
to t − 1 where W is a predetermined window size. We used the
following 11 features:
1. mean area of the bounding box of the object
2. mean area of the information box attached to the object
3. mean distance between the centers of the object bounding box and the attached
information box
4. total duration of fixations inside the bounding box of the object
5. total duration of fixations inside the information box attached to the object
6. mean duration of fixations inside the bounding box of the object
7. mean duration of fixations inside the information box attached to the object
8. mean distance of all fixations to the center of the object bounding box
9. mean distance of all fixations to the center of the information box
10. mean length of saccades that ended up with fixations inside the bounding box
of the object
11. mean length of saccades that ended up with fixations inside the information box
attached to the object
We marked the bounding boxes of the objects manually frame by
frame.
4.5 Evaluation
We evaluated the accuracy of the model with respect to the propor-
tion of times the most relevant object was predicted correctly. We
compared the model performance with five baseline methods. The
first one is random guessing, in which at each time slice, scene
objects are ranked uniformly at random. The second one is an
attention-based method that assigns a relevance proportional to the
total fixation duration on the object and on the augmented content.
This estimate of object relevance is referred to as gaze intensity
[Qvarfordt and Zhai 2005]. This is used to reveal the effect of in-
tricate gaze patterns, other than mere visual attention measured by
gaze intensity in relevance prediction. In the third baseline model
we used the ordinal logistic regression model with the features that
are not related to gaze: first three of the features. Thus we investi-
gated the effect of gaze-based features in prediction accuracy. We
defined two more baseline models that depend on Itti et al.’s bottom-
up visual attention model [Itti et al. 1998] in order to observe how
useful such plain attention modelling is in our problem setup, and
to test if our model provides better accuracy. We computed the Itti-
Koch saliency map of the labelled frames. Then we calculated the
relevance of an object as the maximum saliency inside its bounding
box for one baseline model, and as the average saliency inside the
bounding box for the other one.
We trained separate models for user specific and user independent
cases. In the user-specific case, we trained and tested the model on
the data of the same subject. We splitted the dataset into training
and validation sets by random selection without replacement. We
randomly selected 2/3 of the dataset for training and left out the
remainder for testing. We repeated this process 50 times and mea-
sured the mean prediction accuracy. We computed the accuracy for
several window sizes, starting from 50 frames and increasing un-
til 750 frames with 25-frame steps. Our model outperformed all the
other baseline methods for all subjects and all window sizes (Figure
2). The significance of the difference was tested for each subject
separately using Wilcoxon signed-rank method with α=0.05. We
made the test between our model and three best performing base-
lines; the logit model without gaze features and the two saliency
based models. We selected the window sizes for our model and the
logit model without gaze features with respect to average prediction
accuracy on the training data.
Figure 2: User-specific model accuracy for one user. Sub-images
show the accuracy (proportion of correct predictions) as a func-
tion of the context window size (in frames, x-axis). Red diamond:
our proposed model, blue circles: baseline model using only the
video features (not gaze), green reversed triangles: attention-only
model, cyan squares: random guessing, black triangles: maximum
saliency inside object, pink crosses: average saliency inside object.
In the user-independent case, we left out one user and trained the
model with the whole datasets of the other users. Then we evalu-
ated the accuracy on the data of the left out user. This procedure
was repeated for all users. The results gave the same conclusions
as in the user-specific case although with some decrease in the ac-
curacy for all the metrics and insignificance of outperformance for
some test subjects. This is probably due to the increase in the degree
of uncertainty originating from subjectivity of top-down cognitive
processes. Then a single common model may be inadequate to han-
dle the variability of gaze patterns across the subjects. This issue
needs to be investigated further.
The box plot in Figure 3 (a) shows the learned regressor weights
for a subject in the user-specific case. Small variance of weights
indicates that the model is stable across different splits. Both the
magnitude and the ordering of weights in the user-independent case
107
was very similar to the user-specific case.
The best accuracy is achieved at the longish window sizes (i.e.
525 frames in the user-specific case, and 300 frames in the user-
independent case for test subject 1). This supports the claim that
the context does contain information related to object relevances.
The decrease in accuracy as the window size further increases is
not very significant, and in particular the proposed model seems to
be insensitive to window size.
The feature that makes the highest positive influence on relevance is
the mean distance between the object center and the fixations within
the context (w8). Intuitively, the relevance of an object increases
as the fixations within the context get closer to the center of that
object. The feature that has the highest negative influence is the
mean distance between the object and the box. This means that
as the information box is placed closer to the object, it takes more
interest. Some of the weights are harder to interpret and we will
study them further in our subsequent research.
Figure 3: Variance of the regressor weights for each of the features
among different bootstrap trials in the user-specific model. The
features are nubmered in Section 4.4
5 Discussion
In this work, we assessed the feasibility of a gaze-based object
relevance predictor in real-world scenes where the scene objects
were augmented with additional information. For this, we applied
a rather simple ordinal logistic regression model over a set of gaze
pattern and visual content features. The prominent increase in ac-
curacy when the gaze pattern features are added to the feature set
reveals that gaze statistics and visual features make a mutually com-
plementary contribution to relevance inference. The optimal way of
combining these two sources of information should be further stud-
ied. The outperformance of our model over the bottom-up attention
model in predicting the most relevant object can be attributed to that
the bottom-up models are incapable of reflecting the task-dependent
control of attention.
A better performance can probably be achieved by enriching the
feature set and using a more complex model that better fits to the
data. Generalisation of the model for other real-world scenes also
needs to be investigated further. This can be done by plugging the
model into a wearable information access device and assessing its
performance during online use. Such assessment of our model is
currently under progress.
6 Acknowledgements
Melih Kandemir and Samuel Kaski belong to the Finnish Center
of Excellence in Adaptive Informatics and Helsinki Institute for In-
formation Technology (HIIT). Samuel Kaski also belongs to PAS-
CAL2 EU network of excellence. This study is funded by TKK
MIDE project UI-ART.
References
HARDOON, D., SHAWE-TAYLOR, J., AJANKI, A., PUOLAM ¨AKI,
K., AND KASKI, S. 2007. Information retrieval by inferring im-
plicit queries from eye movements. In International Conference
on Artificial Intelligence and Statistics (AISTATS ’07).
HENDERSON, J. M. 2003. Human gaze control during real-world
scene perception. Trends in Cognitive Sciences 7, 11, 498 – 504.
HYRSKYKARI, A., MAJARANTA, P., AALTONEN, A., AND
R ¨AIH ¨A, K.-J. 2000. Design issues of ’idict’: A gaze-assisted
translation aid. In Proceedings of ETRA 2000, Eye Tracking Re-
search and Applications Symposium, ACM Press, ACM Press,
9–14.
ITTI, L., KOCH, C., AND NIEBUR, E. 1998. A model of saliency-
based visual attention for rapid scene analysis. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence 20, 11,
1254–1259.
KANDEMIR, M., SAARINEN, V.-M., AND KASKI, S. 2010. In-
ferring object relevance from gaze in dynamic scenes. In To Ap-
pear in Short Paper Proceedings of ETRA 2000, Eye Tracking
Research and Applications Symposium.
KLAMI, A., SAUNDERS, C., DE CAMPOS, T. E., AND KASKI, S.
2008. Can relevance of images be inferred from eye movements?
In MIR ’08: Proceeding of the 1st ACM international confer-
ence on Multimedia information retrieval, ACM, New York, NY,
USA, 134–140.
KOZMA, L., KLAMI, A., AND KASKI, S. 2009. GaZIR: Gaze-
based zooming interface for image retrieval. In Proc. ICMI-
MLMI 2009, The Eleventh International Conference on Multi-
modal Interfaces and The Sixth Workshop on Machine Learning
for Multimodal Interaction, ACM, New York, NY, USA, 305–
312.
MCCULLAGH, P., AND NELDER, J. 1989. Generalized Linear
Models. Chapman & Hall/CRC.
QVARFORDT, P., AND ZHAI, S. 2005. Conversing with the user
based on eye-gaze patterns. In CHI ’05: Proceedings of the
SIGCHI conference on Human factors in computing systems,
ACM, New York, NY, USA, 221–230.
TORRALBA, A., OLIVA, A., CASTELHANO, M. S., AND HEN-
DERSON, J. M. 2006. Contextual guidance of eye movements
and attention in real-world scenes: the role of global features in
object search. Psychological Review 113, 4, 766–786.
WARD, D. J., AND MACKAY, D. J. C. 2002. Fast hands-free
writing by gaze direction. Nature 418, 6900, 838.
ZHANG, L., TONG, M. H., MARKS, T. K., SHAN, H., AND COT-
TRELL, G. W. 2008. Sun: A bayesian framework for saliency
using natural statistics. Journal of Vision 8, 7 (12), 1–20.
108

Weitere ähnliche Inhalte

Was ist angesagt?

Automatic Determination Number of Cluster for NMKFC-Means Algorithms on Image...
Automatic Determination Number of Cluster for NMKFC-Means Algorithms on Image...Automatic Determination Number of Cluster for NMKFC-Means Algorithms on Image...
Automatic Determination Number of Cluster for NMKFC-Means Algorithms on Image...IOSR Journals
 
A comprehensive survey of contemporary
A comprehensive survey of contemporaryA comprehensive survey of contemporary
A comprehensive survey of contemporaryprjpublications
 
Finding Relationships between the Our-NIR Cluster Results
Finding Relationships between the Our-NIR Cluster ResultsFinding Relationships between the Our-NIR Cluster Results
Finding Relationships between the Our-NIR Cluster ResultsCSCJournals
 
Geometric Correction for Braille Document Images
Geometric Correction for Braille Document Images  Geometric Correction for Braille Document Images
Geometric Correction for Braille Document Images csandit
 
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...IJERD Editor
 
Comparative analysis and implementation of structured edge active contour
Comparative analysis and implementation of structured edge active contour Comparative analysis and implementation of structured edge active contour
Comparative analysis and implementation of structured edge active contour IJECEIAES
 
Review and comparison of tasks scheduling in cloud computing
Review and comparison of tasks scheduling in cloud computingReview and comparison of tasks scheduling in cloud computing
Review and comparison of tasks scheduling in cloud computingijfcstjournal
 
4 image segmentation through clustering
4 image segmentation through clustering4 image segmentation through clustering
4 image segmentation through clusteringIAEME Publication
 
Graph Theory Based Approach For Image Segmentation Using Wavelet Transform
Graph Theory Based Approach For Image Segmentation Using Wavelet TransformGraph Theory Based Approach For Image Segmentation Using Wavelet Transform
Graph Theory Based Approach For Image Segmentation Using Wavelet TransformCSCJournals
 
MINIMIZING DISTORTION IN STEGANOG-RAPHY BASED ON IMAGE FEATURE
MINIMIZING DISTORTION IN STEGANOG-RAPHY BASED ON IMAGE FEATUREMINIMIZING DISTORTION IN STEGANOG-RAPHY BASED ON IMAGE FEATURE
MINIMIZING DISTORTION IN STEGANOG-RAPHY BASED ON IMAGE FEATUREijcsit
 
A Study on Youth Violence and Aggression using DEMATEL with FCM Methods
A Study on Youth Violence and Aggression using DEMATEL with FCM MethodsA Study on Youth Violence and Aggression using DEMATEL with FCM Methods
A Study on Youth Violence and Aggression using DEMATEL with FCM Methodsijdmtaiir
 
FUZZY SET THEORETIC APPROACH TO IMAGE THRESHOLDING
FUZZY SET THEORETIC APPROACH TO IMAGE THRESHOLDINGFUZZY SET THEORETIC APPROACH TO IMAGE THRESHOLDING
FUZZY SET THEORETIC APPROACH TO IMAGE THRESHOLDINGIJCSEA Journal
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningcsandit
 

Was ist angesagt? (14)

Automatic Determination Number of Cluster for NMKFC-Means Algorithms on Image...
Automatic Determination Number of Cluster for NMKFC-Means Algorithms on Image...Automatic Determination Number of Cluster for NMKFC-Means Algorithms on Image...
Automatic Determination Number of Cluster for NMKFC-Means Algorithms on Image...
 
A comprehensive survey of contemporary
A comprehensive survey of contemporaryA comprehensive survey of contemporary
A comprehensive survey of contemporary
 
Finding Relationships between the Our-NIR Cluster Results
Finding Relationships between the Our-NIR Cluster ResultsFinding Relationships between the Our-NIR Cluster Results
Finding Relationships between the Our-NIR Cluster Results
 
Geometric Correction for Braille Document Images
Geometric Correction for Braille Document Images  Geometric Correction for Braille Document Images
Geometric Correction for Braille Document Images
 
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
 
Comparative analysis and implementation of structured edge active contour
Comparative analysis and implementation of structured edge active contour Comparative analysis and implementation of structured edge active contour
Comparative analysis and implementation of structured edge active contour
 
Review and comparison of tasks scheduling in cloud computing
Review and comparison of tasks scheduling in cloud computingReview and comparison of tasks scheduling in cloud computing
Review and comparison of tasks scheduling in cloud computing
 
4 image segmentation through clustering
4 image segmentation through clustering4 image segmentation through clustering
4 image segmentation through clustering
 
Graph Theory Based Approach For Image Segmentation Using Wavelet Transform
Graph Theory Based Approach For Image Segmentation Using Wavelet TransformGraph Theory Based Approach For Image Segmentation Using Wavelet Transform
Graph Theory Based Approach For Image Segmentation Using Wavelet Transform
 
MINIMIZING DISTORTION IN STEGANOG-RAPHY BASED ON IMAGE FEATURE
MINIMIZING DISTORTION IN STEGANOG-RAPHY BASED ON IMAGE FEATUREMINIMIZING DISTORTION IN STEGANOG-RAPHY BASED ON IMAGE FEATURE
MINIMIZING DISTORTION IN STEGANOG-RAPHY BASED ON IMAGE FEATURE
 
A Study on Youth Violence and Aggression using DEMATEL with FCM Methods
A Study on Youth Violence and Aggression using DEMATEL with FCM MethodsA Study on Youth Violence and Aggression using DEMATEL with FCM Methods
A Study on Youth Violence and Aggression using DEMATEL with FCM Methods
 
FUZZY SET THEORETIC APPROACH TO IMAGE THRESHOLDING
FUZZY SET THEORETIC APPROACH TO IMAGE THRESHOLDINGFUZZY SET THEORETIC APPROACH TO IMAGE THRESHOLDING
FUZZY SET THEORETIC APPROACH TO IMAGE THRESHOLDING
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion mining
 

Andere mochten auch

Tdr prezentacija rezultata bi h 03.06.2104-no-products
Tdr prezentacija rezultata bi h 03.06.2104-no-productsTdr prezentacija rezultata bi h 03.06.2104-no-products
Tdr prezentacija rezultata bi h 03.06.2104-no-productsTDR d.o.o Rovinj
 
Homophones Lesson
Homophones LessonHomophones Lesson
Homophones Lessonjgd7971
 
A go vicencio-top 10 qs on chapter 2 developing marketing strategies & plans
A go vicencio-top 10 qs on chapter 2 developing marketing strategies & plansA go vicencio-top 10 qs on chapter 2 developing marketing strategies & plans
A go vicencio-top 10 qs on chapter 2 developing marketing strategies & plansAmabelleGoVicencio
 
Galerija Magicus Dnevnik Esencija Do 21 3 2010 Ciklus Cernik I Madonin Sv...
Galerija Magicus   Dnevnik Esencija Do 21 3 2010   Ciklus Cernik I Madonin Sv...Galerija Magicus   Dnevnik Esencija Do 21 3 2010   Ciklus Cernik I Madonin Sv...
Galerija Magicus Dnevnik Esencija Do 21 3 2010 Ciklus Cernik I Madonin Sv...guestbe4094
 
dotNET Foundation FGSL 2015
dotNET Foundation FGSL 2015dotNET Foundation FGSL 2015
dotNET Foundation FGSL 2015Marcelo Paiva
 
Homophones Lesson
Homophones LessonHomophones Lesson
Homophones Lessonjgd7971
 
Windows profile how do i
Windows profile how do iWindows profile how do i
Windows profile how do iproser tech
 
Doc110339 normas do_x_congreso_do_sindicato_nacional_de_ccoo
Doc110339 normas do_x_congreso_do_sindicato_nacional_de_ccooDoc110339 normas do_x_congreso_do_sindicato_nacional_de_ccoo
Doc110339 normas do_x_congreso_do_sindicato_nacional_de_ccoooscargaliza
 
Kathryn Fiedler\'s Portfolio
Kathryn Fiedler\'s PortfolioKathryn Fiedler\'s Portfolio
Kathryn Fiedler\'s PortfolioKathryn Fiedler
 
Using Student Research Center
Using Student Research CenterUsing Student Research Center
Using Student Research CenterDavid Smolen
 
Acta asamblea eroski vigo
Acta asamblea eroski vigoActa asamblea eroski vigo
Acta asamblea eroski vigooscargaliza
 

Andere mochten auch (20)

Tdr prezentacija rezultata bi h 03.06.2104-no-products
Tdr prezentacija rezultata bi h 03.06.2104-no-productsTdr prezentacija rezultata bi h 03.06.2104-no-products
Tdr prezentacija rezultata bi h 03.06.2104-no-products
 
Homophones Lesson
Homophones LessonHomophones Lesson
Homophones Lesson
 
Nossa Rotina 2
Nossa Rotina 2Nossa Rotina 2
Nossa Rotina 2
 
αστροφυσική
αστροφυσικήαστροφυσική
αστροφυσική
 
A go vicencio-top 10 qs on chapter 2 developing marketing strategies & plans
A go vicencio-top 10 qs on chapter 2 developing marketing strategies & plansA go vicencio-top 10 qs on chapter 2 developing marketing strategies & plans
A go vicencio-top 10 qs on chapter 2 developing marketing strategies & plans
 
เศรษฐกิจ
เศรษฐกิจเศรษฐกิจ
เศรษฐกิจ
 
TEMA 5A Vocabulario
TEMA 5A VocabularioTEMA 5A Vocabulario
TEMA 5A Vocabulario
 
Web api
Web apiWeb api
Web api
 
Galerija Magicus Dnevnik Esencija Do 21 3 2010 Ciklus Cernik I Madonin Sv...
Galerija Magicus   Dnevnik Esencija Do 21 3 2010   Ciklus Cernik I Madonin Sv...Galerija Magicus   Dnevnik Esencija Do 21 3 2010   Ciklus Cernik I Madonin Sv...
Galerija Magicus Dnevnik Esencija Do 21 3 2010 Ciklus Cernik I Madonin Sv...
 
ลักษณะภูมิภาคของแอฟริกา
ลักษณะภูมิภาคของแอฟริกาลักษณะภูมิภาคของแอฟริกา
ลักษณะภูมิภาคของแอฟริกา
 
การเปลี่ยนแปลงเศรษฐกิจไทย
การเปลี่ยนแปลงเศรษฐกิจไทยการเปลี่ยนแปลงเศรษฐกิจไทย
การเปลี่ยนแปลงเศรษฐกิจไทย
 
dotNET Foundation FGSL 2015
dotNET Foundation FGSL 2015dotNET Foundation FGSL 2015
dotNET Foundation FGSL 2015
 
Homophones Lesson
Homophones LessonHomophones Lesson
Homophones Lesson
 
Windows profile how do i
Windows profile how do iWindows profile how do i
Windows profile how do i
 
ลักษณะภูมิประเทศ 2.4
ลักษณะภูมิประเทศ 2.4ลักษณะภูมิประเทศ 2.4
ลักษณะภูมิประเทศ 2.4
 
Doc110339 normas do_x_congreso_do_sindicato_nacional_de_ccoo
Doc110339 normas do_x_congreso_do_sindicato_nacional_de_ccooDoc110339 normas do_x_congreso_do_sindicato_nacional_de_ccoo
Doc110339 normas do_x_congreso_do_sindicato_nacional_de_ccoo
 
Kathryn Fiedler\'s Portfolio
Kathryn Fiedler\'s PortfolioKathryn Fiedler\'s Portfolio
Kathryn Fiedler\'s Portfolio
 
Using Student Research Center
Using Student Research CenterUsing Student Research Center
Using Student Research Center
 
Acta asamblea eroski vigo
Acta asamblea eroski vigoActa asamblea eroski vigo
Acta asamblea eroski vigo
 
Resume
ResumeResume
Resume
 

Ähnlich wie Kandemir Inferring Object Relevance From Gaze In Dynamic Scenes

Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsIRJET Journal
 
proposal_pura
proposal_puraproposal_pura
proposal_puraErick Lin
 
Scene Description From Images To Sentences
Scene Description From Images To SentencesScene Description From Images To Sentences
Scene Description From Images To SentencesIRJET Journal
 
Enhancing the Design pattern Framework of Robots Object Selection Mechanism -...
Enhancing the Design pattern Framework of Robots Object Selection Mechanism -...Enhancing the Design pattern Framework of Robots Object Selection Mechanism -...
Enhancing the Design pattern Framework of Robots Object Selection Mechanism -...INFOGAIN PUBLICATION
 
Development of Human Tracking System For Video Surveillance
Development of Human Tracking System For Video SurveillanceDevelopment of Human Tracking System For Video Surveillance
Development of Human Tracking System For Video Surveillancecscpconf
 
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...
IRJET -  	  A Survey Paper on Efficient Object Detection and Matching using F...IRJET -  	  A Survey Paper on Efficient Object Detection and Matching using F...
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...IRJET Journal
 
Implementation and performance evaluation of
Implementation and performance evaluation ofImplementation and performance evaluation of
Implementation and performance evaluation ofijcsa
 
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONMULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONijaia
 
Algorithmic Analysis to Video Object Tracking and Background Segmentation and...
Algorithmic Analysis to Video Object Tracking and Background Segmentation and...Algorithmic Analysis to Video Object Tracking and Background Segmentation and...
Algorithmic Analysis to Video Object Tracking and Background Segmentation and...Editor IJCATR
 
Long-Term Robust Tracking Whith on Failure Recovery
Long-Term Robust Tracking Whith on Failure RecoveryLong-Term Robust Tracking Whith on Failure Recovery
Long-Term Robust Tracking Whith on Failure RecoveryTELKOMNIKA JOURNAL
 
Automated traffic sign board
Automated traffic sign boardAutomated traffic sign board
Automated traffic sign boardijcsa
 
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET Journal
 
Visual Saliency Model Using Sift and Comparison of Learning Approaches
Visual Saliency Model Using Sift and Comparison of Learning ApproachesVisual Saliency Model Using Sift and Comparison of Learning Approaches
Visual Saliency Model Using Sift and Comparison of Learning Approachescsandit
 
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...IAEME Publication
 
A Novel Approach for Moving Object Detection from Dynamic Background
A Novel Approach for Moving Object Detection from Dynamic BackgroundA Novel Approach for Moving Object Detection from Dynamic Background
A Novel Approach for Moving Object Detection from Dynamic BackgroundIJERA Editor
 
GROUPING OBJECTS BASED ON THEIR APPEARANCE
GROUPING OBJECTS BASED ON THEIR APPEARANCEGROUPING OBJECTS BASED ON THEIR APPEARANCE
GROUPING OBJECTS BASED ON THEIR APPEARANCEijaia
 
Schematic model for analyzing mobility and detection of multiple
Schematic model for analyzing mobility and detection of multipleSchematic model for analyzing mobility and detection of multiple
Schematic model for analyzing mobility and detection of multipleIAEME Publication
 

Ähnlich wie Kandemir Inferring Object Relevance From Gaze In Dynamic Scenes (20)

Partial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather ConditionsPartial Object Detection in Inclined Weather Conditions
Partial Object Detection in Inclined Weather Conditions
 
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITIONFEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
FEATURE EXTRACTION USING SURF ALGORITHM FOR OBJECT RECOGNITION
 
proposal_pura
proposal_puraproposal_pura
proposal_pura
 
Scene Description From Images To Sentences
Scene Description From Images To SentencesScene Description From Images To Sentences
Scene Description From Images To Sentences
 
Enhancing the Design pattern Framework of Robots Object Selection Mechanism -...
Enhancing the Design pattern Framework of Robots Object Selection Mechanism -...Enhancing the Design pattern Framework of Robots Object Selection Mechanism -...
Enhancing the Design pattern Framework of Robots Object Selection Mechanism -...
 
Development of Human Tracking System For Video Surveillance
Development of Human Tracking System For Video SurveillanceDevelopment of Human Tracking System For Video Surveillance
Development of Human Tracking System For Video Surveillance
 
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...
IRJET -  	  A Survey Paper on Efficient Object Detection and Matching using F...IRJET -  	  A Survey Paper on Efficient Object Detection and Matching using F...
IRJET - A Survey Paper on Efficient Object Detection and Matching using F...
 
Implementation and performance evaluation of
Implementation and performance evaluation ofImplementation and performance evaluation of
Implementation and performance evaluation of
 
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONMULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION
 
Algorithmic Analysis to Video Object Tracking and Background Segmentation and...
Algorithmic Analysis to Video Object Tracking and Background Segmentation and...Algorithmic Analysis to Video Object Tracking and Background Segmentation and...
Algorithmic Analysis to Video Object Tracking and Background Segmentation and...
 
Long-Term Robust Tracking Whith on Failure Recovery
Long-Term Robust Tracking Whith on Failure RecoveryLong-Term Robust Tracking Whith on Failure Recovery
Long-Term Robust Tracking Whith on Failure Recovery
 
Automated traffic sign board
Automated traffic sign boardAutomated traffic sign board
Automated traffic sign board
 
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
IRJET- Comparative Study of Different Techniques for Text as Well as Object D...
 
Visual Saliency Model Using Sift and Comparison of Learning Approaches
Visual Saliency Model Using Sift and Comparison of Learning ApproachesVisual Saliency Model Using Sift and Comparison of Learning Approaches
Visual Saliency Model Using Sift and Comparison of Learning Approaches
 
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...
MULTIPLE HUMAN TRACKING USING RETINANET FEATURES, SIAMESE NEURAL NETWORK, AND...
 
Data Science Machine
Data Science Machine Data Science Machine
Data Science Machine
 
A Novel Approach for Moving Object Detection from Dynamic Background
A Novel Approach for Moving Object Detection from Dynamic BackgroundA Novel Approach for Moving Object Detection from Dynamic Background
A Novel Approach for Moving Object Detection from Dynamic Background
 
D018112429
D018112429D018112429
D018112429
 
GROUPING OBJECTS BASED ON THEIR APPEARANCE
GROUPING OBJECTS BASED ON THEIR APPEARANCEGROUPING OBJECTS BASED ON THEIR APPEARANCE
GROUPING OBJECTS BASED ON THEIR APPEARANCE
 
Schematic model for analyzing mobility and detection of multiple
Schematic model for analyzing mobility and detection of multipleSchematic model for analyzing mobility and detection of multiple
Schematic model for analyzing mobility and detection of multiple
 

Mehr von Kalle

Blignaut Visual Span And Other Parameters For The Generation Of Heatmaps
Blignaut Visual Span And Other Parameters For The Generation Of HeatmapsBlignaut Visual Span And Other Parameters For The Generation Of Heatmaps
Blignaut Visual Span And Other Parameters For The Generation Of HeatmapsKalle
 
Zhang Eye Movement As An Interaction Mechanism For Relevance Feedback In A Co...
Zhang Eye Movement As An Interaction Mechanism For Relevance Feedback In A Co...Zhang Eye Movement As An Interaction Mechanism For Relevance Feedback In A Co...
Zhang Eye Movement As An Interaction Mechanism For Relevance Feedback In A Co...Kalle
 
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...Kalle
 
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...Kalle
 
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...Kalle
 
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze ControlUrbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze ControlKalle
 
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...Kalle
 
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic TrainingTien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic TrainingKalle
 
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...Kalle
 
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser OphthalmoscopeStevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser OphthalmoscopeKalle
 
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...Kalle
 
Skovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze AloneSkovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze AloneKalle
 
San Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
San Agustin Evaluation Of A Low Cost Open Source Gaze TrackerSan Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
San Agustin Evaluation Of A Low Cost Open Source Gaze TrackerKalle
 
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural TasksRyan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural TasksKalle
 
Rosengrant Gaze Scribing In Physics Problem Solving
Rosengrant Gaze Scribing In Physics Problem SolvingRosengrant Gaze Scribing In Physics Problem Solving
Rosengrant Gaze Scribing In Physics Problem SolvingKalle
 
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual SearchQvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual SearchKalle
 
Prats Interpretation Of Geometric Shapes An Eye Movement Study
Prats Interpretation Of Geometric Shapes An Eye Movement StudyPrats Interpretation Of Geometric Shapes An Eye Movement Study
Prats Interpretation Of Geometric Shapes An Eye Movement StudyKalle
 
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...Kalle
 
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...Kalle
 
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...Kalle
 

Mehr von Kalle (20)

Blignaut Visual Span And Other Parameters For The Generation Of Heatmaps
Blignaut Visual Span And Other Parameters For The Generation Of HeatmapsBlignaut Visual Span And Other Parameters For The Generation Of Heatmaps
Blignaut Visual Span And Other Parameters For The Generation Of Heatmaps
 
Zhang Eye Movement As An Interaction Mechanism For Relevance Feedback In A Co...
Zhang Eye Movement As An Interaction Mechanism For Relevance Feedback In A Co...Zhang Eye Movement As An Interaction Mechanism For Relevance Feedback In A Co...
Zhang Eye Movement As An Interaction Mechanism For Relevance Feedback In A Co...
 
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
Yamamoto Development Of Eye Tracking Pen Display Based On Stereo Bright Pupil...
 
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
Wastlund What You See Is Where You Go Testing A Gaze Driven Power Wheelchair ...
 
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
Vinnikov Contingency Evaluation Of Gaze Contingent Displays For Real Time Vis...
 
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze ControlUrbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
Urbina Pies With Ey Es The Limits Of Hierarchical Pie Menus In Gaze Control
 
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
Urbina Alternatives To Single Character Entry And Dwell Time Selection On Eye...
 
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic TrainingTien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
Tien Measuring Situation Awareness Of Surgeons In Laparoscopic Training
 
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
Takemura Estimating 3 D Point Of Regard And Visualizing Gaze Trajectories Und...
 
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser OphthalmoscopeStevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
Stevenson Eye Tracking With The Adaptive Optics Scanning Laser Ophthalmoscope
 
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
Stellmach Advanced Gaze Visualizations For Three Dimensional Virtual Environm...
 
Skovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze AloneSkovsgaard Small Target Selection With Gaze Alone
Skovsgaard Small Target Selection With Gaze Alone
 
San Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
San Agustin Evaluation Of A Low Cost Open Source Gaze TrackerSan Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
San Agustin Evaluation Of A Low Cost Open Source Gaze Tracker
 
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural TasksRyan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
Ryan Match Moving For Area Based Analysis Of Eye Movements In Natural Tasks
 
Rosengrant Gaze Scribing In Physics Problem Solving
Rosengrant Gaze Scribing In Physics Problem SolvingRosengrant Gaze Scribing In Physics Problem Solving
Rosengrant Gaze Scribing In Physics Problem Solving
 
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual SearchQvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
Qvarfordt Understanding The Benefits Of Gaze Enhanced Visual Search
 
Prats Interpretation Of Geometric Shapes An Eye Movement Study
Prats Interpretation Of Geometric Shapes An Eye Movement StudyPrats Interpretation Of Geometric Shapes An Eye Movement Study
Prats Interpretation Of Geometric Shapes An Eye Movement Study
 
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
Porta Ce Cursor A Contextual Eye Cursor For General Pointing In Windows Envir...
 
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
Pontillo Semanti Code Using Content Similarity And Database Driven Matching T...
 
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
Park Quantification Of Aesthetic Viewing Using Eye Tracking Technology The In...
 

Kandemir Inferring Object Relevance From Gaze In Dynamic Scenes

  • 1. Copyright © 2010 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions Dept, ACM Inc., fax +1 (212) 869-0481 or e-mail permissions@acm.org. ETRA 2010, Austin, TX, March 22 – 24, 2010. © 2010 ACM 978-1-60558-994-7/10/0003 $10.00 Inferring Object Relevance from Gaze in Dynamic Scenes Melih Kandemir∗ Helsinki University of Technology Department of Information and Computer Science Veli-Matti Saarinen† Helsinki University of Technology Low Temperature Laboratory Samuel Kaski‡ Helsinki University of Technology Department of Information and Computer Science Abstract As prototypes of data glasses having both data augmentation and gaze tracking capabilities are becoming available, it is now possi- ble to develop proactive gaze-controlled user interfaces to display information about objects, people, and other entities in real-world setups. In order to decide which objects the augmented information should be about, and how saliently to augment, the system needs an estimate of the importance or relevance of the objects of the scene for the user at a given time. The estimates will be used to minimize distraction of the user, and for providing efficient spa- tial management of the augmented items. This work is a feasibility study on inferring the relevance of objects in dynamic scenes from gaze. We collected gaze data from subjects watching a video for a pre-defined task. The results show that a simple ordinal logistic regression model gives relevance rankings of scene objects with a promising accuracy. CR Categories: H.5.r [Information Interfaces and Representation (HCI)]: User interfaces—User interface management systems Keywords: augmented reality, gaze tracking, information re- trieval, intelligent user interfaces, machine learning, ordinal logistic regression 1 Introduction In this paper, we develop a method needed for doing information retrieval in dynamic real-world scenes where the queries are for- mulated implicitly by gaze. In our setup the user wears a ubiqui- tous information access device, “data glasses” having eye-tracking and information augmentation capabilities. The device is assumed to be capable of recognising and tracking certain types of objects from the first-person video data of the user Figure 1 illustrates the idea. Some objects, three faces and the whiteboard in this image, are augmented with attached boxes that include textual information obtained from other sources. In such a setup, each visible object in a scene can be considered as a channel through which additional relevant information can be obtained as augmented on the screen. As in traditional information retrieval setups such as text search en- gines, potential abundance of available information brings up the need for a mechanism to rank the channels with respect to their rel- evance. This is particularly important in proactive mobile setups where the augmented items are also potential distractors. Our goal is to infer the degree of interest of the user for the objects ∗e-mail: melihk@cis.hut.fi †e-mail:veli-m@neuro.hut.fi ‡e-mail:samuel.kaski@tkk.fi Figure 1: A screenshot from the eyesight of hypothetical data glasses with augmented reality capability during a short presen- tation in a meeting room (Scene 1) in the scene. This problem has a connection to modelling of vi- sual attention [Henderson 2003; Itti et al. 1998; Zhang et al. 2008]; whereas visual attention models typically try to predict the gaze pat- tern given the scene, our target is the inverse problem of inferring the user’s state (interests) given the scene and the gaze trajectory. A good solution for the former problem would obviously help in our task too, but current visual attention models mainly consider only physical pointwise saliency which does not yet capture the mainly top-down nature of effects of user’s interest on the gaze pat- tern. Although there exists some initial attempts towards two-way saliency modeling [Torralba et al. 2006], these are evaluated only for rather trivial visual tasks such as counting a certain type of ob- jects in static images. Unlike top-down models where the model is optimised given a well-defined search task, the cognitive task of the subject in our setup is hidden and may even be unclear to the subject herself. Hence, we start by data-driven statistical machine learning techniques for the inverse modeling task. Gaze data has been used in user interfaces in three ways. Our goal is the furthest from the most frequent approach, off-line analysis, for instance studying effectiveness of advertisements in attracting peo- ple’s attention, or analysis of social interaction. In the second ap- proach the user selects actions by explicitly looking at the choices, for instance eye typing [Hyrskykari et al. 2000; Ward and MacKay 2002]. Although such explicit selection mechanisms are easy to im- plement, they require full user attention and are strenuous because of the Midas touch effect: each glance activates an action whether it is intended or not. The third way of using gaze data in user in- terfaces is implicit feedback. The user uses her gaze normally, and information needed by the interface is inferred from the gaze data. An emerging example is proactive information retrieval where sta- tistical machine learning methods are used for inferring relevance from gaze patterns. The inferred relevance judgements are then used as implicit relevance feedback for information retrieval. This has been done for text retrieval by generating implicit queries from 105
  • 2. gaze patterns [Hardoon et al. 2007]. The same principle has been used for image retrieval as well [Klami et al. 2008], recently also coupled dynamically to a retrieval engine in an interactive zooming interface [Kozma et al. 2009]. Gaze has additionally been used as a means of proactive interaction, but not information retrieval, in a desktop application by assigning a relevance function to the entities on a synthetic 2D map [Qvarfordt and Zhai 2005]. To test the feasibility of the idea of relevance ranking from gaze in dynamical real-world setups, we prepared a stimulus video and col- lected gaze data from subjects watching that video. True relevance rankings were then asked from the subjects in several frames. We trained an ordinal logistic regression model and measured its accu- racy in the relevance prediction task on the left-out data. 2 Measurement Setup We shot a video from the first-person view of a subject visiting three indoor scenes. Then we postprocessed this video by augmenting some of the objects with additional textual information in an at- tached box. This video was shown to 4 subjects and gaze data was collected. Right after the viewing session the subjects ranked the scene objects in relevance order for a subset of the video frames. The ranking was considered as the ground truth for learning the models and evaluating them. The modelling task is to predict the user-given ranking for an object given the gaze-tracking data from a window immediately preceding the ranked frame. 3 Model for Inferring Relevance Let us index the stimulus slices preceding each relevance judgement from 1 to N. We extract a feature vector (details in the Experiments section) for each scene object i at time slice t to obtain a single un- labelled data point: fi (t) = {f (t) i1 , f (t) i2 , · · · , f (t) id } where d is the number of features. If we also attach the ground truth relevance ranking ri (t) , we get a labelled data point (fi (t) , ri (t) ). Let us de- note the set of data points, one for each object, related to time slice t as a data subset Λ(t) = {(f1 (t) , r1 (t) ), · · · , (fmt (1) , rmt (1) )} where mt is the number of visible objects at time slice t. Let us denote the data subset without labels by Λ (t) , and the maxi- mum number of visible objects by L = max({m1, · · · , mN }). For notational convenience, we define the most relevant object to have rank L, and the rank decreases as relevance decreases. The whole labelled data set consists of the union of all data subsets ∆ = {Λ(1) , Λ(2) , · · · , Λ(N) }. We search for a mapping from the feature space to the space of relevances, which is conventionally [0, 1]. Such a mapping can di- rectly be achieved using ordinal logistic regression [McCullagh and Nelder 1989] if we assume that the relevance of an object depends only on its features, and it is independent of the relevance of the other visible objects. We use the standard approach as described briefly below. Let us denote the probability of the object rank to be k as P(ri (t) = k | f (t) i ) = φk(f (t) i ). Then we can define the log odds such that the problem reduces to a batch of L − 1 binary regression problems, one for each k = 1, 2, · · · , L − 1: Mk = log P (ri (t) <=k | f(t) i ) 1−P (ri (t)<= | f(t) i ) = log φ0(f(t) i )+φ1(f(t) i )+···+φk(f(t) i ) φk+1(f(t) i )+φk+2(f(t) i )+···+φL(f(t) i ) = w (k) 0 + wf (t) i where a linear model is assumed. By taking the exponent of both sides we get the CDF of the rank distribution for object i at time t: P(ri (t) <= k | f (t) i ) = exp(w (k) 0 + wf (t) i ) 1 + exp(w (k) 0 + wf (t) i ) . Notice that we adopted the standard approach and used common slope coefficients w = [w1, · · · , wd] for all logit models but differ- ent intercepts w (k) 0 . In the training phase, we calculate the maxi- mum likelihood estimates for the parameters θ of this model (θ = {w (1) 0 , · · · , w (k−1) 0 , w1, · · · , wd}) using the Newton-Raphson tech- nique. Given an unlabelled data subset Λ (t) at time t, the object with relevance rank k is predicted to be the one that has the highest probability for that rank; arg maxi φk(f (t) i ). 4 Experiments 4.1 Stimulus Preparation We shot a video clip of 4 minutes and 17 seconds long from the first- person view of a subject, using a see-through head mounted display device. In the scenario of the clip, a visitor coming to our laboratory is informed of our research project. The scenario consists of three consecutive scenes: 1. A short presentation in a meeting room: A researcher in- troduces the project with a block diagram drawn on the white- board (Figure 1) in a meeting room. People present are asking questions. The visitor follows the presentation. 2. A walk in the lab corridor: The visitor walks through the laboratory taking a look at posters on the wall, and zooms into some of the name tags on office doors. 3. Demo of data collection devices: The host introduces how eye tracking experiments are made. He demonstrates a mon- itor with eye tracking capabilities and the head-mounted dis- play device. Next, we augmented the video by attaching information boxes to objects; such as faces, the whiteboard, name tags, posters, and de- vices related to the project. These were considered to be the objects potentialls the most interesting to the visitor. Short snippets of tex- tual information relevant to the objects were displayed inside the boxes. At most one information box was attached to any one object at a time. We displayed boxes for all visible objects. There were from 0 to 4 objects in the scene at a time; average number of scene objects was 2.017 with 1.36 standard deviation. The frame rate of the postprocessed video was 12 fps. 4.2 Data Collection We collected gaze data from 4 subjects while they were watching the stimulus video to get as much information as they can about the research project. After the viewing session, the subjects were shown 154 screenshots from the video in temporal order, each of which represent a 1.66 seconds slot (20 frames). The users were asked to select the objects that were relevant to them at that mo- ment, and also to rank the selected subset of objects according to their relevance. We defined relevance as the interest in seeing aug- mented information about an object in the scene at that particular time. All subjects assured, after ranking, that they were able to remember the correct ranks for almost all the frames. The sub- jects were graduate and postgraduate researchers not working on the project related to the study we present in this paper. 106
  • 3. 4.3 The Eye Tracker We collected the gaze data with a Tobii 1750 eye tracker with 50Hz sample rate. The tracker has an infra-red stereo camera on a stan- dard flat-screen monitor. The device performs tracking by detecting the pupil centers and measuring the reflection from the cornea. The successive gazes that were located within an area of 30 pixels are considered as a single fixation. This corresponds to approximately 0.6 degrees of deflection at a normal viewing distance to an 17”- screen monitor with 1280 × 1024 pixel resolution. Test subjects were sitting 60 cm away from the monitor. 4.4 Feature Extraction We extracted from the gaze and video data a set of features cor- responding to each visible object. This was done at every time slice for which the labelled object ranks were available (i.e., for one frame in every 20 consecutive frames). Each of these features summarises a particular aspect in the temporal context (recent past). We define the context at time t to be a slot from time point t − W to t − 1 where W is a predetermined window size. We used the following 11 features: 1. mean area of the bounding box of the object 2. mean area of the information box attached to the object 3. mean distance between the centers of the object bounding box and the attached information box 4. total duration of fixations inside the bounding box of the object 5. total duration of fixations inside the information box attached to the object 6. mean duration of fixations inside the bounding box of the object 7. mean duration of fixations inside the information box attached to the object 8. mean distance of all fixations to the center of the object bounding box 9. mean distance of all fixations to the center of the information box 10. mean length of saccades that ended up with fixations inside the bounding box of the object 11. mean length of saccades that ended up with fixations inside the information box attached to the object We marked the bounding boxes of the objects manually frame by frame. 4.5 Evaluation We evaluated the accuracy of the model with respect to the propor- tion of times the most relevant object was predicted correctly. We compared the model performance with five baseline methods. The first one is random guessing, in which at each time slice, scene objects are ranked uniformly at random. The second one is an attention-based method that assigns a relevance proportional to the total fixation duration on the object and on the augmented content. This estimate of object relevance is referred to as gaze intensity [Qvarfordt and Zhai 2005]. This is used to reveal the effect of in- tricate gaze patterns, other than mere visual attention measured by gaze intensity in relevance prediction. In the third baseline model we used the ordinal logistic regression model with the features that are not related to gaze: first three of the features. Thus we investi- gated the effect of gaze-based features in prediction accuracy. We defined two more baseline models that depend on Itti et al.’s bottom- up visual attention model [Itti et al. 1998] in order to observe how useful such plain attention modelling is in our problem setup, and to test if our model provides better accuracy. We computed the Itti- Koch saliency map of the labelled frames. Then we calculated the relevance of an object as the maximum saliency inside its bounding box for one baseline model, and as the average saliency inside the bounding box for the other one. We trained separate models for user specific and user independent cases. In the user-specific case, we trained and tested the model on the data of the same subject. We splitted the dataset into training and validation sets by random selection without replacement. We randomly selected 2/3 of the dataset for training and left out the remainder for testing. We repeated this process 50 times and mea- sured the mean prediction accuracy. We computed the accuracy for several window sizes, starting from 50 frames and increasing un- til 750 frames with 25-frame steps. Our model outperformed all the other baseline methods for all subjects and all window sizes (Figure 2). The significance of the difference was tested for each subject separately using Wilcoxon signed-rank method with α=0.05. We made the test between our model and three best performing base- lines; the logit model without gaze features and the two saliency based models. We selected the window sizes for our model and the logit model without gaze features with respect to average prediction accuracy on the training data. Figure 2: User-specific model accuracy for one user. Sub-images show the accuracy (proportion of correct predictions) as a func- tion of the context window size (in frames, x-axis). Red diamond: our proposed model, blue circles: baseline model using only the video features (not gaze), green reversed triangles: attention-only model, cyan squares: random guessing, black triangles: maximum saliency inside object, pink crosses: average saliency inside object. In the user-independent case, we left out one user and trained the model with the whole datasets of the other users. Then we evalu- ated the accuracy on the data of the left out user. This procedure was repeated for all users. The results gave the same conclusions as in the user-specific case although with some decrease in the ac- curacy for all the metrics and insignificance of outperformance for some test subjects. This is probably due to the increase in the degree of uncertainty originating from subjectivity of top-down cognitive processes. Then a single common model may be inadequate to han- dle the variability of gaze patterns across the subjects. This issue needs to be investigated further. The box plot in Figure 3 (a) shows the learned regressor weights for a subject in the user-specific case. Small variance of weights indicates that the model is stable across different splits. Both the magnitude and the ordering of weights in the user-independent case 107
  • 4. was very similar to the user-specific case. The best accuracy is achieved at the longish window sizes (i.e. 525 frames in the user-specific case, and 300 frames in the user- independent case for test subject 1). This supports the claim that the context does contain information related to object relevances. The decrease in accuracy as the window size further increases is not very significant, and in particular the proposed model seems to be insensitive to window size. The feature that makes the highest positive influence on relevance is the mean distance between the object center and the fixations within the context (w8). Intuitively, the relevance of an object increases as the fixations within the context get closer to the center of that object. The feature that has the highest negative influence is the mean distance between the object and the box. This means that as the information box is placed closer to the object, it takes more interest. Some of the weights are harder to interpret and we will study them further in our subsequent research. Figure 3: Variance of the regressor weights for each of the features among different bootstrap trials in the user-specific model. The features are nubmered in Section 4.4 5 Discussion In this work, we assessed the feasibility of a gaze-based object relevance predictor in real-world scenes where the scene objects were augmented with additional information. For this, we applied a rather simple ordinal logistic regression model over a set of gaze pattern and visual content features. The prominent increase in ac- curacy when the gaze pattern features are added to the feature set reveals that gaze statistics and visual features make a mutually com- plementary contribution to relevance inference. The optimal way of combining these two sources of information should be further stud- ied. The outperformance of our model over the bottom-up attention model in predicting the most relevant object can be attributed to that the bottom-up models are incapable of reflecting the task-dependent control of attention. A better performance can probably be achieved by enriching the feature set and using a more complex model that better fits to the data. Generalisation of the model for other real-world scenes also needs to be investigated further. This can be done by plugging the model into a wearable information access device and assessing its performance during online use. Such assessment of our model is currently under progress. 6 Acknowledgements Melih Kandemir and Samuel Kaski belong to the Finnish Center of Excellence in Adaptive Informatics and Helsinki Institute for In- formation Technology (HIIT). Samuel Kaski also belongs to PAS- CAL2 EU network of excellence. This study is funded by TKK MIDE project UI-ART. References HARDOON, D., SHAWE-TAYLOR, J., AJANKI, A., PUOLAM ¨AKI, K., AND KASKI, S. 2007. Information retrieval by inferring im- plicit queries from eye movements. In International Conference on Artificial Intelligence and Statistics (AISTATS ’07). HENDERSON, J. M. 2003. Human gaze control during real-world scene perception. Trends in Cognitive Sciences 7, 11, 498 – 504. HYRSKYKARI, A., MAJARANTA, P., AALTONEN, A., AND R ¨AIH ¨A, K.-J. 2000. Design issues of ’idict’: A gaze-assisted translation aid. In Proceedings of ETRA 2000, Eye Tracking Re- search and Applications Symposium, ACM Press, ACM Press, 9–14. ITTI, L., KOCH, C., AND NIEBUR, E. 1998. A model of saliency- based visual attention for rapid scene analysis. IEEE Trans- actions on Pattern Analysis and Machine Intelligence 20, 11, 1254–1259. KANDEMIR, M., SAARINEN, V.-M., AND KASKI, S. 2010. In- ferring object relevance from gaze in dynamic scenes. In To Ap- pear in Short Paper Proceedings of ETRA 2000, Eye Tracking Research and Applications Symposium. KLAMI, A., SAUNDERS, C., DE CAMPOS, T. E., AND KASKI, S. 2008. Can relevance of images be inferred from eye movements? In MIR ’08: Proceeding of the 1st ACM international confer- ence on Multimedia information retrieval, ACM, New York, NY, USA, 134–140. KOZMA, L., KLAMI, A., AND KASKI, S. 2009. GaZIR: Gaze- based zooming interface for image retrieval. In Proc. ICMI- MLMI 2009, The Eleventh International Conference on Multi- modal Interfaces and The Sixth Workshop on Machine Learning for Multimodal Interaction, ACM, New York, NY, USA, 305– 312. MCCULLAGH, P., AND NELDER, J. 1989. Generalized Linear Models. Chapman & Hall/CRC. QVARFORDT, P., AND ZHAI, S. 2005. Conversing with the user based on eye-gaze patterns. In CHI ’05: Proceedings of the SIGCHI conference on Human factors in computing systems, ACM, New York, NY, USA, 221–230. TORRALBA, A., OLIVA, A., CASTELHANO, M. S., AND HEN- DERSON, J. M. 2006. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychological Review 113, 4, 766–786. WARD, D. J., AND MACKAY, D. J. C. 2002. Fast hands-free writing by gaze direction. Nature 418, 6900, 838. ZHANG, L., TONG, M. H., MARKS, T. K., SHAN, H., AND COT- TRELL, G. W. 2008. Sun: A bayesian framework for saliency using natural statistics. Journal of Vision 8, 7 (12), 1–20. 108