Abstract
This paper describes a computer vision system based on active IR illumination for real-time gaze tracking for interactive graphic display. Unlike most of the existing gaze tracking techniques, which often require assuming a static head to work well and require a cumbersome calibration process for each person, our gaze tracker can perform robust and accurate gaze estimation without calibration and under rather significant head movement. This is made possible by a new gaze calibration procedure that identifies the mapping from pupil parameters to screen coordinates using generalized regression neural networks (GRNNs). With GRNNs, the mapping does not have to be an analytical function and head movement is explicitly accounted for by the gaze mapping function. Furthermore, the mapping function can generalize to other individuals not used in the training. To further improve the gaze estimation accuracy, we employ a hierarchical classification scheme that deals with the classes that tend to be misclassified. This leads to a 10% improvement in classification error. The angular gaze accuracy is about 5◦ horizontally and 8◦ vertically. The effectiveness of our gaze tracker is demonstrated by experiments that involve gaze-contingent interactive graphic display.
1 Introduction
Gaze determines a person’s current line of sight or point of fixation. The fixation point is defined as the intersection of the line of sight with the surface of the object being viewed (such as the screen). Gaze may be used to interpret the user’s intention for noncommand interactions and to enable (fixationdependent) accommodation and dynamic depth of focus. The potential benefits of incorporating eye movements into the interaction between humans and computers are numerous. For example, knowing the location of a user’s gaze may help a computer to interpret the user’s request and possibly enable a computer to ascertain some cognitive states of the user, such as confusion or fatigue.
6 Conclusions
In this paper, we present a new approach for gaze tracking. Compared with the existing gaze tracking methods, our method, though at a lower spatial gaze resolution (about 5◦), has the following benefits: no calibration is necessary, it allows natural head movement, and it is completely nonintrusive and unobtrusive while still producing relatively robust and accurate gaze tracking. The improvement is a result of using a new gaze calibration procedure based on GRNNs.With GRNNs, we do not need to assume an analytical gaze mapping function; therefore, we can account for head movement in the mapping. The use of hierarchical classification schemes further improves the gaze classification accuracy. While our gaze tracker may not be as accurate as some commercial gaze trackers, it achieves sufficient accuracy even under large head movements and, more importantly, is calibration free. It has significantly relaxed the constraints imposed by most existing commercial eye trackers. We believe that, after further improvement, our system will find many applications including smart graphics, human computer interaction, nonverbal communication via gaze, and assistance for people with disabilities.