HOME RESUME PUBLICATIONS RESEARCH PHOTOGRAPHY SOFTWARE TUTORIALS

Eye tracking refers to monitoring the eye movements. There are a number of techniques for measuring eye movements. The most common and widely used technique is video-based eye tracking that uses video cameras to record the image of the eye and extracts the information from the eye image.

Gaze Tracking

A gaze tracker is a device that measures the eye movements and additionally estimates the user’s gaze using the information obtained from the eyes. Depending on the gaze estimation technique employed, the output of the gaze trackers may be the Point-of-Regard (PoR) or Line of Sight (LoS) in 3D space, or it may be a point in a 2-dimensional image (e.g., the user’s field of view (scene image) or a computer display). Find detailed review of recent eye models and techniques for eye detection and tracking in this paper: In the Eye of the Beholder: A Survey of Models for Eyes and Gaze.

Remote vs Head-Mounted Gaze Trackers

Video-based gaze trackers can be categorized into two different types: Head mounted gaze tracker (HMGT) and Table-mounted (a.k.a Remote gaze tracker).

In a table-mounted tracker the system components (camera and infrared light sources) are placed away (remote) from the user. Some of the RGTs have more than one camera to track the eyes and the face. On a head-mounted tracking system, the eye and the front-view cameras are mounted on the head. Some of the HMGTs do not have a front-view camera and estimates the gaze in 3D space. Binocular HMGTs have two eye cameras for tracking both eyes.
In terms of Gaze Estimation Space, table-mounted systems usually only allow for estimating the Point of Regard (PoR) on a fixed planar surface (fixation plane), e.g. a computer display. In contrast, HMGT systems are commonly used for estimating the Line of Sight (LoS) and the gaze point of the user in his field of view. Table-mounted trackers allow a very limited range of movements of the subject's head and have limited field of view, whereas, HMGTs are mounting on the head and have a wide FoV.

Here is the problem we want to solve: The user is sitting in front of a computer display and is looking at a point in the display. There is a camera looking at the user’s eye and we want to use that image to find out where the user is looking at. There are some eye features that can be detected and tracked in the eye image. Some of these features are pupil centre, limbus (border of the iris) and the eye corners. Many gaze trackers use pupil centre together with the reflection of a light source (from the anterior surface of the cornea) for estimating the gaze point (PoR).

One way of using these features in the image and relating them to the gaze point in space is to follow a geometrical method and find person’s gaze vector in space. Once we find the gaze vector relative to our world coordinate system we can then find the intersection of this vector with the planar screen in from of the user and this intersection would be the gaze point. This method is basically a direct way of find the gaze point in space and it requires a calibrated setup and knowledge of the geometry of the eye model and the system components. If you are interested in studying the mathematical details of this method read this paper: General Theory of Remote Gaze Estimation Using the Pupil Center and Corneal Reflections.

Although there are many different methods for gaze estimation, a low precision gaze tracking can be achieved by a simple interpolation. In the interactive figures below you can see the basics of the interpolation based methods that map the pupil center to a point inside a 2-dimentional plane in front of the eye. Let’s assume that the gaze point is always lies on a plane (e.g. a computer display) in front of the eye. Let’s call this plane fixation plane. Figure 1 shows a schematic illustration of the main elements of a remote gaze tracker setup. The camera is shown by a triangle indicating the camera image and the projection center. A simplified model of the eye is shown with its optical/visual axes. The visual axis intersects the fixation plane in the gaze point (PoR). You can also see how a light emitted from a light source will be reflected from the surface of the cornea and is projected to the camera image. You can interact with this figure and move the eye and the camera by dragging the red circles. Play around with this figure and see how the projection of the pupil center and the light source in the image change when you change the gaze direction.

Figure 1: Main elements of a remote gaze tracker setup.

Now let’s see how we can map the pupil position to the gaze point in the fixation plane. In fact, the graph here is showing a 2 dimensional view of the problem but the results can be generalized to 3D. In Figure 1 you can see a point called Pupil_y. There is also a vector connecting PoR and the point Pupil_y. This vector indicates the the location of the pupil center inside the eye image for each fixation point. This vector is in fact the input of the gaze tracking system which is obtained from the eye image. Rotate the eye in the figure and see how this vector changes for different gaze points.

In this figure you can also see another vector indicating the height of the PoR. This will be the output of the system. The output is the position of the gaze point in the fixation plane (y) and in reality this would be a point with two coordinates (x, y). A simple gaze estimation can be done by mapping the Pupil_y (input) to the y (PoR) or the output. When you rotate the eye in Figure 1, you can see the trace of the point Pupil_y. This trace is actually the mapping function we need to use for the gaze mapping. We can either consider this function f(Pupil_y) as a line or a polynomial curve. Let’s assume that it is a linear function for now. Then it can be obtained by taking two sample points. This is done via a calibration procedure. We ask the subject to fixate at two different target points in the fixation plane. This gives us two points along the line which is enough for finding the mapping function.

Use the buttons in the figure, take the first sample point, rotate the eye and take the second sample. You will see a yellow line after the calibration. This is the mapping function that the gaze tracker uses for gaze estimation. This function works fine as long as the position of the eyeball (head position) is fixed relative to the camera. If you move the eye slightly you see that the mapping function will not align with the actual trace of the Pupil_y point.

Figure 1: Main elements of the gaze estimation process using only the pupil center.

Although remote gaze tracker setups are less invasive and more comfortable to the user, they have an undeniable disadvantage of the low tolerance for the user’s head movements. Compensating the noise produced by natural head movements in the gaze estimation process still represents the greatest limitation for most of the remote gaze trackers. For that reason, some researchers seek techniques to improve the gaze estimation so that the users can move their heads during an eye tracking session.

It's possible to use infrared light sources for generating reflection points (i.e., glints) in the user's cornea. These glints are formed by virtual images of the light emission sources, and they are created on frontal corneal surface in which acts as a convex mirror. In general, glints are used as reference points in relation to eye rotation movements. In practice, the alterations suffered by the pupil center position depend exclusively of eye rotation movements. On the other hand, the alterations suffered by the glints depend of user’s head movements.

Let's see how we can control the user’s head movements through the analysis of the relation between a glint and the pupil center. In the Figure 1, we show the pupil-glint vector instead of pupil center. Follow the same steps described in this tutorial and calibrate the system. Then move the center of the eyeball and see how much using the glint can increase the robustness of the gaze estimation against the head movements.

Figure 1: