Tools + Demos
We here provide a large data set of eye movement data on high-resolution natural movies, Hollywood trailers, and static images. Our analysis of this data and details on data collection can be found in
Please cite the above paper if you use our data set in your own publications. We would also appreciate to hear about your research - please send an e-mail to email@example.com.
Please note that we recently made available an even larger data base of eye movements from subjects who watched several hours of Hollywood movies while doing an action recognition task.
The original videos were stored in a .m2t MPEG transport stream by a JVC JY-HD10 HDTV camcorder. Because our clips were cut out of longer recordings without a reencoding, embedded time stamps do not start at zero. Some video players (including versions of the Windows MediaPlayer) have problems with this format. Please download a sample movie first to ensure your software can handle these videos without wasting bandwidth by downloading the whole set (for guaranteed playback, we recommend mplayer or vlc). Alternatively, we provide the stimuli reencoded as MPEG-1 that should play on most platforms.
Please contact us (firstname.lastname@example.org) directly if you are interested in the Hollywood trailers.
The data file format is as follows:
# Comments start with a '#' gaze <horizontal resolution> <vertical resolution> geometry distance <screen-subject> <width of stimulus on screen> <height of stimulus on screen> <sample 1> <sample 2> ...
where <sample n> looks as
<timestamp of gaze sample in microseconds since start of movie> <x position> <y position> <confidence value>
In theory, we could directly read out timestamps from the gaze samples sent over the network by the eye tracker, and these should be multiples of 4 ms; in practice, however, this would require that the system clocks on eye tracker PC and display PC are perfectly synchronized (which they are not). We furthermore cannot guarantee a 4 ms sample interval because UDP network packets may be dropped silently (over a dedicated ethernet link such as the one we used in our experiment, they should not, but the UDP protocol does not guarantee such behaviour). We therefore assigned timestamps from the display PC's clock at the time when gaze samples were read from the network socket. Because sporadically, the thread responsible for receiving samples would not be scheduled immediately after a sample became available, some of the timestamps are thus far from multiples of 4 ms. For analysis purposes, it is therefore relatively safe to locally assume a sample interval of 4 ms and to globally (i.e. over longer temporal distances) use the assigned timestamps.
x and y and screen coordinates with (0,0) at the top left and (hres-1,vres-1) at the bottom right corner; horizontal and vertical resolution are specified in the header of the gaze file (in all these recordings, 1280 and 720, respectively).
The confidence value can be either 1.0 (eye tracked) or 0.0 (eye not tracked, for example during a blink). Blinks are typically bracketed by vertical "eye movements" because of partial lid occlusion; gaze samples around blinks where velocity was greater than 37.5 deg/s were removed.
The .zip file contains the following subdirectories:
This directory contains data from Experiment 1.
The naming convention is straightforward:
<three letter subject id>_<movie>.coord
Male subjects have subject IDs C1K, CCE, CCU, FTD, GGM, KKD, M1K, MBS; all other subjects were female.
This directory contains the "Hollywood trailers" part of Experiment 2. Two trailers were shown 10 times each (over two consecutive days). The naming convention is
<four letter subject id>_rep_<repetition number, 1-10>_<movie>.coord
This directory contains the remaining data from Experiment 2, repetitive presentation of natural movies (same movies as in Experiment 1). Naming convention is as in hollywood_movies_gaze.
This directory contains the "stop-motion" data from Experiment 3. Naming convention is as above, with a 3-letter subject identifier. The corresponding movies have the same names as the natural movies, but should be taken from the stop-motion .zip above.
This directory contains data for the static image presentations from Experiment 3. In this part, every 90th frame was taken from the natural movies and presented in a randomized (over all movies) sequence.
<three letter subject id>_<movie>.<frame number>.coord
Subjects in Experiment 3 saw both stop-motion versions and static images from all 18 natural movies (blocked in movies 1-9 as stop-motion, 10-18 as static, ...). Because subjects thus were familiar with the second set of stimuli, we calculated variability only on the first half of the data set (directories stop_motion_gaze and static_images_gaze above). The remaining gaze recordings may still be of interest and are therefore provided here as well.