Machine Perception Laboratory

The MPLab GENKI Database


The MPLab GENKI Database is an expanding database of images containing faces spanning a wide range of illumination conditions, geographical locations, personal identity, and ethnicity. Each subsequent release  contains all images from the previous release, and so is guaranteed to be backward compatible. The database of images is divided into overlapping subsets, each with its own labels and descriptions. For example, the GENKI-4K subset contains 4000 face images labeled as either “smiling” or “non-smiling” by human coders.  The pose of the faces is approximately frontal as determined by our automatic face detector. The GENKI-SZSL subset contains 3500 images containing faces. They are labeled for the face location and size. The images are available for public use.

The current release of the GENKI database is GENKI-R2009a. It contains 7172 unique image files, which
combine to form these subsets:

  • GENKI-4K: 4000 images, containing expression and head-pose labels.
  • GENKI-SZSL: 3500 images, containing face position and size labels.

Each image file has a unique name. Each subset consists of an “Images” text file and a “Labels” text file. The Images file has one image name per line. The corresponding line in the Labels file contains the data label for that file.

Specific type and meaning of each label is documented in each subset’s “README” file.

Please give appropriate acknowledgment when you use this database and its subsets. Cite the author as “” and the title as “The MPLab GENKI Database,” plus relevant subsets as in the following examples [Bibtex]:

  •, The MPLab GENKI Database.
  •, The MPLab GENKI Database, GENKI-4K Subset.
  •, The MPLab GENKI Database, GENKI-SZSL Subset.


The current public release of The MPLab GENKI Database is available for download here.

Release History

GENKI-R2009a: Adds GENKI-SZSL subset to GENKI-4K.
GENKI-4K: Contains GENKI-4K subset.

GENKI-4K Face, Expression, and Pose Dataset


This section describes the legacy GENKI-4K dataset, which has been superceded by The MPLab GENKI Database.

The GENKI-4K dataset contains 4,000 face images spanning a wide range of subjects, facial appearance, illumination, geographical
locations, imaging conditions, and camera models. All images are labeled for both Smile content (1=smile, 0=non-smile)
and Head Pose (yaw, pitch, and roll parameters, in radians).

Please give appropriate acknowledgements when you use these test sets by using the following citation
in any publications using GENKI-4K:

  •, The MPLab GENKI Database, GENKI-4K Subset.


The public GENKI-4K dataset is available for download here.

The images are in JPEG format. The included README file describes the format of the file labels.txt,
also included, which contains the expression (smile/non-smile) and 3-D head pose labels.

Problems, Questions and Comments should go to