Unconstrained Face Detection and Open Set Recognition Challenge


Alert: Test set released. Submit results by mail to opensetface@vast.uccs.edu by May 03 2017.

This webpage details Part 2 and 3 of the IJCB 2017 Face Recognition Challenge. The complete details of Part 1 may be found Here. All participants are welcome to participate in one or more of the three challenges.


What's Different?

In most face detection/recognition datasets, the majority of images are “posed”, i.e. the subjects know they are being photographed, and/or the images are selected for publication in public media. Hence, blurry, occluded and badly illuminated images are generally uncommon in these datasets. In addition, most of these challenges are close-set, i.e. the list of subjects in the gallery is the same as the one used for testing.

This challenge explores more unconstrained data, by introducing the new UnConstrained College Students (UCCS) dataset, where subjects are photographed using a long-range high-resolution surveillance camera without their knowledge. Faces inside these images are of various poses, and varied levels of blurriness and occlusion. The challenge also creates an open set recognition problem, where unknown people will be seen during testing and must be rejected.

With this challenge, we hope to foster face detection and recognition research towards surveillance applications that are becoming more popular and more required nowadays, and where no automatic recognition algorithm has proven to be useful yet.


UnConstrained College Students (UCCS) Dataset

The UCCS dataset was collected over several months using Canon 7D camera fitted with Sigma 800mm F5.6 EX APO DG HSM lens, taking images at one frame per second, during times when many students were walking on the sidewalk.



Example images of the UCCS dataset. Note that not a single face in these two images is frontal and without occlusion – some have small occlusion, others large; some have significant yaw and pitch angles; and many are blurred.


Capturing of images was performed on 20 different days, between February 2012 and September 2013 covering various weather conditions such as sunny versus snowy days. They also contain various occlusions such as sunglasses, winter caps, fur jackets, etc., and occlusion due to tree branches, poles, etc. To remove the potential bias of using automated face detection (which selects only easy faces), more than 70,000 face regions were hand-cropped. From these, we have labeled in total 1732 identities. Each labeled sequence contains around 10 images. For approximately 20% of the identities, we have sequences from two or more days. Dataset images are in JPG format with an average size of 5184 × 3456.


Different poses and bluriness


We split up the UCCS database into a training, a validation and a test set. In the training and validation set, which is made accessible to the participants at the beginning of the competition, each image is annotated with a list of bounding boxes. Each bounding box is either labeled with an integral identity label, or with the “unknown” label −1. In total, we will provide labels for 1000 different known identities, and around half of the faces in the dataset will be “unknown”. We provide two scripts to run the evaluation on the validation set for part 2 and part 3 respectively, so that participants can optimize meta-parameters of their algorithms to the validation set data. We provide open source baseline algorithms for both parts based on Bob that the participants can compare against.


Sample Bounding Boxes




Download Sample DataSet
Note: The samples do not contain the same image resolution as the ones in dataset.



Timeline


Challenge Announcement

01/16/2017

Training & Validation Data, Baseline Release

01/23/2017

Registration Closing

04/10/2017

Test Data Release

04/18/2017

Submission deadline

05/03/2017

Submit summary paper

05/20/2017



Dataset Download


File Formats

All files are given / expected in CSV format, maybe with comment lines starting with '#'.

Protocol File Formats

For the training and validation set, protocol files contain the complete information of the faces contained in the image. Particularly, they contain a unique number (FACE_ID), the image file name, an integral SUBJECT_ID (which might be -1 for unknown identities) and the hand-labeled face bounding box (FACE_X, FACE_Y, FACE_WIDTH, FACE_HEIGHT).

Training/Validation Protocol

FACE_ID FILE SUBJECT_ID FACE_X FACE_Y FACE_WIDTH FACE_HEIGHT
6000000IMG_5485_30.JPG700000075.6567.475.974.4
6000001IMG_5485_30.JPG7000001294.4531.674.481.8
6000002IMG_5485_30.JPG7000002544.4525.771.483.3

The test set contains only a list of file names, i.e., without any information about faces contained in the image. Particularly, there will be images that do not contain any faces.

Test Protocol

FILE
000214cdae7b0687beab37b1fc102958.jpg
000438fc89e6a536919222dab0bf99b9.jpg
0006be8811958bd396baee0960fbe5b3.jpg


Score File Formats

Face detection score files need to contain one detected bounding box per line. Particularly, each line should contain the FILE (same as in the protocol file), a bounding box (BB_X, BB_Y, BB_WIDTH, BB_HEIGHT) and a confidence score (DETECTION_SCORE). The confidence score can have any range, but higher scores need to mean higher confidences. Note that generally there is more than one bounding box per file. Hence, there should be several lines for each image.


Face Detection Score File

FILE BB_X BB_Y BB_WIDTH BB_HEIGHT DETECTION_SCORE
IMG_5485_30.JPG549.51550.4754.8965.8626.25
IMG_5485_30.JPG10.01543.4458.4470.1320.69
IMG_5485_30.JPG1447.0198.5270.2784.3220.16

The face recognition score file is an extension of the face detection score file. Additionally to the above mentioned bounding boxes, a list of (SUBJECT_ID, RECOGNITION_SCORE)-pairs should be added. We accept up to 10 pairs, i.e., in order to compute detection and identification rate curves for rank up to 10. Please note that only the faces that are labeled with a SUBJECT_ID in the validation set protocol file are of interest. Unknown faces (i.e., faces that have SUBJECT_ID -1 in the protocol file) can either be labeled with -1, or no SUBJECT_ID should be assigned (i.e., no (SUBJECT_ID, RECOGNITION_SCORE)-pair should be given after the DETECTION_SCORE). If any mis-detection (i.e., background region) is labeled with -1 or not labeled at all, this does not count as an error. Any background region or unknown face that is labeled with a SUBJECT_ID other than -1 will increase the number of false alarms (see Evaluation below). If you plan to participate in both challenges, the face recognition score file can be used for evaluating both the detection and the recognition experiment. Hence, only one score file needs to be submitted in this case.

Face Recognition Score File

FILE BB_X BB_Y BB_WIDTH BB_HEIGHT DETECTION_SCORE SUBJECT_ID_1 RECOGNITION_SCORE_1 SUBJECT_ID_2 RECOGNITION_SCORE_2 SUBJECT_ID_3 RECOGNITION_SCORE_3 ... ...
003b27f9e65f4da9847186cc041ba0ca.jpg2511.553154.04202.72243.2623.861120-61.811217-62.19199-61.7341-62.41
003b27f9e65f4da9847186cc041ba0ca.jpg960.431990.14270.26324.3217.69-1-55.98
003b27f9e65f4da9847186cc041ba0ca.jpg3811.961486.11227.54273.0513.05937-57.91-1-66.51

Baseline

The baseline face detection and face recognition experiments are published as an open-source package written in Python and using the signal processing and machine learning toolbox Bob. You can downloaded the Baseline package from PyPI.

Face Detection Baseline

The baseline face detector simply uses Bob's built-in face detector, which is neither optimized for blurry faces nor for profiles.
If you do not wish to run the baseline face detector, you can download the resulting Baseline face detection score file.

Face Recognition Baseline

For face recognition, we simply adopt a PCA+LDA pipeline on top of local binary pattern histogram sequence (LBPHS) features. The PCA+LDA projection matrix is estimated from the faces in the training set. For each person, the images of the training set build one class. Open-set recognition is performed by using all training faces of unknown identities in a separate class.
First, the faces in the training images are re-detected, to assure that the bounding boxes of training and test images have similar content. Then, the faces are rescaled and cropped to a resolution of 64x80 pixels. Afterwards, LPBHS features are extracted from these images, and a PCA+LDA projection matrix is computed. All training features are projected into the PCA+LDA subspace. For each identity (including the unknown identity -1), the average of the projected features is stored as a template.
During testing, in each image all faces are detected, cropped, and LBPHS features are extracted. Those probe features are projected into the same PCA+LDA subspace, and compared to all templates using Euclidean distance. For each detected face, the 10 identities with the smallest distances are obtained -- if identity -1 is included, all less similar images are not considered anymore.
If you do not wish to run the baseline face recognition system, you can download the resulting Baseline face recognition score file.

Evaluation

The evaluation will use Free Receiver Operator Characteristic (FROC) to evaluate the face detection experiments, and the Detection and Identification Rate (DIR) curve on Rank 1 to evaluate open set face recognition. An implementation of the two evaluation scripts for the validation set is provided in the Baseline package. Please refer to this package for more details about the evaluation.
For comparison, the FROC and DIR plots of the baseline are:

Evaluation results on validation set

Contact Us

opensetface@vast.uccs.edu

Organization Team

Dr. Terrance E. Boult Website
Dr. Manuel Günther Website
Akshay Raj Dhamija Website