2nd Unconstrained Face Detection and Open Set Recognition Challenge


Held in conjunction with workshop on Interactive and Adaptive Learning in an Open World at ECCV 2018

What's Different?

In most face detection/recognition datasets, the majority of images are “posed”, i.e. the subjects know they are being photographed, and/or the images are selected for publication in public media. Hence, blurry, occluded and badly illuminated images are generally uncommon in these datasets. In addition, most of these challenges are close-set, i.e. the list of subjects in the gallery is the same as the one used for testing.

This challenge explores more unconstrained data, by introducing the new UnConstrained College Students (UCCS) dataset, where subjects are photographed using a long-range high-resolution surveillance camera without their knowledge. Faces inside these images are of various poses, and varied levels of blurriness and occlusion. The challenge also creates an open set recognition problem, where unknown people will be seen during testing and must be rejected.

With this challenge, we hope to foster face detection and recognition research towards surveillance applications that are becoming more popular and more required nowadays, and where no automatic recognition algorithm has proven to be useful yet.


UnConstrained College Students (UCCS) Dataset

The UCCS dataset was collected over several months using Canon 7D camera fitted with Sigma 800mm F5.6 EX APO DG HSM lens, taking images at one frame per second, during times when many students were walking on the sidewalk.



Example images of the UCCS dataset. Note that not a single face in these two images is frontal and without occlusion – some have small occlusion, others large; some have significant yaw and pitch angles; and many are blurred.


Capturing of images was performed on 20 different days, between February 2012 and September 2013 covering various weather conditions such as sunny versus snowy days. They also contain various occlusions such as sunglasses, winter caps, fur jackets, etc., and occlusion due to tree branches, poles, etc. To remove the potential bias of using automated face detection (which selects only easy faces), more than 70,000 face regions were hand-cropped. From these, we have labeled in total 1732 identities. Each labeled sequence contains around 10 images. For approximately 20% of the identities, we have sequences from two or more days. Dataset images are in JPG format with an average size of 5184 × 3456.


Different poses and blurriness


We split up the UCCS database into a training, a validation and a test set. In the training and validation set, which is made accessible to the participants at the beginning of the competition, each image is annotated with a list of bounding boxes. Each bounding box is either labeled with an integral identity label, or with the “unknown” label −1. In total, we will provide labels for 1000 different known identities, and around half of the faces in the dataset will be “unknown”. We provide two scripts to run the evaluation on the validation set for part 2 and part 3 respectively, so that participants can optimize meta-parameters of their algorithms to the validation set data. We provide open source baseline algorithms for both parts based on Bob that the participants can compare against.


Sample Bounding Boxes




Download Sample DataSet
Note: The samples do not contain the same image resolution as the ones in dataset.



Timeline


Pre-Release Dataset

Available

Challenge Data Release

08/01/2018

Submission deadline

08/19/2018



Dataset Download


File Formats

All files are given / expected in CSV format, maybe with comment lines starting with '#'.

Protocol File Formats

For the training and validation set, protocol files contain the complete information of the faces contained in the image. Particularly, they contain a unique number (FACE_ID), the image file name, an integral SUBJECT_ID (which might be -1 for unknown identities) and the hand-labeled face bounding box (FACE_X, FACE_Y, FACE_WIDTH, FACE_HEIGHT).

Training/Validation Protocol

FACE_ID FILE SUBJECT_ID FACE_X FACE_Y FACE_WIDTH FACE_HEIGHT
6000000IMG_5485_30.JPG700000075.6567.475.974.4
6000001IMG_5485_30.JPG7000001294.4531.674.481.8
6000002IMG_5485_30.JPG7000002544.4525.771.483.3

The test set contains only a list of file names, i.e., without any information about faces contained in the image. Particularly, there will be images that do not contain any faces.

Test Protocol

FILE
000214cdae7b0687beab37b1fc102958.jpg
000438fc89e6a536919222dab0bf99b9.jpg
0006be8811958bd396baee0960fbe5b3.jpg


Score File Formats

Face detection score files need to contain one detected bounding box per line. Particularly, each line should contain the FILE (same as in the protocol file), a bounding box (BB_X, BB_Y, BB_WIDTH, BB_HEIGHT) and a confidence score (DETECTION_SCORE). The confidence score can have any range, but higher scores need to mean higher confidences. Note that generally there is more than one bounding box per file. Hence, there should be several lines for each image.


Face Detection Score File

FILE BB_X BB_Y BB_WIDTH BB_HEIGHT DETECTION_SCORE
IMG_5485_30.JPG549.51550.4754.8965.8626.25
IMG_5485_30.JPG10.01543.4458.4470.1320.69
IMG_5485_30.JPG1447.0198.5270.2784.3220.16

The face recognition score file is an extension of the face detection score file. Additionally to the above mentioned bounding boxes, a list of (SUBJECT_ID, RECOGNITION_SCORE)-pairs should be added. We accept up to 10 pairs, i.e., in order to compute detection and identification rate curves for rank up to 10. Please note that only the faces that are labeled with a SUBJECT_ID in the validation set protocol file are of interest. Unknown faces (i.e., faces that have SUBJECT_ID -1 in the protocol file) can either be labeled with -1, or no SUBJECT_ID should be assigned (i.e., no (SUBJECT_ID, RECOGNITION_SCORE)-pair should be given after the DETECTION_SCORE). If any mis-detection (i.e., background region) is labeled with -1 or not labeled at all, this does not count as an error. Any background region or unknown face that is labeled with a SUBJECT_ID other than -1 will increase the number of false alarms (see Evaluation below). If you plan to participate in both challenges, the face recognition score file can be used for evaluating both the detection and the recognition experiment. Hence, only one score file needs to be submitted in this case.

Face Recognition Score File

FILE BB_X BB_Y BB_WIDTH BB_HEIGHT DETECTION_SCORE SUBJECT_ID_1 RECOGNITION_SCORE_1 SUBJECT_ID_2 RECOGNITION_SCORE_2 SUBJECT_ID_3 RECOGNITION_SCORE_3 ... ...
003b27f9e65f4da9847186cc041ba0ca.jpg2511.553154.04202.72243.2623.861120-61.811217-62.19199-61.7341-62.41
003b27f9e65f4da9847186cc041ba0ca.jpg960.431990.14270.26324.3217.69-1-55.98
003b27f9e65f4da9847186cc041ba0ca.jpg3811.961486.11227.54273.0513.05937-57.91-1-66.51

Baseline

The baseline face detection and face recognition experiments use the MTCNN-v2 and VGG-v2 detection and recognition pipeline, as implemented in the open-source package written in Python. Parts of this package are using the signal processing and machine learning toolbox Bob. You can downloaded the Baseline package from PyPI.

Face Detection Baseline

The baseline face detector simply uses the pre-trained MTCNN-v2 detector models, with the Caffe/Python implementation adapted from http://github.com/walkoncross/mtcnn-caffe-zyf. Since the detector is not optimized for blurry, occluded, or full profile faces, we had to lower the three detection thresholds to (0.1, 0.2, 0.2). If you do not wish to run the baseline face detector, you can download the resulting Baseline face detection score file.

Face Recognition Baseline

For face recognition, we use the VGG v2 face recognition pipeline. We use the pre-trained Squeeze and Excite VGG v2 network and extract the features from the 'pool5/7x7_s1' layer. For each person, the features of the training set are averaged to build a template of that person. Open-set recognition is performed by averaging all training features of unknown identities in a separate template, and another template for features extracted from background detections of the MTCNN detector.
First, the faces in the training images are re-detected, to assure that the bounding boxes of training and test images have similar content. Then, the faces are rescaled and cropped to a resolution of 224x224 pixels. Afterwards, features are extracted using the VGG v2 network. For each identity (including the unknown identity -1 and background detections -2), the average of the features is stored as a template.
During testing, in each image all faces are detected, cropped, and features are extracted. Those probe features are compared to all templates using cosine similarity. For each detected face, the 10 identities with the smallest distances are obtained -- if identity -1 or -100 is included, all less similar identities are not considered anymore.
If you do not wish to run the baseline face recognition system, you can download the resulting Baseline face recognition score file.

Evaluation

The evaluation will use Free Receiver Operator Characteristic (FROC) to evaluate the face detection experiments, and the Detection and Identification Rate (DIR) curve on Rank 1 to evaluate open set face recognition. Learning from our first challenge, in both we use the total number of False Alarms or False Identifications, respectively, in logarithmic scale on the x-axis; and the Detection Rate or Detection and Identification Rate, respectively, on the y-axis. The dotted gray line represents equal number of false and correct detections or identifications, respectively. An implementation of the two evaluation scripts for the validation set is provided in the Baseline package. Please refer to this package for more details about the evaluation.
For comparison, the FROC and DIR plots of the baseline are:

Evaluation results on validation set

Contact Us

opensetface@vast.uccs.edu

Organization Team

Dr. Terrance E. Boult Website
Dr. Manuel Günther Website
Akshay Raj Dhamija Website