Caffe models for Networks developed at the VAST lab

Here are a list of models with their associated publications that describe them. All models were developed at the Vision and Security Technology (VAST) lab at the University of Colorado Colorado Spring (UCCS). Models are provided under a BSD-3 license, please see LICENSE for details.

All of our models require Caffe and should work with the stock Caffe version, while they require special layers that are available on request (mgunther@vast.uccs.edu). We write our models as caffemodel.h5 as we find that the HDF5 interface is more stable across versions of Caffe.

MOON

The Mixed-Objective Optimization Network (MOON) was published in the paper: MOON: A Mixed Objective Optimization Network for the Recognition of Facial Attributes. If you use the MOON network in your research, please cite:

@inproceedings{rudd2016moon,
  title = {{MOON}: A Mixed Objective Optimization Network for the Recognition of Facial Attributes},
  author = {Rudd, Ethan M. and G\"unther, Manuel and Boult, Terrance E.},
  booktitle = {European Conference on Computer Vision (ECCV)},
  editor = {Leibe, Bastian and Matas, Jiri and Sebe, Nicu and Welling, Max},
  pages = {19-35},
  publisher = {Springer},
  series = {Lecture Notes in Computer Science},
  year = 2016
}

The MOON network is build on top of the VGG 16 topology and takes a 178 × 218 pixel color image in RGB format, where pixel values are normalized to [0, 1] range. The image should aligned using (hand-labeled) eye locations, such that the eyes are placed at locations xr = (69, 112) and xl = (109, 112) .

The Network Topology and the pre-trained Balanced Model. If you like, you can also get the pre-trained Unbalanced Model.

The output of the either of the networks are 40 floating point labels (called fc8), one for each attribute. The order of the attributes is the same as in figure 2 of the MOON paper, i.e., alphabetically sorted. Specifically, the attributes are:

5_o_Clock_Shadow, Arched_Eyebrows, Attractive, Bags_Under_Eyes, Bald, Bangs, Big_Lips, Big_Nose, Black_Hair, Blond_Hair, Blurry, Brown_Hair, Bushy_Eyebrows, Chubby, Double_Chin, Eyeglasses, Goatee, Gray_Hair, Heavy_Makeup, High_Cheekbones, Male, Mouth_Slightly_Open, Mustache, Narrow_Eyes, No_Beard, Oval_Face, Pale_Skin, Pointy_Nose, Receding_Hairline, Rosy_Cheeks, Sideburns, Smiling, Straight_Hair, Wavy_Hair, Wearing_Earrings, Wearing_Hat, Wearing_Lipstick, Wearing_Necklace, Wearing_Necktie, Young

Generally, negative values stand for the absence of the attribute, while positive values predict the presence of the attribute. Positive/negative scores with a higher absolute value usually depict attributes with higher certainty.

AFFACT

The Alignment-Free Facial Attribute Classification Technique (AFFACT) was introduced in the paper: AFFACT - Alignment-Free Facial Attribute Classification Technique. If you apply the AFFACT network in your research, please cite:

@inproceedings{guenther2017affact,
  title = {AFFACT - Alignment Free Facial Attribute Classification Technique},
  author = {G\"unther, Manuel and and Rozsa, Andras and Boult, Terrance E.},
  booktitle = {International Joint Conference on Biometrics (IJCB)},
  year = 2017
}

The AFFACT network is built on top of the ResNet-50 topology and takes a 224 × 224 pixel color image in RGB format. The face should be somewhere in the center of the image and be cropped not too tightly, including some hair and other background.

The Network Topology and the pre-trained Sigmoid Cross-Entropy Loss Model. If you like, you can also get the pre-trained Euclidean Loss Model.

The output of the either of the networks are 40 floating point labels (called attributes), one for each attribute. The order of the attributes is the same as in figure 2 of the MOON paper, i.e., alphabetically sorted. Generally, negative values stand for the absence of the attribute, while positive values predict the presence of the attribute. Positive/negative scores with a higher absolute value usually depict attributes with higher certainty.