5 Conclusion
We have presented a systematic study of covariate effects on face verification performance of four recent deep CNN models. We observe that the studied models are affected by image quality to different degrees, but all of them degrade in performance quickly and significantly, when evaluated on lower-quality images than they were trained with. However, given proper architecture choices and training procedures, a deep learning model can be made relatively robust to common sources of image-quality degradations. We found that the models considered were the most easily and consistently degraded in performance through image blurring, which is similar in nature to real-life scenarios of attempting face recognition from low-resolution imagery. Other covariates found to have a considerable effect on the verification performance were noise, image brightness, and missing data, while image contrast and JPEG compression impacted the performance of the models only marginally. Most of the models considered were least affected by changes in input colour space – despite being trained on full colour images – their performance drops negligibly when evaluated on grey-scale images. This finding is also corroborated by the results of the contrast experiments. No specific architecture was found to be significantly more robust than others to all covariates. The VGG-Face model, for example, was most robust to noise, but performed least well for changes in image brightness. GoogLeNet, on the other hand, performed worst on noise and image blur, but had a slight advantage over the remaining models with images of reduced contrast.