Multi-View Facial Data Collection

Multi-View Facial Data Collection

This task is devoted to facial data collection of several individuals each performing several facial expressions and a short passage of speech. To accomplish the task, three main sub-task will be carried on:

1- Facial Expression Acquisition Platform

2- 4D facial dynamics data acquisition

3- Public release of the database

T2.1 – Facial Expression Acquision Platform

In order to create the perfect environment for the facial expression data acquisition, a special room equipped with a trinocular video camera systems and a semi-professional controlled illumination system will be created. This room will enable the individuals to perform they facial actions in a private environment, allowing also the collection of data under different and controlled illumination conditions. The trinocular vision system will be composed of a cyclopean camera (frontal view) and two more static cameras for side view of the face. The vergence of the cameras can be manually adjusted. The central camera will also equipped with a special purpose IR illumination system in order to enhance the detection of the eye’s pupils. This setup camera setup is currently available at the ISR (FCT-POSC/EEA-SRI/61150/2004). Each individual will be asked to perform the same set of actions on different days, following a pre-defined sequence of facial actions. A special designed program will be available to the individuals that will help him to understand the type of action he is suppose to perform.

T.2.2 - 4D facial dynamics data acquisition

At the present there exists no public available collection of 4D facial data that fulfils the requirements of this project. This motivated our effort to develop a database that will comply with those demands. The database we expect to collect will comprise several individuals each performing a number of facial expressions and a short passage of speech. Both facial expression and speech data will be collected since they are each sources of dynamic identity information, and understanding their respective discriminatory strengths remains an open area of research. The number of subjects in our database must be chosen to meet a trade-off; between the human labor involved in data collection, yet being large enough to sufficiently evaluate our system. Each subject will repeat the six basic expressions three times (two times for the enrolment collection and once for the query collection) performed over the complete neutral-apex-neutral cycle. We will also record a subject’s facial motion as they recite a short passage of text three times. Short phrases are more preferable in situations when user convenience or processing time is a consideration. This raw multi-view video feeds task 1 that returns for each captured frame a sparse facial 3D reconstruction using the 2D+3D AAM. Due the AAM shape model nature (see task 1), the process of registering each 3D video sequence both spatially and temporally becomes easy to accomplish (i.e. all the reconstruction is done w.r.t. the parametrized 3D shape model). Since faces deform smoothly over time, noise could be reduced using a spatiotemporal smoothing such as 3D Gaussian filters or radial basis functions. Other solutions such as: analysing the path that each recovered point describes over time, fitting this data as a cubic spline and tracking with Kalman methods or particle filters will be analysed.

T2.3 - Publicly release of the 4D facial dynamics database

It is planned to release the acquired database that consists on three-view of synchronized video (raw data) with the correspondent 3D facial reconstruction in each time instance. This database will be a powerful tool for the facial research community, being of extreme usefulness for task that intended to analyse and recognize the dynamics of faces.

Summarizing, task 2 consists on the acquisition of 4D facial information using trinocular camera system. Task 1 provides the methods for 3D shape recovering using each of the three synchronized frames, thereby obtaining a sparse three-dimensional model variant in time. Several filtering methods will be analysed. A 4D facial dynamics database is then developed capturing facial movements of several individuals in the course of different experiments. The deliverables of Task 2 will be a 4D facial motion sequences of data (quasi-dense points) from each individual experience performed.

BACK   ::  Demos  ::  Publications


Demonstration videos