Proceedings: GI 2016

Reading Between the Dots: Combining 3D Markers and FACS Classification for High-Quality Blendshape Facial Animation

Shridhar Ravikumar (University of Bath), Colin Davidson (The Imaginarium Studios), Dmitry Kit (University of Bath), Neill Campbell (University of Bath), Luca Benedetti (University of Bath), Darren Cosker (University of Bath)

Proceedings of Graphics Interface 2016: Victoria, British Columbia, Canada, 1-3 June 2016, 143-151

DOI 10.20380/GI2016.18

  • BibTex

    author = {Ravikumar, Shridhar and Davidson, Colin and Kit, Dmitry and Campbell, Neill and Benedetti, Luca and Cosker, Darren},
    title = {Reading Between the Dots: Combining 3D Markers and FACS Classification for High-Quality Blendshape Facial Animation},
    booktitle = {Proceedings of Graphics Interface 2016},
    series = {GI 2016},
    year = {2016},
    issn = {0713-5424},
    isbn = {978-0-9947868-1-4},
    location = {Victoria, British Columbia, Canada},
    pages = {143--151},
    numpages = {9},
    doi = {10.20380/GI2016.18},
    publisher = {Canadian Human-Computer Communications Society / Soci{\'e}t{\'e} canadienne du dialogue humain-machine},
    keywords = {Facial performance capture, Face animation, Blendshapes, Motion capture},
  • Supplementary Media


Marker based performance capture is one of the most widely used approaches for facial tracking owing to its robustness. In practice, marker based systems do not capture the performance with complete fidelity and often require subsequent manual adjustment to incorporate missing visual details. This problem persists even when using larger number of markers. Tracking a large number of markers can also quickly become intractable due to issues such as occlusion, swapping and merging of markers. We present a new approach for fitting blendshape models to motion-capture data that improves quality, by exploiting information from sparse make-up patches in the video between the markers, while using fewer markers. Our method uses a classification based approach that detects FACS Action Units and their intensities to assist the solver in predicting optimal blendshape weights while taking perceptual quality into consideration. Our classifier is independent of the performer; once trained, it can be applied to multiple performers. Given performances captured using a Head Mounted Camera (HMC), which provides 3D facial marker based tracking and corresponding video, we fit accurate, production quality blendshape models to this data resulting in high-quality animations.