Proceedings: GI 2019

VideoWhiz: Non-Linear Interactive Overviews for Recipe Videos

Megha Nawhal (Simon Fraser University), Jacqueline B. Lang (Simon Fraser University), Greg Mori (Simon Fraser University), Parmit K. Chilana (Simon Fraser University)

Proceedings of Graphics Interface 2019: Kingston, Ontario, 28 - 31 May 2019

DOI 10.20380/GI2019.15

  • BibTex

    author = {Nawhal, Megha and Lang, Jacqueline B. and Mori, Greg and Chilana, Parmit K.},
    title = {VideoWhiz: Non-Linear Interactive Overviews for Recipe Videos},
    booktitle = {Proceedings of Graphics Interface 2019},
    series = {GI 2019},
    year = {2019},
    issn = {0713-5424},
    isbn = {978-0-9947868-4-5},
    location = {Kingston, Ontario},
    numpages = {8},
    doi = {10.20380/GI2019.15},
    publisher = {Canadian Information Processing Society},
    keywords = {Video navigation, summarization, visualization},
  • Supplementary Media


With millions of recipe videos increasingly available online, viewers often face the challenge of browsing through these videos and deciding among different styles of recipe demonstrations and instructions. Although state-of-the-art video summarization techniques using linear presentation formats have been shown to be effective in domains such as surveillance, sports or lecture videos, recipe videos are often more complex and may require a different summarization approach. We first investigated how viewers navigate recipe videos and what information they look for when seeking quick overviews of such videos. Based on our findings, we designed VideoWhiz, a novel interactive video summarization tool that provides a non-linear overview design allowing easy access to the key stages or milestones within the recipe and inter-milestone relationships. VideoWhiz uses a combination of computer vision techniques and an annotation workflow to generate these interactive overviews. Our evaluation showed that viewers found VideoWhiz to be effective and useful in providing quick overviews of recipe videos. We discuss the potential for future work to investigate non-linear overviews for other types of instructional videos and to explore more powerful representations for video summarization.