Ryohei Suzuki, Daisuke Sakamoto, Takeo Igarashi
The University of Tokyo
(Honorable Mention Award)
We present a video annotation system called ``AnnoTone'', which can embed various contextual information describing a scene, such as geographical location. Then the system allows the user to edit the video using this contextual information, enabling one to, for example, overlay with map or graphical annotations. AnnoTone converts annotation data into high-frequency audio signals (which are inaudible to the human ear), and then transmits them from a smartphone speaker placed near a video camera. This scheme makes it possible to add annotations using standard video cameras with no requirements for specific equipment other than a smartphone. We designed the audio watermarking protocol using dual-tone multi-frequency signaling, and developed a general-purpose annotation framework including an annotation generator and extractor. We conducted a series of performance tests to understand the reliability and the quality of the watermarking method. We then created several examples of video-editing applications using annotations to demonstrate the usefulness of Annotone, including an After Effects plug-in.
Ryohei Suzuki, Daisuke Sakamoto, Takeo Igarashi, AnnoTone: Record-time Audio Watermarking for Context-aware Video Editing, in Proceedings of the 33rd ACM SIGCHI Conference on Human Factors in Computing Systems (CHI 2015)