Interactive video search tools: a detailed analysis of the video browser showdown 2015
Multimed Tools Appl
DOI 10.1007/s11042-016-3661-2
Interactive video search tools: a detailed analysis
of the video browser showdown 2015
Claudiu Cobârzan1 · Klaus Schoeffmann1 · Werner Bailer2 · Wolfgang Hürst3 ·
Adam Blažek4 · Jakub Lokoč4 · Stefanos Vrochidis5 · Kai Uwe Barthel6 ·
Luca Rossetto7
Received: 23 December 2015 / Revised: 15 March 2016 / Accepted: 1 June 2016
© The Author(s) 2016. This article is published with open access at Springerlink.com
Abstract Interactive video retrieval tools developed over the past few years are emerging
as powerful alternatives to automatic retrieval approaches by giving the user more control
as well as more responsibilities. Current research tries to identify the best combinations of
image, audio and text features that combined with innovative UI design maximize the tools
Claudiu Cobârzan
Klaus Schoeffmann
Werner Bailer
Wolfgang Hürst
Adam Blažek
Jakub Lokoč
Stefanos Vrochidis
Kai Uwe Barthel
Luca Rossetto
1
Klagenfurt University, Universitätstraße 65-67, 9020 Klagenfurt, Austria
2
DIGITAL - Institute of Information and Communication Technologies, Joanneum research
Forschungsgesellschaft mbH, Steyrergasse 17, A-8010 Graz, Austria
3
Information and Computing Sciences, Utrecht University, Princetonplein 5, 3584 CC Utrecht,
Netherlands
Multimed Tools Appl
performance. We present the last installment of the Video Browser Showdown 2015 which
was held in conjunction with the International Conference on MultiMedia Modeling 2015
(MMM 2015) and has the stated aim of pushing for a better integration of the user into the
search process. The setup of the competition including the used dataset and the presented
tasks as well as the participating tools will be introduced . The performance of those tools
will be thoroughly presented and analyzed. Interesting highlights will be marked and some
predictions regarding the research focus within the field for the near future will be made.
Keywords Exploratory search · Video browsing · Video retrieval
1 Introduction
The Video Browser Showdown (VBS), also known as Video Search Showcase, is an interactive video search competition where participating teams try to answer ad-hoc queries in
a shared video data set as fast as possible. Typical efforts in video retrieval focus mainly
on indexing and machine-based search performance, for example, by measuring precision
and recall with a test data set. With video getting omnipresent in regular consumers lives,
it becomes increasingly important though to also include the user into the search process.
The VBS is an annual workshop at the International Conference on MultiMedia Modeling
(MMM) with that goal in mind.
Researchers in the multimedia community agree that content-based image and video
retrieval approaches should have a stronger focus on the user behind the retrieval application
[13, 45, 50]. Instead of pursuing rather small improvements in the field of content-based
indexing and retrieval, video search tools should aim at better integration of the human into
the search process, focusing on interactive video retrieval [8, 9, 18, 19] rather than automatic
querying.
Therefore, the main goal of the Video Browser Showdown is to push research on interactive video search tools. Interactive video search follows the idea of strong user integration
with sophisticated content interaction [47] and aims at providing a powerful alternative to
the common video retrieval approach [46]. It is known as the interactive process of video
content exploration with browsing means, such as content navigation [21], summarization
[1], on-demand querying [48], and interactive inspection of querying results or filtered content [17]. Contrarily to typical video retrieval, such interactive video browsing tools give
more control to the user and provide flexible search features, instead of focusing on the
query-and-browse-results approach. Hence, even if the performance of content analysis is
not optimal, there is a chance that the user could compensate shortcomings through ingenious use of available features. This is important since it has been shown that user can give
4
SIRET research group, Department of Software Engineering, Faculty of Mathematics and Physics,
Charles University in Prague, Malostranské nám. 25, 118 00 Prague, Czech Republic
5
Centre for Research and Technology Hellas, Information Technologies Institute, 6th Klm
Charilaou-Thermi Road, 57001 Thessaloniki, Greece
6
Internationaler Studiengang Medieninformatik, Hochschule für Technik und Wirtschaft,
Wilhelminenhofstr. 75a, D-12459 Berlin, Germany
7
Department of Mathematics and Computer Science, University of Basel, Spiegelgasse 1, CH-4051
Basel, Switzerland
Multimed Tools Appl
good performances even with very simple tools, e.g. a simple HTML5 video player [10, 12,
42, 44].
Other interesting approaches include using additional capturing devices such as the
Kinect sensor in conjunction with human action video search [32], exercise learning in the
field of healthcare [20] or interactive systems for video search [7]. In [7] for example, an
interactive system for human action video search based on the dynamic shape volumes is
developed – the user can create video queries by posing any number of actions in front of
a Kinect sensor. Of course, there are many other relevant and related tools in the fields of
interactive video search, video interaction, and multimedia search, which are however out
of the scope of this paper. The interested reader is referred to other surveys in this field,
such as [34, 46, 47].
In this paper we provide an overview of the participating tools along with a detailed
analysis of the results. Our observations highlight different aspects of the performance and
provide insight into better interface development for interactive video search. Details of the
data set and the participating tools are presented, as well as their achieved performance
in terms of score and search time. Further, we reflect on the achieved results so far, give
detailed insights on the reasons why specific tools and methods worked better or worse,
and subsume the experience and observations from the perspective of the organisers. Based
on this, we make several proposals for highly promising approaches to be used with future
iterations of this interactive video retrieval competition.
The remainder of the paper is organized as follows. Section 2 gives a short description of
the competition. Section 3 makes an overview of both the presented tasks and of the obtained
results. Section 4 provides short descriptions of the participating tools. A detailed analysis
of the results for visual expert rounds is presented in Section 5. The results for the textual
expert round are presented in Section 6 and the ones for the novice round in Section 7. A
short historical overview over the last rounds of the Video Browser Showdown together with
some advice on developing interactive video search t (...truncated)