VQone MATLAB toolbox: A graphical experiment builder for image and video quality evaluations
Behav Res (2016) 48:138–150
DOI 10.3758/s13428-014-0555-y
VQone MATLAB toolbox: A graphical experiment builder
for image and video quality evaluations
VQone MATLAB toolbox
Mikko Nuutinen · Toni Virtanen · Olli Rummukainen ·
Jukka Häkkinen
Published online: 17 January 2015
© Psychonomic Society, Inc. 2015
Abstract This article presents VQone, a graphical experiment builder, written as a MATLAB toolbox, developed for
image and video quality ratings. VQone contains the main
elements needed for the subjective image and video quality
rating process. This includes building and conducting experiments and data analysis. All functions can be controlled
through graphical user interfaces. The experiment builder
includes many standardized image and video quality rating
methods. Moreover, it enables the creation of new methods
or modified versions from standard methods. VQone is distributed free of charge under the terms of the GNU general
public license and allows code modifications to be made so
that the program’s functions can be adjusted according to a
user’s requirements. VQone is available for download from
the project page (http://www.helsinki.fi/psychology/groups/
visualcognition/).
Keywords Image rating · Image quality · MATLAB ·
Computer software
Introduction
Image and video quality assessment plays an important
role in development and optimization of image acquisition,
This project has been supported by Academy of Finland project
“Mind, Image, Picture” (MIPI), (project no. 267061).
M. Nuutinen () · T. Virtanen · J. Häkkinen
Institute of Behavioural Sciences, University of Helsinki, PO Box
9 (Siltavuorenpenger 1 A), Helsinki, Finland
e-mail:
O. Rummukainen
Department of Signal Processing and Acoustics, Aalto University,
Espoo, Finland
encoding, and transmission schemes (Bovik, 2013;
Chandler, 2013). In a typical experiment, an observer
is seeing images in a sequence and the task is to evaluate
some property from the images, such as overall quality,
sharpness, graininess, or saturation or to give a magnitude of difference between the images. Many image-rating
methods, such as Forced Choice Paired Comparison (PC),
Triplet, Absolute Category Rating (ACR), Double Stimulus
Impairment Scale (DSIS), Double Stimulus Categorical
Rating (DSCR) and Single Stimulus Continuous Quality Evaluation (SSCQE) have been standardized (ISO
20462-1, 2005; ISO 20462-2, 2005; ITU-T P.910, 2008;
ITU-RBT.500, 2012). The standards of image rating
describe how to display test images and video and possible
reference stimuli to observers as well as how to collect rating scores. The reference stimuli can be images with known
properties or quality, helping the observers anchor their
ratings to something concrete. Moreover, images are often
used in human behavioral research (e.g., Leisti, Radun,
Virtanen, Nyman, & Häkkinen, 2014; Coco & Keller, 2014;
To, Gilchrist, Troscianko, & Tolhurst, 2011).
Different applications and experimental settings require
different methods for rating images and videos. For example, there is a compromise between the amount of test
images and observer fatigue due to prolonged test durations. The PC method displays two images side by side
or sequentially, and the observer’s task is to select the
image with more of the attribute in question, for example
image quality or sharpness. The method excels in finding
small, near detection threshold differences between the test
stimuli mostly with unidimensional differences. However,
the PC method is only suited for experiments with a relatively small number of test images, because the number
of image pairs increases exponentially with the number of
images (Mantiuk, Tomaszewska, & Mantiuk, 2012).
Behav Res (2016) 48:138–150
The ACR method displays one image at a time and the
image is rated without a reference image. The ACR method
is the fastest method for assessing many test stimuli. However, the ACR method can be inaccurate because, when
providing rating values for test images, observers use and
compare test images with their own internal references,
which leads to individual differences in the use of the given
rating scale.
If the reference image is available and the number of test
images is high, the use of rating methods such as DSIS or
DSCR can be justified. The DSIS method displays the reference and test images and the observer’s task is to define the
preference category for the test image. The DSCR method
defines the categories of both the reference and test images.
This paper introduces the VQone toolbox for MATLAB, a still and video quality rating experiment builder.
The VQone contains the main elements needed for building and conducting experiments and data analysis. That is,
the VQone is a tool for showing prior manipulated stimuli
and for recording responses. The VQone toolbox is free of
charge and offers an intuitive and comprehensive graphical
user interface (GUI). Currently, there are many free software packages that are viable tools for creating image rating
experiments (see Table 2). However, these packages are limited compared with VQone as far as graphical interface,
available rating methods, and flexibility are concerned.
VQone enables the creation of a new Dynamic Reference
Absolute Category Rating method (DR-ACR) (Virtanen,
Nuutinen, Vaahteranoksa, Oittinen, & Häkkinen, 2014) and
a wide range of experiments according to image quality
standards (PC, triplet, ACR, DSIS, DSCR, SSCQE). All
of the setups can also be augmented with the possibility
of gathering qualitative free answers from the observers.
That is, observers write down in one or two words in the
text input field the most important aspects that influenced
their quality rating. This allows the researchers to gain
descriptive data about the reasons behind the observers’
judgments (Nyman et al., 2006; Radun et al., 2008). The
descriptive data complements the standard rating methods
by offering a description of what was seen in the test stimuli when a quality decision was made or a preference was
expressed.
VQone is not limited to standardized experiments, it can
also be used to construct entirely new experimental setups.
The user can create and name rating scales, radio buttons,
and check boxes. Furthermore, the user can modify the sizes
and locations of stimulus windows and add different reference stimuli. Because all the settings and experiments are
built through the GUIs, VQone provides tools for forming complex image and video rating without the need for
programming.
In the first section of the present article, we provide a
nontechnical description of the basic functionality offered
139
by the experiment builder unit of VQone. All settings and
possibilities offered by VQone are presented in the VQone
user manual located within the files needed to run VQone
on MATLAB. In the second section, we present and analyze
samples (distributed with the VQone package) of a typical
image rating study. In the third section of this article, we
describe ho (...truncated)