VQone MATLAB toolbox: A graphical experiment builder for image and video quality evaluations (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.3758%2Fs13428-014-0555-y.pdf

VQone MATLAB toolbox: A graphical experiment builder for image and video quality evaluations

Behav Res (2016) 48:138–150 DOI 10.3758/s13428-014-0555-y VQone MATLAB toolbox: A graphical experiment builder for image and video quality evaluations VQone MATLAB toolbox Mikko Nuutinen · Toni Virtanen · Olli Rummukainen · Jukka Häkkinen Published online: 17 January 2015 © Psychonomic Society, Inc. 2015 Abstract This article presents VQone, a graphical experiment builder, written as a MATLAB toolbox, developed for image and video quality ratings. VQone contains the main elements needed for the subjective image and video quality rating process. This includes building and conducting experiments and data analysis. All functions can be controlled through graphical user interfaces. The experiment builder includes many standardized image and video quality rating methods. Moreover, it enables the creation of new methods or modified versions from standard methods. VQone is distributed free of charge under the terms of the GNU general public license and allows code modifications to be made so that the program’s functions can be adjusted according to a user’s requirements. VQone is available for download from the project page (http://www.helsinki.fi/psychology/groups/ visualcognition/). Keywords Image rating · Image quality · MATLAB · Computer software Introduction Image and video quality assessment plays an important role in development and optimization of image acquisition, This project has been supported by Academy of Finland project “Mind, Image, Picture” (MIPI), (project no. 267061). M. Nuutinen () · T. Virtanen · J. Häkkinen Institute of Behavioural Sciences, University of Helsinki, PO Box 9 (Siltavuorenpenger 1 A), Helsinki, Finland e-mail: O. Rummukainen Department of Signal Processing and Acoustics, Aalto University, Espoo, Finland encoding, and transmission schemes (Bovik, 2013; Chandler, 2013). In a typical experiment, an observer is seeing images in a sequence and the task is to evaluate some property from the images, such as overall quality, sharpness, graininess, or saturation or to give a magnitude of difference between the images. Many image-rating methods, such as Forced Choice Paired Comparison (PC), Triplet, Absolute Category Rating (ACR), Double Stimulus Impairment Scale (DSIS), Double Stimulus Categorical Rating (DSCR) and Single Stimulus Continuous Quality Evaluation (SSCQE) have been standardized (ISO 20462-1, 2005; ISO 20462-2, 2005; ITU-T P.910, 2008; ITU-RBT.500, 2012). The standards of image rating describe how to display test images and video and possible reference stimuli to observers as well as how to collect rating scores. The reference stimuli can be images with known properties or quality, helping the observers anchor their ratings to something concrete. Moreover, images are often used in human behavioral research (e.g., Leisti, Radun, Virtanen, Nyman, & Häkkinen, 2014; Coco & Keller, 2014; To, Gilchrist, Troscianko, & Tolhurst, 2011). Different applications and experimental settings require different methods for rating images and videos. For example, there is a compromise between the amount of test images and observer fatigue due to prolonged test durations. The PC method displays two images side by side or sequentially, and the observer’s task is to select the image with more of the attribute in question, for example image quality or sharpness. The method excels in finding small, near detection threshold differences between the test stimuli mostly with unidimensional differences. However, the PC method is only suited for experiments with a relatively small number of test images, because the number of image pairs increases exponentially with the number of images (Mantiuk, Tomaszewska, & Mantiuk, 2012). Behav Res (2016) 48:138–150 The ACR method displays one image at a time and the image is rated without a reference image. The ACR method is the fastest method for assessing many test stimuli. However, the ACR method can be inaccurate because, when providing rating values for test images, observers use and compare test images with their own internal references, which leads to individual differences in the use of the given rating scale. If the reference image is available and the number of test images is high, the use of rating methods such as DSIS or DSCR can be justified. The DSIS method displays the reference and test images and the observer’s task is to define the preference category for the test image. The DSCR method defines the categories of both the reference and test images. This paper introduces the VQone toolbox for MATLAB, a still and video quality rating experiment builder. The VQone contains the main elements needed for building and conducting experiments and data analysis. That is, the VQone is a tool for showing prior manipulated stimuli and for recording responses. The VQone toolbox is free of charge and offers an intuitive and comprehensive graphical user interface (GUI). Currently, there are many free software packages that are viable tools for creating image rating experiments (see Table 2). However, these packages are limited compared with VQone as far as graphical interface, available rating methods, and flexibility are concerned. VQone enables the creation of a new Dynamic Reference Absolute Category Rating method (DR-ACR) (Virtanen, Nuutinen, Vaahteranoksa, Oittinen, & Häkkinen, 2014) and a wide range of experiments according to image quality standards (PC, triplet, ACR, DSIS, DSCR, SSCQE). All of the setups can also be augmented with the possibility of gathering qualitative free answers from the observers. That is, observers write down in one or two words in the text input field the most important aspects that influenced their quality rating. This allows the researchers to gain descriptive data about the reasons behind the observers’ judgments (Nyman et al., 2006; Radun et al., 2008). The descriptive data complements the standard rating methods by offering a description of what was seen in the test stimuli when a quality decision was made or a preference was expressed. VQone is not limited to standardized experiments, it can also be used to construct entirely new experimental setups. The user can create and name rating scales, radio buttons, and check boxes. Furthermore, the user can modify the sizes and locations of stimulus windows and add different reference stimuli. Because all the settings and experiments are built through the GUIs, VQone provides tools for forming complex image and video rating without the need for programming. In the first section of the present article, we provide a nontechnical description of the basic functionality offered 139 by the experiment builder unit of VQone. All settings and possibilities offered by VQone are presented in the VQone user manual located within the files needed to run VQone on MATLAB. In the second section, we present and analyze samples (distributed with the VQone package) of a typical image rating study. In the third section of this article, we describe ho (...truncated)