VISION: a video and image dataset for source identification

EURASIP Journal on Information Security, Oct 2017

Forensic research community keeps proposing new techniques to analyze digital images and videos. However, the performance of proposed tools are usually tested on data that are far from reality in terms of resolution, source device, and processing history. Remarkably, in the latest years, portable devices became the preferred means to capture images and videos, and contents are commonly shared through social media platforms (SMPs, for example, Facebook, YouTube, etc.). These facts pose new challenges to the forensic community: for example, most modern cameras feature digital stabilization, that is proved to severely hinder the performance of video source identification technologies; moreover, the strong re-compression enforced by SMPs during upload threatens the reliability of multimedia forensic tools. On the other hand, portable devices capture both images and videos with the same sensor, opening new forensic opportunities. The goal of this paper is to propose the VISION dataset as a contribution to the development of multimedia forensics. The VISION dataset is currently composed by 34,427 images and 1914 videos, both in the native format and in their social version (Facebook, YouTube, and WhatsApp are considered), from 35 portable devices of 11 major brands. VISION can be exploited as benchmark for the exhaustive evaluation of several image and video forensic tools.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1186%2Fs13635-017-0067-2.pdf

VISION: a video and image dataset for source identification

Shullani et al. EURASIP Journal on Information Security VISION: a video and image dataset for source identification Dasara Shullani 1 Marco Fontani 0 1 Massimo Iuliani 0 1 Omar Al Shaya 1 2 Alessandro Piva 0 1 0 FORLAB, Multimedia Forensics laboratory, PIN Scrl , Piazza G. Ciardi, 25, 59100 Prato , Italy 1 Department of Information Engineering, University of Florence , Via di S. Marta, 3, 50139 Florence , Italy 2 Department of electronic Media, Saudi Electronic University , Abi Bakr As Sadiq Rd, Riyadh, 11673 , Saudi Arabia Forensic research community keeps proposing new techniques to analyze digital images and videos. However, the performance of proposed tools are usually tested on data that are far from reality in terms of resolution, source device, and processing history. Remarkably, in the latest years, portable devices became the preferred means to capture images and videos, and contents are commonly shared through social media platforms (SMPs, for example, Facebook, YouTube, etc.). These facts pose new challenges to the forensic community: for example, most modern cameras feature digital stabilization, that is proved to severely hinder the performance of video source identification technologies; moreover, the strong re-compression enforced by SMPs during upload threatens the reliability of multimedia forensic tools. On the other hand, portable devices capture both images and videos with the same sensor, opening new forensic opportunities. The goal of this paper is to propose the VISION dataset as a contribution to the development of multimedia forensics. The VISION dataset is currently composed by 34,427 images and 1914 videos, both in the native format and in their social version (Facebook, YouTube, and WhatsApp are considered), from 35 portable devices of 11 major brands. VISION can be exploited as benchmark for the exhaustive evaluation of several image and video forensic tools. Dataset multimedia forensics; Image forensics; Video forensics; Source identification 1 Introduction In the last decades, visual data gained a key role in providing information. Images and videos are used to convey persuasive messages to be used under several different environments, from propaganda to child pornography. The wild world of web also allows users to easily share visual contents through social media platforms. Statistics [ 1 ] show that a relevant portion of the world’s population owns a digital camera and can capture pictures. Furthermore, one third of the people can go online and upload their pictures on websites and social networks. Given their digital nature, these data also convey several information related to their life cycle (e.g., source device, processing they have been subjected to). Such information may become relevant when visual data are involved in a crime. In this scenario, multimedia forensics (MF) has been proposed as a solution for investigating images and videos to determine information about their life cycle [ 2 ]. During the years, the research community developed several tools to analyze a digital image, focusing on issues related to the identification of the source device and the assessment of content authenticity [ 3 ]. Generally, the effectiveness of a forensic technique should be verified on image and video datasets that are freely available and shared among the community. Unfortunately, these datasets, especially for the case of videos, are outdated and non-representative of real case scenarios. Indeed, most multimedia contents are currently acquired by portable devices that keep updating year by year. These devices are also capable to acquire both videos and images exploiting the same sensor, thus opening new investigation opportunities in linking different kind of contents [ 4 ]. This motivates the need for a new dataset containing a heterogeneous and sufficiently large set of visual data—both images and videos—as benchmark to test and compare forensic tools. In this paper, we present a new dataset of native images and videos captured with 35 modern smartphones/tablets belonging to 11 different brands: Apple, Asus, Huawei, Lenovo, LG electronics, Microsoft, OnePlus, Samsung, Sony, Wiko, and Xiaomi. Overall, we collected 11,732 native images; 7565 of them were shared through Facebook, in both high and low quality, and through WhatsApp, resulting in a total of 34,427 images. Furthermore, we acquired 648 native videos, 622 of which were shared through YouTube at the maximum available resolution, and 644 through WhatsApp, resulting in a total of 1914 videos1. To exemplify the usefulness of the VISION dataset, we test the performance of a well-known forensic tool, i.e., the detection of the sensor pattern noise (SPN) left by the acquisition device [ 5 ] for the source identification of native/social media contents; moreover, we describe some new opportunities deriving by the availability of images and videos captured with the same sensor to find a solution to current (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1186%2Fs13635-017-0067-2.pdf

Dasara Shullani, Marco Fontani, Massimo Iuliani, Omar Al Shaya, Alessandro Piva. VISION: a video and image dataset for source identification, EURASIP Journal on Information Security, 2017, pp. 15, Volume 2017, Issue 1, DOI: 10.1186/s13635-017-0067-2