Guest Editorial: Image Analysis and Processing Leveraging Additional Information
Multimed Tools Appl (2016) 75:3933–3936
DOI 10.1007/s11042-016-3412-4
GUEST EDITORIAL
Guest Editorial: Image Analysis and Processing
Leveraging Additional Information
Luis Herranz1 · Jian Cheng2 · Yue Gao3 ·
Shuqiang Jiang1
Published online: 16 March 2016
© Springer Science+Business Media New York 2016
Many multimedia analysis and processing applications can benefit from better image understanding. However, accurate understanding of what the underlying visual content represents
still remains a very challenging problem. Systems need to look beyond the pure matrix of
intensity values and exploit other types of information. Actually, humans do not solve visual
problems based just on the captured pixel data. Many non-visual clues and diverse types of
external information are exploited, including prior knowledge, prior experience, contextual
and collaborative information. Intelligent systems can often incorporate them to simplify
the problem, and can also integrate other types of signals, such depth and infrarred images.
Often this additional information can be critical to better address the problem or increase
the quality of the result. This special issue contains 10 papers, selected from the inital 25
submissions, after two rounds of blind review.
Priveleged or expert information can help to guide learning by exploiting indirect cues. In
“Facial Expression Recognition through Modeling Age-related Spatial Patterns” [6], Wang
et al describe a system that leverages age information, available only during training, to
Luis Herranz
Jian Cheng
Yue Gao
Shuqiang Jiang
1
Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese
Academy of Sciences, Beijing 100190, China
2
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy
of Sciences, Beijing 100190, China
3
School of Software, Tsinghua University, Beijing, 100086 China
3934
Multimed Tools Appl (2016) 75:3933–3936
improve facial expression recognition. Age-related spatial expression patterns carry crucial
information that has not been exploited previously. In particular, the authors implement this
system using Bayesian networks modeling facial geometric features, outperforming existing approaches. The paper “Maximum Margin Hashing with Supervised Information” [8],
by Yang et al, describes a large margin approach to learn hashing functions in a supervised
way. Leveraging label information helps to obtain hashing functions that better represent
the semantic similarity between images. The paper “Discriminative Sparse Representation
for Face Recognition” [10], by Zhang et al, proposes to use facial structure information
to increase face recognition performance. The weights are learned for different face locations based on information entropy, providing an effective prior and structure information.
A sparse coding formulation combined with this weighting scheme results in more discriminative representations. Experiments on face recognition benchmarks show good results. In
“High-Order Graph Matching Kernel for Early Carcinoma EUS Image Classification” [9],
Zhang et al address the problem of early carcinoma detection in electronic ultrasonography
images. The method is based on graph matching, and the authors propose a new kernel that
preserves better the underlying topological structure of the graph. The paper “Discriminative
Sparse Neighbor Coding” [1], by Bai et al, proposes a new coding method addressing some
shortcomings of traditional sparse coding strategies. They include a module that selects discriminative features for each class, and then another module that discards non-informative
visual words. Finally they obtain class-specific low dimensional subspaces, with results
achieving state-of-the-art performance.
Combining multiple heterogeneous signals or multiple features extracted from the same
signal can also outperform methods using a single one. In “Features combination for art
authentication studies: brushstroke and materials analysis of Amadeo de Souza-Cardoso”
[4], Montagner et al tackle the interesting problem of art authentication, in the particular case of the modernist painter Amadeo de Souza-Cardoso. In addition to conventional
RGB analysis, which is used to analyze brushstroke characteristics, their system incorporates hyperspectral imaging and elemental analysis to determine the pigments present in the
painting and compare them with those used by the painter. “Camouflage Performance Analysis and Evaluation Framework Based on Features Fusion” [7], by Xue et al, proposes a
model to evaluate the quality of camouflage features. Their model uses nonlinear fusion to
combine multiple image features that measure the degree to which the target and surrounding background differ with respect to background-related and internal features. Subjective
experiments show that scores predicted by the system correlate with the difficulty human
assessors had to detect camouflaged targets. In “Pointwise and Pairwise Clothing Annotation: Combining Features from Social Media” [5], Nogueira et al exploit social media
data to automatically annotate clothes, including the different garment items. The authors
formulate the task as a multi-label and multi-modal classification problem, proposing two
approaches (pointwise and pairwise). Their experiments show a significant improvement
compared with related methods.
External information can be used in image enhancement and other image processing
operations. The paper “Image Super-Resolution based on Multi-kernel Regression”, by Li
[2] et al, describes a data-driven method to obtain high resolution images from blurred ones
using multi-kernel regression, and an efficient algorithm to implement it. The key information to obtain the details that enhance the images is learned from an external dataset.
In “LWT- QR decomposition-based robust and efficient image watermarking scheme using
Lagrangian SVR” [3], Mehta et al address the problem of watermaking images, that is,
embedding an image with an external signal in a such a way that the signal cannot be
perceived, but recovered if necessary. Their method uses the coefficients from a lifting
Multimed Tools Appl (2016) 75:3933–3936
3935
transform followed by the QR decomposition to train a support vector regression model.
The model is used to embed the watermark and then to extract it when necessary.
These ten papers cover a wide range of image analysis and processing applications and
scenarios in which additional information is beneficial. We hope the readers will find interesting ideas in them. Finally, to conclude, the guest editors would like to express their
gratitude to the authors who submitted manuscripts and to the reviewers who generously
contributed reviewing them.
References
1. Bai X, Yan C, Ren P, Bai L, Zhou J (2015) Discriminative sparse neighbor coding. Multimed Tools
Appl. doi:10.1007/s11042-015-2951-4
2. Li J, Qu Y, Li C, Xie Y (2015) Image sup (...truncated)