Preface
Ann. Data. Sci. (2014) 1(2):149–150
DOI 10.1007/s40745-014-0019-3
Preface
Theory, Methods and Applications in Data Science
Yingjie Tian
Published online: 28 October 2014
© Springer-Verlag Berlin Heidelberg 2014
This issue of 2014, Annals of Data Science (Volume 1, No. 2) presents 7 papers from
the several areas of Data Science. They are contributed from 20 authors and the coauthors come from 5 countries and regions: Chile, China, Iran, Serbia and USA. These
contributed papers deal with data science problems from three aspects: the first is the
theoretical base of data science, the second is about the methods of data mining, and
the third contains the applications.
For the theoretical base of Data Science, the paper “Factor space, the theoretical base
of data science”, by Pei-Zhuang Wang, Zeng-Liang Liu, Yong Shi and Si-Cong Guo,
introduced factor space theory, which provides a general coordinate system to describe
the real world and a theoretical base for data science. Factor space was published in
the same year coincidently with the formal conceptual analysis and rough sets. The
three branches were the pioneers in intelligence mathematics, but the former one had
focused on genetic analysis for uncertainty several years. It is a bridge connecting
randomness and certainty and also a bridge connecting fuzziness and certainty. Based
on the theory, factorial databases is presented, which carries a new kind of statistics
to do intelligent analysis for coming tide of Big Data.
For the methods of Data Science, the paper “Review on: Twin Support Vector
Machine”, by Yingjie Tian and Zhiquan Qi, closely reviewed Twin Support vector
machines (TWSVMs) and provided an insightful understanding of current developments. As the useful extension of the traditional SVM, TWSVM has lower computational complexity and better generalization ability, therefore in the last few years it has
been studied extensively and developed rapidly, and became the current researching
Y. Tian (B)
Research Center on Fictitious Economy & Data Science, Chinese Academy of Sciences and Key
Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences,
Beijing 100190, China
e-mail:
123
150
Ann. Data. Sci. (2014) 1(2):149–150
hot spot in machine learning and data mining. This paper also pointed out their limitations and highlighted the major opportunities and challenges, as well as potential
important research directions. The paper “Extended Exponential Geometric proportional hazard model”, by Sadegh Rezaei, Sina Hashami and Lotfollah Najjar, proposed
the Extended Exponential Geometric (EEG) proportional hazard model. Researchers
often use ordinary least square and generalized linear models even for censored data,
while some researchers presented a useful method for cases which include censored
data and used this model without considering baseline hazard models. These methods are all described and compared with the EEG proportional hazard model, the
simulation results show that this model provided more accurate predictions of mean,
median, and high cost cases and this model can replace to exponential hazard models
and OLS with and without log translation and semi-parametric proportional hazard.
Milan Stanojevic, Bogdana Stanojevic, and Nina Turajli proposed a discrete multipleobjective linear fractional programming (MOLFP) model for the web service selection
problem in the paper “Optimization of multiple-objective web service selection using
fractional programming”. Due to the fact that a large number of available services offer
similar functionality, when choosing actual services to be included in the composition
their non-functional properties must also be taken into account. On the other hand
certain constraints regarding the required performances may also be given. Therefore,
web service selection presents a multiple-objective multiple constraint problem and
can be modeled as the MOLFP. They presented a complete methodology for solving
this problem and reported the experimental results.
Applications of Data Science include three papers, the paper “Ranking Countries
by Medal Priorities Won in the 2014 Sochi Winter Olympics”, by Thomas L. Saaty,
Xiaoyue Liu and Michael Sanserino, used Analytic Hierarchy Process (AHP) to quantify the priorities of different games according to environmental and people factors
and also quantify the priorities of gold, silver and bronze medals, then use these priorities to compute the total scores of all three types of medals won by each country in
order to determine the ranking of the countries which won medals in the 22st Winter
Olympics held in Russia. The paper “A Business Model Design for the Strategic and
Operational Knowledge Management of a Port Community”, by Felisa Córdova and
Claudia Durán, proposed a business model design for managing a sea port community,
to transmit the knowledge that have been created and acquired so that all the agents
participating in the community can share this business knowledge. The paper “Commercial Banks with A Hybrid Prediction Model”, by Yinhua Li, Yong Shi, Anqiang
Huang and Haizhen Yang, proposed a new prediction model combined by trait recognition and SVM, which used the accounting data measured one year prior to identity
the features of problem banks. The new method outperformed nine popular prediction
models in overall accuracy. It was also shown that ROA, liquidity assets and short-term
gaps are sound predictors for bank failure prediction.
As Volume 1, No 1 of ADS has attracted many readers, we believe that this Issue
will still keep the effect. ADS encourages the contributors around the world to address
different challenging problems in Data Science, including the theories, methods, applications, especially the corresponding problems arising in Big Data environment.
123
(...truncated)