Annals of Data Science

Annals of Data Science (AODS) is an academic journal focusing on Big Data analytics and applications. It not only promotes how to use interdisciplinary ...

List of Papers (Total 139)

Modeling Determinants of Time-To-Death in Premature Infants Admitted to Neonatal Intensive Care Unit in Jimma University Specialized Hospital

Mar 2017 | Million Wesenu, Sudhir Kulkarni, Tafere Tilahun

Preterm birth is the term used to define births that occur before 37 completed weeks or 259 days of gestation. The aim of this study is to model survival probability of premature infants who were under follow-up and identify significant risk factors for mortality. Recorded hospital data were obtained for a cohort of 490 infants at Jimma University Specialized Hospital, Ethiopia...

Mar 2017
Million Wesenu, Sudhir Kulkarni, Tafere Tilahun

Joint Modeling of Longitudinal CD4 Count and Weight Measurements of HIV/Tuberculosis Co-infected Patients at Jimma University Specialized Hospital

Aug 2016 | Aboma Temesgen, Teshome Kebede

As HIV/TB co-infected patients are started to be visited, it is common to measure weight and CD4 repeatedly overtime to determine the health status of patients. Most of the time linear mixed modeling of weight and CD4 count cannot handle the association between the outcomes whereas the joint modeling of multivariate linear mixed model does. Thus, this study was an attempt to...

Aug 2016
Aboma Temesgen, Teshome Kebede

Computational Stochastic Modelling to Handle the Crisis Occurred During Community Epidemic

Mar 2016 | Ruchi Verma, Vivek Kumar Sehgal, Nitin

Crisis can strike from anywhere at anyone and at any place. The unpredictability and inevitability of a crisis make it imminent that immediate and critical attention is paid to it so that it is managed and contained at the right time. Any crisis is a red alert situation so there is a widely felt need of it being handled with topmost priority and efficiency. A crisis, it may be a...

Mar 2016
Ruchi Verma, Vivek Kumar Sehgal, Nitin

RETRACTED ARTICLE: Measuring Routes Efficiency of Kolkata Bus Transport: A Modified DEA Approach

Mar 2016 | Asit Bandyopadhayay, Ashim Banerjee, Naseem Abidi

Mar 2016
Asit Bandyopadhayay, Ashim Banerjee, Naseem Abidi

Feature Selection for Multi-Class Imbalanced Data Sets Based on Genetic Algorithm

Dec 2015 | Li-min Du, Yang Xu, Hua Zhu

This paper presents an improved genetic algorithm based feature selection method for multi-class imbalanced data. This method improves the fitness function through using the evaluation criterion EG-mean instead of the global classification accuracy in order to choose the features which are favorable to recognize the minor classes. The method is evaluated using several benchmark...

Dec 2015
Li-min Du, Yang Xu, Hua Zhu

A Neighborhood-based Matrix Factorization Technique for Recommendation

Dec 2015 | Meng-jiao Guo, Jin-guang Sun, Xiang-fu Meng

The data sparsity and prediction quality are recognized as the key challenges in the existing recommender Systems. Most of the existing recommender systems depend on collaborating flitering (CF) method which mainly leverages the user-item rating matrix representing the relationship between users and items. However, the CF-based method sometimes fails to provide accurate...

Dec 2015
Meng-jiao Guo, Jin-guang Sun, Xiang-fu Meng

The Information Content of OVX for Crude Oil Returns Analysis and Risk Measurement: Evidence from the Kalman Filter Model

Dec 2015 | Yanhui Chen, Kaijian He, Lean Yu

Crude oil volatility index (OVX) is a new index published by Chicago Board Option Exchange since 2007. In recent years it emerged as an important alternative measure to track and analyze the volatility of future oil prices. In this paper we firstly model and analyze the dynamic relationship between OVX changes and future crude oil price returns with time-varying coefficients...

Dec 2015
Yanhui Chen, Kaijian He, Lean Yu

Exploring Big Data Analysis: Fundamental Scientific Problems

Dec 2015 | Zongben Xu, Yong Shi

Although Big Data has been one of most popular topics since last several years, how to effectively conduct Big Data analysis is a big challenge for every field. This paper tries to address some fundamental scientific problems in Big Data analysis, such as opportunities, challenges, and difficulties encountered in the analysis. The challenges rise from multiple domains that...

Dec 2015
Zongben Xu, Yong Shi

An Efficient Variable Selection Method for Predictive Discriminant Analysis

Dec 2015 | A. Iduseri, J. E. Osemwenkhae

Seeking a subset of relevant predictor variables for use in predictive model construction in order to simplify the model, obtain shorter training time, as well as enhance generalization by reducing overfitting is a common preprocessing step prior to training a predictive model. In predictive discriminant analysis, the use of classic variable selection methods as a preprocessing...

Dec 2015
A. Iduseri, J. E. Osemwenkhae

Segmentation of Chinese Urban Real Estate Market: A Demand-Supply Distribution Perspective

Dec 2015 | Jichang Dong, Xiuting Li, Wencong Li, et al.

This study proposed a new perspective on the analysis of the regional features of real estate market and explored a more reliable segmentation method for Chinese urban real estate market based on the optimization of supply-demand resource distribution. A two-stage clustering procedure is proposed based on supply and demand elements and market performance respectively. And six...

Dec 2015
Jichang Dong, Xiuting Li, Wencong Li, et al.

Informational Energy and Its Application in Testing Normality

Dec 2015 | Hadi Alizadeh Noughabi, Majid Chahkandi

In this article, we propose a test of fit for normality based on the estimated Informational Energy and using m-step spacings. Consistency of the test statistic is established. Critical values and power values of the test against various alternatives are calculated. Finally, the power values of the proposed test are compared with the power values of some prominent normality tests.

Dec 2015
Hadi Alizadeh Noughabi, Majid Chahkandi

Goal-Programming-Based Procedure for Calculating Positive Multipliers Under a Multiple Criteria Data Envelopment Analysis Framework: An Application to UEFA EURO 2012

Dec 2015 | Ana Paula dos Santos Rubem, Luana Carneiro Brandão, João Carlos Correia Baptista Soares de Mello

One of the motivations for the arise of the multiple criteria data envelopment analysis (MCDEA) model was the need to yield more reasonable input-output multipliers than those derived from standard data envelopment analysis (DEA), without using priori information. The problem of unreasonable multipliers occurs when some production units are efficient in standard DEA simply...

Dec 2015
Ana Paula dos Santos Rubem, Luana Carneiro Brandão, João Carlos Correia Baptista Soares de Mello

How to Measure Rhetorical Impact of Teaching and their Levels of Persuasion: A Neuro-rhetoric Approach

Dec 2015 | Lucio Cañete, Hernán Diaz, Felisa Córdova, et al.

This paper explore the question about how persuasive is a person, a professor in our interest, depending on his/her rhetoric. Since persuasion is an act for amending the mind, a model to describe this intellectual entity in students consists of seven categories of elements in it: Quality, Quantity, Space, Time, Causality, Purpose and Law. According to the emphasis that the...

Dec 2015
Lucio Cañete, Hernán Diaz, Felisa Córdova, et al.

Individual Differences in the Order/Chaos Balance of the Brain Self-Organization

Dec 2015 | Hernán Díaz, Fernando Maureira, Elías Cohen, et al.

We used fractal geometry and fractal dimension introductory argumentation as a framework to start understanding dynamical and complex biological systems to then introduce Hurst exponent estimation of chaos/no-chaos balance trend to explore the phenomenology and the information content of EEG data through time. We searched for measure proxy dynamical variables as potential...

Dec 2015
Hernán Díaz, Fernando Maureira, Elías Cohen, et al.

On the Estimation for the Weibull Distribution

Nov 2015 | M. Alizadeh, S. Rezaei, S. F. Bagheri

Here, we consider estimation of the pdf and the CDF of the Weibull distribution. The following estimators are considered: uniformly minimum variance unbiased, maximum likelihood (ML), percentile, least squares and weight least squares. Analytical expressions are derived for the bias and the mean squared error. Simulation studies and real data applications show that the ML...

Nov 2015
M. Alizadeh, S. Rezaei, S. F. Bagheri

Entropy Estimation Using Numerical Methods

Oct 2015 | Hadi Alizadeh Noughabi

Direct integration of the Riemann–Stieltjes integral has been used to computing convolution integrals. This approach has been established to be simple and accurate with good convergence property. In this paper, we used some numerical methods to estimation of entropy of a continuous random variable and then some estimators are introduced. Bounds on the error terms are derived for...

Oct 2015
Hadi Alizadeh Noughabi

A Fuzzy Trustworthiness System with Probability Presentation Based on Center-of-gravity Method

Sep 2015 | Yu-Bin Zhong, Zeng-Liang Liu, Xue-Hai Yuan

Fuzzy methods are widely used in the study of trustworthiness. Based on this fact, the paper researches the fuzzy trustworthiness system and probability presentation theory based on bounded product implication and Larsen square implication. Firstly, we convert a group of single-input and single-output data into fuzzy inference rules and generate fuzzy relation by selecting the...

Sep 2015
Yu-Bin Zhong, Zeng-Liang Liu, Xue-Hai Yuan

Oriental Thinking and Fuzzy Logic, Celebration of the 50th Anniversary of Fuzzy Sets

Sep 2015 | Peizhuang Wang

Sep 2015
Peizhuang Wang

Mining Fuzzy Association Rules in the Framework of AFS Theory

Sep 2015 | Bo Wang, Xiao-dong Liu, Li-dong Wang

In this paper, firstly we study the representations and fuzzy logic operations for the fuzzy concepts in real data systems. Secondly, we propose a new fuzzy association rule mining algorithm in the framework of AFS (Axiomatic Fuzzy Sets) theory. Compared with the current algorithms, the advantage of proposed algorithm has two advantages. One is that the membership functions of...

Sep 2015
Bo Wang, Xiao-dong Liu, Li-dong Wang

Modular Real-Time Face Detection System

Sep 2015 | Kaiyu Wang, Zhiming Song, Menglin Sheng, et al.

In this paper, a novel system architecture of face detection in possession of modular characteristic is proposed, and the corresponding face detection method is described, to match with the proposed architecture. First of all, the proposed architecture of face detection consists of two modules, namely, the coprocessor module of face detection based on FPGA and target system...

Sep 2015
Kaiyu Wang, Zhiming Song, Menglin Sheng, et al.

Transform Group of Monotonic Functions with the Same Monotonicity on [ \(-\) 1, 1] and Operations of Fuzzy Numbers

Sep 2015 | Si-cong Guo, Ying Zhao

Operations of fuzzy numbers are the main content of the fuzzy mathematical analysis. This paper defines the transformation of monotonic bounded functions with same monotonicity on the symmetric interval [\(-\)1, 1], and the four fundamental operations of fuzzy numbers based on the fuzzy structured element. It not only make operations of fuzzy numbers easier, but also start a new...

Sep 2015
Si-cong Guo, Ying Zhao

The Research on the Application of Qualitative Mapping in MapReduce

Sep 2015 | Jiali Feng, Guanglin Xu, Xiaolin Xu

MapReduce is a mathematical tool handling the large-scale data sets through paralleling and distributive calculation. Currently the operations of MapReduce mainly include sorting, grouping and joining, etc. This paper undertakes a research on qualitative mapping and MapReduce, and finds that the solution procedure of qualitative mapping can be a new way of transforming data for...

Sep 2015
Jiali Feng, Guanglin Xu, Xiaolin Xu

A Triple Structure of Rough Sets Based on Selection Function

Sep 2015 | Bo Wang, Yong Shi, Yingjie Tian

Different from rough sets in Pawlak’s sense, which is a binary approximation operations based structure, in this paper, we propose a new rough equivalence relation based on triple approximation operations induced by selection function. The same as traditional rough sets research, we consider the algebra issue of new rough sets system and construct lattice structure in an...

Sep 2015
Bo Wang, Yong Shi, Yingjie Tian

A Comprehensive Survey of Clustering Algorithms

Aug 2015 | Dongkuan Xu, Yingjie Tian

Data analysis is used as a common method in modern science research, which is across communication science, computer science and biology science. Clustering, as the basic composition of data analysis, plays a significant role. On one hand, many tools for cluster analysis have been created, along with the information increase and subject intersection. On the other hand, each...

Aug 2015
Dongkuan Xu, Yingjie Tian

Identifying High-Number-Cluster Structures in RFID Ski Lift Gates Entrance Data

Jul 2015 | Boris Delibašić, Zoran Obradović

In this paper we identify skier groups in data from RFID ski lift gates entrances. The ski lift gates’ entrances are real-life data covering a 5-year period from the largest Serbian skiing resort with a 32,000 skier per hour ski lift capacity. We utilize three representative algorithms from three most widely used clustering algorithm families (representative-based, hierarchical...

Jul 2015
Boris Delibašić, Zoran Obradović