# Annals of Data Science

## List of Papers (Total 64)

#### Modelling Under-Five Mortality among Hospitalized Pneumonia Patients in Hawassa City, Ethiopia: A Cross-Classified Multilevel Analysis

Community acquired pneumonia refers to pneumonia acquired outside of hospitals or extended health facilities and it is a leading infectious disease. This study aims to model mortality of hospitalized under-5 year child pneumonia patients and investigate potential risk factors associated with child mortality due to pneumonia. The study was a retrospective study on 305 sampled ...

#### Modeling Determinants of Time-To-Death in Premature Infants Admitted to Neonatal Intensive Care Unit in Jimma University Specialized Hospital

Preterm birth is the term used to define births that occur before 37 completed weeks or 259 days of gestation. The aim of this study is to model survival probability of premature infants who were under follow-up and identify significant risk factors for mortality. Recorded hospital data were obtained for a cohort of 490 infants at Jimma University Specialized Hospital, Ethiopia. ...

#### Joint Modeling of Longitudinal CD4 Count and Weight Measurements of HIV/Tuberculosis Co-infected Patients at Jimma University Specialized Hospital

As HIV/TB co-infected patients are started to be visited, it is common to measure weight and CD4 repeatedly overtime to determine the health status of patients. Most of the time linear mixed modeling of weight and CD4 count cannot handle the association between the outcomes whereas the joint modeling of multivariate linear mixed model does. Thus, this study was an attempt to model ...

#### Exploring Big Data Analysis: Fundamental Scientific Problems

Although Big Data has been one of most popular topics since last several years, how to effectively conduct Big Data analysis is a big challenge for every field. This paper tries to address some fundamental scientific problems in Big Data analysis, such as opportunities, challenges, and difficulties encountered in the analysis. The challenges rise from multiple domains that include ...

#### An Efficient Variable Selection Method for Predictive Discriminant Analysis

Seeking a subset of relevant predictor variables for use in predictive model construction in order to simplify the model, obtain shorter training time, as well as enhance generalization by reducing overfitting is a common preprocessing step prior to training a predictive model. In predictive discriminant analysis, the use of classic variable selection methods as a preprocessing ...

#### The Information Content of OVX for Crude Oil Returns Analysis and Risk Measurement: Evidence from the Kalman Filter Model

Crude oil volatility index (OVX) is a new index published by Chicago Board Option Exchange since 2007. In recent years it emerged as an important alternative measure to track and analyze the volatility of future oil prices. In this paper we firstly model and analyze the dynamic relationship between OVX changes and future crude oil price returns with time-varying coefficients, ...

#### Segmentation of Chinese Urban Real Estate Market: A Demand-Supply Distribution Perspective

This study proposed a new perspective on the analysis of the regional features of real estate market and explored a more reliable segmentation method for Chinese urban real estate market based on the optimization of supply-demand resource distribution. A two-stage clustering procedure is proposed based on supply and demand elements and market performance respectively. And six ...

#### Informational Energy and Its Application in Testing Normality

In this article, we propose a test of fit for normality based on the estimated Informational Energy and using m-step spacings. Consistency of the test statistic is established. Critical values and power values of the test against various alternatives are calculated. Finally, the power values of the proposed test are compared with the power values of some prominent normality tests.

#### Goal-Programming-Based Procedure for Calculating Positive Multipliers Under a Multiple Criteria Data Envelopment Analysis Framework: An Application to UEFA EURO 2012

One of the motivations for the arise of the multiple criteria data envelopment analysis (MCDEA) model was the need to yield more reasonable input-output multipliers than those derived from standard data envelopment analysis (DEA), without using priori information. The problem of unreasonable multipliers occurs when some production units are efficient in standard DEA simply because ...

#### How to Measure Rhetorical Impact of Teaching and their Levels of Persuasion: A Neuro-rhetoric Approach

This paper explore the question about how persuasive is a person, a professor in our interest, depending on his/her rhetoric. Since persuasion is an act for amending the mind, a model to describe this intellectual entity in students consists of seven categories of elements in it: Quality, Quantity, Space, Time, Causality, Purpose and Law. According to the emphasis that the ...

#### Individual Differences in the Order/Chaos Balance of the Brain Self-Organization

We used fractal geometry and fractal dimension introductory argumentation as a framework to start understanding dynamical and complex biological systems to then introduce Hurst exponent estimation of chaos/no-chaos balance trend to explore the phenomenology and the information content of EEG data through time. We searched for measure proxy dynamical variables as potential ...

#### On the Estimation for the Weibull Distribution

Here, we consider estimation of the pdf and the CDF of the Weibull distribution. The following estimators are considered: uniformly minimum variance unbiased, maximum likelihood (ML), percentile, least squares and weight least squares. Analytical expressions are derived for the bias and the mean squared error. Simulation studies and real data applications show that the ML estimator ...

#### A Fuzzy Trustworthiness System with Probability Presentation Based on Center-of-gravity Method

Fuzzy methods are widely used in the study of trustworthiness. Based on this fact, the paper researches the fuzzy trustworthiness system and probability presentation theory based on bounded product implication and Larsen square implication. Firstly, we convert a group of single-input and single-output data into fuzzy inference rules and generate fuzzy relation by selecting the ...

#### Mining Fuzzy Association Rules in the Framework of AFS Theory

In this paper, firstly we study the representations and fuzzy logic operations for the fuzzy concepts in real data systems. Secondly, we propose a new fuzzy association rule mining algorithm in the framework of AFS (Axiomatic Fuzzy Sets) theory. Compared with the current algorithms, the advantage of proposed algorithm has two advantages. One is that the membership functions of the ...

#### Modular Real-Time Face Detection System

In this paper, a novel system architecture of face detection in possession of modular characteristic is proposed, and the corresponding face detection method is described, to match with the proposed architecture. First of all, the proposed architecture of face detection consists of two modules, namely, the coprocessor module of face detection based on FPGA and target system module, ...

#### Feature Selection for Multi-Class Imbalanced Data Sets Based on Genetic Algorithm

This paper presents an improved genetic algorithm based feature selection method for multi-class imbalanced data. This method improves the fitness function through using the evaluation criterion EG-mean instead of the global classification accuracy in order to choose the features which are favorable to recognize the minor classes. The method is evaluated using several benchmark ...

#### A Neighborhood-based Matrix Factorization Technique for Recommendation

The data sparsity and prediction quality are recognized as the key challenges in the existing recommender Systems. Most of the existing recommender systems depend on collaborating flitering (CF) method which mainly leverages the user-item rating matrix representing the relationship between users and items. However, the CF-based method sometimes fails to provide accurate information ...

#### Transform Group of Monotonic Functions with the Same Monotonicity on [ $-$ 1, 1] and Operations of Fuzzy Numbers

Operations of fuzzy numbers are the main content of the fuzzy mathematical analysis. This paper defines the transformation of monotonic bounded functions with same monotonicity on the symmetric interval [$-$1, 1], and the four fundamental operations of fuzzy numbers based on the fuzzy structured element. It not only make operations of fuzzy numbers easier, but also start a new ...

#### The Research on the Application of Qualitative Mapping in MapReduce

MapReduce is a mathematical tool handling the large-scale data sets through paralleling and distributive calculation. Currently the operations of MapReduce mainly include sorting, grouping and joining, etc. This paper undertakes a research on qualitative mapping and MapReduce, and finds that the solution procedure of qualitative mapping can be a new way of transforming data for ...

#### A Triple Structure of Rough Sets Based on Selection Function

Different from rough sets in Pawlak’s sense, which is a binary approximation operations based structure, in this paper, we propose a new rough equivalence relation based on triple approximation operations induced by selection function. The same as traditional rough sets research, we consider the algebra issue of new rough sets system and construct lattice structure in an algebraic ...

#### Multilevel Modeling of the Progression of HIV/AIDS Disease Among Patients Under HAART Treatment

Human immune deficiency virus results a noncurable disease acquired immuno deficiency syndrome (AIDS). After a person is infected with virus, the virus gradually destroys all the infection fighting cells called CD4 cells and makes the individual susceptible to opportunistic infections which cause severe or fatal health problems. The most effective treatment for the disease is the ...

#### Entropy Estimation Using Numerical Methods

Direct integration of the Riemann–Stieltjes integral has been used to computing convolution integrals. This approach has been established to be simple and accurate with good convergence property. In this paper, we used some numerical methods to estimation of entropy of a continuous random variable and then some estimators are introduced. Bounds on the error terms are derived for ...

#### ELES-Model Based Housing Affordability Comparative Research of Urban Households in Beijing Between 2004 and 2014

In this paper, we apply the model of the extended linear expenditure system to measure housing affordability of urban households in Beijing, and calculate the affordable housing areas of urban households which are on different income levels in Beijing in 2014 and 2004. In this study, we find that, the household disposable incomes of urban households in Beijing increase along with ...