Factor Space, the Theoretical Base of Data Science
Ann. Data. Sci. (2014) 1(2):233–251
DOI 10.1007/s40745-014-0017-5
Factor Space, the Theoretical Base of Data Science
Pei-Zhuang Wang · Zeng-Liang Liu · Yong Shi ·
Si-Cong Guo
Received: 1 July 2014 / Revised: 15 August 2014 / Accepted: 10 September 2014 /
Published online: 28 October 2014
© Springer-Verlag Berlin Heidelberg 2014
Abstract This paper introduces factor space theory, which provides a general coordinate system to describe the real world and a theoretical base for data science. Based
on the theory, factorial databases is presented, which carries a new kind of statistics
to do intelligent analysis for coming tide of Big Data.
Keywords Factor space · Factorial databases · Background relation ·
Factorial neural networks · Factor vane · Sample cultivation · Information fusion
Mathematics Subject Classification
90C05
1 Introduction
Big Data stylishly leads the current tide, various parlances dazzle people in delightful
surprise with confusion. However, we are acutely aware that the core task in the
P.-Z. Wang (B) · S.-C. Guo
College of Intelligence Engineering and Mathematics, Liaoning Technical University, Fuxin 123000,
Liaoning, China
e-mail:
Z.-L. Liu
National Defense University PLA China, Beijing 100091, China
e-mail:
Y. Shi
Research Center of Fictitious Economy and Data Science, Chinese Academy of Science,
Beijing 100080, China
e-mail:
S.-C. Guo
e-mail:
123
234
Ann. Data. Sci. (2014) 1(2):233–251
tide is promoting the intelligence in Big Data. As the journal’s preface emphasized
[1], data science ‘should have its own scientific contents such as axioms, laws and
rues, which are fundamentally important for experts in different fields to explore their
own interests from Big Data’. Even though there are remarkable achievements in
this area, data science still lacks theoretical base on intelligence. As Tsien Hsueshen
emphasized [2], ‘To develop intelligent engineering, most important task is building
the mathematical theory towards intelligence! ’ This paper aims to introduce Factor
space, which provides a general coordinate system for description of things in the
world, which is the very mathematical base for data science.
Factor space [3] was published in the same year coincidently with the formal conceptual analysis [4] and rough sets [5]. The three branches were the pioneers in intelligence mathematics, but the former one had focused on genetic analysis for uncertainty
several years.
Factor space is a bridge connecting randomness and certainty. Both ends can be
ex-transferred each other according to the varying of the dimension of factors [6].
Based on the idea, intentionally or unintentionally, Kolmogorov presented the fundamental space , a factor space, in the axiomatic definition of probabilistic field. He
drives randomness into an inevitable framework, took a march of mathematics toward
random phenomena. Without the idea of factor space, probability could not realize
modernization in the thirties of the last century.
Factor space is a bridge connecting fuzziness and certainty also. This bridge and the
bridge mentioned above shows a duality: The fuzziness on the ground, the universe
U , can be viewed as the randomness in the sky, the power P(U ) of U . Based on the
idea of factor space, Wang presented the theory of Fuzzy Shadows [7] to treat fuzzy
set as the covering function of a random set, which provides a firm base to fuzzy sets
theory and has been applied in fuzzy controllers [8] and several areas. As a summary,
the book “Fuzzy System Theory and Fuzzy Computer” [9] was published in 1997.
The trace of factor space on intelligence was shown in the books ‘Mathematical
theory of Knowledge Representation’ [10], ‘Theory and Applications of Factorial
Neural Networks’ [11], ‘Attribute method in Thinking and Intelligence Science’ [12]
etc. Some represented papers can be found in [13–27].
Factor space has common goals with formal concept and rough sets, the authors
of this paper emphasize the importance of background relation (called the formal
background by R. Wills) and take deep study on the relation. Factor space provides a
population theory to information systems in rough sets. All branches will cooperate
to establish firm mathematical bases for data science.
Organization of the paper: Sect. 2. Introduction to factor and factors space; Sect. 3.
Knowledge representation; Sect. 4. Factorial databases; Discussions including a brief
conclusion and main tasks are given in Sect. 5.
Limited to time, all proofs of propositions are omitted.
2 What is Factor and Factor Space?
Gene is the key of biology, which forms, generates, and identifies all living objects.
There exists a key opening the door to recognize all things in the universe, which is
123
Ann. Data. Sci. (2014) 1(2):233–251
235
Fig. 1 Factor state space
the generalized gene, we call it the Factor. The name of gene was called Mendelian
factor originally; a factor is a fact-or, where ‘fact’ stands for any thing and ‘-or’ is the
matter who describes, determines, and identifies all things. The Chinese translation
of factor, YINSU, mostly fix to the mentioned meaning, factor is the best name for
generalized gene.
A gene likes a switch with two or more states plugged in a node upon chromosome,
each state determines a biological property/quality. Gene is the quality-root of living
beings. A factor switches a series of states. For example, Color is a factor, which
switches three basic states: Red, Yellow and Blue. Factor is the quality-root of things.
From the view of mathematics, a factor f is essentially a mapping, which defined
on a domain O; every object is mapped into X f , the range of f . A state of X f can be
an attribute, a feature, a characteristic, or a degree etc., they form a dimension with
respect to f . A factor is the name of the dimension, which hooks a series of attributes.
Factor is the attribute of attributes.
Any concrete object has complex quality, which can’t be recognized except taking
‘photo’ from a specified angle/aspect. Factor is the angle of analysis. Without factor,
no analysis can be taken.
An object can be analyzed by many factors, and get a record, for example, Height
(John) = 1.75 m, Weight (John) = 70 kg, Age (John) = 25, Sex (John) = Male, etc. Taking a synthesis after analyses, the Cartesian product of those dimensions form a coordinate system, it is the factor state space, which is a coordinate system with dimensions
named by factors. John has been mapped as a point in the coordinate system. Factor
space provides a general coordinate system to describe all things in the universe (see
Fig. 1).
Not only does factor space extend the field of vision, but also bring the flexibility
to the coordinate system: For a given task, factor space decreases its dimensions as
low as possible! There needs to introduce some operations on factors. For example,
Color, Aroma, Taste are three simple factors in foods, and Color–aroma, Color–
taste, Aroma–taste, Color–aroma–ta (...truncated)