PROGRAM PROCEDURES FOR TRAINING A RECOGNITION SYSTEM FOR THE DIFFERENTIAL DIAGNOSIS OF PATIENTS BASED ON HETEROGENEOUS SYMPTOM COMPLEXES
ISSN (p) 0321-2211, ISSN (e) 2663-3450
Автоматизація та інтелектуалізація приладобудування
АВТОМАТИЗАЦІЯ ТА ІНТЕЛЕКТУАЛІЗАЦІЯ ПРИЛАДОБУДУВАННЯ
DOI: 10.20535/1970.67(1).2024.306735
UDC 616.6:004.67
PROGRAM PROCEDURES FOR TRAINING A RECOGNITION SYSTEM FOR
THE DIFFERENTIAL DIAGNOSIS OF PATIENTS BASED ON
HETEROGENEOUS SYMPTOM COMPLEXES
Shulyak O. P., Druzhynin V. V.
National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv, Ukraine
E-mail: ,
The machine learning recognition system for the differential diagnosis of patients based on heterogeneous
nephrology parameter complexes is being considered, transitioning from instrumental means of examination. Training
utilizes empirical statistics of clinical cases in a database with reliable diagnoses. The purpose is to expand the
capabilities of information extraction from similar databases for training recognition procedures by enriching this
toolkit with new features containing characteristic aspects of the extracted information.
The research object is the mathematical and software toolkit for training recognition procedures of patient
differential diagnosis based on statistics of reliably diagnosed clinical cases. The subject of the study is the software
procedures for forming models of parameter complex incidence during training along scales of their values and the
procedures for using these models in diagnostics. Model acquisition is perceived as the main content of the training
process in ensuring diagnosis differentiation. A criterion for accepting preferential diagnostic decisions using such
models is proposed. To simplify the development of mathematical and software procedures, heterogeneous symptom
complexes are normalized and transformed to the [0; 1] scale.
The introduction states the significant prevalence in medicine and related fields of databases with medical and
biomedical data statistics on parameters and characteristics of human organs and systems in different conditions, their
medical interpretation, and their use for various purposes, often associated with patient diagnostics. The problems of
their formation and use are outlined on real databases, with one complicating factor in the development of diagnostic
hardware-software being the substantial heterogeneity of parameters determined by patient examination instruments.
Keywords: patient diagnosis; heterogeneous symptom complexes; parameter normalization; parameter
distribution models; decision accumulation criterion.
Introduction
In both theory and practice of medicine and related
fields, the prevalence of various open and closed-access
databases of medical and biomedical data [1 – 3], diverse
in their medical specialization and purpose, has become
rooted and continues to progress. The extraction and
utilization of information [1 – 3, 4] accumulated in such
databases for various purposes, ranging from its study in
professional training of specialists [1, 3, 5, 6] to its
application in addressing various practical tasks in the field
of medicine and related areas [1 – 4, 6 – 15], are gaining
increasing relevance and importance. There remains a
demand for the development of various software and
hardware tools for obtaining necessary information from
such databases in different sectors of subject area
specialization [1, 3, 5, 16 – 18], including the demand for
the development of simple specialized modules in
software and hardware implementations [1, 3, 19 – 21].
Each type of toolkit for extracting necessary
information from such databases and the corresponding
tools that use it to address their issues have their own
characteristics, their own emphases, and their effectiveness
in extracting and using their available information [1 – 3,
9, 15, 21 – 26, 28] contained in the existing data, as well as
their peculiarities in implementing components of
accumulated empirical observation experience of objects,
processes, and phenomena [1 – 4, 7, 22, 23, 26, 28] are of
interest. Perhaps, there is no universal toolkit for such
purposes, and each new development can be seen as
obtaining data processing tools that complement the
existing toolkit and may demonstrate sufficiently high
effectiveness in their use, which needs to be verified for its
effectiveness [1, 2, 5, 6, 21, 24, 26, 28], and in this sense,
the relevance of such research and developments persists.
One of the obstacles to the development of the
mentioned simple specialized software and hardware
data processing toolkit in the subject area under
consideration is the heterogeneity of parameter
complexes [1 – 6, 9, 10, 18, 22, 29, 30] collected in
databases with descriptions of clinical cases. This
complication can be overcome by simple uniform linear
data transformations [41, 42] considered in the work.
One of the main reasons for the heterogeneity of the
mentioned databases is that they often represent
collections of descriptions of clinical cases from medical
practice with the results of patient instrumental
Вісник КПІ. Серія ПРИЛАДОБУДУВАННЯ, Вип. 67(1), 2024.
55
ISSN (p) 0321-2211, ISSN (e) 2663-3450
Автоматизація та інтелектуалізація приладобудування
examinations [1 – 3, 5 – 7, 12, 15] or the results of
purposeful statistical studies [1 – 3, 10] related to the
analysis of the impact and consequences of professional,
climatic, and other conditions on human life processes [1,
2], the analysis of the dynamics of processes and
phenomena in the body, the disclosure of relationships
between the past, present, and future states of organs and
systems at different levels in the body [1, 2, 4, 9, 10, 12 –
14, 19, 22], the identification of influencing factors [1, 7],
risk factors [1, 22], chances of favorable outcomes [1,
18], as well as the determination of characteristic
regional features [1, 9] related to population health
provision, which explains the heterogeneity of parameter
complexes in databases.
Such databases contain real factual material of
various physical nature, different levels of accuracy and
reliability [1, 5, 6, 15, 28]. It is obtained empirically,
including the use of software and hardware complexes of
various, including medical, purposes and complexities,
using unique and widely used means of patient
examination, means of studying metabolic processes and
products of human life activity, reactions to various
influences, as well as tools for studying food products,
water, determining environmental parameters, properties
of biomedical materials [31 – 38]. Data may be collected
during patient observation in the process of their
dispensary examination, prevention and treatment,
medical examinations, professional selection, surveys,
categorization of the examined population by gender,
age, working conditions, lifestyle, by risk groups and
health level groups, by other characteristics as part of
their comprehensive characterization [31 – 38]. This
increases the diversity and heterogeneity of information
in the obtained similar numerous parameter complexes in
dat (...truncated)