The distribution of online healthcare information: a case study on melanoma.
The Distribution of Online Healthcare Information:
A Case Study on Melanoma
Suresh K. Bhavnani
School of Information, University of Michigan, Ann Arbor MI, 48109-1092
To understand the difficulties users face when
retrieving comprehensive healthcare information, this
paper analyzes how facts related to a widely available
healthcare topic are distributed across high-quality
webpages. An inter-rater experiment with two skincancer physicians helped identify 14 facts necessary
for a comprehensive understanding of melanoma risk
and prevention. A second inter-rater experiment
analyzed how those facts were distributed across 189
relevant webpages from high-quality sites. The
analysis revealed that the distribution of facts is
highly skewed, where few pages have many facts,
many pages have a few facts, and no single page or
site provides all the facts. A more detailed analysis
suggests that the distribution is being caused by a
trade-off between depth and breadth, leading to the
existence of general, specialized, and sparse pages.
Furthermore, the analyses reveal patterns and
complexities in the relationships between facts, pages,
and websites. These distribution results pinpoint the
difficulties faced by searchers, and provide insights
for the design of future systems that guide users in
retrieving comprehensive healthcare information.
INTRODUCTION
A synergistic relationship between healthcare
organizations, and the rapid growth in the number of
healthcare information seekers [1], has resulted in the
development of huge repositories of healthcare
information. For example, the National Cancer
Institute’s (NCI) website currently provides
information, related to 118 different cancers,
distributed across hundreds of pages. Given such vast
resources, one might expect that users could obtain
comprehensive information about a healthcare topic
by visiting one webpage, or even one large website
like NCI. However, this is counter to the conclusions
reached by many information scientists. These
scientists have argued that as the number of
information sources about a specific topic increases,
the information across the sources follows a powerlaw distribution [e.g. 2], where a few sources have a
lot of information about the topic, and a large number
of sources have very little information. Such a
distribution can make the retrieval of complete
information about a topic a difficult, if not an
impossible task [3].
distributed across sources deserves closer inspection.
Previous distribution studies of information include
how articles are distributed across journals [4], how
words are distributed within a book [5], and more
recently how incoming web links are distributed
across webpages [6]. However, much less is known
about how facts related to a search topic are
distributed across relevant webpages.
This paper presents two experiments to understand
how facts related to a common healthcare topic are
distributed across relevant webpages in high-quality
sites. In Experiment-1, two skin cancer physicians
independently rated the importance of facts related to
melanoma risk and prevention. The high inter-rater
agreement enabled our research team to identify a set
of facts necessary for a comprehensive understanding
of melanoma risk and prevention at different levels of
importance. In Experiment-2, a different judge rated
the degree of detail that each fact occurred within 189
relevant pages from high quality sites. These ratings
were subsequently verified through the ratings of
another independent judge. The analysis of the ratings
revealed the relationship between facts of the same
healthcare topic, between facts across different types
of pages, and between facts, webpages, and websites.
The analysis also helped to pinpoint the complexities
involved in finding accurate and comprehensive
information related to a healthcare topic, and
suggested a distribution-conscious approach to the
development of future search systems.
EXPERIMENT-1: IDENTIFICATION OF FACTS
The goal of Experiment-1 was to identify a set of facts
that skin cancer physicians agreed was necessary for a
user to have a comprehensive understanding of
descriptive information related to melanoma risk and
prevention1 (which will henceforth be referred to as
melanoma risk/prevention).
Our research team chose to focus on the distribution
of melanoma risk/prevention for two reasons: (1)
questions related to this topic were the most frequent
in an empirical study [7] of user questions related to
skin cancer, and (2) research related to this topic is
well known, and guidelines for the general public are
widely available on the Web [8].
1
Because the incomplete retrieval of healthcare
information can have dangerous consequences, we
believe the analysis of how such information is
In an earlier study [7], skin cancer physicians developed a
hierarchical taxonomy of real-world user questions, where one of
the high-level nodes was risk/prevention, and whose sub-nodes
included descriptive information, and statistical information.
AMIA 2003 Symposium Proceedings − Page 81
Facts related to descriptive information for melanoma risk and prevention
Judge-1
ratings
Judge-2
ratings
Final
ratings
5
5
5
5
5
5
5
5
5
5
5
5
1
3
2
5
5
5
5
5
5
3
1
2
4
4
5
1
4
2
4
5
1
5
3
4
5
1
4.5
5
5
5
5
5
5
1. Having fair skin [or type I or II skin; or white skin; or tendency to burn, not tan; or green or
blue eyes, or red or blond hair] increases your risk of getting melanoma [or skin cancer]
2. High UV exposure [or sunburn] increases your risk of getting melanoma [or skin cancer]
3. Having many moles [or more than 50 moles] increases your risk of getting melanoma
4. Having dysplastic nevi [or atypical moles] increases your risk of getting melanoma [or skin
cancer]
5. Having a giant [or >20 cm] congenital mole [or mole present at birth] increases your risk of
getting melanoma [or skin cancer] [must mention "giant" and "congenital" or "mole present at
birth"]
6. Having a family history of melanoma [or members of your family who have had melanoma]
increases your risk of getting melanoma [or skin cancer]
7. Having a personal history of melanoma increases your risk of getting melanoma [or skin
cancer]
8. Having a weakened immune system [or immune deficiencies] increases your risk of getting
melanoma [or skin cancer]
9. Having Xeroderma Pigmentosum increases your risk of getting melanoma [or skin cancer]
10. Calculate your personal risk of getting melanoma (source of calculator is provided)
11. Wearing protective clothing can help to prevent melanoma
12. Wearing UV-protective sunglasses can help to prevent melanoma
13. Wearing sunscreen can help to prevent melanoma
14. Avoiding UV Rays [or avoiding peak sunlight hours; or seeking shade] can help to prevent
melanoma
15. Examining your body for suspicious moles [or changing moles, or itching moles, or moles that
match the ABCDs] can help to prevent melanoma from spreading
Figure 1. Fifteen fa (...truncated)