Open Access to Scientific Data: Promoting Science and Innovation
Data Science Journal, Volume 6, Open Data Issue, 17 June 2007
OPEN ACCESS TO SCIENTIFIC DATA: PROMOTING SCIENCE AND
INNOVATION
Guan-Hua Xu*
* Minister, Ministry of Science and Technology, Beijing, the People’s Republic of China.
Email:
ABSTRACT
As an important part of the science and technology infrastructure platform of China, the Ministry of Science and
Technology launched the Scientific Data Sharing Program in 2002. Twenty-four government agencies now
participate in the Program. After five years of hard work, great progress has been achieved in the policy and
legal framework, data standards, pilot projects, and international cooperation.
By the end of 2005, one-third
of the existing public-interest and basic scientific databases in China had been integrated and upgraded. By
2020, China is expected to build a more user-friendly scientific data management and sharing system, with 80
percent of scientific data available to the general public. In order to realize this objective, the emphases of the
project are to perfect the policy and legislation system, improve the quality of data resources, expand and
establish national scientific data centers, and strengthen international cooperation. It is believed that with the
opening up of access to scientific data in China, the Program will play a bigger role in promoting science and
national innovation.
KEYWORDS: Scientific data, Data sharing, Science policy, Research infrastructure, National innovation
1
INTRODUCTION
As economic globalization speeds up, the global flow and allocation of essential factors of productivity become
much more popular than ever before, especially in capital, information, technology, and talents. The
advancement of technology and innovation are becoming the main ways of improving the overall strength and
core competitiveness of each nation. Reliance on science and technology to realize the sustainable use of
resources and to promote the harmonious development between humans and nature is already a universal
strategic goal.
In the face of many challenges and opportunities, in early 2006 the Chinese State Council released the National
Guidelines for Medium- and Long-term Plans for Science and Technology Development (2006-2020). The
ultimate goal is to transform China into one of the innovative countries. The Guidelines establish a target to
raise the level of China’s research and development expenditures in GDP to 2.5 percent or above, with a science
and technology advancement contribution rate reaching 60 percent. One of the strategic elements of the
Guidelines is to construct the science and technology infrastructure platform, which is fundamental to building
China’s capacity in scientific and technological research.
Scientific data sharing is one of the core elements in these plans. The Ministry of Science and Technology
(MoST) gives high priority to scientific data sharing and therefore launched the Scientific Data Sharing Program
(SDSP) in 2002.
OD21
Data Science Journal, Volume 6, Open Data Issue, 17 June 2007
2
THE OVERALL CONCEPT OF PROMOTING SCIENTIFIC DATA
SHARING IN CHINA
Based on the principles, objectives and tasks defined by the Construction Outline of the National Science and
Technology Infrastructure Platform, MoST establishes the overall approach for promoting scientific data sharing
in China. The 2006 State Council Guidelines promote the integration of scientific data resources generated and
accumulated by national research projects, with a focus on public welfare and basic science, to make them more
open and accessible based on the requirements of scientific and technological innovation. The defining
principles and priorities include overall planning and resource sharing, cooperative development of unified
standards, demand-oriented activities and guaranteeing of security. Several pilot projects are being initiated in
these areas.
By 2020, China is expected to achieve several key objectives through the implementation of SDSP. It will
establish a networked scientific data management mechanism and a data sharing service system with an
effective structure and broad coverage of most basic science and public-welfare domains. It will establish data
policies, regulations and standards, and implement an operational sharing mechanism. The Program will
develop a technology-oriented service team with appropriate professional representation and the ability to adapt
to social needs in the application of information. It will open up access to over 80 percent of the public-welfare
and basic science data resources. Overall, the Program will make the accumulation and sharing of scientific data
resources support the basic requirements of innovation, and ultimately promote economic and social
development.
Already by 2010, China is expected to build a data management and sharing service system with a three-tier
structure consisting of 40 scientific data centers or networks, 300 master databases and one portal. This system
will cover six major fields: natural resources and environment, agriculture, population and health, basic and
frontier sciences, engineering and technology and regional scientific and technical research.
The SDSP was initiated to function as a catalyst, to integrate publicly-funded data resources with a view to
leverage all possible data resources from the government to the private sector, and to make them available to the
general public. Figure 1 illustrates the Program’s three-tiered structure as outlined above.
Figure 1. Structure of the Scientific Data Sharing Program
OD22
Data Science Journal, Volume 6, Open Data Issue, 17 June 2007
China’s progress in scientific data sharing can be tracked through the development of its policy and legal
framework, data standards, pilot projects and international cooperation. Open access to scientific data cannot be
achieved without an implementing policy and legal framework. MoST initiated the policy-making process at the
beginning of the SDSP. By the end of 2006, there were four laws and regulations being drafted at the national
level, and 39 rules and regulations that were either being drafted or already released by the relevant departments
and agencies.
Because the SDSP involves massive scientific data resources, it is difficult to reach the goal of effective sharing
without unified data and technology standards. Of the 32 principal standards identified under the Program, 23
have been completed. This has been done based mainly on the analysis of standards at the national and
international levels. Training courses on the implementation of these standards have been conducted. Based on
these principal standards, the relevant government agencies have established more than 120 data management
standards for different sectors.
The implementation of the SDSP goals has been facilitated as a result of these standard-setting activities.
Twenty-four government agencies have been engaged in the Program so far. Despite the differences in (...truncated)