Reliability of plasma polar metabolite concentrations in a large-scale cohort study using capillary electrophoresis-mass spectrometry
Reliability of plasma polar metabolite concentrations in a large-scale cohort study using capillary electrophoresis-mass spectrometry
Sei Harada 0 1 2
Akiyoshi Hirayama 0 2
Queenie Chan 2
Ayako Kurihara 0 1 2
Kota Fukai 1 2
Miho Iida 1 2
Suzuka Kato 1 2
Daisuke Sugiyama 1 2
Kazuyo Kuwabara 1 2
Ayano Takeuchi 1 2
Miki Akiyama 0 2
Tomonori Okamura 1 2
Timothy M. D. Ebbels 2
Paul Elliott 2
Masaru Tomita 0 2
Asako Sato 0 2
Chizuru Suzuki 0 2
Masahiro Sugimoto 0 2
Tomoyoshi Soga 0 2
Toru Takebayashi 0 1 2
0 Institute for Advanced Biosciences, Keio University , Tsuruoka, Yamagata , Japan , 3 Department of Epidemiology and Biostatistics, School of Public Health, Faculty of Medicine, Imperial College London , London , United Kingdom , 4 MRC-PHE Centre for Environment and Health, Imperial College London , London , United Kingdom , 5 Department of Obstetrics and Gynecology, Keio University School of Medicine, Tokyo, Japan, 6 Faculty of Environment and Information Studies, Keio University , Fujisawa, Kanagawa , Japan , 7 Computational and Systems Medicine, Department of Surgery and Cancer, Imperial College London , South Kensington, London , United Kingdom
1 Department of Preventive Medicine and Public Health, Keio University School of Medicine , Tokyo , Japan
2 Editor: Andrea Motta, National Research Council of Italy , ITALY
Data Availability Statement: Most relevant data
are within the paper and its Supporting Information
files. Raw data cannot be made publicly available,
as study participants did not consent to have their
information freely accessible. Based on these
consents, the Ethics Committee for Tsuruoka
Metabolomics Cohort Study (which includes
representatives of Tsuruoka citizens,
administration of Tsuruoka City, a lawyer, and
expert advisers) strictly inhibits any public data
sharing because data contain potentially identifying
or sensitive disease information. Data accession
Cohort studies with metabolomics data are becoming more widespread, however,
largescale studies involving 10,000s of participants are still limited, especially in Asian
populations. Therefore, we started the Tsuruoka Metabolomics Cohort Study enrolling 11,002
community-dwelling adults in Japan, and using capillary electrophoresis-mass spectrometry
(CE-MS) and liquid chromatography±mass spectrometry. The CE-MS method is highly
amenable to absolute quantification of polar metabolites, however, its reliability for
largescale measurement is unclear. The aim of this study is to examine reproducibility and validity
of large-scale CE-MS measurements. In addition, the study presents absolute
concentrations of polar metabolites in human plasma, which can be used in future as reference ranges
in a Japanese population.
Metabolomic profiling of 8,413 fasting plasma samples were completed using CE-MS, and
94 polar metabolites were structurally identified and quantified. Quality control (QC) samples
were injected every ten samples and assessed throughout the analysis. Inter- and
intrabatch coefficients of variation of QC and participant samples, and technical intraclass
correlation coefficients were estimated. Passing-Bablok regression of plasma concentrations by
requests may be sent to the administration of the
Ethics Committee for Tsuruoka Metabolomics
Cohort Study. The data will be shared after review
of the purpose and permission by the ethics
committee. Contact information for the Ethics
Committee for Tsuruoka Metabolomics Cohort
Study is the administrator of the committee, Shoji
Eiju, who may be contacted at the following email
address: . Address:
9-25 Babacho, Tsuruoka City, 997-8601, Japan.
Funding: This study was supported in part by
research funds from the Yamagata Prefectural
Government (http://www.pref.yamagata.jp/) and
the city of Tsuruoka (https://www.city.tsuruoka.lg.
jp/), Medical Research Council and Public Health
England (https://www.mrc.ac.uk/, grant number
MR/L01341X/1) and NIHR Health Protection
Research Unit in Health Impact of Environmental
Hazards (http://hieh.hpru.nihr.ac.uk/, grant number
HPRU-2012-10141), and by the Grant-in-Aid for
Scientific Research (B) (grant numbers
JP24390168, JP15H04778), Grant-in-Aid for
Challenging Exploratory Research (grant number
25670303), and Grant-in-Aid for Young Scientists
(B) (grant number JP15K19231) from the Japan
Society for the Promotion of Science (http://www.
jsps.go.jp/). The funders had no role in study
design, data collection and analysis, decision to
publish, or preparation of the manuscript.
Competing interests: The authors have declared
that no competing interests exist.
CE-MS on serum concentrations by standard clinical chemistry assays was conducted for
creatinine and uric acid.
Results and conclusions
In QC samples, coefficient of variation was less than 20% for 64 metabolites, and less than
30% for 80 metabolites out of the 94 metabolites. Inter-batch coefficient of variation was
less than 20% for 81 metabolites. Estimated technical intraclass correlation coefficient was
above 0.75 for 67 metabolites. The slope of Passing-Bablok regression was estimated as
0.97 (95% confidence interval: 0.95, 0.98) for creatinine and 0.95 (0.92, 0.96) for uric acid.
Compared to published data from other large cohort measurement platforms, reproducibility
of metabolites common to the platforms was similar to or better than in the other studies.
These results show that our CE-MS platform is suitable for conducting large-scale
Large-scale metabolomics in prospective epidemiological studies is a promising approach to
identify biomarkers for prevention, diagnosis, and prognosis of chronic diseases including
cardiovascular diseases [1±3] and cancer [
]. Since the metabolomic profile is indicative of
biological alterations associated with a wide range of possible genetic or environmental factors,
this is expected to give new insights to understand complex etiology of diseases, related to
genes, external and internal environment, and their interactions [
Metabolomic profiling using liquid chromatography±mass spectrometry (LC-MS) and gas
chromatography±mass spectrometry (GC-MS) has been conducted for over 1,000 blood
samples collected in European cohorts including Cooperative Health Research in the Region of
Augsburg (KORA) [
] and TwinsUK registry [6,8±10], and American cohorts such as
Framingham Heart Study (FHS) Offspring cohort [
] and Atherosclerosis Risk in Communities
Study (ARIC) . Nuclear magnetic resonance (NMR) has also been used in population
studies such as Estonian Biobank [
], Finnish cohorts [
], COMBI-BIO [
] and the
INTERnational study of MAcro/micronutrients and blood Pressure (INTERMAP) Study [14±16].
However, large-scale cohorts involving 10,000s of individuals with metabolomics data are still
limited. In addition, Asian cohorts with metabolomics are limited in scope or small-scale in
size. It is essential to conduct large-scale metabolomics studies in different populations, as it
has been reported that metabolomic profiles vary by ethnic group and lifestyle [
therefore initiated the Tsuruoka Metabolomics Cohort Study (TMCS) [19±21] in Japan
enrolling 11,002 participants since April 2012. This is among the first Japanese population-based
cohort studies with metabolomics, using capillary electrophoresis-mass spectrometry
(CE-MS) for polar metabolites and LC-MS for lipid metabolites.
Compared to other methods of metabolomic profiling, the CE-MS method has high
separation efficiency and compound identification capability, and allows the absolute quantification
of polar metabolites, including carbohydrates and amino acids [22±24]. Also, CE-MS has
unique advantages which are suitable for large-scale epidemiological studies. Firstly, the ability
for multiplex separations with serial injections of seven or more samples in a single run allows
of higher sample throughput at lower costs with high quality assurance since a QC is included
2 / 16
in every run [
]. Secondly, CE-MS is optimal to analyse volume-restricted biospecimens
which is critical for retrospective analysis [
Some epidemiological or long-term studies using CE-MS were recently reported [27±28],
however, reliability in measurement of thousands of blood samples over long periods of time
is still unclear, especially inter-batch variations. In order to estimate a precise disease risk with
high statistical power in epidemiological studies, it is critically important to limit measurement
error and bias . Thus, in this study, we aimed to examine the reproducibility and validity
of large-scale CE-MS measurements and identify the compounds with reliable measurements.
It is also important to establish the absolute concentrations of metabolomics biomarkers in
epidemiological studies because this helps us to compare and combine the results among
different studies and to determine the levels to use practically for prevention. However, little
population-based information has been available even for values according to sex and age [
Our CE-MS platform can yield these values for a wide range of polar metabolites.
In this study, we examined the reliability of large-scale metabolomics profiling via our
CE-MS platform, using 883 quality control (QC) samples for cation metabolites and 946 for
anions as well as 8,413 participant plasma samples, with data acquired over a period of 52
months. The measurement values for creatinine and uric acid were validated by comparison
with independent clinical laboratory assays of these variables. In addition, we present
summary data of absolute plasma concentration for each metabolite according to sex and age for
community-dwelling adults in Japan, which may be used in future as reference values.
Materials and methods
Study population and sample collection
TMCS is a Japanese cohort, initiated in April 2012 (Tsuruoka City, Yamagata Prefecture,
Japan), involving 11,002 participants aged 35 to 74 years. They were recruited among attendees
of annual municipal or workplace health check-up programs held in four sites of the city at
baseline (2012 AprilÐ2015 March). All participants gave written informed consent for this
study; its protocol was approved by the Medical Ethics Committee of the School of Medicine,
Keio University, Tokyo, Japan (approval no. 20110264).
All participants completed a comprehensive questionnaire on lifestyle, dietary habit and
medical history. In addition, biological samples including serum, plasma, urine and DNA, and
medical examination data by health check-up programs were collected at recruitment. These
data and samples continue to be collected prospectively, when the participants undergo annual
municipal or workplace health check-up programs. The follow-up survey to collect
information for death, change of address and medical information including incidence of
cardiovascular diseases and cancer is also conducted every year, using national records and hospital
TMCS is particularly designed to discover metabolomics biomarkers for common diseases
and disorders, related to environmental and genetic factors. In order to have the optimal
samples for our CE-MS metabolomics platform, we followed suitable protocols reported
]. In brief, blood samples were collected in the morning after 12h-overnight
fasting. Plasma samples were collected with ethylenediaminetetraacetic acid-2Na (EDTA-2Na)
as an anticoagulant and kept at 4ÊC immediately after collection. The samples were centrifuged
for 15 minutes (1,500g at 4ÊC) within 3 hours of collection, divided into aliquots, and
preserved at 4ÊC until extraction of metabolites. Metabolite extraction from plasma was finished
within 6 hours after collection to reduce the metabolic reactions in plasma, then the extract
was stored at -80ÊC. Fifty μL of plasma was used for sample extraction. The extraction method
has been detailed previously [
3 / 16
Metabolomics measurement and quality control samples
Metabolomic profiling was conducted for fasting plasma samples via capillary electrophoresis
time-of-flight mass spectrometry (CE-TOFMS). CE-TOFMS analysis of cationic and anionic
metabolites was performed as described previously [
]. Raw data were processed using
our proprietary software (MasterHands) [
]. We used two CE-MS instruments to measure
cation and two for anions exclusively. These four instruments were solely used for this study
during the study period. Mass calibration using tuning solution and MS entrance cleaning
were performed at the beginning of every sequence to ensure robust performance. In addition,
in order to avoid unexpected changes in sensitivity or variance in measurement of mass in a
continuous run, the number of samples per one run was limited up to 100. The average
number of sample runs was 30 per day, and about 0.5% of runs failed due to current drop or
capillary fracture in this study. As a preliminary study, we identified 154 polar metabolites with
standard compounds in plasma. For all the participant samples, we measured absolute
concentrations of 94 metabolites (54 cations and 40 anions, listed in S1 Table), which were expected
to be detected in more than 20% of plasma samples.
Metabolomic profiles of participant samples were analysed from June 2012 in collected
order, and completed for 8,413 samples until August 2016. These data consisted of 105
running batches of cations and 99 batches of anions. One batch contains an average of 80.1
samples (maximum 164) for cations, and 83.2 samples (maximum 168) for anions. To monitor the
stability of metabolomics analysis, QC samples were injected every 10 samples and assessed at
the start of the analytical run and at intervals throughout the analysis. In total, 883 QC samples
for cations and 946 for anions were used for this study. For QC samples, 150 mL serum
collected from 20 people from the same population in advance were extracted for metabolomics
analysis as soon as collected, then divided into 50 μL aliquots and stored at -80ÊC. QC aliquots
stored at -80ÊC were thawed and used for monitoring daily during the study. We calculated
the mean concentration for each metabolite in QC samples which were previously analysed in
70 sequences, then when the concentration of each metabolite in QC samples continuously
exceeded (more than twice) the mean concentration ± two standard deviations for more than
half metabolites, we re-analysed the subsequent samples of sequence.
Clinical laboratory assay
For the purpose of validating the absolute measurement values, we used standard clinical
laboratory assay data for serum creatinine and uric acid. These data were collected from the health
check-up programs that 2,325 of participants underwent in Tsuruoka Kyoritsu Hospital, at the
same time as recruitment and metabolomics sample collection. Creatinine was measured by
enzymatic method which is widely used in medical examinations in Japan [
]. Uric acid
was also measured by enzymatic method [
]. Both of these methods were authorised by the
Japan Society of Clinical Chemistry as the national standardized method. Tsuruoka Kyoritsu
Hospital acquired certification in quality control of these methods by the Japanese Association
of Medical Technologists [
For samples where metabolites were not detected, half of the lowest detected values were
]. Inter- and intra-batch variance for each metabolite concentration of QC samples
were computed to evaluate reproducibility, using a linear mixed model with observed
metabolite level, Y, and a random effects common to each batch, B.
Yi m Bi εi
4 / 16
Then, we calculated the coefficient of variation (CV), by dividing the variance estimated
from this model by the mean. Pearson correlation coefficients between inter- and intra-batch
CV were also calculated. These analyses were also conducted with participant samples to assess
inter- and intra- batch variance.
The intraclass correlation coefficient (ICC) was calculated to compare the reliability of the
metabolomics biomarkers with previous research [
]; it was calculated from variance of
measurement errors, σE2, and total variance, σT2.
Although we could not compute ICC for participant samples as there were no replicates, we
computed technical errors from the large number of replicates for QC samples considered to
be representative of the population samples. We made approximate calculation of ICC,
substituting CV of QC samples for error variance and CV of participant samples for total variance.
Approximate ICC 1
We conducted Passing-Bablok regression of our CE-MS measurements in plasma on
standard clinical laboratory measurements in serum for creatinine and uric acid concentrations.
We also showed Bland-Altman plots using mean of these two methods and percentage of
We summarized the metabolomics data stratified by sex and age. Linear regression analysis
was performed to investigate differences by sex and age, with adjustment for possible
confounders: smoking and alcohol drinking habit, history of any ischemic heart disease, stroke
and cancer, and current disease status including hypertension, diabetes, dyslipidaemia and
impaired kidney function. Bonferroni correction was used to account for multiple testing
(α = 0.05/94).
All statistical analyses were performed using performed using R.3.3.1 (R Core Team 2016,
R Foundation for Statistical Computing, Vienna, Austria.).
Coefficient of variation for quality control samples
CV values for QC samples are shown in S1 Table. Of the 94 metabolites, CV was less than 20%
for 64 metabolites (68%), 20±30% for 16 metabolites (17%), and more than 30% for 14 (16%)
(Fig 1A). Median CV was 7.9% for cation compounds and 18.9% for anions. Boxplots of
metabolite concentrations by batches were shown in S1 and S2 Files.
The comparison of reproducibility with other major MS-platforms used in large
epidemiologic studies was shown in Table 1 and S2 Table. CV values of overlapping polar metabolites
were similar to or better than in other platforms.
Inter-batch CV estimated via a linear mixed model was less than 20% for 81 compounds
(86%), but more than 30% for four of them. Intra-batch CV was less than 20% for 74
compounds (78%) (Fig 1B). Inter- and intra-batch CV had similar values (medians, respectively
5.8% and 5.0% for cations; 11.9% and 13.8% for anions). They were also highly correlated
(Pearson's r = 0.85) (S1 Fig).
5 / 16
Fig 1. Histogram of CV for 94 metabolites in QC samples. (A) Coefficients of variation (CV) for detected 94 metabolites in quality control
(QC) samples. (B) Inter- and intra-batch CV for each metabolite in QC samples. Inter- and intra-batch CV were computed using linear mixed
Variation for participant samples
Statistical summary of metabolites measured in participant samples is shown in S1 Table.
Total, intra- and inter- batch CV among participants are shown in Fig 2. Medians of total,
inter- and intra-batch CV were 32.3%, 9.9% and 30.4% for cation metabolites, respectively,
Study showing CV in QC samples
for polar metabolites
Cohorts of the above study
Separation method for polar
N overlapping metabolites with
Median of CV for metabolites in
for overlapping with Platform 2
for overlapping with Platform 3
for overlapping with Platform 4
How to measure QC samples
CE: capillary electrophoresis, CV: coefficient of variation, FHS: Framingham Heart Study, GAC: Genome Analysis Centre, GC: Gas chromatography, IAB: Institute for
Advanced Biosciences, KORA: Cooperative Health Research in the Region of Augsburg, LC: liquid chromatography, MS: mass spectrometry, QC: quality control.
PLOS ONE | https://doi.org/10.1371/journal.pone.0191230
6 / 16
Fig 2. Histogram of CV for each metabolite in participant samples. (A) Coefficient of variation (CV) for each detected metabolite in
participant plasma samples. (B) Inter- and intra-batch CV for each metabolite in participant samples. Inter- and intra-batch CV were computed
using linear mixed models.
and 44.9%, 20.2%, and 38.4% for anions. As expected, CV in participant samples was larger
than in QCs. Participant samples had larger intra-batch CV than inter-batch CV, in contrast
Results of calculation of estimated ICC are shown in Fig 3 and S1 Table. This was > 0.75
for 67 metabolites (71%), 0.40±0.75 for 25 metabolites (27%), and < 0.40 for two (uridine and
Passing-Bablok regression of CE-MS measurements on standard clinical laboratory
measurements estimated slope 0.97 (95% confidence interval: 0.95, 0.98) and intercept -4.52 (-5.69,
-3.37) for creatinine, and slope 0.95 (0.92, 0.96) and intercept -21.03 (-27.15, -15.36) for uric
acid. This result for creatinine was consistent even after excluding three samples more than
five standard deviations from the mean as outliers (slope = 0.97, intercept = -21.19). As shown
in intercept values, absolute concentrations of these were relatively lower for plasma by
CE-MS than for serum by the independent clinical assay (Creatinine mean ± standard
deviation: 63.2±20.3 μmol/L by CE-MS vs 70.4±22.0 μmol/L by clinical assay, p < 0.001 for paired
t-test; Uric acid: 266.5±72.9 μmol/L vs 307.6±75.6 μmol/L, p < 0.001). Bland-Altman plots
also shows that CE-MS provided lower values than clinical assay, whereas any other bias was
unlikely to be observed. (Figs 4 and 5).
Table 2 and S3 Table show results by sex and age. 73 metabolites differed by sex after age
adjustment and Bonferroni correction. 49 compounds for males and 64 for females were
significantly related to age. Analysis of creatinine and uric acid measured by clinical assay showed
similar results to CE-MS (Table 2).
In our large-scale epidemiological study using the CE-MS metabolomics platform we report
concentrations for 94 polar compounds in blood with good to high reproducibility: CV for 80
compounds (85% of all) was less than 30% despite a measurement period of 52 months.
Interbatch CV was less than 20% for 81 compounds (86%) among 105 batches for cations and 99
7 / 16
Fig 3. Histogram of estimated ICC. Estimated intraclass correlation coefficients (ICC) calculated using the formula, 1 − (Total CV
of QC samples)2 / (Total CV of subject samples)2.
batches for anions. The measured values by our CE-MS method for creatinine and uric acid
were similar to established clinical laboratory assays for these compounds widely used in
In metabolomics studies, QC sample methods are widely used to evaluate reproducibility
]. Features with QC CV < 20% are often considered to have good reproducibility, as
recommended by the US FDA [
]. Features with QC CV < 30% are also considered acceptable
]. Compounds with lower reproducibility in our analyses had small peaks and low
signal/noise ratios, therefore, it was difficult to detect these peak areas precisely, and to
differentiate them from noise.
] and the Broad institute [
] have conducted large-scale targeted
metabolomics measurement for cohort studies including KORA, Twins UK, ARIC and FHS
Offspring Cohort. Absolute IDQ™ kits (BIOCRATES Life Sciences, Innsbruck, Austria) have
been used by studies like KORA and Twins UK . Compared to published data from these
platforms, reproducibility in this study which has unique broad coverage of polar metabolites
was similar to or better than in other studies.
8 / 16
Fig 4. Bland-Altman plots for creatinine. X-axis indicates the mean creatinine concentrations (μmol/L) of capillary
electrophoresis-mass spectrometry (CE-MS) and clinical assay, and Y-axis indicates percentage of differences between
these two methods.
In order to reduce measurement errors, we strictly limited the instruments used for this
study, and were checking sensitivity of instruments regularly. Also, we reanalyse samples when
monitored QC sample concentrations did not match the criteria, in order to keep
measurement quality. These careful settings might contribute to good reliability equally to other
However, this comparison should be treated with caution as the method for calculating CV
was different between platforms especially for measurement duration and number of batches
and replicates. Nonetheless, this result shows that our platform is at least comparable to others
and suitable for conducting large-scale epidemiological studies. It should be noted that our
values are reported as absolute concentrations of compounds, instead of the relative intensity of
Compared to other established MS platforms, our CE-MS platform has limited coverage for
overall metabolites because CE-MS is not able to detect most of non-polar metabolites. Most
of polar metabolites were detected consistently in participants, however, some metabolites
(especially anions) were hard to be quantified with adequate precision due to their lower
concentrations than limit of detection. Despite this limitation, our platform still has broader
coverage of polar metabolites than other MS platforms. In addition, environmental load and cost
is relatively lower because little organic solvent is used in CE-MS. Compared to NMR, CE-MS
has lower sample throughput, but coverage and sensitivity for polar metabolites are higher.
9 / 16
Fig 5. Bland-Altman plots for uric acid. X-axis indicates the mean uric acid concentrations (μmol/L) of capillary
electrophoresis-mass spectrometry (CE-MS) and clinical assay, and Y-axis indicates percentage of differences between
these two methods.
CE-MS: capillary electrophoresis-mass spectrometry, ref: reference
Therefore, CE-MS is also suitable to use in combination with other methods to cover broader
Variations between samples can be classified as inter- or intra-batch variations. Reducing
inter-batch variations is an important issue in large-scale metabolomics [
]. The results
show that our metabolomics method controlled inter-batch effects well for most of the
measured compounds without any statistical adjustments, regardless of the large scale with 883
QC samples among 105 batches for cations and 946 QC samples among 99 batches for anions.
Inter- and intra-batch variations were comparable. Even for metabolites with poor
reproducibility, this reflected intra-batch effects, except for triethanolamine, indole-3-acetate, and
malonate where inter-batch CV was larger than intra-batch one.
The larger CV for participant plasma samples than for serum QCs can be interpreted as due
to variation between subjects including biological differences. The larger intra-batch CV than
inter-batch CV also indicates that increasing CV for participant samples was mainly due to the
variation between subjects, rather than measurement errors. Although the difference between
serum and plasma should be taken into account, our preliminary study showed that the
metabolites detected in serum samples included all the metabolites detected in plasma, and
metabolite concentrations in EDTA plasma and serum correlate well with the exception of some
metabolites such as hypoxanthine and lactic acid [
Technical ICC can add another aspect to evaluate technical variation, focusing on
participant variation including true biological differences. In epidemiological studies, statistical
power is influenced by the magnitude of the effect of interest in the population. When the
biological variation is larger than technical variation, biological differences can be detected
despite measurement errors. Our estimate of ICC, 1 − (Total CV of QC samples)2 / (Total
CV of Participant samples)2, was above 0.40 for most measured compounds, except for
malonate and uridine. Even for compounds with poor reproducibility, those with ICC above 0.40
may be worth examining as potential biomarkers, with careful evaluation of their
The reliability of CE-MS measurements for creatinine and uric acid, measured by cation
mode and anion mode, respectively, were evaluated with respect to clinical laboratory data.
Clinical measurements of serum creatinine and uric acid are standardized and certificated
nationwide, and commonly adopted in medical laboratories and hospitals. The slope of
Passing Bablok regression was nearly one with narrow 95% confidence interval in both of
creatinine and uric acid. These results indicate that CE-MS and clinical laboratory data were similar,
although the slope was slightly less than one and the intercept was less than zero due to the 10±
13% lower mean concentrations by CE-MS. It is possible that this was due to the difference
between plasma and serum, considering previously reported findings that most metabolite
concentrations are higher in serum than in EDTA plasma [
]. Nonetheless both methods
gave similar results with respect to differences by sex and age. This further indicates that our
metabolomics data are fit for purpose for epidemiological studies.
Our metabolomics summary data by sex and age provide for the first time absolute
concentrations for a large sample of community-dwelling adults in Japan. These data can be used as a
reference for other population studies, especially Asians, though careful interpretation is
required for some of the compounds with low reproducibility. Our data show sex differences
in concentrations for most metabolites even after adjusting for potential confounders
including unique metabolites in our CE-MS platform such as guanidinoacetate,
gamma-butyrobetaine, and mucate that may contribute to explaining the biological sex differences. Many
metabolites were also associated with age after adjustment for confounders, although we are
unable to evaluate whether this is cause or consequence because of the cross-sectional design.
11 / 16
This study is characterized by large sample size from one population of community
dwelling adults in Tsuruoka city, Japan. This can result in good internal validity in this population.
This population can be considered as representative of Japanese whose genetic background is
homogeneous, although the variety of environmental factors should be taken account of. In
order to generalize our findings, further studies are needed in other populations in Japan and
other countries. In addition, external validation and cross-platform and inter-lab round-robin
studies are expected to establish the methodology of large-scale studies using CE-MS platform.
International collaborative studies should be proceeded to address these issues.
In conclusion, this study shows the CE-MS platform yields reliable concentrations for
plasma metabolites in a large-scale study. It provides high-quality metabolomics data which
will aid in the understanding of links between disease risk and metabolism.
S1 Fig. Correlation between inter- and intra- batch CV in QC samples. The plots between
inter- and intra- batch coefficients of variation in quality control (QC) samples are shown.
S1 File. Boxplots of cationic metabolite concentrations by batches in QC samples.
S2 File. Boxplots of anionic metabolite concentrations by batches in QC samples.
S1 Table. Statistical summary of measured metabolites. CV: coefficient of variation, ICC:
intraclass correlation coefficient, ND: not detectable values, QC: quality control, SD: standard
deviation. ICC was estimated by following; 1 − (Total CV of QC samples)2 / (Total CV of
S2 Table. Comparison of CV for polar metabolites in QC samples between major mass
spectrometry metabolomics platforms used in large cohort studies. CE: capillary
electrophoresis, CV: coefficient of variation, FHS: Framingham Heart Study, GAC: Genome Analysis
Centre, GC: Gas chromatography, IAB: Institute for Advanced Biosciences, KORA:
Cooperative Health Research in the Region of Augsburg, LC: liquid chromatography, QC: quality
control. 1 CV of this detected metabolite was not reported. 2 CV of asymmetric
dimethylarginine and symmetric dimethylarginine was not reported separately. 3 CV of leucine and
isoleucine was not reported separately. 4 CV of 2-oxoglutarate and adipate was not reported
separately. 5 CV of citrate and isocitrate was not reported separately. 6 CV of fumarate and
malate was not reported separately.
S3 Table. Concentrations of all metabolites stratified with sex and age. SD: standard
deviation, SE: standard error. Confounders are following; smoking and alcohol drinking habit,
history of any ischemic heart disease, stroke and cancer, and current disease status including
hypertension, diabetes, dyslipidaemia and impaired kidney function. Hypertension was
defined as systolic blood pressure 140 mmHg and/or diastolic blood pressure 90 mmHg
and/or currently on antihypertensive therapy; impaired glucose tolerance as fasting plasma
glucose 126mg mg/dL and/or hemoglobin A1c (NGSP) 6.5% and/or current use of
12 / 16
We thank the residents of Tsuruoka City for their interest in our study and the members of the
Tsuruoka Metabolomic Cohort Study team for their commitment to the project.
Conceptualization: Sei Harada, Akiyoshi Hirayama, Toru Takebayashi.
Data curation: Sei Harada, Akiyoshi Hirayama, Masahiro Sugimoto, Toru Takebayashi.
Formal analysis: Sei Harada, Timothy M. D. Ebbels.
Funding acquisition: Sei Harada, Paul Elliott, Masaru Tomita, Tomoyoshi Soga, Toru
Suzuki, Toru Takebayashi.
Tomoyoshi Soga, Toru Takebayashi.
Investigation: Sei Harada, Akiyoshi Hirayama, Ayako Kurihara, Kota Fukai, Miho Iida,
Suzuka Kato, Daisuke Sugiyama, Kazuyo Kuwabara, Miki Akiyama, Asako Sato, Chizuru
Methodology: Sei Harada, Akiyoshi Hirayama, Ayano Takeuchi, Masahiro Sugimoto,
Project administration: Sei Harada, Toru Takebayashi.
Resources: Akiyoshi Hirayama, Masaru Tomita, Asako Sato, Chizuru Suzuki, Masahiro
Sugimoto, Tomoyoshi Soga.
Software: Sei Harada, Ayano Takeuchi, Masahiro Sugimoto.
Supervision: Queenie Chan, Tomonori Okamura, Paul Elliott, Masaru Tomita, Tomoyoshi
Soga, Toru Takebayashi.
Validation: Sei Harada, Akiyoshi Hirayama.
Visualization: Sei Harada.
Writing ± original draft: Sei Harada, Queenie Chan, Timothy M. D. Ebbels.
Writing ± review & editing: Sei Harada, Akiyoshi Hirayama, Queenie Chan, Kota Fukai,
Daisuke Sugiyama, Tomonori Okamura, Timothy M. D. Ebbels, Paul Elliott, Toru
13 / 16
14 / 16
15 / 16
1. Shah SH , Kraus WE , Newgard CB . Metabolomic Profiling for the Identification of Novel Biomarkers and Mechanisms Related to Common Cardiovascular Diseases . Circulation . 2012 ; 126 : 1110 ± 1120 . https:// doi.org/10.1161/CIRCULATIONAHA.111.060368 PMID: 22927473
2. Wang T , Larson M , Vasan R , Cheng S , Rhee EP , McCabe E , et al. Metabolite profiles and the risk of developing diabetes . Nat Med . 2011 ; 17 : 448 ± 454 . https://doi.org/10.1038/nm.2307 PMID: 21423183
3. Cheng S , Rhee EP , Larson MG , Lewis GD , McCabe EL , Shen D , et al. Metabolite Profiling Identifies Pathways Associated With Metabolic Risk in Humans . Circulation . 2012 ; 125 : 2222 ± 2231 . https://doi. org/10.1161/CIRCULATIONAHA.111.067827 PMID: 22496159
4. Armitage EG , Barbas C. Metabolomics in cancer biomarker discovery: Current trends and future perspectives . J Pharm Biomed Anal . 2014 ; 87 : 1± 11 . https://doi.org/10.1016/j.jpba. 2013 . 08 .041 PMID: 24091079
5. Mondul AM , Moore SC , Weinstein SJ , Karoly ED , Sampson JN , Albanes D . Metabolomic analysis of prostate cancer risk in a prospective cohort: The alpha-tocopherol, beta-carotene cancer prevention (ATBC) study . Int J Cancer . 2015 ; 137 : 2124 ± 2132 . https://doi.org/10.1002/ijc.29576 PMID: 25904191
6. Suhre K , Shin SY , Petersen AK , Mohney RP , Meredith D , WaÈgele B , et al. Human metabolic individuality in biomedical and pharmaceutical research . Nature . 2011 ; 477 : 54 ± 60 . https://doi.org/10.1038/ nature10354 PMID: 21886157
7. Athersuch TJ , Keun HC . Metabolic profiling in human exposome studies . Mutagenesis . 2015 ; 30 : 755 ± 762 . https://doi.org/10.1093/mutage/gev060 PMID: 26290610
8. Sekula P , Goek ON , Quaye L , Barrios C , Levey AS , Romisch-Margl W , et al. A Metabolome-Wide Association Study of Kidney Function and Disease in the General Population . J Am Soc Nephrol . 2016 ; 27 : 1175 ± 1188 . https://doi.org/10.1681/ASN.2014111099 PMID: 26449609
9. Shin SY , Fauman EB , Petersen AK , Krumsiek J , Santos R , Huang J , et al. An atlas of genetic influences on human blood metabolites . Nat Genet . 2014 ; 46 : 543 ± 550 . https://doi.org/10.1038/ng.2982 PMID: 24816252
10. Long T , Hicks M , Yu HC , Biggs WH , Kirkness EF , Menni C , et al. Whole-genome sequencing identifies common-to-rare variants associated with human blood metabolites . Nat Genet . 2017 ; 49 : 568 ± 578 . https://doi.org/10.1038/ng.3809 PMID: 28263315
11. Yu B , Heiss G , Alexander D , Grams ME , Boerwinkle E. Associations Between the Serum Metabolome and All-Cause Mortality Among African Americans in the Atherosclerosis Risk in Communities (ARIC) Study . Am J Epidemiol . 2016 ; 183 : 650 ±6. https://doi.org/10.1093/aje/kwv213 PMID: 26956554
12. Fischer K , Kettunen J , WuÈrtz P , Haller T , Havulinna AS , Kangas AJ , et al. Biomarker Profiling by Nuclear Magnetic Resonance Spectroscopy for the Prediction of All-Cause Mortality: An Observational Study of 17,345 Persons . PLoS Med . 2014 ; 11 : e1001606. https://doi.org/10.1371/journal.pmed. 1001606 PMID: 24586121
13. Karaman I , Ferreira DLS , BoulangeÂ CL , Kaluarachchi MR , Herrington D , Dona AC , et al. Workflow for Integrated Processing of Multicohort Untargeted 1H NMR Metabolomics Data in Large-Scale Metabolic Epidemiology . J Proteome Res . 2016 ; 15 : 4188 ± 4194 . https://doi.org/10.1021/acs.jproteome. 6b00125 PMID: 27628670
14. Elliott P , Posma JM , Chan Q , Garcia-Perez I , Wijeyesekera A , Bictash M , et al. Urinary metabolic signatures of human adiposity . Sci Transl Med . 2015 ; 7: 285ra62 . https://doi.org/10.1126/scitranslmed. aaa5680 PMID: 25925681
15. Holmes E , Loo R , Stamler J , Bictash M , Yap IK , Chan Q , et al. Human metabolic phenotype diversity and its association with diet and blood pressure . Nature . 2008 ; 453 : 396 ± 400 . https://doi.org/10.1038/ nature06882 PMID: 18425110
16. Chan Q , Loo RL , Ebbels TMD , Van Horn L , Daviglus ML , Stamler J , et al. Metabolic phenotyping for discovery of urinary biomarkers of diet, xenobiotics and blood pressure in the INTERMAP Study: an overview . Hypertens Res . 2017 ; 40 : 336 ± 345 . https://doi.org/10.1038/hr. 2016 .164 PMID: 28003647
17. Xu T , Holzapfel C , Dong X , Bader E , Yu Z , Prehn C , et al. Effects of smoking and smoking cessation on human serum metabolite profile: results from the KORA cohort study . BMC Med . 2013 ; 11 : 60 . https:// doi.org/10.1186/ 1741 -7015-11-60 PMID: 23497222
18. Guertin KA , Moore SC , Sampson JN , Huang WY , Xiao Q , Stolzenberg-Solomon RZ , et al. Metabolomics in nutritional epidemiology: identifying metabolites associated with diet and quantifying their potential to uncover diet-disease relations in populations . Am J Clin Nutr . 2014 ; 100 : 208 ± 217 . https://doi.org/ 10.3945/ajcn.113.078758 PMID: 24740205
19. Harada S , Takebayashi T , Kurihara A , Akiyama M , Suzuki A , Hatakeyama Y , et al. Metabolomic profiling reveals novel biomarkers of alcohol intake and alcohol-induced liver injury in community-dwelling men . Environ Health Prev Med . 2016 ; 21 : 18 ± 26 . https://doi.org/10.1007/s12199-015-0494-y PMID: 26459263
20. Iida M , Harada S , Kurihara A , Fukai K , Kuwabara K , Sugiyama D , et al. Profiling of plasma metabolites in postmenopausal women with metabolic syndrome . Menopause . 2016 ; 23 : 749 ± 58 . https://doi.org/10. 1097/GME.0000000000000630 PMID: 27070805
21. Fukai K , Harada S , Iida M , Kurihara A , Takeuchi A , Kuwabara K , et al. Metabolic Profiling of Total Physical Activity and Sedentary Behavior in Community-Dwelling Men . PLoS One . 2016 ; 11 : e0164877. https://doi.org/10.1371/journal.pone. 0164877 PMID: 27741291
22. Ramautar R , Somsen GW , De Jong GJ. CE-MS for metabolomics: Developments and applications in the period 2014±2016 . Electrophoresis. 2017 ; 38 : 190 ± 202 . https://doi.org/10.1002/elps.201600370 PMID: 27718257
23. Hirayama A , Tomita M , Soga T. Sheathless capillary electrophoresis-mass spectrometry with a highsensitivity porous sprayer for cationic metabolome analysis . Analyst . 2012 ; 137 : 5026 ± 33 . https://doi. org/10.1039/c2an35492f PMID: 23000847
24. Sugimoto M , Hirayama A , Robert M , Abe S , Soga T , Tomita M. Prediction of metabolite identity from accurate mass, migration time prediction and isotopic pattern information in CE-TOFMS data . Electrophoresis . 2010 ; 31 : 2311 ± 2318 . https://doi.org/10.1002/elps.200900584 PMID: 20568260
25. Frantzi M , van Kessel KE , Zwarthoff EC , Marquez M , Rava M , Malats N , et al. Development and Validation of Urine-based Peptide Biomarker Panels for Detecting Bladder Cancer in a Multi-center Study . Clin Cancer Res . 2016 ; 22 : 4077 ± 86 . https://doi.org/10.1158/ 1078 - 0432 .CCR- 15 -2715 PMID: 27026199
26. Boizard F , Brunchault V , Moulos P , Breuil B , Klein J , Lounis N , et al. A capillary electrophoresis coupled to mass spectrometry pipeline for long term comparable assessment of the urinary metabolome . Sci Rep . 2016 ; 6 : 34453 . https://doi.org/10.1038/srep34453 PMID: 27694997
27. Macedo AN , Mathiaparanam S , Brick L , Keenan K , Gonska T , Pedder L , et al. The Sweat Metabolome of Screen-Positive Cystic Fibrosis Infants: Revealing Mechanisms beyond Impaired Chloride Transport . ACS Cent Sci . 2017 ; 3 : 904 ± 913 . https://doi.org/10.1021/acscentsci.7b00299 PMID: 28852705
28. DiBattista A , McIntosh N , Lamoureux M , Al-Dirbashi OY , Chakraborty P , Britz-McKibbin P. Temporal Signal Pattern Recognition in Mass Spectrometry: A Method for Rapid Identification and Accurate Quantification of Biomarkers for Inborn Errors of Metabolism with Quality Assurance . Anal Chem . 2017 ; 89 : 8112 ± 8121 . https://doi.org/10.1021/acs.analchem. 7b01727 PMID: 28648083
White E. Measurement error in biomarkers: sources, assessment, and impact on studies . IARC Sci Publ . 2011 : 143 ± 61 .
30. Dunn WB , Lin W , Broadhurst D , Begley P , Brown M , Zelena E , et al. Molecular phenotyping of a UK population: defining the human serum metabolome . Metabolomics . 2015 ; 11 : 9± 26 . https://doi.org/10. 1007/s11306-014 -0707-1 PMID: 25598764
31. Hirayama A , Sugimoto M , Suzuki A , Hatakeyama Y , Enomoto A , Harada S , et al. Effects of processing and storage conditions on charged metabolomic profiles in blood . Electrophoresis . 2015 ; 36 : 2148 ± 2155 .
32. Hirayama A , Nakashima E , Sugimoto M , Akiyama S , Sato W , Maruyama S , et al. Metabolic profiling reveals new serum biomarkers for differentiating diabetic nephropathy . Anal Bioanal Chem . 2012 ; 404 : 3101 ±9. https://doi.org/10.1007/s00216-012-6412 -x PMID : 23052862
33. Hirayama A , Soga T. CE-MS in metabolomics . Electrophoresis . 2009 ; 30 : 276 ± 291 . https://doi.org/10. 1002/elps.200800512 PMID: 19107702
34. Sugimoto M , Wong DT , Hirayama A , Soga T , Tomita M. Capillary electrophoresis mass spectrometrybased saliva metabolomics identified oral, breast and pancreatic cancer-specific profiles . Metabolomics . 2010 ; 6 : 78 ± 95 . https://doi.org/10.1007/s11306-009-0178-y PMID: 20300169
35. Hirayama A , Kami K , Sugimoto M , Sugawara M , Toki N , Onozuka H , et al. Quantitative metabolome profiling of colon and stomach cancer microenvironment by capillary electrophoresis time-of-flight mass spectrometry . Cancer Res . 2009 ; 69 : 4918 ± 25 . https://doi.org/10.1158/ 0008 - 5472 .CAN- 08 -4806 PMID: 19458066
36. Tanganelli E , Prencipe L , Bassi D , Cambiaghi S , Murador E. Enzymic assay of creatinine in serum and urine with creatinine iminohydrolase and glutamate dehydrogenase . Clin Chem . 1982 ; 28 : 1461 ± 4 . PMID: 7083556
37. Peake M , Whiting M. Measurement of serum creatinineÐcurrent status and future goals . Clin Biochem Rev . 2006 ; 27 : 173 ± 84 . PMID: 17581641
38. Fossati P , Prencipe L , Berti G . Use of 3,5-dichloro-2-hydroxybenzenesulfonic acid/4-aminophenazone chromogenic system in direct enzymic assay of uric acid in serum and urine . Clin Chem . 1980 ; 26 : 227 ± 31 . PMID: 7353268
39. Yamamoto Y , Hosogaya S , Osawa S , Ichihara K , Onuma T , Saito A , et al. Nationwide multicenter study aimed at the establishment of common reference intervals for standardized clinical laboratory tests in Japan . Clin Chem Lab Med . 2013 ; 51 : 1663 ± 72 . https://doi.org/10.1515/cclm-2012 -0413 PMID: 23612542
40. Hornung RW , Reed LD . Estimation of Average Concentration in the Presence of Nondetectable Values . Appl Occup Environ Hyg . 1990 ; 5 : 46 ± 51 .
41. Townsend MK , Clish CB , Kraft P , Wu C , Souza AL , Deik AA , et al. Reproducibility of Metabolomic Profiles among Men and Women in 2 Large Cohort Studies . Clin Chem . 2013 ; 59 : 1657 ± 1667 . https://doi. org/10.1373/clinchem. 2012 .199133 PMID: 23897902
42. Sampson JN , Boca SM , Shu XO , Stolzenberg-Solomon RZ , Matthews CE , Hsing AW , et al. Metabolomics in epidemiology: sources of variability in metabolite measurements and implications . Cancer Epidemiol Biomarkers Prev . 2013 ; 22 : 631 ± 40 . https://doi.org/10.1158/ 1055 - 9965 .EPI- 12 -1109 PMID: 23396963
43. Illig T , Gieger C , Zhai G , RoÈmisch-Margl W , Wang-Sattler R , Prehn C , et al. A genome-wide perspective of genetic variation in human metabolism . Nat Genet . Europe PMC Funders; 2010 ; 42 : 137 ± 41 .
44. Shaham O , Wei R , Wang TJ , Ricciardi C , Lewis GD , Vasan RS , et al. Metabolic profiling of the human response to a glucose challenge reveals distinct axes of insulin sensitivity . Mol Syst Biol . 2008 ; 4 : 214 . https://doi.org/10.1038/msb. 2008 .50 PMID: 18682704
45. Dunn WB , Wilson ID , Nicholls AW , Broadhurst D. The importance of experimental design and QC samples in large-scale and MS-driven untargeted metabolomic studies of humans . Bioanalysis . 2012 ; 4 : 2249 ± 2264 . https://doi.org/10.4155/bio.12.204 PMID: 23046267
46. US FDA. Guidance for Industry, Bioanalytical Method Validation . 2011 . https://www.fda.gov/ downloads/Drugs/Guidance/ucm070107.pdf
47. Brunius C , Shi L , Landberg R . Large-scale untargeted LC-MS metabolomics data correction using between-batch feature alignment and cluster-based within-batch signal intensity drift correction . Metabolomics.; 2016 ; 12 : 173 . https://doi.org/10.1007/s11306-016 -1124-4 PMID: 27746707
48. Rhee EP , Ho JE , Chen MH , Shen D , Cheng S , Larson MG , et al. A Genome-wide Association Study of the Human Metabolome in a Community-Based Cohort . Cell Metab . 2013 ; 18 : 130 ± 143 . https://doi.org/ 10.1016/j.cmet. 2013 . 06 .013 PMID: 23823483
49. Kuligowski J , PeÂrez-Guaita D , Lliso I , Escobar J , LeoÂn Z , Gombau L , et al. Detection of batch effects in liquid chromatography-mass spectrometry metabolomic data using guided principal component analysis . Talanta . 2014 ; 130 : 442 ± 448 . https://doi.org/10.1016/j.talanta. 2014 . 07 .031 PMID: 25159433
50. Wehrens R , Hageman JA , van Eeuwijk F , Kooke R , Flood PJ , Wijnker E , et al. Improved batch correction in untargeted MS-based metabolomics . Metabolomics . 2016 ; 12 : 88 . https://doi.org/10.1007/ s11306-016 -1015-8 PMID: 27073351
51. Yu Z , KastenmuÈller G , He Y , Belcredi P , MoÈller G , Prehn C , et al. Differences between Human Plasma and Serum Metabolite Profiles. Oresic M, editor . PLoS One . 2011 ; 6: e21230 . https://doi.org/10.1371/ journal.pone. 0021230 PMID: 21760889