Toxicogenomic module associations with pathogenesis: a network-based approach to understanding drug toxicity
OPEN
The Pharmacogenomics Journal (2018) 18, 377–390
www.nature.com/tpj
ORIGINAL ARTICLE
Toxicogenomic module associations with pathogenesis: a
network-based approach to understanding drug toxicity
JJ Sutherland, YW Webster, JA Willy, GH Searfoss, KM Goldstein, AR Irizarry, DG Hall and JL Stevens
Despite investment in toxicogenomics, nonclinical safety studies are still used to predict clinical liabilities for new drug candidates.
Network-based approaches for genomic analysis help overcome challenges with whole-genome transcriptional profiling using
limited numbers of treatments for phenotypes of interest. Herein, we apply co-expression network analysis to safety assessment
using rat liver gene expression data to define 415 modules, exhibiting unique transcriptional control, organized in a visual
representation of the transcriptome (the ‘TXG-MAP’). Accounting for the overall transcriptional activity resulting from treatment, we
explain mechanisms of toxicity and predict distinct toxicity phenotypes using module associations. We demonstrate that early
network responses complement traditional histology-based assessment in predicting outcomes for longer studies and identify a
novel mechanism of hepatotoxicity involving endoplasmic reticulum stress and Nrf2 activation. Module-based molecular subtypes
of cholestatic injury derived using rat translate to human. Moreover, compared to gene-level analysis alone, combining module and
gene-level analysis performed in sequence identifies significantly more phenotype-gene associations, including established and
novel biomarkers of liver injury.
The Pharmacogenomics Journal (2018) 18, 377–390; doi:10.1038/tpj.2017.17; published online 25 April 2017
INTRODUCTION
Safety remains a major cause of attrition during clinical trials.1–5
Prior to clinical testing, all clinical candidates are evaluated in
animals to define the spectrum of toxicities that might occur in
human subjects and safe doses for clinical testing.6 However,
continued occurrences of clinical safety terminations calls into
question the value of nonclinical testing in predicting human
risk.7,8 Nonetheless, when confidence in nonclinical safety data is
high compounds are more likely to be safe in humans.9
Uncertainty regarding safety predictions occurs at three major
transition points in biopharmaceutical testing: (1) the transition
inherent in using simple in vitro models to predict in vivo
nonclinical (animal) results early in discovery; (2) the transition
from nonclinical testing to human clinical trials; and (3) the
transition from testing in well-controlled clinical trials to the larger
diverse patient population post approval. In other work, we
addressed the first transition by associating chemical properties
with toxicity early10 and by developing a systems level framework
using co-expression networks to evaluate how well mechanisms
extrapolate from primary cell cultures to the same organ in vivo.11
Here we address the second transition by investigating the utility
of network-based toxicogenomic approaches for predicting
mechanisms of drug-induced liver injury and the translation from
rodent to human.
Considerable effort has been invested applying transcript
profiling to risk assessment using methodologies such as gene
signatures,12 pathway-based enrichment analysis,13 co-expression
networks,14,15 and adverse outcome pathways.16 However, toxicogenomic approaches to safety testing remain challenging and
have achieved only modest utility in addressing uncertainty in
safety predictions, largely as an investigative tool. Nonclinical
safety testing remains largely dependent on traditional clinical
chemistry and histologic evaluation. Gene signatures are effective
as classifiers but their development requires large and costly
compendia of transcript profiles and may not translate to other
models and mechanisms. Limitations in measurement technologies and the inherent stochastic nature of biological systems pose
additional analytical challenges to establishing the relationship
between thousands of variables (genes) and toxicity properties
using small sets of training compounds. Pathway or Gene
Ontology (GO) enrichment analysis can reduce noise but are
biased toward known biology captured in existing
repositories.13,17
Unsupervised methods that organize high-dimensional data
into networks based on biologically relevant coalescent properties
reduce noise and boost signal detection.18,19 This seems intuitive
since organisms demonstrate modularity and conservation of
biology across evolution.20,21 One such approach, weighted gene
co-expression network analysis (WGCNA), uses the property of coexpression to organize genes into gene networks or modules.22
Here we develop a co-expression framework called the ‘toxicogenomic module associations with pathogenesis’ (the TXG-MAP) and
integrate it with standard pathology evaluation to characterize
mechanisms of drug-induced liver injury. We demonstrate the
utility of the TXG-MAP for common applications. First, we illustrate
how co-expression modules reveal mechanisms of pathogenesis
concurrent with or preceding toxicity phenotypes. Second, we
illustrate the utility of modules for identifying marker genes in
small data sets, while controlling for false discovery. Third, we use
case studies to illustrate utility in elucidating specific mechanisms
of liver injury. Fourth, we identify transcription factors that couple
upstream signals to co-expression changes. Finally, we demonstrate that module-based molecular phenotypes for rodent liver
injury translate to human liver disease.
Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis IN, USA. Correspondence: Dr JL Stevens, Lilly Research Laboratories, Eli Lilly and Company,
Lilly Corporate Center, Indianapolis, IN 46285, USA.
E-mail:
Received 2 November 2016; revised 19 February 2017; accepted 28 February 2017; published online 25 April 2017
Toxicogenomic module associations with pathogenesis
JJ Sutherland et al
378
MATERIALS AND METHODS
Microarray data processing from Drug Matrix, TG-GATEs and GEO
The Drug Matrix (DM)23 and open TG-GATEs (TG)24 databases constitute
two large publicly available resources describing the effects of drugs and
other compounds in rat liver. They contain gene expression data from
Affymetrix microarrays, linked to traditional histology and clinical
chemistry results for 3528 treatment groups from TG and 654 from DM.
A treatment group denotes three or more animals receiving a given dose
of drug or vehicle, usually administered daily by oral gavage, and killed
following drug exposures lasting from 3 h to 29 days. The treatment
groups analyzed in this work are given in Supplementary Table S1.
Methods for obtaining, processing and analyzing rat liver microarray data
from DM and TG are described in detail elsewhere;11 details for Gene
Expression Omnibus (GEO) sets are provided in Supplementary Methods.
phenotype label, means ‘any other (...truncated)