Lifetime Data Analysis

Lifetime Data Analysis is the only journal dedicated to statistical methods and applications for lifetime data. The journal advances and promotes statistical ...

List of Papers (Total 78)

Gradient boosting-based discrete failure time model for selecting time-varying effects and interactions

Analyzing survival data from the National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) Program offers invaluable insight to guide cancer management. In these data the event times are recorded in units of months. In such datasets, it can be important to consider the selection of covariates, distinguish between time-varying and time-independent effects, and...

A dependent and censored first hitting-time model with compound Poisson processes

We consider a bivariate first hitting-time model in which durations are the crossing times of dependent compound Poisson processes with fixed thresholds. The identifiability of the model is discussed, and likelihood estimators of the model parameters are proposed. We obtain the asymptotic properties of the estimators and underline their finite sample performance with a simulation...

Wild bootstrap for counting process-based statistics: a martingale theory-based approach

The wild bootstrap is a popular resampling method in the context of time-to-event data analysis. Previous works established the large sample properties of it for applications to different estimators and test statistics. It can be used to justify the accuracy of inference procedures such as hypothesis tests or time-simultaneous confidence bands. This paper provides a general...

Integrative analysis of high-dimensional RCT and RWD subject to censoring and hidden confounding

In this study, we focus on estimating the heterogeneous treatment effect (HTE) for survival outcome. The outcome is subject to censoring and the number of covariates is high-dimensional. We utilize data from both the randomized controlled trial (RCT), considered as the gold standard, and real-world data (RWD), possibly affected by hidden confounding factors. To achieve a more...

Robust inverse probability weighted estimators for doubly truncated Cox regression with closed-form standard errors

Survival data is doubly truncated when only participants who experience an event during a random interval are included in the sample. Existing methods typically correct for double truncation bias in Cox regression through inverse probability weighting via the nonparametric maximum likelihood estimate (NPMLE) of the selection probabilities. This approach relies on two key...

Lifetime analysis with monotonic degradation: a boosted first hitting time model based on a homogeneous gamma process

In the context of time-to-event analysis, First hitting time methods consider the event occurrence as the ending point of some evolving process. The characteristics of the process are of great relevance for the analysis, which makes this class of models interesting and particularly suitable for applications where something about the degradation path is known. In cases where the...

Spatiotemporal multilevel joint modeling of longitudinal and survival outcomes in end-stage kidney disease

Individuals with end-stage kidney disease (ESKD) on dialysis experience high mortality and excessive burden of hospitalizations over time relative to comparable Medicare patient cohorts without kidney failure. A key interest in this population is to understand the time-dynamic effects of multilevel risk factors that contribute to the correlated outcomes of longitudinal...

Unifying mortality forecasting model: an investigation of the COM–Poisson distribution in the GAS model for improved projections

Forecasting mortality rates is crucial for evaluating life insurance company solvency, especially amid disruptions caused by phenomena like COVID-19. The Lee–Carter model is commonly employed in mortality modelling; however, extensions that can encompass count data with diverse distributions, such as the Generalized Autoregressive Score (GAS) model utilizing the COM–Poisson...

Nested case–control sampling without replacement

Nested case–control design (NCC) is a cost-effective outcome-dependent design in epidemiology that collects all cases and a fixed number of controls at the time of case diagnosis from a large cohort. Due to inefficiency relative to full cohort studies, previous research developed various estimation methodologies but changing designs in the formulation of risk sets was considered...

A constrained maximum likelihood approach to developing well-calibrated models for predicting binary outcomes

The added value of candidate predictors for risk modeling is routinely evaluated by comparing the performance of models with or without including candidate predictors. Such comparison is most meaningful when the estimated risk by the two models are both unbiased in the target population. Very often data for candidate predictors are sourced from nonrepresentative convenience...

Cox model inference for relative hazard and pure risk from stratified weight-calibrated case-cohort data

The case-cohort design obtains complete covariate data only on cases and on a random sample (the subcohort) of the entire cohort. Subsequent publications described the use of stratification and weight calibration to increase efficiency of estimates of Cox model log-relative hazards, and there has been some work estimating pure risk. Yet there are few examples of these options in...

Bias of the additive hazard model in the presence of causal effect heterogeneity

Hazard ratios are prone to selection bias, compromising their use as causal estimands. On the other hand, if Aalen’s additive hazard model applies, the hazard difference has been shown to remain unaffected by the selection of frailty factors over time. Then, in the absence of confounding, observed hazard differences are equal in expectation to the causal hazard differences...

Pseudo-value regression trees

This paper presents a semi-parametric modeling technique for estimating the survival function from a set of right-censored time-to-event data. Our method, named pseudo-value regression trees (PRT), is based on the pseudo-value regression framework, modeling individual-specific survival probabilities by computing pseudo-values and relating them to a set of covariates. The standard...

The built-in selection bias of hazard ratios formalized using structural causal models

It is known that the hazard ratio lacks a useful causal interpretation. Even for data from a randomized controlled trial, the hazard ratio suffers from so-called built-in selection bias as, over time, the individuals at risk among the exposed and unexposed are no longer exchangeable. In this paper, we formalize how the expectation of the observed hazard ratio evolves and deviates...

Dynamic Treatment Regimes Using Bayesian Additive Regression Trees for Censored Outcomes

To achieve the goal of providing the best possible care to each individual under their care, physicians need to customize treatments for individuals with the same health state, especially when treating diseases that can progress further and require additional treatments, such as cancer. Making decisions at multiple stages as a disease progresses can be formalized as a dynamic...

Estimation of separable direct and indirect effects in a continuous-time illness-death model

In this article we study the effect of a baseline exposure on a terminal time-to-event outcome either directly or mediated by the illness state of a continuous-time illness-death process with baseline covariates. We propose a definition of the corresponding direct and indirect effects using the concept of separable (interventionist) effects (Robins and Richardson in Causality and...

Evaluation of the natural history of disease by combining incident and prevalent cohorts: application to the Nun Study

The Nun study is a well-known longitudinal epidemiology study of aging and dementia that recruited elderly nuns who were not yet diagnosed with dementia (i.e., incident cohort) and who had dementia prior to entry (i.e., prevalent cohort). In such a natural history of disease study, multistate modeling of the combined data from both incident and prevalent cohorts is desirable to...

Causal inference with recurrent and competing events

Many research questions concern treatment effects on outcomes that can recur several times in the same individual. For example, medical researchers are interested in treatment effects on hospitalizations in heart failure patients and sports injuries in athletes. Competing events, such as death, complicate causal inference in studies of recurrent events because once a competing...

Regression models for censored time-to-event data using infinitesimal jack-knife pseudo-observations, with applications to left-truncation

Jack-knife pseudo-observations have in recent decades gained popularity in regression analysis for various aspects of time-to-event data. A limitation of the jack-knife pseudo-observations is that their computation is time consuming, as the base estimate needs to be recalculated when leaving out each observation. We show that jack-knife pseudo-observations can be closely...

Latency function estimation under the mixture cure model when the cure status is available

This paper addresses the problem of estimating the conditional survival function of the lifetime of the subjects experiencing the event (latency) in the mixture cure model when the cure status information is partially available. The approach of past work relies on the assumption that long-term survivors are unidentifiable because of right censoring. However, in some cases this...

Estimating distribution of length of stay in a multi-state model conditional on the pathway, with an application to patients hospitalised with Covid-19

Multi-state models are used to describe how individuals transition through different states over time. The distribution of the time spent in different states, referred to as ‘length of stay’, is often of interest. Methods for estimating expected length of stay in a given state are well established. The focus of this paper is on the distribution of the time spent in different...

Investigating non-inferiority or equivalence in time-to-event data under non-proportional hazards

The classical approach to analyze time-to-event data, e.g. in clinical trials, is to fit Kaplan–Meier curves yielding the treatment effect as the hazard ratio between treatment groups. Afterwards, a log-rank test is commonly performed to investigate whether there is a difference in survival or, depending on additional covariates, a Cox proportional hazard model is used. However...