Analyzing survival data from the National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) Program offers invaluable insight to guide cancer management. In these data the event times are recorded in units of months. In such datasets, it can be important to consider the selection of covariates, distinguish between time-varying and time-independent effects, and...
We consider a bivariate first hitting-time model in which durations are the crossing times of dependent compound Poisson processes with fixed thresholds. The identifiability of the model is discussed, and likelihood estimators of the model parameters are proposed. We obtain the asymptotic properties of the estimators and underline their finite sample performance with a simulation...
The wild bootstrap is a popular resampling method in the context of time-to-event data analysis. Previous works established the large sample properties of it for applications to different estimators and test statistics. It can be used to justify the accuracy of inference procedures such as hypothesis tests or time-simultaneous confidence bands. This paper provides a general...
In this study, we focus on estimating the heterogeneous treatment effect (HTE) for survival outcome. The outcome is subject to censoring and the number of covariates is high-dimensional. We utilize data from both the randomized controlled trial (RCT), considered as the gold standard, and real-world data (RWD), possibly affected by hidden confounding factors. To achieve a more...
Survival data is doubly truncated when only participants who experience an event during a random interval are included in the sample. Existing methods typically correct for double truncation bias in Cox regression through inverse probability weighting via the nonparametric maximum likelihood estimate (NPMLE) of the selection probabilities. This approach relies on two key...
In the context of time-to-event analysis, First hitting time methods consider the event occurrence as the ending point of some evolving process. The characteristics of the process are of great relevance for the analysis, which makes this class of models interesting and particularly suitable for applications where something about the degradation path is known. In cases where the...
Individuals with end-stage kidney disease (ESKD) on dialysis experience high mortality and excessive burden of hospitalizations over time relative to comparable Medicare patient cohorts without kidney failure. A key interest in this population is to understand the time-dynamic effects of multilevel risk factors that contribute to the correlated outcomes of longitudinal...
Forecasting mortality rates is crucial for evaluating life insurance company solvency, especially amid disruptions caused by phenomena like COVID-19. The Lee–Carter model is commonly employed in mortality modelling; however, extensions that can encompass count data with diverse distributions, such as the Generalized Autoregressive Score (GAS) model utilizing the COM–Poisson...
Nested case–control design (NCC) is a cost-effective outcome-dependent design in epidemiology that collects all cases and a fixed number of controls at the time of case diagnosis from a large cohort. Due to inefficiency relative to full cohort studies, previous research developed various estimation methodologies but changing designs in the formulation of risk sets was considered...
The added value of candidate predictors for risk modeling is routinely evaluated by comparing the performance of models with or without including candidate predictors. Such comparison is most meaningful when the estimated risk by the two models are both unbiased in the target population. Very often data for candidate predictors are sourced from nonrepresentative convenience...
The case-cohort design obtains complete covariate data only on cases and on a random sample (the subcohort) of the entire cohort. Subsequent publications described the use of stratification and weight calibration to increase efficiency of estimates of Cox model log-relative hazards, and there has been some work estimating pure risk. Yet there are few examples of these options in...
Hazard ratios are prone to selection bias, compromising their use as causal estimands. On the other hand, if Aalen’s additive hazard model applies, the hazard difference has been shown to remain unaffected by the selection of frailty factors over time. Then, in the absence of confounding, observed hazard differences are equal in expectation to the causal hazard differences...
This paper presents a semi-parametric modeling technique for estimating the survival function from a set of right-censored time-to-event data. Our method, named pseudo-value regression trees (PRT), is based on the pseudo-value regression framework, modeling individual-specific survival probabilities by computing pseudo-values and relating them to a set of covariates. The standard...
It is known that the hazard ratio lacks a useful causal interpretation. Even for data from a randomized controlled trial, the hazard ratio suffers from so-called built-in selection bias as, over time, the individuals at risk among the exposed and unexposed are no longer exchangeable. In this paper, we formalize how the expectation of the observed hazard ratio evolves and deviates...
To achieve the goal of providing the best possible care to each individual under their care, physicians need to customize treatments for individuals with the same health state, especially when treating diseases that can progress further and require additional treatments, such as cancer. Making decisions at multiple stages as a disease progresses can be formalized as a dynamic...
In this article we study the effect of a baseline exposure on a terminal time-to-event outcome either directly or mediated by the illness state of a continuous-time illness-death process with baseline covariates. We propose a definition of the corresponding direct and indirect effects using the concept of separable (interventionist) effects (Robins and Richardson in Causality and...
The Nun study is a well-known longitudinal epidemiology study of aging and dementia that recruited elderly nuns who were not yet diagnosed with dementia (i.e., incident cohort) and who had dementia prior to entry (i.e., prevalent cohort). In such a natural history of disease study, multistate modeling of the combined data from both incident and prevalent cohorts is desirable to...
Many research questions concern treatment effects on outcomes that can recur several times in the same individual. For example, medical researchers are interested in treatment effects on hospitalizations in heart failure patients and sports injuries in athletes. Competing events, such as death, complicate causal inference in studies of recurrent events because once a competing...
Jack-knife pseudo-observations have in recent decades gained popularity in regression analysis for various aspects of time-to-event data. A limitation of the jack-knife pseudo-observations is that their computation is time consuming, as the base estimate needs to be recalculated when leaving out each observation. We show that jack-knife pseudo-observations can be closely...
This paper addresses the problem of estimating the conditional survival function of the lifetime of the subjects experiencing the event (latency) in the mixture cure model when the cure status information is partially available. The approach of past work relies on the assumption that long-term survivors are unidentifiable because of right censoring. However, in some cases this...
Multi-state models are used to describe how individuals transition through different states over time. The distribution of the time spent in different states, referred to as ‘length of stay’, is often of interest. Methods for estimating expected length of stay in a given state are well established. The focus of this paper is on the distribution of the time spent in different...
The classical approach to analyze time-to-event data, e.g. in clinical trials, is to fit Kaplan–Meier curves yielding the treatment effect as the hazard ratio between treatment groups. Afterwards, a log-rank test is commonly performed to investigate whether there is a difference in survival or, depending on additional covariates, a Cox proportional hazard model is used. However...