Deterministic linkage for improving follow-up time in a Brazilian population-based cancer registry

Scientific Reports, Apr 2023

Population-based cancer registries (PBCR) are the primary source of cancer incidence and survival statistics. The loss to follow-up of these patients is concerning since it reduces the reliability of any statistical analysis. The linkage techniques have been increasingly used to improve data quality in various information systems. The linkage was performed between the databases of the PBCR-Barretos and the mortality database of the state of São Paulo. To evaluate the improvement in the follow-up time of patients, the comparability of the two databases, pre- and post linkage, was made. Three analyses were performed: a comparative analysis of the absolute number of deaths, a comparative analysis of the follow-up time of patients and the survival analysis. After linkage, there was an increase of 813 deaths. The follow-up time of patients was extended and observed in most types of tumours. The comparability of the survival analyses at both time points also showed a decrease in survival probabilities for all tumour types. Deterministic linkage is effective in updating the vital status of registered patients, improving patient follow-up time, and maintaining good quality data from PBCRs, consequently producing more reliable rates, as seen for the survival analyses.

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s41598-023-31303-6.pdf

Deterministic linkage for improving follow-up time in a Brazilian population-based cancer registry

www.nature.com/scientificreports OPEN Deterministic linkage for improving follow‑up time in a Brazilian population‑based cancer registry Talita Fernanda Pereira 1,2*, Valmir José Aranha 3, Bernadette Cunha Waldvogel Allini Mafra da Costa 1,2,4,6 & José Humberto Tavares Guerreiro Fregnani 1,5,6 3 , Population-based cancer registries (PBCR) are the primary source of cancer incidence and survival statistics. The loss to follow-up of these patients is concerning since it reduces the reliability of any statistical analysis. The linkage techniques have been increasingly used to improve data quality in various information systems. The linkage was performed between the databases of the PBCR-Barretos and the mortality database of the state of São Paulo. To evaluate the improvement in the follow-up time of patients, the comparability of the two databases, pre- and post linkage, was made. Three analyses were performed: a comparative analysis of the absolute number of deaths, a comparative analysis of the follow-up time of patients and the survival analysis. After linkage, there was an increase of 813 deaths. The follow-up time of patients was extended and observed in most types of tumours. The comparability of the survival analyses at both time points also showed a decrease in survival probabilities for all tumour types. Deterministic linkage is effective in updating the vital status of registered patients, improving patient follow-up time, and maintaining good quality data from PBCRs, consequently producing more reliable rates, as seen for the survival analyses. Abbreviations PBCR Population-based Cancer Registries LFU Lost follow-up IARC International Agency for Research on Cancer FSEADE Foundation State System of Data Analysis DCO Deaths certificate only CPF Brazilian personal identification (in Portuguese: Cadastro de Pessoa Física) ICD-10 International Statistical Classification of Diseases APAC High-complexity information system SAI Outpatient information system SINAN Notifiable diseases information system SIM Mortality information systems Population-based cancer registries (PBCRs) are the primary source for cancer incidence and survival statistics and are considered to be the gold s tandard1. These statistics are critical tools for cancer prevention initiatives and are widely regarded as essential for health services and cancer control programs w orldwide2,3. In addition, survival estimates also contribute to the clinical treatment of patients because, based on the data, doctors can accurately adopt more effective treatments and m edications4. Therefore, cancer registries should always ensure data quality to avoid inaccurate information about the disease s tate2. According to the standards recommended by the International Agency for Research on Cancer (IARC), the quality of the PBCR is evaluated according to five dimensions: comparability, validity, timeliness, completeness, and data quality indices for population-based cancer survival. The IARC latest technical report, “Planning and developing population-based cancer registries in low- and middle-income settings," was published in 2014 and described certain quality assessors, such as the quality provided by survival rates, as measured by the follow-up 1 Post Graduate Program of the Education and Research Institute, Barretos Cancer Hospital, Pio XII Foundation, Barretos, São Paulo 14784‑400, Brazil. 2Based‑Population Cancer Registry of Barretos Region, Barretos Cancer Hospital, Pio XII Foundation, Barretos, São Paulo 14784‑400, Brazil. 3State System of Data Analysis Foundation, São Paulo 05508‑000, Brazil. 4Department of Precision Health, Luxembourg Institute of Health, 1445 Strassen, Luxembourg. 5A.C. Camargo Cancer Center, São Paulo 01525‑001, Brazil. 6These authors contributed equally: Allini Mafra da Costa and José Humberto Tavares Guerreiro Fregnani. *email: Scientific Reports | (2023) 13:4816 | https://doi.org/10.1038/s41598-023-31303-6 1 Vol.:(0123456789) www.nature.com/scientificreports/ time of cancer cases over time. This feature is mostly dependent on the PBCRs passive follow-up, which involves retrieving reported deaths from vital registry d atabases5. The loss to follow-up (LFU) of these patients is concerning, since it reduces the reliability of statistical analysis and could be a potential bias6. To ensure that these statistics have satisfactory quality, it is necessary to have a complete follow up, from the time of their diagnosis up to their death, or last contact update. Thus, the PBCRs cross-reference their databases with those of civil registries and vital statistics, identifying any deaths and cause of death of patients. However, the linkage process is quite challenging because access to the civil registry databases is often restricted. This restricted access can have numerous causes, the main one being the region’s public policies, such as data protection laws, which prevent access to sensitive information about i ndividuals1. Knowing the importance of quality control of the information in the PBCRs, the difficulties these registries face, and the linkage technique as an alternative for improving data quality, the current study aimed to evaluate the effects on the quality of information from the Population-Based Cancer Registry of Barretos, São Paulo, after the deterministic linkage with the mortality database of the State of São Paulo government. Material and methods This observational study included a cohort of 11,346 incident cancer cases in PBCR of Barretos (São Paulo state, Brazil) between 2002 and 2018, with patients ranging in age from 0 to 99 years old and of both sexes. Through technical cooperation, the mortality database of the State Data Analysis System Foundation (FSEADE) was used for the deterministic type of linkage with the PBCR of incident cancer cases from Barretos. This database has over 3.6 million deaths (excluding fetal deaths) in the State of São Paulo (Brazil). FSEADE used the deterministic technique to detect deaths, to improve follow-up time and to identify new cases (Death Certificate Only—DCO). To perform the linkage, the database was encrypted and sent to FSEADE through its institutional platform, where the data were sent after registering the researcher’s user and password. After the dataset was sent to the platform, only the professionals involved in the linkage had access to download. It was deleted after the affiliated group was allowed. The name, mother’s name, date of birth and Brazilian personal identification number (in Portuguese: Cadastro de Pessoa Física—CPF) were the defining criteria to consider the same individual in both databases (called pairs). Despite all attempts, it was not possible to obtain the mother’s name in 91 cases and the date of birth in one case in the PBCR of the Barretos database. The CPF and the individual’s name were recorded for all cases in the Barretos RCBP database. After deterministic linkage, there was a (...truncated)


This is a preview of a remote PDF: https://www.nature.com/articles/s41598-023-31303-6.pdf
Article home page: https://www.nature.com/articles/s41598-023-31303-6

Pereira, Talita Fernanda, Aranha, Valmir José, Waldvogel, Bernadette Cunha, da Costa, Allini Mafra, Tavares Guerreiro Fregnani, José Humberto. Deterministic linkage for improving follow-up time in a Brazilian population-based cancer registry, Scientific Reports, DOI: 10.1038/s41598-023-31303-6