Event

Data Fusion for Studying Cancer Surveillance Databases

Wednesday, September 3, 2025 15:30to16:30

Fatema Shafie Khorassani, PhD

Assistant Professor of Biostatistics, Boston University School of Public Health

Note: Meet & Greet Prof Shafie Khorassani from 3-3:30pm in Room 1140; Prior to seminar 3:30-4:30pm

WHEN: Wednesday, September 3, 2025, from 3:30 to 4:30 p.m.
WHERE: Hybrid | 2001 91ºÚÁÏÍø College Avenue, Rm 1140;
NOTE: Fatema Shafie Khorassani will be presenting in-person

Abstract

Studying factors associated with racial disparities in cancer mortality requires data on many variables, including healthcare access, socioeconomic status, and comorbidities. Existing national cancer surveillance resources each collect parts of the necessary information. Using the Surveillance, Epidemiology, and End Results (SEER) registry means excluding information like hospital type, insurance status, and comorbidities. On the other hand, the National Cancer Database (NCDB), which does provide that information, excludes cause-of-death, making it impossible to study cancer-specific mortality. Integrating data from multiple sources allows us to study associations between race and cancer-specific mortality adjusted for important confounders. Our goal is to make inference about a model regressing an outcome from one dataset on a set of important confounders from another. Both datasets collect a set of common variables. We propose a method for data fusion with time-to-event outcomes and present semiparametric estimating equations for data fusion. Our proposed estimating equation is doubly robust, providing consistent parameter estimates if either the source process or the distribution of partially observed covariates is correctly specified. We apply the estimating equations to studying racial disparities in cancer mortality using data from SEER adjusted for confounders collected in NCDB.

Speaker Bio

Fatema Shafie Khorassani is an Assistant Professor of Biostatistics at the Boston University School of Public Health. Her statistical methodology research focuses on data integration methods for time-to-event outcomes, causal inference for observational data, and statistical methods for the evaluation of surrogate outcomes. Her work is motivated by studying health disparities using complex observational data sources. She has applied this work in studying cancer, stroke, and tobacco use. She earned her PhD in Biostatistics from the University of Michigan.

Back to top