Date of Award

Fall 12-11-2023

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Biology

First Advisor

Ritu Aneja

Second Advisor

Jun Kong

Third Advisor

Emilius Adrianus Maria Janssen

Abstract

Background: Triple Negative Breast Cancer (TNBC) is an aggressive breast cancer subtype that lacks expression of estrogen (ER), progesterone (PR), and human epidermal growth factor 2 (HER2) receptors. Neoadjuvant chemotherapy (NAC), or chemotherapy given before surgery to downstage the tumor, is part of the standard treatment approach for patients with TNBC. However, only 30-40% of TNBC patients respond well to NAC, resulting in a pathological complete response (pCR), i.e., absence of residual disease (RD). Patients with TNBC who do not respond well to NAC (~ 60-70%) are at a higher risk of disease recurrence and distant metastasis (met).

Methods: In this study, we developed supervised machine learning models to distinguish between various histological components in hematoxylin and eosin (H&E)-stained whole slide images (WSIs) of annotated TNBC tissue to identify features that can predict NAC response and metastasis. In the NAC study, H&E-stained WSIs of treatment-naïve biopsies from 85 patients (model development cohort) and 79 patients (validation cohort). Tile-level model inputs were preprocessed tiles from WSIs measured as 55 texture features and separated through a stratified 8-fold cross-validation strategy (TNBC H&E histology pipeline). Patient-level models leveraged the top eight graph-based features of paired histology classification maps following a leave-one-out cross-validation strategy. The metastasis study incorporated H&E-stained WSIs of adjuvant-treated resections from 115 patients into the TNBC H&E histology pipeline. Patient-level models leveraged graph-based features, normalized clinical features, and metastasis outcome data following a synthetic minority oversampling (SMOTE) and nested 4-fold cross-validation strategy.

Results: The NAC ML pipeline achieved 84.1% accuracy, and the Met ML pipeline achieved 80.1% accuracy. The histological class pairs with the strongest NAC response predictive ability were tumor & tumor tumor-infiltrating lymphocytes (TILs) for pCR and microvessel density & polyploid giant cancer cells (PGCCs) for RD. Similarly, the histological class pairs with the strongest Met predictive ability were Stroma & PGGC and Stroma & Tumor. The addition of clinical variables in met pipelines statistically significantly increased the overall performance metrics of Met ML models.

Conclusion: Our machine learning pipelines can robustly identify clinically relevant histological classes to predict NAC response and metastatic outcomes in TNBC patients.

DOI

https://doi.org/10.57709/36393233

File Upload Confirmation

1

Share

COinS