-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
78 lines (66 loc) · 8.29 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title><!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Fusing Imaging and Metabolic Modelling in Ovarian Cancer</title>
<link rel="stylesheet" href="style.css">
</head>
<body>
<header>
<h1>Fusing Imaging and Metabolic Modelling in Ovarian Cancer</h1>
<p>Noushin Eftekhari, Aninda Saha, Suraj Verma, Guido Zampieri, Saladin Sawan, Annalisa Occhipinti, Claudio Angione</p>
</header>
<!-- Section: Figure -->
<section class="section">
<div class="container">
<img src="Images/Fig1-pipeline.png" alt="Patient-specific fluxomic and radiomics machine learning pipeline" style="max-width:100%; height:auto;">
<p><strong>Figure 1:</strong> The feature generation and feature augmentation consisted of the following two steps: (1.A) 3D patch tumours of unlabelled CT images were extracted using a U-Net architecture trained on 50 cancer patients with labelled CT images. (1.B) Integrating transcriptomics data in Genome-Scale Metabolic Modeling (GSMM) is one of the essential steps to estimate patient-specific models. Specifically, gene expression (GE) data is used as a constraint to create patient-specific models that simulate the corresponding patient metabolism.
(2) Three deep learning models generated a latent data representation. This step included feature selection from GE and flux data (FD) data to reduce data dimensionality and design three deep neural networks (DNNs) to integrate image data (IMG), GE and FD multi-modal datasets. (3) Finally, the patient-specific lower-dimensional features extracted from each modality were combined and fed into ML-based survival prediction models (i.e., RSF, CGBSurv, GBSurv, CoxPH, Coxnet, SurvSVM), which were evaluated using C-index and Kaplan-Meier (KM) curves.</p>
</div>
</section>
<section id="highlights">
<h2>Highlights</h2>
<ul>
<li>Integrated Multi-Modal Pipeline for Personalized Cancer Models</li>
<li>Metabolic Biomarker Identification for Ovarian Cancer</li>
<li>Explainable AI Enhances Biological Understanding</li>
</ul>
</section>
<section id="abstract">
<h2>Abstract</h2>
<p>
Integrating heterogeneous data is crucial to elucidate the molecular bases of ovarian cancer. Measurements charting the relationship between genotype (e.g., transcriptomics), phenotype (e.g., imaging), and tumour microenvironment (e.g., metabolomics) are required for a full picture of tumour development. However, there is a lack of robust multimodal integration methods when only a limited number of common samples is available. In this study, we generate patient-specific metabolic models starting from transcriptomics data and integrate the resulting flux rates (fluxomics) with imaging data. We show that this multi-modal integration -- never attempted before -- improves ovarian cancer survival estimation, while enabling a mechanistic interpretation of the predictions. To assess the robustness of our approach, we evaluate its accuracy with different combinations of transcriptomics, fluxomics, and computerised tomography (CT) imaging data, and predict patient survival. Generating and fusing flux rates with imaging and transcriptomics significantly improves model accuracy compared to widely used transcriptomics-imaging approaches, while also producing insights into critical metabolic reactions in ovarian cancer. Our approach is general and can be applied to other cancer types where coupled imaging-transcriptomics data is available.
</p>
</section>
<section id="results">
<h2>Results</h2>
<div class="figure">
<img src="Images/Fig4-TCA.png" alt="Metabolic Modelling Identifies MDH, G6PDH2c, PFK as Potential Ovarian Cancer Biomarkers" style="max-width:100%; height:auto;">
<p>(A) Difference between HR and LR patients' fluxomic data from maxFVA of critical metabolic pathways in ovarian cancer, predicted by our model. Red indicates the maximum difference between LR and HR patients (505.1 mmol/gDWh), while green shows the highest negative differences (-421 mmol/gDWh). (B) The heatmap shows maximum flux values of reactions in critical pathways in ovarian cancer.
(C) Heatmap showing the median of all reaction maximum flux values in 142 subsystems of the Human-GEM atlas for LR/HR patients (in both B and C, we randomly selected nine not-censored patients in the test set).</p>
</div>
<div class="figure">
<img src="Images/Fig3-Cindex.png" alt="Metabolic Feature Augmentation Improves Omics-Based Survival Models "style="max-width:100%; height:auto;">
<p>(A) C-index values calculated with different IMG, FD and GE data combinations and six ML methods (RSF, CGBSurv, GBSurv, CoxPH, Coxnet, and SurvSVM). The statistical differences between pairs of methods are shown via $p$-values (the t-test input is a perturbation pair of six ML methods). The integration of the three modalities shows better results than two-modalities integration (statistical significance: ns: p > 0.05; *: p $\le$ 0.05; **: p $\le$ 0.01; ***: p $\le$ 0.001). (B) The KM curves show the classification of HR and LR patients for the six ML methods. The corresponding $p$-values show that CGBSurv, GBSurv, and CoxPH were the best-performing models. (C) The HR and LR fluxomic data (maxFVA values LR/HR, respectively) for five critical pathways in ovarian cancer, including (Galactose Metabolism (GM), Glycolysis /Gluconeogenesis (Gly/Glu), Pentose Phosphate Pathway (PPP), Transport Reaction (TR), Tricarboxylic acid cycle and glyoxylate/dicarboxylate metabolism (TCA/DM)).</p>
</div>
<div class="figure">
<img src="Images/Fig4-5_AO.png" alt="Biological Interpretation of Selected Gene Features shows MUC16, CLDN3 and the KLK Family as Biomarkers for Ovarian Cancer"style= "max-width:100%; height:auto;">
<p>(A) The two U-Net model versions show that the loss values decrease when the number of steps increases. The best value is 0.39 with 20 scans per patient, which indicates the more accurate model to segment the ROI. (B). C-index of three DNN models trained to extract the last hidden layer features for IMG, FD, and GE on the test set. (C) and (D) The distribution of SHAP values for each data type for the best model (Coxnet) demonstrates that FD has a substantial impact on the outcome of ovarian cancer survival prediction, showing that adding this layer to radiomics data can improve the outcomes. (E) Number of genes associated with significant pathways with Benjamini p-value (i.e., corrected $p$-values) $<$0.05. (F) Normal distribution of HR and LR groups based on integrated feature vector (e.g., FD, GE, IMG) SHAP value. (G) and (H) Empirical Cumulative Distribution Function (ECDF), which enables plotting SHAP data values from the lowest contributing feature to the highest contributing one and visualising them scattered across data types. (I) The SHAP explainer was applied to the DNN model to extract essential features from the GE data type. The PGM5 gene significantly impacted the DNN prediction, and the PGM reaction also showed a significant flux rate value in the glycolysis pathway.
(J) The 30 genes with the most significant variation in GE values between cancerous and healthy tissue. Additionally, a pairwise comparison of each gene with the same gene in the other group exhibits a significant $p$-value. Keratin 7 (KRT7) shows about ten units of difference between the median of GE healthy and cancer samples; also, this gene is known as an ovarian cancer prognostic marker. Lipocalin 2 (LCN2), the CD24 molecule (CD24), and epithelial splicing regulatory protein 1 (ESRP1) are prognostic markers in breast cancer that, according to our analysis, have a significant difference between healthy and cancer groups. (K) Reactome pathway enrichment shows the top 30 pathways significantly linked to ovarian cancer ($p$-value < 0.05), and with a high proportion of genes among the GE selected features.</p>
</div>
</section>
<footer>
<p>© 2024 Noushin Eftekhari. All rights reserved.</p>
</footer>
</body>
</html>
</title>
</head>
<body>
</body>
</html>