diff --git a/README.md b/README.md
index f37167c..a8ae932 100644
--- a/README.md
+++ b/README.md
@@ -8,13 +8,21 @@ The repository provides a collection of vision language models, benchmarks, and
 ## VILA-M3
 
 **VILA-M3** is a *vision language model* designed specifically for medical applications. 
-It focuses on addressing the unique challenges faced by general-purpose vision-language models when applied to the medical domain.
+It focuses on addressing the unique challenges faced by general-purpose vision-language models when applied to the medical domain and integrated with existing expert segmentation and classification models.
+
+<p align="center">
+  <img src="m3/docs/images/MONAI-VLM_Overview.svg" width="95%"/>
+</p>
 
 For details, see [here](m3/README.md).
 
 
 ### Local Demo
 
+<p align="center">
+  <img src="m3/docs/images/gradio_app_ct.png" width="60%"/>
+</p>
+
 #### Prerequisites
 
 1. **Linux Operating System**
diff --git a/m3/README.md b/m3/README.md
index a109dcc..1165929 100644
--- a/m3/README.md
+++ b/m3/README.md
@@ -43,29 +43,38 @@ The resulting expert model output will be fed back to the VLM for generating the
 ## Performance
 
 ### VQA Benchmarks
-|                   | Average |
-|-------------------|---------|
-| VILA-M3-3B        |         |
-| Llama3-VILA-M3-8B |         |
-| VILA-M3-13B       |         |
+|     Model                 |     Type             | VQA-RAD*  | SLAKE-VQA | Path-VQA | Average  |
+|---------------------------|----------------------|-----------|-----------|----------|----------|
+|     Llava-Med             |     Task-specific    | *84.2*    | *86.8*    | *91.7*   | *87.6*   |
+|     Med-Gemini-1.5T       |     Generalist       | 78.8      | **84.8**  | 83.3     | 82.3     |
+|     Llama3-VILA-M3-3B     |     Generalist       | 78.2      | 79.8      | 87.9     | 82.0     |
+|     Llama3-VILA-M3-8B     |     Generalist       | **84.5**  | 84.5      | 90.0     | **86.3** |
+|     Llama3-VILA-M3-13B    |     Generalist       | 80.5      | 83.2      | **91.0** | 84.9     |
+*Comparisons to Llava-Med & Med-Gemini are not direct as data splits are not available.
 
 ### Report Generation Benchmarks
-|                   | Average |
-|-------------------|---------|
-| VILA-M3-3B        |         |
-| Llama3-VILA-M3-8B |         |
-| VILA-M3-13B       |         |
+|     Model                 |     Type             | BLUE-4*  | ROUGE*   | GREEN*   |
+|---------------------------|----------------------|----------|----------|----------|
+|     Llava-Med             |     Task-specific    | *1.0*    | *13.3*   | -        |
+|     Med-Gemini-1.5T       |     Generalist       | 20.5     | 28.3     | -        |
+|     Llama3-VILA-M3-3B     |     Generalist       | 20.2     | 31.7     | 39.4     |
+|     Llama3-VILA-M3-8B     |     Generalist       | 21.5     | **32.3** | 40.0     |
+|     Llama3-VILA-M3-13B    |     Generalist       | **21.6** | 32.1     | 39.3     |
+*Comparisons to Llava-Med & Med-Gemini are not direct as data splits are not available.
 
 ### Classification Benchmarks
-|                   | Average |
-|-------------------|---------|
-| VILA-M3-3B        |         |
-| Llama3-VILA-M3-8B |         |
-| VILA-M3-13B       |         |
-
+| Expert info               | w/o          | w/o        | with         | with       |
+|---------------------------|--------------|------------|--------------|------------|
+|     Model                 | ChestX-ray14 | CheXpert   | ChestX-ray14 | CheXpert   |
+|     Med-Gemini-1.5T       | 46.7         | 48.3       | -            | -          |
+|     TorchXRayVision       | -            | -          | 50           | 51.5       |
+|     Llama3-VILA-M3-3B     | 48.4         | 57.4       | **51.3**     | 60.8       |
+|     Llama3-VILA-M3-8B     | 45.9         | **61.4**   | 50.7         | 60.4       |
+|     Llama3-VILA-M3-13B    | **49.9**     | 55.8       | 51.2         | **61.5**   |
 
 ## Demo
-An interactive demo is provided in ...
+For and interactive demo, please access here.
+The code to run the demo locally is described [here](../README.md#local-demo).
 
 ## Data preparation
 To prepare the datasets for training and evaluation, follow the instructions in [data_prepare](./data_prepare).
@@ -73,6 +82,21 @@ To prepare the datasets for training and evaluation, follow the instructions in
 ## Training
 To replicate our fine-tuning procedure, utilize the provided scripts.
 
+For our released checkpoints, we use a slurm cluster environment.
+- VILA training code with Torch distributed
+- 4 nodes with 8xA100 GPUs (80 GB each)
+- Cosine learning rate decay with warmup
+
+<p align="left">
+  <img src="docs/images/training.png" width="50%"/>
+</p>
+
+|     # Parameters    |     Training time    |
+|---------------------|----------------------|
+|     3 billion       |     5.5 hours        |
+|     8 billion       |     11.0 hours       |
+|     13 billion      |     19.5 hours       |
+
 ## Evaluation
 To evaluate a model on the above benchmarks, follow the instructions in [eval](./eval/README.md)
 
diff --git a/m3/docs/images/gradio_app_ct.png b/m3/docs/images/gradio_app_ct.png
new file mode 100644
index 0000000..3a61281
Binary files /dev/null and b/m3/docs/images/gradio_app_ct.png differ
diff --git a/m3/docs/images/training.png b/m3/docs/images/training.png
new file mode 100644
index 0000000..68f86eb
Binary files /dev/null and b/m3/docs/images/training.png differ