Fix gradio bug and update readme (#19)

Signed-off-by: Mingxin Zheng <[email protected]> Co-authored-by: Holger Roth <[email protected]>
Project-MONAI · Oct 16, 2024 · bc31eec · bc31eec
1 parent c4f677a
commit bc31eec
Show file tree

Hide file tree

Showing 2 changed files with 47 additions and 13 deletions.
diff --git a/README.md b/README.md
@@ -15,32 +15,65 @@ For details, see [here](./monai_vila2d/README.md).
 
 ### Local Demo
 
-- Make sure you have CUDA 12.2 and Python 3.10 installed
-    - (Recommendded) Use Docker image: `nvidia/cuda:12.2.2-devel-ubuntu22.04`
+#### Prerequisites
+
+1. **Linux Operating System**
+
+1. **CUDA Toolkit 12.2** (with `nvcc`) for [VILA](https://github.com/NVlabs/VILA).
+
+    To verify CUDA installation, run:
+    ```bash
+    nvcc --version
+    ```
+    If CUDA is not installed, use one of the following methods:
+    - **Recommended** Use the Docker image: `nvidia/cuda:12.2.2-devel-ubuntu22.04`
+        ```bash
+        docker run -it --rm --ipc host --gpus all --net host \
+            -v <ckpts_dir>:/data/checkpoints \
+            nvidia/cuda:12.2.2-devel-ubuntu22.04 bash
+        ```
+    - **Manual Installation (not recommended)**: Download the appropiate package from [NVIDIA offical page](https://developer.nvidia.com/cuda-12-2-2-download-archive)
+
+1. **Python 3.10** and **Git**
+
+    To install these, run
     ```bash
-    docker run -itd --rm --ipc host --gpus all --net host -v <mount paths> \
-        nvidia/cuda:12.2.2-devel-ubuntu22.04 bash
+    sudo apt-get update
+    sudo apt-get install -y python3.10 python3.10-venv python3.10-dev git
     ```
-    **IMPORTANT**: Install these packages in container too: `apt-get update && apt-get install -y python3.10 python3.10-venv git`
-    - Manually install it: https://developer.nvidia.com/cuda-12-2-2-download-archive
-- Set up the dependencies
+    NOTE: The commands are tailored for the Docker image `nvidia/cuda:12.2.2-devel-ubuntu22.04`. If using a different setup, adjust the commands accordingly.
+
+
+#### Setup Environment
+
+1. Clone the repository and set up the environment:
     ```bash
     git clone https://github.com/Project-MONAI/VLM --recursive
     cd VLM
-    python3.10 -m venv .venv
+    python -m venv .venv
     source .venv/bin/activate
     make demo_monai_vila2d
     ```
 
-- Run the Demo
+#### Running the Gradio Demo
+
+1. Navigate to the demo directory:
     ```bash
     cd demo
-    # keys to call the expert models
+    ```
+
+1. Set the API keys for calling the expert models:
+    ```bash
     export api_key=<your nvcf key>
     export NIM_API_KEY=<your NIM key>
+    ```
+
+1. Start the Gradio demo:
+    ```bash
     python demo/gradio_monai_vila2d.py  \
-        --modelpath <path to the checkpoint> \
-        --convmode <llama_3 or vicuna_1>
+        --modelpath /data/checkpoints/<checkpoint-name> \
+        --convmode <llama_3 or vicuna_1> \
+        --port 7860
     ```
 
 ## Contributing

diff --git a/demo/gradio_monai_vila2d.py b/demo/gradio_monai_vila2d.py
@@ -236,7 +236,7 @@ def __init__(self):
         self.sys_prompt = SYS_PROMPT
         self.sys_msg = SYS_MSG
         self.slice_index = None  # Slice index for 3D images
-        self.image_path = None  # Image path to display and process
+        self.image_url = None  # Image URL to the image on the web
         self.axis = 2
         self.top_p = 0.9
         self.temperature = 0.0
@@ -415,6 +415,7 @@ def process_prompt(self, prompt, sv, chat_history):
                 break
 
         if expert:
+            logger.debug(f"Expert model {expert.__class__.__name__} is being called.")
             text_output, seg_file, instruction, download_pkg = expert.run(
                 image_url=sv.image_url,
                 input=outputs,