doc: update README

Signed-off-by: Xin Liu <[email protected]>
LlamaEdge · Apr 8, 2024 · 76c8523 · 76c8523
1 parent 69f7f04
commit 76c8523
Showing 1 changed file with 37 additions and 35 deletions.
diff --git a/README.md b/README.md
@@ -347,43 +347,45 @@ To check the CLI options of the `rag-api-server` wasm app, you can run the follo
   ```bash
   $ wasmedge rag-api-server.wasm -h
 
-  Usage: rag-api-server.wasm [OPTIONS] --model-name <MODEL-NAME> --prompt-template <TEMPLATE>
+  Usage: rag-api-server.wasm [OPTIONS] --model-name <MODEL_NAME> --prompt-template <PROMPT_TEMPLATE>
 
   Options:
-  -m, --model-name <MODEL-NAME>
-          Sets names for chat and embedding models. The names are separated by comma without space, for example, '--model-name Llama-2-7b,all-minilm'.
-  -a, --model-alias <MODEL-ALIAS>
-          Sets model aliases [default: default,embedding]
-  -c, --ctx-size <CTX_SIZE>
-          Sets context sizes for chat and embedding models. The sizes are separated by comma without space, for example, '--ctx-size 4096,384'. The first value is for the chat model, and the second value is for the embedding model. [default: 4096,384]
-  -r, --reverse-prompt <REVERSE_PROMPT>
-          Halt generation at PROMPT, return control.
-  -p, --prompt-template <TEMPLATE>
-          Sets the prompt template. [possible values: llama-2-chat, codellama-instruct, codellama-super-instruct, mistral-instruct, mistrallite, openchat, human-assistant, vicuna-1.0-chat, vicuna-1.1-chat, vicuna-llava, chatml, baichuan-2, wizard-coder, zephyr, stablelm-zephyr, intel-neural, deepseek-chat, deepseek-coder, solar-instruct, gemma-instruct]
-      --system-prompt <system_prompt>
-          Sets global system prompt. [default: ]
-      --qdrant-url <qdrant_url>
-          Sets the url of Qdrant REST Service. [default: http://localhost:6333]
-      --qdrant-collection-name <qdrant_collection_name>
-          Sets the collection name of Qdrant. [default: default]
-      --qdrant-limit <qdrant_limit>
-          Max number of retrieved result. [default: 3]
-      --qdrant-score-threshold <qdrant_score_threshold>
-          Minimal score threshold for the search result [default: 0.4]
-      --log-prompts
-          Print prompt strings to stdout
-      --log-stat
-          Print statistics to stdout
-      --log-all
-          Print all log information to stdout
-      --web-ui <WEB_UI>
-          Root path for the Web UI files [default: chatbot-ui]
-  -s, --socket-addr <IP:PORT>
-          Sets the socket address [default: 0.0.0.0:8080]
-  -h, --help
-          Print help
-  -V, --version
-          Print version
+    -m, --model-name <MODEL_NAME>
+            Sets names for chat and embedding models. The names are separated by comma without space, for example, '--model-name Llama-2-7b,all-minilm'
+    -a, --model-alias <MODEL_ALIAS>
+            Model aliases for chat and embedding models [default: default,embedding]
+    -c, --ctx-size <CTX_SIZE>
+            Sets context sizes for chat and embedding models. The sizes are separated by comma without space, for example, '--ctx-size 4096,384'. The first value is for the chat model, and the second is for the embedding model [default: 4096,384]
+    -p, --prompt-template <PROMPT_TEMPLATE>
+            Prompt template [possible values: llama-2-chat, mistral-instruct, mistrallite, openchat, codellama-instruct, codellama-super-instruct, human-assistant, vicuna-1.0-chat, vicuna-1.1-chat, vicuna-llava, chatml, baichuan-2, wizard-coder, zephyr, stablelm-zephyr, intel-neural, deepseek-chat, deepseek-coder, solar-instruct, phi-2-chat, phi-2-instruct, gemma-instruct]
+    -r, --reverse-prompt <REVERSE_PROMPT>
+            Halt generation at PROMPT, return control
+    -b, --batch-size <BATCH_SIZE>
+            Batch size for prompt processing [default: 512]
+        --system-prompt <SYSTEM_PROMPT>
+            Global system prompt
+        --qdrant-url <QDRANT_URL>
+            URL of Qdrant REST Service [default: http://localhost:6333]
+        --qdrant-collection-name <QDRANT_COLLECTION_NAME>
+            Name of Qdrant collection [default: default]
+        --qdrant-limit <QDRANT_LIMIT>
+            Max number of retrieved result [default: 3]
+        --qdrant-score-threshold <QDRANT_SCORE_THRESHOLD>
+            Minimal score threshold for the search result [default: 0.4]
+        --log-prompts
+            Print prompt strings to stdout
+        --log-stat
+            Print statistics to stdout
+        --log-all
+            Print all log information to stdout
+        --socket-addr <SOCKET_ADDR>
+            Socket address of LlamaEdge API Server instance [default: 0.0.0.0:8080]
+        --web-ui <WEB_UI>
+            Root path for the Web UI files [default: chatbot-ui]
+    -h, --help
+            Print help
+    -V, --version
+            Print version
   ```
 
 </details>