feat: support i18n

milvus-io · Aug 26, 2024 · 46499c3 · 46499c3
1 parent b1d3982
commit 46499c3
Show file tree

Hide file tree

Showing 19 changed files with 156 additions and 99 deletions.
diff --git a/.github/workflows/ci-i18n.yml b/.github/workflows/ci-i18n.yml
@@ -0,0 +1,57 @@
+name: ci-i18n
+
+on:
+  push:
+    branches:
+      - feat/v2.4.x-i18n
+    # paths:
+    #   - "site/zh/**"
+  release:
+    types: [released]
+  workflow_dispatch:
+
+# A workflow run is made up of one or more jobs that can run sequentially or in parallel
+jobs:
+  # This workflow contains a single job called "build "
+  build:
+    # The type of runner that the job will run on
+    runs-on: ubuntu-latest
+
+    # Steps represent a sequence of tasks that will be executed as part of the job
+    steps:
+      - name: Check out Git repository
+        uses: actions/checkout@v1
+        with:
+          ref: feat/v2.4.x-i18n
+
+      - name: Extract branch name
+        shell: bash
+        run: echo "##[set-output name=branch;]$(echo ${GITHUB_REF#refs/heads/})"
+        id: extract_branch
+
+      - name: Md2md
+        run: |
+          cp site/en/Variables.json ./
+          mv site doc_from
+          sudo npm install @zilliz/mdtomd -g
+          goover
+          rm -rf doc_from/*
+          rm check-link.js
+          mv doc_to site
+
+      - name: Delete And Push
+        run: |
+          sudo apt-get update
+          sudo apt-get install jq
+          cd ../
+          git clone -b feat/localization https://.:${{ secrets.P_GITHUB_TOKEN }}@github.com/milvus-io/web-content.git target
+          git config --global user.email "[email protected]"
+          git config --global user.name "Milvus-doc-bot"
+          cp ./milvus-docs/version.json ./target
+          cd target
+          rm -rf `cat version.json | jq -r .version`
+          mkdir `cat version.json | jq -r .version`
+          cp -avr ../milvus-docs/** ./`cat version.json | jq -r .version`
+          git add .
+          git commit -m "Release new docs "
+          git push -f origin feat/localization
diff --git a/site/en/about/comparison.md b/site/en/about/comparison.md
@@ -52,7 +52,7 @@ Although both serve similar functions as vector databases, the domain-specific t
 | Deployment Modes | SaaS-only | Milvus Lite, On-prem Standalone & Cluster,  Zilliz Cloud Saas & BYOC |
 | Embedding Functions | Not available	 | Support with <a href="https://github.com/milvus-io/milvus-model">pymilvus[model]</a> |
 | Data Types | String, Number, Bool, List of String | String, VarChar, Number (Int, Float, Double), Bool, Array, JSON, Float Vector, Binary Vector, BFloat16, Float16, Sparse Vector |
-| Metric and Index Types | Cos, Dot, Euclidean<br>P-family, S-family | Cosine, IP (Dot), L2 (Euclidean),  Hamming, Jaccard<br>FLAT, IVF_FLAT, IVF_SQ8, IVF_PQ, HNSW, SCANN, GPU Indexes |
+| Metric and Index Types | Cos, Dot, Euclidean<br/>P-family, S-family | Cosine, IP (Dot), L2 (Euclidean),  Hamming, Jaccard<br/>FLAT, IVF_FLAT, IVF_SQ8, IVF_PQ, HNSW, SCANN, GPU Indexes |
 | Schema Design | Flexible mode | Flexible mode, Strict mode |
 | Multiple Vector Fields | N/A | Multi-vector and hybrid search |
 | Tools | Datasets, text utilities, spark connector | Attu, Birdwatcher, Backup, CLI, CDC, Spark and Kafka connectors |

diff --git a/site/en/about/roadmap.md b/site/en/about/roadmap.md
@@ -22,28 +22,28 @@ Welcome to the Milvus Roadmap! Join us on our continuous journey to enhance and
     </thead>
     <tbody>
         <tr>
-            <td><strong>AI-developer Friendly</strong><br><i>A developer-friendly technology stack, enhanced with the latest AI innovations</i></td>
-            <td><strong>Multi-Vectors & Hybrid Search</strong><br><i>Framework for multiplex recall and fusion</i><br><br><strong>GPU Index Acceleration</strong><br><i>Support for higher QPS and faster index creation</i><br><br><strong>Model Library in PyMilvus</strong><br><i>Integrated embedding models for Milvus</i></td>
-            <td><strong>Sparse Vector (GA)</strong><br><i>Local feature extraction and keyword search</i><br><br><strong>Milvus Lite (GA)</strong><br><i>A lightweight, in-memory version of Milvus</i><br><br><strong>Embedding Models Gallery</strong><br><i>Support for image and multi-modal embeddings and reranker models in model libraries</i></td>
-            <td><strong>Original Data-In and Data-Out</strong><br><i>Support for Blob data types</i><br><br><strong>Data Clustering</strong><br><i>Data co-locality</i><br><br><strong>Scenario-oriented Vector Search</strong><br><i>e.g. Multi-target search & NN filtering</i><br><br><strong>Support Embedding & Reranker Endpoint</strong></td>
+            <td><strong>AI-developer Friendly</strong><br/><i>A developer-friendly technology stack, enhanced with the latest AI innovations</i></td>
+            <td><strong>Multi-Vectors & Hybrid Search</strong><br/><i>Framework for multiplex recall and fusion</i><br/><br/><strong>GPU Index Acceleration</strong><br/><i>Support for higher QPS and faster index creation</i><br/><br/><strong>Model Library in PyMilvus</strong><br/><i>Integrated embedding models for Milvus</i></td>
+            <td><strong>Sparse Vector (GA)</strong><br/><i>Local feature extraction and keyword search</i><br/><br/><strong>Milvus Lite (GA)</strong><br/><i>A lightweight, in-memory version of Milvus</i><br/><br/><strong>Embedding Models Gallery</strong><br/><i>Support for image and multi-modal embeddings and reranker models in model libraries</i></td>
+            <td><strong>Original Data-In and Data-Out</strong><br/><i>Support for Blob data types</i><br/><br/><strong>Data Clustering</strong><br/><i>Data co-locality</i><br/><br/><strong>Scenario-oriented Vector Search</strong><br/><i>e.g. Multi-target search & NN filtering</i><br/><br/><strong>Support Embedding & Reranker Endpoint</strong></td>
         </tr>
         <tr>
-            <td><strong>Rich Functionality</strong><br><i>Enhanced retrieval and data management features</i></td>
-            <td><strong>Support for FP16, BF16 Datatypes</strong><br><i>These ML datatypes can help reduce memory usage</i><br><br><strong>Grouping Search</strong><br><i>Aggregate split embeddings</i><br><br><strong>Fuzzy Match and Inverted Index</strong><br><i>Support for fuzzy matching and inverted indexing for scalar types like varchar and int</i></td>
-            <td><strong>Inverted Index for Array & JSON</strong><br><i>Indexing for array and partial support JSON</i><br><br><strong>Bitset Index</strong><br><i>Improved execution speed and future data aggregation</i><br><br><strong>Truncate Collection</strong><br><i>Allows data clearance while preserving metadata</i><br><br><strong>Support for NULL and Default Values</strong></td>
-            <td><strong>Support for More Datatypes</strong><br><i>e.g. Datetime, GIS</i><br><br><strong>Advanced Text Filtering</strong><br><i>e.g. Match Phrase</i><br><br><strong>Primary Key Deduplication</strong></td>
+            <td><strong>Rich Functionality</strong><br/><i>Enhanced retrieval and data management features</i></td>
+            <td><strong>Support for FP16, BF16 Datatypes</strong><br/><i>These ML datatypes can help reduce memory usage</i><br/><br/><strong>Grouping Search</strong><br/><i>Aggregate split embeddings</i><br/><br/><strong>Fuzzy Match and Inverted Index</strong><br/><i>Support for fuzzy matching and inverted indexing for scalar types like varchar and int</i></td>
+            <td><strong>Inverted Index for Array & JSON</strong><br/><i>Indexing for array and partial support JSON</i><br/><br/><strong>Bitset Index</strong><br/><i>Improved execution speed and future data aggregation</i><br/><br/><strong>Truncate Collection</strong><br/><i>Allows data clearance while preserving metadata</i><br/><br/><strong>Support for NULL and Default Values</strong></td>
+            <td><strong>Support for More Datatypes</strong><br/><i>e.g. Datetime, GIS</i><br/><br/><strong>Advanced Text Filtering</strong><br/><i>e.g. Match Phrase</i><br/><br/><strong>Primary Key Deduplication</strong></td>
         </tr>
         <tr>
-            <td><strong>Cost Efficiency & Architecture</strong><br><i>Advanced systems emphasizing stability, cost efficiency, scalability, and performance</i></td>
-            <td><strong>Support for More Collections/Partitions</strong><br><i>Handles over 10,000 collections in smaller clusters</i><br><br><strong>Mmap Optimization</strong><br><i>Balances reduced memory consumption with latency</i><br><br><strong>Bulk Insert Optimazation</strong><br><i>Simplifies importing large datasets</i></td>
-            <td><strong>Lazy Load</strong><br><i>Data is loaded on-demand through read operations</i><br><br><strong>Major Compaction</strong><br><i>Re-distributes data based on configuration to enhance read performance</i><br><br><strong>Mmap for Growing Data</strong><br><i>Mmap files for expanding data segments</i></td>
-            <td><strong>Memory Control</strong><br><i>Reduces out-of-memory issues and provides global memory management</i><br><br><strong>LogNode Introduction</strong><br><i>Ensures global consistency and addresses the single-point bottleneck in root coordination</i><br><br><strong>Storage Format V2</strong><br><i>Universal format design lays the groundwork for disk-based data access</i></td>
+            <td><strong>Cost Efficiency & Architecture</strong><br/><i>Advanced systems emphasizing stability, cost efficiency, scalability, and performance</i></td>
+            <td><strong>Support for More Collections/Partitions</strong><br/><i>Handles over 10,000 collections in smaller clusters</i><br/><br/><strong>Mmap Optimization</strong><br/><i>Balances reduced memory consumption with latency</i><br/><br/><strong>Bulk Insert Optimazation</strong><br/><i>Simplifies importing large datasets</i></td>
+            <td><strong>Lazy Load</strong><br/><i>Data is loaded on-demand through read operations</i><br/><br/><strong>Major Compaction</strong><br/><i>Re-distributes data based on configuration to enhance read performance</i><br/><br/><strong>Mmap for Growing Data</strong><br/><i>Mmap files for expanding data segments</i></td>
+            <td><strong>Memory Control</strong><br/><i>Reduces out-of-memory issues and provides global memory management</i><br/><br/><strong>LogNode Introduction</strong><br/><i>Ensures global consistency and addresses the single-point bottleneck in root coordination</i><br/><br/><strong>Storage Format V2</strong><br/><i>Universal format design lays the groundwork for disk-based data access</i></td>
         </tr>
         <tr>
-            <td><strong>Enterprise Ready</strong><br><i>Designed to meet the needs of enterprise production environments</i></td>
-            <td><strong>Milvus CDC</strong><br><i>Capability for data replication</i><br><br><strong>Accesslog Enhancement</strong><br><i>Detailed recording for audit and tracing</i></td>
-            <td><strong>New Resource Group</strong><br><i>Enhanced resource management</i><br><br><strong>Storage Hook</strong><br><i>Support for Bring Your Own Key (BYOK) encryption</i></td>
-            <td><strong>Dynamic Replica Number Adjustment</strong><br><i>Facilitates dynamic changes to the number of replicas</i><br><br><strong>Dynamic Schema Modification</strong><br><i>e.g., Add/delete fields, modify varchar lengths</i><br><br><strong>Rust and C# SDKs</strong></td>
+            <td><strong>Enterprise Ready</strong><br/><i>Designed to meet the needs of enterprise production environments</i></td>
+            <td><strong>Milvus CDC</strong><br/><i>Capability for data replication</i><br/><br/><strong>Accesslog Enhancement</strong><br/><i>Detailed recording for audit and tracing</i></td>
+            <td><strong>New Resource Group</strong><br/><i>Enhanced resource management</i><br/><br/><strong>Storage Hook</strong><br/><i>Support for Bring Your Own Key (BYOK) encryption</i></td>
+            <td><strong>Dynamic Replica Number Adjustment</strong><br/><i>Facilitates dynamic changes to the number of replicas</i><br/><br/><strong>Dynamic Schema Modification</strong><br/><i>e.g., Add/delete fields, modify varchar lengths</i><br/><br/><strong>Rust and C# SDKs</strong></td>
         </tr>
     </tbody>
 </table>

diff --git a/site/en/adminGuide/deploy_etcd.md b/site/en/adminGuide/deploy_etcd.md
@@ -50,7 +50,7 @@ Run the following command to start Milvus that uses the etcd configurations.
 docker compose up
 ```
 
-<div class="alert note">Configurations only take effect after Milvus starts. See <a href=https://milvus.io/docs/install_standalone-docker.md#Start-Milvus>Start Milvus</a> for more information.</div>
+<div class="alert note">Configurations only take effect after Milvus starts. See <a href="https://milvus.io/docs/install_standalone-docker.md#Start-Milvus">Start Milvus</a> for more information.</div>
 
 ## Configure etcd on K8s
 

diff --git a/site/en/adminGuide/deploy_pulsar.md b/site/en/adminGuide/deploy_pulsar.md
@@ -34,7 +34,7 @@ Run the following command to start Milvus that uses the Pulsar configurations.
 docker compose up
 ```
 
-<div class="alert note">Configurations only take effect after Milvus starts. See <a href=https://milvus.io/docs/install_standalone-docker.md#Start-Milvus>Start Milvus</a> for more information.</div>
+<div class="alert note">Configurations only take effect after Milvus starts. See <a href="https://milvus.io/docs/install_standalone-docker.md#Start-Milvus">Start Milvus</a> for more information.</div>
 
 
 ## Configure Pulsar with Helm

diff --git a/site/en/adminGuide/deploy_s3.md b/site/en/adminGuide/deploy_s3.md
@@ -35,7 +35,7 @@ Run the following command to start Milvus that uses the S3 configurations.
 ```shell
 docker compose up
 ```
-<div class="alert note">Configurations only take effect after Milvus starts. See <a href=https://milvus.io/docs/install_standalone-docker.md#Start-Milvus>Start Milvus</a> for more information.</div>
+<div class="alert note">Configurations only take effect after Milvus starts. See <a href="https://milvus.io/docs/install_standalone-docker.md#Start-Milvus">Start Milvus</a> for more information.</div>
 
 ## Configure S3 on K8s
 

diff --git a/site/en/adminGuide/rbac.md b/site/en/adminGuide/rbac.md
@@ -58,7 +58,7 @@ client.update_password(
 ```python
 client.list_users()
 
-# output:
+# output
 # ['root', 'user_1']
 ```
 
@@ -67,7 +67,7 @@ client.list_users()
 ```python
 client.describe_user(user_name='user_1')
 
-# output:
+# output
 # {'user_name': 'user_1', 'roles': ()}
 ```
 
@@ -88,7 +88,7 @@ After creating a role, you can:
 ```python
 client.list_roles()
 
-# output:
+# output
 # ['admin', 'public', 'roleA']
 ```
 
@@ -120,7 +120,7 @@ client.describe_role(
     role_name='roleA'
 )
 
-# output:
+# output
 # {'role': 'roleA',
 #  'privileges': [{'object_type': 'User',
 #    'object_name': 'user_1',
@@ -150,8 +150,8 @@ client.describe_user(
     user_name='user_1'
 )
 
-# output:
-# {'user_name': 'user_1', 'roles': ('roleA',)}
+# output
+# {'user_name': 'user_1', 'roles': ('roleA')}
 ```
 
 ## 6. Revoke privileges

diff --git a/site/en/getstarted/quickstart.md b/site/en/getstarted/quickstart.md
@@ -101,11 +101,11 @@ data = [
 print("Data has", len(data), "entities, each with fields: ", data[0].keys())
 print("Vector dim:", len(data[0]["vector"]))
 ```
-
-    Dim: 768 (768,)
-    Data has 3 entities, each with fields:  dict_keys(['id', 'vector', 'text', 'subject'])
-    Vector dim: 768
-
+```
+Dim: 768 (768,)
+Data has 3 entities, each with fields:  dict_keys(['id', 'vector', 'text', 'subject'])
+Vector dim: 768
+```
 
 ## [Alternatively] Use fake representation with random vectors
 If you couldn't download the model due to network issues, as a walkaround, you can use random vectors to represent the text and still finish the example. Just note that the search result won't reflect semantic similarity as the vectors are fake ones. 
@@ -130,10 +130,10 @@ data = [
 print("Data has", len(data), "entities, each with fields: ", data[0].keys())
 print("Vector dim:", len(data[0]["vector"]))
 ```
-
-    Data has 3 entities, each with fields:  dict_keys(['id', 'vector', 'text', 'subject'])
-    Vector dim: 768
-
+```
+Data has 3 entities, each with fields:  dict_keys(['id', 'vector', 'text', 'subject'])
+Vector dim: 768
+```
 
 ## Insert Data
 Let's insert the data into the collection:
@@ -144,9 +144,9 @@ res = client.insert(collection_name="demo_collection", data=data)
 
 print(res)
 ```
-
-    {'insert_count': 3, 'ids': [0, 1, 2], 'cost': 0}
-
+```
+{'insert_count': 3, 'ids': [0, 1, 2], 'cost': 0}
+```
 
 ## Semantic Search
 Now we can do semantic searches by representing the search query text as vector, and conduct vector similarity search on Milvus.
@@ -169,9 +169,9 @@ res = client.search(
 
 print(res)
 ```
-
-    data: ["[{'id': 2, 'distance': 0.5859944820404053, 'entity': {'text': 'Born in Maida Vale, London, Turing was raised in southern England.', 'subject': 'history'}}, {'id': 1, 'distance': 0.5118255615234375, 'entity': {'text': 'Alan Turing was the first person to conduct substantial research in AI.', 'subject': 'history'}}]"] , extra_info: {'cost': 0}
-
+```
+data: ["[{'id': 2, 'distance': 0.5859944820404053, 'entity': {'text': 'Born in Maida Vale, London, Turing was raised in southern England.', 'subject': 'history'}}, {'id': 1, 'distance': 0.5118255615234375, 'entity': {'text': 'Alan Turing was the first person to conduct substantial research in AI.', 'subject': 'history'}}]"] , extra_info: {'cost': 0}
+```
 
 The output is a list of results, each mapping to a vector search query. Each query contains a list of results, where each result contains the entity primary key, the distance to the query vector, and the entity details with specified `output_fields`.
 
@@ -205,9 +205,9 @@ res = client.search(
 
 print(res)
 ```
-
-    data: ["[{'id': 4, 'distance': 0.27030569314956665, 'entity': {'text': 'Computational synthesis with AI algorithms predicts molecular properties.', 'subject': 'biology'}}, {'id': 3, 'distance': 0.16425910592079163, 'entity': {'text': 'Machine learning has been used for drug design.', 'subject': 'biology'}}]"] , extra_info: {'cost': 0}
-
+```
+data: ["[{'id': 4, 'distance': 0.27030569314956665, 'entity': {'text': 'Computational synthesis with AI algorithms predicts molecular properties.', 'subject': 'biology'}}, {'id': 3, 'distance': 0.16425910592079163, 'entity': {'text': 'Machine learning has been used for drug design.', 'subject': 'biology'}}]"] , extra_info: {'cost': 0}
+```
 
 By default, the scalar fields are not indexed. If you need to perform metadata filtered search in large dataset, you can consider using fixed schema and also turn on the [index](https://milvus.io/docs/scalar_index.md) to improve the search performance. 
 
@@ -256,10 +256,10 @@ res = client.delete(
 
 print(res)
 ```
-
-    [0, 2]
-    [3, 4, 5]
-
+```
+[0, 2]
+[3, 4, 5]
+```
 
 ## Load Existing Data
 Since all data of Milvus Lite is stored in a local file, you can load all data into memory even after the program terminates, by creating a `MilvusClient` with the existing file. For example, this will recover the collections from "milvus_demo.db" file and continue to write data into it.

diff --git a/site/en/getstarted/run-milvus-gpu/install_cluster-helm-gpu.md b/site/en/getstarted/run-milvus-gpu/install_cluster-helm-gpu.md
@@ -174,7 +174,7 @@ In addition to a single GPU device, you can also assign multiple GPU devices to
     <ul>
       <li>The release name should only contain letters, numbers and dashes. Dots are not allowed in the release name.</li>
       <li>The default command line installs cluster version of Milvus while installing Milvus with Helm. Further setting is needed while installing Milvus standalone.</li>
-      <li>According to the <a href="https://kubernetes.io/docs/reference/using-api/deprecation-guide/#v1-25">deprecated API migration guide of Kuberenetes</a>, the <b>policy/v1beta1</b> API version of PodDisruptionBudget is not longer served as of v1.25. You are suggested to migrate manifests and API clients to use the <b>policy/v1</b> API version instead. <br>As a workaround for users who still use the <b>policy/v1beta1</b> API version of PodDisruptionBudget on Kuberenetes v1.25 and later, you can instead run the following command to install Milvus:<br>
+      <li>According to the <a href="https://kubernetes.io/docs/reference/using-api/deprecation-guide/#v1-25">deprecated API migration guide of Kuberenetes</a>, the <b>policy/v1beta1</b> API version of PodDisruptionBudget is not longer served as of v1.25. You are suggested to migrate manifests and API clients to use the <b>policy/v1</b> API version instead. <br/>As a workaround for users who still use the <b>policy/v1beta1</b> API version of PodDisruptionBudget on Kuberenetes v1.25 and later, you can instead run the following command to install Milvus:<br/>
       <code>helm install my-release milvus/milvus --set pulsar.bookkeeper.pdb.usePolicy=false,pulsar.broker.pdb.usePolicy=false,pulsar.proxy.pdb.usePolicy=false,pulsar.zookeeper.pdb.usePolicy=false</code></li> 
       <li>See <a href="https://artifacthub.io/packages/helm/milvus/milvus">Milvus Helm Chart</a> and <a href="https://helm.sh/docs/">Helm</a> for more information.</li>
     </ul>