Skip to content

Commit

Permalink
updates dvc and adds convenience script for testing
Browse files Browse the repository at this point in the history
  • Loading branch information
matthewcoole committed Dec 13, 2024
1 parent 7791b19 commit a003fa6
Show file tree
Hide file tree
Showing 4 changed files with 83 additions and 54 deletions.
7 changes: 6 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,15 @@ Pull the data from the object store using DVC:
dvc pull
```
### Working with the pipeline
You should now be ready to re-run the pipeline:
You should now be ready to run the pipeline:
```shell
dvc repro
```
This should only reproduce the pipeline, but only stages that have been modified will actually be re-run (see output whilst running). If you want to check that all stages of the pipeline are running correctly you can either user the `-f` flag with the above command to force DVC to re-run all stages of the pipeline or (as re-running with all the data can take several hours) run the convenience script `test-pipeline.sh`. This script will run the pipeline with a tiny subset of data as an experiment which should only take a copule of minutes:
```shell
./test-pipeline.sh
```

This pipeline is defined in [`dvc.yaml`](dvc.yaml) and can be viewed with the command:
```shell
dvc dag
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ dependencies = [
"bitsandbytes==0.44.1",
"chroma-haystack==0.18.0",
"chromadb==0.5.3",
"dvc[s3]==3.2.0",
"dvc[s3]==3.58.0",
"haystack-ai==2.2.3",
"kaleido==0.2.1",
"langchain==0.2.7",
Expand Down
17 changes: 0 additions & 17 deletions setup-venv.sh

This file was deleted.

111 changes: 76 additions & 35 deletions uv.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit a003fa6

Please sign in to comment.