Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Model Loading / extract-and-run #1756

Closed
Samoed opened this issue Jan 10, 2025 · 15 comments · Fixed by #1775 or #1818
Closed

Error Model Loading / extract-and-run #1756

Samoed opened this issue Jan 10, 2025 · 15 comments · Fixed by #1775 or #1818
Assignees
Labels
bug Something isn't working

Comments

@Samoed
Copy link
Collaborator

Samoed commented Jan 10, 2025

When doing commits, it appears

System.IO.IOException: No space left on device : '/home/runner/runners/2.321.0/_diag/Worker_20250110-152956-utc.log' ...

example run https://github.com/embeddings-benchmark/mteb/actions/runs/12712643686/job/35438686654?pr=1749
@isaac-chung

@isaac-chung
Copy link
Collaborator

I have a rough idea what's going on: the base branch in the PR is not main, and the actual one was not passed into extract_model_names.py.

@isaac-chung
Copy link
Collaborator

Feel free to skip it while I work on a fix.

@isaac-chung
Copy link
Collaborator

isaac-chung commented Jan 12, 2025

There's an error like this one as well:

python scripts/extract_model_names.py
Traceback (most recent call last):
File "/home/runner/work/mteb/mteb/scripts/extract_model_names.py", line 60, in
model_names = extract_model_names(changed_files)
File "/home/runner/work/mteb/mteb/scripts/extract_model_names.py", line 40, in extract_model_names
model_name = next(
File "/home/runner/work/mteb/mteb/scripts/extract_model_names.py", line 42, in
kw.value.value
AttributeError: 'Name' object has no attribute 'value'
make: *** [Makefile:44: model-load-test] Error 1

@Samoed
Copy link
Collaborator Author

Samoed commented Jan 12, 2025

What can cause this error?

@isaac-chung isaac-chung added the bug Something isn't working label Jan 13, 2025
@isaac-chung isaac-chung reopened this Jan 15, 2025
@isaac-chung isaac-chung self-assigned this Jan 15, 2025
@isaac-chung
Copy link
Collaborator

Problem seems to persist in #1805

@Samoed
Copy link
Collaborator Author

Samoed commented Jan 15, 2025

Maybe we should check in tests if json file is correct without downloading models?

@isaac-chung
Copy link
Collaborator

isaac-chung commented Jan 15, 2025

Problem seems to persist in #1805

The branch in #1805 indeed has the previous fix merged in. I will take a look at why the files on disk aren't being removed.

Maybe we should check in tests if json file is correct without downloading models?

@Samoed How would that work?

@Samoed
Copy link
Collaborator Author

Samoed commented Jan 15, 2025

The script will scan all model metas and check if each model is present in the JSON. If a model is not found, the test will fail. This way, the test won't download any models.

@isaac-chung
Copy link
Collaborator

isaac-chung commented Jan 15, 2025

I get the benefit of not downloading any model files, but which JSON file is being checked against? How does this test actually validate that the model loading?

[edit] Oh, by json file, do you mean model_load_failures.json? Hmm that could work. This way, any added model meta will need to be run by the user. In that case, I can add better instructions to help with that.

The core issue of scan_cache_dir failing remains unresolved though, and still might affect the user running the test locally.

@Samoed
Copy link
Collaborator Author

Samoed commented Jan 15, 2025

The main issue is figuring out how to add new models to model_load_failures.json. The only approach I can think of is to run the test in two modes: one that downloads the model and another that just performs a json check. I don’t think scan_cache_dir is a big problem since users typically add only 1-2 models at a time.

@isaac-chung
Copy link
Collaborator

Good call, let's go with your suggestion. I feel @x-tabdeveloping 's case (adding many models) is rather rare. I'll work on that today and will update when ready 💪

@Samoed
Copy link
Collaborator Author

Samoed commented Jan 15, 2025

Maybe this could be handled similarly to tasks' descriptive statistics. The wrapper could write all the information about model downloading to a file, and ModelMeta would only read the data from that file.

@isaac-chung
Copy link
Collaborator

The suggestion above works well with adding models, but not updating models. The json check alone does not influence the test outcomes. What we missed was whether the current target key gives the value "None" (means the model loaded without issues, if the script was run).

Here are the cases:

Case Changes in File Detected Model Names Exist in JSON JSON Target Key Gives "None" Test Outcome
Add New Models Yes No No
Add New Models Yes Yes No
Add New Models Yes Yes Yes
Update Models Yes Yes No
Update Models Yes Yes Yes

In the update models case, if a model passed before and passes again after file changes, git will not detect a change for the json file.

Note

That said, this version is not able to check whether model load has actually been run or not. This means that it does not stop someone from hardcoding MODEL_NAME: "None" into the json file.

@Samoed
Copy link
Collaborator Author

Samoed commented Jan 15, 2025

Maybe the model could be run on some mock tasks (or all mock tasks) with the main score and MTEB version added to json. This could be used to check if anything has been updated, and the comparison could be done using a git diff by tag. But this not aligning with previous idea

@isaac-chung
Copy link
Collaborator

That could work, if it could stop someone from hardcoding in the json file. I don't think it does :(

Another approach could be only run one model from a changed model file. Right now it runs all models from a changed model file. This might help side-step our issue with scan_cache_dir with minimal updates to the current test setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants