Error Model Loading / extract-and-run #1756

Samoed · 2025-01-10T19:22:42Z

When doing commits, it appears

System.IO.IOException: No space left on device : '/home/runner/runners/2.321.0/_diag/Worker_20250110-152956-utc.log' ...

example run https://github.com/embeddings-benchmark/mteb/actions/runs/12712643686/job/35438686654?pr=1749
@isaac-chung

The text was updated successfully, but these errors were encountered:

isaac-chung · 2025-01-11T22:25:23Z

I have a rough idea what's going on: the base branch in the PR is not main, and the actual one was not passed into extract_model_names.py.

isaac-chung · 2025-01-11T22:25:45Z

Feel free to skip it while I work on a fix.

isaac-chung · 2025-01-12T12:05:12Z

There's an error like this one as well:

python scripts/extract_model_names.py
Traceback (most recent call last):
File "/home/runner/work/mteb/mteb/scripts/extract_model_names.py", line 60, in
model_names = extract_model_names(changed_files)
File "/home/runner/work/mteb/mteb/scripts/extract_model_names.py", line 40, in extract_model_names
model_name = next(
File "/home/runner/work/mteb/mteb/scripts/extract_model_names.py", line 42, in
kw.value.value
AttributeError: 'Name' object has no attribute 'value'
make: *** [Makefile:44: model-load-test] Error 1

Samoed · 2025-01-12T16:31:02Z

What can cause this error?

isaac-chung · 2025-01-15T08:41:58Z

Problem seems to persist in #1805

Samoed · 2025-01-15T09:13:32Z

Maybe we should check in tests if json file is correct without downloading models?

isaac-chung · 2025-01-15T09:39:01Z

Problem seems to persist in #1805

The branch in #1805 indeed has the previous fix merged in. I will take a look at why the files on disk aren't being removed.

Maybe we should check in tests if json file is correct without downloading models?

@Samoed How would that work?

Samoed · 2025-01-15T09:48:05Z

The script will scan all model metas and check if each model is present in the JSON. If a model is not found, the test will fail. This way, the test won't download any models.

isaac-chung · 2025-01-15T10:09:18Z

I get the benefit of not downloading any model files, but which JSON file is being checked against? How does this test actually validate that the model loading?

[edit] Oh, by json file, do you mean model_load_failures.json? Hmm that could work. This way, any added model meta will need to be run by the user. In that case, I can add better instructions to help with that.

The core issue of scan_cache_dir failing remains unresolved though, and still might affect the user running the test locally.

Samoed · 2025-01-15T11:06:21Z

The main issue is figuring out how to add new models to model_load_failures.json. The only approach I can think of is to run the test in two modes: one that downloads the model and another that just performs a json check. I don’t think scan_cache_dir is a big problem since users typically add only 1-2 models at a time.

isaac-chung · 2025-01-15T11:25:40Z

Good call, let's go with your suggestion. I feel @x-tabdeveloping 's case (adding many models) is rather rare. I'll work on that today and will update when ready 💪

Samoed · 2025-01-15T12:28:50Z

Maybe this could be handled similarly to tasks' descriptive statistics. The wrapper could write all the information about model downloading to a file, and ModelMeta would only read the data from that file.

isaac-chung · 2025-01-15T12:43:38Z

The suggestion above works well with adding models, but not updating models. The json check alone does not influence the test outcomes. What we missed was whether the current target key gives the value "None" (means the model loaded without issues, if the script was run).

Here are the cases:

Case	Changes in File Detected	Model Names Exist in JSON	JSON Target Key Gives "None"	Test Outcome
Add New Models	Yes	No	No	❌
Add New Models	Yes	Yes	No	❌
Add New Models	Yes	Yes	Yes	✅
Update Models	Yes	Yes	No	❌
Update Models	Yes	Yes	Yes	✅

In the update models case, if a model passed before and passes again after file changes, git will not detect a change for the json file.

Note

That said, this version is not able to check whether model load has actually been run or not. This means that it does not stop someone from hardcoding MODEL_NAME: "None" into the json file.

Samoed · 2025-01-15T12:50:39Z

Maybe the model could be run on some mock tasks (or all mock tasks) with the main score and MTEB version added to json. This could be used to check if anything has been updated, and the comparison could be done using a git diff by tag. But this not aligning with previous idea

isaac-chung · 2025-01-15T13:13:19Z

That could work, if it could stop someone from hardcoding in the json file. I don't think it does :(

Another approach could be only run one model from a changed model file. Right now it runs all models from a changed model file. This might help side-step our issue with scan_cache_dir with minimal updates to the current test setup.

isaac-chung mentioned this issue Jan 12, 2025

ci: fix model loading test #1775

Merged

2 tasks

isaac-chung added the bug Something isn't working label Jan 13, 2025

KennethEnevoldsen closed this as completed in #1775 Jan 13, 2025

isaac-chung reopened this Jan 15, 2025

isaac-chung self-assigned this Jan 15, 2025

isaac-chung mentioned this issue Jan 15, 2025

ci: only return 1 model_name per file #1818

Merged

2 tasks

isaac-chung closed this as completed in #1818 Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error Model Loading / extract-and-run #1756

Error Model Loading / extract-and-run #1756

Samoed commented Jan 10, 2025

isaac-chung commented Jan 11, 2025

isaac-chung commented Jan 11, 2025

isaac-chung commented Jan 12, 2025 •

edited

Loading

Samoed commented Jan 12, 2025

isaac-chung commented Jan 15, 2025

Samoed commented Jan 15, 2025

isaac-chung commented Jan 15, 2025 •

edited

Loading

Samoed commented Jan 15, 2025

isaac-chung commented Jan 15, 2025 •

edited

Loading

Samoed commented Jan 15, 2025

isaac-chung commented Jan 15, 2025

Samoed commented Jan 15, 2025

isaac-chung commented Jan 15, 2025

Samoed commented Jan 15, 2025 •

edited

Loading

isaac-chung commented Jan 15, 2025

Error Model Loading / extract-and-run #1756

Error Model Loading / extract-and-run #1756

Comments

Samoed commented Jan 10, 2025

isaac-chung commented Jan 11, 2025

isaac-chung commented Jan 11, 2025

isaac-chung commented Jan 12, 2025 • edited Loading

Samoed commented Jan 12, 2025

isaac-chung commented Jan 15, 2025

Samoed commented Jan 15, 2025

isaac-chung commented Jan 15, 2025 • edited Loading

Samoed commented Jan 15, 2025

isaac-chung commented Jan 15, 2025 • edited Loading

Samoed commented Jan 15, 2025

isaac-chung commented Jan 15, 2025

Samoed commented Jan 15, 2025

isaac-chung commented Jan 15, 2025

Note

Samoed commented Jan 15, 2025 • edited Loading

isaac-chung commented Jan 15, 2025

isaac-chung commented Jan 12, 2025 •

edited

Loading

isaac-chung commented Jan 15, 2025 •

edited

Loading

isaac-chung commented Jan 15, 2025 •

edited

Loading

Samoed commented Jan 15, 2025 •

edited

Loading