1.29.8 (2025-01-17)

Fix

fix: Added Misc Chinese models (#1819)
Added moka and piccolo models to overview file
Added Text2Vec models
Added various Chinese embedding models

Co-authored-by: Isaac Chung <[email protected]> (9823529)

fix: Added way more training dataset annotations (#1765)
fix: Leaderboard: K instead of M
Fixes #1752
format
fixed existing annotations to refer to task name instead of hf dataset
added annotation to nvidia
added voyage
added uae annotations
Added stella annotations
sentence trf models
added salesforce and e5
jina
bge + model2vec
added llm2vec annotations
add jasper
format
format
Updated annotations and moved jina models
fix: add even more training dataset annotations (#1793)
fix: update max tokens for OpenAI (#1772)

update max tokens

ci: skip AfriSentiLID for now (#1785)
skip AfriSentiLID for now
skip relevant test case instead

Co-authored-by: Isaac Chung <[email protected]>

1.28.7

Automatically generated by python-semantic-release

ci: fix model loading test (#1775)
pass base branch into the make command as an arg
test a file that has custom wrapper
what about overview
just dont check overview
revert instance check
explicitly omit overview and init
remove test change
try on a lot of models
revert test model file

Co-authored-by: Isaac Chung <[email protected]>

feat: Update task filtering, fixing bug which included cross-lingual tasks in overly many benchmarks (#1787)
feat: Update task filtering, fixing bug on MTEB

Updated task filtering adding exclusive_language_filter and hf_subset
fix bug in MTEB where cross-lingual splits were included
added missing language filtering to MTEB(europe, beta) and MTEB(indic, beta)

The following code outlines the problems:

import mteb
from mteb.benchmarks import MTEB_ENG_CLASSIC

task = [t for t in MTEB_ENG_CLASSIC.tasks if t.metadata.name == &#34;STS22&#34;][0]
# was eq. to:
task = mteb.get_task(&#34;STS22&#34;, languages=[&#34;eng&#34;])
task.hf_subsets
# correct filtering to English datasets:
# [&#39;en&#39;, &#39;de-en&#39;, &#39;es-en&#39;, &#39;pl-en&#39;, &#39;zh-en&#39;]
# However it should be:
# [&#39;en&#39;]

# with the changes it is:
task = [t for t in MTEB_ENG_CLASSIC.tasks if t.metadata.name == &#34;STS22&#34;][0]
task.hf_subsets
# [&#39;en&#39;]
# eq. to
task = mteb.get_task(&#34;STS22&#34;, hf_subsets=[&#34;en&#34;])
# which you can also obtain using the exclusive_language_filter (though not if there was multiple english splits):
task = mteb.get_task(&#34;STS22&#34;, languages=[&#34;eng&#34;], exclusive_language_filter=True)

format
remove "en-ext" from AmazonCounterfactualClassification
fixed mteb(deu)
fix: simplify in a few areas
fix: Add gritlm
1.29.0

Automatically generated by python-semantic-release

fix: Added more annotations!
fix: Added C-MTEB (#1786)

Added C-MTEB

1.29.1

Automatically generated by python-semantic-release

docs: Add contact to MMTEB benchmarks (#1796)
Add myself to MMTEB benchmarks
lint
fix: loading pre 11 (#1798)
fix loading pre 11
add similarity
lint
run all task types
1.29.2

Automatically generated by python-semantic-release

fix: allow to load no revision available (#1801)
fix allow to load no revision available
lint
add require_model_meta to leaderboard
lint
1.29.3

Automatically generated by python-semantic-release

Co-authored-by: Roman Solomatin <[email protected]>
Co-authored-by: Isaac Chung <[email protected]>
Co-authored-by: Isaac Chung <[email protected]>
Co-authored-by: github-actions <[email protected]>
Co-authored-by: Márton Kardos <[email protected]>

fix: bm25s (#1827)

Co-authored-by: sam021313 <[email protected]> (96420a2)

fix: Added Chinese Stella models (#1824)

Added Chinese Stella models (74b495c)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.29.8

1.29.8 (2025-01-17)

Fix