You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to know on what ruCLIP was trained.
We, LAION, have around 6B yet unreleased img-text-pairs, filtered with CLIP and mCLIP. Many of them also are russian. :)
@christophschuhmann Hello! Your dataset LAION is incredible. As a researcher, I would be interested in working with your dataset in the Russian language.
ruCLIP was trained on datasets from open sources, datasets of the Sberbank ecosystem, and sample datasets translated using neural networks. We collected about 240M pairs, with only 100M in "native" Russian. The data turned out quite noisy, but the signal for ruCLIP is definitely in them.
My colleague Andrey Kuznetsov sent you an e-mail [email protected] . Could you discuss with him the conditions and rules of your dataset? We would be very grateful for your help.
Nice to hear from you, I have not received an email yet on [email protected]
Maybe it got caught in a spam filter. Could he sent it again to [email protected]
I would like to know on what ruCLIP was trained.
We, LAION, have around 6B yet unreleased img-text-pairs, filtered with CLIP and mCLIP. Many of them also are russian. :)
If you 'd like access, let me know.
Christoph Schuhmann
www.laion.ai
The text was updated successfully, but these errors were encountered: