Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge MLSea and LinkedPapersWithCode #2

Open
VladimirAlexiev opened this issue Dec 3, 2024 · 1 comment
Open

merge MLSea and LinkedPapersWithCode #2

VladimirAlexiev opened this issue Dec 3, 2024 · 1 comment

Comments

@VladimirAlexiev
Copy link

VladimirAlexiev commented Dec 3, 2024

https://linkedpaperswithcode.com/ (LPWC) is an RDFization of https://paperswithcode.com/ by @davidlamprecht and friends.
The source is at https://github.com/metaphacts/linkedpaperswithcode.
It has 10,648,824 triples, 486k publications, 205k github repos, 69k evaluations, 11k datasets, 5k tasks, 2.2k methods.
https://paperswithcode.com/ is quite smaller than arxiv (imho the preeminent resource of ML innovations), but is 99% ML/AI and has a lot more detail than arxiv.
A major feature of LPWC is that it's connected to SemOpenAlex, i.e. authors and institutions are disambiguated and connected to LOD.
It's stored in GraphDB 10 and connected by internal federation to SemOpenAlex

https://dtai-kg.github.io/MLSea-KGC/ is an RDFization of Kaggle, OpenML (and some additional data).
The source is at https://github.com/dtai-kg/MLSea-KGC.
It has this breakdown per class: 1.1M software, 940k source code repos, 294k datasets, 408k publications (ScientificWork).
It's stored in Virtuoso 08.03.3329 with 1,445,226,769 triples.
But https://dtai-kg.github.io/MLSeascape/ refers to GraphDB and sends to https://193.190.127.194:7200 (which is a GraphDB specific port), with 13,393,828 triples.
An open version of the paper is https://2024.eswc-conferences.org/wp-content/uploads/2024/04/146640512.pdf.
It mentions LPWC in brief "Linked Papers with Code [18] is a KG that provides information about ML publications from Papers with Code [50] with related metadata such as their datasets, tasks, and evaluations."

Merging these two excellent resources (including ontologies and actual KGs) will be a major boon for the KG and ML communities!

@VladimirAlexiev
Copy link
Author

See a major contribution by the LPWC team in the linked issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant