Skip to content

Commit

Permalink
contributors
Browse files Browse the repository at this point in the history
  • Loading branch information
harisont committed Dec 19, 2024
1 parent 3859e37 commit 8607527
Show file tree
Hide file tree
Showing 3 changed files with 63 additions and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# ![MultiGEC](multigec.png)
MultiGEC is a dataset for Multilingual Grammatical Error Correction in 12 European languages (Czech, English, Estonian, German, Greek, Icelandic, Italian, Latvian, Russian, Slovene, Swedish and Ukrainian) that was originally compiled in the context of [MultiGEC-2025](https://spraakbanken.github.io/multigec-2025/shared_task.html), the first text-level GEC shared task.
MultiGEC is a dataset for Multilingual Grammatical Error Correction in 12 European languages (Czech, English, Estonian, German, Greek, Icelandic, Italian, Latvian, Russian, Slovene, Swedish and Ukrainian) compiled by [the CompSLA working group and over 20 external data providers](https://spraakbanken.github.io/multigec-2025/contributors.html) in the context of [MultiGEC-2025](https://spraakbanken.github.io/multigec-2025/shared_task.html), the first text-level GEC shared task.

## Access
The MultiGEC dataset is subject to the terms of use listed [here](https://spraakbanken.github.io/multigec-2025/terms_of_use.html).
Expand Down
1 change: 1 addition & 0 deletions _config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ description: >
header_pages:
- data_format.md
- terms_of_use.md
- contributors.md
- shared_task.md
- publications.md

Expand Down
61 changes: 61 additions & 0 deletions contributors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
title: Contributors
---

The MultiGEC dataset is the result of a collaboration between the [the CompSLA working group](https://spraakbanken.gu.se/compsla), responsible for the __design, validation and distribution of the MultiGEC dataset__, and over 20 external data providers.

The __individual MultiGEC subcorpora__ were curated by the following people:

#### Czech
- Alexandr Rosen, Charles University, Prague

#### English
- Diane Nicholls, ELiT, Cambridge University Press & Assessment
- Andrew Caines, University of Cambridge
- Paula Buttery, University of Cambridge

#### Estonian
- Mark Fishel, University of Tartu, Estonia
- Kais Allkivi, Tallinn University, Estonia
- Kristjan Suluste, Eesti Keele Instituut, Estonia

#### German
- Andrea Horbach, IPN / CAU Kiel, Germany
- Josef Ruppenhofer, FernUniversität in Hagen, Germany
- Katrin Wisniewski, Universität Leipzig
- Torsten Zesch, FernUniversität in Hagen, Germany

#### Greek
- Alexandros Tantos, Aristotle University of Thessaloniki
- Konstantinos Tsiotskas, Aristotle University of Thessaloniki
- Vassilis Varsamopoulos, Aristotle University of Thessaloniki
- Pinelopi Kikilintza, Aristotle University of Thessaloniki
- Elena Drakonaki, Aristotle University of Thessaloniki
- Eleni Tsourilla, Aristotle University of Thessaloniki
- Despoina-Ourania Touriki, Aristotle University of Thessaloniki

#### Icelandic
- Isidora Glišić, University of Iceland

#### Italian
- Jennifer-Carmen Frey, Eurac Research Bolzano, Italy
- Lionel Nicolas, Eurac Research Bolzano, Italy

#### Latvian
- Roberts Darģis, University of Latvia
- Ilze Auzina, University of Latvia

#### Russian
- Alla Rozovskaya, City University of New York (CUNY), USA

#### Slovene
- Špela Arhar Holdt, University of Ljubljana, Slovenia
- Aleš Žagar, University of Ljubljana, Slovenia

#### Swedish
- Arianna Masciolini, University of Gothenburg, Sweden
- Elena Volodina, University of Gothenburg, Sweden

#### Ukrainian:
- Oleksiy Syvokon, Microsoft
- Mariana Romanyshyn, Grammarly

0 comments on commit 8607527

Please sign in to comment.