cleanup and rearrange task page

spraakbanken · Dec 18, 2024 · 3022ad9 · 3022ad9
1 parent c5a0fe4
commit 3022ad9
Showing 1 changed file with 14 additions and 53 deletions.
diff --git a/multigec-2025.md b/multigec-2025.md
@@ -1,7 +1,9 @@
 # ![MultiGEC-2025 logo](multigec-2025-horizontal.png)
 
-### For the shared task __results__, click [here (__minimal edits track__)](https://spraakbanken.github.io/multigec-2025/results/test_results_minimal.html) and [here (__fluency edits track__)](https://spraakbanken.github.io/multigec-2025/results/test_results_fluency.html).
+###### Quick links: [call for participation](#call-for-participation) | [task description](#task-description) | [data](#data) | [evaluation](#evaluation) | [timeline](#timeline) | [publication](#publication) | [results](#results) | [organizers](#organizers) | [data providers](#data-providers)
 
+
+## Call for participation
 The [Computational SLA](https://spraakbanken.gu.se/en/compsla) working group invites you to participate in the shared task on text-level Multilingual Grammatical Error Correction, **MultiGEC**, covering 12 languages: Czech, English, Estonian, German, Greek, Icelandic, Italian, Latvian, Russian, Slovene, Swedish and Ukrainian (see also the [call for participation on the ACL portal](https://www.aclweb.org/portal/content/shared-task-multilingual-grammatical-error-correction-2025)).
 
 Automatic system evaluation will be carried out [on CodaLab](https://codalab.lisn.upsaclay.fr/competitions/20500), but the official leaderboard will be hosted on this website.
@@ -10,7 +12,7 @@ The results will be presented on March 5, 2025, at the [NLP4CALL workshop](https
 The publication venue for system descriptions will be the proceedings of the NLP4CALL workshop, co-published in ACL anthology. 
 
 To register for/express interest in the shared task, please fill in [this form](https://forms.gle/nTPfARVqy1XmqT4t6).   
-To get important information and updates about the shared task, please join the [MultiGEC-2025 Google Group](https://groups.google.com/g/multigec-2025).
+To get important information and updates about the shared task, ask questions and hold discussions please join the [MultiGEC-2025 Google Group](https://groups.google.com/g/multigec-2025).
 
 ## Task description
 In this shared task, your goal is to rewrite learner-written texts to make them grammatically correct or both grammatically correct and idiomatic, that is either adhering to the "minimal correction" principle or applying fluency edits. 
@@ -32,56 +34,16 @@ For fair evaluation of both approaches to the correction task, we will provide t
 We particularly encourage development of multilingual systems that can process all (or several) languages using a single model, but this is not a mandatory requirement to participate in the task. 
 
 ## Data
-
 We provide training, development and test data for each of the languages.
-The training and development splits will be made available through GitHub. 
-Evaluation will be performed on a separate test set. 
-
-### Data access
 
-Training and validation data is available at [github.com/spraakbanken/multigec-2025-participants](https://github.com/spraakbanken/multigec-2025-participants).
+Training and validation data is available [on GitHub](https://github.com/spraakbanken/multigec-2025-participants).
 To get access to this repository, you need to agree to the [Terms of Use](https://forms.gle/VLJ18WbwsxitEBYi7). 
+Evaluation will be performed on a separate test set. 
 
-### Data Format
-The dataset, divided into folders based on language, consists of essay-aligned files, one containing the original learner essays, and one or more containing reference (corrected/normalized) texts.
-
-Internally, each file follows this simple markdown-based format:
-
-```
-### essay_id = 1
-Full text of the first essay/reference.
-
-Whitespace, including newline characters, is preserved, but for the sake of readability TWO consecutive newline characters spearate subsequent essays.
-
-### essay_id = 2
-Full text of the second essay/reference.
-
-...
-```
+A description of the data format is available [here](https://spraakbanken.github.io/multigec-2025/data_format.html).
 
-### External Data
 Participants may use additional resources to build their systems __provided that the resource is publicly available for research purposes__. This includes monolingual data, artificial data, pretrained models, syntactic parsers, etc. After the shared task, we encourage participants to share any newly created resources with the community.
 
-<!--
-
-### Data Licenses
-
-| Language  |  Corpus name | Corpus license | MultiGEC license | 
-|:----------|:-------------|:---------------|:-----------------|
-| Czech     | 
-| English   | 
-| Estonian  |
-| German    |
-| Greek     |
-| Icelandic | 
-| Italian   | 
-| Latvian   | 
-| Russian   | 
-| Slovene   |
-| Swedish   | SweLL-gold | --CLARIN-ID, -PRIV, -NORED, -BY | 
-| Ukrainian |
--->
-
 ## Evaluation 
 During the shared task, evaluation will be based on the following cross-lingually applicable __automatic metrics__:
 
@@ -99,7 +61,7 @@ After the shared task, we also plan on carrying out a __human evaluation__ exper
 * October 20, 2024 - third call for participation. Training and validation data released ✓
 * October 31, 2024 - reminder. CodaLab opens for team registrations, validation phase starts ✓
 * November 13, 2024 - test phase starts ✓
-* November 29, 2024 - system submission deadline (system output) (__extended__) ✓
+* November 29, 2024 - system submission deadline (system output) (__extended__); open phase starts ✓
 * December 2, 2024 - results announced ✓
 * January 9, 2024 - paper submission deadline with system descriptions (__extended__)
 * January 20, 2025 - paper reviews sent to the authors
@@ -108,15 +70,18 @@ After the shared task, we also plan on carrying out a __human evaluation__ exper
 
 __All deadlines above are AoE__.
 
-
 ## Publication
 We encourage you to submit a paper with your system description to the NLP4CALL workshop special track. 
 We follow the same requirements for paper submissions as the NLP4CALL workshop, i.e. we use the same template and apply the same page limit. 
 All papers will be reviewed by the organizing committee. 
 Upon paper publication, we encourage you to share models, code, fact sheets, extra data, etc. with the community through GitHub or other repositories.
 
-## Organizers
+## Results
+Official results for the competitive phase of the tasks, (ended on November 29, 2024) are available at the following links:
+- [__minimal edits__ track](https://spraakbanken.github.io/multigec-2025/results/test_results_minimal.html)
+- [__fluency edits__ track](https://spraakbanken.github.io/multigec-2025/results/test_results_fluency.html).
 
+## Organizers
 * [Arianna Masciolini](https://harisont.github.io/research.html), University of Gothenburg, Sweden
 * [Andrew Caines](https://www.cl.cam.ac.uk/~apc38/), University of Cambridge, UK
 * [Orphée De Clercq](https://research.flw.ugent.be/en/orphee.declercq), Ghent university, Belgium
@@ -167,8 +132,4 @@ Upon paper publication, we encourage you to share models, code, fact sheets, ext
   - Arianna Masciolini, University of Gothenburg, Sweden
 - Ukrainian:
   - Oleksiy Syvokon, Microsoft
-  - Mariana Romanyshyn, Grammarly
-
-## Contact information and forum for discussions
-
-Please join the [MultiGEC-2025 Google group](https://groups.google.com/g/multigec-2025) in order to ask questions, hold discussions and browse for already answered questions.
+  - Mariana Romanyshyn, Grammarly