finalise readme

digitalepidemiologylab · Feb 6, 2024 · 11c2419 · 11c2419
1 parent 59cbe94
commit 11c2419
Show file tree

Hide file tree

Showing 3 changed files with 443 additions and 8 deletions.
diff --git a/CITATION.cff b/CITATION.cff
@@ -3,9 +3,7 @@
 
 cff-version: 1.2.0
 title: >-
-  Using experts, crowdsourcing and artificial intelligence
-  to classify the public stance towards vaccination
-  leveraging social media data
+  Use of large language models as a scalable approach to understanding public health discourse
 message: >-
   If you use this software, please cite it using the
   metadata from this file.
@@ -20,5 +18,5 @@ authors:
 repository-code: >-
   https://github.com/digitalepidemiologylab/llm_crowd_experts_annotation
 license: EUPL-1.2
-version: '1.0'
-date-released: '2024-02-05'
+version: '1.1'
+date-released: '2024-02-06'
diff --git a/README.html b/README.html
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
-# Assessment of Large Language Models (LLMs) and Amazon Mturk workers performance in comparison to experts in tweets' annotation for public perception of vaccines
+# Use of large language models as a scalable approach to understanding public health discourse
 
-This repository assesses the performance of LLMs (GPT versions 3.5 and 4, and Mistral) and Amazon Mturk workers in comparison with experts annotators when annotating tweets for public perception on vaccines.
+This repository assesses the performance of LLMs (GPT versions 3.5 and 4, Mistral and Mixtral),  and Amazon Mturk workers in comparison with experts when annotating tweets for public perception on vaccines.
 
 Since Twitter/X data cannot be freely accessible, only certain data is available under the folder 'data', including the tweets id with at least partial agreement among experts.
 
@@ -12,4 +12,8 @@ For visualising the main results of the analysis, including a Shiny application,
 4. Source the code of the scripts with all data publicly available, indicated by "(public)" 
 
 ## Structure of this repository
-
+**R project**: enables to have this repository as a portable, self-contained folder.
+**Shiny app**: web application to visualise some of the results of the study.
+**data**: folder with the publicly available data or aggregated data used in this study.
+**scripts**: folder with the R and python scripts used in the study to produce the results. Some of the scripts cannot be run since those are linked to restricted data that is not available in the repository.
+**outputs**: folder with the outputs produced by the scripts and included in the study.