Skip to content

Commit

Permalink
finalise readme
Browse files Browse the repository at this point in the history
  • Loading branch information
lauespinosa committed Feb 6, 2024
1 parent 59cbe94 commit 11c2419
Show file tree
Hide file tree
Showing 3 changed files with 443 additions and 8 deletions.
8 changes: 3 additions & 5 deletions CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,7 @@

cff-version: 1.2.0
title: >-
Using experts, crowdsourcing and artificial intelligence
to classify the public stance towards vaccination
leveraging social media data
Use of large language models as a scalable approach to understanding public health discourse
message: >-
If you use this software, please cite it using the
metadata from this file.
Expand All @@ -20,5 +18,5 @@ authors:
repository-code: >-
https://github.com/digitalepidemiologylab/llm_crowd_experts_annotation
license: EUPL-1.2
version: '1.0'
date-released: '2024-02-05'
version: '1.1'
date-released: '2024-02-06'
433 changes: 433 additions & 0 deletions README.html

Large diffs are not rendered by default.

10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Assessment of Large Language Models (LLMs) and Amazon Mturk workers performance in comparison to experts in tweets' annotation for public perception of vaccines
# Use of large language models as a scalable approach to understanding public health discourse

This repository assesses the performance of LLMs (GPT versions 3.5 and 4, and Mistral) and Amazon Mturk workers in comparison with experts annotators when annotating tweets for public perception on vaccines.
This repository assesses the performance of LLMs (GPT versions 3.5 and 4, Mistral and Mixtral), and Amazon Mturk workers in comparison with experts when annotating tweets for public perception on vaccines.

Since Twitter/X data cannot be freely accessible, only certain data is available under the folder 'data', including the tweets id with at least partial agreement among experts.

Expand All @@ -12,4 +12,8 @@ For visualising the main results of the analysis, including a Shiny application,
4. Source the code of the scripts with all data publicly available, indicated by "(public)"

## Structure of this repository

**R project**: enables to have this repository as a portable, self-contained folder.
**Shiny app**: web application to visualise some of the results of the study.
**data**: folder with the publicly available data or aggregated data used in this study.
**scripts**: folder with the R and python scripts used in the study to produce the results. Some of the scripts cannot be run since those are linked to restricted data that is not available in the repository.
**outputs**: folder with the outputs produced by the scripts and included in the study.

0 comments on commit 11c2419

Please sign in to comment.