Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is user score included in dictionary keys in summarization_evaluation? #44

Open
AndreaSottana opened this issue Apr 5, 2023 · 1 comment

Comments

@AndreaSottana
Copy link
Contributor

Hello, I have noticed that as a result of this line
https://github.com/davidjurgens/potato/blob/master/potato/server_utils/schemas/likert.py#L39
on summarization_evaluation the generated annotation_output looks something like this

{"label_annotations": {"relevance": {"scale_5": "5"}, "fluency": {"scale_2": "2"}, "coherence": {"scale_4": "4"}, "consistency": {"not consistent": "2"}

or

{"label_annotations": {"relevance": {"scale_4": "4"}, "fluency": {"scale_1": "1"}, "coherence": {"scale_3": "3"}, "consistency": {"consistent": "1"}}

whereas ideally we should have something like

{"label_annotations": {"relevance": {"scale": "5"}, "fluency": {"scale": "2"}, "coherence": {"scale": "4"}, "consistency": "not consistent"}

Not only is there redundancy and duplication of information because the rating is included in both the key and the value, but this also has negative implications for the annotation_output/annotated_instances.tsv file because each rating has its own column meaning that relevance scale 4 would have a different column than relevance scale 3 and the output would look something like this which is really not ideal
Screenshot 2023-04-05 at 16 45 40

Would it not be better to change this line of code from

label = "scale_" + str(i)

to

label = "scale"

to prevent this issue? I have not explored the project yet in enough depth to be confidently able to say whether this would break other sections of the code, but to me it seems like this should be changed. Let me know what you think

@Jiaxin-Pei
Copy link
Collaborator

Hi @AndreaSottana , thanks a lot for raising this issue.
I agree with you that we need a more elegant way to save the issue.
The code you pointed to is the code to generate the HTML for the annotation schema and the label variable was used also for the shortcut keybindings. Therefore, we probably cannot change that part at this time.

What we can do next is to edit the code to save all the annotations. I will try to fix this later this week!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants