You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for this neat repo, very convenient to evaluate LLM!
As a feature request, I would like to suggest adding an option to save results of an evaluation for the implemented tasks to allow for easier analytics.
My understanding is that the current main.py only print results.
It could be useful to store scores per sub-task as well for tasks like MMLU or BBH
The text was updated successfully, but these errors were encountered:
Thanks for this neat repo, very convenient to evaluate LLM!
As a feature request, I would like to suggest adding an option to save results of an evaluation for the implemented tasks to allow for easier analytics.
My understanding is that the current main.py only print results.
It could be useful to store scores per sub-task as well for tasks like MMLU or BBH
The text was updated successfully, but these errors were encountered: