Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: mmlu evaluation scores are constant #1172

Open
jalling97 opened this issue Oct 1, 2024 · 0 comments
Open

bug: mmlu evaluation scores are constant #1172

jalling97 opened this issue Oct 1, 2024 · 0 comments
Labels
possible-bug 🐛 Something may not be working

Comments

@jalling97
Copy link
Contributor

Steps to reproduce

  1. Deploy LFAI
  2. Run evaluations against deployed instance
  3. View MMLU results

Expected result

  • Scores to very across each topic category in MMLU

Actual Result

  • All scores are the same for each category (approx 0.69, which is relatively high)

Visual Proof (screenshots, videos, text, etc)

INFO:root:MMLU task scores:
                            Task    Score
0  high_school_european_history  0.69697
1               business_ethics  0.69697

Additional Context

May need to investigate the underlying implementation of mmlu in deepeval

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
possible-bug 🐛 Something may not be working
Projects
None yet
Development

No branches or pull requests

1 participant