Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Added Predicted Win Probabilities to CompassArenaBradleyTerrySummarizer #1815

Merged
merged 1 commit into from
Jan 10, 2025

Conversation

acylam
Copy link
Collaborator

@acylam acylam commented Jan 10, 2025

Motivation

Added an option to report the predicted win rates instead of ELO scale ratings in CompassArenaBradleyTerrySummarizer

Modification

  • Added an option to report the predicted win rates (win probabilities against the baseline model) instead of the ELO scale ratings.
  • report_pred_win_rates is set to True by default, which reports the predicted win rates for downstream processes (what's being returned by the summarize method).
  • Both predicted win rates and ELO scale ratings will be saved in the summary files regardless of whether report_pred_win_rates is turned on. This parameter only affects what's being returned by the summarize method.

BC-breaking (Optional)

CompassArenaBradleyTerrySummarizer now defaults to reporting the predicted win rates instead of the ELO scale ratings that were previously reported. You can still change the output back to returning ELO ratings by setting reprot_pred_win_rates to True.

Use cases (Optional)

Perform subjective evaluation with the updated evaluation config:

opencompass configs/eval_subjective_bradleyterry.py --mode=all

…s with an option to switch between win rates and elo ratings
@acylam acylam merged commit 7f2aeef into open-compass:main Jan 10, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants