[Feature] Added Predicted Win Probabilities to `CompassArenaBradleyTerrySummarizer` #1815

acylam · 2025-01-10T03:14:45Z

Motivation

Added an option to report the predicted win rates instead of ELO scale ratings in CompassArenaBradleyTerrySummarizer

Modification

Added an option to report the predicted win rates (win probabilities against the baseline model) instead of the ELO scale ratings.
report_pred_win_rates is set to True by default, which reports the predicted win rates for downstream processes (what's being returned by the summarize method).
Both predicted win rates and ELO scale ratings will be saved in the summary files regardless of whether report_pred_win_rates is turned on. This parameter only affects what's being returned by the summarize method.

BC-breaking (Optional)

CompassArenaBradleyTerrySummarizer now defaults to reporting the predicted win rates instead of the ELO scale ratings that were previously reported. You can still change the output back to returning ELO ratings by setting reprot_pred_win_rates to True.

Use cases (Optional)

Perform subjective evaluation with the updated evaluation config:

opencompass configs/eval_subjective_bradleyterry.py --mode=all

…s with an option to switch between win rates and elo ratings

added predicted win rates reporting to bradley terry subj eval method…

48f685b

…s with an option to switch between win rates and elo ratings

mm-assistant bot assigned tonysy Jan 10, 2025

acylam temporarily deployed to prod January 10, 2025 03:15 — with GitHub Actions Inactive

acylam requested a review from bittersweet1999 January 10, 2025 03:42

bittersweet1999 approved these changes Jan 10, 2025

View reviewed changes

acylam merged commit 7f2aeef into open-compass:main Jan 10, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Added Predicted Win Probabilities to `CompassArenaBradleyTerrySummarizer` #1815

[Feature] Added Predicted Win Probabilities to `CompassArenaBradleyTerrySummarizer` #1815

acylam commented Jan 10, 2025

[Feature] Added Predicted Win Probabilities to CompassArenaBradleyTerrySummarizer #1815

[Feature] Added Predicted Win Probabilities to CompassArenaBradleyTerrySummarizer #1815

Conversation

acylam commented Jan 10, 2025

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

[Feature] Added Predicted Win Probabilities to `CompassArenaBradleyTerrySummarizer` #1815

[Feature] Added Predicted Win Probabilities to `CompassArenaBradleyTerrySummarizer` #1815