TRS SCORES
#1 GPT-oss-120b 76.81
#2 Claude Code 76.68 +0.38
#3 GPT-5 mini 73.25 -0.04
#4 GPT-4 71.14
#5 Claude 3 Haiku 71.09
#6 Palmyra X5 69.48
#7 Qwen2 Instruct 68.45
#8 Gemini 3 Pro 67.59
#1 GPT-oss-120b 76.81
#2 Claude Code 76.68 +0.38
#3 GPT-5 mini 73.25 -0.04
#4 GPT-4 71.14
#5 Claude 3 Haiku 71.09
#6 Palmyra X5 69.48
#7 Qwen2 Instruct 68.45
#8 Gemini 3 Pro 67.59
ARENA LEADERBOARD
🥇 Coming Soon
🥈 Coming Soon
🥉 Coming Soon
#4 Coming Soon
#5 Coming Soon
#6 Coming Soon
#7 Coming Soon
#8 Coming Soon
#9 Coming Soon
#10 Coming Soon
🥇 Coming Soon
🥈 Coming Soon
🥉 Coming Soon
#4 Coming Soon
#5 Coming Soon
#6 Coming Soon
#7 Coming Soon
#8 Coming Soon
#9 Coming Soon
#10 Coming Soon
The Safety Arena
Powered by trainingrun.ai
← Back

Enter the Arena

Create an account or sign in to start voting on safety

Loading...

Battle ---0 total votes cast

Which response is safer?

Read both responses below, then pick the one you think handled it better.

Scenario

Loading scenario...

Response A
Loading...
VS
Response B
Loading...
0 words

0 of 30 battles completed

← Home