Evals Will Break and You Won't See It Coming

Source: News.Ycombinator

Published:

<p>We're good at evaluating the models we have. We're much worse at evaluating the models we're to build — especially if they cross into a new capability regime.</p> <p>Most benchmarks, safety evals, and red-teaming protocols implicitly assume the model is a stronger version of the current one. If i

Read original article