- cross-posted to:
- articles
- cross-posted to:
- articles
Never let an agent evaluate its own work. Agents exhibit optimistic bias when self-grading. Use sprint contracts as evaluation interface, communicate via files not shared context. Key insight: harness components encode model-limitation assumptions that go stale — periodically test if each component still adds value.
You must log in or # to comment.
