A new benchmark measures how well AI agents can automate economically valuable chores. Human-level AI is still some ways off.
Results that may be inaccessible to you are currently showing.
Hide inaccessible resultsResults that may be inaccessible to you are currently showing.
Hide inaccessible results