AI has a measurement problem
SAN FRANCISCO - There's a problem with leading artificial intelligence tools such as ChatGPT, Gemini and Claude: We don't really know how smart they are.
That's because, unlike companies that make cars or drugs or baby formula, AI companies aren't required to submit their products for testing before releasing them to the public. There's no Good Housekeeping seal for AI chatbots, and few independent groups are putting these tools through their paces in a rigorous way.
Instead, we're left to rely on the claims of AI companies, which often use vague, fuzzy phrases like "improved capabilities" to describe how their models differ from one version to the next. And while there are some standard tests given to AI models to assess how good they are at, say, math or logical reasoning, many experts have doubts about how reliable those tests really are.
This might sound...
- Log in to post comments