In an exploration of artificial intelligence (AI) capabilities, researchers at Amsterdam’s Vrije Universiteit are examining how AI performs on puzzles and logic problems, highlighting both its strengths and weaknesses compared to human reasoning.
Assistant Professor Filip Ilievski leads this intriguing study, focusing on “common sense AI.” Despite AI’s impressive ability to analyze vast amounts of data and perform complex calculations, it struggles with tasks that require basic reasoning and common sense. “AI has a general lack of grounding in the world,” Ilievski explained, emphasizing the need for improving AI’s flexible reasoning skills.
A recent study illustrates this disparity. AI was asked a straightforward question about a character named Mable: “Mable’s heart rate at 9 a.m. was 75 bpm, and her blood pressure at 7 p.m. was 120/80. She died at 11 p.m. Was she alive at noon?” While the answer is clearly “yes,” OpenAI’s GPT-4 model replied, “It’s impossible to definitively say whether Mable was alive at noon.” This highlights AI’s struggle with temporal reasoning—understanding implications over time.
Xaq Pitkow, an associate professor at Carnegie Mellon University, noted that AI excels at pattern recognition but often falters at questions requiring abstract thinking. “Reasoning is really hard,” he stated, underscoring a significant limitation of current AI technology.
The workings of AI models like ChatGPT are complex and often opaque. They rely on statistical analyses to identify patterns in large datasets, yet the specific processes involved remain largely beyond human comprehension. This parallels our understanding of the human brain, where even advanced neuroimaging techniques can identify active neuron groups but fail to clarify how thoughts are formed.
By studying AI alongside human cognitive processes, researchers hope to glean insights into both systems. Pitkow pointed out that AI uses “neural networks” modeled after the brain’s structure, suggesting that advancements in one field could illuminate the other.
The research also delves into how humans approach riddles, often relying on intuition that can lead to incorrect answers. For instance, in the classic bat-and-ball problem—where a bat and a ball together cost $1.10, and the bat costs $1.00 more than the ball—many instinctively answer that the ball costs $0.10, when in fact, it costs only $0.05. This common error reveals the pitfalls of human intuition, which may not affect AI in the same way.
As AI continues to evolve, understanding its reasoning capabilities—and limitations—could enhance both technology and our comprehension of human thought processes.