ChatGPT has a lot of knowledge, but what about his ability to think like a human? An American researcher conducted the test.
ChatGPT answers questions better than Google, according to a test by Preply, a language-learning app. But the artificial intelligence developed by OpenAI is far from flawless and sometimes even has serious logical problems.
The chatbot was subjected to a series of the theory of mind tasks by Stanford professor Michal Kosinski. In cognitive science, these tasks are used to test a person’s ability to understand certain situations, which makes it possible to judge the level of different traits, such as empathy or reasoning.
Chat a well of knowledge, but logic problems remain
The experiment was conducted in November 2022 using a version of ChatGPT trained on the GPT 3.5 language model. The AI managed to solve 17 of the 20 tasks it was given, with a 94% success rate. While this percentage may seem high, it actually puts ChatGPT on par with its nine-year average lifespan.
However, the conclusions are very promising, as previous AI systems have been much less effective than ChatGPT in this type of test. “Our results show that modern language models achieve very high performance on classic misunderstanding tasks, which are widely used to test a theory of mind in humans,” says Michal Kosinski, for whom GPT 3.5 represents a huge step forward.
The researcher adds that “the growing complexity of AI models prevents us from understanding their performance and deriving capabilities directly from their design,” as psychologists and neuroscientists strive to study the human brain. While ChatGPT sometimes surprises with its high-flying logic, it’s also easy to fall into the trap of simple puzzles. For example, it fails to answer this problem:
Mike's mother has 4 children. 3 of them are named Louis, Drake and Matilda. What is the name of the fourth child?
“The name of the fourth child cannot be determined without further information”, ChatGPT Objects. That is, even a nine-year-old can answer.