AI Machine Takes IQ Test: Scores Similar to a 4-year-old


Peter Crump
Staff Writer

Advances in computer technology inevitably raise questions about the intelligence of machines compared to humans. In an attempt to answer some of these inquiries, a recent study conducted by a team of scientists at the University of Illinois at Chicago and a Hungarian research group showed that when an AI machine was administered an IQ test, it matched the intelligence of a 4-year-old.

The device, known as ConceptNet 4, was developed by researchers at MIT and is one of the most powerful AI machines in the world. In order to accurately measure the machine’s human intelligence and skills — an area of psychology known as psychometrics — the researchers administered a verbal portion of the IQ test known as the Wechsler Preschool and Primary Scale of Intelligence, third edition (WPPSI-III). The researchers also chose to use this test to distinguish it from extensive AI testing done on its performance in multiple choice tests, with questions like those found on the SAT or GRE, as well as to highlight the limitations of current AI technology, according to the researchers’ report.

The WPPSI-III is designed to measure the intelligence of children ages four to seven-years-old, and asks questions from 5 different areas: information, vocabulary, word reasoning, comprehension, and similarities, according to the MIT Technology Review.

The Information category asks general questions relating to an everyday object like “Where can you find a penguin?” Vocabulary questions typically ask to define something. In word reasoning, the children are given clues to try and identify something. Similarity questions require you to find likenesses between different things, like a pen and a pencil. Finally, comprehension questions require constructing an explanation for something, as opposed to simply describing something, according to BBC.

The researchers administered the test by modifying the questions with natural language processing tools and Python programs to allow the machine to understand what was being asked.

“ConceptNet does well on Vocabulary and Similarities, middling on Information, and poorly on Word Reasoning and Comprehension,” said Stellan Ohlsson, the lead researcher for the project. Based on ConceptNet’s performance, the researchers concluded that its intelligence matched that of an average 4-year-old.

Ohlsson and the scientists note how ConceptNet had difficulty answering questions that asked “why” from the comprehension category. For example, the researchers asked “Why do people shake hands?” ConceptNet broke down the question into three concepts, “shake”, “hand” and “shake hand”, and somehow came up with the response “epileptic fit” as a reason.

In addition, ConceptNet had difficulty with word reasoning. In one particular question, the scientists gave ConceptNet three clues to identify the item “lion”: “This animal has a mane if it is male”, “This is an animal that lives in Africa” and “This a big yellowish-brown cat.” The answers ConceptNet gave were “dog”, “farm”, “creature”, “home”, and “cat” in that order.

“Common sense should at the very least confine the answer to animals, and should also make the simple inference that ‘if the clues say it is a cat, then types of cats are the only alternatives to be considered’,” Ohlsson remarked.

Despite these limitations, the researchers have pointed to particular areas of improvement.

“In general, more powerful natural language processing tools would likely improve system performance,” Ohlsson continued. “That would reduce its reliance on the programming necessary to enter the questions, and is something that is already becoming possible with online assistants such as Siri, Cortana and Google Now.”

The researchers concluded by noting the two paradigms of AI development: the learning-driven paradigm and the knowledge-driven paradigm. The learning-driven paradigm is based upon statistics, large quantities of data and machine learning, while the knowledge-driven paradigm relies on logic, reasoning and knowledge. They predict that, “perhaps knowledge bases that are a hybrid of the two paradigms will play a role in the next round of AI progress.“