Illustration by Andria Chen
The Allen Institute for Artificial Intelligence (AI2) launched the beta website for their project called Semantic Scholar. It is a search engine capable of returning results based not on the simple matching of keywords, but instead on the meaning of a text matched to the meaning of a search query.
The new site boasts that its academic search engine will offer advanced filters, easy exploration and, most importantly, semantic understanding; a software that can read.
“With millions of papers appearing every year, you just can’t keep up with them,” said Oren Etzoni, executive director of the Allen Institute. “So you need [a search engine with] some level of understanding.”
The project is currently limited to its beta stage, and its search domain only includes academic computer science articles. AI2 will expand Semantic Scholar into other fields, but even in its current state, Semantic Scholar could prove immensely useful.
“No one can keep up with the explosive growth of scientific literature,” said Etzioni.
The Semantic Scholar project pairs perfectly with AI2’s other pursuits. All four of their projects focus on interpreting information presented by a particular source. The first project, Project Aristo, aims to introduce systems capable of reasoning and logic, while Project Euclid seeks to develop software to solve problems in geometry and mathematics. Project Plato focuses on extracting meaning from the contents of visual media (e.g. pictures, graphs or videos).
Along with Semantic Scholar, these projects are approaching capabilities that resemble understanding information presented in data, unlike current systems that do not go beyond scanning for it. When software understands the information that it is fed, it can present it back to users in more easily digested formats.
This fits in with a broader trend in the world of artificial intelligence. AI2’s projects are representations of a phenomenon in computation known as deep learning. Software capable of deep learning takes a set of data and uses it to make abstractions, drawing concepts from the given data set comparable to human ability.
Most major tech companies have gotten their feet wet in the field of deep learning software. Google’s project, the Knowledge Graph, has been around since 2012. Prior to its launch, a Google search of “how many students does the University of California, Santa Barbara have” would merely return a list of links to sites containing those keywords. Thanks to the Knowledge Graph, however, the first result that Google returns is a direct answer to the search query entered. To do this, it must use deep learning to look at the semantics underlying web pages.
“We want ultimately to be able to take an experimental paper and say, ‘Okay, do I have to read this paper, or can the computer tell me that this paper showed that this particular drug was highly efficacious?’” said Etzioni.
Despite project limitation, there is growing interest in using machine learning to train computers in recognizing concepts. Other companies, such as Meta, are doing a similar service that will automatically identify people and entities mentioned in medical literature.
“Essentially, it allows you to track at the concept level, or the technology level, rather than the article level,” said Sam Molyneux, CEO of Meta. “Concepts like the CRISPR technology, which is really revolutionizing how genome engineering is happening right now — we picked that up as an emerging concept a number of years ago.”