The original version of this story appeared in Quanta Magazine.
Among the myriad abilities that humans possess, which ones are uniquely human? Language has been a top candidate at least since Aristotle, who wrote that humanity was âthe animal that has language.â Even as large language models such as ChatGPT superficially replicate ordinary speech, researchers want to know if there are specific aspects of human language that simply have no parallels in the communication systems of other animals or artificially intelligent devices.
In particular, researchers have been exploring the extent to which language models can reason about language itself. For some in the linguistic community, language models not only donât have reasoning abilities, they canât. This view was summed up by Noam Chomsky, a prominent linguist, and two coauthors in 2023, when they wrote in The New York Times that âthe correct explanations of language are complicated and cannot be learned just by marinating in big data.â AI models may be adept at using language, these researchers argued, but theyâre not capable of analyzing language in a sophisticated way.
That view was challenged in a recent paper by GaĹĄper BeguĹĄ, a linguist at the University of California, Berkeley; Maksymilian DÄ bkowski, who recently received his doctorate in linguistics at Berkeley; and Ryan Rhodes of Rutgers University. The researchers put a number of large language models, or LLMs, through a gamut of linguistic testsâincluding, in one case, having the LLM generalize the rules of a made-up language. While most of the LLMs failed to parse linguistic rules in the way that humans are able to, one had impressive abilities that greatly exceeded expectations. It was able to analyze language in much the same way a graduate student in linguistics wouldâdiagramming sentences, resolving multiple ambiguous meanings, and making use of complicated linguistic features such as recursion. This finding, BeguĹĄ said, âchallenges our understanding of what AI can do.â
This new work is both timely and âvery important,â said Tom McCoy, a computational linguist at Yale University who was not involved with the research. âAs society becomes more dependent on this technology, itâs increasingly important to understand where it can succeed and where it can fail.â Linguistic analysis, he added, is the ideal test bed for evaluating the degree to which these language models can reason like humans.
Infinite Complexity
One challenge of giving language models a rigorous linguistic test is making sure they donât already know the answers. These systems are typically trained on huge amounts of written informationânot just the bulk of the internet, in dozens if not hundreds of languages, but also things like linguistics textbooks. The models could, in theory, simply memorize and regurgitate the information that theyâve been fed during training.
To avoid this, BeguĹĄ and his colleagues created a linguistic test in four parts. Three of the four parts involved asking the model to analyze specially crafted sentences using tree diagrams, which were first introduced in Chomskyâs landmark 1957 book, Syntactic Structures. These diagrams break sentences down into noun phrases and verb phrases and then further subdivide them into nouns, verbs, adjectives, adverbs, prepositions, conjunctions and so forth.
One part of the test focused on recursionâthe ability to embed phrases within phrases. âThe sky is blueâ is a simple English sentence. âJane said that the sky is blueâ embeds the original sentence in a slightly more complex one. Importantly, this process of recursion can go on forever: âMaria wondered if Sam knew that Omar heard that Jane said that the sky is blueâ is also a grammatically correct, if awkward, recursive sentence.
