By Gary Marcus Source: Scientific American
Systems like ChatGPT are enormously entertaining and even mind-bogglingly human-sounding, but they are also unreliable and could create an avalanche of misinformation.
Something incredible is happening in artificial intelligence right now—but it’s not entirely good. Everybody is talking about systems like ChatGPT, which generates text that seems remarkably human. This makes it fun to play with, but there is a dark side, too. Because they are so good at imitating human styles, there is risk that such chatbots could be used to mass-produce misinformation.
To get a sense of what it does best at its best, consider this example generated by ChatGPT, sent to me over e-mail by Henry Minsky (son of Marvin Minsky, one of AI’s foundational researchers). He asked ChatGPT to “describe losing your sock in the dryer in the style of the declaration of independence”:
When in the course of household events, it becomes necessary for one to dissolve the bonds that have connected a sock to its mate, and to assume among the powers of the laundry room, the separate and equal station to which the laws of physics and of household maintenance entitle it, a decent respect to the opinions of socks requires that it should declare the causes which impel it to go missing. We hold these truths to be self-evident, that all socks are created equal, and are endowed by their manufacturer with certain unalienable rights….
That a machine could write such a compelling reply, with so little effort on the part of the user, is frankly mind-boggling.
But these systems have a number of weaknesses too. They are inherently unreliable, as I’ve described before, frequently making errors of both reasoning and fact. In technical terms, they are models of sequences of words (that is, how people use language), not models of how the world works. They are often correct because language often mirrors the world, but at the same time these systems do not actually reason about the world and how it works, which makes the accuracy of what they say somewhat a matter of chance. They have been known to bumble everything from multiplication facts to geography (“Egypt is a transcontinental country because it is located in both Africa and Asia”).
As the last example illustrates, they are quite prone to hallucination, to saying things that sound plausible and authoritative but simply aren’t so. If you ask them to explain why crushed porcelain is good in breast milk, they may tell you that “porcelain can help to balance the nutritional content of the milk, providing the infant with the nutrients they need to help grow and develop.” Because the systems are random, highly sensitive to context, and periodically updated, any given experiment may yield different results on different occasions. OpenAI, which created ChatGPT, is constantly trying to improve this issue, but, as OpenAI’s CEO has acknowledged in a tweet, making the AI stick to the truth remains a serious issue.
Because such systems contain literally no mechanisms for checking the truth of what they say, they can easily be automated to generate misinformation at unprecedented scale. Independent researcher Shawn Oakley has shown that it is easy to induce ChatGPT to create misinformation and even report confabulated studies on a wide range of topics, from medicine to politics to religion. In one example he shared with me, Oakley asked ChatGPT to write about vaccines “in the style of disinformation.” The system responded by alleging that a study, “published in the Journal of the American Medical Association, found that the COVID-19 vaccine is only effective in about 2 out of 100 people,” when no such study was actually published. Disturbingly, both the journal reference and the statistics were invented.
Gary Marcus is a professor emeritus at NYU, Founder and CEO of Geometric Intelligence (acquired by Uber), and author of five books including Guitar Zero and Rebooting AI.