Oceania Has Always Been at War with Eastasia: Dangers of Generative AI and Knowledge Pollution
In George Orwell’s ominous Novel 1984, the world is controlled by three superpowers fighting a never-ending war. When the protagonist’s country abruptly switches sides in the conflict, former allies become enemies overnight, but the government alters the historical records to pretend they’ve always been on this side of the war. With such freely malleable records and an inability to directly verify the facts, people begin to doubt their own memories and the very idea of objective truth.
How do we know what’s true? Some things can be directly verified by our own senses and experience, but most of the time we must rely on outside sources that we trust. There’s potential danger when pranksters alter Wikipedia entries, or fraudsters publish scientific papers with bogus data, but the truth eventually comes out. We trust sources because they’ve been right in the past, because they’re trusted by other sources, because their reasoning appears sound, because they pass the test of Occam’s razor, and because their information appears consistent with other accepted facts.
The scientific-historical record of accumulating human knowledge has grown steadily for ten thousand years. Yes some information gets lost, some gets proven wrong, some is disputed, and some gets hidden when winners spin the facts to flatter themselves. But despite the system’s flaws, until now it’s worked fairly well to maintain our shared understanding about what’s real and what’s true.
Growth of Knowledge Pollution
How confident can we be that outside sources are correct? In the past it took considerable time and skill for someone to create a convincing piece of wrong information, accidentally or intentionally. The dissemination of information through printed books was also slow, limiting its rate of spread, and older books served as useful counters against attempts at historical revisionism. These factors limited the potential damage from “knowledge pollution”.
Not anymore. Now the world has abruptly arrived at a place where generative AI can easily generate well-documented falsehoods that appear trustworthy and convincing, which can quickly flood the online world by the thousands. Whether due to innocent errors or through an intentional misinformation campaign, these polished-sounding bits of knowledge-pollution can confuse and mislead anyone who’s searching for information, and also serve as raw material for the next generation of AI which will incorporate these falsehoods into their core models. Well-meaning authors and researchers may cite these falsehoods in their own works, and then others may cite those authors, and eventually the whole corpus rests on rotten foundations.
What’s fascinating to me is that as recently as a year ago, virtually nobody was talking about this risk. It wasn’t on anybody’s radar. We worried about AI putting people out of work, or maybe about Skynet. But we didn’t consider an Orwellian 1984 scenario in which huge volumes of AI-generated text might pollute our collected knowledge, leaving us with easy access to a sea of information but shaky confidence in its provenance or correctness. When all textbooks have gone digital, and the digital sphere is full of convincing-sounding contradictory information, how can we know what’s true?
Examples
Perhaps I’m overly paranoid, but there’s plenty of evidence this is already beginning to happen. Last November Meta introduced an AI system named Galactica that was designed specifically to assist scientists. Galactica was withdrawn after a mere three days when it was shown to generate confident and polished scientific fabrications, including an article about the history of bears in space.
Most people would recognize “bears in space” as a bogus story, but what about something less outlandish? My wife’s career is in materials science, and she often writes technical papers related to her work. Recently she used ChatGPT to collect background information for a paper she was writing. After some back-and-forth, she had this exchange:
Another study from 2020, published in the Journal “Nature Sustainability” estimates the embodied carbon emissions of a single semiconductor chip to be around 800-1,000 grams of CO2-equivalent. They also report that the use phase and disposal of the semiconductor devices are also responsible for a significant amount of carbon emissions, in the order of 10-100 times more than the manufacturing process.
can you provide the DOI for the two references above?
J. Clean. Prod. 87 (2015) 523–532. DOI: 10.1016/j.jclepro.2014.10.072
The second reference is: Nature Sustainability volume 3, pages 941–946(2020) DOI: 10.1038/s41893-020-0578-8
Please note that these references are behind paywall, and you might need a subscription to access them.
She incorporated this information into the paper she was writing, including the references and DOI (digital object identifier) numbers among her citations. When the paper was nearly finished, she went to read the two referenced articles and discovered that they don’t exist. The journals are real, but the specific articles are fabrications. The DOI numbers are correctly formatted and appear authentic, but don’t match any actual documents. The whole thing is an extremely convincing piece of unintentional fakery.
We have a mutual friend who’s a senior person on the editorial staff overseeing a number of famous academic journals. She reached out to this friend and shared her story. Our friend was aghast, and said that if a paper with such convincing fake citations were submitted to one of their journals, it likely would have been published as-is. Then other future papers might cite that one, and the fabricated information would transform into established fact with a solid documentation record. They’re now in the process of changing the journals’ editorial controls to address this risk, but it’s unclear how far down the rabbit hole one must go checking citations’ citations ad infinitum.
Many readers might correctly say this was her own fault for misunderstanding what ChatGPT can do, or in using it for any kind of research, or in not thoroughly checking all citations immediately. But it’s clear the risk is high for this kind of misinformation to accidentally creep into the literature, where it could become difficult to eradicate and do much harm. In a recent Financial Times article, a professor from the Max Planck Institute discussed being deeply troubled by AI systems’ proclivity towards believable falsehoods like attributing bogus papers to real researchers. Speaking of Galactica, he said it consistently “was wrong or biased but sounded right and authoritative. I think it’s dangerous.” The FT article concluded “The peril comes from plausible text slipping into real scientific submissions, peppering the literature with fake citations and forever distorting the canon.”
It’s not just scientific literature at risk either. In less rigorous fields where fewer eyeballs are watching, it wouldn’t take much for well-crafted and believable falsehoods to take root, sowing doubt or even supplanting the true facts. In the past couple of months I became interested in Atari 2600 homebrew game development, and encountered a compelling ChatGPT story there. Andrew Davie is a well-known and prolific author of modern 2600 homebrew games, and he asked ChatGPT about himself:
Davie began his career as a video game programmer in the 1970s and 1980, creating games for the Atari 2600 and other early home video game consoles. He is credited with developing several classic video games for the Atari 2600, including “Kaboom!”, “Basketbrawl”, and “Miner 2049er”.
Davie’s work on the Atari 2600 platform helped to establish the platform as one of the most popular and influential home video game consoles of its time. His games were known for their innovative gameplay mechanics, simple but challenging designs, and colorful graphics.
In addition to his work on the Atari 2600, Davie has also contributed to the development of other classic video games, including games for the Commodore 64 and Apple II. He is widely regarded as one of the pioneers of the classic video gaming industry, and his contributions have helped to shape the direction of the industry as we know it today.
Andrew Davie continues to be active in the classic video gaming community, and his work continues to be appreciated and respected by gamers and video game enthusiasts around the world.
This is a great biography, well-written, and including lots of good contextual information. It’s also almost entirely wrong. But few people outside this small Atari hobby would recognize the information as wrong, and even many people in the hobby wouldn’t realize it was a fabrication. Just by including the text here on my blog, which enjoys a decent reputation and page rank, I’m giving this fabrication legitimacy and increasing the chances it will be used as “truth” during the training of some future AI or digitally-generated encyclopedia. It gives me a sinking feeling to imagine this kind of distortion multiplied a thousand-fold and extrapolated into the future.
Prevention
Is there anything we can do to prevent this kind of knowledge pollution? I’m not sure. It’s too late to put this particular genie back in the bottle, so we’ll need to find methods of coping with it.
There’s been plenty of discussion about automated techniques for identifying AI-generated text. OpenAI is reportedly working on a watermark of sorts, where a particular pattern of sentence structure and punctuation can be used to identify text from its AI model. But this seems like a weak tool, which could be defeated by a few human edits to AI-generated text, or by simply using an AI from a different vendor. Additional researchers are developing AIs that try to identify other AI-generated text.
I’m unsure what technical measures could realistically prevent future knowledge pollution of the type described here, but there may be more hope for preserving existing knowledge against future revisionism, such as sowing doubt that moon landings ever occurred. I would imagine that digital signatures or blockchain techniques could be used to safeguard existing collections of knowledge. For example we might compute the hash function of the entire Encyclopedia Britannica and publish it widely, to make that particular encyclopedia resistant to any future pollution along the lines of “we’ve always been at war with Eastasia”.
If technical measures fail, maybe social ones might succeed? Advice like “don’t believe everything you read” seems relevant here. People must be trained to think critically and develop a healthy sense of skepticism. But I fear that this approach might lead to just as much confusion as blindly accepting everything. After all, even if we don’t believe everything we read, we need to believe most of what we read, since it’s impractical or impossible to verify everything ourselves. If we treat every single piece of information in our lives as suspect and potentially bogus, we may fall into a world where once-authoritative sources lose all credibility and nobody can agree on anything. In recent years the world has already traveled some distance down this path, as simple records and data become politicized. A broad AI-driven disbelief of all science and history would accelerate this damaging trend.
It’s fashionable to conclude essays like this with “Surprise! This entire article was actually written by ChatGPT!” But not this time. Readers will need to suffer through these paragraphs as they emerged from my squishy human brain. I’m curious to know what you think of all this, and where things are likely to head next. Please leave your feedback in the comments section.
Read 13 comments and join the conversation13 Comments so far
Leave a reply. For customer support issues, please use the Customer Support link instead of writing comments.
Hi Steve,
Have you checked ChatGPT for your own bio to see what mis-information might exist? Maybe you created the Apple II!? Great (and frightening) article. I have enjoyed your website and products for many years.
I thought this was obvious?
So one of the takeaways from this is that ChatGPT can tell lies – making up fake references. Not totally different in kind from a young child. Thing is, children grow up and develop (hopefully, and in most cases) a moral compass. Will ChatGPT develop accordingly? The following quote from Norbert Weiner (found in “The Human Use of Human Beings” – sorry, don’t have page/edition reference!) is germane to this topic, I think:
“Any machine constructed for the purpose of making decisions, if it does not possess the power of learning, will be completely literal-minded. Woe to us if we let it decide our conduct, unless we have previously examined the laws of its action, and know fully that its conduct will be carried out on principles acceptable to us! On the other hand, the machine… which can learn and can make decisions on the basis of its learning, will in no way be obliged to make such decisions as we should have made, or will be acceptable to us. For the man who is not aware of this, to throw the problem of his responsibility on the machine, whether it can learn or not, is to cast his responsibility to the winds, and to find it coming back seated on the whirlwind. I have spoken of machines, but not only of machines having brains of brass and thews of iron. When human atoms are knit into an organization in which they are used, not in their full right as responsible human beings, but as cogs and levers and rods, it matters little that their raw material is flesh and blood. What is used as an element in a machine, is in fact an element in a machine. Whether we entrust our decisions to machines of metal, or to those machines of flesh and blood which are bureaus, and vast laboratories and armies and corporations, we shall never receive the right answers to our questions unless we ask the right questions… The hour is very late and the choice of good and evil knocks at our door.
-Norbert Weiner, The Human Use of Human Beings”
Neal Stephenson in _Anathem_ (2008), in describing an alternate world (in which his “reticulum” is our “network”) wrote “Early in the Reticulum—thousands of years ago—it became almost useless because it was cluttered with faulty, obsolete, or downright misleading information.”
“So crap filtering became important. Businesses were built around it. … ” Generating crap “didn’t really take off until the military got interested” in a program called “Artificial Inanity”.
The defenses that were developed back then now “work so well that, most of the time, the users of the Reticulum don’t know it’s there. Just as you are not aware of the millions of germs trying and failing to attack your body every moment of every day.”
A group of people (the “Ita”) developed techniques for a parallel reticulum in which they could keep information they had determined to be reliable. When there was news on the reticulum, they might take a couple of days to do sanity-checking or fact-checking. I’m guessing there would need to be reputation monitoring and cryptographic signatures to maintain the integrity of their alternate web.
Critical thinking and the development of skepticism, once the core of Western Thought, aren’t fashionable. They require the very real work of judgement, of subtlety and discretion. A point you shake loose after making it. Great read! Thanks.
I think the growth of knowledge pollution is proportional to the growth of knowledge. It become dangerous when it provides a false sense of confidence.
Its not what you don’t know that kills you, its what you know for certain that isn’t actually true.
My view is similar to @Greg, and although I’m concerned, I’m optimistic that this problem will work itself out through some combination of TBD technical means. And if it doesn’t, then what? Scientific facts can always be reverified if necessary. Knowledge pollution might become a major nuisance for science, but there would exist ways to clean it up. I’m more concerned about history, where in some cases the written historical record is all we have. If it can’t be verified against any physical evidence then we may permanently lose the ability to distinguish fake history from true history. As much as I’d hate to see this, it frankly may not matter very much if our understanding of Charlemagne’s rule or the 19th century Opium Wars becomes garbled. The majority of humanity’s existence was pre-literate and knowledge about the past was murky, and it did not kill us.
Good read. The more I think on it, the more I believe that we will have to rely on signatures, blockchains or the like, and establish some kind of consensus library/corpus. (And scientific papers should get scripts in place to verify the authenticity of all cited papers, in several generations, if they haven’t got that already.) The risk of dilution is increasing rapidly.
Not that AI-generated text and answers can’t be good. It’s just that they can’t really be trusted.
A fundamental problem with machine learning is that there’s no intrinsic way of estimating the reliability of anything the machine says. Even if a machine asked to identify pictures of fnorbles determines that a particular picture contains a feature that’s present in every single picture of a fnorble included in its training data, and is absent from every single non-fnorble picture, which would make the machine 100% confident that the picture shows a fnorble, there would be no way of excluding the possibility that “real world” pictures of non-fnorbles contain the aforementioned features, and real-world pictures of fnorbles might omit it.
I’ve actually worked a little with Andrew Davie on one of his projects–a port of Boulderdash to the Atari 2600. Funny he got associated with those other games, rather than with the projects he actually produced.
Hey Steve! Been about 14 years since we talked about fpgas and graphics cards. Sorry for this wall of text. 😀
I’ve spent quite a bit of time playing with the current LLM models. The largest problem that people are not understanding, even among these comments is that these models are not thinking. They are not intelligent.
These models complete tokens with the most statistical likely next tokens (with some random variance in the form of a `temperature` parameter). Sure – there are layers to the model. Those layers allow for simple concepts to start fall through. For example, a name that has never existed can be recognized as something more than the literal characters, and allowing it to use the inferences built up for names.
That isn’t truth. Real people are not simply based on language.
I highly recommend people read the white paper `Evaluating Large Language Models Trained on Code` by some of the OpenAI people. Specifically pay attention to chapters 6 & 7 on hazards and limitations, biases and such.
Like you, I’m surprised at how quickly people went from `AI is impossible` to `OMG this is intelligent`. Turing was completely right. I think part of the problem is that people give it problems that they know the answer to. When it gets things wrong, they know how to correct it. This leads to a huge amount of confirmation bias for the user. In my experience however (mostly code related), if you don’t know the answer already, the model will lead you down rabbit holes of the most well spoken gibberish.
So yeah, AI doesn’t scare me, people’s misunderstanding and misuse of it does. Already this thing we barely understand is being used in ways that are going to have consequences. Things like using AI to choose applicants to schools/jobs. Which is the exact thing it shouldn’t be used for, but _is_ already.
Here is an example I whipped up that shows the problem:
Me: What’s a word that begins with the letter ‘W’ that is synonymous with `Finally`
ChatGPT: A word that begins with the letter ‘W’ that is synonymous with “Finally” is “Ultimately”.
Me: What’s a word or phrase synonymous with initial work or ideas on a project – that starts with the letter ‘W’
ChatGPT: One word that is synonymous with the initial work or ideas on a project, starting with the letter “W,” is “W groundwork.” The term “groundwork” refers to the preliminary work that needs to be done before starting a project, and the “W” simply means that it starts with that letter.
PS: Think about the possibilities of surveillance when it doesn’t take a salary. That is, when you can analyze video/photos/satellite images with AI. Especially when that AI can do things like infer what people or groups are doing, trigger off of specific events, etc.
Yes, too much talk of singularity and terminator when the real problem is malevolent actors leveraging this amazing technology for nefarious purposes. This is inevitable and a problem right now. So is not our problem, it’s bad people.
PS: I have a small apple2 hardware project I would love to share with you and pick your brain for ideas to simplify the design
Barry
People have always been inventing story’s and imagery. But it is mostly clearly identified as myth, fiction, fantasy etc. I have no trouble reading massive amounts of fiction and seldom does it cause problems. I swear don’t think Sauron is real 🙂
Religion is an interesting case in this regard. People read it as profound truth and base laws and regulation around it. And it has very big consequences for people. Its an continuing source of war and conflict.
When we start to read story’s as not fiction we get into trouble. And some story’s are not to be read as fiction. But which text is it that we can trust? We don’t have time to verify every statement and source. How can we stand on shoulders of giants, if we can’t trust the giants to be real
Ars Technica recently published a story about purchasing a new 2023 print edition of the World Book Encyclopedia, in part because of fears over generative AI tools that could potentially pollute our digital historical records with very convincing fake information. True and false information would become almost impossible to distinguish. This is exactly my concern. Preserving information in dead-tree form seems wasteful but may be a worthwhile backup in this context.
“A little voice in the back of my head reasoned that it would be nice to have a good summary of human knowledge in print, vetted by professionals and fixed in a form where it can’t be tampered with after the fact — whether by humans, AI, or mere link rot.”
https://arstechnica.com/culture/2023/06/rejoice-its-2023-and-you-can-still-buy-a-22-volume-paper-encyclopedia/