Google releases open-source tool to detect AI-generated text

Google has unveiled SynthID Text, a new open-source tool designed to help developers identify AI-generated text. The tool, which is part of Google’s Responsible Generative AI Toolkit, was made available through the AI platform Hugging Face. SynthID Text introduces an innovative way to watermark text created by large language models (LLMs), offering a critical solution to growing concerns around misinformation, academic dishonesty, and other malicious uses of AI-generated content.

How SynthID Text Works

At its core, SynthID Text relies on a process called token modulation to watermark text. Large language models generate text one token at a time, where tokens can represent a single character, word, or part of a phrase. The model assigns a probability score to each possible token, indicating the likelihood of it being the next in the sequence. SynthID Text adjusts these probability scores during the generation process to create a watermark that is imperceptible to humans but detectable by software.

For example, when given a prompt like “What’s your favorite fruit?,” an LLM might predict the next token could be “apple” or “banana,” each with a probability score. SynthID tweaks these scores slightly without affecting the quality of the output, embedding the watermark into the final text.

Google claims that SynthID Text does not degrade the quality, accuracy, or creativity of the generated text. The watermark can still be detected even if the text has been cropped, paraphrased, or lightly modified.

Advantages of SynthID Text

SynthID Text has been integrated into Google’s Gemini models since early 2024. To validate its performance, Google conducted a large-scale experiment, deploying the watermark in over 20 million responses generated by its chatbot users. Participants rated watermarked texts as being of equal quality to unwatermarked versions, demonstrating that the watermarking process did not compromise the text’s readability or usefulness.

The open-sourcing of SynthID Text allows other AI developers to incorporate the watermarking tool into their own models. This broad adoption could standardize the practice of watermarking across the AI community, helping developers build responsible AI systems.

Pushmeet Kohli, vice president of research at Google DeepMind, emphasized the importance of this step, stating, “Other generative AI developers will be able to use this technology to detect whether text outputs have come from their own LLMs, making it easier to build AI responsibly.”

Limitations and Challenges

While SynthID Text offers promising benefits, Google admits that its technology has certain limitations. The watermarking tool is less effective when applied to short text or content that has been rewritten or translated into another language. In particular, SynthID struggles with factual prompts, where little variation is possible. For instance, responses to a question like “What is the capital of France?” provide fewer opportunities for adjusting token probabilities without altering factual accuracy.

Experts, including Soheil Feizi from the University of Maryland, acknowledge these challenges, noting that watermarking AI-generated text is especially difficult in near-deterministic tasks like code generation or factual queries.

Additionally, like other watermarking techniques, SynthID is vulnerable to attacks. Researchers from the Swiss Federal Institute of Technology have shown that watermarks can be scrubbed from text or spoofed, making it appear as though human-generated content was created by an AI.

Future of AI Watermarking

SynthID is not the first attempt at watermarking AI-generated text, but it is the first large-scale deployment of such a tool in the real world. Governments are taking notice of these advancements, with China already mandating watermarking for AI content and California considering similar regulations.

As AI-generated content becomes more pervasive, the urgency to develop reliable watermarking technologies is increasing. According to the European Union Law Enforcement Agency, 90% of online content could be synthetically generated by 2026, posing new challenges for disinformation and fraud detection.

| Welcome to Global Village Space

Google releases open-source tool to detect AI-generated text

How SynthID Text Works

Advantages of SynthID Text

Limitations and Challenges

Future of AI Watermarking

Air India flight to Delhi makes emergency landing

UK launches police operation against ‘grooming gangs’

U.S. Embassy in Tel Aviv Hit by Missile Strike, Sustains Minor Damage

Iran hangs man convicted of spying for Israel amid air war

Israel says Tehran residents to ‘pay price’ after Tel Aviv, Haifa attacks

Trump vetoed Israeli plan to kill Iran’s supreme leader, US officials say

Israel asked US to join Iran attack – Axios

Pakistan denies Iranian claims of nuclear support against Israel

Trump warns Iran not to attack the US

Pakistan calls for Muslim states to unite against Israel

Netanyahu’s official aircraft lands in Athens

Iran’s Supreme Leader Warns Israel: ‘You Will Not Escape Unscathed'”

Quick Links

Must Read

Air India flight to Delhi makes emergency landing

UK launches police operation against ‘grooming gangs’

U.S. Embassy in Tel Aviv Hit by Missile Strike, Sustains Minor Damage

Trump vetoed Israeli plan to kill Iran’s supreme leader, US officials say

Pakistan denies Iranian claims of nuclear support against Israel

Pakistan calls for Muslim states to unite against Israel

Popular Articles

Israeli Prime Minister’s psychiatrist commits suicide:Satire

Nawaz Sharif taught in Canadian University, video goes viral

4 Best Sites to Buy Instagram Followers as an Influencer

Global Village Space: A journey of 21 months

Barack Obama’s brother claims he’s ‘definitely gay’

Aamir Liaquat Hussain, Dania Shah leak videos of each other

Magazine

Pakistan: Self-adorned in a vortex

Mothers: The unsung heroines, carrying the burden with grace

Marvellous May: Celebrating the Superwomen of Motherhood

Predicaments of an ‘Army Brat’

Charles III: The Turbulent Path to the Crown

The release of “Behind Closed Doors”