What Turnitin Does & Doesn’t Check for AI Writing

Turnitin’s AI detection system analyzes the statistical patterns in your writing to determine whether sentences were likely generated by a large language model like ChatGPT, Claude, or Gemini. It does not simply compare your submission against a database of known AI outputs. Instead, it examines how your words, sentences, and paragraphs behave statistically, looking for the telltale signatures that distinguish machine-generated prose from human writing.

How the Detection Model Works

At its core, Turnitin’s AI checker evaluates two foundational traits of text: perplexity and burstiness. Perplexity measures the statistical “smoothness” of a sequence of words. AI-generated text tends to be highly predictable, choosing the most likely next word at each step, which produces low perplexity. Human writing is messier and more surprising, jumping between ideas or using unusual word choices that a language model would rarely select.

Burstiness measures variation in things like sentence length. Humans naturally write in bursts: a long, winding sentence followed by a short punchy one, then a medium-length thought. AI text tends to be more uniform, settling into consistent rhythms and structures that feel polished but oddly even.

Turnitin doesn’t stop at those two measures, though. The system uses a transformer-based model, the same type of neural network architecture that powers the AI writing tools themselves. This allows it to detect what Turnitin calls “higher order deviations,” meaning subtle, long-range statistical dependencies that simpler perplexity checks would miss. Think of it as pattern recognition operating across entire paragraphs and documents rather than just individual sentences.

What Gets Flagged in the Report

When your instructor views the AI writing report, Turnitin provides an overall percentage indicating how much of the submission’s qualifying text (prose sentences in long-form writing) was likely AI-generated. That percentage is broken into two categories, each color-coded in the report:

AI-generated only (cyan highlight): Text that was likely produced by a large language model, possibly with minor modifications from an AI bypasser tool.
AI-generated and AI-paraphrased (purple highlight): Text that was likely AI-generated and then run through a paraphrasing tool or word spinner like Quillbot to disguise its origin.

The report includes an interactive bar that maps these highlights across individual pages, so an instructor can click on a colored section and jump directly to the flagged text. This means your instructor sees not just a single number but a visual breakdown of exactly which sentences or passages triggered the detection.

The 20% Threshold and What Scores Mean

Turnitin treats low AI scores with extra caution. If the detector returns a score between 1% and 19%, the report displays an asterisk instead of a specific percentage, and no highlighted text is shown. This is a deliberate design choice to reduce the chance of false positives when only a small portion of the document triggers the model. A few sentences that happen to read like AI output can push the score into that range without meaning you actually used a tool.

At 0%, the system found nothing in the qualifying text that resembled AI-generated writing. At 20% or above, the report displays the full percentage along with sentence-level highlights in the document. The higher the score, the more of your submission the model flagged.

Tools It Can Detect

Turnitin’s system is trained to identify content from large language models broadly, not just one specific tool. It flags text from ChatGPT, and its detection model extends to other LLM-based writing tools as well. Beyond straight AI generation, the system also targets text that has been processed through AI paraphrasing tools, word spinners, and so-called “humanizer” or “bypasser” tools designed specifically to evade detection. The purple highlighting in the report is dedicated to catching this kind of post-processed AI content.

This means that running ChatGPT output through a paraphrasing tool does not necessarily hide it. Turnitin’s model was specifically trained to recognize the statistical fingerprints that remain even after AI text has been reworded by another AI tool.

How Accurate the Detection Is

Turnitin reports a document-level false positive rate of less than 1% for submissions where 20% or more of the text is flagged as AI-generated. That means when the system confidently flags a significant portion of your paper, it is rarely wrong about the document overall. At the sentence level, the false positive rate is higher, around 4%. Individual sentences can be misidentified even when the overall document score is accurate.

This distinction matters. If your paper receives a 45% AI score, the overall conclusion that a large chunk was AI-generated is very likely correct. But any single highlighted sentence could be a false positive. That is one reason Turnitin built in the asterisk threshold for scores below 20%: small amounts of flagged text are more likely to be noise than signal.

How It Handles Mixed Content

Many students wonder what happens if they write most of a paper themselves but use AI for a paragraph or two, or use AI to generate a rough draft and then heavily edit it. Turnitin analyzes text at the sentence level and then aggregates those results into the overall score, so a paper that is partly human and partly AI should, in theory, show a percentage reflecting only the AI-written portions.

In practice, hybrid documents are where the technology gets less reliable. Research from Temple University found that submissions containing a mix of human and AI writing exhibited inconsistent detection results. The model can struggle to draw clean lines when you have edited AI text heavily or woven AI-generated sentences into otherwise original paragraphs. The boundaries between “your words” and “AI words” blur in ways that the sentence-level analysis does not always capture cleanly.

What Turnitin Does Not Check For

Turnitin’s AI detection only evaluates prose sentences in long-form writing. It does not analyze short-answer responses, bullet-point lists, tables, code blocks, or mathematical equations. If your submission is not primarily composed of standard paragraphs, the AI indicator may not have enough qualifying text to produce a meaningful score.

It is also worth understanding that the AI detection feature is separate from Turnitin’s traditional plagiarism checker. The plagiarism tool compares your text against a database of published works, student papers, and web content to find matching passages. The AI writing indicator does not check for copied text at all. It only evaluates whether the writing patterns in your submission are statistically consistent with AI generation. A paper can score 0% on plagiarism and still receive a high AI detection score, or vice versa. Your instructor may see both reports, but they measure entirely different things.

What Turnitin Does & Doesn’t Check for AI Writing

How the Detection Model Works

What Gets Flagged in the Report

The 20% Threshold and What Scores Mean

Tools It Can Detect

How Accurate the Detection Is

How It Handles Mixed Content

What Turnitin Does Not Check For

How Many Paragraphs Should a Cover Letter Be?

How to Use Your Credit Limit for Cash Advances

How the Detection Model Works

What Gets Flagged in the Report

The 20% Threshold and What Scores Mean

Tools It Can Detect

How Accurate the Detection Is

How It Handles Mixed Content

What Turnitin Does Not Check For

Post navigation