Academic Integrity

AI Content Detection System

A 10-method ensemble analyzer that detects AI-generated content through perplexity analysis, burstiness scoring, linguistic patterns, hallucination detection, and even adversarial evasion attempts, with detailed explanations for every verdict.

0 Detection Methods

0 Classification Levels

Adversarial Detection

Want Something Like This? Read the Full Story

Screenshot coming soon

The Challenge

What was broken.

AI writing tools got good fast. Students discovered that ChatGPT could produce passable essays, and overnight, faculty lost the ability to distinguish student work from AI output. Commercial detectors emerged but they’re black boxes, they give a percentage and no explanation. Faculty can’t tell a student “this was flagged as AI-generated” if they can’t explain why.

False positives ruin trust. False negatives enable cheating. And students quickly learned evasion techniques: paraphrasing tools, strategic typos, thesaurus substitution, even mixing Cyrillic characters into Latin text. A single detection method is easy to fool.

The arms race between generation and detection needed a different approach, not one better detector, but many independent detectors working together.

Black Box Verdicts

Commercial detectors say ‘87% AI’ with no explanation. Faculty can’t justify academic integrity decisions with an opaque percentage.

Single-Method Vulnerability

Any one detection technique can be fooled. Paraphrase tools, humanizers, and strategic editing beat simple detectors.

Adversarial Evasion

Students use homoglyph substitution, thesaurus abuse, strategic typos, and filler injection to dodge detection tools.

False Positive Damage

Wrongly accusing a student of AI use is devastating. Faculty need confidence levels and explanations, not just scores.

The Approach

How we solved it.

Multi-Method Ensemble

Built 10 independent detection methods, each analyzing text from a completely different angle. No single method is reliable alone, but their combined signal is strong. Weighted ensemble scoring produces a final classification.

Each detector returns a (score, confidence, details, explanation) tuple. The adversarial detector doesn’t contribute to the AI/human score, it adjusts overall confidence instead.

Model-Based Analysis

Two methods use GPT-2 as a reference model: Perplexity Detection measures how predictable the text is (AI text is more predictable), and Zero-Shot Detection analyzes log-likelihood distributions without any training data.

Linguistic & Stylometric Analysis

Four methods examine writing patterns: N-gram frequency, POS distribution, function word usage, sentence complexity, syntactic structure, voice patterns, and readability metrics across 5 scales (Flesch, Gunning Fog, Coleman-Liau, SMOG, ARI).

Adversarial & Hallucination Detection

Two specialized methods detect evasion attempts (homoglyphs, strategic typos, thesaurus abuse, filler injection, copy-paste boundaries) and AI hallucination patterns (fabricated citations, anachronisms, AI self-revelation phrases, unsourced statistics).

Technologies Used

Python NumPy SciPy NLTK spaCy Transformers (GPT-2) PyTorch scikit-learn textstat pandas

Key Features

What it actually does.

10 Independent Detectors

Perplexity, Burstiness, N-gram, Linguistic, Stylometric, Readability, Pangram, Hallucination, Zero-Shot, and Adversarial, each analyzing text from a unique angle.

Detailed Explanations

Every verdict comes with a multi-section report: overall assessment, per-detector findings, key insights, and specific recommendations. Faculty can understand and explain every flag.

Hallucination Detection

Catches fabricated citations (future dates, generic authors, mixed styles), anachronisms, AI self-revelation phrases, and unsourced precise statistics.

Adversarial Evasion Detection

Identifies homoglyph substitution (Cyrillic in Latin text), known ‘humanizing’ typos, thesaurus abuse density, filler injection, and copy-paste boundaries.

Configurable Profiles

Enable or disable individual detectors, adjust weights, and create custom detection profiles tuned to specific use cases or document types.

Three-Tier Classification

AI (0-35%), Uncertain (35-65%), Human (65-100%). The ‘Uncertain’ zone is deliberate, it prevents false positives and tells faculty when they need to investigate further.

In Action

See it in action.

Ensemble Analysis Screenshot

Ensemble Analysis

The comprehensive analyzer running all 10 methods on a text sample, showing individual detector scores and the weighted ensemble verdict.

Detailed Report Screenshot

Detailed Report

A multi-section explanation showing per-detector findings, key insights, and the reasoning behind the final classification.

Adversarial Detection Screenshot

Adversarial Detection

The adversarial detector flagging homoglyph substitution and thesaurus abuse patterns in a tampered text sample.

Results

The numbers speak.

Detection Methods

Each analyzing from a unique angle

Readability Scales

Flesch, Gunning Fog, SMOG, and more

Evasion Techniques Caught

Homoglyphs to filler injection

Classification Tiers

AI, Uncertain, Human

Other tools just give us a percentage. This one tells us why it thinks something is AI-generated, which specific patterns it found, which methods agree, and where the text is suspicious. That’s the difference between ‘I think you cheated’ and ‘here’s what the evidence shows.’

Department Chair Academic Integrity Committee

Insights

What we learned.

The Uncertain zone is the most important feature

Binary AI/Human classification is irresponsible. The 35-65% ‘Uncertain’ band explicitly tells faculty ‘we don’t know, investigate further.’ This prevents the false positives that destroy student trust and protects institutions from wrongful accusations.

Adversarial detection as a confidence adjuster, not a score

When evasion techniques are detected, the text might still be human-written (some people genuinely use thesaurus words). Instead of affecting the AI/Human score, adversarial detection lowers the confidence, telling faculty ‘something unusual is happening, but we’re less sure about the overall verdict.’

Explanations matter more than accuracy

A 95% accurate black box is less useful than an 85% accurate system that explains its reasoning. Faculty need to understand why text was flagged so they can make informed decisions and have defensible conversations with students.

Let’s Talk

Want a transparent AI
detection system?

Tell us about your academic integrity challenges and let’s explore what a multi-method, explainable detection system could look like for your institution.

No pitch. No pressure. Just a conversation about what might work.

AI Content Detection System

What was broken.

Black Box Verdicts

Single-Method Vulnerability

Adversarial Evasion

False Positive Damage

How we solved it.

Multi-Method Ensemble

Model-Based Analysis

Linguistic & Stylometric Analysis

Adversarial & Hallucination Detection

Technologies Used

Facing a similar challenge?

What it actually does.

10 Independent Detectors

Detailed Explanations

Hallucination Detection

Adversarial Evasion Detection

Configurable Profiles

Three-Tier Classification

See it in action.

Ensemble Analysis

Detailed Report

Adversarial Detection

The numbers speak.

What we learned.

The Uncertain zone is the most important feature

Adversarial detection as a confidence adjuster, not a score

Explanations matter more than accuracy

Related projects.

Multi-API AI Content Checker

Error Hunter: Gamified Writing Analysis

Want a transparent AIdetection system?

Want a transparent AI
detection system?