How to detect AI-generated emails
AI agents now send emails from real Gmail accounts — fluent, personalized, and impossible to spot by eye. Here's how to detect them automatically.
Why email detection is hard
Traditional spam filters look for phishing links, malformed headers, and blacklisted IPs. AI-generated emails have none of these. They arrive from legitimate email providers, pass SPF/DKIM checks, and read like a real person wrote them.
The tell isn't in the words — it's in the infrastructure.
The 4 categories of signals
1. Zero false-positive signals (any one = definite agent)
- ESP infrastructure — platforms like Instantly.ai, Smartlead, Lemlist, and GMass leave fingerprints in
Received: and X-Mailer headers even when they mask the sending account.
- Prompt leakage — AI agents sometimes forget to strip their system prompt from the email body. Template brackets, instructions like "[INSERT RECIPIENT NAME]", or "As an AI language model" in the message body are dead giveaways.
- Agent framework metadata — programmatic email clients have distinct MIME structure: no quoted-printable encoding, exact 76-char line wraps, Content-Transfer-Encoding differences.
- Honeypot responses — invisible challenge instructions embedded in outgoing emails that LLMs act on, humans never notice.
2. Low false-positive signals (combine 2+ to flag)
- Superhuman response speed — a 14-second reply to a 500-word email. Humans take minutes at minimum.
- Heartbeat cadence — messages arriving at perfect 30-minute intervals matching a typical agent polling loop.
- Ghost sender — email address created within days, no LinkedIn, no Google footprint, generated-looking local part.
- Follow-up cadence — a cold outreach sequence with coefficient of variation under 15% across intervals.
3. Medium false-positive signals (score boosters only)
These are real signals but too weak to flag alone — using them alone would produce false positives on humans who just happen to write clearly:
- Low perplexity — AI text has unnaturally uniform predictability. GPT-family models score 20–30 perplexity; human email averages 60–100.
- LLM vocabulary patterns — "I hope this email finds you well", "leverage", "streamline", "circle back", "reach out" at above-baseline rates.
- Perfect grammar with no artifacts — no typos, no "Sent from iPhone", no copy-paste artifacts, no autocorrect errors.
4. Whitelist protections
The most important part of any detection system is knowing when NOT to flag. AgentProof applies multipliers that override all signals:
- Your Google Contacts → score × 0.3
- Prior conversation history → score × 0.5
- Calendar correlation → score × 0.4
- User-confirmed human senders → score = 0
The key insight: false positives destroy trust. A system that occasionally misses an agent is annoying. A system that badges a real person as a bot gets uninstalled. AgentProof is asymmetrically cautious — high confidence to flag, aggressive whitelisting to protect.
Badge tiers in Gmail
AgentProof injects colored badges next to sender names in your Gmail inbox:
- 12 Green (0–30) — likely human
- 47 Yellow (31–60) — uncertain, click for detail
- 91 Red (61–100) — likely AI agent
- Auto Gray — transactional (Coinbase, GitHub, etc.)
Try it yourself
AgentProof is a free Chrome extension that runs entirely in your browser. No email data leaves your machine on the free tier. Install it and every new email in your Gmail inbox gets scored automatically.
Try AgentProof free →