Technical Deep-Dive

How Guardlyze Works

A technical explanation of Level 4 content moderation: how we detect threats that bypass AWS, Google, and Azure using evasion normalization and intent analysis.

Last updated: January 2026

97%+

Evasion Detection

40+

Specialized AI Models

119

Languages Supported

45+

Threat Categories

Source: Internal benchmarks, 2026. See full comparison.

Understanding Protection Levels

What Are the 4 Levels of Content Moderation?

Content moderation technology has evolved through four distinct levels. Most APIs stop at Level 2. Guardlyze operates at Level 4.

Level 1Keyword Filtering
~40%

Blocks explicit words like "kill", "nazi". The most basic approach.

Limitation: Easily bypassed with "k1ll", "n4zi", or Unicode substitutions.

Used by: Basic filters, regex-based systems

Level 2Contextual ML
60-70%

Machine learning analyzes word patterns and context around flagged terms.

Limitation: Trained on what people say, not what they mean. Misses zero-toxic-word manipulation.

Used by: AWS Rekognition, Google Cloud Vision, Azure Content Moderator

Level 3Intent Detection
85%+

Analyzes psychological intent, power dynamics, and manipulation patterns.

Limitation: Can still be bypassed with sophisticated evasion techniques.

Used by: Guardlyze (included)

Level 4Evasion-Proof
97%+

Normalizes ALL evasion vectors before classification. Intent + Evasion combined.

Limitation: State-of-the-art. Continuously updated against new evasion techniques.

Used by: Guardlyze only

Technical Architecture

How Does Guardlyze Process Content?

Every piece of content goes through our 6-stage detection pipeline. The key innovation is Stage 2: evasion normalization happens before classification.

1

Content Ingestion

Your content (text, image, or audio) is sent to our unified API endpoint. We accept JSON payloads with automatic format detection.

POST /text/moderate, /image/moderate, or /audio/moderate
2

Evasion Normalization

Before any classification, we decode all known evasion techniques. This is what makes Level 4 detection possible.

Unicode confusables → Latin, BiDi → stripped, Zalgo → cleaned, Leetspeak → normalized
3

Language Detection

Our language detector identifies the primary language and routes content to appropriate specialized classifiers.

119 languages supported, mixed-script handling, dialect detection
4

Multi-Model Classification

Content is analyzed by 40+ specialized AI models, each trained on specific threat categories with dedicated datasets.

Parallel processing, category-specific thresholds, confidence scoring
5

Intent Analysis

Level 3-4 detection analyzes psychological patterns, power dynamics, and manipulation tactics beyond vocabulary.

7 psychological abuse patterns, grooming sequences, DARVO detection
6

Response Generation

You receive categorized results with confidence scores, risk levels, matched categories, and word-level highlighting.

JSON response, confidence scores, actionable flags
Level 4 Capability

How Does Evasion Detection Work?

Bad actors use algorithmic evasion to bypass content filters. Guardlyze normalizes these techniques before classification, achieving 97%+ detection where competitors score below 30%. Learn more about Unicode confusables from the Unicode Consortium.

TechniqueExampleOthersGuardlyze
Unicode Confusables

Visually identical characters from different Unicode blocks bypass character-based filters.

hаte (Cyrillic 'а')<10%98%+
BiDi Control Attacks

Right-to-left override characters hide malicious file extensions or reverse text meaning.

malware.txt‮exe.txt<5%99%+
Zalgo Text

Combining diacritical marks create 'glitchy' text that breaks tokenization.

h̷̢a̵͝t̶̛e̸͠<5%99%+
Zero-Width Characters

Invisible Unicode characters inserted between letters break word matching.

ha​te (invisible)<15%98%+
Leetspeak

Number-letter substitutions are the oldest trick but still effective against basic filters.

h8te, k1ll, n4z1~40%96%+
Homoglyphs

Mixing characters from Cyrillic, Greek, and mathematical symbols.

кіⅼⅼ (mixed scripts)<20%97%+
Level 3 Capability

How Does Intent Detection Work?

Level 3 detection analyzes psychological intent, not vocabulary. We detect manipulation even when every individual word appears "safe."

7 Psychological Abuse Patterns

  • Reality Distortion — Gaslighting, denying events
  • Conditional Affection — Love-bombing, withdrawal
  • Emotional Invalidation — "You're too sensitive"
  • Social Isolation — Cutting off support networks
  • Blame Inversion (DARVO) — Deny, Attack, Reverse
  • Dehumanization — Reducing to objects/labels
  • Manipulation Cycles — Tension-explosion-honeymoon

Grooming Detection Signals

  • Predator Acronyms — ASL, GNOC, S2R, NP4NP (50+)
  • Age Probing — "How old are you?" in context
  • Trust Escalation — Rapid relationship building
  • Off-Platforming — "Do you have Snap/Discord?"
  • Secrecy Induction — "Don't tell anyone"
  • Context Multipliers — Platform-aware risk scoring

Example: Zero Toxic Words, Maximum Risk

"You're mature for your age. This chat is slow, do you have Discord? I won't tell anyone if you don't."

Traditional APIs:SAFE|Guardlyze:CRITICAL — Grooming Pattern
Integration

How Do I Integrate Guardlyze?

Standard REST API. Send content, receive categorized results with confidence scores and risk levels. Average integration time: less than 1 day.

curl -X POST https://api.guardlyze.com/text/moderate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your content here",
    "context": {
      "platform": "dating_app",
      "user_age": 16
    }
  }'

Response:

{
  "safe": false,
  "risk_level": "critical",
  "categories": ["grooming", "off_platforming"],
  "confidence": 0.94,
  "flags": [
    {"type": "age_probing", "span": [0, 28]},
    {"type": "off_platform", "span": [45, 67]}
  ]
}
Global Coverage

What Languages Does Guardlyze Support?

Tier 1 — 14 Precision Languages

95%+ accuracy, native classifiers, regional slang, word-level highlighting

English, French, Spanish, Italian, German, Portuguese, Turkish, Russian, Arabic, Chinese, Polish, Dutch, Japanese, Korean

Tier 2 — 105 Standard Languages

85%+ accuracy, all 45 categories, contextual analysis

All major world languages including Hindi, Bengali, Vietnamese, Thai, Hebrew, Greek, Czech, Romanian, Hungarian, and 95+ more.

Regional Dialect Support

  • Arabic: MSA + Gulf, Levantine, Egyptian, Maghrebi, Arabizi
  • Spanish: Spain, Mexico, Argentina, Colombia variants
  • French: France, Quebec, Belgium, Switzerland, African
  • Chinese: Simplified, Traditional, Cantonese romanization

All processing complies with our Privacy Policy and Terms of Service.

Frequently Asked Questions

Technical questions about how Guardlyze works

Ready for Level 4 protection?

Join the waitlist and be among the first to access Guardlyze.