What is Level 4 content moderation?

Level 4 is the highest level of content moderation that goes beyond keyword filtering (Level 1), contextual ML (Level 2), and intent detection (Level 3). Level 4 adds evasion-proof detection that catches Unicode confusables, BiDi attacks, Zalgo text, and other sophisticated bypass techniques with 97%+ accuracy.

How does Guardlyze compare to AWS, Google, and Azure moderation?

Traditional cloud providers like AWS Rekognition, Google Cloud Natural Language, and Azure Content Moderator typically stop at Level 2 (contextual ML). Across standard and coded hate speech combined, they score 51-59% while Guardlyze achieves ~90%. The gap is largest on coded language where competitors drop to 15-30%.

Is Guardlyze GDPR compliant?

Yes, Guardlyze is fully compliant with GDPR, DSA (Digital Services Act), and KOSA (Kids Online Safety Act). We offer zero data retention options for maximum privacy.

How does Guardlyze detect content that bypasses other moderation APIs?

Guardlyze uses a multi-stage pipeline that normalizes evasion techniques before classification. Unicode confusables (Cyrillic characters), BiDi control attacks, Zalgo text, zero-width characters, and leetspeak are decoded first, then analyzed by specialized models. This achieves 97%+ detection on evasion attacks where competitors score below 30%.

What content formats does Guardlyze analyze?

Guardlyze analyzes text (messages, comments, bios, usernames), images (photos, memes, screenshots), and audio (voice messages, recordings). All formats use a single unified API—no need to integrate multiple vendors.

How does Guardlyze detect grooming with zero explicit words?

Guardlyze's grooming detection uses multi-layered behavioral analysis: predator acronyms (ASL, GNOC, S2R), age probing patterns, trust escalation sequences, off-platforming attempts, and context multipliers based on platform type. A message like 'You're mature for your age. Do you have Discord?' contains zero toxic words but is flagged as critical risk.

How many threat categories does Guardlyze detect?

Guardlyze detects 45+ content categories organized into 6 families: Hate & Discrimination, Child Safety & Grooming, Psychological & Self-Harm, Violence & Extremism, Illegal Trades, and Privacy & AI Security.

Does Guardlyze store my content?

No. Guardlyze operates with zero data retention by default. Content is processed in RAM and immediately discarded. Optional 30-day retention for audit purposes is available but opt-in only.

Technical Deep-Dive

How Guardlyze Works

Q: What languages does Guardlyze support?

Guardlyze supports 119 languages for content moderation, compared to 12 languages for most traditional providers.

A technical explanation of Level 4 content moderation: how we detect threats that bypass AWS, Google, and Azure using evasion normalization and intent analysis.

Last updated: January 2026

97%+

Evasion Detection

40+

Specialized AI Models

119

Languages Supported

45+

Threat Categories

Source: Internal benchmarks, 2026. See full comparison.

Understanding Protection Levels

What Are the 4 Levels of Content Moderation?

Content moderation technology has evolved through four distinct levels. Most APIs stop at Level 2. Guardlyze operates at Level 4.

Level 1Keyword Filtering

~40%

Blocks explicit words like "kill", "nazi". The most basic approach.

Limitation: Easily bypassed with "k1ll", "n4zi", or Unicode substitutions.

Used by: Basic filters, regex-based systems

Level 2Contextual ML

60-70%

Machine learning analyzes word patterns and context around flagged terms.

Limitation: Trained on what people say, not what they mean. Misses zero-toxic-word manipulation.

Used by: AWS Rekognition, Google Cloud Vision, Azure Content Moderator

Level 3Intent Detection

85%+

Analyzes psychological intent, power dynamics, and manipulation patterns.

Limitation: Can still be bypassed with sophisticated evasion techniques.

Used by: Guardlyze (included)

Level 4Evasion-Proof

97%+

Normalizes ALL evasion vectors before classification. Intent + Evasion combined.

Limitation: State-of-the-art. Continuously updated against new evasion techniques.

Used by: Guardlyze only

Technical Architecture

How Does Guardlyze Process Content?

Every piece of content goes through our 6-stage detection pipeline. The key innovation is Stage 2: evasion normalization happens before classification.

Content Ingestion

Your content (text, image, or audio) is sent to our unified API endpoint. We accept JSON payloads with automatic format detection.

POST /text/moderate, /image/moderate, or /audio/moderate

Evasion Normalization

Before any classification, we decode all known evasion techniques. This is what makes Level 4 detection possible.

Unicode confusables → Latin, BiDi → stripped, Zalgo → cleaned, Leetspeak → normalized

Language Detection

Our language detector identifies the primary language and routes content to appropriate specialized classifiers.

119 languages supported, mixed-script handling, dialect detection

Multi-Model Classification

Content is analyzed by 40+ specialized AI models, each trained on specific threat categories with dedicated datasets.

Parallel processing, category-specific thresholds, confidence scoring

Intent Analysis

Level 3-4 detection analyzes psychological patterns, power dynamics, and manipulation tactics beyond vocabulary.

7 psychological abuse patterns, grooming sequences, DARVO detection

Response Generation

You receive categorized results with confidence scores, risk levels, matched categories, and word-level highlighting.

JSON response, confidence scores, actionable flags

Level 4 Capability

How Does Evasion Detection Work?

Bad actors use algorithmic evasion to bypass content filters. Guardlyze normalizes these techniques before classification, achieving 97%+ detection where competitors score below 30%. Learn more about Unicode confusables from the Unicode Consortium.

Technique	Example	Others	Guardlyze
Unicode Confusables Visually identical characters from different Unicode blocks bypass character-based filters.	hаte (Cyrillic 'а')	<10%	98%+
BiDi Control Attacks Right-to-left override characters hide malicious file extensions or reverse text meaning.	malware.txt‮exe.txt	<5%	99%+
Zalgo Text Combining diacritical marks create 'glitchy' text that breaks tokenization.	h̷̢a̵͝t̶̛e̸͠	<5%	99%+
Zero-Width Characters Invisible Unicode characters inserted between letters break word matching.	hate (invisible)	<15%	98%+
Leetspeak Number-letter substitutions are the oldest trick but still effective against basic filters.	h8te, k1ll, n4z1	~40%	96%+
Homoglyphs Mixing characters from Cyrillic, Greek, and mathematical symbols.	кіⅼⅼ (mixed scripts)	<20%	97%+

Level 3 Capability

How Does Intent Detection Work?

Level 3 detection analyzes psychological intent, not vocabulary. We detect manipulation even when every individual word appears "safe."

7 Psychological Abuse Patterns

Reality Distortion — Gaslighting, denying events
Conditional Affection — Love-bombing, withdrawal
Emotional Invalidation — "You're too sensitive"
Social Isolation — Cutting off support networks
Blame Inversion (DARVO) — Deny, Attack, Reverse
Dehumanization — Reducing to objects/labels
Manipulation Cycles — Tension-explosion-honeymoon

Grooming Detection Signals

Predator Acronyms — ASL, GNOC, S2R, NP4NP (50+)
Age Probing — "How old are you?" in context
Trust Escalation — Rapid relationship building
Off-Platforming — "Do you have Snap/Discord?"
Secrecy Induction — "Don't tell anyone"
Context Multipliers — Platform-aware risk scoring

Example: Zero Toxic Words, Maximum Risk

"You're mature for your age. This chat is slow, do you have Discord? I won't tell anyone if you don't."

Traditional APIs:SAFE|Guardlyze:CRITICAL — Grooming Pattern

Integration

How Do I Integrate Guardlyze?

Standard REST API. Send content, receive categorized results with confidence scores and risk levels. Average integration time: less than 1 day.

curl -X POST https://api.guardlyze.com/text/moderate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your content here",
    "context": {
      "platform": "dating_app",
      "user_age": 16
    }
  }'

Response:

{
  "safe": false,
  "risk_level": "critical",
  "categories": ["grooming", "off_platforming"],
  "confidence": 0.94,
  "flags": [
    {"type": "age_probing", "span": [0, 28]},
    {"type": "off_platform", "span": [45, 67]}
  ]
}

Global Coverage

What Languages Does Guardlyze Support?

Tier 1 — 14 Precision Languages

95%+ accuracy, native classifiers, regional slang, word-level highlighting

English, French, Spanish, Italian, German, Portuguese, Turkish, Russian, Arabic, Chinese, Polish, Dutch, Japanese, Korean

Tier 2 — 105 Standard Languages

85%+ accuracy, all 45 categories, contextual analysis

All major world languages including Hindi, Bengali, Vietnamese, Thai, Hebrew, Greek, Czech, Romanian, Hungarian, and 95+ more.

Regional Dialect Support

Arabic: MSA + Gulf, Levantine, Egyptian, Maghrebi, Arabizi
Spanish: Spain, Mexico, Argentina, Colombia variants
French: France, Quebec, Belgium, Switzerland, African
Chinese: Simplified, Traditional, Cantonese romanization

All processing complies with our Privacy Policy and Terms of Service.

Frequently Asked Questions

Technical questions about how Guardlyze works

Ready for Level 4 protection?

Join the waitlist and be among the first to access Guardlyze.

See comparison vs. AWS, Google, Azure →