How Guardlyze Works
A technical explanation of Level 4 content moderation: how we detect threats that bypass AWS, Google, and Azure using evasion normalization and intent analysis.
Last updated: January 2026
97%+
Evasion Detection
40+
Specialized AI Models
119
Languages Supported
45+
Threat Categories
Source: Internal benchmarks, 2026. See full comparison.
What Are the 4 Levels of Content Moderation?
Content moderation technology has evolved through four distinct levels. Most APIs stop at Level 2. Guardlyze operates at Level 4.
Blocks explicit words like "kill", "nazi". The most basic approach.
Limitation: Easily bypassed with "k1ll", "n4zi", or Unicode substitutions.
Used by: Basic filters, regex-based systems
Machine learning analyzes word patterns and context around flagged terms.
Limitation: Trained on what people say, not what they mean. Misses zero-toxic-word manipulation.
Used by: AWS Rekognition, Google Cloud Vision, Azure Content Moderator
Analyzes psychological intent, power dynamics, and manipulation patterns.
Limitation: Can still be bypassed with sophisticated evasion techniques.
Used by: Guardlyze (included)
Normalizes ALL evasion vectors before classification. Intent + Evasion combined.
Limitation: State-of-the-art. Continuously updated against new evasion techniques.
Used by: Guardlyze only
How Does Guardlyze Process Content?
Every piece of content goes through our 6-stage detection pipeline. The key innovation is Stage 2: evasion normalization happens before classification.
Content Ingestion
Your content (text, image, or audio) is sent to our unified API endpoint. We accept JSON payloads with automatic format detection.
POST /text/moderate, /image/moderate, or /audio/moderateEvasion Normalization
Before any classification, we decode all known evasion techniques. This is what makes Level 4 detection possible.
Unicode confusables → Latin, BiDi → stripped, Zalgo → cleaned, Leetspeak → normalizedLanguage Detection
Our language detector identifies the primary language and routes content to appropriate specialized classifiers.
119 languages supported, mixed-script handling, dialect detectionMulti-Model Classification
Content is analyzed by 40+ specialized AI models, each trained on specific threat categories with dedicated datasets.
Parallel processing, category-specific thresholds, confidence scoringIntent Analysis
Level 3-4 detection analyzes psychological patterns, power dynamics, and manipulation tactics beyond vocabulary.
7 psychological abuse patterns, grooming sequences, DARVO detectionResponse Generation
You receive categorized results with confidence scores, risk levels, matched categories, and word-level highlighting.
JSON response, confidence scores, actionable flagsHow Does Evasion Detection Work?
Bad actors use algorithmic evasion to bypass content filters. Guardlyze normalizes these techniques before classification, achieving 97%+ detection where competitors score below 30%. Learn more about Unicode confusables from the Unicode Consortium.
| Technique | Example | Others | Guardlyze |
|---|---|---|---|
| Unicode Confusables Visually identical characters from different Unicode blocks bypass character-based filters. | hаte (Cyrillic 'а') | <10% | 98%+ |
| BiDi Control Attacks Right-to-left override characters hide malicious file extensions or reverse text meaning. | malware.txtexe.txt | <5% | 99%+ |
| Zalgo Text Combining diacritical marks create 'glitchy' text that breaks tokenization. | h̷̢a̵͝t̶̛e̸͠ | <5% | 99%+ |
| Zero-Width Characters Invisible Unicode characters inserted between letters break word matching. | hate (invisible) | <15% | 98%+ |
| Leetspeak Number-letter substitutions are the oldest trick but still effective against basic filters. | h8te, k1ll, n4z1 | ~40% | 96%+ |
| Homoglyphs Mixing characters from Cyrillic, Greek, and mathematical symbols. | кіⅼⅼ (mixed scripts) | <20% | 97%+ |
How Does Intent Detection Work?
Level 3 detection analyzes psychological intent, not vocabulary. We detect manipulation even when every individual word appears "safe."
7 Psychological Abuse Patterns
- Reality Distortion — Gaslighting, denying events
- Conditional Affection — Love-bombing, withdrawal
- Emotional Invalidation — "You're too sensitive"
- Social Isolation — Cutting off support networks
- Blame Inversion (DARVO) — Deny, Attack, Reverse
- Dehumanization — Reducing to objects/labels
- Manipulation Cycles — Tension-explosion-honeymoon
Grooming Detection Signals
- Predator Acronyms — ASL, GNOC, S2R, NP4NP (50+)
- Age Probing — "How old are you?" in context
- Trust Escalation — Rapid relationship building
- Off-Platforming — "Do you have Snap/Discord?"
- Secrecy Induction — "Don't tell anyone"
- Context Multipliers — Platform-aware risk scoring
Example: Zero Toxic Words, Maximum Risk
"You're mature for your age. This chat is slow, do you have Discord? I won't tell anyone if you don't."
How Do I Integrate Guardlyze?
Standard REST API. Send content, receive categorized results with confidence scores and risk levels. Average integration time: less than 1 day.
curl -X POST https://api.guardlyze.com/text/moderate \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Your content here",
"context": {
"platform": "dating_app",
"user_age": 16
}
}'Response:
{
"safe": false,
"risk_level": "critical",
"categories": ["grooming", "off_platforming"],
"confidence": 0.94,
"flags": [
{"type": "age_probing", "span": [0, 28]},
{"type": "off_platform", "span": [45, 67]}
]
}What Languages Does Guardlyze Support?
Tier 1 — 14 Precision Languages
95%+ accuracy, native classifiers, regional slang, word-level highlighting
English, French, Spanish, Italian, German, Portuguese, Turkish, Russian, Arabic, Chinese, Polish, Dutch, Japanese, Korean
Tier 2 — 105 Standard Languages
85%+ accuracy, all 45 categories, contextual analysis
All major world languages including Hindi, Bengali, Vietnamese, Thai, Hebrew, Greek, Czech, Romanian, Hungarian, and 95+ more.
Regional Dialect Support
- Arabic: MSA + Gulf, Levantine, Egyptian, Maghrebi, Arabizi
- Spanish: Spain, Mexico, Argentina, Colombia variants
- French: France, Quebec, Belgium, Switzerland, African
- Chinese: Simplified, Traditional, Cantonese romanization
All processing complies with our Privacy Policy and Terms of Service.
Frequently Asked Questions
Technical questions about how Guardlyze works
Ready for Level 4 protection?
Join the waitlist and be among the first to access Guardlyze.