sarihصريح

Scan Arabic text for toxicity, hate speech, and spam. Dialect-aware. Fully offline.

13 filters · 3 severity levels · fully offline — Arabic-first content moderation for social apps, chatbots, and UGC platforms.

$ pip install sarih

Chapter I

The Problem

Arabic platforms need Arabic moderation.

English-First Tools
Existing moderation APIs are built for English. They miss Arabic slang, dialect-specific insults, and culturally specific toxicity.
Dialect Blindness
An insult in Egyptian Arabic looks nothing like the same insult in Gulf Arabic. MSA-only tools miss dialect-specific content entirely.
Cloud Dependency
Sending user content to third-party APIs. Privacy concerns, latency, cost, and single point of failure.
Data Privacy
Sensitive user content leaving your infrastructure. Compliance nightmares. sarih runs entirely on your machine.

Chapter II

13 Filters

Every type of toxic content. Caught.

Chapter III

Severity Levels

Three levels. Clear action for each.

Level	Meaning
BLOCK	Must remove. Clearly toxic, dangerous, or illegal content.
FLAG	Needs human review. Likely problematic but context-dependent.
REVIEW	Soft signal. May be fine, but worth a second look.

Chapter IV

Dialect Aware

Five dialects. One tool.

Dialect	Arabic
MSA	فصحى
Egyptian	مصري
Gulf	خليجي
Levantine	شامي
Moroccan	مغربي

Demo

See It

Chapter V

Commands

6 commands. Zero config.

scan
Scan a JSONL file for toxic content
check
Check a single text string
pipe
Read from stdin for pipelines
stats
Moderation statistics by filter and severity
clean
Remove or redact flagged content
explain
Describe all filters and severity levels

Chapter VI

Get Started

# Install $ pip install sarih # Check a single text $ sarih check "text to moderate" # Scan a dataset $ sarih scan data.jsonl # Clean a dataset $ sarih clean data.jsonl --output clean.jsonl # View statistics $ sarih stats data.jsonl # Pipe from stdin $ cat texts.jsonl | sarih pipe # Learn about filters $ sarih explain

Chapter VII

As a Library

Import and moderate in two lines.

from sarih import moderate result = moderate("text to check") print(result.severity) # BLOCK, FLAG, or REVIEW print(result.filters) # list of triggered filters print(result.dialect) # detected dialect