artokعرتوك

See how much more Arabic costs across 18 LLM tokenizers. Benchmark scores, diacritics analysis, dialect comparison.

5x more tokens for Arabic on Claude vs English for the same sentence. Arabic speakers are paying a hidden tax on every API call.
$ pip install artok

Chapter I
The Problem
Same meaning. Different cost.
Arabic
الذكاء الاصطناعي يغير العالم
25
tokens (Claude)
vs
English
AI is changing the world
5
tokens (Claude)

Chapter II
See It
One command. Full picture.
artok terminal demo

Chapter III
The Benchmark
Arabic friendliness score across 8 text categories: news, poetry, Quran, technical, conversational, Egyptian dialect, Gulf dialect, social media.
TokenizerScore /100
Mistral Large 392.1
Qwen 3.591.7
GPT-4.191.1
Gemini 2.5 Pro90.2
Llama 483.6
DeepSeek V3.283.8
Jamba 1.575.3
Grok 250.3
GPT-4 (legacy)44.5
Claude Sonnet 4.625.6

Chapter IV
Features
22 flags. Everything you need.

Chapter V
18 Tokenizers
10 providers. Pricing auto-updates via artok --update.
TokenizerProviderInput $/1MOutput $/1M
GPT-4.1OpenAI$2.00$8.00
GPT-4.1 miniOpenAI$0.40$1.60
GPT-4.1 nanoOpenAI$0.10$0.40
GPT-4oOpenAI$2.50$10.00
GPT-4o miniOpenAI$0.15$0.60
Claude Opus 4.6Anthropic$5.00$25.00
Claude Sonnet 4.6Anthropic$3.00$15.00
Claude Haiku 4.5Anthropic$1.00$5.00
Llama 4Meta$0.18$0.18
Qwen 3.5Alibaba$0.10$0.40
Mistral Large 3Mistral$0.50$1.50
Mistral SmallMistral$0.10$0.30
Gemini 2.5 ProGoogle$1.25$10.00
Gemini 3 FlashGoogle$0.50$3.00
DeepSeek V3.2DeepSeek$0.27$1.10
Grok 2xAI$2.00$10.00
Command R+Cohere$2.50$10.00
Jamba 1.5AI21$0.20$0.40

Chapter VI
Get Started
# Install $ git clone https://github.com/Moshe-ship/artok.git $ cd artok && pip install -e ".[all]" # See the tax $ artok "الذكاء الاصطناعي يغير العالم" -e "AI is changing the world" # Benchmark $ artok --benchmark # Dialects $ artok --dialects # Tashkeel $ artok "بِسْمِ اللَّهِ الرَّحْمَنِ الرَّحِيمِ" --tashkeel # Stay current $ artok --update