Skip to content

RAG Text Chunker (split for Embeddings)

Split long text into chunks for Embedding / RAG pipelines. 4 strategies (chars / tokens / paragraphs / Markdown headings), overlap control, per-chunk visualization, JSON / JSONL / Markdown export.

100% Free No signup Browser-only Instant download 5 languages Dark mode
Related: 💰 Embeddings Pricing · 🪙 LLM Token Counter
Chunks
0
Avg size
0
Max
0

    

❓ FAQ

Recommended chunk size?
Embedding models: 256-512 tokens (OpenAI text-embedding-3-small) / 512-1024 (Voyage, Cohere). Big enough to preserve meaning
How much overlap?
10-20% of chunk size is standard — prevents context loss at boundaries
Why heading strategy?
Semantic chunks → often higher retrieval precision than fixed size (esp. for docs / code / FAQ)
🐛 Found a bug or issue with this tool?

Free to use, no signup. Even just the steps to reproduce are helpful. Reports go directly to the operator and help us fix issues.

* Browser info (UA / screen / language / URL) is sent automatically to help reproduce the issue