Skip to content

RAG Text Chunker (split for Embeddings)

Divide long-form text into chunks optimized for Embedding / RAG ingestion. Supports 4 strategies: fixed character count, fixed token count, paragraph-based, and Markdown heading-based chunking; includes overlap adjustment, chunk visualization, and JSON / JSONL / Markdown export options.

100% Free No signup Browser-only Instant download 5 languages Dark mode
Related: 💰 Embeddings Pricing · 🪙 LLM Token Counter
Chunks
0
Avg size
0
Max
0

    

❓ Frequently Asked Questions

Recommended chunk size?
Embedding models: 256-512 tokens (OpenAI text-embedding-3-small) / 512-1024 (Voyage, Cohere). Big enough to preserve meaning
How much overlap?
10-20% of chunk size is standard — prevents context loss at boundaries
Why heading strategy?
Semantic chunks → often higher retrieval precision than fixed size (esp. for docs / code / FAQ)
🐛 Found a bug or issue with this tool?

Free to use, no signup. Even just the steps to reproduce are helpful. Reports go directly to the operator and help us fix issues.

* Browser info (UA / screen / language / URL) is sent automatically to help reproduce the issue