Cerebrate File Documentation

Break large files into manageable pieces, preserve context, and process them with Cerebras AI.

Get started View on GitHub


Overview

Cerebrate File is a command-line tool for processing large documents through the Cerebras AI API. It splits files intelligently to fit within the model’s context window while keeping track of what came before.

Key Features

  • Smart chunking: Automatically break large documents into smaller parts
  • Context overlap: Keep snippets from previous chunks to maintain continuity
  • Directory support: Recursively process folders using glob patterns
  • Parallel execution: Handle multiple files at once with threading
  • Terminal UI: Clean progress output that updates in real time
  • Retry logic: Handle rate limits and temporary errors without manual intervention
  • Format flexibility: Works with text, markdown, code, and semantic content
  • Configurable behavior: Plenty of CLI options for tuning how things work

Getting Started

Installation

Install with pip or uv:

# Using pip
pip install cerebrate-file

# Using uv (faster)
uv pip install cerebrate-file

Quick Start

  1. Set your Cerebras API key:
    export CEREBRAS_API_KEY="csk-..."
    
  2. Process a single file:
    cerebrate-file document.md --output processed.md
    
  3. Process all markdown files in a directory tree:
    cerebrate-file . --output ./output --recurse "**/*.md"
    

Use Cases

Use Cerebrate File when you need to:

  • Rewrite, summarize, or translate large documents
  • Refactor code across an entire project
  • Generate new versions or expansions of existing content
  • Apply consistent transformations to many files at once
  • Clean, format, or analyze large text datasets

Model Details

The tool uses the Qwen-3 Coder 480B model from Cerebras:

  • Context window: 131,072 tokens
  • Speed: ~570 tokens/second
  • Specialty: Good at both code and natural language
  • Rate limits:
    • 30 requests per minute
    • 1,000 requests per day
    • 10 million tokens per minute

Documentation Sections

System Requirements

  • Python 3.9+
  • Minimum 4GB RAM (8GB recommended for large files)
  • Internet connection
  • Valid Cerebras API key

License

Licensed under Apache 2.0. See LICENSE for details.

Support


Back to top

Copyright © 2024-2025 Adam Twardoch. Distributed under the Apache 2.0 license.