TechSetupGuides
Beginneraillmclitypescriptnodejscode-analysisdeveloper-toolsgit

Repomix - Pack repositories into AI-friendly files

A powerful tool that packs entire repositories into single, AI-optimized files for feeding to Large Language Models.

  1. Step 1

    Overview

    Repomix is a powerful tool that packs your entire repository into a single, AI-friendly file. It's designed for feeding codebases to Large Language Models (LLMs) like Claude, ChatGPT, DeepSeek, Perplexity, Gemini, Llama, and Grok for analysis, refactoring, documentation, or code review. The tool intelligently formats code for LLM comprehension, counts tokens, respects .gitignore files, and includes security scanning to detect sensitive information before sharing.

  2. Step 2

    Technology Stack

    Repomix is built with modern TypeScript and Node.js technologies:

    Language: TypeScript (93.1%), Vue (4.7%), JavaScript (1.1%)
    License: MIT
    Stars: ~25,600
    Owner: yamadashy
    Repo: https://github.com/yamadashy/repomix
    Version: 1.14.0
    Node.js: >=22.0.0
    
    Core Dependencies:
    - commander - CLI framework
    - @clack/prompts - Interactive CLI prompts
    - gpt-tokenizer - Token counting for LLMs
    - @modelcontextprotocol/sdk - MCP server integration
    - globby - File pattern matching
    - iconv-lite - Character encoding conversion
    - jschardet - Character encoding detection
    - valibot & zod - Schema validation
    - secretlint - Security scanning
    
    Dev Tools:
    - vitest - Testing framework
    - biome - Fast linter and formatter
    - typescript - Type checking
  3. Step 3

    Installation - NPX (Recommended)

    The easiest way to use Repomix without installing globally:

    # Run without installation
    npx repomix@latest
    
    # Pack a specific directory
    npx repomix@latest path/to/directory
    
    # Process a remote repository
    npx repomix@latest --remote yamadashy/repomix
  4. Step 4

    Installation - Global Package

    Install Repomix globally for permanent CLI access:

    # NPM
    npm install -g repomix
    
    # Yarn
    yarn global add repomix
    
    # Bun
    bun add -g repomix
    
    # Verify installation
    repomix --version
  5. Step 5

    Installation - Homebrew

    For macOS and Linux users, install via Homebrew:

    # Install
    brew install repomix
    
    # Verify
    repomix --version
  6. Step 6

    Installation - Docker

    Run Repomix in a Docker container without local installation:

    # Run in current directory
    docker run -v .:/app ghcr.io/yamadashy/repomix
    
    # Create an alias for convenience
    alias repomix='docker run -v .:/app ghcr.io/yamadashy/repomix'
    
    # Use the alias
    repomix --version
  7. Step 7

    Basic Usage

    Pack your repository with a single command. By default, Repomix creates an XML-formatted output file optimized for AI consumption:

    # Pack current directory (creates repomix-output.xml)
    repomix
    
    # Pack specific directory
    repomix path/to/directory
    
    # Pack and copy to clipboard
    repomix --copy
    
    # Pack to specific output file
    repomix --output my-codebase.xml
  8. Step 8

    Output Formats

    Repomix supports multiple output formats optimized for different AI tools and workflows:

    # XML format (default, best for Claude)
    repomix --style xml
    
    # Markdown format (human-readable)
    repomix --style markdown
    
    # Plain text format
    repomix --style plain
    
    # JSON format (programmatic use)
    repomix --style json
  9. Step 9

    File Filtering

    Control which files to include or exclude using glob patterns. Repomix automatically respects .gitignore files:

    # Include specific patterns
    repomix --include "src/**/*.ts,**/*.md"
    
    # Exclude specific patterns
    repomix --ignore "**/*.log,tmp/,*.test.ts"
    
    # Combine include and exclude
    repomix --include "src/**/*.ts" --ignore "**/*.test.ts"
    
    # Create .repomixignore file for project-specific exclusions
    echo "dist/" >> .repomixignore
    echo "node_modules/" >> .repomixignore
  10. Step 10

    Remote Repository Processing

    Process GitHub repositories without cloning them locally:

    # Process a public GitHub repository
    repomix --remote yamadashy/repomix
    
    # Process specific branch
    repomix --remote yamadashy/repomix --remote-branch main
    
    # Process with remote config (use cautiously)
    repomix --remote yamadashy/repomix --remote-trust-config
    ⚠ Heads up: Remote repository configs are not loaded by default for security reasons. Only use --remote-trust-config with repositories you trust.
  11. Step 11

    Code Compression (Experimental)

    Enable Tree-sitter-based compression to reduce token usage by ~70% while preserving code structure and semantic meaning. This extracts essential signatures and removes implementation details:

    # Enable compression
    repomix --compress
    
    # Combine with other options
    repomix --compress --style markdown --output compressed-output.md
    ⚠ Heads up: Compression is lossy - it removes implementation details. Use only when code structure matters more than exact implementation.
  12. Step 12

    Token Counting

    Repomix provides token counts for each file and the entire repository, useful for staying within LLM context limits:

    # Token counts are included in output by default
    repomix
    
    # The output file includes:
    # - Total token count
    # - Per-file token counts
    # - File and character statistics
  13. Step 13

    Security Scanning

    Repomix uses Secretlint to detect sensitive information like API keys, tokens, and passwords before packaging:

    # Security check is enabled by default
    repomix
    
    # Disable security check (not recommended)
    repomix --no-security-check
    
    # Security warnings appear in the output
    # Review and remove sensitive data before sharing with AI
    ⚠ Heads up: Always review the output file for sensitive information before sharing with external AI services.
  14. Step 14

    Configuration File

    Create a repomix.config.json file for project-specific settings. Also supports TypeScript (.ts) and JavaScript (.js) config files:

    {
      "output": {
        "filePath": "repomix-output.xml",
        "style": "xml",
        "compress": false,
        "splitOutput": null,
        "removeComments": false,
        "removeEmptyLines": false,
        "topFilesLength": 5,
        "showLineNumbers": false,
        "copyToClipboard": false,
        "git": {
          "includeLogs": false,
          "logCount": 50,
          "includeDiff": false
        }
      },
      "ignore": {
        "useGitignore": true,
        "useDefaultPatterns": true,
        "customPatterns": [
          "*.log",
          "tmp/**",
          "**/*.test.ts"
        ]
      },
      "security": {
        "enableSecurityCheck": true
      },
      "include": ["src/**/*", "**/*.md"],
      "maxFileSize": 52428800
    }
  15. Step 15

    Split Large Outputs

    For repositories that produce very large outputs, split them into multiple files to stay within LLM context limits:

    # Split by size (e.g., 1mb, 500kb, 2mb)
    repomix --split-output 1mb
    
    # Creates multiple files:
    # - repomix-output-part1.xml
    # - repomix-output-part2.xml
    # - etc.
  16. Step 16

    Git History Integration

    Include commit logs and diffs in the output for additional context:

    # Include last 50 commit logs (default)
    repomix --git-logs
    
    # Include specific number of logs
    repomix --git-logs 100
    
    # Include diffs
    repomix --git-diff
    
    # Combine logs and diffs
    repomix --git-logs --git-diff
  17. Step 17

    MCP Server Integration

    Repomix provides an MCP server for AI assistants to interact directly with repositories. This enables Claude and other MCP-compatible AI tools to pack and analyze codebases:

    # Install the MCP server
    npx @repomix/mcp-server
    
    # Configure in Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):
    # {
    #   "mcpServers": {
    #     "repomix": {
    #       "command": "npx",
    #       "args": ["-y", "@repomix/mcp-server"]
    #     }
    #   }
    # }
    
    # Available MCP tools:
    # - pack_repository: Pack local repositories
    # - pack_remote_repository: Pack GitHub repos
    # - search_repository: Search through packed outputs
    # - read_file: Read files with security scanning
  18. Step 18

    GitHub Actions Integration

    Automate repository packing in CI/CD pipelines:

    name: Pack Repository
    on: [push]
    
    jobs:
      pack:
        runs-on: ubuntu-latest
        steps:
          - uses: actions/checkout@v4
          
          - name: Setup Node.js
            uses: actions/setup-node@v4
            with:
              node-version: '22'
          
          - name: Pack repository
            run: |
              npx repomix@latest --output packed-repo.xml
          
          - name: Upload artifact
            uses: actions/upload-artifact@v4
            with:
              name: packed-repository
              path: packed-repo.xml
  19. Step 19

    Library Usage

    Use Repomix programmatically in Node.js applications:

    import { pack } from 'repomix';
    
    // Pack a repository
    const result = await pack({
      input: './my-project',
      output: {
        filePath: 'output.xml',
        style: 'xml',
        compress: false,
      },
      ignore: {
        customPatterns: ['*.log', 'tmp/**'],
      },
    });
    
    console.log(`Packed ${result.fileCount} files`);
    console.log(`Total tokens: ${result.tokenCount}`);
  20. Step 20

    Browser Extensions

    Repomix offers Chrome and Firefox extensions for packing repositories directly from GitHub:

    Chrome Extension:
    https://chrome.google.com/webstore (search for "Repomix")
    
    Firefox Add-on:
    https://addons.mozilla.org (search for "Repomix")
    
    Usage:
    1. Navigate to any GitHub repository
    2. Click the Repomix extension icon
    3. Choose format and options
    4. Download or copy the packed output
  21. Step 21

    VS Code Extension

    The community-maintained "Repomix Runner" extension integrates Repomix into VS Code:

    Installation:
    1. Open VS Code
    2. Go to Extensions (Cmd+Shift+X / Ctrl+Shift+X)
    3. Search for "Repomix Runner"
    4. Click Install
    
    Usage:
    1. Right-click on a folder in Explorer
    2. Select "Pack with Repomix"
    3. Choose options in the command palette
    4. Output appears in workspace
  22. Step 22

    Web Interface

    Use the online tool at repomix.com for quick packing without installation:

    Website: https://repomix.com
    
    Features:
    - Paste repository URL
    - Choose output format
    - Configure options via UI
    - Download or copy result
    - No installation required
    
    Limitations:
    - Public repositories only
    - May have size limits
    - No custom configuration files
  23. Step 23

    Command Line Reference

    Full list of repomix flags and options. Run repomix --help to see this at any time.

    Usage: repomix [options] [directory]
    
    Options:
      -v, --version                Display version number
      -o, --output <file>          Output file path
      --style <type>               Output style: xml, markdown, plain, json
      --compress                   Enable Tree-sitter compression (~70% reduction)
      --copy                       Copy output to clipboard
      --split-output <size>        Split output by size (e.g., 1mb, 500kb)
      --include <patterns>         Include file patterns (comma-separated)
      --ignore <patterns>          Ignore file patterns (comma-separated)
      --no-security-check          Disable security scanning
      --remote <repo>              Process remote GitHub repository
      --remote-branch <branch>     Specify remote branch
      --remote-trust-config        Trust remote repository config
      --git-logs [count]           Include commit logs
      --git-diff                   Include git diffs
      --remove-comments            Remove code comments
      --remove-empty-lines         Remove empty lines
      --show-line-numbers          Show line numbers in output
      -c, --config <file>          Config file path
      --init                       Initialize config file
      -h, --help                   Display help
  24. Step 24

    Common Use Cases

    Repomix excels at several AI-assisted development workflows:

    1. Code Review: Feed codebase to Claude/ChatGPT for comprehensive review
       repomix --compress --style markdown
    
    2. Documentation Generation: Generate docs from entire codebase
       repomix --include "src/**/*.ts" --style markdown
    
    3. Refactoring Planning: Analyze architecture before major changes
       repomix --compress --git-logs 100
    
    4. Bug Analysis: Share relevant code with AI for debugging
       repomix --include "src/**/*.ts,**/*.test.ts"
    
    5. Onboarding: Create codebase summaries for new team members
       repomix --style markdown --remove-comments
    
    6. Security Audit: Scan entire repository for issues
       repomix --git-logs --git-diff
    
    7. Migration Planning: Analyze legacy code before migration
       repomix --compress --include "src/**/*"
    
    8. API Documentation: Generate API docs from source
       repomix --include "**/*.ts" --ignore "**/*.test.ts"
    
    9. Code Search: MCP integration for AI-powered search
       (Use MCP server with Claude Desktop)
    
    10. CI/CD Analysis: Automated repository packing in pipelines
        (Use GitHub Actions integration)
  25. Step 25

    Best Practices

    Tips for getting the most out of Repomix:

    1. Start with compression for large repositories
       - Reduces tokens by ~70%
       - Preserves structure for AI understanding
    
    2. Use .repomixignore for consistent exclusions
       - Version control your ignore patterns
       - Keep output focused on relevant code
    
    3. Choose the right format for your LLM
       - XML: Best for Claude (default)
       - Markdown: Best for human readability
       - JSON: Best for programmatic processing
    
    4. Monitor token counts
       - Stay within LLM context limits
       - Use --split-output for very large repos
    
    5. Enable security scanning
       - Always review output before sharing
       - Remove sensitive data from repositories
    
    6. Use specific include patterns
       - Focus on relevant code
       - Reduce noise in AI analysis
    
    7. Leverage git history for context
       - Include recent commits for change context
       - Add diffs for specific analysis
    
    8. Automate with GitHub Actions
       - Keep packed outputs up to date
       - Integrate with documentation workflows
    
    9. Use MCP for interactive AI sessions
       - Enable Claude to pack and search dynamically
       - Better for iterative analysis
    
    10. Test with different formats
        - Some LLMs work better with specific formats
        - Markdown is more token-efficient for some models
  26. Step 26

    Troubleshooting

    Common issues and solutions:

    Issue: Output too large for LLM context
    Solution: Use --compress or --split-output 1mb
    
    Issue: Sensitive data in output
    Solution: Review security warnings, add to .repomixignore
    
    Issue: Node.js version error
    Solution: Upgrade to Node.js 22 or later
    
    Issue: Remote repository fails
    Solution: Check repository is public, verify URL format
    
    Issue: Unexpected files included
    Solution: Check .gitignore, add patterns to .repomixignore
    
    Issue: Compression removes needed code
    Solution: Don't use --compress for implementation review
    
    Issue: Character encoding errors
    Solution: Repomix auto-detects, but verify source file encoding
    
    Issue: Token count inaccurate
    Solution: Token counts are estimates, verify with your LLM
    
    Issue: Config file not loaded
    Solution: Ensure repomix.config.json in project root
    
    Issue: Performance slow on large repos
    Solution: Use --include to limit scope, exclude node_modules
  27. Step 27

    File Locations

    Default file locations and configuration:

    Configuration:
      repomix.config.json          - Project configuration (JSON)
      repomix.config.ts            - Project configuration (TypeScript)
      repomix.config.js            - Project configuration (JavaScript)
      .repomixignore               - Project-specific ignore patterns
    
    Output:
      repomix-output.xml           - Default XML output
      repomix-output.md            - Markdown output
      repomix-output.txt           - Plain text output
      repomix-output.json          - JSON output
      repomix-output-part*.xml     - Split output files
    
    Git Integration:
      .gitignore                   - Automatically respected
      .ignore                      - Automatically respected

Feature requests

Sign in to suggest features or vote on existing ones.

No feature requests yet.

Discussion

0 people marked this as worked·Sign in to mark your own.

Sign in to join the discussion.

No comments yet.