Repo Radar
Useful open-source repos worth knowing — what each one does in plain language, with stars and the project's own headline claims. Filter below, or ask the blog to search posts and repos together.
Stop feeding your LLM garbage HTML.
★ 69kunclecode/crawl4aiCrawl4AI is a web crawler and scraper that converts web content into LLM-ready Markdown, replacing traditional scraping methods.
- drastically more cost-effective than any of the existing solutions
Stop digging through codebases. Query your entire repo like a database.
★ 68ksafishamsi/graphifyGraphify converts codebases, documentation, and other files into a queryable knowledge graph, replacing traditional code search.
View on GitHub →LiteLLM — 100+ LLMs, one line
★ 51kBerriAI/litellmLiteLLM is a Python SDK and proxy server that unifies over 100 LLM APIs into an OpenAI-compatible format.
- 100+ LLMs
- Self-hosted
- Call any LLM in OpenAI format
Your LLM context window is burning money.
★ 30kchopratejas/headroomHeadroom is a context compression layer that reduces the token count of inputs before they reach a large language model.
- 60-95% fewer tokens
- 6 algorithms
- reversible
Stop rewriting existing browser code.
★ 28kDietrichGebert/ponytailPonytail is a prompt engineering technique that makes AI agents generate simpler, more concise code by leveraging existing browser capabilities.
- 80-94% less code
- 3-6× faster
- 42-75% cheaper
AirLLM — 70B on a 4GB GPU
★ 20klyogavin/airllmAirLLM is a tool that optimizes memory for large language model inference, replacing the need for quantization or distillation.
- 70B large language models to run inference on a single 4GB GPU card without quantization, distillation and pruning
- run 405B Llama3.1 on 8GB vram
- Support CPU inference
Optimize Your AI Agent's Context Window
★ 18kmksglu/context-modeContext Mode optimizes AI coding agent context windows by replacing raw tool output with sandboxed summaries.
- 98% reduction in context window size from tool output
Stop paying for cloud TTS. Your device can do it faster.
★ 12ksupertone-inc/supertonicSupertonic is a text-to-speech system designed for local inference with minimal overhead. It solves the problem of relying on cloud APIs for TTS, enabling on-device processing without privacy concerns
- 31-Language Multilingual
- 99M-Parameter Open-Weight Model
- fast enough to turn an entire webpage into audio in under a second
Fit your entire compony doc into your RAM
★ 12kRyanCodrai/turbovecturbovec is a Rust vector index with Python bindings that replaces FAISS for vector search.
- A 10 million document corpus takes 31 GB of RAM as float32. turbovec fits it in 4 GB
- searches it faster than FAISS
- Hand-written NEON (ARM) and AVX-512BW (x86) kernels beat FAISS IndexPQFastScan by 10–19% on ARM
Your IDE is a context-switching graveyard.
★ 9.5kXiaomiMiMo/MiMo-CodeMiMoCode is a terminal-based AI coding assistant that replaces manual coding and command-line tasks.
- terminal-native AI coding assistant
- persistent memory system to keep a deep understanding of your project across sessions
- continuously improving itself
Semble — code search for agents, 98% fewer tokens
★ 5.2kMinishLab/sembleSemble is a code search library for agents, replacing traditional grep+read methods for code retrieval.
- Uses ~98% fewer tokens than grep+read
- Indexing and searching a full codebase end-to-end takes under a second
- ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality
Forget HayGen for Avatar Video creation
★ 4.3kmeituan-longcat/LongCat-VideoLongCat-Video is a foundational video generation model, replacing traditional methods for creating video content.
View on GitHub →Tired of inefficient vision-language models
★ 2.6kNVlabs/EagleEagle is a vision-language model that serves as a backbone for other models, replacing prior VLM backbones.
- LocateAnything now supports batch inference with a pure FlashAttention runtime
- efficient inference on A100, RTX 4090, and other non-Hopper/Blackwell GPUs
Burn 70% of coding tokens now.
★ 2.0kcocoindex-io/cocoindex-codeThis project offers an AST-based semantic code search engine, replacing traditional keyword or regex-based code search.
- Instant token saving by 70%.
- 1 min setup — install and go, zero config needed!