Building RAG Apps Without the Bloat: Meet Shraga

Tired of bloated RAG frameworks? Discover Shraga — a minimal, production-ready open-source alternative built by BigData Boutique to simplify and scale GenAI applications without the overhead.

When you’re building a RAG application, one of the first decisions you face is whether to use an existing framework or build your own. Frameworks like LangChain, LangGraph, LlamaIndex and CrewAI are great for getting started — they abstract away a lot of the complexity and let you move fast. But as soon as you want something custom, performant, or production-grade, those abstractions often get in the way.

At BigData Boutique, we’ve worked on quite a few RAG systems — across industries, use cases, and architectures — and we kept running into the same issues. So we built our own open-source framework. Something minimal, composable, and easy to debug.

Better yet, Shraga is designed to get you up and running in no time, and also allows to almost immediately deploy what you have to be used by internal users, then external, then real production with analytics over usage, history and consumption of tokens and cost. Today, this is how we run quick and efficient GenAI POCs with customers, and how they often end up running the final product in production.

In this post, I’ll walk through how we approached it, what worked for us, and why we think it’s worth considering if you’re serious about putting GenAI into production.

Building RAG Apps Without the Bloat: Meet Shraga

What We Actually Needed From a Framework

The Evolution of a RAG Application with Shraga

Flows: The Core Building Block

A FastAPI Layer That Does the Boring Stuff

Ingestion and Embedding: Getting Quality Data In

DocHandler: Preprocess and Structure Content

BaseEmbedder: Plug-and-Play Embedding

A Simple but Powerful UI: `shraga-ui`

Project Setup Overview

Backend 🐍

Frontend ⚛️

Summary

Rehan

Leave a Reply Cancel reply