Loading...

An agentic system for extracting text from documents (PDF/JPEG) with reinforcement learning capabilities using RAG. Features OCR, vision models, LLM-based structured data extraction, validation, and self-learning from user corrections.
Idea
Create an intelligent document extraction system that learns from user corrections. Uses Langgraph for agentic workflows, ChromaDB for RAG-based learning, OpenRouter for flexible LLM model selection, and Langfuse for observability. Supports schema management, re-extraction, and real-time accuracy validation.
Tech Stacks
ChromaDBFastAPILangfuseLanggraphOpenRouterPythonReactSQLiteTailwind CSSTypeScriptVite