Kareem Almasri

< />

I build production-grade LLM applications — retrieval pipelines, agentic workflows, and fine-tuned models that ship real value. Currently exploring the edges of context engineering and multi-modal RAG.

Kareem@portfolio: ~

Tools that ship.

A pragmatic toolkit for taking LLM ideas from notebook to production — model orchestration, retrieval, evals, and the infrastructure to glue it all together.

# core
PythonFastAPIDocker
# llm
LangChainRAG SystemsVector DatabasesPrompt EngineeringFine-tuning (LoRA/QLoRA)
# providers
OpenAI APIAnthropic APIHuggingFace

Selected projects.

A few things I've built. Each one solved a real problem and taught me something I now bring to the next one.

/01

RAG Knowledge Assistant

Production-ready retrieval pipeline with hybrid search, reranking, and cited responses. Handles multi-format ingestion and streams answers under 200ms TTFT.

PythonLangChainQdrantFastAPIOpenAI
/02

Agentic Workflow Engine

Multi-step agent runtime with tool calling, memory, and human-in-the-loop checkpoints. Built for reliability with full trace logging and replay.

PythonAnthropicPydanticRedisDocker
/03

Fine-tuned Domain Model

LoRA-adapted instruction model for specialized domain Q&A. Includes eval harness, synthetic data generation, and a serving layer with token streaming.

HuggingFacePyTorchPEFTQLoRAvLLM

Let's build something.

Working on an LLM problem? Want to collaborate? Drop a message and I'll get back to you within a day.

↳ press send to open mail client