DSPyWeekly
Your Weekly Dose Of All Things DSPy
You'll receive verification email in max 1 minute. Do check SPAM just in case. Unsubscribe anytime.
Subscribed by employees and students at
".@DSPyOSS is so good that i'm kind of sad how many hours i spent struggling without it last year"
"Both DSPy and (especially) GEPA are currently severely under hyped in the AI context engineering world."
"We've reached that stage where every single day of the week (and every weekend), there are *several* really cool @DSPyOSS research papers, open-source applications, production use cases, or deep-dive tutorials, etc."
DSPyWeekly Issue No #23
Published on February 20, 2026
📚 Articles
Automatically Learning Skills for Coding Agents - GEPA
Today, we are introducing gskill, a fully automated pipeline to learn skills for any repository. Given any GitHub repository, gskill creates important agent skill files for coding agents. These skills help the coding agent understand the repository and complete the tasks more efficiently.
optimize_anything: A Universal API for Optimizing any Text Parameter - GEPA
Today we are introducing optimize_anything, a declarative API that optimizes any artifact representable as text (e.g., code, prompts, agent architectures, vector graphics, configurations). It extends GEPA (Genetic-Pareto, our state-of-the-art LLM prompt optimizer) far beyond prompts. You declare what to optimize and how to measure it; the system handles the search. Testing it across several domains, we find optimize_anything consistently matches or outperforms domain-specific tools, including some purpose-built for each task.
PAPER: EmbeWebAgent: Embedding Web Agents into Any Customized UI
EmbeWebAgent is a framework that embeds intelligent agents directly into enterprise web applications, overcoming the limitations of screenshot- or DOM-based web agents. Instead of relying on surface-level observations, it uses lightweight frontend hooks, including curated ARIA and URL-based signals, plus a per-page function registry exposed via WebSocket. A reusable backend workflow performs reasoning and executes actions. The system is stack-agnostic, supporting frameworks like React or Angular, and enables mixed-granularity actions from basic GUI interactions to higher-level composites. By leveraging MCP tools for navigation and analytics, EmbeWebAgent enables robust, multi-step behaviors with minimal retrofitting in live UI environments.
PAPER: Generative Ontology: When Structured Knowledge Learns to Create
Traditional ontologies describe domain structure but cannot generate novel artifacts. Large language models generate fluently but produce outputs lacking structural validity, hallucinating mechanisms without components, goals without end conditions. We introduce Generative Ontology, a framework synthesizing these complementary strengths: ontology provides the grammar; the LLM provides the creativity. Generative Ontology encodes domain knowledge as executable Pydantic schemas constraining LLM generation via DSPy signatures.
PAPER: WER is Unaware: Assessing How ASR Errors Distort Clinical Understanding in Patient Facing Dialogue
As Automatic Speech Recognition (ASR) is increasingly deployed in clinical dialogue, standard evaluations still rely heavily on Word Error Rate (WER). To bridge this evaluation gap, we introduce an LLM-as-a-Judge, programmatically optimized using GEPA through DSPy to replicate expert clinical assessment.
🎥 Video
Building AI agents for 127 million customers: Practical lessons from Nubank
Aman Gupta, Principal Machine Learning Engineer at Nubank, discussing how the company designs, evaluates, and scales production-grade AI agents for a massive user base of over 127 million customers. Gupta highlights the significant technical and operational challenges of deploying autonomous agents at this magnitude. He advocates for a highly pragmatic, data-driven approach to production AI. DSPy is used.
From Prompt Engineering to Prompt Optimization in Production LLM Systems - YouTube
In this AI Tech Experts Webinar, Julia May (ML Engineer), explains why prompts should be treated as hyperparameters and how automated prompt optimization can improve LLM performance in production systems. The talk compares manual vs automatic prompt engineering, then walks through three popular data-driven optimization approaches inspired by recent research. DSPy is discussed.
RLMs: Recursive Language Models - YouTube
Discover why Recursive Language Models (RLMs) are the defining paradigm of 2026 for solving context rot through "context folding." We break down the Zhang et al. paper to show how RLMs handle inputs 100x beyond standard context windows while actually reducing inference costs. Watch this deep dive to master the core constructs of next-gen context engineering.
🚀 Projects
RLM vs ReAct for Compositional Tool Calling
This is a mock summary for the article at https://github.com/RamXX/react2rlm.
GitHub - JamesHWade/dspy-explorer: Interactive DSPy RLM trace explorer
Interactive visualizer for DSPy RLM (Recursive Language Model) execution traces. Watch how an LLM iteratively reasons, writes Python code, executes it in a sandboxed REPL, and self-corrects to answer questions about large contexts.
GitHub - Archelunch/dspy-repl
dspy-repl is a modular package for non-Python REPL-based RLM engines compatible with DSPy, inspired by the Recursive Language Models paper.