- Pondhouse Data OG - We know data & AI
- Posts
- Pondhouse Data AI - Tips & Tutorials for Data & AI 51
Pondhouse Data AI - Tips & Tutorials for Data & AI 51
Composer 2.5: SOTA-Level Coding for Less | GitLab’s Agentic AI Event | agentmemory: Cross-Session Agent Memory | RAG Techniques Guide

Hey there,
In this week’s edition we’re diving into Retrieval-Augmented Generation (RAG) with a hands-on guide to mastering advanced retrieval techniques, and spotlighting agentmemory, the open-source engine bringing persistent memory to your favorite coding agents. On the news front, Cursor’s Composer 2.5 is redefining coding model intelligence, while Microsoft’s TRELLIS.2 makes high-fidelity 3D asset generation from images faster than ever. Plus, don’t miss our tip on installing Hermes Agent in seconds for instant AI coding productivity and automation.
Enjoy the read!
Cheers, Andreas & Sascha
In today's edition:
📚 Tutorial of the Week: Mastering Retrieval-Augmented Generation
🛠️ Tool Spotlight: agentmemory: Persistent memory for coding agents
📰 Top News: Cursor Composer 2.5 boosts high-quality coding at low costs
💡 Tip: Install Hermes Agent instantly via pip
Let's get started!
Tutorial of the week
Mastering Retrieval-Augmented Generation: The Ultimate Guide

Retrieval-Augmented Generation (RAG) is transforming how AI systems combine information retrieval with generative models. If you’re building or scaling AI applications, understanding RAG’s evolving techniques is essential. The RAG Techniques GitHub repository is a comprehensive, community-driven resource packed with practical tutorials, runnable notebooks, and visual guides.
Covers 42+ advanced RAG techniques, from foundational methods to memory-augmented retrieval, multi-modal pipelines, and explainability strategies.
Features step-by-step notebooks for each technique, including query enhancement, semantic chunking, fusion retrieval, reranking, and evaluation frameworks.
Includes links to companion resources like RAG Made Simple (a visual book) and Prompt Engineering Techniques for mastering AI interaction.
Community-driven: Join the Educational AI subreddit or RAG Discord to collaborate, get feedback, and stay updated.
Practical for researchers, engineers, and anyone deploying production-grade GenAI systems—each tutorial is designed for real-world implementation.
Whether you’re new to RAG or aiming to optimize state-of-the-art pipelines, this repository is your go-to hub for actionable knowledge and hands-on learning.
Tool of the week
agentmemory — Persistent, cross-session memory for AI coding agents

AI coding agents are powerful, but they forget everything between sessions—forcing you to re-explain your stack, bugs, and preferences every time. agentmemory is an open-source, self-hosted memory engine that solves this by capturing, compressing, and injecting relevant context into future sessions for agents like Claude Code, Copilot CLI, Codex, Cursor, Gemini CLI, and many more.
Universal agent support: Works out-of-the-box with 50+ popular AI coding agents via hooks, MCP, or REST API. One memory server can serve all your agents, sharing knowledge seamlessly.
High-accuracy, cost-efficient recall: Achieves 95.2% retrieval accuracy (R@5 on LongMemEval-S), with up to 92% fewer tokens used compared to built-in agent memory, saving both context window and API costs.
Zero manual effort: Automatically records tool usage, session data, and project context—no more manual note-taking or copy-pasting between sessions.
Advanced search and memory lifecycle: Combines BM25, vector, and knowledge graph search with auto-consolidation, decay, and contradiction detection for robust, evolving memory.
Real-time observability: Ships with a live viewer and full OpenTelemetry traces via the iii engine, so you can inspect, replay, and debug what your agent remembers.
With over 11,000 GitHub stars and 950+ passing tests, agentmemory is rapidly becoming the standard for persistent, cross-agent memory in AI-powered development workflows.
Top News of the week
Cursor Unveils Composer 2.5: A Major Leap in Coding Model Intelligence

Cursor has announced the release of Composer 2.5, its latest AI coding model, delivering a significant boost in efficiency and intelligence for developers. Composer 2.5 is engineered for complex, long-horizon coding tasks, offering a 10x efficiency improvement over previous models and rivals. The model’s enhanced capabilities are the result of training on 25x more synthetic tasks and the introduction of targeted reinforcement learning with textual feedback, enabling more precise and reliable code generation.
Key technical advancements include improved handling of multi-file and long-running tasks, smarter feedback mechanisms for localized error correction, and a more refined communication style. Composer 2.5 leverages advanced distributed training techniques such as sharded Muon and dual mesh HSDP for efficient scaling. The model is now available via the Cursor IDE, CLI, and web, with flexible pricing and a faster variant for high-throughput needs. Notably, Cursor is collaborating with SpaceXAI to train an even larger model, utilizing a massive GPU cluster for future breakthroughs.
This release marks a substantial step forward for AI-assisted software development, with early community feedback highlighting improved usability and real-world effectiveness.
Also in the news
GitLab Transcend - the Event to Showcase Real-World Agentic AI for Software Development
GitLab’s Transcend event, scheduled for June 10, will spotlight the adoption of agentic AI in software engineering. The virtual conference will feature leaders from Mercedes Benz, Google Cloud, AWS, and others, with sessions on intelligent orchestration, productivity research, and hands-on workshops using the GitLab Duo Agent Platform. Attendees can expect demonstrations of automated code review, pipeline fixes, and security remediation, reflecting the growing role of AI agents in the software development lifecycle.
Study Finds Grep Outperforms Vector Search in Agentic Retrieval Tasks
A new study challenges the dominance of vector search in AI retrieval, showing that traditional grep-based lexical search often yields higher accuracy than vector search in agentic workflows. Evaluated across multiple agent harnesses—including custom and provider-native CLIs—on the LongMemEval benchmark, grep consistently outperformed vector search, especially when tool outputs were delivered inline. However, the effectiveness varied with agent architecture and delivery method, highlighting the importance of retrieval mechanics and orchestration in end-to-end system performance.
Microsoft Releases TRELLIS.2: Fast Open-Source Image-to-3D Model Generator
Microsoft has open-sourced TRELLIS.2, a 4-billion-parameter model that converts images into high-fidelity, production-ready 3D assets in as little as 3 seconds. TRELLIS.2 introduces the O-Voxel format, which efficiently encodes both shape and surface materials, enabling robust handling of complex geometries and photorealistic textures. The model supports GLB export with full PBR textures and can be run locally on Linux with a 24GB+ NVIDIA GPU. Released under the MIT license, TRELLIS.2 is available for commercial use, with weights and code accessible on Hugging Face and GitHub.
New MoE Study Shows What Really Matters When Scaling Expert Models
A new arXiv paper, “Slicing and Dicing: Configuring Optimal Mixtures of Experts,” offers a practical guide to designing Mixture-of-Experts models. The authors ran 2,000+ pretraining experiments across models up to 6.6B total parameters, testing expert count, expert size, shared experts, heterogeneous experts, load balancing, and token dropping. Their key finding: model quality is driven mainly by expert count and expert granularity, while shared experts, heterogeneous expert sizes, and load-balancing settings have smaller effects. One important exception is dropless routing, which gave a steady gain. For teams working with MoE models, the message is simple: focus first on how many experts you use and how large each active expert should be.
Tip of the week
Speed Up AI Coding and Automation: Install Hermes Agent in Seconds
Tired of lengthy installs and bloated dependencies when setting up AI agents for coding or workflow automation? Hermes Agent v0.14.0 now ships as a real PyPI package, making setup fast and lightweight.
Quick install: Run
pip install hermes-agent && hermesto get started instantly—no more repo cloning or shell scripts. Hermes bundles its terminal UI and launcher right out of the box.Lighter footprint: Heavy backends (like messaging adapters, image-gen, voice/TTS) now lazy-install only when you use them, reducing disk usage and vulnerability surface.
Flexible deployment: Hermes runs on Linux, macOS, Windows (native beta), Docker, and serverless platforms. Works with any LLM provider—OpenAI, Claude, Grok, Hugging Face, and more.
Instant productivity: Start coding, automate tasks, or chat with your agent from CLI or 22 messaging platforms. Switch models with
hermes model—no code changes needed.
Use this when you need a robust, cross-platform AI agent without the hassle of traditional installs.
We hope you liked our newsletter and you stay tuned for the next edition. If you need help with your AI tasks and implementations - let us know. We are happy to help

