Pondhouse Data OG - We know data & AI
Posts
Pondhouse Data AI - Tips & Tutorials for Data & AI 22

Pondhouse Data AI - Tips & Tutorials for Data & AI 22

Manus AI: Is This What AGI Looks Like? | Build Low-Code AI Agents with n8n | Privacy-Friendly Document Extraction | Add Memory to Your Agents

Andreas Nigg
25 Mar

Hey there,

This week, we bring you practical tools for AI development: from building agents with n8n's low-code platform to examining China's impressive Manus AI system. We look at SmolDocling for document processing, investigate why multi-agent systems sometimes underperform, and share a straightforward technique for adding memory to your conversational agents.

Enjoy the read!

Cheers, Andreas & Sascha

In today's edition:

📚 Tutorial of the Week: Building AI Agents with n8n - A Low-Code Approach to AI Workflows

🛠️ Tool Spotlight: SmolDocling: A tiny OCR model for high-quality and privacy-friendly document data extraction

📰 Top News: Manus AI - The first glimpse on AGI?

💡 Tips: Add Memory to Your AI Agents for Better Conversations

Let's get started!

Find this Newsletter helpful?
Please forward it to your colleagues and friends - it helps us tremendously.

Tutorial of the week

Building AI Agents with n8n - A Low-Code Approach to AI Workflows

For developers who've worked with custom-built AI agents, you know the challenge of managing API integrations and maintaining complex code. This week, we explore a pragmatic alternative using n8n as a low-code platform for AI agent development.

Our tutorial demonstrates how to leverage n8n's visual workflow system to build an agent that can:

Process user input through a chat interface
Make decisions on which tools to use for specific queries
Execute Wikipedia searches when factual information is needed
Perform dynamic SQL queries against a PostgreSQL database
Maintain conversation context through configurable memory systems

The implementation is quite simple: it starts by connecting a Chat Trigger node to an AI Agent node (using OpenAI's GPT-4o), then extending functionality through tool nodes. The workflow execution is transparent, showing exactly how the agent routes requests and processes responses at each step.

A very welcome and dynamic observation is n8n's ability to handle the API complexity while still allowing for custom code when needed. For example, you can use expressions like {{ $fromAI('placeholder_name') }} to dynamically inject AI-generated SQL queries into database connections.

We also address practical considerations like Docker deployment options, data persistence, and the limitations of using this approach in high-volume production environments versus prototyping scenarios.

This method won't replace custom-built agents for production-critical systems, but it offers a significant advantage for rapid development and testing of complex agent behaviors without writing hundreds of lines of API integration code.

Building AI Agents with n8n: Low-Code approach to AI workflows

Discover how to create AI agents using n8n—use a no-code/low-code platform to build robust workflows for AI and beyond.

www.pondhouse-data.com/blog/ai-agents-with-n8n

Tool of the week

SmolDocling: High-Quality document data extraction with privacy-preserving easy self-hosting option

Earlier this month, Mistral released their hosted OCR API. First impressions showed that it is one of the best OCR solutions available for now. However, in our tests there were two main shortcomings:

It’s cloud-only, meaning it might be difficult for privacy reasons
For table extractions, it sometimes opts to create images instead of extracting the tables, making it not ideal for technical documentation

That’s where our tool of the week comes into play: SmolDocling-256M-preview. Developed by the Docling Team at IBM Research, this image-text-to-text model brings high-quality document processing capabilities in a significantly smaller package.

ds4sd/SmolDocling-256M-preview · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co/ds4sd/SmolDocling-256M-preview

ManusAI: The first glimpse at what “AGI” might look like?

A new AI agent named Manus has captured significant attention in the global AI community over the past week. Developed by Wuhan-based startup Butterfly Effect, Manus represents a fundamental shift in AI assistant architecture compared to systems like ChatGPT and Claude.

Manus AI sets new records in virtually any general AI assistant benchmarks

What Makes Manus Different?

Unlike traditional chatbots that rely on a single large language model, Manus employs a multi-agent approach, utilizing several AI models working in coordination - including Anthropic's Claude 3.5 Sonnet and fine-tuned versions of Alibaba's open-source Qwen. This architecture enables Manus to function as what the company calls "the world's first general AI agent" - designed to work autonomously across a wide range of tasks.

The system's defining feature is "Manus's Computer" - a window that allows users to observe the agent's work in real time and intervene when necessary. This creates a significantly more transparent and collaborative experience compared to black-box AI assistants.

MIT Technology Review's Assessment

In hands-on testing with three complex tasks - compiling journalist lists, apartment hunting, and candidate research - Manus demonstrated impressive capabilities but also revealed limitations:

Strengths:

Highly intuitive interface accessible to non-technical users
Effective at breaking down complex tasks into logical steps
Transparent workflow allowing real-time observation and intervention
Superior results to ChatGPT DeepResearch on certain tasks
Cost-effective at approximately $2 per task (one-tenth of DeepResearch)
Ability to learn from feedback and refine outputs

Limitations:

System instability and crashes during high service loads
Difficulty processing large text chunks
Occasional "cutting corners" to expedite tasks
Challenges with paywalled content and CAPTCHA systems
Incomplete research when faced with broad, open-ended tasks

The review suggests Manus is particularly well-suited for analytical tasks requiring extensive open web research but with defined parameters - essentially serving as a "highly intelligent and efficient intern."

Source: technologyreview.com

A Glimpse of AGI Capabilities?

Some observers have suggested that Manus's architecture - with multiple specialized AI models working together to solve diverse problems - might represent an early conceptual step toward what Artificial General Intelligence could eventually look like.

The system's ability to decompose complex tasks, navigate information resources independently, and adapt to user feedback shows a more flexible form of problem-solving than single-model approaches. This coordination between specialized systems mirrors certain theoretical frameworks for how AGI might eventually be constructed.

That said, we very much caution against overstating the comparison. We still don't know what pathway to AGI might prove successful, or even what specific capabilities would define true AGI. Manus faces significant limitations in reasoning depth, robustness, and breadth of capabilities that highlight the substantial gap between today's systems and the hypothetical capabilities of true general intelligence. But still - Manus is a remarkable tool and one which you might try: Despite it’s limitations, it can help tremendously.

Everyone in AI is talking about Manus. Technologyreview put it to the test.

The new general AI agent from China had some system crashes and server overload—but it’s highly intuitive and shows real promise for the future of AI helpers.

www.technologyreview.com/2025/03/11/1113133/manus-ai-review

Also in the news

Microsoft Explores Paying Contributors to AI Training Data

Microsoft has launched research into technology that could identify and compensate individuals whose work significantly influences AI outputs. The project, described as "training-time provenance," would trace which human-created content was most essential to specific AI generations, potentially establishing a payment framework for creators. This approach contrasts with the industry's current direction, where major AI labs have typically avoided individual contributor payments while advocating for broader fair use protections.

Microsoft is exploring a way to credit contributors to AI training data | TechCrunch

According to a job listing, Microsoft is launching a project to estimate the influence of specific training data on the media AI models create.

techcrunch.com/2025/03/21/microsoft-is-exploring-a-way-to-credit-contributors-to-ai-training-data

AI Leaders Challenge AGI Hype with More Grounded Perspectives

A growing number of prominent AI researchers are publicly questioning optimistic predictions about Artificial General Intelligence (AGI). While Anthropic's Dario Amodei predicts human-level AI by 2026 and OpenAI's Sam Altman claims his company knows how to build "superintelligent" systems, others are bringing more measured viewpoints. Thomas Wolf (Hugging Face) calls such predictions "wishful thinking," arguing that breakthrough discoveries require asking entirely new questions—something current LLMs struggle with. Similarly, Yann LeCun (Meta) dismissed the idea that LLMs could achieve AGI as "nonsense," while Kenneth Stanley (Lila Sciences) highlights creativity as a fundamental capability missing from today's AI. These "AI realists" aren't opposing progress but focusing attention on concrete barriers between current technology and truly general intelligence.

Highly recommended watch: Yann LeCun (Head of AI at Meta) talking about realistic AI outcomes:

The AI leaders bringing the AGI debate down to Earth | TechCrunch

Some AI leaders are concerned the AGI race lacks serious evaluations, and are pushing for assessments of things like creativity in AI models.

techcrunch.com/2025/03/19/the-ai-leaders-bringing-the-agi-debate-down-to-earth

New Paper: Why AI Multi-Agents still fail?

A new highly interesting study has identified why Multi-Agent Systems (MAS), where multiple LLM agents collaborate on tasks, aren't significantly outperforming single-agent approaches despite industry excitement. Researchers analyzed five popular MAS frameworks across 150 diverse tasks, identifying 14 distinct failure modes organized into three categories: specification failures, inter-agent misalignment, and verification problems.

Common issues include agents ignoring task specifications, failing to verify outputs, losing conversation history, and poor communication between agents. The research challenges assumptions that simply connecting multiple LLMs automatically leads to better performance. The findings suggest that effective multi-agent systems require more sophisticated orchestration, clearer role definitions, and structured verification mechanisms.

Highly recommended read:

Paper page - Why Do Multi-Agent LLM Systems Fail?

Join the discussion on this paper page

huggingface.co/papers/2503.13657

Tip of the week

Add Memory to Your AI Agents for Better Conversations

This week's tip focuses on enhancing AI agents with memory capabilities - an interesting feature for creating more natural, context-aware conversational experiences.

Why Memory Matters for AI Agents

When building custom AI agents, one of the biggest limitations is their inability to remember previous interactions. Without memory, your agent starts each conversation fresh, forcing users to constantly repeat information and context. This creates a disjointed experience that feels unnatural and inefficient.

Implementing a Simple Memory System

The article below details a practical approach to adding memory to AI agents with these key components:

Memory Configuration - Define parameters like maximum message count before summarization and connection details for persistent storage
Conversation Storage - Save user inputs, agent responses, and tool calls in a database for future reference
Context Summarization - Automatically create summaries of past exchanges to maintain manageable context size
Context Retrieval - Fetch relevant conversation history and inject it into the agent's prompt before processing new requests

When to Use Memory

Memory is most valuable for:

Multi-turn conversations where context builds over time
Agents that need to recall user preferences or previous decisions
Applications where users might continue conversations after interruptions

However, for simple one-off queries or when privacy is paramount, you might choose a stateless design instead.

The technique we demonstrate can be implemented with any LLM and requires only basic database capabilities, making it accessible for most development scenarios.

Read the full implementation guide in the article below.

Enhancing Self-Made Agent with Memory: Keeping Context for Better Conversations

Learn how to extend your advanced AI agent with memory to maintain conversation history and context, ensuring more coherent and context-aware responses.

www.pondhouse-data.com/blog/ai-agents-memory

We hope you liked our newsletter and you stay tuned for the next edition. If you need help with your AI tasks and implementations - let us know. We are happy to help