- Pondhouse Data OG - We know data & AI
- Posts
- Pondhouse Data AI - Tips & Tutorials for Data & AI 26
Pondhouse Data AI - Tips & Tutorials for Data & AI 26
Auto-generate MCP servers from existing APIs | AlphaEvolve Revolutionizes Algorithm Design | Real-time Video Generation with LTX-Video | GPT-image-1 Takes AI Image Gen to New Heights

Hey there,
Google's AlphaEvolve is discovering algorithms that boost efficiency across data centers and AI training, while their new Gemini 2.5 Pro I/O Edition is setting new benchmarks in code generation.
We also dive into OpenAI's GPT-image-1, which brings unprecedented text rendering and editing capabilities to AI image creation.
Plus, learn how to automate MCP server creation and generate impressive videos in real-time with LTX-Video.
As always, tools, news, and tips to help you build smarter with AI—let’s get into it.
Cheers, Andreas & Sascha
In today's edition:
📚 Tutorial of the Week: Automate MCP Server Creation from OpenAPI and FastAPI
🛠️ Tool Spotlight: OpenAI GPT-image-1 - The Next Evolution in AI Image Generation
📰 Top News: AlphaEvolve - Google's Gemini-Powered Agent Discovers and Optimizes Algorithms
💡 Tips: Create High-Quality AI Videos with LTX-Video
Let's get started!
Find this Newsletter helpful?
Please forward it to your colleagues and friends - it helps us tremendously.
Tutorial of the week
Automatically wrap existing APIs in MCP servers - using FastMCP
MCP servers provide a standardized API optimized for LLMs, but creating a wrapper around your existing APIs can feel redundant. Our tutorial shows you how to automatically generate an MCP server from your existing OpenAPI specifications or FastAPI applications, eliminating duplicate work.
Generate an MCP Server from OpenAPI in Minutes
With FastMCP (version 2.0.0+), you can transform any API with an OpenAPI specification into an MCP server with just a few lines of code:
import httpx
from fastmcp import FastMCP
# Connect to your existing API
api_client = httpx.AsyncClient(base_url="https://api.my-api-url.com")
# Load your OpenAPI spec
spec = {...} # Your OpenAPI specification as a Python dict
# Create an MCP server from your OpenAPI spec
mcp = FastMCP.from_openapi(openapi_spec=spec, client=api_client)
if __name__ == "__main__":
mcp.run()
FastMCP intelligently maps your API routes to appropriate MCP components:
GET endpoints without path parameters become resources
GET endpoints with path parameters become resource templates
POST, PUT, DELETE endpoints become tools
FastAPI Integration: Even Simpler
If you're using FastAPI, the process is even more straightforward:
from fastmcp import FastMCP
from fastapi import FastAPI
# Your existing FastAPI app
app = FastAPI()
@app.get("/items")
def list_items():
return [{"id": 1, "name": "Item 1"}, {"id": 2, "name": "Item 2"}]
# Create and run your MCP server
mcp = FastMCP.from_fastapi(app=app)
mcp.run()
This integration runs directly on the ASGI transport with no additional overhead and supports all FastAPI features, including authentication.
Almost unbelievable, but that is all you need to create your own MCP server from existing APIs. For more information as well as for some background on why you’d need an MCP server in the first place, visit our latest article:
Tool of the week
Tool Spotlight: OpenAI GPT-image-1 - The Next Evolution in AI Image Generation
OpenAI has raised the bar for AI image generation with their latest model, GPT-image-1. Unlike its DALL-E predecessors, this is a natively multimodal language model that brings superior instruction following, text rendering, and real-world knowledge to image creation.
Key Capabilities
The Image API provides three distinct endpoints:
Generations: Create images from text prompts
Edits: Modify existing images or generate new ones using reference images
Inpainting: Replace specific parts of an image using transparent masks
What Sets GPT-image-1 Apart
Superior instruction following: More precisely follows detailed prompts
Better text rendering: Significantly improved text clarity in generated images
Advanced editing capabilities: More accurate and detailed image modifications
Real-world knowledge integration: Leverages world knowledge for more contextually relevant images
Implementation Example
Here's how to generate an image with GPT-image-1:
from openai import OpenAI
import base64
client = OpenAI()
prompt = "A children's book drawing of a veterinarian using a stethoscope to listen to the heartbeat of a baby otter."
result = client.images.generate(
model="gpt-image-1",
prompt=prompt
)
image_base64 = result.data[0].b64_json
image_bytes = base64.b64decode(image_base64)
with open("otter.png", "wb") as f:
f.write(image_bytes)
Advanced Features
The model offers substantial customization options:
Size control: Square (1024×1024), portrait (1024×1536), or landscape (1536×1024)
Quality settings: Low, medium, or high (affecting token usage and cost)
Transparent backgrounds: Create images with transparency (PNG/WebP only)
Moderation control: Adjust content filtering strictness with "auto" or "low" settings
Practical Applications
GPT-image-1 shines for use cases requiring:
Complex visual compositions following specific instructions
Images with accurate text elements
Product mockups and marketing visuals
Custom illustrations with detailed requirements
Image editing with reference materials
Considerations
Be aware of potential limitations including:
Latency: Complex prompts may take up to 2 minutes
Cost structure: Based on token usage (input text + output image tokens)
API access: Requires organization verification in some cases
For developers looking to implement state-of-the-art image generation capabilities with more precision and real-world knowledge, GPT-image-1 offers a significant upgrade over previous image generation models.
Example: Combine multiple products into one image
In this example, we'll use 4 input images to generate a new image of a gift basket containing the items in the reference images.
import base64
from openai import OpenAI
client = OpenAI()
prompt = """
Generate a photorealistic image of a gift basket on a white background
labeled 'Relax & Unwind' with a ribbon and handwriting-like font,
containing all the items in the reference pictures.
"""
result = client.images.edit(
model="gpt-image-1",
image=[
open("body-lotion.png", "rb"),
open("bath-bomb.png", "rb"),
open("incense-kit.png", "rb"),
open("soap.png", "rb"),
],
prompt=prompt
)
image_base64 = result.data[0].b64_json
image_bytes = base64.b64decode(image_base64)
# Save the image to a file
with open("gift-basket.png", "wb") as f:
f.write(image_bytes)

The 4 base images and the result
Top News of the week
Top News: AlphaEvolve - Google's Gemini-Powered Agent Discovers groundbreaking algorithms
Google has unveiled AlphaEvolve, a sophisticated AI system that combines their Gemini large language models with an evolutionary framework to discover and optimize algorithms across mathematics and computing.
How It Works
AlphaEvolve uses an ensemble approach with complementary LLMs:
Gemini Flash generates a broad range of ideas quickly
Gemini Pro provides deeper, more insightful suggestions
The system proposes solutions as code, then uses automated evaluators to verify and score them. The highest-performing solutions are evolved through further iterations, creating increasingly refined algorithms.
In easier terms: AlphaEvolve is a “self-learning” system which learns through experimentation.

Real-World Impact
AlphaEvolve has already been deployed across Google's computing ecosystem with measurable results:
Data Center Optimization: Recovered 0.7% of Google's worldwide compute resources through improved scheduling in their Borg system
Hardware Acceleration: Improved circuit design for upcoming Tensor Processing Units
AI Training Efficiency: Sped up matrix multiplication in Gemini's architecture by 23%, reducing training time by 1%
GPU Instruction Optimization: Achieved a 32.5% speedup for the FlashAttention kernel in Transformer models

Diagram showing how AlphaEvolve helps Google deliver a more efficient digital ecosystem, from data center scheduling and hardware design to AI model training.
Mathematical Breakthroughs
Beyond practical applications, AlphaEvolve has tackled fundamental mathematical problems:
Discovered an improved algorithm for multiplying 4×4 complex-valued matrices using 48 scalar multiplications, advancing beyond Strassen's 1969 algorithm
Made progress on approximately 20% of the 50+ open mathematical problems it was tested on
Established a new lower bound for the kissing number problem in 11 dimensions with 593 spheres
Future Availability
Google plans to make AlphaEvolve available through an Early Access Program for academic users initially, with potential broader availability later. The company expects the system's capabilities to continue improving alongside advancements in large language models.
Disadvantages
While the results are almost groundbreaking - just imagine that we got a new algorithm in mathematics, for a problem which is out there for decades - there is one slight drawback: The system works well for tightly defined problem spaces with clear testing scenarios (so that the system can learn what is good and what is bad). Whether such an approach is useful for more dynamic areas - like let’s say unclear customer requirements - is to be seen.
Read Google full announcement here:
Also in the news
Googles next W: Google's Gemini 2.5 Pro I/O Edition Claims AI Coding Crown
Google DeepMind has released a new version of Gemini 2.5 Pro dubbed the "I/O Edition," which has dethroned Anthropic's Claude 3.7 Sonnet as the top-performing AI coding model. The updated model scored 1499.95 on the WebDev Arena Leaderboard, significantly outperforming Claude's 1377.10 score.
Available immediately to developers through Google AI Studio and Vertex AI without pricing changes ($1.25/$10 per million tokens in/out), this version shows remarkable improvements in generating functional web applications and interactive interfaces from single prompts.
Developers are already praising its capabilities, with Cognition's Silas Alberti noting it was "the first model to successfully complete a complex refactoring of a backend routing system," while Cursor CEO Michael Truell reported "a marked decrease in tool call failures." The upgrade comes strategically ahead of Google's annual I/O developer conference scheduled for May 20-21.
Microsofts ADeLe: A New Framework for Predicting AI Model Performance
Microsoft researchers have developed ADeLe (annotated-demand-levels), a quite novel approach to AI model evaluation that goes beyond measuring accuracy to predict performance on unfamiliar tasks and explain why models succeed or fail.
The framework assesses AI models across 18 different cognitive and knowledge-based abilities, rating tasks from 0-5 based on how much they demand each capability. By comparing what a task requires with what a model can deliver, ADeLe creates comprehensive "ability profiles" that reveal each AI's specific strengths and weaknesses.
When tested across 16,000 examples from 63 tasks, the system achieved approximately 88% accuracy in predicting the performance of models like GPT-4o and LLaMA-3.1-405B. The research also uncovered significant limitations in current benchmarking approaches, finding that many popular AI tests either don't measure what they claim or only cover a limited range of difficulty levels.
This breakthrough could transform how AI systems are evaluated before deployment, enabling researchers and developers to anticipate potential failures and understand model capabilities with much better clarity.
Read their announcement:
Tip of the week
LTX-Video: Stunning hyper-realistic video generation model
Want to generate impressive AI videos without waiting hours for processing? This week, we're exploring LTX-Video, the first DiT-based video generation model capable of creating high-quality videos in real-time.
LTX-Video is the first DiT-based video generation model that can generate high-quality videos in real-time. It can generate 30 FPS videos at 1216×704 resolution, faster than it takes to watch them. Most importantly, you can generate videos from image as well as text.
Make sure to check out their amazing example videos over at their github page:

Quick Start Options
You can start generating videos immediately through these online interfaces:
LTX-Studio for image-to-video (two model options)
Fal.ai for both text-to-video and image-to-video
Replicate for both generation methods
Prompt Engineering for Better Results
The secret to getting great results lies in how you structure your prompts:
Start with the main action in a single, clear sentence
Add specific movement details - be explicit about gestures and motion
Describe appearances precisely rather than abstractly
Include environmental details and background elements
Specify camera angles and movements for more cinematic control
Keep descriptions literal and chronological - think like a cinematographer
For example, instead of "A beautiful sunset at the beach," try: "A wide-angle shot of golden sunlight reflecting on gentle waves at a sandy beach. The camera slowly pans right as the orange sun descends toward the horizon, casting long shadows across the rippled sand."
Local Installation
If you prefer running it locally:
git clone https://github.com/Lightricks/LTX-Video.git
cd LTX-Video
python -m venv env
source env/bin activate
python -m pip install -e .$$inference-script$$
# Basic text-to-video generation
python inference.py --prompt "Your detailed prompt here" --height 704 --width 1216 --num_frames 121 --seed 42 --pipeline_config configs/ltxv-13b-0.9.7-distilled.yaml
Pro Tip
For the absolute best results, use the ComfyUI integration by following the setup at ComfyUI-LTXVideo. This provides greater control over generation parameters and supports advanced workflows like video extension and multi-condition generation.
We hope you liked our newsletter and you stay tuned for the next edition. If you need help with your AI tasks and implementations - let us know. We are happy to help