Pondhouse Data AI - Edition 5

Chat interface for your database | LLMs for research & writing | New tool to optimize prompts

Andreas Nigg
15 Jul

Hey there,

We’re excited to bring you the 5th edition of our Pondhouse AI newsletter — your go-to for all things artifical intelligence. Whether you want to learn about AI concepts, use AI tools effectively, or see inspiring examples, we’ve got you covered.

Let’s get started!

Cheers, Andreas & Sascha

In todays edition:

News: Microsoft releases GraphRAG - a tool for complex, AI-based knowledge graphing
Tutorial: Create a chat-interface for your SQL database
Tip of the Week: Systematic prompt creation using Anthropic Workbench
Tool of the Week: Automatically research and draft long-form articles using Stanford University’s “STORM”

Find this Newsletter helpful?
Please forward it to your colleagues and friends - it would mean a lot to us.

Microsoft releases GraphRAG: A tool for complex data discovery

Microsoft Research has released GraphRAG, an innovative graph-based approach to retrieval-augmented generation (RAG), on GitHub. This tool offers improved structured information retrieval and comprehensive response generation compared to traditional RAG methods.

Key Features:

Uses large language models to create rich knowledge graphs from text documents
Identifies "communities" within data to provide hierarchical summaries
Excels at answering "global questions" about entire datasets
Outperforms naive RAG in comprehensiveness and diversity of responses

GraphRAG demonstrates significant advantages in handling complex queries and providing insights into large datasets - compared to “normal” RAG. It's particularly useful for understanding data at a global level without needing to formulate specific questions in advance.

From our judgement, this tool represents a significant step forward in AI-assisted data utilization and question-answering systems.

Microsoft encourages community feedback as they continue to refine and expand GraphRAG's capabilities - so, click the link below and try it out.

For more details, read the full announcement here.

GraphRAG: New tool for complex data discovery now on GitHub

GraphRAG, a graph-based approach to retrieval-augmented generation (RAG) that significantly improves question-answering over private or previously unseen datasets, is now available on GitHub. Learn more:

www.microsoft.com/en-us/research/blog/graphrag-new-tool-for-complex-data-discovery-now-on-github

Tutorials & Use Cases

Chat with your Database: Use natural language to query data

Imagine asking complex questions about your data in plain English and getting instant answers. No need for complex SQL syntax or searching for the right tables and schemas in your database.

Using modern AI SQL agents promises just that: Ask a natural language question, AI will create a query and provide you the data as requested.

How do text2sql agents work?

First, text2sql agents take a user's query in plain language. Then, using a large language model like GPT-4, they translate this query into a SQL statement compatible with the target database, such as BigQuery. The agent accesses the database schema, including table structures and column descriptions, to inform its query generation. It may also employ Retrieval Augmented Generation (RAG) to identify the most relevant tables and columns for the query, improving accuracy. Once a SQL query is generated, the agent executes it against the database. If errors occur, the agent can automatically attempt to correct the query. This process continues in a loop until a satisfactory result is obtained or a maximum number of attempts is reached.

In our article below, learn how to implement such an agent:

Set up a development environment for BigQuery and OpenAI
Implement an AI agent loop for query generation and error handling
Add “self-healing” to the agent, so that it recovers from query errors
Improve accuracy of the text2sql procedure by adding a small Retrieval Augmented Generation (RAG) component

Read the full tutorial here.

How to Chat with Your BigQuery: Building a Conversational AI Agent for BigQuery

Simplify complex BigQuery data analysis with a custom conversational interface. Learn how to use OpenAI's GPT-4o model to generate SQL queries and execute them against your BigQuery database.

www.pondhouse-data.com/blog/database-query-with-natural-language

Also in the news

Tech executives confident in AI skills, but AI adoption stays hard

A Zartis study reveals a paradox in UK tech companies' AI adoption: while 85% of executives rate their workforce's AI skills highly, significant barriers like budget constraints and talent shortages persist. Despite these challenges, 93% of companies are investing at least £100,000 in AI in 2024, primarily in software development. The research highlights the pressure to keep pace with AI advancements while dealing with ROI uncertainties and integration challenges.

For more details, read the full article here.

Microsoft: Skeleton Key, a new type of generative AI jailbreak technique

Microsoft has uncovered "Skeleton Key," a new AI jailbreak technique that bypasses safety guardrails in various AI models, potentially causing them to generate harmful content. The company has implemented countermeasures in its AI offerings and shared findings with other providers. Microsoft recommends a multi-layered defense approach and has updated its Azure AI tools to help customers protect against such attacks in their AI applications.

For more details, see Microsofts detailed blog post.

OpenAI’s ChatGPT app for MacOS is now available for all users

Earlier this year OpenAI announced their MacOS ChatGPT application - and it finally landed. What’s the buzz about? The app basically integrates ChatGPT much closer to your day-to-day desktop experience:

Option + Space opens ChatGPT from any screen on your desktop.
Take screenshots of your current desktop and ask questions about it.
Use Microphone input to “talk” with ChatGPT

While the application certainly is not a revolution, we find that the remarkable vision capability of GPT-4o in combination with the ease of taking screenshots of your desktop, makes working with ChatGPT indeed much more convenient.

Workflow tip: Take a screenshot of the error message you just got and ask the ChatGPT app about it. It’s much faster than explaining what happened.

For a list of features, continue here.

Tip of the week

Use the Anthropic Prompt Workbench to Optimize your LLM prompts

Anthropic unveils new prompt generator tools to enhance developer productivity with its Claude 3.5 Sonnet AI model.

Key features include:

Built-in prompt generator accessible through Anthropic Console
Ability to test, evaluate, and create prompts
Option to upload real-world examples or generate AI test cases
Side-by-side comparison of prompt effectiveness
Rating system for prompt efficiency
“Get code” feature which provides copy/paste ready API integration code for your optimized prompts
Aimed at assisting both new users and experienced prompt engineers

Especially the prompt generation tool is superb, as it takes your user question and turns it in a well-written and very detailed LLM prompt.

How does it work?

Navigate to the Anthropic Console Workbench.
Add a system prompt and user prompt - or better - use the automatic prompt generator
Optional: Add variables to your prompt
Run the prompt and see the results
Click on “Evaluate” to get a list of results created for your prompt. Rate the results.
Optional: Click on “Generate Test Case” to use Claude to create synthetic test cases. Rate them accordingly to get methodic insights in your prompt quality

Using the Claude Workbench allows us to systemically craft prompts and even monitor their effectiveness over time - instead of just trial & error-ing prompts - as many of us did in the past.

Tool of the week

STORM - LLM-powered research and writing tool

STORM is an new open-source tool that uses artificial intelligence to generate Wikipedia-style articles from scratch using internet search capabilities.

Developed by researchers at Stanford University, STORM aims to assist writers and researchers in the pre-writing stage of article creation.

Storm automatically creates well-researched articles from a simple question

Why STORM?

By automating the research process and outline generation, it significantly reduces the time and effort required in the initial stages of writing.

While not producing publication-ready content, STORM provides a solid foundation that experienced writers can build upon, making it an interesting tool for knowledge exploration and content development.

How it works

STORM's approach is divided into two main stages:

Pre-writing: The system conducts internet-based research to collect references and generates a comprehensive outline.
Writing: Using the outline and gathered references, STORM produces a full-length article with citations.

The key innovation lies in STORM's ability to generate insightful questions automatically. It achieves this through:

Perspective-Guided Question Asking: STORM analyzes similar topics to discover diverse perspectives, which guide its question-generation process.
Simulated Conversation: The system simulates a dialogue between a writer and a topic expert, grounded in internet sources, allowing for dynamic understanding and follow-up questions.

Storm focusses heavily on the research aspect of article creation

With its modular design and support for various language models and search engines, STORM offers flexibility for researchers and developers to customize and extend its capabilities.

GitHub - stanford-oval/storm: An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations. - stanford-oval/storm

github.com/stanford-oval/storm

We hope you liked our newsletter and you stay tuned for the next edition. If you need help with your AI tasks and implementations - let us know. We are happy to help