Pondhouse Data AI - Edition 10

Microsoft wants to restart nuclear power plant for their AI data centers | Tips on how to make LLM outputs more consistent | How to use AI directly with Google BigQuery | OpenAI releases new "thinking" LLM

Andreas Nigg
24 Sep

Hey there,

We’re excited to bring you the 10th edition of our Pondhouse AI newsletter — your source for tips and tricks around AI and LLMs. Whether you want to learn about AI concepts, use AI tools effectively, or see inspiring examples, we’ve got you covered.

Let’s get started!

Cheers, Andreas & Sascha

In todays edition:

News: OpenAI releases new LLM which can “think” before answering. And Microsoft wants to restart an old nuclear power plant for their AI data centers.
Tutorial: How to use AI directly from within Google BigQuery.
Tip of the week: Making AI outputs more consistent.
Repository of the Week: Great course from Microsoft: Generative AI for beginners - from getting started to building chatbots.

Find this Newsletter helpful?
Please forward it to your colleagues and friends - it helps us tremendously.

OpenAI releases new “thinking” LLM - with best in class performance on complex tasks

OpenAI has released the O1 model, a major enhancement designed to excel in scenarios requiring advanced reasoning and structured thinking. Aimed at improving decision-making, research capabilities, and more, O1 is available to users via the API in preview mode.

From our first tests it indeed seems to be the biggest step in LLM output quality since GPT-4. The big drawback being that it’s very very slow.

What is OpenAI O1?

O1 is a new iteration of OpenAI’s language models, designed with a focus on handling tasks that involve detailed reasoning, logical structure, and step-by-step problem solving. Unlike previous models, O1 offers more robust performance in areas where logical deduction and structured analysis are required.

Where Does O1 Excel?

O1 shines in tasks that demand:

Complex problem-solving: It excels at handling layered problems, where answers require multiple steps and nuanced understanding.
Decision-making: The model supports complex decision trees and helps streamline difficult choices by evaluating multiple variables.
Research and data synthesis: O1 can analyze dense information, summarize findings, and support users with intelligent insights for research projects.
Coding and logical tasks: It’s also highly effective for technical tasks that involve code generation, debugging, and providing step-by-step guidance.

How Does a 'Reasoning LLM' Work?

A reasoning LLM, like O1, operates by breaking down complex tasks into smaller, logical steps, ensuring each component is systematically addressed. It leverages structured patterns in language to simulate human-like problem-solving processes. By analyzing data and identifying relationships between elements, O1 can guide users through decision trees, provide reasoning paths, and offer explanations that mirror how a human might tackle the problem. This involves more robust data interpretation and a deeper understanding of cause-effect relationships, making it a much better tool than ordinary LLMs.

In summary, this new category of LLMs works similar as previous LLMs worked when engaged with “chain-of-thought” prompting - a prompting technique where the user would use the response of an LLM as input for another prompt. Having this as part of the models internals however is quite revolutionary.

What about costs?

Well, with each new model, there are new costs. This new reasoning model - as shown in the illustration above - uses input tokens and output tokens as usual. But it also consumes “reasoning tokens” - tokens used only internally for the “thinking process”. These reasoning tokens are currently priced the same as output tokens.

When using the model via OpenAI’s API, one can monitor the required reasoning tokens in the API response:

usage: {
  total_tokens: 1000,
  prompt_tokens: 400,
  completion_tokens: 600,
  completion_tokens_details: {
    reasoning_tokens: 500
  }
}

For more information, read the full model details here.

Tutorials & Use Cases

Retrieval Augmented Generation, Vector Search, Embeddings and LLMs in Google BigQuery

Google BigQuery is a favorite among data professionals for storing their companies data. Recently, Google added the capability to store vector embeddings in BigQuery - opening up BigQuery to be used for any Retrieval Augmented Generation use - case.

Additionally, BigQuery is tightly integrated with Googles quite powerful Vertex AI ecosystem - a collection of tools and models to make AI available directly from within the database.

For example, calling an embedding generation LLM and updating existing texts with embeddings is as simple as calling this query:

-- Update our table. Change dataset.table according to your configuration
UPDATE `blog_bg_vector.blog_bq_embeddings` t
-- We just want to change the embeddings column
SET t.embeddings = e.ml_generate_embedding_result
FROM (

  -- Call the BigQuery ML embedding generation
  SELECT
    -- ml_generate_embedding_result is the actual array of floats as result
    -- of the embedding process
    ml_generate_embedding_result,
    -- we also get the actual text content back, so we can use it for
    -- updating our table.
    content
  FROM ML.GENERATE_EMBEDDING(
    -- select the previously created model, by dataset.model_name
    MODEL `blog_bg_vector.embedding_model`,
    -- We need to pass the text column from our original table
    (
      SELECT
         -- GENERATE_EMBEDDING requires the actual text column to be called 'content'
         text as content
      FROM `blog_bg_vector.blog_bq_embeddings`
      -- we only want to update embeddings for rows which don't have embeddings already
      WHERE LENGTH(text) > 0 AND ARRAY_LENGTH(embeddings) = 0
    )
  )
  WHERE LENGTH(ml_generate_embedding_status) = 0
) e
WHERE t.text = e.content;

In the tutorial below, you’ll find hands-on examples - including sample datasets - for:

How to call LLMs from a BigQuery SQL statement
How to create embeddings for texts stored in BigQuery
How to integrate a full rag pipeline with only using BigQuery - no additional infrastructure required

RAG, Embeddings and Vector Search with Google BigQuery

Google BigQuery recently got support for vector embeddings. Learn how to use them and how to create embeddings of your texts without leaving BigQuery. No external systems required.

www.pondhouse-data.com/blog/vector_search_with_bigquery

Read the full tutorial here.

Also in the news

Microsoft wants to restart melted down nuclear power plant for their AI datacenters

We all know by now, that AI needs energy. A lot of it. Microsoft has teamed up with Constellation Energy to power its AI systems using energy from - wait for it - a nuclear power plant. The Three Mile Island nuclear plant, more specifically, which had a critical meltdown in 1979 and was inactive since 2019. This move is part of a broader trend among tech giants seeking alternative energy sources to meet rising energy demands driven by AI while minimizing carbon footprints. The plant is expected to generate 835 megawatts, enough to power around 700,000 homes.

Read the full article here.

Alibaba Cloud released Qwen 2.5 - and it’s the best Open Source model family to data

We are very exciting about any good, new open source model. While we like managed model providers like OpenAI, open source models make sure that not too much power is concentrated with a few big players. And luckily, we got another open source model, which event beats GPT-4 in many disciplines.

Alibaba has unveiled Qwen2.5, a new suite of open-source AI models with sizes up to 72 billion parameters. These models excel in tasks like coding, mathematical reasoning, and language comprehension, offering significant improvements over previous versions. Qwen2.5 also introduces advanced API services, including Qwen-Plus and Qwen-Turbo, catering to enterprise needs for scalable AI solutions

Read the announcement post here.

Google will start flagging AI generated images in their search results

Fake news - and especially fake images generated by AI are a problem. LLMs like Flux can generate photorealistic images without noticeable artefacts. To tackle this issue, Google will introduce a new feature in its search results to flag images that have been generated or edited by AI. This feature, called “About this image,” will display metadata revealing if an image was created using AI tools. It will be available through Google Search, Google Lens, and Android's Circle to Search.

We are happy to see it. While we like generative AI for what it can do, users should be aware what’s real and what’s generated by AI.

For more details, read Googles detailed plans here.

Tip of the week

How to make OpenAI model responses more consistent

Don’t you hate it? You craft a wonderful prompt, which creates a superb LLM answer - only to find that you run the same prompt a second time, it produces pure garbage?

This happens quite often, due to system-inherent randomness of how LLMs create answers.

However, there are some things we can do about it:

The obvious: Make sure the prompts are clear and concise. Assume your LLM is an intern who just now started at your company. You need to explain as much as possible and don’t leave any room for interpretation.
The hidden parameter: seed: A more practical solution: OpenAI allows to set a seed parameter when calling their LLM API. The seed - in short - makes sure that if you ask exactly the same prompt twice, you’ll get exactly the same answer twice. So it’s highly recommended to use the seed parameter, as it helps tremendously!
Using temperature: In general, the lower the parameter temperature of an LLM is set, the less variations the prompt response impose. However, there is a drawback: It has been shown that a temperature of 0 makes LLMs a little “dumber” - they are less able to solve complex problems then with a temperature of 0.5 or 0.7. The suggestion: Use a temperature of 0.7 and run your prompt multiple times. If you experience too much of variety in responses, lower the temperature by 0.1 - and repeat.

Find more details in OpenAI’s advanced usage guide.

Repository of the week

Generative AI for beginners: A complete course by Microsoft

"Generative AI for Beginners" by Microsoft offers an 18-lesson course designed to teach the fundamentals of building generative AI applications. The course covers topics such as prompt engineering, text and image generation, responsible AI usage, and AI application security. Each lesson includes video tutorials, written content, and code examples in Python and TypeScript.

Key topics covered:

Introduction to Generative AI
Understanding Prompt Engineering
Working with Text Generation
Image Generation Techniques
Responsible AI Practices
Security in AI Applications
Hands-on Projects with Code Examples (Python, TypeScript)
Integration with OpenAI and Azure OpenAI APIs
Building AI-driven Applications

Target group:

It's ideal for beginners with basic programming knowledge, offering hands-on projects and resources to deepen learning. Access to Azure OpenAI or OpenAI API is recommended for coding exercises.

However, even experienced AI application developers can benefit from these courses as reference for looking up specific topics.

GitHub - microsoft/generative-ai-for-beginners: 18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/ - microsoft/generative-ai-for-beginners

github.com/microsoft/generative-ai-for-beginners

We hope you liked our newsletter and you stay tuned for the next edition. If you need help with your AI tasks and implementations - let us know. We are happy to help