How to Use AI Foundry Reasoning Models with OpenAI Agents

Agents are everywhere these days, and yes it’s the new buzzword! Everyone is trying to build one, either to explore what’s possible or to solve real business problems. And with so many models and frameworks out there, it can get overwhelming.

The OpenAI Agent SDK makes it easier to build agents, but not many developers know how to use Azure Foundry hosted reasoning models, like o1, to make their agents better and smarter.

In this post, I’ll walk you through how to connect Azure OpenAI’s o1 model with the OpenAI Agent SDK. We’ll look at a simple example to show how it works – but first, let’s start with the basics first.

Understanding Reasoning Models vs. General-Purpose Models

General-Purpose Models: (like GPT-4 or GPT-3.5) are designed to handle a wide range of tasks including content generation, summarization, and basic Q&A. They excel at versatility but may not provide the depth of analysis required for complex reasoning tasks.

Reasoning Models: (like o1) are specifically optimized for tasks requiring step-by-step logical thinking, complex problem-solving, and structured analysis. These models are particularly valuable when your agent needs to:

Process and analyze the data in depth
Make logical inferences based on complex information
Provide detailed, well reasoned responses to domain specific queries

Setting up the environment

To get started with Azure OpenAI’s o1 reasoning model in your Agent SDK implementation, you’ll need:

An Azure subscription with access to Azure OpenAI Service
The OpenAI Agent SDK installed in your Python environment
Appropriate API credentials and endpoint information

Here’s an example showing how to implement an agent using Azure OpenAI’s o1 reasoning model:

import os
from openai import AsyncAzureOpenAI
from agents import Agent, OpenAIChatCompletionsModel

# Environment variables (typically loaded from .env file)
AZURE_OPENAI_API_KEY = os.environ["AZURE_OPENAI_KEY"]
AZURE_OPENAI_ENDPOINT = os.environ["AZURE_OPENAI_ENDPOINT"]
AZURE_OPENAI_REASONING_MODEL_DEPLOYMENT = os.environ["AZURE_OPENAI_REASONING_MODEL_DEPLOYMENT"] # set this to name of reasoning model you have deployed e.g. o1
AZURE_OPENAI_REASONING_MODEL_API_VERSION = os.environ["AZURE_OPENAI_REASONING_MODEL_API_VERSION"]

# Create Azure OpenAI client specifically for reasoning models
def get_azure_openai_async_reasoning_client():
    return AsyncAzureOpenAI(
        api_key=AZURE_OPENAI_API_KEY,
        azure_endpoint=AZURE_OPENAI_ENDPOINT,
        api_version=AZURE_OPENAI_REASONING_MODEL_API_VERSION,
        azure_deployment=AZURE_OPENAI_REASONING_MODEL_DEPLOYMENT
    )

# Configure the reasoning model
def get_reasoning_model(client):
    return OpenAIChatCompletionsModel(
        model=AZURE_OPENAI_REASONING_MODEL_DEPLOYMENT,
        openai_client=client
    )

# Create client instance - ensure to use reasoning model
reasoning_client = get_azure_openai_async_reasoning_client()

# Define our analysis agent
analysis_agent = Agent(
    name="Data Analysis Agent",
    instructions="""You are an AI assistant specializing in data analysis.
    Your primary goal is to analyze data and extract key insights.
    
    Follow these steps in your analysis:
    1. Identify key metrics and patterns in the data
    2. Compare current metrics to historical trends
    3. Highlight significant changes or anomalies
    4. Provide reasoned insights about the data
    
    Your analysis should be thorough, logical, and supported by the data provided.
    For each insight, explain your reasoning process and cite specific figures from the input.
    """,
    model=get_reasoning_model(reasoning_client)
    # output_type=list[AnalysisResult] # optional configuration
)

# Helper function to get the agent
def get_analysis_agent():
    return analysis_agent

# Example usage
async def analyze_data(data):
    agent = get_analysis_agent()
    response = await get_agent_response(agent, data)
    return response.final_output

So, how to make the decision while selecting the model?

Selecting the right model for your business use case requires careful consideration of multiple factors. Here’s a practical framework I usually use for deciding between reasoning models like o1 and general-purpose models:

Choose Reasoning Model when:

For tasks needing deep reasoning like complex data extraction, legal review, or risk assessment.
In domains requiring high precision (e.g. healthcare) or transparent, explainable AI decisions.
When logic chains or step-by-step thought processes are critical.
If regulatory compliance demands clear justification of AI outputs.

Choose General Purpose model when:

For creative tasks, summarization, or general Q&A with less complexity.
When speed and scale matter more than perfect accuracy.
If working within a limited budget or building quick prototypes.
When handling diverse, unpredictable inputs in flexible use cases.

Implementation Cautions and General Recommendations

When implementing AI agents with reasoning models in production environments, consider these best practices:

Focus on Cost: Keep in mind that accuracy comes at a cost, reasoning models can be expensive to run, so it’s important to use them wisely. Before using one, check the pricing and consider applying it only to parts of your agent where high precision really matters. Also, keep an eye on token usage to manage your costs. One way to do this is by breaking your agent into smaller, focused agents—some parts of your system might not need a reasoning model at all. You can read more about when to use a single-agent vs. multi-agent setup in this post. Reference: https://www.linkedin.com/pulse/single-agent-vs-multi-agent-architectures-ai-when-choose-gawale-av65c/
Know your success metrics and do compare models: Benchmark reasoning models against general-purpose alternatives using A/B testing on your specific use cases, with hybrid approaches where appropriate (reasoning models for critical paths, general-purpose for others). Reference: https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/benchmark-model-in-catalog
Optimize your prompts for reliability: Structure prompts with explicit step-by-step instructions, use low temperature settings for deterministic reasoning, and implement robust output validation with appropriate error handling, e.g. by implementing Guardrails with smaller models. Reference: https://openai.github.io/openai-agents-python/guardrails/

Conclusion

As AI continues to evolve, the ability to select the right model for specific tasks will become increasingly important. The combination of the OpenAI Agent SDK’s framework with Azure OpenAI’s specialized models provides a powerful way for addressing complex business problems across different domains.

Have you implemented reasoning models in your AI agents or applications? I’d love to hear about your experiences and insights.

Understanding Reasoning Models vs. General-Purpose Models

Setting up the environment

So, how to make the decision while selecting the model?

Implementation Cautions and General Recommendations

Conclusion

Leave a Comment Cancel reply