Back to Blog
AIMCPLLM

Building an AI Agent with MCP: The ChatManager Deep Dive (Part 3)

In Part 1, we learned that MCP = Function Calling + Standardization. We saw how MCP servers expose tools and how clients communicate with them through JSON-RPC messages. In Part 2, we built a UniversalMCPClient that connects to any MCP server (stdio, SSE, or Streamable HTTP), discovers available tools, and executes them. But here’s what we haven’t solved yet: How do you make an AI that automatically decides which tools to use?(using mcp client)

December 21, 202518 min read
Building an AI Agent with MCP: The ChatManager Deep Dive (Part 3)

The Missing Piece: From Client to Agent

In Part 1, we learned that MCP = Function Calling + Standardization. We saw how MCP servers expose tools and how clients communicate with them through JSON-RPC messages.

In Part 2, we built a UniversalMCPClient that connects to any MCP server (stdio, SSE, or Streamable HTTP), discovers available tools, and executes them.

But here’s what we haven’t solved yet: How do you make an AI that automatically decides which tools to use?(using mcp client)

You could manually tell your client “call the weather tool with these arguments,” but that’s not an AI agent. A real agent:

  • Understands natural language requests

  • Decides which tools to use

  • Calls multiple tools in sequence

  • Synthesizes results into coherent responses

This is where ChatManager comes in. It’s the bridge between your MCP client and Large Language Models (LLMs), creating a true AI agent with automatic tool calling.

The Architecture: Three Layers Working Together

The ChatManager sits in the middle, translating between:

  1. Human language (user requests)

  2. LLM decisions (which tools to call)

  3. MCP protocol (actual tool execution)

The Core Challenge: Format Translation

Remember from Part 1 that MCP tools use JSON Schema format:

{
  "name": "get_weather",
  "description": "Get weather for a location",
  "inputSchema": {
    "type": "object",
    "properties": {
      "city": {"type": "string"}
    },
    "required": ["city"]
  }
}

But OpenAI sdk function calling uses a different format

{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Get weather for a location",
    "parameters": {
      "type": "object",
      "properties": {
        "city": {"type": "string"}
      },
      "required": ["city"]
    }
  }
}

Notice the differences:

  • OpenAI wraps everything in a function to_openai_function object

  • OpenAI uses parameters instead of inputSchema

  • OpenAI requires a type: "function" field

ChatManager’s first job: Convert MCP tools to OpenAI format.

Why OpenAI Format? Understanding SDK Choice

You might wonder: “We’re using Anthropic’s Claude via OpenRouter, so why are we using OpenAI’s format?”

Here’s the important distinction: The format we use depends on which SDK we choose, not on which model we’re running.

When we write:

from openai import AsyncOpenAI
self.openai_client = AsyncOpenAI(
    api_key=os.getenv("OPENROUTER_API_KEY"),
    base_url="https://openrouter.ai/api/v1"
)

We made a choice: Use the OpenAI SDK. This SDK has its own format for function calling, and that’s what we must use.

The key insight:

  • If we used the OpenAI SDK → We use OpenAI’s function calling format

  • If we used the Anthropic SDK → We’d use Anthropic’s function calling format

  • If we used the Google SDK → We’d use Google’s function calling format

Why we chose OpenAI SDK:

  1. Works with OpenRouter: By pointing the OpenAI SDK to OpenRouter’s API (base_url="https://openrouter.ai/api/v1"), we can access any model (OpenAI, Anthropic, Google, Meta, etc.)

  2. One SDK, Many Models: We don’t need to install and learn multiple SDKs — the OpenAI SDK gives us access to dozens of models through OpenRouter

  3. Consistent Interface: Our code stays the same whether we use GPT-4, Claude, or Gemini — we just change the model parameter

In our implementation:

  • We use the OpenAI SDK (our choice of client library)

  • We connect to OpenRouter (the API gateway that supports multiple models)

  • We can run any model (Anthropic’s Claude, OpenAI’s GPT-4, Google’s Gemini, etc.)

  • We must use OpenAI’s function calling format (because that’s what the OpenAI SDK expects)

This is why every tool definition, tool call, and tool result in our code uses OpenAI’s format — it’s a consequence of choosing the OpenAI SDK, not because it’s a universal standard.

If you use a different SDK: You’d need to adapt the format conversion. For example, with Anthropic’s SDK, you’d convert MCP tools to Anthropic’s tool format instead. The MCP client stays the same, but the ChatManager’s format conversion layer would change.

The Data Classes: Building Blocks

Before diving into the main logic, let’s understand the data structures. These are the “vocabulary” ChatManager uses to communicate.

ToolDefinition: The Translator

@dataclass
class ToolDefinition:
    """Represents a tool in OpenAI format"""
    server_name: str
    tool_name: str
    description: str
    parameters: Dict[str, Any]
    
    @classmethod
    def from_mcp_tool(cls, server_name: str, mcp_tool: MCPTool) -> 'ToolDefinition':
        """Convert MCP tool to OpenAI tool definition"""
        # MCP tools already use JSON Schema format
        parameters = mcp_tool.inputSchema if hasattr(mcp_tool, 'inputSchema') else {
            "type": "object",
            "properties": {},
            "required": []
        }
        
        return cls(
            server_name=server_name,
            tool_name=mcp_tool.name,
            description=mcp_tool.description or f"Tool {mcp_tool.name} from {server_name}",
            parameters=parameters
        )
    
    def to_openai_function(self) -> Dict[str, Any]:
        """Convert to OpenAI function calling format"""
        return {
            "type": "function",
            "function": {
                "name": f"{self.server_name}__{self.tool_name}",
                "description": self.description,
                "parameters": self.parameters
            }
        }

Key insight: We use server__tool naming (e.g., weather__get_forecast) to avoid conflicts. If you have a Gmail server and a Slack server, both might have a "send" tool. The double underscore keeps them distinct.

ToolCall: What the LLM Wants

When the LLM decides to use a tool, it returns something like:

{
  "id": "call_abc123",
  "type": "function",
  "function": {
    "name": "weather__get_forecast",
    "arguments": "{\"latitude\": 40.7, \"longitude\": -74.0}"
  }
}

We parse this into a ToolCall object:

@dataclass
class ToolCall:
    """Represents a tool call requested by the LLM"""
    id: str
    server_name: str
    tool_name: str
    arguments: Dict[str, Any]
    
    @classmethod
    def from_openai_tool_call(
        cls,
        tool_call: Any,
        tool_definitions: Dict[str, 'ToolDefinition']
    ) -> 'ToolCall':
        """Parse from OpenAI tool call format"""
        function_name = tool_call.function.name
        
        # Parse server__tool format
        if "__" in function_name:
            server_name, tool_name = function_name.split("__", 1)
        else:
            # Fallback: try to find in tool_definitions
            tool_def = tool_definitions.get(function_name)
            if tool_def:
                server_name = tool_def.server_name
                tool_name = tool_def.tool_name
            else:
                raise ValueError(f"Cannot parse tool name: {function_name}")
        
        # Parse arguments (they come as a JSON string)
        try:
            arguments = json.loads(tool_call.function.arguments)
        except json.JSONDecodeError:
            arguments = {}
        
        return cls(
            id=tool_call.id,
            server_name=server_name,
            tool_name=tool_name,
            arguments=arguments
        )

Why parse the name? The LLM returns "weather__get_forecast", but our MCP client needs to know: "Call the get_forecast tool on the weather server."

ToolResult: What Actually Happened

After executing a tool via the MCP client, we get a result. This needs to go back to the LLM in OpenAI’s format:

@dataclass
class ToolResult:
    """Represents the result of a tool execution"""
    tool_call_id: str
    server_name: str
    tool_name: str
    result: Any
    success: bool
    error: Optional[str] = None
    
    def to_openai_tool_message(self) -> Dict[str, Any]:
        """Convert to OpenAI tool message format"""
        if self.success:
            content = self._serialize_result(self.result)
        else:
            content = json.dumps({
                "error": self.error,
                "message": f"Tool {self.tool_name} failed"
            }, ensure_ascii=False)
        
        return {
            "role": "tool",
            "tool_call_id": self.tool_call_id,
            "name": f"{self.server_name}__{self.tool_name}",
            "content": content
        }

The serializeresult method is crucial. MCP results can be complex objects with text, images, or resources. We need to convert them to JSON strings that the LLM can understand:

def _serialize_result(self, result: Any) -> str:
    """Serialize MCP result to JSON string"""
    try:
        if isinstance(result, str):
            return result
        
        # Handle MCP CallToolResult objects
        if hasattr(result, 'content'):
            content_items = []
            for item in result.content:
                if hasattr(item, 'type'):
                    # TextContent
                    if item.type == 'text':
                        content_items.append({
                            'type': 'text',
                            'text': item.text
                        })
                    # ImageContent
                    elif item.type == 'image':
                        content_items.append({
                            'type': 'image',
                            'data': item.data,
                            'mimeType': item.mimeType
                        })
                    # ResourceContent
                    elif item.type == 'resource':
                        content_items.append({
                            'type': 'resource',
                            'resource': {
                                'uri': item.resource.uri,
                                'mimeType': getattr(item.resource, 'mimeType', None),
                                'text': getattr(item.resource, 'text', None)
                            }
                        })
            
            result_dict = {
                'content': content_items,
                'isError': getattr(result, 'isError', False)
            }
            
            return json.dumps(result_dict, ensure_ascii=False)
        
        # Fallback for simple types
        if isinstance(result, (dict, list, int, float, bool, type(None))):
            return json.dumps(result, ensure_ascii=False)
        
        return json.dumps({'result': str(result)}, ensure_ascii=False)
        
    except Exception as e:
        logger.error(f"Failed to serialize result: {e}")
        return json.dumps({
            'error': 'Serialization failed',
            'message': str(e),
            'result_type': type(result).__name__
        }, ensure_ascii=False)

Why so complex? MCP supports rich content types (text, images, resources). We need to preserve this structure while converting to JSON for the LLM.

The ChatManager: Orchestrating the Agent

Now for the main class. Let’s break it down step by step.

Initialization: Setting Up the Agent

class ChatManager:
    """Manages LLM conversations with automatic MCP tool integration"""
    
    def __init__(
        self,
        mcp_client: UniversalMCPClient,
        model: str = "anthropic/claude-3.5-sonnet",
        base_url: str = "https://openrouter.ai/api/v1",
        system_prompt: Optional[str] = None,
        max_iterations: int = 10,
        temperature: float = 0.7,
        history: Optional[List[Message]] = None
    ):
        self.mcp_client = mcp_client
        self.model = model
        self.max_iterations = max_iterations
        self.temperature = temperature
        
        # Initialize OpenAI client for OpenRouter
        self.openai_client = AsyncOpenAI(
            api_key=os.getenv("OPENROUTER_API_KEY"),
            base_url=base_url
        )
        
        # Conversation state
        self.conversation_history: List[Message] = history if history is not None else []
        self.system_prompt = system_prompt or (
            "You are a helpful assistant with access to various tools. "
            "Use the available tools when needed to answer user questions accurately."
        )
        
        # Add system message if history is empty
        if not self.conversation_history and self.system_prompt:
            self.conversation_history.append(
                Message(role="system", content=self.system_prompt)
            )
        
        # Build tool definitions
        self.tool_definitions: Dict[str, ToolDefinition] = {}
        self._build_tools_schema()

Key decisions:

  • We use OpenRouter as the LLM provider (supports OpenAI, Anthropic, and many others with one API)

  • We maintain conversation history (essential for context)

  • We build tool definitions immediately (converting all MCP tools to OpenAI format)

  • We set a max_iterations limit (prevents infinite loops if the LLM keeps calling tools)

Building the Tool Schema

def _build_tools_schema(self):
    """Build tool definitions from MCP client"""
    self.tool_definitions.clear()
    
    # Get all tools from all servers (using our UniversalMCPClient from Part 2)
    all_tools = self.mcp_client.list_tools()
    
    for server_name, tools in all_tools.items():
        for mcp_tool in tools:
            tool_def = ToolDefinition.from_mcp_tool(server_name, mcp_tool)
            self.tool_definitions[tool_def.full_name] = tool_def
    
    logger.info(f"Built {len(self.tool_definitions)} tool definitions")

This is where Part 2 connects to Part 3. We use the UniversalMCPClient.list_tools() method we built in Part 2 to discover all available tools, then convert each one to OpenAI format using ToolDefinition.from_mcp_tool().

The Agentic Loop: Where the Magic Happens

This is the heart of ChatManager. When a user sends a message, we enter a loop where the LLM can call tools multiple times until it has enough information to answer:

async def send_message(self, user_message: str) -> str:
    """Send a message and get response (with automatic tool calling)"""
    # Add user message to history
    self.conversation_history.append(
        Message(role="user", content=user_message)
    )
    
    logger.info(f"User message: {user_message}")
    
    # Process conversation with tool calling loop
    final_response = await self._process_conversation_loop()
    
    logger.info(f"Assistant response: {final_response}")
    
    return final_response

The real work happens in processconversation_loop():

async def _process_conversation_loop(self) -> str:
    """Main conversation loop with automatic tool calling"""
    iteration = 0
    
    while iteration < self.max_iterations:
        iteration += 1
        logger.info(f"Conversation iteration {iteration}/{self.max_iterations}")
        
        # Step 1: Call LLM with current conversation history
        response_message = await self._call_llm()
        
        # Step 2: Add assistant message to history
        assistant_message = Message.from_openai_response(
            response_message,
            self.tool_definitions
        )
        self.conversation_history.append(assistant_message)
        
        # Step 3: Check if LLM wants to call tools
        if assistant_message.tool_calls:
            logger.info(f"LLM requested {len(assistant_message.tool_calls)} tool calls")
            
            # Step 4: Execute all tool calls (in parallel!)
            tool_results = await self._execute_tool_calls(assistant_message.tool_calls)
            
            # Step 5: Add tool results to history
            for result in tool_results:
                tool_message_dict = result.to_openai_tool_message()
                tool_message = Message(
                    role="tool",
                    content=tool_message_dict["content"],
                    tool_call_id=tool_message_dict["tool_call_id"],
                    name=tool_message_dict["name"]
                )
                self.conversation_history.append(tool_message)
            
            # Step 6: Continue loop to let LLM process tool results
            continue
        
        # No tool calls - we have the final response
        if assistant_message.content:
            return assistant_message.content
        else:
            logger.warning("LLM returned no content and no tool calls")
            return "I apologize, but I couldn't generate a response."
    
    # Max iterations reached
    logger.warning(f"Max iterations ({self.max_iterations}) reached")
    return "I apologize, but I couldn't complete the task within the allowed steps."

Let’s trace through an example:

User: “What’s the weather in New York and send me an email about it?”

Iteration 1:

  1. Call LLM with conversation history (just the user message)

  2. LLM returns: “I need to call weather__get_forecast with NY coordinates"

  3. Execute the tool via MCP client

  4. Add result to history

  5. Continue loop

Iteration 2:

  1. Call LLM with updated history (now includes weather data)

  2. LLM returns: “I need to call gmail__send_email with weather info"

  3. Execute the tool via MCP client

  4. Add result to history

  5. Continue loop

Iteration 3:

  1. Call LLM with full history

  2. LLM returns: “I’ve sent you an email with the weather forecast for New York. It’s currently 72°F and sunny.”

  3. No tool calls — return final response

Calling the LLM

async def _call_llm(self) -> Any:
    """Call the LLM with current conversation history"""
    # Prepare messages (convert our Message objects to OpenAI format)
    messages = [
        msg.to_openai_format()
        for msg in self.conversation_history
    ]
    
    # Prepare tools (convert our ToolDefinitions to OpenAI format)
    tools = [
        tool_def.to_openai_function()
        for tool_def in self.tool_definitions.values()
    ]
    
    # Call OpenAI API (via OpenRouter)
    try:
        response = await self.openai_client.chat.completions.create(
            model=self.model,
            messages=messages,
            tools=tools if tools else None,
            temperature=self.temperature
        )
        
        return response.choices[0].message
        
    except Exception as e:
        logger.error(f"LLM API call failed: {e}")
        raise

This is where all three parts connect:

  • Part 1: We understand that tools are just JSON Schema definitions

  • Part 2: We have an MCP client that can execute those tools

  • Part 3: We convert MCP tools to OpenAI format and let the LLM decide which to use

Executing Tools in Parallel

When the LLM requests multiple tools, we execute them concurrently:

async def _execute_tool_calls(
    self,
    tool_calls: List[ToolCall]
) -> List[ToolResult]:
    """Execute multiple tool calls (in parallel if possible)"""
    logger.info(f"Executing {len(tool_calls)} tool calls")
    
    # Execute all tool calls in parallel using asyncio.gather
    tasks = [
        tool_call.execute(self.mcp_client)
        for tool_call in tool_calls
    ]
    
    results = await asyncio.gather(*tasks, return_exceptions=True)
    
    # Convert exceptions to failed ToolResults
    final_results = []
    for i, result in enumerate(results):
        if isinstance(result, Exception):
            tool_call = tool_calls[i]
            final_results.append(
                ToolResult(
                    tool_call_id=tool_call.id,
                    server_name=tool_call.server_name,
                    tool_name=tool_call.tool_name,
                    result=None,
                    success=False,
                    error=str(result)
                )
            )
        else:
            final_results.append(result)
    
    # Log results
    for result in final_results:
        if result.success:
            logger.info(f"{result.server_name}.{result.tool_name} succeeded")
        else:
            logger.error(f"{result.server_name}.{result.tool_name} failed: {result.error}")
    
    return final_results

Why parallel execution? If the LLM wants to check weather in 5 cities, we can call all 5 weather APIs simultaneously instead of waiting for each one sequentially. This dramatically speeds up responses.

The ToolCall.execute() method is simple:

async def execute(self, mcp_client: UniversalMCPClient) -> 'ToolResult':
    """Execute this tool call via MCP client"""
    try:
        # This uses the UniversalMCPClient.call_tool() method from Part 2!
        result = await mcp_client.call_tool(
            self.server_name,
            self.tool_name,
            self.arguments
        )
        
        return ToolResult(
            tool_call_id=self.id,
            server_name=self.server_name,
            tool_name=self.tool_name,
            result=result,
            success=True,
            error=None
        )
    except Exception as e:
        logger.error(f"Tool execution failed: {self.server_name}.{self.tool_name}: {e}")
        return ToolResult(
            tool_call_id=self.id,
            server_name=self.server_name,
            tool_name=self.tool_name,
            result=None,
            success=False,
            error=str(e)
        )

This is the bridge to Part 2: We use the UniversalMCPClient.call_tool() method we built in Part 2 to actually execute the tool via MCP protocol.

Complete Example: Putting It All Together

Here’s a full example showing how all three parts work together:

import asyncio
from MCP_Client import UniversalMCPClient, ServerConfig
from ChatManager import ChatManager
async def main():
    # Step 1: Create MCP client (from Part 2)
    mcp_client = UniversalMCPClient()
    
    # Step 2: Connect to MCP servers (using concepts from Part 1)
    await mcp_client.add_server(ServerConfig(
        name="weather",
        transport="stdio",
        command="python",
        args=["weather.py"]
    ))
    
    await mcp_client.add_server(ServerConfig(
        name="gmail",
        transport="streamable_http",
        url="http://localhost:8080/mcp"
    ))
    
    # Step 3: Create ChatManager (Part 3 - this article!)
    chat_manager = ChatManager(
        mcp_client=mcp_client,
        model="anthropic/claude-3.5-sonnet",
        system_prompt="You are a helpful assistant with access to weather and email tools."
    )
    
    # Step 4: Have a conversation
    print("🤖 AI Agent Ready! Available tools:")
    for tool_name in chat_manager.tool_definitions.keys():
        print(f"   - {tool_name}")
    print()
    
    # Example 1: Simple tool usage
    response = await chat_manager.send_message(
        "What's the weather in New York?"
    )
    print(f"Assistant: {response}\n")
    
    # Example 2: Multi-tool orchestration
    response = await chat_manager.send_message(
        "Check the weather in San Francisco and send me an email with the forecast"
    )
    print(f"Assistant: {response}\n")
    
    # Example 3: Complex reasoning
    response = await chat_manager.send_message(
        "Compare the weather in New York, London, and Tokyo, "
        "then email me which city has the best weather today"
    )
    print(f"Assistant: {response}\n")
    
    # Cleanup
    await mcp_client.disconnect_all()
if __name__ == "__main__":
    asyncio.run(main())

What happens behind the scenes:

  1. MCP Client (Part 2) connects to weather and Gmail servers

  2. ChatManager (Part 3) discovers all available tools and converts them to OpenAI format

  3. User asks a question

  4. ChatManager sends the question to the LLM along with available tools

  5. LLM decides which tools to call (e.g., weather__get_forecast)

  6. ChatManager parses the tool calls and executes them via MCP Client

  7. MCP Client (Part 2) sends the request to the appropriate server using MCP protocol (Part 1)

  8. Server returns results

  9. ChatManager adds results to conversation history

  10. Loop continues until LLM has enough information

  11. LLM generates final response

  12. User gets answer

The Power of This Architecture

This three-layer architecture is incredibly powerful:

Layer 1 (MCP Protocol — Part 1):

  • Standardized tool definitions

  • Works with any server

  • Transport-agnostic

Layer 2 (MCP Client — Part 2):

  • Connects to any MCP server

  • Handles all three transports

  • Manages sessions and cleanup

Layer 3 (ChatManager — Part 3):

  • Automatic tool selection

  • Multi-step reasoning

  • Natural language interface

The result? An AI agent that can:

  • Understand complex requests

  • Break them into steps

  • Execute multiple tools

  • Synthesize results

  • Provide coherent answers

All while being completely modular. Want to add a new tool? Just connect a new MCP server. Want to use a different LLM? Just change the model parameter. Want to customize behavior? Adjust the system prompt.

Advanced Features

Conversation History Management

def get_history(self) -> List[Message]:
    """Get conversation history"""
    return self.conversation_history.copy()
def clear_history(self):
    """Clear conversation history (keeps system prompt)"""
    self.conversation_history = []
    if self.system_prompt:
        self.conversation_history.append(
            Message(role="system", content=self.system_prompt)
        )
    logger.info("Conversation history cleared")
def add_system_message(self, content: str):
    """Add a system message to the conversation"""
    self.conversation_history.append(
        Message(role="system", content=content)
    )
    logger.info(f"System message added: {content}")

Why this matters: You can inject context mid-conversation. For example, if the user uploads a document, you can add a system message: “The user has uploaded a document containing…”

Dynamic Tool Refresh

def refresh_tools(self):
    """Refresh tool definitions from MCP client"""
    logger.info("Refreshing tool definitions")
    self._build_tools_schema()

Use case: If you connect a new MCP server mid-conversation, call

refresh_tools() to make it available to the LLM.

Error Handling and Edge Cases

The ChatManager handles several edge cases:

  1. Tool execution failures:

# If a tool fails, we return an error message to the LLM
# The LLM can then decide to retry, use a different tool, or inform the user
if not result.success:
    content = json.dumps({
        "error": result.error,
        "message": f"Tool {result.tool_name} failed"
    })

2. Max iterations:

# Prevents infinite loops if the LLM keeps calling tools
if iteration >= self.max_iterations:
    return "I apologize, but I couldn't complete the task within the allowed steps."

3. Empty responses:

# If the LLM returns neither content nor tool calls
if not assistant_message.content and not assistant_message.tool_calls:
    return "I apologize, but I couldn't generate a response."

Performance Considerations

Parallel Tool Execution: The biggest performance win comes from executing multiple tools simultaneously:

# Instead of this (sequential):
for tool_call in tool_calls:
    result = await tool_call.execute(mcp_client)
    results.append(result)
# We do this (parallel):
tasks = [tool_call.execute(mcp_client) for tool_call in tool_calls]
results = await asyncio.gather(*tasks)

The Complete Picture: How All Three Parts Connect

Let’s trace a complete request through all three layers:

User: “What’s the weather in Paris and email me about it?”

Part 3 (ChatManager):

  1. Receives user message

  2. Converts MCP tools to OpenAI format

  3. Calls LLM with tools

LLM Response:

{
  "tool_calls": [
    {
      "id": "call_1",
      "function": {
        "name": "weather__get_forecast",
        "arguments": "{\"latitude\": 48.8566, \"longitude\": 2.3522}"
      }
    }
  ]
}

Part 3 (ChatManager): 4. Parses tool call 5. Extracts: server=”weather”, tool=”get_forecast”, args={…}

Part 2 (MCP Client): 6. Looks up “weather” server 7. Gets session for that server 8. Calls session.call_tool("get_forecast", {...})

Part 1 (MCP Protocol)=init+call: 9. Formats JSON-RPC message:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "get_forecast",
    "arguments": {"latitude": 48.8566, "longitude": 2.3522}
  }
}
  1. Sends via stdio/HTTP/SSE transport

  2. Weather server executes

  3. Returns result

Part 2 (MCP Client): 13. Receives result 14. Returns to ChatManager

Part 3 (ChatManager): 15. Converts result to OpenAI format 16. Adds to conversation history 17. Calls LLM again with weather data

LLM Response:

{
  "tool_calls": [
    {
      "id": "call_2",
      "function": {
        "name": "gmail__send_email",
        "arguments": "{\"to\": \"[email protected]\", \"subject\": \"Paris Weather\", ...}"
      }
    }
  ]
}

Parts 3 → 2 → 1: Same flow for Gmail tool

Final LLM Response:

{
  "content": "I've checked the weather in Paris (currently 18°C and partly cloudy) and sent you an email with the detailed forecast."
}

Part 3 (ChatManager): 18. Returns final response to user

What’s Next: Building the Complete Chat Interface

This is Version 1 of ChatManager — the core engine. But a chat engine without an interface is like a car without a steering wheel.

Version 2: The Complete Chat Application

We’re building a full-featured chat interface similar to Claude Desktop:

Frontend Features:

  • Chat Interface: Beautiful message display with streaming responses

  • Session Management: Create, switch, and manage multiple conversation sessions

  • MCP Server Dashboard: Visual interface to add/remove MCP servers (extending the current SettingsModal)

  • Tool Visualization: See tool calls and results in real-time as they execute

  • Conversation History: Persistent storage of all your conversations

Backend Enhancements:

  • WebSocket Streaming: Real-time token-by-token response streaming

  • Session Persistence: SQLite database for conversation history

  • Session API: Full CRUD operations for chat sessions

  • Tool Call Tracking: Detailed logging of all tool executions

The Vision: A production-ready MCP chat application where you can:

  1. Connect to any MCP server (weather, Gmail, databases, etc.)

  2. Have natural conversations with AI

  3. Watch as the AI automatically calls tools to answer your questions

  4. Manage multiple conversation sessions

  5. See your complete conversation history

This isn’t just a demo — it’s a complete, deployable application that rivals Claude Desktop, but with the flexibility to connect to ANY MCP server you want.

Conclusion: The AI Agent Revolution

We’ve now completed the full journey:

Part 1: Understanding MCP protocol — the foundation Part 2: Building an MCP client — the connection layer Part 3: Creating ChatManager — the intelligence layer

Together, these three parts create a complete AI agent system:

  • Standardized (works with any MCP server)

  • Intelligent (LLM decides which tools to use)

  • Automatic (no manual tool selection needed)

  • Extensible (add new tools by connecting new servers)

The beauty of this architecture is its simplicity. Each layer has a clear responsibility:

  • MCP protocol: Define and execute tools

  • MCP client: Connect and communicate

  • ChatManager: Think and orchestrate

This is the future of AI applications. Not chatbots that just talk, but agents that actually DO things. Agents that can check your email, update your calendar, analyze data, generate reports, and synthesize information from dozens of sources — all through natural conversation.

The code is production-ready. The architecture is proven. The ecosystem is growing.

Start building. The AI agent revolution is here.

Full code available: Contact me for the complete implementation

Related reading:

Enjoyed this article? Share it!