Rethinking MCP Architecture for AI Agents
The Model Context Protocol (MCP) ecosystem is exploding with specialized tools—weather lookups, file managers, database connectors, and dozens more. But what if we’re approaching this all wrong? What if instead of creating 30 different tools, we should give AI agents access to something much more powerful: the ability to write and execute code directly?
This isn’t just a theoretical idea. Developers are already building MCP servers that expose programming languages as tools, and the results are surprisingly effective. Here’s why this approach might be the future of AI tool integration.
The Problem with Tool Proliferation
Traditional MCP implementations face several fundamental challenges:
Platform Dependencies
Command-line tools often break across different operating systems or versions. An AI agent that works perfectly on macOS might fail completely on Windows because of subtle differences in how tools handle arguments or file paths.
Context Limitations
Each tool requires its own documentation, parameters, and examples. With 30 tools, you’re consuming massive amounts of context just describing what’s available—leaving less room for actual problem-solving.
State Management Issues
CLI tools are stateless by design. If an AI agent needs to run a multi-step debugging session (like using LLDB to debug a crash), it has to remember session state across multiple tool calls. This leads to agents losing track of their work and starting over.
Composability Problems
Real-world tasks require combining multiple tools in creative ways. But when each tool is isolated, agents struggle to chain operations effectively. They can’t easily pipe output from one tool to another or maintain shared state across operations.
The Code-First Alternative
Instead of exposing dozens of specialized tools, what if we exposed a single powerful tool: a programming language interpreter? This is exactly what projects like pexpect-mcp
are doing.
Here’s the core idea:
{
"name": "python_executor",
"description": "Execute Python code in a persistent session. The session maintains state between calls, and you have access to all standard libraries plus specialized tools like pexpect for process interaction.",
"parameters": {
"code": {"type": "string", "description": "Python code to execute"}
}
}
That’s it. One tool. But this single tool is incredibly powerful because:
- Persistent State: Variables and objects persist between calls
- Full Language Access: All the programming patterns the AI learned during training
- Library Ecosystem: Access to thousands of Python packages
- Composability: Natural ability to combine operations in code
Real-World Example: Debugging with Code
Consider debugging a crashing C program. With traditional MCP tools, you’d need:
- An LLDB tool
- A file reading tool
- A process spawning tool
- A session management tool
- Error handling across all of them
With a code-first approach, the AI agent simply writes Python:
import pexpect
# Start LLDB with the crashing program
child = pexpect.spawn('lldb ./demo-buggy')
child.expect('(lldb)')
# Set a breakpoint and run
child.sendline('breakpoint set -n main')
child.expect('(lldb)')
child.sendline('run')
child.expect('(lldb)')
# Examine the crash
child.sendline('bt')
child.expect('(lldb)')
backtrace = child.before.decode()
print(f"Backtrace: {backtrace}")
This approach has several advantages:
Natural Patterns
The AI doesn’t need to learn new tool APIs—it’s writing standard Python code using libraries it already understands from training.
Error Handling
Python’s exception handling works naturally. If something goes wrong, the AI can catch exceptions and adapt its approach.
Reusability
After solving a problem once, the AI can easily convert its solution into a standalone script that humans can run independently.
Debugging Capability
The AI can inspect its own state using standard Python debugging techniques like print()
, dir()
, and repr()
.
Beyond Python: JavaScript for Web Automation
The same principle applies to other domains. Instead of exposing 30 Playwright tools for web automation, expose a JavaScript interpreter with Playwright available:
// Navigate to a page and extract data
const page = await browser.newPage();
await page.goto('https://example.com');
// Collect data from multiple pages
const results = [];
for (let i = 1; i <= 10; i++) {
await page.goto(`https://example.com/page/${i}`);
const data = await page.evaluate(() => {
return document.querySelector('.content').textContent;
});
results.push(data);
}
// Results automatically available for next operations
console.log(`Collected ${results.length} items`);
This eliminates the need for separate tools for navigation, data extraction, iteration, and state management.
The Security Question
Critics often raise security concerns about giving AI agents the ability to execute arbitrary code. But this misses a key point: we’re already doing this.
When we give an AI agent access to:
- File system tools
- Network tools
- Process spawning tools
- Database tools
We’ve essentially given it the ability to do anything code execution could do, just in a more cumbersome way.
The real security comes from proper sandboxing and monitoring, not from restricting the interface. Whether an AI agent deletes your files through a “file deletion tool” or through Python’s os.remove()
doesn’t matter—the key is preventing unauthorized file deletion entirely.
Implementation Patterns
Building code-first MCP servers is surprisingly straightforward:
Basic Python Executor
class PythonMCP:
def __init__(self):
self.globals = {} # Persistent state
def execute_code(self, code, timeout=30):
try:
result = eval(code, self.globals)
return {"success": True, "result": str(result)}
except Exception as e:
return {"success": False, "error": str(e)}
Enhanced with Libraries
class EnhancedPythonMCP:
def __init__(self):
self.globals = {
'pexpect': pexpect,
'requests': requests,
'json': json,
# Pre-load commonly used libraries
}
def execute_code(self, code, timeout=30):
# Add timeout handling, output capture, etc.
pass
When Code-First Makes Sense
This approach works particularly well for:
Development Tasks
- Debugging applications
- Running tests
- Code analysis
- Build automation
Data Processing
- ETL operations
- Analysis workflows
- Report generation
System Administration
- Log analysis
- Monitoring setup
- Configuration management
Research and Exploration
- API discovery
- Data investigation
- Prototype development
Limitations and Considerations
The code-first approach isn’t perfect:
Learning Curve
AI agents need to understand the available libraries and their APIs. For well-known libraries like requests
or pexpect
, this works great. For custom or obscure libraries, more guidance is needed.
Verbosity
Code can be more verbose than specialized tool calls. A simple weather lookup might be longer in Python than calling a dedicated weather tool.
Context Usage
While one tool description is simpler than thirty, the actual code calls can be longer and use more context.
The Future of AI Tool Integration
The success of code-first MCP implementations suggests a broader principle: instead of creating ever-more-specialized tools, we should focus on giving AI agents access to powerful, general-purpose capabilities they already understand.
This doesn’t mean abandoning specialized tools entirely. But it does mean being thoughtful about when to create a new tool versus when to leverage programming languages that AI agents already know well.
The most powerful AI systems of the future won’t be those with the most tools—they’ll be those that can write and execute code to solve problems we haven’t even thought of yet.
Getting Started
If you want to experiment with code-first MCP:
- Try existing implementations: Projects like
pexpect-mcp
provide working examples - Start simple: Begin with basic Python execution and add libraries as needed
- Focus on state: The persistent session is what makes this approach powerful
- Think about sandboxing: Use containers or VMs to limit blast radius
The paradigm shift from “many specialized tools” to “one powerful programming interface” represents a fundamental change in how we think about AI tool integration. It’s worth exploring, especially as AI agents become more sophisticated and capable of handling complex, multi-step tasks.
This approach to MCP design prioritizes the programming capabilities that AI agents already possess, rather than forcing them to learn dozens of new tool interfaces. The results suggest that sometimes the best tool is the ability to write tools.
Comments