MCP servers that mutate your platform/infrastructure
An AI Industry `hold my beer` kind of idea
Just when you think we have finally learned some hard won lessons... The AI industrial complex is here tripping over piles of cash shouting “hold my beer!”.
Of all the half thought out things I have encountered, the idea of having MCP servers make changes to your infrastructure or cloud platform has got to be up there with the daftest of them all.
Remember the days of engineers saying “it works on my machine”? Those days when deployment to production was a white knuckle ride, where you deployed over weekends to limit the blast radius of something going wrong. When you could never be sure that something did not get missed out until either everything worked fine, or you stared in horror at the chaos unfolding while desperately trying to roll back to the previous stable state.
I don’t think there is a single engineer that misses those days. So queue my utter amazement when I start seeing MCP servers appear that allows your favourite coding agent or AI tooling to literally make changes to the foundational pieces of the systems we build. Worst of all this is being pushed by the very platforms we build on.
For some bizarre reason these vendors think engineers want AI tooling to directly mutate their platforms. Worse yet they think that AI and Agentic AI “changes everything”.
I am here to tell you that AI absolutely bloody does not change anything when it comes to this topic! For more than a decade we fought the good fight to make the source repo the thing that drives everything. We adopted Infrastructure as Code (IaC) because it is the only reliable way to evolve a system and platform. The code became the source of truth. We could trust in what we have in our repos because what we had in our repos was what got deployed. Gone were the days of “ClickOps” and the horrors it inflicted.
Yet here we are with vendors trying to get us to hook non-deterministic tools to software that will directly make changes to platforms. The amnesia of the tech industry is baffling.
Why is this a bad idea?
Firstly if you cannot see the changes made to a platform then you have no idea what changes have taken place. Which means you cannot reliably deploy between different environments. How do you make the same changes between environments reliably? Sure you could save your instructions in a “plan” file in your repo for the agent to use next time, but isn’t it easier to just save the Terraform, Pulumi, Ansible, etc code instead?
Secondly AI tooling is susceptible to prompt injection just like every other LLM based tool. You do not want any rogue agent skill or even just rogue instructions in source code to cause modification to your infrastructure. On top of all of this LLMs are not deterministic, so you have no way of knowing what your AI tooling will do each time it is given instructions.
Thirdly, even if your coding agent is telling you it is doing X via an MCP server, you cannot be sure what it is actually doing unless you inspect every single call it makes. If you are not able to see every tool call, and see every single set of arguments passed to the tools, then you cannot have absolute confidence about what is happening.
MCP servers are not the problem though. The problem is the recklessness that is being presented as progress. The idea that we should abandon hard won best practices in the “AI Age” is infuriating and insulting.
How should this work?
Always start from the code. What is in your git repo is the gold standard. I cannot stress this enough, ALWAYS start with the code.
Want to make changes to your platform? You change your IaC code and deploy. Want to mutate some other part of your solution, say database schemas? Same thing, change the code in your repo and deploy from there. Your coding agent should always generate the code to achieve your goal. If you do this you are in full control, you can review the proposed changes and can be certain that what you see is what you get.
I am not against MCP servers. What I am against are MCP servers that mutate your infrastructure or cloud platform.
Personally I find MCP servers for cloud platforms very useful when they offer the ability to investigate and get information from the platforms. If I want to make changes to my platform I can ask my coding agent to investigate the current state and how it would evolve my code to achieve the results I want. MCP shines here because it gets context relevant information without any side effects. I control the side effects by making changes to my code, either with or without my coding agent.
This is the way MCP servers for cloud platforms and infrastructure should be used. It should serve as a source of context relevant information. Not as the mechanism through which you effect change. If you follow this approach you will accelerate your delivery while keeping long term stability. If you decide to make change to your platforms via MCP it will end with outages and chaos. Which path you go down is your choice, so choose wisely.
