2023-11-07 00:00:00
by Airwalk Reply Head of Engineering Jim Lamb
Way back in time, there were software developers and systems administrators. And sadness. Sadness that the developers’ code didn’t like the sysadmins’ infrastructure and sadness that the sysadmins’ infrastructure didn’t like the developers’ code.
Then came DevOps, and everyone got happy. Happy that they were all pulling in the same direction with common goals. Developers were able to treat infrastructure like code and sysadmins were able to automate everything. Tooling improved. Processes were refined. Developers built things and ran them, in many cases 24x7. Everyone was called a ‘DevOps Engineer’, we started creating teams that were fully responsible for developing and maintaining products, and productivity boomed.
Or did it?
The above story was true in many organisations, mostly the very well-disciplined. However, what happened in other organisations, particularly the larger ones, was that the burden of creating and supporting infrastructure components was simply passed to the individual product-based DevOps teams. Even with the [supposedly] easy-to-manage public cloud, developers were spending a lot of their time performing tasks that weren’t directly concerned with writing, testing or deploying code and so weren’t improving the product being delivered.
Who was maintaining the source control system? The artefact repository? The CI/CD system? Who was upgrading the Kubernetes clusters? The container images? The virtual machine images? Often, the semi-autonomous DevOps teams were the ones doing all of this, but only for themselves in isolation. Maybe some of this work was left in a central team, run by the rebranded sysadmins, but new dividing lines appeared. Maybe a centralised source control or CI/CD system was provided, but it may not have met the needs of all the DevOps teams, so they might have created their own. Maybe there was a centrally-provided PaaS, but with easy access to the numerous services offered by public cloud providers, teams may have chosen to use something that better suited the needs of the product they were developing.
Sprawl happened.
Now every DevOps team was using a different set of technologies and managing them in a different way. Everyone was doing DevOps but with little or no common ground. This may have been fine through some lenses, but the amount of time being spent across the organisation on the maintenance of these toolsets was getting far too high. It was also inhibiting the movement of staff across teams because the learning curves were getting steeper. The ‘You build it, you run it’ panacea was proving too costly and difficult to achieve.
We need to take a step back and look at where the talents of individual DevOps teams can make a real difference and let them control those areas of the technology. Then we should provide the rest – the parts that can be commoditised – centrally.
But we must not throw away the learnings of effective product teams, instead, we must apply those to this centralised provision, creating a product team dedicated to providing an ‘Internal Developer Platform’, or IDP. This provision is what has become known as Platform Operations, or PlatformOps.
This is more than just adding users to GitHub or patching the Jenkins server. Platform Operations is about providing a set of technologies to be consumed by all the individual DevOps teams in an organisation who are developing their individual products. The approach must be to treat the platform as a product; a product which is based on properly researched user requirements, is improved through genuine user testing and receives feedback by listening to its users. There must be a roadmap, a backlog, a prioritisation process and clearly identified ownership. The platform has to be treated as a long-term initiative, not just something that a few people are thrown at for a short time to plug a hole in functionality or quell some disquiet about a particular tool or process.
Let’s also not forget all the good stuff we learned from DevOps. The streamlining of the process, the upskilling and uniting of people and the use of the best technologies for the job.
Ideally, the IDP should be self-service, allowing DevOps teams to go to a web portal or use an API to provision whatever environments and tools they might need to build and deploy their software products. This could include IaaS and PaaS services, containers, serverless functions, networking, databases, data stores, language runtimes, CDNs, deployment toolchains, software repositories and CI/CD.
If your organisation uses multiple cloud providers, then there should still be a single portal to manage any of the services just mentioned on any of the supported clouds.
Through researching and understanding what all of your internal customer teams need, you can come up with a set of patterns which can be turned into parameterised templates and then into a service catalogue. These can then be modified and deployed as required through automation to create the services the teams need. Each of these templates must have a baseline set of best-practice standards baked in, including observability, security, resilience, performance, cost efficiency and scalability.
These templates still need to give your individual DevOps teams maximum flexibility; forcing them into rigid configuration or tool choices will cause frustration – they will choose their own tooling elsewhere and the mission will fail. For the same reason, the technologies used must be modern and kept up-to-date. This should be part of the ongoing development of the platform itself.
There are some tools that organisations are using to help them build IDPs, for example, the Open Source project https://backstage.io. Some organisations are using commercial products, or rolling their own wrappers around existing tooling like AWS Service Catalog, their favourite CI/CD tool and Kubernetes.
It isn’t just a new name for an existing process or team. Don’t fall into the same trap some companies did when DevOps arrived. Platforms need to be treated like real products.
Is rolling out an automation tool, such as Jenkins, creating a platform? No!
Is giving teams free rein over AWS/Azure/GCP the same as a platform? No!
Is Kubernetes a platform? No! It is just one potential component.
By allowing DevOps teams to self-serve, you remove bottlenecks and achieve scale with speed and consistency; you have built-in governance and compliance in your standard patterns; you keep visibility and control over costs, not just through providing efficient templates and consistent observability, but also through standardisation of resources used and taking advantage of lower unit costs for licences.
But the biggest win is in making your internal DevOps teams more productive by letting them concentrate on what they’re good at and letting them consume a well-designed and well-maintained platform on which to build their products.