The term "release management" might bring images to your mind of meetings that run from 4 p.m. - 8 a.m. and involve a build, sign-offs and lots of coffee, soda and vending machine snacks.
Or you might think of a programmer clicking a deploy link.
Not all companies can trigger a deployment whenever they're ready (relying on the version control safety net, of course), and not all should, but most can benefit from moving away from the first example and toward the last.
I’ve written before how many organizations can reduce or eliminate regression testing, making rollouts more predictable and less painful.
While that might sound great in theory, getting there is a different story. One way to achieve what may sound like deployment nirvana is the Toyota Kata, an improvement method that Mike Rother learned from Toyota.
What is the Toyota Kata method?
Take a meeting with six people discussing the release process, and you'll likely leave with 12–20 ideas for improvement. Calculating the return on investment for those improvements won't be easy. Even if you can rank the improvements in some way, implementing the five or six best ideas will not have a systematic effect.
[Related: Are you over-testing your software?]
Before you can talk about improvement, you need to define a goal.
The Toyota Kata refers to this as establishing “strategic direction.” To Stephen Bungay, though, it is strategic intent and Hoshin Kanri calls it strategy deployment.
Once you have a strategic direction, you can find out where the gaps are, then establish immediate steps, called a target condition and move toward that goal.
Step 1 – desired state
Having a test tooling or a rotating release manager, incident manager, or ITIL certification are certainly goals, but exactly what that means for software development is not defined. Test tooling and automation could be a desired state, if the goal included releasing more often, decreasing the time for regression test/fix/test to under one day, and so on. Here are a few specific examples:
- Move from last commit of software to deployed within one hour.
- Be able to fix issues in production by rollback within one hour of notification.
- Develop software in pieces small enough to build and deploy in the same day.
- Reduce the risk to a production deploy so much that any team member can make the deploy without a release manager.
- Shrink the production support team by half.
- Bring production support into the shipping team.
Depending on where you work, some of these might be the existing state – or they might sound unbelievable. Some also read like arbitrary measures -- and they are. In order to make these strategic, each goal needs a "so that..." attached - what the final benefit will be to the organization. For example, shrinking production support allows those team members to switch to new feature development; pushing to production in an hour changes the economics of deploys to encourage deploys and so on. Those benefits will be grounded in the organization, and can't be predicted here.
The idea here is not to pick an achievable goal, or even a stretch goal, but instead a goal that seems unachievable. Instead of an end-state, the goal defines a journey - a direction for long-term improvement.
Let's take one more look at that goal.
For example, one of the companies we work with wanted to "automate testing" by having a computer run predefined scripts through the user interface in order to deploy more often. They had no interest in automating the deploys, which consisted of turning the servers off, pushing a new build and turning the servers back on. Rolling out the deploys took over an hour, which meant the software could only be deployed late nights during a weekend.
If we automated the checking, but not the deploys, the process would take close to a year and the fastest deploys possible would be once a week.
Instead of starting with a target (like "automate testing"), we start with an objective, then analyze the entire test/fix/deploy process to find where the impediments are.
Step 2 – gap analysis and identify the next target condition
Once we know the desired state, we study the current system to identify gaps that matter. In this example, we are talking about release management, so everything from the last commit of code to version control ("code complete") to on the production servers. Some organizations consider immediate implementation support ("stabilizing") part of release management. Others run an entire mini-project around the end game of a deploy, including days to months of regression testing, a large series of bug fixes that next to be triaged, done or marked WILL_NOT_FIX, then re-tested. Another common mini-project is the deploy itself, which can include days of coordination and a final, multi-hour rollout process.
One way to conduct a gap analysis is to look at the entire process: Build, Test, Fix, Deploy-Coordinate, Deploy/Do -- looking at how long each step takes if done ideally, and the various ways that step breaks down. Eventually, you'll find a bottleneck: a step that’s holding back improvement the most. Sometimes, the what-to-fix isn't the bottleneck, but the easiest to improve right now.
Once what to improve starts to come into view, the next question is how to improve it – what the next small step should look like.
The Toyota Kata term for this is the target condition. That's different than a numeric quota, like "12 deploys per year." Numeric quotas are dangerous because they are so easy to game. As Charles Goodhart put it "when a measure becomes a target it ceases to be a good measure."
Target conditions don't just give a number, but also describes the operating attributes, how the game will be played to obtain the results desired. "Shrink the build to 10 minutes" is a quota; adapt a Continuous Integration (CI) system or build server like Jenkins to automate builds after every commit looks more like a target condition. Target conditions also include a delivery date, which can be established in concert with the team.
Then management stops. Or, at least, the directions stop. This is very different than traditional management, which might break down the process of adopting Jenkins into a plan, including requirements, design, various tasks for various roles and so on. Instead, management admits that the path to the next target condition is unclear; the team starts working toward the goal on a daily basis as a part of their work.
Step 3 – coaching and improvement
At this point a classically managed project could go two ways. The company might assign a project manager, who assembles a team to work on improving releases part-time. If the company is large enough, the team might be full-time, and have to try to improve releases as they are happening; perhaps the company creates a center of excellence. If the project is smaller, perhaps not funded, a director might simply ask his direct reports what they are doing at weekly meetings to accomplish this objective, in addition to all the other work they have to do.
Those one-on-ones, status meetings, and planning sessions can all be replaced with the coaching kata. The coaching kata asks five questions that walk the student through the goal, obstacles, next steps, and where to go and see if those next steps will work. On a full-time project, a manager might run through these questions two or three times a day. If the manager is serious about the improvement, even a part-time project would include several sessions a week.
Coaching does several things. First it demonstrates that the student understands the problem, both the strategic direction and the current target. Then it forces the student to list the obstacles, what they have tried, how that turned out and what they will try next – and they need to explain it to another person, couched as small experiments.
The opposite of explicit coaching is internal R&D, where the manager asks "how it is going?" and the technical staff member says "oh, it's going." That non-answer probably means that the programmer is stuck, getting compile errors, run-time errors, having library problems or otherwise fighting through the morass that is trying to implement a new software package to do something new.
[Related: How peer review leads to quality code]
The experience of implementing a build tool – Jenkins, CircleCi or TFS – can be a bit like that. The Coaching kata can help team members think more scientifically about their work, identifying when they are stuck earlier, make it easier to ask for help or decide to try a different approach.
Step 4 – establish and iterate toward the next target condition
Once the team accomplishes the target condition, it is time to set up the next one, then repeat. This improvement is tough, because the team is delivering software the whole time. If the company is using something like scrum, a retrospective might be a good time to reflect if the target condition is satisfied, and what the next one should be.
Eventually the release process with get tight enough that it is time to improve something else. Wouldn't that be nice?
Getting serious – what you'll do tomorrow
While there are no standards for release management, there are some things that are common on many teams. For example, time spent on releasing the software is time that could have been spent developing it. Also, if the release process is expensive, it will make economic sense to perform it less often, which creates a delay in getting new features out to customers and allows the amount of change to increase. This makes every release riskier, which calls for more safeguards, which creates a longer deploy process, so it makes economic sense to deploy less often. The pattern is predictable and negative.
Instead of doing what makes economic sense in the current conditions, ask how to change the system, and then move toward that change.
This story, "Test, deploy, release ... repeat" was originally published by CIO.