BackgroundWe recently did some work to a number of services where all we were doing was removing some old functionality that is no longer used (obsolete code). Removing messages, handlers, classes, tests. I like deleting code, it makes things simpler, less logic to break, less places for bugs to hide. Once we finished, we wanted to get all of this released to prod ASAP. We tested all of the areas affected in a large system level test in the preprod environment, all of the services (and others that had not changed) working together.
But we had push back.
Push backThe question was: Why do you need to release this clean-up in advance of any further work?
By doing so you are making this an active rather than passive deployment with associated extra risk and double the cost.
If you are removing unused code you can just deploy with the next addition/change to code because by testing that you are implicitly testing that absence of code. Even if the new code isn’t affected, then our deployment checklists cover that situation too – we have already double checked this removal/clean-up of code won’t have an impact on production
RebuttalI broke this down into a number of sections. A number of questions that I thought were being asked in the statement.
Question: Why do many releases instead of one, isn’t that more risky?
This is a question from the old skool of thought, releases are big bad things we need to do as few as possible.
Answer: I would say many small releases are inherently less risky.
If (very unlikely but) if something goes wrong it will be less clear what caused the issue. Was it new functionality, or the clean-up work that is the culprit?
If we release now, we know what to monitor over the coming days, and have to monitor less
If we don’t deploy all 8 things, someone else will (at some point and in some cases many months in the future). This poor soul will need to decide if the changes we did need testing, what the consequences of the changes are, and worry about if there are other dependent services to deploy.
Each service is push button, so deploy time is small.
Rollback is not hard if we need it with no business consequences at the moment. In future if we release with other functionality and the changes we have made break something we need to rollback new functionality too.
Question: If we are removing code why the need to release anything? There is no new functionality to release.
I guess this is a question about business value, no new business value no need to release.
Answer: That is true, its mostly removing old code and cleaning up. But its just as critical, almost more so to get this out in a small release sooner, as we may (again unlikely but possible, we are human) have removed something we should not have.
Question: Won't it be more to test doing it twice?
This assumes that we manually test everything on every release. where actually we only test what has changed manually in conjunction with automated testing for the rest.
Answer: We have already done good full end to end testing last week of the 8 things affected. If we wait until next week or the week after we will have to do the tests again as the versions of things to be released will all be different by then, so will need to test the 8 deployable things again full stack = extra 1 day
Other reasons for deployment.
The changes, what we did and why we did it, are still fresh in our minds. The longer we leave it the less sure we are that we will be doing the right things.
Its not critical, but I'd prefer not to do a partial deployment (service-X is going to get released soon) I'd like the rest of the clean-up to be deployed too.
Ideally prod and preprod are as same as possible (for environmental consistency and testing reasons). Any differences between the test system of preprod and prod invalidates other testing efforts. Because in prod services will be integrating with a different version of other services than that are in preprod making like-for-like testing impossible.
ConclusionI maintain there is business benefit in doing the deployment now, and deploying all 8 services at that. To be fair businesses, managers, stakeholders, even developers (especially senior ones) have all seen their fair share of long deployments, failure and difficult rollbacks. Leading, ultimately to a fear of deployment. So it's natural to want to avoid the perceived risks. But perversely by restricting the number of deployments you are actually increasing the likelihood of future fail.
A core philosophy of the devops culture is to release early and often (continuous delivery). By doing the things you find painful more often you master them and make them trivial, there by improving your mean time to recovery.
The business benefit is ultimately one of developer productivity, testability and system up-time