Everything was prepared. The features were in place, the critical bugs were fixed, the environments were up and running and the test were green. We were finally about to go into production with the system we had been working so hard on for the past year.
We had deployed in Azure with redundancy, fail-overs and slots using continuous delivery. The customers had been doing acceptance testing for weeks. We were ready!
Then it happened.
Three days before going into production, the pipeline started halting. The applications crashed one by one. First the development environment, following the test environment. Then the staging environment and last; the production environment.
We had three days to pin down and fix the problem.
This is a war-story about how we turned our Azure environments upside down, logged, debugged, measured and at last, found the nasty bug and fixed it.
Jon is a passionate developer, team leader and systems architect. He is VP of Engineering at Chinsay, trying to help his team of skilled engineers building innovative and reliable systems to revolutionize the concept of contract management.