Nowadays, most software written is complex, both in its intent and existence. How many times have we in our own employment written "quick fixes" in the interest of time and business? We would have argued that once we have the patch ready, we could devote quality time to understand and fix the issue by design. However, it usually is deja vu when 2 years down the lane we still hear from the operations team that Jimmy's process, isnt running properly? Jimmy's process? Well that how, we have the nomenclature running for the little quirks and fixes we create, the creator's name and then the fixed component :). The fix would have been intended for subset of the business processes at that time, which we could eventually have expanded across impact many business processes in the future. The point being, what we seem to quick fix for keeping our business process running could come down hard on us later.
Lets try doing an impartial analysis of why a Jimmy's process (please bear with me for picking on Jimmy :)) would come into existence in the first place:
1. There was an issue in the processing of some software component.
2. If the issue was not fixed, business would end up spending a lot of time trying to fix it manully or it meant a revenue loss or it just meant loss of face for the IT team.
3. The business and/or the project manager wanted a quick way to fix it so that the issue is resolved for now, as analysis on the root cause could mean more time spent while the issue remains unfixed in production.
4. One of the most important factors: QA never tested out the scenario because of unavailability of production like infrastructure or data volumes.
5. It could have been because we had a churn in the dev team and the new comer decided it was easier to write a quick fix instead of spending quality time (which is always in short supply!) to get to the right fix.
6. Well if it isnt all the above, it was because a solution was implemented by the software development team without involving the operations team in any of the meetings who could probably have brought out the gap.
It goes without saying that it usually is the operations team who face the heat on a day today basis governing the use of the software processes and in that they end up being the most constructive critics of the system. It also is a fact that they are usually the first line of defense for all exception scenarios while the dev team is probably the last. Seems right to say they probably have seen an issue like the one before and hence on hindsight may be able to suggest scenarios where a solution could fail.
I was reading about the adoption of the principle of DevOps and it seemed to appeal to me in companies where processes are usually written and rewritten on the fly. I do accept there are critics to the argument and hell, tell me one thing that hasnt ? So if you think you are in a position to address issues like the ones written above. Please do read up on http://en.wikipedia.org/wiki/DevOps
I was thinking I would summarize on the same and had written up my piece and somewhere in that process found a blog by Damon Edwards which illustrated it pretty succinctly. So decide to link to his blog rather than reinvent the wheel: http://dev2ops.org/blog/2010/2/22/what-is-devops.html