Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

While articles like this are very interesting for explaining the technical side of things, I am always left wondering about the organizational/managerial side of things. Had anyone at Knight Capital Group argued for the need of an automated and verifiable deployment process? If so, why were their concerns ignored? Was it seen as a worthless expenditure of resources? Given how common automated deployment is, I think it would be unlikely that none of the engineers involved ever recommended moving to a more automated system.

I encountered something like this about a year ago at work. We were deploying an extremely large new system to replace a legacy one. The portion of the system which I work on required a great deal of DBA involvement for deployment. We, of course, practiced the deployment. We ran it more than 20 times against multiple different non-production environments. Not once in any of those attempts was the DBA portion of the deployment completed without error. There were around 130 steps involved and some of them would always get skipped. We also had the issue that the production environment contained some significant differences from the non-production environments (over the past decade we had, for example, delivered software fixes/enhancements which required database columns to be dropped... this was done on the non-production systems, but was not done on the production environment because dropping the columns would take a great deal of time). Myself and others tried to raise concerns about this, but in the end we were left to simply expect to do cleanup after problems were encountered. Luckily we were able to do the cleanup and the errors (of which there were a few) were able to be fixed in a timely manner. We also benefitted from other portions of the system having more severe issues, giving us some cover while we fixed up the new system. The result, however, could have been very bad. And since it wasn't, management is growing increasingly enamored with the idea of by-the-seat-of-your-pants development, hotfixes, etc. When it eventually bites us as I expect it will, I fear that no one will realize it was these practices that put us in danger.



You should read Charles Perrow's "Normal Accidents" and all will be revealed. This is hardly a new problem.

http://www.amazon.com/Normal-Accidents-Living-High-Risk-Tech...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: