Monday 5 July 2010

Continuous Build and Test - Achieving automated deployment

A few days ago, I posted about how I developed a CI system for an ambitious financial software developer. We had gotten to the point where we had the experience with creating a CI environment but at this point we had outgrown our current build/integration software. Cruisecontrol.Net performed admirably, it had offered us a level of control that we had never seen before. Through the use of Cruisecontrol.Net, we were now able to ensure that all builds would take place the same way based on the requirements of the team working on the code. There was no-one manually running something to build something and forgetting some steps halfway through the process. Development teams were able to see the status of the build they were all working on and received reports and notifications about their code. Management were now able to look at high level reports to see how far along any given project was going. By implementing unit testing during the build process, testing carried out by analysts uncovered fewer and fewer issues and bugs due to this. When it came to deployment, we were also able to run a suite of UI tests post deployment, all carried out based on the spec provided by the development team working on this code.

However, there was still a significant amount of work needed to be done by the test and deployment team at this point. We had a complete hands off build and test system. We could build code and test its logic in one process. We could also test post build using UI test tools such as Watin and Selenium. By this point in time, we were also on the cusp of implementing Quick Test Pro test scripts into the process. There was only really one manual piece of work that we needed to complete in order to have a completely hands of build/test/deployment.

Now, we had a great deal of NAnt scripts that could do pretty much anything we wanted. We were building and running multiple tests, the only thing we could not do at the time was automatically release on a successful build process. We didn't want this to take place in production though, but this would be acceptable for any other environment (including DR). We had a shiny new SCM in place that was fully supported by the build scripts we had created. Right about then, we were also churning out the same amount of code as the main development teams. We worked in c# as well as some other languages like Python and Ruby, whereas the development teams worked exclusively in VB.Net. The last hurdle for us to cross would be the automated deployment after a god build.

It wasn't just the code that the backroom teams were developing that was growing. The company had expanded it's customer base and we were now providing solutions to other global players in the finance sector. The test and deployment teams had already demonstrated that we could work with the development teams to create a build process exclusively for them. In reality, we just a bunch of NAnt scripts and tasks that we used dynamically. The only thing that we needed to do next was to appraise our CI software.

Cruisecontrol.Net is a fantastic tool, but if you work in a company that is providing different solutions to different clients it can be a bit tricky. As the business at large was my teams customer, we quickly found that even though we could put together a CI system for any given team to cater for each spec would mean we would need to implement a Cruisecontrol.Net server per development team. We also found out that Cruisecontrol.Net would be a bit of a nightmare to configure automated deployments. Even though Cruisecontrol.Net could make use of .Net Remoting, the head of infrastructure pointed out (and rightly so) that to achieve an automated deployment using this software would be a security disaster. At the time, Cruisecontrol.Net was a complete CI tool in a box, to implement an automated deployment using Cruisecontrol.Net would have meant we would have had to deploy a Cruisecontrol.Net server to each environment and its selection of application servers. The obvious problem here is that if had have done that, then any malicious users could trigger a build out in the environment itself and possibly substitute our code for theirs. We looked at a way around this, but we could not come up with a solution that infrastructure could support due to security issues - we explained that we could get away with NAnt being out in the environment to handle deployment. Again, and rightly so, infrastructure pointed out that the same problem regarding malicious builds would still exist - basically they said that we were not to have any build tools out in the environments at all. At this juncture in our brainstorming, I pointed out that MSBUILD is already out there and is comparable to NAnt and that it really couldn't be removed as it was needed by .Net! Infrastructure was thankful for this update, but still forbade us from using it for deployments.

Right about now we had two problems on our hands. First up we needed a new CI build tool that could cater for the direction the company was moving towards and secondly, we needed to come up with an automated deployment tool.

We dealt with the CI element first.

We took a look around to see what was on offer. Team Foundation Server could do what we want build wise, but was simply not up to the task as far as our development was concerned. Cruisecontrol.Net was getting left behind with regards to features - the way in which it worked also started to become obsolete for us. We looked at some other CI tools that were available, both the head of development and myself thought we should go for the best of bread solution. This had already been highlighted in our existing setup - there were no tools or frameworks etc that we were using that were entirely dependant on another. From the outset, I wanted each element of the overall system to be as agnostic as possible so that when the time came to change something there would be little impact on our ability to carry our testing and deployments etc. We looked at other tools out there, things like Anthill Pro etc. We ended up going for a tool called TeamCity. It pretty much ticked all the boxes we needed, it was free (up to a certain point) and worked on the server agent architecture - and it was this function that decided it for us.

On adopting TeamCity to be our main CI tool, we asked for one extra server in the build environment (infrastructure did ask why we needed a fifth machine when we were only using one of our existing four...). The new addition was pretty important. Cruisecontrol.Net is a complete build server in a box which means that you need to have it installed on each machine you want to build from (or, at least this is how it worked when we were using it). TeamCity is a little more subtle. It makes use of what are termed as 'build agents'. With a server running constantly listening for build agents, you are able to add as many build servers as you want to cater for each of your projects. For instance, I could have (and have) ran a Linux box to build Mono assemblies. This box would be configured for this type of build only. My TeamCity server is running on a Windows 2008 machine on the network, it has been configured to only allow Mono builds to take place on the box running Linux. With Cruisecontrol.Net, you would have had to configure an instance of Cruisecontrol.Net on each build machine you wanted to build apps from. The TeamCity Agent/Server design is much, much more favourable. It means you can have a relatively cheap box sitting somewhere that is configured to do only one type of build - or in fact be configured to represent a machine out in an environment. You begin to ensure that the code you are building and testing is conforming to the spec laid out for that piece of build. With one server controlling it all and providing valuable MI. When we got to see this in action, we were very pleased. Now we could build for any spec and deliver information on these builds to whomever needed to get feedback.

This again made people happy. It made me happy because it meant that all my guys could simply get on with what they needed to do. The guys writing and maintaining the code we needed for this to take place were not bogged down manually checking that everything had built not only correctly, but also matched the spec for that piece of build. It enabled us to simply agree on a build spec for that development team and implement it. In essence, this was win all over.

As TeamCity uses the server/agent architecture, infrastructure was not adverse to us placing a build agent out in the environments to start working on a delivery system if we needed to do this. There was a lot of work that needed to take place prior to this though, the workstream that would enable a build to be deployed to an environment needed to be bashed out first. For instance, one of our requirements was that we wouldn't release a build to an environment until it has:
  • Been built successfully.
  • Passed it's unit tests.
  • Passed it's UI tests on the build box.
These were pretty simple requirements for ourselves. The majority of the work left to us was to develop the NAnt tasks needed to carry out environmental testing and to create a deployment platform.

There was a pseudo deployment platform in place already, it worked as part of the existing application. it was primarily used to handle some functions out in an environment. Things like pushing assemblies out to application servers and controlling web servers et. I took the decision to move away from this system - I didn't want the build/test/deployment process to be dependant on the actual product we were making.

So, we came up with a prototype platform that implemented .Net Remoting and MSBUILD. There were a few other staging areas as well, like TFTP servers and small services to control things like web-server availability and application servers. As the project for automated deployment went on, we encountered new tools like PowerShell. To me, this was a very useful tool, it meant that I could create c# code that could be accessed directly from within a PowerShell script (we already knew we could script .Net code in NAnt, but we didn't like to do this.). So, MSBUILD gave way for PowerShell, we liked it and so did infrastructure. With these tools in implementation, we needed to define the automated release process. It went a little like this:
  • Build machine is selected from server farm.
  • Build machine clears a workspace.
  • Latest code is updated to the build machine.
  • Latest code is built.
  • If the build was successful, unit testing takes place.
  • On successful unit testing the assemblies are packaged for deployment.
  • Selected environment would begin a countdown to unavailability.
  • Once build is packaged and the environment is down for deployment, existing code is uninstalled on target machines.
  • Latest build is deployed and activated on target environment.
  • Tests are then carried out in the environment, mainly UI testing.
  • On successful test completion, all relevant schema and sql is deployed to that environments DB servers.
  • If the new code fails at all, everything is rolled back and the DB is not touched.
And that was pretty much it. All we needed to do was to come up with some simple .Net classes to handle some of the deployment steps via .Net Remoting. Our PowerShell scripts would then handle the process out in the environment. If it all failed, everything was rolled back prior to releasing to the DB and the team responsible for the build would be notified.

We could have ran a test prior to committal on each developers machine. However, the head of development and myself thought that this could be a little tricky to carry out as there were so many developers working on so many work-streams in so many environments.

When it came to deploying to production, the same scripts and classes were used, but with intervention from someone in the release team to guide it all. So the same process was used, but in such a manner that steps were followed manually.

Just prior to leaving that company, we had created a complete build to release system in just under a year. The system I designed and created with the other guys in the release team (and ultimately the test team when I got promoted) was a world away from the VB scripts we used when I first joined. Now, each customer of the company has its very own build and test specification as well as an automated deployment specification.

Doing all of this would not have been possible if it were not for the availability of the tools at our disposal. There are earlier posts on my blog on how NAnt can be used to create new tasks. Systems like TeamCity allow you to concisely manage all aspects of a development work-stream from end to end. The only thing I never got around to doing for that company was to implement FITNesse.

I continue to praise CI and Agile to my colleagues, when ever I start a new contract, I often ask what build process they have in place. Directly after working for that company, I joined a small team within a huge organisation that needed to create a mini version of my last CI project. I now get asked by developers and employers to explain just how easy it is to implement for their own projects. It astonishes me sometimes to see very complex pieces of software not getting even the simplest unit test - and that has to go through a manual build phase with set specification.

Saturday 3 July 2010

Hyper Geo 64

As a lot of people who know me will admit to, I am a fan of video games. Pretty much where ever there is a TV in my house, there will be a video game console attached to it.

If you are a guy round about the same age as me, it is possible that you will remember some of the great video game consoles of the 1990's. One that may stick out a bit was the Neo Geo made by SNK. I remember this console having a near mythological impact on video gaming, the games themselves cost almost as much as the console itself, and chances are you never met someone who owned a Neo Geo.

The console itself was pretty much a stripped down version of an arcade machine, this is why the games cost so much money. The AES, which was the home version of the console, provided arcade perfect graphics during a time where people were arguing about whether or not the SNES was superior to the Mega Drive. I think there were a number of attempts by SNK to create a newer, cheaper console for home use - they came up with the Neo Geo CD as well as two handheld consoles.

I think one of the most interesting consoles they came up with was the Hyper Geo 64. From what I can make out, this was SNK's attempt to flirt with 3D graphics. It never made it to the home market though and failed as an arcade machine with only a few games made for it.

Given my interest in consoles, I have embarked on a search to pick up a Hyper Geo 64 and see if I can get it firing. I love old tech like this, so I am hoping that it wont be too expensive to get one.

Continuous Integration and SCM

Last month, I went on a bit talking about Continuous Integration and how I have developed myself and my skills in relation to this.

I left off by saying that the needs of the company had expanded some what as had the needs of individual development teams. My team was slightly different to the main production and new build teams. The work and code we produced was made to facilitate the development of the applications we offered to our customers. So for us, there was very little in the way of UI development etc. The code that my team developed was primarily to support the rest of the development teams, enabling them to get a birds eye view of development at any one time and across all projects. We didn't just restrict this to management, it would be a bit daft to do that. Pre CI, we often heard the following complaint "we just do not know what is happening". In the build team we were responsible for maintaining all of the build and test environments, infrastructure took care of the guts of them and we maintained the environment based on a development teams specification.

So, to start developing version 2.0 of the CI system, we need to appraise a few new things. First up was our SCM tool. Our current tool was just not geared up to what we wanted it to do. We had been using the same tool for quite some time prior to me joining the company, it was fine in the pre-.Net days where there was only one customer, but as the company grew we needed to get something a little more robust, extensible and modern in. So, one of the first things we did was move away from Surround as our SCM. There were a number of options available to us at this point in time. Team Foundation Server could cater for us at version 1.0, it had a SCM system, it could carry out builds on more than one server etc. The only trouble with it was that it was v1.0 software, even though it was a production release, it was a pain to get it to work. Thankfully, the head of development had looked into this issue quite some time ago with a view to migrating from Surround to a newer tool. His initial efforts at the time met with apathy - there was no need for a new tool as the one they had was fine. So, when the time came the head of development pointed me in the direction of a package called Accurev. By this point in time, I had been made up to the head of deployment and testing, my main aim here was to get a complete CI system, from build to testing logic, to deployment to an environment and then further testing (UI stuff, this was a web app). I took a look at Accurev and liked it pretty much straight away. It was a very modern tool, it allowed you to work under the Agile umbrella of software development and allowed my team to quickly configure code and builds whilst at the same time gave the releasers time to make sure the code we were about to release was the correct code (there was a time prior to Accurev that code hadn't been checked in or re-based properly).

To fit into our CI system, all we needed to do was to create a number of NAnt tasks to work with Accurev, these were nice little tasks to make sure that the code was being updated on the build server. Integrating Accurev, Cruisecontrol.Net, NAnt was pretty simple in the long run, and it let management chill out when we were able to demonstrate how a fairly significant migration could take place with out a lot of drama.

With our new SCM in place and all the developers and other people who needed access to code trained up on how to use it, we were ready for take-off. Took a weekend to export the code we needed from our old SCM into our new one. It didn't take long to script the whole event, but did take a while to get the code from a to b. Once it was done, we were ready. We had a nice new shiny SCM that everyone knew how to work and we were ready to re task the deployment/release team into a new method of CI.

Of course, there were some hiccups along the way. I think the scariest one we had was just after migration, one of my colleagues accidentally deleted some very important code a few days in-front of a major release. bit of panic at the beginning, but we sorted out a major catastrophe fairly quickly. One of the good things that Accurev offers is the fact that it never, ever forgets code. It actually stores all your code ad infinitum, so a quick look through the manual and we were able to bring back all the code we needed after a few bits of code on the command line.

With this migration complete, we were only a few steps away from providing a major update to our CI system.