Folding Air, Software Development With Damian Stanger: 2014

Wednesday 15 October 2014

Blue green web deployment with powershell and IIS

I wanted to follow up my earlier post (about our current CD process) with a more technically focussed one, one that can describe the nuts and bolts of the actual BlueGreenDeployment.

Technology

Powershell, powershell and powershell, oh and windows, IIS and Go (build server)

Process

As I described in my earlier post the blue green web deploy consists of these steps:

1. Deploy
1.1 Fetch artifact
1.2 Select the config for this deployment
1.3 Delete the other configs
1.4 Deploy to staging (delete then copy)
1.5 Backup live

2. Switch blue green
2.2 Point live to new code
2.3 Point staging to old code

Blue Green Deployment

Before diving in to the details I should firstly convey what blue green deployments are, and what they are not.

There are a few different ways to implement blue green deployments but they all have the same goals:

1. Allow testing on live without actually being live.

2. Enable deployments to have the smallest possible impact on the live service as possible.

3. Give you an easy roll-back path.

This can be accomplished in many ways. Techniques include DNS switching, directory moving, or virtual path redirecting.

We have chosen to do IIS physical path redirecting. This allows us to do the same technique on all our environments from test to live, same scripts, same code, and doesn't cost as much as requiring multiple servers which DNS switching would require.

Commands used for this demo are

PS> .\Create-Websites.ps1 -topLevelDomain co.uk

PS> Deploy-Staging -source c:\tmp -websiteName foobarapi -domainName foobar.co.uk

PS> Backup-Live -WebsiteName foobarapi -DomainName foobar.co.uk

PS> Switch-BlueGreen -WebsiteName foobarapi -DomainName foobar.co.uk

The code I'm going to talk through is all located here: https://github.com/DamianStanger/Powershell

Conventions used:

All websites are named name.domain and name-staging.domain
All backing folders are in c:\virtual and are named name.domain.green and name.domain.blue
You don't know if blue or green is currently serving live traffic.
Backups are taken to c:\virtual-backups\name.domain
Log files always live in c:\logs\name.domain
There is always a version.txt and bluegreen.txt in the root of every website/api

In this example I'm using name=foobarapi and domain=foobar.co.uk

The technical detail

This is the meaty stuff, it consists mainly of powershell, and should work no matter what CI software you are using. I can heartily recommend Go by Thoughtworks. It has a built in artifact repository and brilliant dependency tracking through its value stream map functionality.

Setup IIS and backing folders

To test my deployment scripts you will firstly need to set up the dummy/test folders and IIS websites. For this you can use this script: Create-Websites.ps1. I'm not going to go into detail of the script as its not the focus of this post but it creates your app pool and website.

The code is exercised with the following:

setupWebsite "foobarui" "foobarui-test" $true "green" 

applyCert("*.foobar.*") <<optional if you want the sites to have an ssl cert applying>>

This will create 2 websites on IIS pointing to the green and blue folders as per the conventions outlined further above. Finally apply an SSL certification using powershell, this command will apply the SSL cert to all the websites in this instance of IIS.
To remove the created items from IIS issue commands similar to this:

PS> dir IIS:\AppPools | where-object{$_.Name -like "*.foobar.co*"} | Remove-Item

PS> dir IIS:\Sites | where-object{$_.Name -like "*.foobar.co*"} | remove-item

PS> dir IIS:\SslBindings | remove-item

Once you have the websites correctly set up you can then utilise the deploy blue green scripts :-)

Deployment

The Blue Green deployment module is located here: BlueGreenDeployment.psm1 and will need importing into your powershell session with the following command:
PS> Import-module BlueGreenDeployment.psm1
Once you have the module imported you can issue the following commands:

PS> Deploy-Staging -source c:\tmp -websiteName foobarapi -domainName foobar.co.uk

PS> Backup-Live -WebsiteName foobarapi -DomainName foobar.co.uk

PS> Switch-BlueGreen -WebsiteName foobarapi -DomainName foobar.co.uk

Lets dig into these one by one.

1. Deploy-Staging
This is quite straight forward. Find the folder that is currently serving staging and copy the new version there. The interesting bit of code is the method of determining which folder to replace with the new version. IsLiveOnBlue and GetPhysicalPath work together to determine the folder in use on staging. Notice the retries inside GetPhysicalPath I found that sometimes IIS just doesn't want to play, but if you ask it a second time it will?? Don't ask..
The code that actually determines the physical path is:

$website = "IIS:\Sites\$WebsiteName.$domainName"

...

$websiteProperties = Get-ItemProperty $website

$physicalPath = $websiteProperties.PhysicalPath

The rest of the powershell is relatively straight forward

2. Backup-Live
Backing up live is again pretty standard powershell. Again determine the folder that is serving live then do a copy. Done.

3. Switch-BlueGreen
Performing the switch is actually really easy when it comes to it. Firstly determine which folder (blue or green) is serving live (same code as the deploy step) and then switch it with the staging website.
Set-ItemProperty $liveSite -Name physicalPath -Value $greenWebsitePath -ErrorAction Stop
The only added complication is the rewriting of the log file location in the web.config. Log4net only really works well if one process (web site) uses one log file. Again you can look this up yourselves as this is an aside to the main purpose of this post.

Conclusion

The interwebs in general are full of articles/opinions/tales of how bad windows is to automate, it actually winds me up. Maybe it used to be true but I've been finding that with powershell and Go I've been able to automate anything I need. It's so powerful. Don't let the microsoft haters stop you from doing what needs to be done.

The blue green deployment technique outlined here is working really well for us at the moment and has helped us to take our projects live sooner/quicker and with more confidence.

Automation for the win.

Sunday 12 October 2014

Adventures in continuous delivery, our build/deployment pipeline

Overview

We have been undergoing a bit of a dev-ops revolution at work. We have been on a mission to automate everything, well as much is as possible. Exciting times but we are still only just setting out on this adventure, I wanted to document where we are currently at.

First a brief overview of what we have. We have many many small windows services, websites and apis each belonging to a service and performing a specific role. I must quickly add we are a microsoft shop. More and more are our services moving towards a proper service orientated architecture. I hesitate to use the term micro services as it's so hard to pin a definition on the term but let's just say they are quite small, focused on a single responsibility.

We have 5 or 6 SPA apps mainly written with durandal and angular. 7 or 8 different APIs serving data to these apps and to external parties. 10 to 15 windows services which mostly publish and subscribe to N service bus queues.

We currently have 8 environments that we need to deploy to (going to be difficult to do this by hand, me thinks) including CI, QA, Test, Pre-prod* and live* (* the last 4 are doubled as we deploy into 2 different regions which both operate slightly differently and have different config and testing). This list is growing with every month that passes. We really really needed some automation, when it was just 3 environments in the UK region we just about got by with manual deployments.

I'm going to outline how the build pipeline integrates with the deployment pipelines and the steps that we take in each stage. But I'm not really going to concentrate on the actual technical details, this is more of a process document.

1.0 The build pipeline

We operate on a trunk based development model (most of the time) and every time you check in we run a build that will produce a build artifact, push that in to an artifact repository and then run unit and integration tests on the artifact.

Fig 1. The build pipeline

Build

1. Run a transform on the assembly info so that the resultant dll has build information inside the details. This aids us determine what version of a service is running on any environment, just look at the dlls properties.

2. Create a version.txt file that lives in the root of the service. This is easily looked at on an API or website as well as in the folder containing a service.

3. We check in all the versions of the config files for all the environments that we will be deploying to and use a transform to replace the specific parts of a common config file with environment specific details (e.g connection strings). Every environment's config is now part of the built artifact.

4. Build the solution, usually with msbuild, or for the SPA apps, gulp

5. If all this is successful upload the built artifact to the artifact repo (the go server)

Test

6. Fetch the built artifact

7. Run unit tests

8. Run integration tests

The test stage is separate so that we can run tests on a different machine if necessary. It also allows us to parallelise the tests running them on many machines at once if required.

Not shown on this diagram are the acceptance tests, these are run in another pipeline. Firstly we need to do a web deploy (as below) then setup some data in different databases and finally run the tests.

2.0 The web deploy pipeline

So far so good, everything is automated on the success of the previous stage. We then have the deployment pipelines of which only the one to CI is fully automated so that acceptance tests can be run on the fully deployed code. All the other environments are push button deploys using Go.

The deployment of all our websites/APIs/SPAs are very similar to each other and the same across all the environments so we have confidence that it will work when finally run against live.

Fig 2. The web deploy pipeline

Deploy

1. Fetch the build artifact

2. Select the desired config for this environment and discard the rest so there is no confusion later

3. Deploy to staging (I've written a separate article on this detailing how it works with IIS powershell and windows)

a. Delete the contents of the staging websites physical path

b. Copy the new code and config into the staging path

Switch blue green

We are using the BueGreenDeployment model for our deployments. Basically you deploy to a staging environment then when you are happy with any manual testing you switch it over to live with the use of powershell to switch the physical folders in IIS of staging and live. This gives a quick and easy rollback (just switch again) and minimises any down time for the website in question.

3.0 The service deployment pipeline

Much the same as the deployment of websites except for the fact the there is no blue green. The Services mainly read from queues and so this makes it difficult to run a staging version at the same time as a live version (not impossible but a bit advanced for us at the moment)

Fig 3. The service deploy pipeline

Deploy

The install step again utilises powershell heavily, firstly to stop the services, then back things up and deploy the new code before starting the service up again.

There is no blue green style of rollback here as there are complications to doing this with windows services and with reading off the production queues. There is probably room for improvement here but we should be confident that things work by the time we deploy live as we have proved it out in 2 or 3 environments before live.

Summary

I'm really impressed with Go as our CI/CD platform it gives some great tooling around the value stream map, promotion of builds to the other environments, pipeline templates and flexibility. We haven't just arrived at this setup of course, its been an evolution which we are still undergoing. But we are in a great position moving forward as we need to stand up more and more environments both on prem and in the cloud.

Fig 4. The whole deployment pipeline

Room for improvement

There is plenty of room for improvement in all of this though

* Config checked into source control and built into the artifact

Checking the config into the code base is great for our current team, we all know where the config is, its easy to change or add new things to it. But for a larger team or where we didn't want the entire team to know secret connection string to live DBs it wouldn't work. Thank goodness we don't have any paranoid DBAs here. Also there is a problem if we want to tweak some config in an environment. we need to produce an entire new build artifact from source code, which might now have other changes in it that we don't want to go live. We can handle this using feature toggles and a branch by abstraction mode of working but it requires good discipline which we as a team are only just getting our heads around. Basically if the code is always in a releasable state this is not an issue.

* Staging and live both have the same config

When you do blue green deployments as we are doing, both staging and live always point to the live resources and databases, so it's hard to test that the new UI in staging works with the new API, also in staging as both the staging and current live UI will be pointing to the live API. Likewise the live and staging API will both be pointing to the live DB or other resources. Blue green deployments are not designed for integration testing like this, that's what the lower environments are for.

On a very similar vein, logging will go to the same log files which can be a problem if your logging framework takes out locks on files, we use log4net a lot which does. There are options to work in a lock when required mode with log4net but it can really hit performance. We have solved this by rewriting the path to the log file on blue green switch.

* No blue green style deployments of windows services

The lack of blue green deployment of services means that we have a longer period of disruption when deploying and a slower rollback strategy. Added to this you can't test the service on the production server before you actually put it live. There are options here but it gets quite complicated to do, and by the time the service is going live you should have finished all your testing by now.

* Database upgrades are not part of deployment

At the time of writing we are still doing database deployments by hand, this is slowly changing and some of our DBs do now have automated deployments, mainly using the redgate SQL tool set, but we are still getting better at this. It's my hope that we will get to the fully automated deployments of data schemas at some point, but we are still concentrating on the deployment of the code base

* Snowflake servers

All our servers both on prem and in the cloud are built, installed and configured manually. I've started to use chocolaty and powershell to automate what I can around set-up and configuration, but the fact still remains that its a manual process to get a new server up and running. The consequence of this is that each server has small differences to other servers that "should" be the same. This means that we could introduce bugs in different environments due to accidental differences in the server itself.

* Ability to spin up environments set up as needed for further growth

Related to the above point, as a way to move away from the problem of snowflake servers we need to look at technologies like puppet, chef, Desired state configuration etc. If we had this automation we could spin up test servers, deploy to other regions/markets, or scale up the architecture by creating more machines.

Relevant Technology Stack (for this article)

• Windows

• Thoughtworks Go

• Powershell

• IIS

• SVN and Git

• Msbuild and gulp

Next >>

Ive written a follow up article to this which details the nuts and bolts of the blue green deployment techniques we are currently using. blue-green-web-deployment-with-IIS-and-powershell.

The code for which can be found on my git hub here: https://github.com/DamianStanger/Powershell/

Sunday 31 August 2014

Using Powershell to create local users on windows

We are setting up a server farm for a new environment consisting on many servers and we want to create many users with admin rights on each one, including the remote desktop user group.

We could have spent an hour or so and used the GUI on each server but we thought that a script would be quicker, not to mention more fun to write.

The latest version is here: https://github.com/DamianStanger/Powershell/blob/master/Add-LocalAdminUserAccount

The version at time of writing is below:

Function Add-LocalUserAdminAccount{

  param (

   [parameter(Mandatory=$true)]

    [string[]]$ComputerNames=$env:computername, 

   [parameter(Mandatory=$true)]

    [string[]]$UserNames, 

   [parameter(Mandatory=$true)]

    [string]$Password

  )



  foreach ($computer in $ComputerNames){

    foreach ($userName in $UserNames){

      Write-Host "setting up user $userName on $computer"



      [ADSI]$server="WinNT://$computer"

      $user=$server.Create("User",$userName)

      $user.SetPassword($Password)

      $user.Put("FullName","$userName-admin")

      $user.Put("Description","Scripted admin user for $userName")



      #PasswordNeverExpires

      $flag=$User.UserFlags.value -bor 0x10000

      $user.put("userflags",$flag)



      $user.SetInfo()



      [ADSI]$group = “WinNT://$computer/Administrators,group”

      write-host "Adding" $user.path "to " $group.path

      $group.add($user.path)



      [ADSI]$group = “WinNT://$computer/Remote Desktop Users,group”

      write-host "Adding" $user.path "to " $group.path

      $group.add($user.path)

    }

  }

}



[string[]]$computerNames = "computer1", "computer2"

[string[]]$accountNames = "ops", "buildagent"



Add-LocalUserAccount -ComputerNames $computerNames -UserNames $accountNames -Password mysecurepassword

The lines that do the damage are 14 to 24 to create and save the user, then 26 to 32 to add the user to the required groups on the machine.

It would be trivial to change this script so it was a powershell module but the script as it stands serves my current needs. Just add more computer names and account names to suit your needs we have around 10 of each in the version of the scripts I'm running.

Friday 20 June 2014

Story points vs numbers of stories. What's the best way to predict your project completion date? Or are estimates worth it?

Do story points actually help in estimation and the prediction of project milestones and end dates? Given the inherent inaccuracy of points can we save time by skipping the estimation process?
We estimate our stories in story points (2, 4 or 8 points representing small medium or large). We try and do this when the stories get added to the back log so that we can better plan the upcoming work.

But personally I don't really think the effort required to estimate the stories is worth it, maybe we should plan just with the numbers of stories instead?

Id like to highlight my thoughts with actual data rather than hearsay and conjecture. Opinion pieces are all very well but sometimes you just need to 'show me the data'.

Comparison of the two different metrics

We have been going with our current project for 16 iterations/weeks now, we had one major release at iteration 6 and we are now approaching the next major release for this project. I wanted to take this opportunity to reflect on the no-estimates debate, is there any point in estimates, I will let the numbers do the speaking for me.

Here is our velocity first in terms of story points then in terms of stories completed:

Velocity in story points

Velocity in numbers of stories

The following 2 graphs tracking the amount of work done vs remaining. The first is an agile burn up chart, the second a lean cumulative flow chart

Burn up (story points)

Cumulative flow (numbers of stories)

And here are the statistics around the numbers of stories that we have:
13 stories/spikes with no points. 39 with 2 points. 29 with 4 points. 5 with 8 points

Conclusions

Apart from the fact that there has been a lot of scope creep and our velocity has fluctuated wildly, what do the graphs tell us?

Well at the beginning of the project, like almost every project I've ever worked on we thought we had most (but not all) of the requirements captured, and like most of the projects I've ever worked on we were wrong, very wrong. The scope doubled from 120 points (40 stories) to 240 points (80 stories) from the start of the second phase up to now. So using points or numbers of stories gave a very misleading picture of the scope and hence estimated completion date, No difference in metrics here then.

Our velocity trend is either 15 points or 6 stories, whichever measurement of velocity you use the estimated end date (as of this writing) is the 20th July. Both the burn up and the cumulative flow diagrams show how scope has been added with each iteration as we discover new requirements. As with everything we do we have to ask is that thing that we do adding value?

So I think we could ditch estimates. But this is the thing, we MUST sill analyse stories properly, must break them down into small independent stories (always remember to INVEST in your stories). And we as a team must still discuss the stories before we pick them up to work on.

Appendix

You might ask what happened in early may when the graphs flat line (iteration 20)? Well half the team was needed for an urgent fix/deploy to another system at the same time as holidays from 2 other team members, then we had to ramp up the team again (with different team members).
And why the fluctuations in velocity? well the team changed quite regularly, the time when we were the most productive was (unsurprisingly) the time when the team was the most stable and we weren't getting distracted with other projects and maintenance work.

All graphs are produced curtsey of the Mingle agile project management software from ThoughtWorks, its a great tool for managing your agile projects. We moved from Trello to Mingle around about the new year but that is a different story for a different blog post.

Indecently I'm not even going to mention burn down charts, I've long since abandoned them, they are so limited for trying to visualise the actual picture of what is going on in a given release. If you want to see a good explanation of burn ups vs burn downs just Google it, or look at this example here: http://brodzinski.com/2012/10/burn-up-better-burn-down.html
I realise not everyone sees it my way, and I guess if you have stable/full requirements and want to track work in detail in a given iteration then burn down might work for you but in our experiences it just causes confusion to management and stakeholders. They may end up asking the question 'why is your burn down going up?' or 'why is it flat?'. They don't know if you did nothing or were adding features at the same rate as knocking them off.
Ok, ok, sorry I did mention burn down charts, sorry....

Saturday 24 May 2014

Unit testing with Moq - Returning different values from multiple calls to the same method in a loop, and then verifying multiple calls to another method.

I was doing some TDD the other day in a C# .net service and found myself wanting to write a loop that called a couple of methods and acted on them in different ways depending on the return values. I needed to mock the calls so they returned different data depending on how many times the methods were invoked.

This is the method under test that I ended up with after doing the TDD cycle. (I've stripped out all the exception handling and logging to keep this example clear)

using JourneyHeader.Domain.Entities;

namespace journeyMigration

{

  public class JourneyMigrator

  {

    private readonly JourneySource _journeySource;

    private readonly JourneyDestination _journeyDestination;



    public JourneyMigrator(JourneySource journeySource, JourneyDestination journeyDestination)

    {

      _journeySource = journeySource;

      _journeyDestination = journeyDestination;

    }



    public int JourneysProcessed { get; private set; }

    public int JourneysFailedProcessing { get; private set; }

    public JourneyHeader LastJourneyProcessed { get; private set; }



    public void Start()

    {

      JourneyHeader journeyHeader = _journeySource.GetNextJourney();

      while (journeyHeader != null)

      {

        _journeyDestination.Upload(journeyHeader);

        JourneysProcessed++;                 

        LastJourneyProcessed = journeyHeader;     

        journeyHeader = _journeySource.GetNextJourney();

      }

    }

  }

}

The lines we really care about are 21 through 28. Notice that GetNextJourney() is getting called many times depending on the value returned last time. Also see that on line 24 the upload() method is called inside the loop, I want to verify I was passing the correct values through to it.

This is one of the tests I came up with whilst writing this code, its a good example of the 2 things I wanted to demonstrate here.

using System;

using System.Collections.Generic;

using FluentAssertions;

using JourneyHeader.Domain.Entities;

using journeyMigration;

using Moq;

using NUnit.Framework;



namespace journeyMigrationTests

{

  [TestFixture]

  public class JourneyMigratorTests

  {

    [Test]

    public void ShouldProcessTwoJourneys()

    {

      var journeyHeader1 = new JourneyHeader();

      var journeyHeader2 = new JourneyHeader();

      var queue = new Queue<JourneyHeader>(new [] {journeyHeader1, journeyHeader2, null});

      _journeyHeaderSource.Setup(x => x.GetNextJourney()).Returns(queue.Dequeue);

  

      _journeyMigrator.Start();

  

      _journeyMigrator.JourneysProcessed.Should().Be(2);

      _journeyMigrator.LastJourneyProcessed.Should().Be(journeyHeader2);

      _journeyDestination.Verify(x => x.Upload(journeyHeader1), Times.Exactly(1));

      _journeyDestination.Verify(x => x.Upload(journeyHeader2), Times.Exactly(1));

    }

  }

}

The interesting lines here are 16 through 20 where im setting up a queue that is used to return the values in the prescribed order. You have to do it this way because in Moq you cant do multiple Setups on a given classes method, the last one to be defined will win. In the following example journeyHeader2 is always returned.

_journeyHeaderSource.Setup(x => x.GetNextJourney()).Returns(journeyHeader1);

_journeyHeaderSource.Setup(x => x.GetNextJourney()).Returns(journeyHeader2);

The Returns method takes a value as above or a function that is run every time a return value is required, you could write it like this.
_journeyHeaderSource.Setup(x => x.GetNextJourney()).Returns(() => queue.Dequeue());
But the one I've used on line 20 is a lot clearer.

Finally lines 26 and 27 verify the method Upload was called correctly, once with the first journeyHeader and once with the second.

Appendix
Moq - A popular and friendly mocking framework for .NET

Monday 7 April 2014

Thoughtworks Go, adding a version text file as a build artifact

Firstly let me be clear that we use Go from Thoughtworks but I'm sure you can use the same technique outlined below for other CI systems such as Teamcity or TFS.

When we deploy our built code to the live servers its good to be able to see what the version of the dlls/exes files is. To do this we put a file in the same directory as the built files called version.txt which contains the details of the build that has been deployed, the build number, the revision of SVN that formed the source for the build.

If you look in the console tab of the build job that you have set up you will see something similar to the following:

[go] setting environment variable 'GO_ENVIRONMENT_NAME' to value 'CI'

[go] setting environment variable 'GO_SERVER_URL' to value 'https://buildAgent01:8154/go/'

[go] setting environment variable 'GO_TRIGGER_USER' to value 'changes'

[go] setting environment variable 'GO_PIPELINE_NAME' to value 'Scoring'

[go] setting environment variable 'GO_PIPELINE_COUNTER' to value '81'

[go] setting environment variable 'GO_PIPELINE_LABEL' to value '81'

[go] setting environment variable 'GO_STAGE_NAME' to value 'Build'

[go] setting environment variable 'GO_STAGE_COUNTER' to value '1'

[go] setting environment variable 'GO_JOB_NAME' to value 'BuildSolution'

[go] setting environment variable 'GO_REVISION' to value '6343'

[go] setting environment variable 'GO_TO_REVISION' to value '6343'

[go] setting environment variable 'GO_FROM_REVISION' to value '6343'

With this data you can create a task which uses powershell to create the file

Command: powershell 

Arguments: sc .\version.txt "GO_ENVIRONMENT_NAME:%GO_ENVIRONMENT_NAME%, GO_SERVER_URL:%GO_SERVER_URL%, GO_TRIGGER_USER:%GO_TRIGGER_USER%, GO_PIPELINE_NAME:%GO_PIPELINE_NAME%, GO_PIPELINE_COUNTER:%GO_PIPELINE_COUNTER%, GO_PIPELINE_LABEL:%GO_PIPELINE_LABEL%, GO_STAGE_NAME:%GO_STAGE_NAME%, GO_STAGE_COUNTER:%GO_STAGE_COUNTER%, GO_JOB_NAME:%GO_JOB_NAME%, GO_REVISION:%GO_REVISION%"

This produces a text file like this:

version.txt

GO_ENVIRONMENT_NAME:CI

GO_SERVER_URL:https://buildAgent01:8154/go/

GO_TRIGGER_USER:changes

GO_PIPELINE_NAME:Scoring

GO_PIPELINE_COUNTER:81

GO_PIPELINE_LABEL:81

GO_STAGE_NAME:Build

GO_STAGE_COUNTER:1

GO_JOB_NAME:BuildSolution

GO_REVISION:6343

make sure this file is included in the build output folder along with the build artifacts, good times.

Monday 24 March 2014

Visualising the Thoughtworks Go pipline using Cradiator, a build information radiator/monitor

You know the old adage, 'out of sight, out of mind'? Like it or not, sometimes the state of the build on CI is forgotten about, and if you can't see the current state without going looking for it, it can stay red for a few days before it's noticed by someone.

I've always been a big fan of information radiators, build monitors, graphs and stats that are in peoples faces, and I just wanted to share our current solution to the whole 'who cares if the CI server is not Green' problem.

We use Go (the CI server from Thoughtworks) for our build and deployment pipeline, which is great. Although it doesn't ship with a build monitor that can be installed on a machine to show the state of the build, Thoughtworks do expose an API that allows you to build your own, for example on http://Server:Port/go/cctray.xml

We found an old-ish project called Cradiator that works with cruse control (remember that old build server? it turns out that Go has the same API, all be it on a slightly different URL). The problem was that we have secured our instance of Go so that you need to be logged in to access it. This caused problems with Cradiator and so we forked it and added the ability to set your own credentials in the config. The fork can be found here.

Below are a couple of screen shots showing the Go build server and also the corresponding Cradiator screen.

Our Go pipeline for this small part of the overall system

The Cradiator build monitor screen

As you can see every stage within a Go pipeline has a corresponding line in Cradiator. To achieve this, we use the following filter in the Cradiator config file: project-regex="^.*::.*::.*"

There is a robotic voice that announces who broke what, with some good catastrophic sound effects to accompany it. It's doing a great job of focusing people on fixing things if/when they go red.

Ever since we have started using this, the average time of a red build to a check in fixing it is less than an hour. Visibility for the win.

References:

https://github.com/DamianStanger/Cradiator

https://github.com/PandaWood/Cradiator

https://cradiator.codeplex.com/

Thursday 13 March 2014

Thoughtworks Go, asynchronously trigger a manual stage from a long running test

We are using Go from Thoughtworks Studios to manage our build pipeline, we have the builds generating artifacts that are then deployed and installed on to UAT servers, automation rocks. We then we have a long running test that runs out of process using lots of NServiceBus queues.

We have a stage that starts a process manager that fires fake messages in to the start of the system, then 3 different services pass messages along doing various things, all connected together via message buses.

So at the start of the UAT test we kick off some powershell passing in the current pipeline counter (this is a vital detail, it tells the future API call which pipeline to kick)

. .\start_fullpipelinetest.ps1 %GO_PIPELINE_COUNTER%;

in this example the %GO_PIPELINE_COUNTER% variable is 11

Given the nature of the system once the message is sent to kick off the process the Go pipeline goes green and the next stage goes into an awaiting approval manual stage.

The last thing the test do is to send a message out to inform other downstream systems that things are ready to go, we hook into this and fire the following POST into to Go to run the next stage. If you were to do this manually you click the icon circled above, to do it programatically you send the following POST command:

curl --data "" http://user:password@server:8153/go/run/uat_start_FullPipelineTest/11/TestingComplete

Which kicks off the final stage of the pipeline. For us this does some verification as to the expected state of the system and passes or fails accordingly.

I really like it, we get an asynchronous test that does not hog the go agent resources and will instantly tell you about failures once a test run has finished.

Go is very flexible and the API lets you do all sorts of cool things, like uploading artefacts and triggering pipelines.

curl -u user:password -F file=@abc.txt http://goserver.com:8153/go/files/foo/1243/UATest/1/UAT/def.txt

curl -u user:password -d "" http://goserver.com:8153/go/api/pipelines/foo/schedule