Monitoring with NewRelic

Over the years I’ve come to rely on information radiators during testing to get immediate (or as quick as possible) feedback from the systems I’m testing.

Firebug, log files, event logs and many other sources of information are all very useful to a tester. They can give you insights in to what is happening in the system under test.

We’ve just taken this a step further by rolling out NewRelic on our test servers.

NewRelic is what’s termed a “Application Management Solution”.

I’ve been talking about this internally as a system that can give us three distinct insights:

  • User Experience Information
  • Server Information
  • Product Performance Information

I’ve probably over simplified the tool and doing an injustice but it allows me to clearly explain the value we’re seeing from it.

User Experience Information

NewRelic gives us all sorts of data around how the experience is for end users when they use our product.

We can use this to ascertain how our product is being experienced by our customers, but we can also use it to understand how the experience is stacking up for our testers.

If we are testing and we observe a slow down we can check whether it really was a product slow down using NewRelic and more importantly; what’s actually happening on the stack.

We can use NewRelic to work out what browsers are being used across all of our environments. We can see the test coverage we have across browsers and we can also see what browsers our own business use from our pre-production test environments (where we test all kits before live deploy).

We can also then see which browsers are faster than others. We can see which versions are used and which browser is our most heavily used. Interesting stuff to help guide and tune our testing.

Server Information

NewRelic monitors the actual servers giving all sorts of information such as memory, CPU, process usage etc etc. This is great information on our test servers, especially during perceived slow downs or during a load test.

We have other mechanisms for measuring this also so this is the least used function in NewRelic when testing.

Product Performance Information

For me, this is the greatest information tools like NewRelic offer; they show you what the product is actually doing.

It includes what pages are being dished, how fast are they being dished, where they may be slow (in the DOM? Network?), what queries are being run, what part of the code is running them and how often they are being called.

When we dig around in the data we can find traces that NewRelic stores which give an amazing level of detail about what the product is/was doing when the trace was run.

It’s going to become a testers best friend.

In a nutshell what it allows us to do is provide an accurate picture of what the product is doing when we are testing. This means we can now log supremely accurate defect reports including traces and metrics about the product at the moment any bugs were foud.

The programmers can also dig straight in to any errors and be given the exact code that is generating the error.

We can see which queries are running meaning that if we encounter an error, a slow down or something worth digging in to we have the details to hand.

It’s still early days using the tool but already we’ve had deep insight in to how the product runs in our environments which I’ve never been able to get from just one place.

It’s immediate also. Test – check NewRelic – move on.

Imagine how powerful this could be on your live systems too.

Imagine the richness of information you could retrieve and imagine how fast you could get to the root cause of any problems. It’s powerful stuff. Expect to hear further posts on how tools like this can inform tests, provide a depth of supporting information and provide help to performance testing.

Some notes:

  • There are alternatives to NewRelic.
  • It’s still early days but tools like this are proving invaluable for accurate and timely troubleshooting and information gathering.
  • I’m not affiliated to NewRelic in any way – I’m just a fan.

Failure Demand

One of the highlight talks from EuroSTAR 2012 was the keynote by John Seddon.


It wasn’t even a testing talk. It was a talk about value, waste and failure demand. The talk was about Vanguard’s (John’s company) work with Aviva Insurance to improve their system to provide value to the customer. It was an interesting talk from my perspective because it was centred around the call centre aspect of Aviva. As I work on call centre products I had more interest than some of those around me.


I saw good parallels to testing and software development but I don’t believe all did. I think it’s a shame because had many people seen the connections I believe they may have been as inspired as I was after the talk.


In a nutshell John told the story of how Aviva was being run based on cost. Management were looking at the system (the business process) as a cost centre and working to reduce costs rather than looking at the root causes of why costs were high.

Aviva started to receive large numbers of calls to their call centres. So they started to build more call centres to cater for demand. The call centres started to be moved to area in Britain and abroad where the cost per centre was cheaper. They were looking at the costs of the call centres and were optimising and reducing cost where they could.


The problem was though, that the costs in the call centre was an effect of customers not getting value at the beginning of the cycle. So when a customer would interact with Aviva they would not get their problem or request dealt with 100%. They would then call the call centre again. And again, not get it resolved. So they would call back. The managers took this to mean that people liked to speak to Aviva, hence more call centres. The real reason was that they were not solving the problem correctly first time, hence they were spending more trying to solve the problem later.


John coined the term “Failure Demand” to explain this. Failures in the system were creating demand elsewhere. In this instance it was calls to a call centre.


He worked with Aviva to increase the chances of satisfying the customer 100% on their first interaction, thereby reducing the need for further call centres. Customer satisfaction went through the roof and savings were made.


The problem Aviva had was that they were managing based on cost, rather than the value they provide to their customers. Switching this focus means a significant mindset change, but the results are incredible.


What’s this got to do with testing?

A lot. When we manage our development process by cost we start to ignore the value that we are adding. We use metrics to make decisions, we look for cheaper ways of doing things and we optimise the wrong parts of the system.


I immediately saw lots of parallels with software development. Every time we do rework, bug fixes, refactoring, enhancements and any other work which could have been avoided is, I believe, failure demand. We are spending more money correcting things than if we had spent more time getting it right in the first place.


With software development though there will always be times when we need to refactor, change something or fix bugs. The question for me though is at just what level does natural change cross over in to failure demand.


Did we not define the requirements well enough and are now having to change the product because it’s not right for the customer?

Did we not include the right people at the start and some tough questions get asked too late in the process?

Did we not have sufficient testing in place early enough to catch obvious bugs which now require rework?

Did we not have the right skills in place to make sound technical decisions which now mean we have to re-write bits of the product?

Did we not spend enough time understanding the problem domain before jumping in and building something?


Agile helps to reduce this somewhat by making the feedback loop smaller, but as John mentioned in his talk “Failing fast is still failing”.


It was a really good talk. It made me really think about what elements of a testers work could be failure demand. It re-enforced my ideas that optimising parts of the system doesn’t often address the root cause and it gave me renewed energy to look at the value rather than cost.


If you’re interested in the talk, here is a similar one (without the Aviva part) from Oredev and here is the Aviva video that John showed during the presentation.

Re-thinking IT – John Seddon from Øredev on Vimeo.

Interesting stuff. His company website is here:

Geeks hiring geeks

For those that are hiring managers there is a book I would most definately recommend you read. It’s a book called Hiring Geeks That Fit by Johanna Rothman.

We’re not doing too bad at all at recruiting (we are recruiting again by the way!) but there are always lessons to be learned and advice to be sought out.

Johanna’s book is a great read packed full of useful insights, experience and nuggets of gold that may just change the way you recruit. It’s great to read a book that is pragmatic about recruitment and open to the scary reality that hiring geeks that fit can be challenging and demanding and may require managers to step outside of their comfort zone.

It’s a book designed for those that want the right candidate, not just the best candidate they can find within 30 miles of the office.

It’s also full of practical advice like how to make an offer that will be tempting, about how to be sure you are “right on the money” at offer stage and how to make a great first day impression. I liked the chapters about sourcing and seeking out candidates.

I imagine it’s not comfortable reading for those who expect generic adverts to attract top talent or for a consultant to do all of the work for them, but that’s why the book is so good. Johanna spends a nice amount of time talking about personal networks, and of course, social networks as a way to recruit. I’ve had major success from both personal and social networks so can testify to how powerful they are becoming.

“One thing you cannot do is avoid Twitter. Not if you want the best technical candidates. Not if you want people who use social media. But you can keep your Twitter use to 15 minutes a day while you are sourcing candidates. That, you can do.”

Most tasks that are worth doing involve an investment of time. This is a theme I believe runs throughout the whole book. Johanna makes it clear that the process is time consuming, but it’s an investment. To get good candidates takes a great deal of time and effort.

“You don’t have to spend gobs of money to find great candidates, but if you don’t, you probably will need to spend time. Remember, potential candidates may not all look in one place to learn about the great job you have open, so you need to use a variety of sourcing techniques to reach them.”

Or you could just throw money at recruiting:

 “If you have a substantial budget but not a lot of time, consider using a combination of the more costly sourcing techniques—such as print and media ads, external contingency recruiters, external retained-search recruiters, headhunters, and numerous nontraditional approaches—along with the time-intensive techniques.”

It’s a really balanced book to read. I took loads from the book and would definitely recommend it to anyone recruiting.

In fact, it’s a good book for those seeking a new position also – it certainly gives insights in to how managers may be recruiting.

The templates included in the book are very useful indeed, especially for refining your requirements further and understanding the value your company could offer a candidate.

It’s an easy book to read also with clear language and stories of key points as way of example.

During our recruiting I am digging in to the book and putting in to practice many of the ideas. Good book indeed.