Test-driven Development in Analytics
In the 2017 world of IT and systems engineering, Test-driven development (TDD) is quickly becoming the new mantra. No one writes a line of code these days without the intent to have that code check/test itself. If there is bug in that code, it gets caught and fixed before it goes live, reducing any risk of breakage.
This kind of system has never been deployed on the analytics side. By convention, analytics work has relied on hacks; quick and dirty patches that frequently go awry, and are just as likely to backfire and cut down the analyst, as to cut down her obstacles. If the analyst is winging it, to fill in a little gap in the proverbial data wall, he can unwittingly create a huge chasm with a single stroke. Bringing a TDD approach to analytics would go some way in changing that. It would require that whenever you make any change to your analytics, you make sure the change is fully tested before it’s deployed. This method takes more time — and may frustrate management — but will result in better quality control.
I read recently that the average time an analyst spends at a given company is 18 months. During that time, the refrain one constantly hears is, “Oh, I inherited a mess. I’ve had to shut everything down, clean it up, and start all over.” One place where things traditionally fall apart quickly is marketing campaign reporting. When one team hands over a collection of spreadsheets to a new team that lacks the institutional knowledge the spreadsheets were built on, it’s hardly surprising.
From the business’ perspective, shutting down every 18 months and starting over is not a long term sustainable model. I like knowing that people in our industry recognize problems and can work to fix them. But I don’t like feeling that this constant cycle of frustration is our destiny. If work needs to be done to fix a system, fine. But wouldn’t it be better if you could trust that the fix would last? How great would it be for one analytics engineer to have a smooth handoff to another?
Our industry needs to create governance systems, constraints that ensure that well-intentioned people within an organization can’t tinker with data to the point of breakage. To shore up our reporting tools for the long term, we need to first isolate all the ways the system can break and automate these as much as possible.
Claravine is just one of several new platforms springing up, attempting to build something that will move the industry toward an environment and a workflow that resembles true engineering. The biggest thing we’ve done is allow analysts to move from spreadsheets to an automated platform. We’d like to go further, so that all the tracking code verification can happen behind the scenes. But that’s the future. For now, it’s one step at a time.