#Agile Data Conversion

Last year I presented on Agile Data Warehouse – the final frontier at SDEC12. You can find the presentation here if you are interested.

This year on the same project I have been challenged with another new frontier. Now that we are executing and evolving the Data Warehouse, the question was posed as to how we can run Data Conversion projects in an Agile way. The project has a requirement to convert massive amounts of data from old Legacy systems to a new application. These old Legacy systems are written in Cobol and have existed since the early 1970’s. Basically they are old-school Legacy.

We have encountered the traditional quandary. Data conversion projects and application development projects need to occur in parallel because of the constraints on the project. Then both of these streams of projects need to be validated and tested for the start of Integration Testing. This clearly can be a challenge as changes in one can affect the other. The issue is that if we wait until the end of all the design of the applications before we can consider the specifications for conversion to be materially complete, we would not then have adequate time to be able to develop, convert, and validate and reconcile for the start of Integration Testing.Although the application projects and teams have embraced Agile methods, the Data conversion teams have been quite hesitant. The focus for data conversion has still be to have a complete set of specifications and to test and validate the entire data conversion process.

Once Data Conversion is code complete there is a large amount of work required to reconcile and validate the converted data. Although there will be some test automation, the amount of investigation can be significant to resolve one defect. In short, data conversion validation will take much longer than application validation. Yet both streams need to feed the Integration Test.


The question that was posed is whether this was a truly Agile process.

We didn’t deliver frequently and we didn’t minimize inventory. I would say we failed on both counts.

We were tasked to determine how we could possibly supply converted data for Integration Testing.


The epiphany was asking ourselves what an Agile Data Conversion would look like. We discussed that we would be able to do just enough to allow the testing to proceed. It turns out we are already doing some of that for the generation of Sample files when working with the package vendor. We are generating the files in accordance to the current specification and adapting as we learn more. So that is good – early and frequent feedback.

One thing we were not doing in our development was placing some rigor around how we adapted. In this case, doing things in an Agile way actually resulted some more structure. We still had to plan since we were time-boxed into when we needed to be complete. This was a change to ensure we had a plan to adapt rather than just ad-hoc adaption. Which all Agile project can suffer from.

But the one area we were not as Agile as we could be was the reconciliation and validation of the data conversions. We had planned that we had to balance all the conversions before we could say the data was ready for Integration Testing. This was going to result in multiple Mock conversions over many months.


The Solution

The solution we came up with was that we should be able to reconcile and validate the converted data iteratively. After we are able to convert a full set of data, we are not going to wait until we can validate all the data. We are going to find between 5-10 clients and their associated data that balance and look good. These clients and data will be simple at first. (although we will add complex clients if they balance) We will then add to the clients we use for Integration Testing as we are able to validate and balance them as we go along. If we can’t find 5 clients to balance off the hop, we have a larger issue. And in the spirit of Agile, it is surely better to know that right away that wait until we try to validate the entire set of data.

Ultimately we will end up validating all the converted data but in an iterative way.

I’ll report back as we see results. We also do have a plan B if needed. 🙂