Avoiding data conversion hell
Why don’t my report counts show the same numbers on your new system as they did on my old?
Ugh, do I ever hate to have this question come up in a project review meeting. It is surely my candidate for the question consultants least like to hear from a client.
And this question usually follows by a few weeks or months a close runner up question, Can you convert all my old data into the new system. Clients who ask this and consultants who answer yes are looking for trouble for themselves.
This has been on my mind as our team approaches the end of a current and some recent projects whose conclusion has been directly linked to resolving data ‘errors.’ Sometimes you find unexpected joy, such as last winter when a conversion from a really old Raisers Edge version to Giftworks provided historical donation totals that matched entirely. Not to be a pessimist, but yup, I was surprised.
In our hearts if not our experience, we all know that data conversion gaps occur way more often. Here are some thoughts for “both sides” to keep in mind in approaching that next software upgrade.
First, as a background philosophical thought, if an organization’s data looked just right in its old system, would you still need to upgrade or replace? Business practices change and the old ways no longer work. We generally welcome change and we can expect staff to look forward to operating in new forward-looking ways. Yet fitting the old data to the new may be, as a colleague said yesterday may be mixing apples and oranges. The old and the new may never align again.
If you do convert piles of old transaction data or program enrollment data, and the new system has involved strategic planning and incorporation of new business rules, accept that reports will look different. Embrace the new and appreciate it.
Second, conversely on the human or philosophical side of things, you have to make an estimate of how much staff not just “hate the old way” but really are ready to “embrace the new way.” I think of a data conversion some years ago where staff continued to run mailing lists from their old system months after they were doing data entry in their spiffy new system, with its way cooler list generator. Hostage syndrome. They had grown hostage to the features that bore down on them in the old system. So if things ARE going to look different, you need to prepare everyone for it 100% or life will be difficult.
Part of the way to do this is to make sure staff at all levels of the organization are in the room when you discuss data migration. It is not sufficient to just look at reports and lists and to talk with the managers who use them. Let everyone hear how admin and program staff actually enter data, the compromises they make, the guerrilla warfare they may be waging against existing data validations to put what they want where they want it. This may be sobering to senior management and support realism about the conversion.
Third, consider if you can just leave historical data where it is. If you have been paying for a proprietary on-line data system that you want to stop paying for, you may have no choice but to extract old data and put it somewhere it can still be used. If not, why not just continue keep the old system available to run historical reports and start fresh with the new? Two pending conversions to Salesforce to replace a collection of spreadsheets and old Access databases have been way pleasant from a decision to not incorporate years of messy and unaligned old data.
Deciding not to convert may be a major policy decision if senior management or the board is used to a certain kind of review. I don’t underestimate it. But it is worth pursuing the question at the outset, before anything starts.
Fourth, consider converting just top level data, such as all primary demographic and status information for current contacts and leaving aside in the old system. Start with just what you know to be clean and essential.
Fifth, a variation on the third, fill in complex and messy old data by hand over time. As new cases or activity occurs for a contact, fill in or correct the gaps in their data from the old system. Meanwhile, don’t try to run multi-year historical reports on the new system
Sixth, if the data is messy, clean it up parallel to and prior to switching to the new system. My experience: most of the time, it is going to be easier to correct messy data either in its present home or in some convenient way station. In another conversion to Salesforce, where the data now resides in Filemaker, we have agreed to freeze data entry for a bit, and pull the data into a combination of Excel or Access where it will be more susceptible to bulk manipulation and checking.
Seventh, tell the consultant everything you can. If you are relatively new and the system is not well documented, find the staff members who have been there long enough to explain quirky things lost in the mists. In a current conversion from a proprietary undocumented membership system to CiviCRM, we have had one gotcha after another. For example, the current system shows five membership levels. The new system will have the same five membership levels. The paper forms show five membership levels. We checked at the beginning. Great! Easy! Except a faction of the members, in their actual extracted data showed a sixth membership level. And CiviCRM importer chokes on those records so it has to be addressed. OK, we figured this out, but if you add up field by field the consultant detective work to resolve things like this, you suddenly have blown a reasonable data migration budget.
Eighth, agree to leave data conversion out of proposal bids and do it on a time and materials basis only. This will enable everyone as the inevitable problems arise to make pragmatic decisions. Yes, it is worth it to write a script to fix this problem. No, it is not worth it, leave it alone. Or, yes, we need this and organizational staff will fix it ourselves or hire temps to do it.
I regularly resolve not to submit proposals where I have to include data conversion in the cost estimate, and I regularly find myself unable to keep that commitment. Even if you allot a few hours as part of a preproject assessment, it is unlikely you will find all the problems. A current project that is not exactly a system migration, but instead will need to enable staff to use purchased business data in a Drupal-based research and organizing framework. The data is good, used nationally and expensive! It should be clean say the developers! Yet even here we find quirky things. What should we make of a crucial research field which sometimes is stored as an actual dollar amount and sometimes as a coded list of dollar values. Since this is an ongoing data feed and not a one-time data conversion, we need to address it so that data searches will work on both kinds of cost valuation figures. And it’s just not realistic to expect to find all this stuff before you get pretty deeply into it.
Nine, if you are the consultant, you often, not always, do a trial conversion, leave it on a development site for a while for testing, and then do the final conversion to go live. It is really important to carefully document the conversion steps. In one really messy conversion, we ran, tested and refined the 40 or so separate conversion scripts several times before deciding the soup was ready. Different people contributed to the scripts. Having the scripts catalogued and organized lessened the pain of having to rerun them. Plus even a year later, it was possible to go back and find the script that massaged this messed up status category or performed some other task when the client asked about “what happened to the people with interest code x”?
Ten, hmm. I am leaving a slot for you, veterans of data conversions. Your thoughts? I could certainly use them…
Meanwhile, this post derived from a suggestio n from fearless editor Laura Q to write about the question consultants least like to hear from clients. When I posed this to my co-workers here at Database Designs, I got so many suggestions, I first thought of doing a “top ten” and decided to just start with my “favorite.” This could turn into an occasional series, and to balance, my colleague Mimi suggested one on questions consultants MOST like to hear from clients.
Ugh, do I ever hate to have this question come up in a project review meeting. It is surely my candidate for the question consultants least like to hear from a client.
And this question usually follows by a few weeks or months a close runner up question, Can you convert all my old data into the new system. Clients who ask this and consultants who answer yes are looking for trouble for themselves.
This has been on my mind as our team approaches the end of a current and some recent projects whose conclusion has been directly linked to resolving data ‘errors.’ Sometimes you find unexpected joy, such as last winter when a conversion from a really old Raisers Edge version to Giftworks provided historical donation totals that matched entirely. Not to be a pessimist, but yup, I was surprised.
In our hearts if not our experience, we all know that data conversion gaps occur way more often. Here are some thoughts for “both sides” to keep in mind in approaching that next software upgrade.
First, as a background philosophical thought, if an organization’s data looked just right in its old system, would you still need to upgrade or replace? Business practices change and the old ways no longer work. We generally welcome change and we can expect staff to look forward to operating in new forward-looking ways. Yet fitting the old data to the new may be, as a colleague said yesterday may be mixing apples and oranges. The old and the new may never align again.
If you do convert piles of old transaction data or program enrollment data, and the new system has involved strategic planning and incorporation of new business rules, accept that reports will look different. Embrace the new and appreciate it.
Second, conversely on the human or philosophical side of things, you have to make an estimate of how much staff not just “hate the old way” but really are ready to “embrace the new way.” I think of a data conversion some years ago where staff continued to run mailing lists from their old system months after they were doing data entry in their spiffy new system, with its way cooler list generator. Hostage syndrome. They had grown hostage to the features that bore down on them in the old system. So if things ARE going to look different, you need to prepare everyone for it 100% or life will be difficult.
Part of the way to do this is to make sure staff at all levels of the organization are in the room when you discuss data migration. It is not sufficient to just look at reports and lists and to talk with the managers who use them. Let everyone hear how admin and program staff actually enter data, the compromises they make, the guerrilla warfare they may be waging against existing data validations to put what they want where they want it. This may be sobering to senior management and support realism about the conversion.
Third, consider if you can just leave historical data where it is. If you have been paying for a proprietary on-line data system that you want to stop paying for, you may have no choice but to extract old data and put it somewhere it can still be used. If not, why not just continue keep the old system available to run historical reports and start fresh with the new? Two pending conversions to Salesforce to replace a collection of spreadsheets and old Access databases have been way pleasant from a decision to not incorporate years of messy and unaligned old data.
Deciding not to convert may be a major policy decision if senior management or the board is used to a certain kind of review. I don’t underestimate it. But it is worth pursuing the question at the outset, before anything starts.
Fourth, consider converting just top level data, such as all primary demographic and status information for current contacts and leaving aside in the old system. Start with just what you know to be clean and essential.
Fifth, a variation on the third, fill in complex and messy old data by hand over time. As new cases or activity occurs for a contact, fill in or correct the gaps in their data from the old system. Meanwhile, don’t try to run multi-year historical reports on the new system
Sixth, if the data is messy, clean it up parallel to and prior to switching to the new system. My experience: most of the time, it is going to be easier to correct messy data either in its present home or in some convenient way station. In another conversion to Salesforce, where the data now resides in Filemaker, we have agreed to freeze data entry for a bit, and pull the data into a combination of Excel or Access where it will be more susceptible to bulk manipulation and checking.
Seventh, tell the consultant everything you can. If you are relatively new and the system is not well documented, find the staff members who have been there long enough to explain quirky things lost in the mists. In a current conversion from a proprietary undocumented membership system to CiviCRM, we have had one gotcha after another. For example, the current system shows five membership levels. The new system will have the same five membership levels. The paper forms show five membership levels. We checked at the beginning. Great! Easy! Except a faction of the members, in their actual extracted data showed a sixth membership level. And CiviCRM importer chokes on those records so it has to be addressed. OK, we figured this out, but if you add up field by field the consultant detective work to resolve things like this, you suddenly have blown a reasonable data migration budget.
Eighth, agree to leave data conversion out of proposal bids and do it on a time and materials basis only. This will enable everyone as the inevitable problems arise to make pragmatic decisions. Yes, it is worth it to write a script to fix this problem. No, it is not worth it, leave it alone. Or, yes, we need this and organizational staff will fix it ourselves or hire temps to do it.
I regularly resolve not to submit proposals where I have to include data conversion in the cost estimate, and I regularly find myself unable to keep that commitment. Even if you allot a few hours as part of a preproject assessment, it is unlikely you will find all the problems. A current project that is not exactly a system migration, but instead will need to enable staff to use purchased business data in a Drupal-based research and organizing framework. The data is good, used nationally and expensive! It should be clean say the developers! Yet even here we find quirky things. What should we make of a crucial research field which sometimes is stored as an actual dollar amount and sometimes as a coded list of dollar values. Since this is an ongoing data feed and not a one-time data conversion, we need to address it so that data searches will work on both kinds of cost valuation figures. And it’s just not realistic to expect to find all this stuff before you get pretty deeply into it.
Nine, if you are the consultant, you often, not always, do a trial conversion, leave it on a development site for a while for testing, and then do the final conversion to go live. It is really important to carefully document the conversion steps. In one really messy conversion, we ran, tested and refined the 40 or so separate conversion scripts several times before deciding the soup was ready. Different people contributed to the scripts. Having the scripts catalogued and organized lessened the pain of having to rerun them. Plus even a year later, it was possible to go back and find the script that massaged this messed up status category or performed some other task when the client asked about “what happened to the people with interest code x”?
Ten, hmm. I am leaving a slot for you, veterans of data conversions. Your thoughts? I could certainly use them…
Meanwhile, this post derived from a suggestio n from fearless editor Laura Q to write about the question consultants least like to hear from clients. When I posed this to my co-workers here at Database Designs, I got so many suggestions, I first thought of doing a “top ten” and decided to just start with my “favorite.” This could turn into an occasional series, and to balance, my colleague Mimi suggested one on questions consultants MOST like to hear from clients.


Comments
I precisely needed to thank
I precisely needed to thank you so much yet again. I do not know the things I could possibly have accomplished without those information shown by you concerning this subject matter. Entirely was a real hard circumstance in my view, nevertheless understanding the professional tactic you processed I'm happier for your guidance and thus wish you recognize what a powerful job you're providing instructing the others through your blog post.
Data Conversion
Tenth – consider an "iterative" approach to the solution. The iterative approach acknowledges that the traditional software development methodology fails because the detail requirements often cannot be known in advance. The ultimate solution is evolved through iterations of relatively minor changes and retesting. Nowadays, the iterative approach is referred to as the Agile development methodology. But even the iterative approach has its challenges. With an ETL tool or with a typical custom programming effort or any other programmer-driven solution, an iterative approach means that the system experts (if there are any) must communicate ever-changing requirements to the programmers who are customizing the ETL interfaces or writing the conversion source code – this is communications quagmire that is prone to error. But Universal Conversion Technologies (UCT) has solved that problem and has been using an iterative approach to solve data migration puzzles for 20 years – well before a name was assigned to our process. With our solution, the changed business rules are entered into our toolset by experts who are non-programmers and the tool generates the new code. If you are dealing with a complex conversion involving legacy systems, multiple systems, home-grown file structures, high data volumes, extremely aggressive timelines, critical accuracy requirements, lack of system knowledge, etc., contact UCT. We have data migration experts with 20+ years of experience apiece. We have a tool that creates conversion code directly from business rules that are entered by system experts (not programmers). We have a methodology that has been used successfully on 100s of complex conversions. Visit UCT at www.uctcorp.com for more information or call (214) 348-2000.