Accurate data is rightly considered one of a business’s most valuable assets. But a database is dynamic, it needs managing, checking, refining. Without such care it decays becoming less useful and can even degrade into a positive liability.
Maintaining data accuracy:
- Preserves the brand image
- Enables better business decisions
- Avoids the waste and irritation of duplicate mailings
- Saves time and money
- Ensures a single version of any information held
- Complies with the Data Protection Act.
Data cleansing also plays a large part in combating crime. David Green, business development director at GB Group, warns: There’s a huge growth in business fraud where fraudsters invent bogus company names, etc. For fraud prevention, if nothing else, data cleansing is very important.
But is it really necessary?
Data decays very quickly. According to Experian about a third of business data erodes each year, so even a relatively small database of 5000 records changes at the rate of four records per day.
Ed Wrazen, vice president of marketing at Harte-Hanks Trillium Software, says: B2B databases are not just about being able to contact people accurately, but about business intelligence and how to work more effectively.
Clean data is essential but before rushing to correct errors businesses should know:
- What information the business needs
- Whether data is likely to be bought in
- How much the database is likely to grow in one, two and three years
- How often data will be used.
Steve Tuck, chief strategy officer at Datanomic, explains, People think that data cleansing is difficult. It isn’t. What can be a problem is if they misunderstand what information their business needs or how they use it. If they do, then data cleansing won’t help them; they’ll only end up with a clean database of information they don’t want and then they blame the process.
Anyone contemplating data cleansing should first check how inaccurate their data is using software which trawls through large volumes of data looking for errors and inconsistencies.
Wrazen of Trillium Software comments: We’ve examined business data where we’ve found as much as 60 per cent duplication. We’ve also reconciled systems where different product codes for the same product meant that it came both first and second in that business’s list of best sellers. We’ve even heard of people spending 20 per cent of their time manually rectifying anomalies in reports generated from different databases.
Poor data can literally cost a company millions in ineffective marketing where campaigns are tightly focused to appeal to an audience which the company turns out not to have.
Horses for courses
Data cleansing by hand or using tools such as Excel is only viable with very small datasets. Even then it’s time consuming and not necessarily accurate; inevitably people are inconsistent and make mistakes. Cleaning a relational database by hand is also very difficult as the integrity of the links needs to be preserved. With automatic cleansing, once the business has tuned the software to do what it wants, the rules are applied consistently and continually.
The most appropriate data cleansing solution for individual businesses depends on the nature of the business, issues of confidentiality and security, in-house resources and expertise, etc. Data cleansing solutions typically include:
- Cleaning postcodes and addresses
- Cleaning and coding international addresses
- Ability to work with many languages
- De-duplication
- Data capture services
- Screening mail, fax and telephone preference services
- Verifying telephone numbers
- Screening national telephone number change updates
- Suppressing national change of addresses
- Suppressing goneaways
- Screening mortality and bereavement
- Consultancy and advice.
Some companies prefer to integrate data cleansing software into their CRM systems so that data is cleaned at the point of entry. In large businesses, data is usually dispersed across several sites and different databases interact. If there are duplications the whole picture is skewed. Gartner reports that 50 per cent of CRM systems have limited acceptance with users because of poor data quality; the focus has been on moving the data rather than checking it for accuracy.
SMEs on the other hand may find that online data cleansing services are more appropriate than importing their own software.
Terry Hiles, managing director at Capscan, comments: For SMEs such services can be a low-risk, low-cost way of cleaning data; there is rarely a set-up fee, you pay for what you use, the service is available 24/7 from wherever you’re located, it’s easy to use, offers data encryption for security and is fast.
Most outsourcing firms only charge for matched records, ie. they run your files against their clean list and charge for the records they change. Data cleansing comes within the remit of the DMA but there is little regulation. In fairness it’s an extraordinarily difficult area to regulate as companies are selling negatives they’re matching something with something else and removing the mistakes.
Guarding against problems
One concern about using a data cleansing service is the provenance and accuracy of clean lists. Tony Sweetman, data quality director at EHS Brann Discovery, says: I’d like data cleansing services to be far more open about where their data comes from and how frequently it’s updated. I’d also like to see more users checking for accuracy rather than assuming that nominally cleaned data is right.
Under the Data Protection Act businesses are responsible for the quality of the data they hold so they need to be confident about the service they’re getting. As Mary Sands, marketing manager at DQ Global, says: A major problem with outsourcing data cleansing is that you lose control of your database.
Businesses considering data cleansing should test the software or service by running sample data through it and checking the results. The software should be able to handle names and their variants, acronyms, contact details, product codes, etc. and also trace where characters have been transposed or words have been spelt phonetically.
Chris Cuffe, managing director, HelpIT says: A good tip for evaluating software is to run a copy data file through one product and then through another to see how many more problems are identified and vice versa.
Data cleansing products are designed to be easy to use and generally include training and support within the licence fee.
Data cleansing amends records to make them uniform, but there are alternatives. Mark Masterson, Sonetto business manager at IVIS Group, explains: Sonetto Product Information Management software identifies duplications within product databases but rather than amending the information, tags and maps the different terms. Every time new data is added it is automatically tagged. Thus green, grn and emerald are all understood as green while if necessary lime can be added later. Sonetto is designed for use in large companies and Tesco.com has adopted it to enhance its supply-chain infrastructure.
Cost versus saving
Businesses typically see data cleansing as a cost, but this is inaccurate. Direct savings can be incurred by using the correct contact details, not duplicating products and addresses and ignoring people who don’t want to be mailed. Indirect benefits accrue by ensuring you continue to appear professional and your brand respected.
Hiles of Capscan says: The cost of software or services depends on the options required; online ‘on demand’ services start from as little as £25 Contracting the work out to a bureau costs from around £150 and even in-house software under licence or ‘pay-as-you-go’ arrangements can work out surprisingly cheaply. At the other end of the spectrum, data cleansing software embedded in a customer relationship management system which cleans and enhances global data might cost £20,000.
Knowledge is power. It’s an old adage but still true. But if your knowledge base is full of errors, just how powerful are you? Few businesses realise the true cost of inaccurate data and see data cleansing as a needless expense. But it’s easy, and inexpensive and probably quicker than deciding whose budget should finance it. Why would any professional organisation not do it?
Supplemental: Data cleansing solutions what’s available?
There are three generic types of data cleansing software available, depending on objectives. 1. Single-file de-duplication. Typically used by companies to remove duplication from a contact management system; such software ensures that each contact only appears once in any file. 2. Two-file de-duplication. Mainly used by companies to introduce new data or merge two databases. 3. Multiple-file merger. Allows merger of many different data files. Most typically used by service providers but large companies regularly merge data from many different sources.