How to Keep Data Simple in 2021

How to Keep Data Simple in 2021

It is one thing to have a state-of-the-art database that combines all relevant information in one place. It is another to improve business practices using available data. 

Too often, I find an ample amount of data tucked away somewhere, acting more as a liability to the organization than as an asset. Unless you are dealing with data as a hobby, all data activities must increase revenue, decrease cost or save time. 

Days of bragging about the size of the database have long been over, unless someone is still clinging onto the notion that the Big Data combined with some AI modules is the answer to all. Users do not care about the size of the database. A hungry person wants a bowl of cooked rice in front of him, now. Emphasizing the yearly yield of rice production in California will not satisfy his hunger.

So, how do we quench the user’s hunger for information? First off, do we know what the questions are? Secondly, are the data in forms of answers to such questions? In the age of information overload, we must remember too much is actually not good. Therefore, data players should always be mindful that analytics is about cutting down the noise, and providing insights, not raw data. 

A database that comes with a 900-page data dictionary could be exciting for data nerds, but not for end-users who have to make differences in business immediately. Complex and cryptic information is not any different from a thick English dictionary for a visiting foreigner. Maybe all she needs now is a direction to her hotel, not a full dissertation regarding Shakespearean plays.

If it’s about a customer value, express them in common forms, such as total dollar amount spent, number of transactions, accumulated loyalty points, number of returns or cancels, years being a customer or weeks since last purchase. It would be more effective if there were one combined score for all these, as modeling is the best way to compress complex information. Also, all data variables must be cleaned, standardized, organized and labeled.  

As another example, estimates of the target’s wealth can be expressed in many ways: household income, home value, total asset amount, expendable and disposable income, wealth score and socio-economic status indicator. Remember, only data geeks would appreciate such varieties and minutia. Be mindful about the users’ end goals, and keep everything simple. Often, a chef’s choice is all they want.  

Answers to questions must be expressed in an intuitive fashion, like “There’s a 70% chance of showers tomorrow morning.” Are the answers in Yes/No format, or are they numeric values ranging from 1-5 or 1-100? Is it higher the better, or is it ranked starting with number one? Are the data real or inferred? Or are they mixed?

In case modeling is involved, make the model score clear and easy for everyone to understand. Model scores are often long numbers with many decimal places. They must be grouped in 10 or 20 equal-size segments, clearly specified which are “better” score groups. 

For example, score group 1 could be the best target in 1-10 scale, but there are cases where 9 is the best and zero is the worst. If it is even remotely confusing to the users, that means the analyst in charge didn’t finish the job properly.

Some folks are obsessed with accuracy of information, but future customer value, for instance, doesn’t need to be accurate to the last decimal place. For most marketing applications, “$2,000-ish” is good enough. 

Images Powered by Shutterstock