Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

4 ways to correct bad data and improve your AI


As marketing analytics rapidly evolves into an AI-driven field, one major challenge threatens to derail progress: bad data. While AI excels at turning vast amounts of information into useful insights, its effectiveness depends on well-planned and well-managed datasets.

Bad data leads to bad predictions, biases, flawed insights and unintended outcomes. To address these risks, companies are investing heavily in data cleansing, validation, and governance—an essential, time-consuming, and complex process.

For analysts, prioritizing better measurement and understanding of the business context behind their data is key. That’s why analysts must lead efforts to optimize data for AI. Here are four strategies for extracting insights from flawed datasets while improving data hygiene and planning.

1. Identify supporting data

It is often possible to use other data sources to validate the metrics you are trying to measure. For example, I worked with a retailer who claimed their inventory data was unreliable — which was a big problem. However, point-of-sale (POS) data identified fast-moving SKUs that suddenly showed zero sales.

Although the inventory system showed low inventory levels (but not depletion), the sales patterns clearly indicated an inventory problem affecting revenue. Using this insight, we adjusted replenishment thresholds and triggers to keep high-demand items in stock, minimizing revenue loss.

Dig deeper: How to make sure your data is ready for AI

2. Research the ‘bad reputation’

Sometimes a dataset gets a bad rap because of “noisy extremes” that get disproportionate attention. Although noticeable, these errors often represent a small proportion of otherwise correct data.

For example, I worked on household policy data for a personal lines insurer. There have been cases where shelves were incorrectly grouped under the same household or incorrectly separated. We found several issues – such as incorrect or duplicate addresses and policies sold by different agents – causing most of the errors. We cleaned up the data set by writing patch code, turning it into a reliable source.

3. Distinguish zero and zero

Missing data can hinder decision making. So the first step is to determine if the values ​​are truly missing or simply recorded as zero. Understanding the logic behind how the data is generated is critical because “no activity” (zero) is not the same as “missing information” (zero). If the data is indeed missing, you have two options.

  • Are there proxy values ​​or variables that can estimate missing values? This may involve experimenting with combined variables.
  • Can the business question still be solved using the available data?

In most cases, missing data is more of an obstacle than an insurmountable obstacle.

Dig deeper: The Data Analytics Hierarchy: Where Generative Artificial Intelligence Fits In

4. Use a random mistake to your advantage

Sometimes it takes too much time to fix bad data or it can’t be fixed at all. However, if the errors are accidental, they can cancel each other out. This makes it possible to measure significant differences between groups or periods.

For example, my team worked with web traffic data from two recently merged brands. Each brand had its own analytics platform, which provided somewhat different measurements and faced visitor identification issues.

Since there was no reason to believe that one brand’s platform was significantly smaller than the other, we assumed that the errors were random. Segmentation factors were similar for both brands, which allowed us to effectively analyze differences at the segment level. This combined segment-driven strategy has saved the company millions.

Making the most of scarce data in an AI-driven world

These strategies are not exhaustive as each data challenge is unique. Too often, however, companies abandon flawed data sets too soon, focusing solely on the lengthy data remediation process. These interim strategies show how valuable insights can still be extracted from imperfect data sets.

At the same time, companies should not feel limited by their current data. In many cases, generating new, more relevant data can happen quickly, especially in digital marketing. By leveraging supporting data, addressing reputation issues, distinguishing zero from zero, and strategically using random errors, analysts can unlock the value of flawed data sets and help build a strong foundation for AI-driven success.

Dig deeper: The AI-Powered Path to Smarter Marketing

Contributing authors are invited to create content for MarTech and are chosen for their expertise and contributions to the martech community. Our associates work under supervision redaction and contributions are checked for quality and relevance to our readers. The opinions expressed are their own.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *