What are the three most important qualities of a great data analyst?
1. They know their data inside and out
2. They draw useful conclusions from their data.
3. They deliver accurate information
Robin and I have written many articles addressing quality number one (knowing the data), which provides a solid base for a successful career as a CPG data nalyst. I recently tackled the tougher-to-teach skill of drawing useful conclusions. Now, in this post, I’ll address the third item on the list: delivering accurate information.
Great analysts are relentless about quality control. But that doesn’t mean they don’t make mistakes. They do—guaranteed. I’ve been analyzing data for over 30 years (starting as a research assistant in college), and I still make data errors regularly. The question is, who finds the mistakes? You? Your boss? Your boss’s boss? Your (gulp!) client? It’s always better to catch your own errors. Here are some of the methods I use:
1. Be skeptical and dubious about your own work
Don’t assume your numbers are right. Assume they could be wrong and regularly double check things as you work. This mindset will help you avoid most errors.
2. Be extra suspicous of surprise findings
Sales are up 50 percent this month when they’ve never been up more than 5 percent any month in the past year? Time to double check your calculations and confirm there’s not a data error.
3. Don’t type in numbers
Try to avoid manually inputting numbers at any point in the process. For example, when I’m making a PowerPoint chart, I copy and paste from Excel even if it’s just a couple of numbers. Linking can also help reduce data transfer errors though (in my personal experience) linking can also add to data problems so use linking carefully and sparingly.
4. Don’t move data around manually
Just as manually inputting data can introduce errors, so can manually moving data around. For example, if you pull data from some larger system (Nielsen, IRI, retailer POS, whatever) and then realize the columns aren’t in the right order, don’t reorder them manually. Instead, go back, respecify the data request, and let the ideal column order roll in automatically. By the way, I broke this rule the other day and immediately made a mistake—I moved the data without moving the headings, which mixed up retailer names. Luckily, I realized my error because I followed the next rule:
5. Make a habit of cross checking numbers
It may take you a few minutes to figure out the best way to do this. You might have to add an extra step to the process or pull some extra data. But take the time to do it, consistently. I promise you that extra effort will pay off, eventually, in a big way. Here are a few examples of cross checks I use:
- Compare grand totals from one report to another. This cross check works when you’re looking at the same data set through various lenses. For example, one category report shows brand totals and another shows segment totals. But the grand total should be the same on both reports – compare them and make sure that total really does match. Even if the report or analysis doesn’t include a grand total, create one so you can do the check. You’d be surprised how often this simple exercise uncovers an error!
- Make sure custom subtotals add up. Example: You’ve created two custom time periods that divide up a year of data based on when a new product was introduced. Now, as a cross check, add your two pieces together and compare to the 52 week total that’s undoubtedly available from the original data source. Ensure the numbers match.
- Compare current trends to similar numbers from an earlier period or report. Make sure the differences and/or changes are plausible.
- Compare a few data points to the original data. This is key after doing a lot of data manipulation. To make sure you can do this, clearly label and save a copy of your fresh, pristine data before beginning to make changes. Because I am truly distrustful of myself, this is often not even enough for me and I will go back and pull fresh numbers from the master data source to make sure nothing’s been lost along the way.
6. Proofread one more time, preferably the next day
If you can possibly avoid it, never send off a final analysis without taking a break from it, whether it’s a day or a few hours. Your subconscious will work on it in the meantime, and sometimes you’ll see errors you missed or find new implications.
It’s hard to build these methods into demanding timelines—it requires a lot of discipline—but it’s worth it. I’ve improved on this a lot since my early days—when I would write conclusions the morning of client presentations!
What are your tips and tricks for improving accuracy? We’d love to hear them, as would your fellow readers. Please share in the comment box below.
If you enjoyed this article, subscribe to future posts via email. We won’t share your email address with anyone.
Gagan says
Sally,
This is such a great pool of knowledge. Each article is so worth reading. Please please please don’t stop blogging, ever!
Thanks
Sally Martin says
Thanks so much for the positive feedback – glad to be of service!
Linda Boudreau says
Great thoughts, thank you for sharing. Because of the rising importance of data-driven decision making, having a strong data governance team is an important part of the equation, and will be one of the key factors in changing the future of business. There is so much great work being done with data quality tools in various industries such as financial services and health care. It will be interesting to see the impact of these changes down the road.
Linda Boudreau
http://DataLadder.com
Allison Layne Bramhall says
I just made some errors pulling a report (part of a 5 day long very manual process), and this definitely helped validate that 1. it happens and 2. there are ways to check. Still in my first few years of full time data analysis and learn more every day. So glad I found your blog! Thank you!
Sally Martin says
Allison, I’m glad the blog has been a helpful. Ugh, 5 day long manual processes! I have been there and I feel your pain. Maybe there are things you can do to take some of the “manual” out of your process? That’s a great way to cut down on errors. All the best, Sally
Colleen says
I’ve picked up a great deal of very useful knowledge from your blog over the past couple of years. This piece on reducing errors is priceless. It helped to confirm for me that it is so worth taking the time to do the cross checks, refer back to the original data and the hardest part: Waiting a day after you’ve finished to review your work before submitting. Thank you!
Sally Martin says
Colleen, Thanks for letting me know you found this piece helpful. A lot of comes down to believing that those extra steps will pay off – maybe not this time, or even next time, but one of these days that patience and care will save you in a big way!
Dave says
These are so important to see now and then. I just sent to a friend who sent out reporting recently that had data sorted incorrectly. His report was sent to one person who then redistributed to 374 people and started getting all kinds of questions on the validity of the data.
Another lesson: Before redistributing somebody else’s data reporting, look at it to make sure it makes sense.
Thank you for all your posts!