Data Mining for Dummies: Interview with Meta Brown @metabrown312 #DataTalk


At Experian we believe that big data is good. Good for our economy; good for consumers and good for society. In today’s #DataTalk, we had a chance to talk with Meta Brown about her work in data science and her book “Data Mining for Dummies.

About Meta Brown

Meta Brown helps organizations use practical data analysis to solve everyday business problems. Meta is a hands-on data miner who has tackled projects with up to $900 million dollars at stake.

She hold a Masters of Science Degree in Nuclear Engineering from the Massachusetts Institute of Technology and a Bachelor of Science degree in Mathematics from Rutgers University.

You can learn about her work by going to MetaBrown.com and you can follow her on Twitter @MetaBrown312.

Check out the full interview:

Can you share a little bit about yourself and the work you do?
Sure. I’m a statistician, I studied classical statistics and engineering. In the nineties, my employer wanted to get into the data mining business, so I became a data miner and started demonstrating data mining methods, teaching classes and developing training materials. For the same reason, I also began to use and teach text analytics.

Over the years, I became so well known for my writing, teaching and speaking, that I began to coach other data analysts in technical communication. Now, most of my clients seek me out for help improving communication between their technical staff and non-technical managers or customers.

Have you always loved working with data?
Not particularly, no! I began working with data in my teens, but even now, I don’t love every minute that I spend on data analysis. A lot of the time must be spent on data preparation, quality checks and other tasks that are necessary, but not a lot of fun. I love the end result, unearthing information that provides meaningful guidance for important decisions.

What led you down the path to begin helping businesses solve problems with data?
I accepted an engineering position at a corporation with 6000 employees, and then found out that I was the only one of them with any significant training in statistics.

How do you define data mining? Is there a simple way to think about it?
Yes, very simple. Data mining is statistics for cheaters! The object of data mining is to empower business people to discover useful information from data.

Business people can’t drop everything to earn a degree in statistics, so in data mining, you use a variety of data analysis methods with little or no regard to theory. You make discoveries quickly, and use field testing in place of theoretical support.

How much data is needed for data mining?
Quantity doesn’t have much to do with it.

Data mining came to be associated with large volumes of data because software developers wanted to spare users the effort of sampling. They built tools that handled the quantities of data that seemed large at the time, and did so quickly.

What’s more important is the relevance of the data to the problem you want to solve, the quality of the data, and having the level of detail that you need. Lack of quality is usually a bigger problem than lack of quantity.

When you start working with a company – what teams or professionals within the organization do you tend to work with?
Data mining usually begins with the marketing department. Most marketing executives are aware of analytics.

But data miners come from many roles ranging from engineers and others with manufacturing roles to law enforcement experts to academic researchers.

Are there any common challenges you find that companies have with data and data mining?
The most common issue is making investments without defining goals. If you start without a realistic plan for success, you probably won’t succeed.

Why did you decide to write “Data Mining for Dummies”?
I write for the business community. A business person who wants to know what what’s what on an unfamiliar topic doesn’t go looking for an academic book! You get a “For Dummies” book!

So, this was an opportunity to write the book that defines data mining for the business community. Who could resist the opportunity to write “Data Mining for Dummies”?

Did you write this book with a certain person in mind? Who can benefit?
“Data Mining for Dummies” is written primarily for data mining novices, especially business people. The only prerequisites are everyday office computer skills and a little feel for numbers.

The book explains basics like what data miners actually do, how to get data, and the fundamentals of many data mining techniques. It’s also drawn readers who have experience with data mining, who want to fill gaps in understanding, or get ideas on how to explain data mining more clearly. But if you’re looking for fancy computer code or equations, you won’t like “Data Mining for Dummies,” because you won’t find that stuff in this book!

Why is data mining useful for a business? Any practical examples of where we see the benefits of data mining?
Data mining is useful because it provides concrete evidence, rather than mere opinion, to support decision making, often at a level of detail and relevance that decision makers are not getting from other sources.

Here are a few of my favorites:

  • A retailer discovered that the data collected on a customer’s first day in the frequent shopper’s program was a strong predictor of how much the customer would spend in the long term. That information allows the retailer to target its marketing plans and get better returns.
  • A planning commission used public property records to predict changes in real estate ownership. Knowing which parcels are likely to change hands create opportunities to influence land use. There’s a Chapter in the book called A Day in the Life of a Data Miner that explores the data and the process used for that project in detail.
  • A school administrator investigated expulsion records and discovered that the series of disciplinary actions that led to expulsion often begins with friction over school uniforms. He’s now an outspoken critic of school uniforms!
  • And of course, you may have heard about the Obama for America 2012 team using data in many ways for highly personalized campaigning and fundraising, helping to raise over $1 billion and win the election. That approach is called microtargeting, and you can learn more about it in the book.

In chapter four of your book, you write about the laws of data mining. You write about importance of business goals, data preparation, right modeling, patterns, predictions, etc. Is there a certain law that you cover that you think gets overlooked?

First, let me explain that the 9 Laws of Data Mining are principles of data mining practice. I didn’t create them, they were first stated by Tom Khabaza, a pioneering data miner. The fact is, each of the nine laws is often overlooked!

Even the first law of data mining, “business objectives are the origin of every data mining solution,” which is the very heart of the subject, is often neglected. Data Mining isn’t intended for finding information that is merely interesting, it’s about finding information that helps solve specific business problems.

That’s why I devoted a chapter to explaining the 9 Laws. And just so everyone understands what’s at stake, that chapter also includes a real-life example of a business that spent over a million dollars on data mining, but wasn’t able to use the final result.

What chapter of your book did you have the most fun writing?
I enjoyed writing Chapter 9: Making New Data. It’s one of four chapters devoted to understanding the major sources of data.

Many data miners consider only the data that’s easily available to them. Depending on their job and skills, that might mean just data that’s already in-house, or just government sources, or just some other source that was not originally created for the specific problem you need to solve.

Often, it’s worthwhile to obtain new data that addresses your own specific business problem, and this data provides unique competitive advantage because only you have it. So that chapter addresses things like survey research, loyalty programs and experimentation.

What are some of the biggest advancements in data mining you’ve seen lately?
The hot new things in data mining are developments in using data that we haven’t thought of as data in the past. That would include text, audio, images and video.

We’re also seeing new offerings that make data integration and access easier; you must have data access if you want to do data mining!

Meta, where can everyone learn more about you and the work you do?
Visit my website! Metabrown.com. You can find my contact information there, and links to many of my articles on analytics and communication.

Check out other #DataTalk interviews and learn about upcoming events.