Future of Data Mining

Recently read an interesting article from New York Times “For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights”, the article talks about the future development of data mining which is closely related to marketers’ work in a digital data world. Just want to share some thoughts about this article here.

Nowadays, with the rapid development of technology, the consuming behaviors of people become more and more transparent. With abundance of digital data from various sources, marketers not only want to get the “big data” of their customers, but further more to get the meaning behind those data, that is effective data meaning. As it is mentioned in this article, data janitor and data mining work might be boring and tedious, but it is required before those data was sorted and organized into an easier way to do further analysis, So I believe organizing data from many sources can be and must be enhanced to leave more time for analysis. This is a trend in the future and is already on the way.

As technology develops, more and more mundane work will be handed over to software tools even robots to leave more time for people make intelligent decisions. I believe, the data scientist and computer programmers will figure out the solutions, developing software and computer programs to automate, clean and organize data.

Thus now in a marketing team, you probably not only have data scientists, but also computer programmers. To organize data from different sources, you will let the data scientists pick out the useful information needed, including what data we need, a uniformed data format (different formats from different sources), in what way to represent the data. Then the programmers need to develop visualization software tools and computer programs to sort and organize the data. They will have to work together closely to test and enhance the program and meanwhile add calculating and analyzing functions to the software or programs to get thoughtful insights of the data. That’s the cool sexy part the author mentioned.

Just like Excel makes many people can analyze data as experts do is because the software did most of the work preparing and organizing data, so that people have more time spending on the cool stuff. The improvement of other software tools for data mining in modern data world will furthermore free people from mundane work and get to the real valuable part of data analysis in the future. We are living in this changing world now and will benefit from it.

Start-ups to develop those fancy software tools like Paxata mentioned in the article already started the revolution. Paxata is a self-service adaptive data preparation platform to help people turn raw data into ready data for analytics. Their solutions include importing, exploring, enriching, and combining, sharing and publishing data. This is exactly what modern data world needs now; the platform and software tools will help analysts save more time to do more than just clean data.


