Author picture

The Problem With Public Data

Miranda Partners is publishing our first guest blog, from Miguel Angel Davila, a founding partner at Tukan. As the blog shows, TUKAN collects, cleans, and standardizes millions of data points for the Mexican market, with a focus on the financial sector. For the data junkies out there like ourselves, we cannot recommend TUKAN highly enough.

In the past few years, one of the major breakthroughs for companies is that it has become considerably easier to generate and make sense of first-party data.

Ten years ago, most companies were not even using, or even aware of, the data that they produced.

What changed? The necessity to understand business operations at a faster rate brought forth the emergence of great products that helped companies make sense of internal data more efficiently, such as: Databricks, Tableau, Snowflake, among others; coupled with an accelerated increase of mobile and internet penetration that has made first-party data collection more efficient.

It should come as no surprise that – according to a study from the World Economic Forum – more than 80% of companies state that they have implemented big-data analytics and machine learning technologies within their organizations.

The incredible thing about these changes is that companies have gone from being completely unaware of the data they produce, to not being able to get their hands on enough data to outsmart their peers and gain a competitive edge. After all, if companies really want to know what’s happening outside their organization, they will need external data.

According to a study conducted by Capgemini, companies that extensively leverage external data in their decision making processes, enjoy (on average) a 70% higher revenue per employee and 22% higher profitability. Furthermore, the main three external data sources are: data from aggregators, public and open datasets.

If leveraging external data is such a good thing for driving better business decisions and growth, why isn’t everybody doing it? The short answer: because it’s hard.

Out there (in the wild), databases are numerous, scattered, dirty and hard-to-handle. This requires analysts and data-scientists to spend 60 – 80% of their time (on average) prepping and cleaning the data to make it actionable. If we’re thinking of a 5 day workweek, that’s just one day left for actual analysis. Here’s when companies and data teams realize that when it comes to data, public has a very different meaning from free.

If companies truly want to make a significant impact by working with public data, they can’t afford to spend that amount of resources solely on prepping, cleaning and updating public data sources. That’s where TUKAN comes in to help.

We have spent so much time facing the problems associated with processing public data, that we’ve made it our mission to make sure no analyst has to spend another second cleaning and prepping the data they need for their analysis.

TUKAN’s service consists on taking care of all the dirty work that comes associated to using third-party data; we clean, standardize and collect millions of data points that are made actionable through an easy to use web-platform and API, where our users can find and download data sources quickly, merge datasets across different sources, create customized tables, charts and reports that update automatically.

Finally, to make sure that every single dataset in our catalog has value, we are building our catalog alongside our clients which guide us in the direction we need to go. So if you’re curious about incorporating public data into your decision making, sign up for a free demo at our website and we’ll figure out a way to help!