It’s 2018. Big data has evolved. It’s no longer perceived as the phantasmal mystery it once was. No, big data has matured. Interest and adoption are at steady-state. In fact, big data is such a routine part of an organization’s data skill sets that we don’t even call it “big data” anymore. It’s just data now; Data that’s become so integral to our decision-making frameworks that we take for granted the extravagant successes it affords us. In other words, creating value from big data analytics is mainstream.
With this ubiquity, most large organizations already enjoy strategic wins due to existing big data initiatives. For them, the low-hanging fruit have all been picked. These organizations have moved on to subsequent rounds of optimization; To create even more value from their data assets. Ask any CDO, it’s in this phase where effective data management becomes paramount to success.
WHY EFFECTIVE DATA MANAGEMENT SPELLS SUCCESS IN BIG DATA
In my work as a data strategist, I often find myself researching the latest and greatest big data tools, technologies, and methodologies. I only make new technology recommendations based on the client’s existing technologies, because we need to leverage existing capabilities in such a way that the client achieves their goals most cost-effectively.
CONSIDER THE COST OF CREATING VALUE FROM BIG DATA ANALYTICS
Speaking of cost-effectiveness, did you know that, in 2017, Glassdoor reported that the average salary for Data Scientists is $120,931 per year? Also in 2017, CrowdFlower’s Data Scientist Report stated that, data scientists are spending 53% of their time on collecting, labeling, cleaning, and organizing data.
That’s about $64,000 per person per year that an employer spends on data preparation man-hours.
Doing a little math here, that’s about $64,000 per person per year that an employer spends on data preparation man-hours. For organizations with five or more data professionals, it makes absolute sense to acquire a data management tool that can automate out most data preparation tasks. This will go a good way in lowering the cost of creating value from big data analytics.
That said, below I’ve listed a few of the capabilities that I look for when recommending data management solutions. Those are:
- Data Cataloging Support: It’s impossible to maintain 100% accuracy when you catalog big data manually. Data management tools should have built-in data cataloging capabilities.
- Data Discovery Support: When it comes to discovering and identifying relevant datasets, you should have an automation tool that identifies all potentially relevant datasets, and then you can narrow down your selection from there.
- Data Munging Capabilities: You want a tool that can do some base level data aggregation, clean-up, and enhancement – so your data scientists can spend their time building models.
OTHER KEY CHARACTERISTICS OF A WINNING DATA MANAGEMENT SUITE
In an ideal world, you’d want a data management suite that can handle both the sophisticated analytical requirements of a data scientist while also accommodating the data management needs of business users as well. This solution would of course need to manage, store, and process all varieties of data structures, volumes, and velocities, but there’s more that’s needed. It should have a front-end interface on which data-driven applications can be built, deployed, and maintained. Alas, that’s a lot to ask…
In recent research, I’ve seen several promising solutions on the market. I know that my brand sponsor, SAP, recently launched a powerful integrated big data management solution called SAP HANA Data Management Suite!
In fact, to celebrate this launch, they sponsored a giveaway where 1 member of our community will win the gorgeous keyboard shown below:
Be sure to enter to win today!