The generation, collection and storage of big data have become commonplace, even necessary. Now, enterprises must figure out not only what to do with big data, but also how to store and analyze it at all.
“In brief, big data consists of large quantities of data that are almost always being analyzed,” said Bryan Stoddard, who runs the website CertaHosting. For many enterprises, this means relying cloud storage. In fact, it’s one of the factors behind the increasing popularity of cloud computing.
“Because the velocity and voracity of big data are so high, it seems warehouses are becoming a less optimal solution," said Stoddard. "Migration to the cloud--while implementing security measures like encryption--is the way forward for most organizations.”
But while cloud storage is becoming increasingly accessible for organizations of different sizes, and with different data needs, enterprise challenges around big data remain. Here are three enterprise considerations to keep in mind as your organization moves forward with big data.
1. Go bigger, geta automated
Part of the challenge of dealing with big data is scale.
“As big data grows, organizations will need to scale their resources accordingly to keep up with the pace,” said Andrew Herbert, founder of Cangler Analytics. Investment in automation, as well as serverless compute and storage technologies, is an important part of that, he added.
“Without investment in automation, the rate at which organizations can execute on data analytics use cases is directly proportionate to the number of data engineers, data scientists and data analysts they hire,” he said. “This creates a scalability issue and dramatically increases the cost of data analytics.”
Automation means putting control of the data in the hands of IT staffers, as well as users at every level of the organization, Herbert said.
“Organizations of the future will empower every employee to be a data-driven decision maker in order to handle the large volume of big data,” he said. “Dashboards and data science tools will no longer be the domain of IT; the combination of data democratization and automation will allow everyone in an organization to effectively use data analytics in their day-to-day decision making.”
2. Do you need storage, analysis, sharing?
Along with automation, organizations have to figure out their data storage needs to truly embrace big data’s potential, said Ajay Prasad, founder and president of RepuGen.
“Data storage comes down to two main factors:proper storage and organizational methods of storage,” Prasad said. The latter involves options like having APIs automatically organize data so teams can focus on managing products and customers.
Prasad said cloud computing and storage are a key component. “Many cloud storage platforms depict that cloud data is the next step to safe data storage and sharing,” he said.
3. How complex do you need to get?
Enterprises that must store big data usually reach for two main technologies: S3 and HDFS, said Jesse Anderson, managing director of Big Data Institute. S3 and other object stores are common ways to store files, he added, and HDFS is a frequent choice for data storage and processing.
“You can use S3 to store data for processing; however, it won't be as performant as HDFS,” he said. “For ease of operational use, some companies will opt for just using S3 to store and share data, especially in the cloud.”
Those two options are sufficient for simpler needs, Anderson said, but more complex use cases need to consider data optimization with storage.
“This is where we get into NoSQL databases and their core use cases,” he said. “Many big data companies will use both HDFS/S3 and a NoSQL database.” The latter lays out data so it isn’t necessary to read 100 billion rows or 1 petabyte of data each time. All reads and writes are efficient, even at scale.”