by  Chip Plesnarski

Look Up for Data Warehousing

clock-icon-white  4 min read

Forward-thinking enterprises know that the speed, agility, and scalability required to sustain and compete in the digital landscape requires a move to the cloud. The most pressing driver of this need is data. As society grows more connected, data will grow exponentially—as will the need to consolidate, store, and manage it. There was nearly 11 zettabytes of user IP data that flowed from the cloud in 2018. (That’s 11 billion terabytes of user data that was processed last year alone.)

Beyond storage and optimizing capital and operational expenditures, there are many benefits to looking to the cloud for data warehousing, especially when using a cloud-native warehouse approach. Cloud warehousing:

  • Handles larger volumes of data at scale
  • Provides greater visibility into data assets
  • Processes high-volume data more effectively
  • Significantly reduce operational burdens
  • Empower IT to actively contribute to broader organizational strategies

We advocate for cloud-native warehousing wherever possible because it is the most futureproof approach to data warehousing. This is true for a number of reasons.

data-warehouse-gcp

Different by design

Cloud-native warehouses—are, as their name indicates, built with cloud architecture and constraints in mind from the start. Serverless computing—such as Google’s BigQuery—eliminates node and cluster configuration management. Separation of computing and storage resources allows for instant load and query capability. Automated management features include: data growth, query performance, disaster recovery, backups, and more. And cloud-native warehouses deliver scalability with no further requirement for resource provisioning.

There are differences with cloud-native to deploying a cloud-hosted version of a company’s current data warehouse (which would be a data warehouse in the cloud, not cloud-native). A cloud-native approach eliminates much of the ongoing administration and expert management required to maintain a cloud-based version of MPP data warehouse, or a traditional cluster-and-node architecture—which will have inherent issues with horizontal scalability and responsiveness when migrated. There is, however, a certain amount of retraining and tweaking that must take place internally to adapt current processes to best practices for cloud-native systems. For enterprises that are still undecided, Google recommends evaluating the following technical and economic dimensions to determine if continuing to run the data warehouse on premises is right for the company.

Data security and governance

Is data always protected by encryption? Is fine-grained, role-based access control supported? Does the on prem warehouse provide transparent audit logging for activity, data access, and billing? If the answer to one or more of these questions is, “No” then cloud-native must be considered.

Strategic Initiatives

If a company feels the on prem warehouse passes for security and governance, then ask the following in regards to how well IT is empowered to contribute to the company’s strategic initiatives:

  • Does it scale seamlessly and simplify operations?
  • Does it allow team collaboration across data artifacts and analyses?
  • Does it automate data delivery of new sources on demand?
  • Does it lay a strong foundation for predictive analytics and machine intelligence?

Once again, if a company can’t answer confidently in the affirmative, then cloud-native data warehousing will likely deliver dramatic improvements both economically and operationally.

data-warehouse

The next logical step is to determine which cloud-native solution is the right one for the company. It’s not enough to ensure parity with critical business functions today. The company’s relationship with data will change dramatically in both the short- and long-term future. As data generation and reliance on that data increases, demand for access will grow in parallel outside of IT.

For the future, enterprise architects must also consider the following about cloud-native warehouse options:

  • How is data access controlled within and beyond IT?
  • Does it have non-expert user-friendly interfaces and features?
  • Does it provide simple paths to high-demand data including marketing, mobile, and SaaS analytics?

Additionally, companies must ensure that a warehouse lives within an ecosystem designed to achieve strategic goals, that there are accessible reporting tools, mobile analytics frameworks, with immediate stream processing and ML capability integration.

Next Steps

While looking to the cloud for data management, take a close look at Google Cloud Platform (GCP) for on demand, business insights via serverless data analytics services.

You can leave the complexities of data analytics behind and

  • Use Google BigQuery, a cloud-native serverless data warehouse that executes queries in seconds instead of minutes, at any scale, for accelerated time to insight
  • Ingest and analyze up to millions of events per second in real time with Cloud Pub/Sub and Cloud Dataflow
  • Get value faster from data processing on Apache Spark and Apache Hadoop with Cloud Dataproc
  • Visualize and explore data, publish dashboard and report to share insights using Google Data Studio and existing third-party BI tools
  • Bring predictive analytics into your applications by adopting machine learning at your own pace using Cloud Machine Learning Engine or pre-trained machine learning APIs
download pdf

At SoftServe, we are proud GCP partners and award-winning big data and cloud experts. Contact us today and let’s discuss where you are in your cloud-based data management journey.