Jungle Disk Migrates 4.5 PB to Google Cloud
Jungle Disk was using an incumbent cloud provider service and needed to move 4.5 PB of data to Google Cloud Platform (GCP) for cost-saving purposes and better storage plans.
In collaboration with Google, SoftServe developed a tool that facilitates the migration of 4.5 PB of data to Google Cloud. Migrating this huge amount of data in a short timeframe without service interruption called for a fast, scalable, reliable and error proof solution.
The tool can move data to GCP platform by specifying the exact bucket to move from at ~200 MB/s. This supports error logging, stop/resume/retry logic and has deployment/monitoring process in place.
In phase one of the project, Jungle Disk received a Software Architecture Document that described general software requirements, migration tool architecture design, the technology stack the software would be built on and a system overview with general system purpose. Additionally, SoftServe provided a mig ration benchmark results based on a working prototype of the tool. Based on the benchmark results, Jungle Disk was able to understand how much time the actual migration would take and what efforts would be needed for the migration.
The migration tool that performs the data migration is deployed to real production servers ready-to-use. It was built using DevOps best practices and continuous deployment processes. SoftServe’s team collaborated directly with Google to validate the solution prior to implementation.
SoftServe is also leveraging a subset of Google Cloud Platform services including Google Cloud Storage, Kubernetes Engine, Cloud Pub/Sub, Google Stackdriver, and Cloud SQL.
SoftServe’s solution is capable of migrating large chunks of data to Google Cloud platform with the durability and performance expected by the client and has logging/monitoring and deployment processes in place. Jungle Disk’s migration is in progress, and the company has already migrated 100TB to Google Cloud. SoftServe is also providing additional support to help with bug-fixes.
Jungle Disk positions GCP storage as a way to improve data recovery time, having seen significantly improved data recovery time with GCP when compared to tests conducted on other platforms for the Jungle Disk software v3.30 release. This improvement will substantially better outcomes for Jungle Disk’s customers in the event of an accident or cyber-attack.
GCP also provides geo-replicated data protection which ensures client data on GCP storage is not impacted by an outage in a single data center.