AWS Performance Data Management Optimization for Global Investment Firm
Our client is a leading Financial Services global firm with multi-asset alternative investment based in the US, Boston. The company specializing in private and public equity, venture capital, credit, impact investing, life sciences, and real estate. The firm invests across a range of industry sectors and geographic regions. As of 2019, the firm manages over hundred billion dollars of investment capital.
The client generates a large volume of transactional data as part of continuous, around-the-clock, worldwide trading. This data is produced, and supported by a proprietary transaction system that doesn’t support analysis and reporting in a timely manner; hence why this process wasn’t efficient and scalable to drive the business forwards.
The client needed to withdraw up-to-date data twice a day from the transaction system to facilitate reporting on portfolio management performance, financial position, exposition dynamics, and more. However, data refreshment deteriorates the main system’s performance. The client tried to overcome this challenge by carrying out data refreshments only during US low trading periods (e.g. during night-time). Nevertheless, as the client operates globally, while one geo-location is not trading, there may be other offices that are.
Our client needed a reliable partner to help them speed-up the data refreshment processing and reduce reporting and analysis. This required guidance from professionals to analyze problems and identify how business goals could be achieved.
Initial project phase included architecture assessment of existing systems to identify any process bottlenecks in the current architecture. During the architecture assessment, SoftServe’s findings included:
- Synchronic workflow—all jobs within it are executed one-by-one
- Lack of dashboard or metrics with statistics about detailed pipeline execution from a technical point of view
- Each database read-replica node could handle only a few processing tasks in parallel
- Single point of failure and no fail-over strategy
After evaluating the client’s business requirements and expected outcomes, SoftServe’s team advised on creating improvements.
Three key changes were identified to create the biggest impact on system performance by managing data flows:
- Parallelization of workflow execution—the system will split the data in different data flows to process couple of flows in parallel using AWS cloud
- Evaluate databases that allow data be read directly from the AWS S3 tool, or compact small files (since there are a large number of documents and these files are small in size—we suggested grouping the files together so that the data processing system recognized the files as single item, thus increasing the processing speed)
- Loading of compacted files to the database
Based on the prepared evaluation of transactional data size and daily patterns, performance vs. cost, and current system capabilities, SoftServe proposed two possible AWS-based architecture solutions:
- Out-of-the-box solution
- Customized approach
Both solutions and suggested databases were tested to evaluate whether they met the client’s needs through the development of Proofs of Concept (PoC) and after Implementation as a separate stages.
Facilitating a customized architecture approach, along with an Amazon Web Service (AWS) Redshift Spectrum database proved to maximize the system’s performance and match success criteria—therefore, this approach was put into the implementation stage.
In order to meet industry standards, SoftServe's team introduced Spectrum schema-on-read tables (taken directly from S3), developed orchestration function in serverless AWS Lambda, and described all stack using a “Infrastructure as a service (IaaS)” using AWS CloudFormation.
SoftServe drove the successful completion of the client’s first AWS project which enabled AWS services for the client.
The smooth project execution and positive outcomes were a direct result of SoftServe team’s comprehensive understanding of the financial services industry and AWS services.
The project was delivered on time with the following deliverables:
- 50 percent decrease in the process execution time of data processing
- Modern and straight-forward deployment procedure for the whole stack using AWS
- Recommendations for data quality assurance in terms of AWS technologies
The Client and SoftServe have learned several lessons and are looking forward to the next improvements and optimization phases.
Through multiple evaluation exercises available tools set on the market, SoftServe's team chose the most effective solution for this scenario and following key lessons learned:
- AWS S3 service match very good for Investment Services “raw data location”
- AWS Redshift Spectrum—out-of-the-box solution for staging location which includes schema-on-read
- AWS Lambda event-based solution—used of orchestration for whole solution
- AWS CloudFormation—Infrastructure as a Code (IaaC), easy and fast deployment