Migrated a locally developed Markov model, used in student loan analysis, from R Studio IDE to AWS cloud, increasing access, adaptability, and collaboration.
The client Varde Partners LP operates as an investment management firm and is a leading global alternative investment adviser. Founded in 1993, the firm has invested more than $68 billion since inception and manages over $14 billion on behalf of a global investor base.
Their goal is to seek opportunities in less competitive markets making large amounts of financial data and information critical to their success. Only high-performing, accurate financial models can help them understand the drivers of performance and give senior management a holistic financial view of the Firm’s performance.
Limited approach to model development
In one particular project to analyze financial Markov models for student loans, the client relied on in-house experts to develop the models, but they weren’t easily accessible, adaptable, nor collaborative for other team members. Their job was performed in a local development environment, for which the RStudio IDE was chosen. The data used in model training was retrieved manually from PowerBI. The codebase consisted of custom-made scripts, which could not easily be reused nor adapted to other similar problems.
Blue Orange Digital identified a set of challenges arising from their approach to model development:
- Local development environments are inherently limited. There was no possibility to easily share results, nor engage other team members in model development.
- The existing codebase was inefficient, which resulted in a slow development speed. Since the company required multiple similar models to be developed, this did not align with their long term goals.
- The existing codebase was not following best practices. This made it expensive to maintain, with each further code change costing the company both money and time.
- Local model development is not sustainable. Without versioning and keeping track of models, it was impossible to reuse them across different applications.
- The existing model was not easily accessible to in-house stakeholders: employees who wanted to contribute to model development, but also those who wanted to access and interpret its results. The financial model could simply not be used to its full potential.
- The inference and its results took place locally and stayed local. Instead of having the ability to share them with the rest of the team, both the results and the implementation details of the model were stuck with the sole developer of the model.
All in all, the limitations of the local development environment did not serve the long-term goals of the client. With the purpose to develop multiple models for different applications, the customer needed a way to improve access to model development and to make their results available to a variety of stakeholders. Employees, executives, and even external financial advisors required access to the models and their results.
Model migration: From local development to cloud-based notebooks.
Blue Orange Digital planned and implemented a migration plan that ported a locally developed financial model to the AWS cloud. Like this, anyone in the company could be granted access to the model, join in its development, and have easier access to its results.
Our team of engineers broke down the problem into the following steps:
1 . Refactored and optimized a locally developed student loan financial model
The existing code base had originally been written in R, which was less than the ideal choice for the task at hand. Our engineers reviewed, refactored, and optimized the existing code base. They identified hot spots, removed anti-patterns, and brought the code base up-to-speed with current best practices from the industry.
The effects of our work could be seen immediately: after refactoring, the model could run 12x times faster compared to its initial inference speed.
2 . Migrated the local codebase to a cloud-based notebook environment
Our team deployed the optimized model to cloud-based notebook environments managed with AWS. For this, they relied on the SageMaker tool, which they used to deploy, manage, and customize Notebook instances according to the client’s needs.
The use of notebook instances was crucial in enabling the client and their team to easily collaborate on future model development efforts.
3 . Created a guided model development flow
The Blue Orange team architected and implemented a model development flow based on a predefined notebook template and for the ability to manage multiple notebook instances in mind. By leveraging ipywidgets and the flexibility of Jupyter notebooks, they implemented an interface that allows users to create custom notebooks. This resulted in a “guided model development flow” that also enabled non-technical employees to duplicate, modify, and interact with their own models, in their own notebooks. The non-technical staff could “copy and paste” a model and adapt it as necessary to run a comparative analysis without circling back to their developers.
On top of that, our team set up a git-based version control system, which made model versioning possible. Not only has that made it easier to collaborate on notebooks, but also to keep track of changes over time.
4 . Provided technical support
Our engineers have also tackled the accessibility and security issues of the implemented solution. Given that access to model development was required on a global level, they made sure to set up user permissions according to the client’s needs. Like this, employees from different countries could manage their own notebook instances, running in isolation from all other instances.
Our team also provided the client with the documentation for the new cloud-based development environment. This included industry best practices, guidelines on model development, as well as an introduction to SageMaker-specific tooling. This enabled the client’s development team to self-manage and make the most out of their new cloud-based model development flow.
The solution provided by Blue Orange Digital laid the foundation for further development of financial models. With the cloud-based notebook environment in place, model development can now engage the whole company.
Notebooks enable user-friendliness
- Financial analysis models are intricate and complicated by nature. Having the infrastructure to make the model development flow more intuitive and simple to use has made these models (and their results) more accessible to a variety of stakeholders.
- Scientists and analysts can easily start using existing models, or build new models based on the existing templates. Executives and managers have access to better reports. Most importantly, results can be reproduced and model performance can be tracked all throughout development.
- The new development flow makes communicating about models and their results much easier. What used to be a one-person effort to do data science now has the chance to benefit and engage the whole organization.
Reduced model inference times
- The optimized model is running much faster and is now in a state in which it can be used as a foundation for the development of other models. As one of our team members puts it, model inference time went down from 12 minutes to under 1 minute:
- While this may only sound relevant for the development team, it is truly a KPI that affects the performance of the whole financial analysis workflow: model development and prototyping become faster, safer, and more reliable.
Improved Model Accessibility and Data Security
- The client now has the capability to assign model development tasks to different employees (or even external partners) without having to worry about data privacy and security issues. The cloud-based notebook environment can be easily managed and kept under control by their internal IT and operations team, while data scientists and financial analysts can focus on what matters.
- The solution provided by Blue Orange Digital has a twofold effect on the reliability of our client’s financial analysis efforts.
- Firstly, at the model level: the versioning of models has become possible. This enables future model developers to keep track of model changes, track their performance, and even reproduce analysis results, as required.
- Secondly, at the infrastructure level: the SageMaker Notebook environment is a foolproof solution that will benefit the client for a long time. It can easily be integrated with existing data processing pipelines and extended with new functionalities, to increase the variety of the existing models.
While our client’s challenges are by no means unique in the financial sector, it took the vast expertise of our team to turn those challenges into opportunities. They carefully took into account existing infrastructure constraints, the availability of modern tooling, and the needs of the in-house experts. The solution they provided has made a significant impact on the model building process and has laid the foundation for future machine learning and analytics workflows.