The Data Race Against The Pandemic, Together
How ML and data were crucial in fighting COVID-19 in 2020 as a united global community. 2020 was a year...
More and more organizations are relying on public cloud services. However, being in the cloud does not mean the same thing for all. Some organizations choose to implement and maintain their own cloud architectures. Others prefer to use entirely managed services, such as Snowflake’s cloud data platform. While the former has complete control over the architectures they are using, the latter enjoy never having to worry about infrastructure maintenance.
So what’s important when considering the “build vs buy” decision? Is maintaining your own cloud architecture a way for tech-savvy organizations to flex and show off their skills? Or is it really a must, given managed services’ lack of flexibility? When is one more suitable than the other?
A variety of decision factors need to be accounted for when choosing a cloud-based architecture. We distinguish and discuss the following types of considerations: infrastructure, data, and people.
Migrating to the cloud is an iterative process that happens gradually, over time. Organizations find themselves at different levels of cloud adoption, depending on how long ago they have started their journey. The cloud migration is powered by very specific business goals and deciding to build or buy can serve the exact same goals. This is a good time to distinguish and understand the differences between buying and building data architectures for the cloud.
Organizations that find themselves at the beginning of the cloud migration process have the most flexibility in choosing between buying and building. This decision should match the goals of their migration strategy. If they are migrating in order to cut down on maintenance costs, buying a managed service such as Snowflake is likely to help along that direction. By removing the need of maintenance and infrastructure setup, less time and resources will be needed for low-level maintenance.
On the other hand, if the migration goals are to counteract the limitations of on-premise, legacy systems and to develop custom architectures, building and owning a unique data architecture for the cloud is the way to go. Organizations that are already maintaining in-house environments are likely to have resources and expertise for maintaining their own infrastructure in the cloud.
See also: Executive’s Guide to BI Tools
The complexity of an organization’s data architecture plays a crucial role in their long-term cloud strategy. It determines which cloud services need to be integrated, which tools need to be used, and how much expertise is required for their maintenance.
Simple data architectures usually serve common purposes. They allow the ingestion, transformation, storage, querying, and visualization of data. While on-premise architectures are only limited by technical limitations of the tools being used (such as storage or compute power), in the cloud there are literally no limits to what can be accomplished. The only prerequisite being that these data architectures are built to take advantage of the cloud capabilities. Only then, operational requirements such as scalability, availability, and data protection can be reached.
For the most common data processing purposes, these problems have been solved by tools such as Snowflake. Buying a Snowflake architecture means buying a proven solution that has worked for thousands of other users. Since their cloud data platform is built to make the most out of the cloud, there’s no need to optimize nor try to improve the architecture.
For custom requirements, however, tools like Snowflake may not offer enough flexibility. In that case, it makes more sense to build architectures that serve those specific processing needs. However, such advice needs to be taken with a grain of salt. While simply building an architecture for the cloud may easily be achieved, it’s the maintenance and optimization that make the difference. This all falls under the direct responsibility of your team.
Then other times your organization may not be ready to change too much at once and a custom hybrid work-around solution may get you through till it gets too complicated and expensive to maintain. At some point, it makes more financial sense to update the whole system. If you are here or if you don't know how much you could be saving by switching, get a data audit by Blue Orange Digital, a Top AI Development Partner.
An architecture’s complexity is directly impacted by the particularities of the data being processed. This includes the different data types being handled, the variety of data sources, and the processes around that data. While these are things that change with an organizations’ evolution, it only makes sense to think about the data considerations in terms of future needs.
Buying Snowflake makes sense when there is a need to handle multiple data sources and multiple data types. By leveraging its data lake technology, Snowflake allows dealing with both structured and unstructured data out of the box. Their data warehouse is also tightly integrated with the data lake and allows quick access to pre-processed data. At the same time, ETL workloads are built to run concurrently, in a distributed manner, while making use of all cloud capabilities. Such functionalities ensure the performance of current AND future data processing pipelines.
Building data architectures makes the most sense when the data sources have a low potential to change in the future. A low number of data sources, few data types, and basic processing requirements should not result in complex services configurations. Since data lakes and data warehouses are still provided by all cloud vendors, they can be coupled together and implemented according to particular wishes. At the same time, this offers more flexibility and the ability to implement custom processing pipelines.
See Also: Is Machine Learning the Right Solution?
The availability of engineering staff is the most impactful factor in the decision of buying vs building. Building an architecture is only possible when skills and expertise with different cloud services are available. Similarly, the maintenance and optimization of a cloud platform are only possible when a dedicated team of engineers continually keeps a close eye on the performance of the different services.
Organizations with existing IT departments are most likely to develop expertise for building their own cloud platforms. On the other hand, small companies without any internal IT support are most likely to benefit from buying a managed service such as Snowflake. By reducing the need for maintenance, they can only train (or hire) engineering staff to work on their data and applications. With all the infrastructure needs managed for them, they can still enjoy the benefits of running in the cloud, without the need of handling any operational tools and processes.
Skilled IT personnel are needed for a variety of tasks across a cloud data platform. From setting up a cloud architecture to building data transformation pipelines and visualizing data to customizing and connecting different BI dashboard tools.
Since different organizations have different data access requirements, it is mandatory to understand the roles of the different team members. Who needs access to data? Why do they need access to data? Should the data be accessible via code only, or are visualization dashboards also needed?
When building, a significant part of the team is responsible for the conception and the maintenance of the cloud architecture. At first, they need to plan and identify a suitable cloud architecture. Secondly, they need to build, configure, and piece together a variety of tools and cloud services. Last but not least, the team is responsible for the maintenance of the whole architecture, which includes dealing with failures and ensuring operational requirements.
When buying, the team can only focus on accessing, interpreting, and visualizing data. On implementing applications, custom tools, and connecting dashboards that can then be further used by non-technical users. This is one of the most sought after Snowflake benefits: it allows skilled engineers to focus more on extracting business value from data, rather than spend time configuring and maintaining the underlying infrastructure.
Building a cloud data architecture is not a decision that has an impact on the present only. Instead, it is something that is meant to be reliable in the future and to accommodate both current and future business needs. For this reason, our decision guideline takes you through a few different perspectives that are relevant when setting up a data-centric cloud architecture. Identifying your business needs, internal processes, team growth perspectives, are all factors that will lead to the best decision.
Are you currently deciding whether or not to build your own cloud architecture? Are you considering buying a managed service, like Snowflake? Our team of experienced cloud architects and data engineers can assist you with your decision
Schedule 15-min with a Blue Orange Digital Solution Architect to discuss which option is right for your data sources and future goals.
Josh Miramant is the CEO and founder of Blue Orange Digital, a data science and machine learning agency with offices in New York City and Washington DC.
Miramant is a popular speaker, futurist, and a strategic business & technology advisor to enterprise companies and startups. As an example of thought leadership, Miramant has been featured in IBM ThinkLeaders, Dell Technologies, Global Banking & Finance Review, the IoT Council of Europe, among others. He can be reached at firstname.lastname@example.org.
Blue Orange Digital is recognized as a “Top 10 AI Development and Consultant Agency,”
by Clutch and YahooFinance, for innovations in predictive analytics,
automation, and optimization with machine learning in NYC.
They help organizations optimize and automate their businesses, implement data-driven analytic techniques, and understand the implications of new technologies such as artificial intelligence, big data, and the Internet of Things.