Forget About Managing Data Warehouse Infrastructure: Use Amazon Redshift Serverless
Data analytics use is on the rise and organizations are constantly searching for ways to remove the hurdles that limit access for team members with minimal expertise. Data warehouses are necessary systems used to report and analyze data, but they require quite a lot of learning. Not everyone has the time and capability to learn how to operate them.
Amazon states that developers, business professionals, and data analysts don’t have to bother learning them. There’s a solution rocking the scene, called Amazon Redshift. Redshift allows its users to work with data across data warehouses, databases, and data lakes by using SQL. However, today we’re taking Redshift one step further.
Meet Amazon Redshift Serverless. As the name suggests, ARS analytics run on the cloud. All it asks for are data and your queries. No more boring cluster setup and management or expensive costs when the data warehouse is inactive. You’re charged only for those seconds when the database is actively loading or querying data.
Features of AWS Redshift Serverless
Amazon Redshift Serverless utilizes the existing data warehouse capacities to portray business statistics clearly and scale their resources into a more compatible display. Relying on a serverless endpoint, it regulates the system capacities within seconds to maximize performance and simplify even highly complicated operations, regardless of the workload.
The serverless endpoint brings other several features that are quite useful:
- Users aren’t required to deal with the setting up and management of Amazon Redshift provisioned clusters to analyze and access data.
- Querying across data lakes, data warehouses, and operational data sources is fairly simple by using the advanced Amazon Redshift SQL functions, and its lake house architecture.
- It charges only for the amount of time (in seconds) when the data warehouse is actively processing queries.
The control of data flow and its movement from the serverless data warehouse is done through a console interface. From there you can extract data from the Amazon S3 data lake or Redshift managed storage.
Using AWS Redshift Serverless
Now, let’s get to the part of using the Amazon Redshift console. Using the console requires having the IAM (Identity and Access Management) permission. Then, continue by attaching to your IAM user or role a policy that resembles this one below:
Choose the Amazon Redshift console from the AWS Management Console and then tap on Try Amazon Redshift Serverless. The first time you open the serverless endpoint console if the AWS identity and IAM permissions are correct, you’re going to see the Get Started with Amazon Redshift Serverless page.
You are presented with two options to choose from: Use the default settings or Customize Settings. The second allows you to create a serverless database or endpoint on your own. Selecting the second option displays a number of settings you can customize.
- Admin User Credentials: The login information for the initial database administrator (user name and password). This user is granted ownership permission of the database.
- Database Name: This initial (default) database resides in the AWS Region and it’s named ‘dev’. You are not allowed to change it but the database is in the ownership of your account.
- Virtual private cloud (VPC): The details of the VPC which holds the created database.
- VPC Security Groups: The IP range and subnets accepted by the VPC are defined by these security groups.
- Subnet: It names the subnets contained in the VPC connected with the selected database.
- Customize Encryption Settings: By default, the system uses the AWS-owned KMS for data encryption. However, you’re allowed to choose a KMS key on your management.
- Audit Logging: A list of audit log types that you would like to export. Amazon Redshift Serverless allows the movement of the user, user activity, and connection data easily.
- Permissions: The IAM role linked with the serverless endpoint must involve a secure relationship with redshift.amazonaws.com and redshift-serverless.amazonaws.com.
Before using the serverless endpoint you have to wait a few minutes for Amazon Redshift Serverless to complete resources initialization for your account. After this process is done and the environment is set, the Amazon Redshift query editor pops up for you to work on.
Final Thoughts
The pricing model is one of the features to set Amazon Redshift Serverless apart from other services. Data warehouse compute capacity is counted in Redshift Processing Units (RPUs). The billing doesn’t rely on the computing capacity but only the time of query processing and is done in RPU-hours on a per-second basis of your workload.
Data science holds the key to an organization’s success. Blue Orange Digital assists businesses in multiple industries, from healthcare, real estate, and agriculture to financial services, in extracting and presenting data for better data-based decisions. Read more here.