AWS Redshift (Part 1)

As an AWS solutions architect, you must set up a solution that helps the data analysts in your company to process large historical data for some released products. The data scientists and the developers suggest collecting all the results of the queries…


This content originally appeared on DEV Community and was authored by Dorra B.

As an AWS solutions architect, you must set up a solution that helps the data analysts in your company to process large historical data for some released products. The data scientists and the developers suggest collecting all the results of the queries for additional analytics with Amazon EMR, Athena and SageMaker. What AWS solution can you use in this context?

To answer this question, you need first to know what type of database you are dealing with.

Generally, we can classify databases into two groups, according to the approach that they use, which affects the type of data we want to extract eventually:

1. On-Line Transactional Processing databases(OLTP) :

Like RDS, it has a high transaction volume of simple and short queries. OLTP databases rely on four main operations: Create, Read, Update and Delete.

For example, with RDS you can CREATE a table containing products and their corresponding prices, you can READ the content of the table, UPDATE the names or the prices of the products and DELETE a product that you will no longer sell for the customers.

2. On-Line Analytical Processing Databases(OLAP):

It has a relatively low transaction volume of sophisticated and long queries that urge aggregations. OLAP DBs are used mainly for analytics.

Through the previous definitions, it became obvious that an OLAP is required in our context. An example of an OLAP database on AWS is Redshift.

Redshift is fully managed by AWS. It is a petabyte-scale data warehouse service.

Unlike RDS and many other OLTP databases which use rows, Redshift uses columns to store data. It also uses advanced compression and Massive parallel processing of data . This makes it ten times faster than SQL databases.

Redshift helps to report visualize and analyze collected data. You can save the results of your queries to an S3 data lake so you can do additional analytics with services provided by AWS like Athena and SageMaker.

Although Redshift is fully managed by AWS, it is set up ONLY in ONE availability zone and can’t take large data ingestion in real-time.


This content originally appeared on DEV Community and was authored by Dorra B.


Print Share Comment Cite Upload Translate Updates
APA

Dorra B. | Sciencx (2021-11-12T15:03:25+00:00) AWS Redshift (Part 1). Retrieved from https://www.scien.cx/2021/11/12/aws-redshift-part-1/

MLA
" » AWS Redshift (Part 1)." Dorra B. | Sciencx - Friday November 12, 2021, https://www.scien.cx/2021/11/12/aws-redshift-part-1/
HARVARD
Dorra B. | Sciencx Friday November 12, 2021 » AWS Redshift (Part 1)., viewed ,<https://www.scien.cx/2021/11/12/aws-redshift-part-1/>
VANCOUVER
Dorra B. | Sciencx - » AWS Redshift (Part 1). [Internet]. [Accessed ]. Available from: https://www.scien.cx/2021/11/12/aws-redshift-part-1/
CHICAGO
" » AWS Redshift (Part 1)." Dorra B. | Sciencx - Accessed . https://www.scien.cx/2021/11/12/aws-redshift-part-1/
IEEE
" » AWS Redshift (Part 1)." Dorra B. | Sciencx [Online]. Available: https://www.scien.cx/2021/11/12/aws-redshift-part-1/. [Accessed: ]
rf:citation
» AWS Redshift (Part 1) | Dorra B. | Sciencx | https://www.scien.cx/2021/11/12/aws-redshift-part-1/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.