We use Redshift Spectrum for creating temp tables during the Ignition process.
Cloud Data Architect (AWS-Snowflake-Teradata-Oracle) at a consultancy with 10,001+ employees
The Redshift Spectrum is the most valuable feature, but the solution needs to be more optimized
Pros and Cons
- "I have primarily used the Redshift Spectrum feature and found it most valuable."
- "The solution is unable to work fast."
What is our primary use case?
What is most valuable?
I have primarily used the Redshift Spectrum feature and found it most valuable.
What needs improvement?
During our last office project, Redshift couldn't perform well even for a data size of 6 TB. Thus, compared to Teradata and Snowflake, the solution needs to work faster. They should extend the plan by including better optimization and readability as we get while using Teradata. Also, they should provide zero-copy coding and sharing facilities.
For how long have I used the solution?
I have been using the solution for three months.
Buyer's Guide
Amazon Redshift
December 2024
Learn what your peers think about Amazon Redshift. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
824,067 professionals have used our research since 2012.
What do I think about the stability of the solution?
I rate the solution's stability as a seven.
What do I think about the scalability of the solution?
There is no issue with the solution's scalability.
How was the initial setup?
The setup was straightforward as I have a POC. It was a simple process.
What's my experience with pricing, setup cost, and licensing?
The solution is available at a mid-range price as compared to other vendors.
What other advice do I have?
While using Redshift, we need to combine it with Glue to complete the process. Whereas, Databrix offers the same procedure without combining two solutions. Redshift would work well with small businesses if they already use AWS services. They can use Redshift if the database is not that huge. I recommend Snowflake over Redshift. I rate the performance as well as the overall product as a five.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Senior Director of Product Management at Sprinklr
Operates as a reliable Amazon service and has the capability to gather data from various Amazon sources and can be easily integrated with some maintenance configuration and code
Pros and Cons
- "Redshift is a major service of Amazon and is very scalable. It enables faster recalculations and data management, helping to retrieve data quickly."
- "When working with third-party services requires additional integrations and configurations, which can sometimes add more cost."
What is our primary use case?
I used it as part of the Amazon Connect integration; I had to implement Redshift for a couple of customers. It's used for various use cases involving reporting and exporting data to external sources. I have also used it for some analytics integrations.
The use cases I have typically worked on involve transferring Amazon Connect data to different systems for analytics. The two or three deployments I have done with Redshift are more or less similar because it acts as a kind of data middleware.
Redshift effectively gathers data from various sources and facilitates the integration of that data into different destinations. This is typically used for insights collection, data showcasing, and integration into a standard ETL process.
How has it helped my organization?
So, the overall performance and speed of Redshift have affected the query times.
For the use cases I worked on, particularly on the Connect side, the query times with Redshift are pretty straightforward. We started using Redshift for these cases, and it significantly helped. To achieve faster results from Redshift, we first need to optimize the queries. It does reduce a lot of time in how data is gathered and then presented from the queries.
What is most valuable?
For me, the most valuable feature of Redshift is the way it operates as a reliable Amazon service. It has the capability to gather data from various Amazon sources and can be easily integrated with some maintenance configuration and code; Lambda functions are required for this. It can be used in multiple places.
It all depends on the use cases, how we can actually ship the data, and how we can use the data from multiple sources. It is a typical reliable software and works very efficiently with Amazon.
For Amazon Connect combined with Redshift, the integration is mostly straightforward. Using Redshift always depends on the use cases, as there are other methods Amazon Connect can use to achieve its goals. As for Redshift itself, it can be used to build pipelines.
What needs improvement?
When working with third-party services requires additional integrations and configurations, which can sometimes add more cost.
From the Amazon Connect side of things, we have integrated Redshift. However, as an overall product, I have limited experience.
But from what I have experienced, whenever we do a Redshift integration, it needs to be planned carefully because although Amazon supports multiple data sources and different data consumption, Redshift needs to be configured very effectively and requires dedicated shared knowledge for successful deployments.
What do I think about the scalability of the solution?
Redshift is a major service of Amazon and is very scalable. It enables faster recalculations and data management, helping to retrieve data quickly. It’s a relatively old service within Amazon's offerings, with at least 10,000 customers. I've seen cases in different organizations where users experienced up to 35X times increase in throughput while using Amazon Redshift.
How was the initial setup?
It's pretty much straightforward. I just need some sort of configuration and a bit of integration, and then that's it. We should be able to get that done.
For first-time usage of Redshift, the process is pretty straightforward, thanks to the documentation provided by AWS and the straightforward integration with Amazon Connect.
It didn't take me much time to create, deploy, and configure. It’s very straightforward. However, having some prior knowledge about Redshift can speed up the process significantly.
For me, coming from a different background and learning about Redshift for the first time, I ended up reading some database documentation and doing some trials and testing before committing the production data.
What other advice do I have?
For someone who knows a bit about how databases and data warehousing work, it's quite straightforward to learn Redshift. It's easier for those involved in analysis, reporting, and ETL data warehousing, specifically database developers or data warehousing developers; they can learn it faster.
However, for someone without this background, it might take a bit more time to understand the concepts and how they integrate in different ways.
Overall, I would rate it an eight out of ten because it has been straightforward for my use cases. It's easy to integrate for those use cases.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Apr 29, 2024
Flag as inappropriateBuyer's Guide
Amazon Redshift
December 2024
Learn what your peers think about Amazon Redshift. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
824,067 professionals have used our research since 2012.
Data Analyst Lead at Vectornator
A cost-effective warehouse solution that needs to improve the access limitations
Pros and Cons
- "The solution has very competitive pricing."
- "It would be good to see Redshift as a serverless offering."
What is our primary use case?
Redshift is an AWS warehouse solution. We have structured datasets, and we don't load all the amplitude data into Redshift. We first do this via Hudl, a data integration solution partner, but then later, it's directly loaded by an interaction. Then we run DBT against Redshift. We have our data models in DBT, and we run data analytics threats against the data warehouse.
What is most valuable?
Service accounts are used in both Amazon Redshift and Google Cloud. For example, I could create a service account for my desktop to access Redshift or a service account for multiple users to access Redshift. In BigQuery, creating a service account is very simple, and you get full control over the access, so you can limit what the service account can do. This prevents accidental exposure of data or deletion of data. Only certain features are available, which is very handy.
Postgres syntax requires 25 synthetic scrubs to Postgresify. It's handy, but there are no blockers when using the query. It's more competitive, but the price is very reasonable. I was always aware of what I would pay, and if I reserved servers, I knew what it would cost. There is no alternative in choosing a solution. We had to use the server version of AWS, but it had limited features. A few features were lacking, which couldn't front Redshift against it or access it from the API. We had our nodes, which were sent from Amazon. It has a minimal setup, with two services running only.
It was predictable because the performance was good. When a complex BBT model was running, we reached its limits. If there was a one-node setup, not all the storage was available on the server. For example, in a machine with 72 gigabytes of storage, only four were available in a single setup. I had another node, with 64Gb. All the storage of the two servers was available and when you are running these complex queries, it's not only a bit of computing but also temporarily eats up the storage. I couldn't use a single server because temporary tables ate up the storage. BigQuery’s authentication is straightforward. Besides that, it's doing what it's expected to do. There are no major problems.
What needs improvement?
It would be good to see Redshift as a serverless offering. The proposition may be unclear, but at the time, there were certain limitations with the pay-as-you-go offering. However, a serverless offering would be more flexible on-demand pricing, which would be good to see because Redshift is not expensive, but I always have to buy a new server if I need more computing than I have. Setting up a new server is an easy task, but it would be better if I could scale my Redshift cluster up or down as needed; still, there is a need for manual control. For example, my analyst team is working on a job that requires a lot of computing and is only needed for this month, week, or even today. The job should scale up and down automatically, but it is not yet fully developed.
For how long have I used the solution?
I have been using Amazon Redshift for one and a half years.
What do I think about the stability of the solution?
We've had some cases where queries would get stuck, and we'd be on them for ages. I don't have the transparency to see what other queries are already running or if we're running out of some kind of resource. There weren't many major problems, but sometimes we'd get these annoying issues, especially when running complex queries.
What do I think about the scalability of the solution?
If we can immediately set up new servers, it's easy to do, but an automatic solution or a threshold would be ideal. This feature may be already available, but I'm not sure. We have three users using this solution. I rate the solution’s scalability a seven out of ten.
How are customer service and support?
Amazon Redshift support is not always available, so it can be challenging to reach them. You have to buy time and schedule with them. There is no real need for a technical hub, but it is not there when there is a need.
How was the initial setup?
The initial setup wasn't very complex.
What's my experience with pricing, setup cost, and licensing?
The solution has very competitive pricing. It can be expensive for the first time when you are building your site. Time and the amount of data also take some time to downsize. It would be cheaper than to have a server, but for Plexigos storage, you have to buy a specific size of compute power. Initially, it was more expensive than BigQuery pay-as-you-go, but it got cheaper later. The more data you have, the relative ratio becomes cheaper. It depends on the use case. In AWS, you must invest and understand the setups, such as what kind of servers you need. Then, you can set up your own, which can be very cheap. Redshift can be engineering-focused to set up, which is not ideal. Azure and Google Cloud, are more efficient for data analysts who are not data engineers. But it can be effective once you get used to it and set up a process. If you are utilizing the on-demand stuff, Redshift is the only vendor offering a dedicated service.
What other advice do I have?
From time to time, the solution needs to be restarted for maintenance. I recommend BigQuery over Amazon Redshift. I don't have experience with Snowflake, but it's set to be more feature-rich than BigQuery or HSA. I was more happy using BigQuery. Redshift is doing what it's expected to do, but you had to invest in learning the setup. Overall, I rate the solution a seven out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Solutions Architect at a hospitality company with 501-1,000 employees
Simple to configure with cost-effective managed service but limitations from a business intelligence perspective
Pros and Cons
- "Its simplicity in configuration, cost-effectiveness due to being in the cloud and close to our data sources, and the fact that it's a managed service that is scalable and reliable are highly valuable."
- "There might be some limitations from a business intelligence perspective, but nothing we can't find a workaround for."
What is our primary use case?
We use Amazon Redshift in our business intelligence ecosystem. It's simple to configure, cost-effective, and close to our data sources.
How has it helped my organization?
The managed service is scalable and reliable. AWS takes away scalability and reliability components, making it relatively easier for us.
What is most valuable?
Its simplicity in configuration, cost-effectiveness due to being in the cloud and close to our data sources, and the fact that it's a managed service that is scalable and reliable are highly valuable.
What needs improvement?
There are no significant issues preventing us from doing our tasks. However, there might be some limitations from a business intelligence perspective, but nothing we can't find a workaround for.
For how long have I used the solution?
We have been using it for five years or more.
What do I think about the stability of the solution?
We are happy with it, so there are no major stability issues that stand out.
What do I think about the scalability of the solution?
AWS handles scalability and reliability, making it easier for us to manage.
How are customer service and support?
We have two people to continue with support.
How would you rate customer service and support?
Positive
How was the initial setup?
Setting it up was straightforward due to its simplicity and being a managed service.
What about the implementation team?
AWS handles the scalability and reliability components, making it easier to implement.
What other advice do I have?
Ensure that information about specific configurations and internal uses remains anonymous.
I'd rate the solution seven out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Sep 29, 2024
Flag as inappropriateHead of Big Data Department at IBA Group
Provides excellent features, enables fast reporting, and can be deployed easily
Pros and Cons
- "Redshift Spectrum is the most valuable feature."
- "The product must become a bit more serverless."
What is our primary use case?
We use the solution for data storage of reports.
What is most valuable?
Redshift Spectrum is the most valuable feature.
What needs improvement?
The product must become a bit more serverless. Users should have to pay only for the resources they consume.
For how long have I used the solution?
I have been using the solution for one year.
What do I think about the stability of the solution?
The tool is quite stable.
What do I think about the scalability of the solution?
Around 20 people in our organization use the product. The tool’s scalability is good.
How was the initial setup?
The solution is deployed on the cloud. The initial setup was pretty easy.
What's my experience with pricing, setup cost, and licensing?
The product is quite expensive.
Which other solutions did I evaluate?
We also tried using Athena. However, Redshift was faster.
What other advice do I have?
We use the tool because we have everything on AWS. Amazon Redshift is best for fast reporting. People who want to use the solution must try using Athena. If it is not fast enough, they can try Redshift. Overall, I rate the product an eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Sr BI and Data Engineer at Datacult
Good for data warehousing but complex setup
Pros and Cons
- "The most valuable feature of Redshift is its cluster."
- "The initial setup is a complex process, especially for someone who is not familiar with nodes and configuring terms like RPUs."
What is our primary use case?
We use it for data warehousing. Currently, I'm setting up a data link with Redshift to fetch data from our data lake.
What is most valuable?
The most valuable feature of Redshift is its cluster.
What needs improvement?
Redshift's serverless technology needs to improve because not everyone is technically inclined. Organizations want to quickly access and import data into their data warehouse without hassle.
Redshift's ETL tool, Glue, is not seamlessly integrated with Redshift. I've encountered many instances where it couldn't fetch the perfect data type from the source, which should be intuitive. Snowflake's ETL tool, on the other hand, is more intuitive and seamless.
For how long have I used the solution?
I have been using this solution for two years. I am working with the latest version.
What do I think about the stability of the solution?
I haven't faced any stability issues because when it works, it runs continuously.
How was the initial setup?
The initial setup is a complex process, especially for someone who is not familiar with nodes and configuring terms like RPUs. You need to consult the documentation to understand what an RPU is.
Moreover, Redshift can be difficult to maintain, especially the Redshift cluster instance.
What about the implementation team?
When it comes to the initial deployment and implementation process of Redshift, there are two types of nodes to choose from: DC2 and RA3, which are for different requirements based on the load. One is for storage, one is for storage and checking, and one is for the computing center.
First, the user needs to know their exact requirement, unlike Snowflake, which automatically scales up and down based on the requirement using the Retrieval Service tool.
The service has not matured yet, and for the Redshift cluster, scaling has to be done manually. The cluster also needs to be set up manually, which is not ideal, especially when Snowflake is already in the market.
It is easy to deploy if you already know how to use Redshift. But if I were a new customer, I might need assistance.
What's my experience with pricing, setup cost, and licensing?
Redshift is a bit less costly than Snowflake, but the effort justifies the cost for Snowflake.
What other advice do I have?
I would suggest starting with a three-cluster that is DC two large, especially if you are setting up a cluster-based search. We offer a three-month or one-month trial, which will allow you to see if you can handle the manual scaling up, scaling down, and maintenance of Redshift. If not, then you can switch to a serverless data solution.
Overall, I would rate it a seven out of ten.
Disclosure: My company has a business relationship with this vendor other than being a customer: Integrator
DWH, BI & Big Data consultant / developer /modeler - independent contractor at Freelancer
Helps us create SQL ETL procedures in a business system
Pros and Cons
- "I like it because the usage is very similar to Microsoft SQL server. The structure of the query and the temporary tables are very similar."
- "The explain panel in the Redshift database could be better."
What is our primary use case?
I use Amazon Redshift for the creation of SQL ETL procedures in a business system. Business people check this in a front-end application, and it helps them plan sales for the next year.
Redshift is being deployed on a Microsoft Azure server.
There are about six people working on this project and using the solution, but there are many similar projects running on Oracle and Redshift.
What is most valuable?
I like it because the usage is very similar to Microsoft SQL server. The structure of the query and the temporary tables are very similar. Until recently, I thought it was the superior database, but now I think that Redshift is better.
What needs improvement?
The explain panel in the Redshift database could be better.
For how long have I used the solution?
I have used this solution for 10 months.
What do I think about the stability of the solution?
The solution is stable. I haven't had any problems or downfall with the database in the 10 months that I have used the solution.
Which solution did I use previously and why did I switch?
I have also used Microsoft SQL server and Oracle.
How was the initial setup?
Setup was difficult because I had to set up 25 connections with different users and passwords. The connections have been predefined, but there were still problems when trying to connect for the first time. I had some problems with some certifications that were malfunctioning. This might have had something to do with the functionality of my keyboard because if I pushed a random combination on the keyboard, it would delete the certificate from the folder and the connection wouldn't work. I think this is a problem with the remote desktop rather than with Redshift.
What other advice do I have?
I would rate this solution as eight out of ten. I can't give it a higher score because there are some issues with variable character columns in the table. Otherwise, it's a great database.
Some of my former colleagues from a previous job have joined my organization, and they have had some issues with the SQRs because some things work differently in Redshift, like the partition bar. If someone has issues with Redshift, my advice is to check with support.
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Service Manager & Solution Architect at a logistics company with 10,001+ employees
Easy to use and simple to setup, but the performance is low, and there is no tool to support the CDC
Pros and Cons
- "It is quite simple to use and there are no issues with creating the tables."
- "It takes a lot of time to ingest and update the data."
What is our primary use case?
We stored all of the data in the S3 bucket and would like to have it stored in a data warehouse, which is why we chose this database.
It would be very easy for us as an end-user, who would like to access the data, rather than draw it post-transformation and store it at a database level.
What is most valuable?
The TP transactions for the creation of the tables does very well.
It is quite simple to use and there are no issues with creating the tables.
What needs improvement?
The managing updates, deletes, and role-level change performance is very low. For example, while you are doing inserts, updates, deletes, and amalgamates, the performance is very, very poor.
If you want to query the database after you have a lot of terabytes of data, the load, performance-wise, is very low.
Looking at the performance of the query, querying the database, and especially with the amalgamates when it is getting updated, it is really poor.
We like this solution and have tried all of the native services; they were working quite well. The only concern about Redshift was managing the cluster, especially the EMR cluster. Our company policy was not to use EMR clusters, especially with the nodes failing. There were many instances of downtime happening. Essentially, there was too much data traffic.
The other drawback was the CDC, as we do not have any tools that can support it.
Creating the structure is easy on the DDL side, but after you create the table and you want to transform the data to store it in a database, the performance is poor.
It takes a lot of time to ingest and update the data. After you ingest the data and someone wants to fetch it in the table, it takes a lot of time performance-wise to return the results.
For how long have I used the solution?
We have been using this solution for three months.
We are using the latest version.
What do I think about the stability of the solution?
There are issues with stability and it should be compared with Snowflake.
What do I think about the scalability of the solution?
This solution is scalable. We scale up and scale down manually when we are required to, we do not have an automatic setup.
We have three or four people using this solution.
How are customer service and technical support?
We have contacted technical support to give our opinion and recommendations or feedback and they agreed that it needs improvement.
Which solution did I use previously and why did I switch?
Previously, we tried the Snowflake database, which works really well. The expectations were really good with the performance, also the DDL, DML operations on the processing of the data.
How was the initial setup?
The initial setup is simple and we did not find it very complex at all.
The time it takes to deploy depends on how many tables you want to create, or how many tables will you merge the data with.
Which other solutions did I evaluate?
We are switching to Azure, although not because of the product or the services that we did not like. It's about AWS being competitors for logistic companies that we are working with. Also for security reasons, we do not know how secure the data is on the cloud.
If you are competitors then you don't know if the data can be accessed by your competitor, and the team can be looking at a demographic, which could impact your sales.
What other advice do I have?
We have only just started using Redshift, but we are not really satisfied with it.
I would rate this solution a six out of ten.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free Amazon Redshift Report and get advice and tips from experienced pros
sharing their opinions.
Updated: December 2024
Product Categories
Cloud Data WarehousePopular Comparisons
Azure Data Factory
Snowflake
Teradata
Microsoft Azure Synapse Analytics
Vertica
Amazon EMR
AWS Lake Formation
Oracle Autonomous Data Warehouse
SAP Business Warehouse
IBM Db2 Warehouse on Cloud
Firebolt
Buyer's Guide
Download our free Amazon Redshift Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which ETL or Data Integration tool goes the best with Amazon Redshift?
- What is the major difference between AWS Redshift and Snowflake?
- What is the biggest difference between Amazon Redshift and Vertica
- How does Amazon Redshift compare with Microsoft Azure Synapse Analytics?
- What are the challenges faced during migrating from Netezza to AWS Redshift?
- Which ETL or Data Integration tool goes the best with Amazon Redshift?
- What are the main differences between Data Lake and Data Warehouse?
- What are the benefits of having separate layers or a dedicated schema for each layer in ETL?
- What are the key reasons for choosing Snowflake as a data lake over other data lake solutions?
- Are there any general guidelines to allocate table space quota to different layers in ETL?