Try our new research platform with insights from 80,000+ expert users
SenioITh677 - PeerSpot reviewer
Senior IT Officer- Head of Administration, System Administration Division for Unix and Linux Servers at a financial services firm with 10,001+ employees
Real User
A cost-effective alternative for managing our big data
Pros and Cons
  • "Now, using this solution, it is much cheaper to have all of the data available for searching, not in real-time, but whenever there is a pending request."
  • "I would like to see more support for containers such as Docker and OpenShift."

What is our primary use case?

We use this solution to look at and manage big data. It's mostly historical data that we offload from our data warehouse, as well as from other databases in other platforms.

We have two different installations. The first one is based on IBM POWER CPUs, and the other one is based on Intel CPUs. Our data center is on-premise. There is some thought on moving to a private could, or a private IBM cloud, but we have not proceeded with that as of yet.

How has it helped my organization?

This solution is a cheaper way for us to offload the otherwise expensive data. We can move data from outdated database versions, such as Oracle 10. It is now out of support, but still hosts some of our historical data. This solution has helped us move our data to the current version.

Previously, we had our data on more expensive platforms. Now, using this solution, it is much cheaper to have all of the data available for searching, not in real-time, but whenever there is a pending request.

What needs improvement?

We have had problems with the backup and with services that require a disaster site. We are still struggling with some of these issues.

We are having trouble with Active Directory and Hive integration.

I would like to see more support for containers such as Docker and OpenShift.

For how long have I used the solution?

About a year and a half.
Buyer's Guide
Cloudera Data Platform
April 2025
Learn what your peers think about Cloudera Data Platform. Get advice and tips from experienced pros sharing their opinions. Updated: April 2025.
848,716 professionals have used our research since 2012.

What do I think about the stability of the solution?

We have had some issues with the code, but it's mostly from the developers. From our side, we don't see any issues with stability, although it may be that we have a lot of unused CPU capacity.

What do I think about the scalability of the solution?

We have not acquired any additional hardware since our initial purchase. However, we expect more use cases to be added, at which point we may have performance or scalability problems.

How was the initial setup?

The initial setup is not very difficult. The configuration is not easy, but somebody with some experience is able to set it up. We had users for which we had to set up quotas and queues. For us, the basic installation was completed within a matter of a week.

What about the implementation team?

We had IBM set up both of our installations. 

What other advice do I have?

This is a good product, but we still have some issues with backup, and the performance monitors that we install on every system. There may be solutions, but we're struggling to integrate them.

This is a product that I recommend. It's a solution that comes at a lower price, and it works well if you don't have expectations that it will behave like a much more expensive system.

I would rate this solution an eight out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PeerSpot user
Big Data - Senior Solutions Architect at a tech vendor with 10,001+ employees
Vendor
It is open and there is no lock-in.

What is most valuable?

We evaluated Cloudera and Hortonworks. Based on our evaluation and actual experience in production of 60 nodes and development of 12 nodes, the most valuable features of Hortonworks are:

  • 100% open
  • No lock-in like Cloudera
  • Fast and accurate support instantly
  • Largest number of committers to Hadoop by any means
  • Hive is better in performance and ease of use compared to Impala

How has it helped my organization?

It helps a lot in data in motion (ingestion and manage in real time). We are able to do 3rd-party data monetization of our data within a t+20 minute time frame to our end customers.

What needs improvement?

  • Cost
  • Reliability
  • Speed
  • Ease of use

For how long have I used the solution?

I have used it for three years.

What was my experience with deployment of the solution?

I initially encountered deployment issues, but they were very good in resolving them.

What do I think about the stability of the solution?

I have not encountered stability issues.

What do I think about the scalability of the solution?

I have not encountered any scalability issues at all. That's the key reason we picked HDP over Cloudera, as Cloudera have issues & don't support compression of Hive in ORC format. They push only their products (not good).

How are customer service and technical support?

Customer Service:

Customer service has been excellent from the day one until now... and our Admin is comfortable with the SLA and turnaround time.

Technical Support:

Technical support is very good and proactive with SmartSense.

Which solution did I use previously and why did I switch?

We previously used a different solution. We switched from Cloudera. Initially, we went with Cloudera due to it being a popular choice in the market, etc, then realized it was bad choice. Before we scaled from 6 nodes to 12 nodes and before we went livein production, we scrapped it due to Impala's performance and lock-in.

How was the initial setup?

Using Ambari, it was easy to set up and we even tried the AWS for a test cluster.

What about the implementation team?

An in-house team implemented it: two admins, seven developers, one data scientist, one PM and 22 business users at the customer (end-user side).

What was our ROI?

ROI is 300%.

What's my experience with pricing, setup cost, and licensing?

Hortonworks is the best, comparing all three flavors. If all is well, we might use open source alone in the next three years; others you can't due to lock-in...

Which other solutions did I evaluate?

Before choosing this product, we also evaluate Cloudera.

What other advice do I have?

It is the best in terms of product vision and actual delivery.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Cloudera Data Platform
April 2025
Learn what your peers think about Cloudera Data Platform. Get advice and tips from experienced pros sharing their opinions. Updated: April 2025.
848,716 professionals have used our research since 2012.
PeerSpot user
Solution Architect at MIMOS Berhad
Real User
Top 20Leaderboard
It gives us semantic analysis based on the feeds from social networking data, clickstream data, etc., but it needs to support disaster recovery features such as mirroring.

What is most valuable?

  • It's the one and only complete open source big data platform
  • Ambari-managed admin configuration for HDFS, YARN, Hive, HBase, etc.
  • Customized dashboards
  • Web-based HDFS browser
  • SQL editor for Hive
  • Apache Phoenix - OLTP and operational analytics on Hadoop
  • Apache Zeppelin - A web-based notebook that enables interactive data analytics

How has it helped my organization?

  • Maintenance of our own data lake in the enterprise-level
  • Storage and analysis of server logs
  • Applying Operational Intelligence in the enterprise-level based on the analysis of various department units data
  • Semantic analysis based on the feeds from social networking data, clickstream data, etc.

What needs improvement?

  • Rolling upgrade
  • Disaster recovery features such as mirroring should be supported

For how long have I used the solution?

We've used it for one year.

What was my experience with deployment of the solution?

No issues encountered.

What do I think about the stability of the solution?

No issues encountered.

What do I think about the scalability of the solution?

No issues encountered.

How are customer service and technical support?

Customer Service:

3/10

Technical Support:

3/10

Which solution did I use previously and why did I switch?

No previous solution was in place.

How was the initial setup?

It's easy to setup.

What about the implementation team?

We did it in-house.

What's my experience with pricing, setup cost, and licensing?

Completely use the community edition along with other features that can be implemented on top.

Which other solutions did I evaluate?

No other solutions were looked at.

What other advice do I have?

Study, analyze, and compare with other big data platforms features according to your requirements before choosing the appropriate one.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user347793 - PeerSpot reviewer
Principal Consultant - Big Data with 501-1,000 employees
Vendor
It is improving rapidly, but like other flavors of Hadoop there is room for improvement.

What is most valuable?

  • Ambari
  • Hive
  • Sqoop
  • Flume
  • Spark

How has it helped my organization?

The Hadoop value proposition is in expanded functionality, linear scalability, and reduced software and infrastructure costs. Hadoop offers several generic frameworks for batch, real-time, and iterative processing, such as map-reduce, spark, and spark streaming. Additionally, these frameworks provide libraries for predictive analytics and machine learning. This type of expanded functionality is not easily achieved on any other single platform.

What needs improvement?

File system to provide indexed access to individual records with in-place update/delete. Also, Security integration through a common interface for authentication, authorization, disk encryption, network encryption, data access layer, data masking, etc.

Hadoop, does not provide improved performance, compared to traditional RDBMS, unless processing batches in the TB-PB range, or if the Hadoop platform has significantly more resources available.

For how long have I used the solution?

I have implemented various flavors of Hadoop over the past five years, including platform configuration and application development.

What was my experience with deployment of the solution?

Deployment is improving rapidly, but like other flavors of Hadoop there are always issues.

What do I think about the stability of the solution?

Stability is improving rapidly, but like other flavors of Hadoop there are always issues.

How are customer service and technical support?

5/10 - Responsive, but like all flavors of Hadoop, there are too many tickets to be reasonably triaged and supported.

Which solution did I use previously and why did I switch?

I have implemented various flavors of Hadoop such as Hortonworks and Cloudera over the past five years, including platform configuration and application development.

How was the initial setup?

Straightforward once you know what you’re doing.

What about the implementation team?

I work for a vendor team.

What was our ROI?

ROI is one of the main reasons organization pursue Hadoop. Cost per TB is a compelling factor.

Which other solutions did I evaluate?

Hadoop is complex. It takes a dedicated approach from individuals with a broad range of technology skills and commitment to overcome challenges that do not normally present themselves in well-established technologies.

What other advice do I have?

Hadoop is complex. It takes a dedicated approach from individuals with a broad range of technology skills and commitment to overcome challenges that do not normally present themselves in well-established technologies.

Disclosure: My company has a business relationship with this vendor other than being a customer: We're partners.
PeerSpot user
PeerSpot user
Infrastructure Engineer at Zirous, Inc.
Real User
Top 20
It's increased the amount of data that we store from sensor data and weblogs, which gives us a greater scope of data to analyze. However, I'd like to see an increase in usability for Apache Storm.

What is most valuable?

The HDFS (Java-based file system) and Hive Utilities are proving to be most useful.

How has it helped my organization?

Hortonworks has allowed my organization to increase the amount of data that we regularly store from sensor data and weblogs, which in turn gives us a greater scope of data to analyze.

What needs improvement?

I would like to see an increase in usability for the Apache Storm engine within the data platform.

For how long have I used the solution?

I have been using it for less than a year.

What was my experience with deployment of the solution?

When initializing our cluster, we did not allocate enough space to our VAR partition and that ended up causing some issues with the networking to our onsite Tomcat server.

How are customer service and technical support?

Customer Service:

It's fairly low customer service.

Technical Support:

It's fairly low technical support.

Which solution did I use previously and why did I switch?

We started off with this product.

How was the initial setup?

Both straightforward and complex, everything was easy to set up, but a lot of the behind the scenes configuration changes for customization could be rather time consuming.

What about the implementation team?

We used an in-house team. My advice is to study hard and read all the documentation thoroughly before starting any implementation. It is paramount that one understands the system before implementing it.

What was our ROI?

Current ROI is none as we are still in the POC phase with most of our products.

What other advice do I have?

Be sure that the product is necessary for the situation.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user344022 - PeerSpot reviewer
Data science engineer at a tech services company with 501-1,000 employees
Consultant
We are capable of processing various data science tasks, e.g. natural language processing or log processing.

What is most valuable?

  • Open-source
  • Big community

How has it helped my organization?

It is a different paradigm than standard relational databases. We can also process different tasks then just those related to the standard database world. That said, we are capable of processing various data science tasks, e.g. natural language processing or log processing.

What needs improvement?

  • Stability
  • It needs to be more mature
  • Security
  • User friendliness

For how long have I used the solution?

I've used it for three years alongside MapR and Cloudera.

What was my experience with deployment of the solution?

Almost every part of the Hadoop ecosystem has its problems and bugs.

What do I think about the stability of the solution?

Almost every part of the Hadoop ecosystem has its problems and bugs.

What do I think about the scalability of the solution?

Almost every part of the Hadoop ecosystem has its problems and bugs.

How are customer service and technical support?

The paid service is pretty good, but if you don't pay, there is documentation available in the community which is pretty good.

Which solution did I use previously and why did I switch?

I slightly experienced Cloudera which is very similar to Hortonworks, but there are parts which are not open source. I'm working more with Hortnoworks because all its parts are open source. and my company has a longer partnership with Hortnoworks

How was the initial setup?

It is easy if you have good administrators. It is also easy if you want to just play with it on your laptop. For real work and stability, I definitely recommend some paid support.

What about the implementation team?

I was involved in multiple projects. Usually, it was done in-house with paid support.

What was our ROI?

Every project is different. Since Hadoop is an infrastructure for a long period there is no simple ROI. Also, each customer has different expectations.

Disclosure: My company has a business relationship with this vendor other than being a customer: We have a partnership with all major Hadoop vendors.
PeerSpot user
Manager at a tech services company with 201-500 employees
Real User
Leaderboard
A seamless solution with a solid workflow
Pros and Cons
  • "The data platform is pretty neat. The workflow is also really good."
  • "It would also be nice if there were less coding involved."

What is our primary use case?

We use this solution for the hospitality industry. 

How has it helped my organization?

It was for end to end data processing and data manipulations.

What is most valuable?

The data platform is pretty neat. The workflow is also really good. 

What needs improvement?

The NiFi platform could be enhanced. This refers to the data ingestion in a workflow. 

It would also be nice if there was less coding involved. 

For how long have I used the solution?

I have been using this solution for six years. 

How are customer service and support?

The technical support is okay, but not excellent. They can take a while to respond. 

What other advice do I have?

If you wish to use this solution, make sure you compare it with some other solutions first to make sure it's right for your needs. 

Overall, on a scale from one to ten, I would give this solution a rating of nine.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user346956 - PeerSpot reviewer
Cyber Security and Analytics Engineer at a government with 1,001-5,000 employees
Vendor
We can collect data from different databases, and where the data is similar, it allows for a detailed analysis from a single data store. It could improve, though, on the ability to update data.

Valuable Features

Ease of deployment and management of the Hadoop cluster are features we've found most valuable.

Improvements to My Organization

It allows our organization to collect data from databases that are different, and where the data is similar, it allows for a detailed analysis from a single data store.

Room for Improvement

The ability to update data is an area where the product could improve.

Use of Solution

I've used it for one year.

Deployment Issues

We had an issue during deployment. You have to be sure that your base image is perfect and that your infrastructure is properly configured or issues will occur.

Customer Service and Technical Support

We don't have their paid support, but I have had discussions with their engineers and they have been extremely helpful. So based on that, I would give them 8/10.

Initial Setup

The initial setup is complex. It mainly stems from small issues that typically pop up and also a lack of experience in deploying the product. I highly suggest taking the Hortonworks Training prior to deploying.

Implementation Team

We used an in-house team. Take your time and utilize the free resources provided by Hortonworks.

ROI

At this point I don't believe I could provide a ROI as we aren't fully utilizing the product.

Pricing, Setup Cost and Licensing

If possible, I would suggest paying for the professional services which would give you on-site engineers to help deploy the cluster.

Other Solutions Considered

We did look at Cloudera, but due to having literally no money to spend for the project, we chose Hortonworks due to its being completely free and open source.

Other Advice

Take your time and script as much as you can so that all base images are the same.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Cloudera Data Platform Report and get advice and tips from experienced pros sharing their opinions.
Updated: April 2025
Buyer's Guide
Download our free Cloudera Data Platform Report and get advice and tips from experienced pros sharing their opinions.