What is our primary use case?
We use Apache Hadoop for analytics purposes.
What is most valuable?
The ability to take a lot of data and attempt to basically deliver the appropriate splices and summary chart is the most crucial function that I have discovered.
This stands in contrast to some of the other tools that are available, such as SQL and SAS, which are likely incapable of handling such a large volume of data. Even R, for instance, is unable to handle such data volumes.
Apache Hadoop can manage large amounts and volumes of data with relative ease, which is a feature that is beneficial.
What needs improvement?
In terms of processing speed, I believe that some of this software as well as the Hadoop-linked software can be better. While analyzing massive amounts of data, you also want it to happen quickly. Faster processing speed is definitely an area for improvement.
I am not sure about the cloud's technical aspects, whether there are things that happen in the cloud architecture that essentially make it a little slow, but speed could be one. And, second, the Hadoop-linked programs and Hadoop-linked software that are available could do much more and much better in terms of UI and UX.
I mentioned it definitely, and this is probably the only feature we can improve a little bit because the terminal and coding screen on Hadoop is a little outdated, and it looks like the old C++ bio screen.
If the UI and UX can be improved slightly, I believe it will go a long way toward increasing adoption and effectiveness.
For how long have I used the solution?
I have been using Apache Hadoop for six months.
What do I think about the stability of the solution?
It is far more stable than some of the other software that I have tried. It's also the current version of Hadoop software and is becoming increasingly more stable.
When a new version is released, the subsequent ones are always more stable and easier to use.
What do I think about the scalability of the solution?
According to what I have seen in my current enterprise, once I joined the organization, it was fairly simple to have it for an employee, and this is true for everyone who's been onboarded in my own designation. I would imagine that it is fairly scalable across an enterprise.
I am fairly certain that we have between 10 and 15,000 employees who use it.
How are customer service and support?
I have not had any direct experience with technical support.
We have an in-house technical support team that handles it.
Which solution did I use previously and why did I switch?
I have since changed careers, I no longer use any automation tools, nor does my job need me to compare the capabilities of other tools.
I am working with Risk Analytic tools. I work with data these days, therefore I use technologies like Hive, Shiny R, and other data-intensive programs.
Shiny is a plugin that you can have on R. As a result of changing my profiles, I am now working in a position that is more data-centric and less focused on process automation.
We currently have proprietary tools, a proprietary cloud software, therefore I don't really need to employ any external cloud vendors. Aside from that, I only use the third-party technologies I've already indicated, primarily Hadoop and R.
This is one of the prime, one of the cornerstone software that we use. I have never been in a position to compare the like-for-like comparison with another software.
How was the initial setup?
As it is proprietary software for the enterprise that I am currently working on, I had no trouble setting it up.
What's my experience with pricing, setup cost, and licensing?
I am not sure about the price, but in terms of usability and utility of the software as a whole, I would rate it a three and a half to four out of five.
Which other solutions did I evaluate?
When I was a digital transformation consultant for my prior employer, I downloaded and read the reviews.
It involved learning about workflow automation tools as well as process automation. I looked at a number of these platforms as part of that, but I have never actually used them.
What other advice do I have?
I would recommend this solution for data professionals who have to work hands-on with big data.
For instance, if you work with smaller or more finite data sets, that is, data sets that do not keep updating themselves, I would most likely recommend R or even Excel, where you can do a lot of analysis. However, for data professionals who work with large amounts of data, I would strongly recommend Hadoop. It's a little more technical, but it does the job.
I would rate Apache Hadoop an eight out of ten. I would like to see some improvements, but I appreciate the utility it provides.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.