What is our primary use case?
I'm involved in architecting and implementing Elasticsearch-based solutions, catering to various use cases including IIoT, cybersecurity, IT Ops, and general logging and monitoring.
The intention of this article is not to compare AWS Elasticsearch with Elastic ELK Elasticsearch and at the end declare the winner. Elasticsearch by itself is one of the coolest and versatile Big Data stacks out there. If you are planning to use it in your organization or trying to evaluate if it is the right stack for your product/ solution, this article offers some insights from an architect's perspective.
How has it helped my organization?
I'm not the right person to answer this question as I'm the service provider. My clients are the right people to answer.
What is most valuable?
The Spaces feature in Kibana is really useful. I can ingest all data and then offer multi-tenancy on a single stack to various departments (internal) or customers (external). This feature isn't available in AWS Elasticsearch, and Machine Learning isn't available either.
Other useful features such as Canvas (used to create live infographics) and Lens (used to explore and create visualisations using a drag-and-drop feature) are available only in Elastic's ELK Elasticsearch.
In the last 18 months Elastic has really caught up and also gone way beyond AWS by putting together all the missing components that make ELK Elasticsearch the most comprehensive stack in the entire Big Data ecosystem. Comprehensive because one stack addresses all of the three essential technical components of an end-to-end system: collect, store and visualise terabytes (and even petabytes) of structured or semi-structured data at ease.
What needs improvement?
Enhance the Spaces feature to make it fully multi-tenant by enabling role-based access control (RBAC) at a Space level rather than overall Kibana or stack level like it is currently.
Elastic needs to work on their Machine Learning offering because currently they have been trying to make it a black box which doesn't work for a serious user (a Data Scientist) as it doesn't give any control over the underlying algorithm. It's like a point-and-click camera vs a DSLR. The offering started with a single/ univariate anomaly detection on time-series data. Now, they have a multivariate which is good, but beyond this, we cannot build any other Machine Learning models, like traditional supervised models. Anomaly detection uses mostly unsupervised algorithms and also it is a very broad problem space for a black box to solve it fully.
Make index’s metadata searchable (or referenceable in search queries).
For how long have I used the solution?
What do I think about the stability of the solution?
Elastic ELK Elasticsearch is one of the most stable Big Data engines and the simplest to maintain and scale. Redundancy is built into the design so there is no single point of failure. We can configure a DR easily and if something goes wrong, we can restore the system into a brand new cluster in hours.
What do I think about the scalability of the solution?
Elasticsearch by itself is 100% scalable as scalability is built into the design like any Big Data system. We just have to add more nodes, and it scales horizontally and then redistributes the data into the new nodes, and the cluster becomes faster and agile automatically. Cross-cluster replication comes with a Platinum license. But this feature is highly exceptional and not a common need.
Which solution did I use previously and why did I switch?
I have worked with all the flavours of Elasticsearch viz. Elastic.co's ELK which is popularly known as the ELK stack (pronounced as 'yelk'), AWS Elasticsearch and Open Distro plugins for Elasticsearch.
All (including Solr that comes with Hadoop) are built on a common underlying technology, Apache Lucene. The difference is the added features that I call 'batteries included'. To be precise, Elastic's ELK Elasticsearch, unlike others, comes with free enterprise-grade apps (called plugins in Kibana) and a bunch of cool and useful Kibana features. It also features a good deal of engineering automation conveniences built into the stack.
Did you know that the original founders of Elasticsearch are the folks at Elastic.co, the company that has recently transitioned to an open-core philosophy by design. But since AWS took the initial lead and started offering the stack as AWS Elasticsearch service it became more popular and a preferred option for the uninformed. Elastic, on the other hand, was busy innovating and adding more muscle to the stack that it is no more limited to being just the fastest search engine on the planet. In fact, the keyword ‘search’ in Elasticsearch is not relevant anymore and, moreover, it is misleading.
How was the initial setup?
Initial setup is indeed straightforward and fast because it will mostly be a single-node cluster. But as the data volume grows and we start seeing a performance lag, the stack requires scaling (by adding more nodes) and a professional intervention for doing the right capacity design and configuration fine tuning.
What about the implementation team?
It is always a good idea to engage a professional vendor to implement it right the first time and save yourself a lot of time in experimenting and trying to figure out the optimisation hacks and how-to’s all by yourself.
What was our ROI?
A stack like Elasticsearch that enables heavy lifting of the data effortlessly comes with its intrinsic yet obvious ROI. If one is not able to realise the ROI it means either the data is bad (garbage in, garbage out) or the stack is not implemented properly.
What's my experience with pricing, setup cost, and licensing?
The basic license is free, and it comes with a lot of features that aren't supposed to be free! With a Gold license, we get Alerting (called Watcher) and some modest enterprise features. Note that if alerting is a must feature for you, you can install open-source alerting plugins like Open Distro Alerting or ElastAlert and avoid the Gold license cost. Active Directory integration, SAML, SSO, Machine Learning etc. come with Platinum license. The licensing is per-node and per-annum basis for an on-premise installation and for Cloud Elastic-managed service the cost is baked into the hourly pay-as-you-go fee. Kibana does not have a license, so it's free.
If you don't want alerting, Active Directory or LDAP integration and are good with native authentication, the basic license will suffice. The basic license also comes with many internal stack features, which are free. For example, data segregation into hot and warm storage, automatic configuration, and rolling over the index after achieving a certain size limit.
SIEM (Security Information and Event Management) app is free. Also is another cool app called Uptime that helps us monitor the uptime of servers and web services. We can do this without any third-party licensing cost. Just turn on the apps, ingest data using Beats and the apps will start thriving. Over time they become mission critical to your business.
For example, the SIEM app will automatically populate the dashboards and allow us to monitor network traffic, successful logins, unsuccessful login attempts, and anomalous security events. All that comes off the shelf and is free. You'll pay a lot, on the other hand, for a traditional SIEM like ArcSight or LogRhythm.
Another free app called Infrastructure (formerly known as Metrics) helps monitor the server infrastructure by configuring light-weight data collectors called MetricBeats (for Windows systems) and AuditBeats (for Linux systems). The Beats will start pumping in all the system performance metrics into the stack and help monitor the memory, CPU and disk utilization.
Which other solutions did I evaluate?
I have worked with all the flavours of Elasticsearch viz. Elastic.co's ELK which is popularly known as the ELK stack (pronounced as 'yelk'), AWS Elasticsearch and Open Distro plugins for Elasticsearch.
All (including Solr that comes with Hadoop) are built on a common underlying technology- Apache Lucene. The difference is the added features that I call 'batteries included'. To be precise, Elastic's ELK, unlike the others, comes with free enterprise-grade apps (called plugins in Kibana), a bunch of cool and useful Kibana features, and a good deal of engineering automation built into the stack.
Moreover, the original founders of Elasticsearch are the folks at Elastic.co, the company that's built on open-core philosophy. But AWS took the initial lead and offered the stack as AWS Elasticsearch service catering mostly to search-engine use cases. But ELK, with all its goodness, is much more than a search engine! In fact, the keyword search in Elasticsearch is very misleading.
What other advice do I have?
You can spin up Elastic ELK Elasticsearch fully-managed service either on AWS, GCP, or Azure, or have your own on-premises installation and dockerize it. Whereas the AWS Elasticsearch is available only on AWS. That's the hosting difference.
Elastic ELK Elasticsearch comes with a support-only subscription, and there are a lot of updates happening. Kibana is constantly improved and there’s a new release every two weeks.
Disclosure: I am a real user, and this review is based on my own experience and opinions.