Apache Spark vs Outerthought Lily comparison

Apache and Outerthought are both solutions in the Hadoop category. Apache is ranked #1 with an average rating of 8.6, while Outerthought is ranked #16. Apache holds a 17.5% mindshare in H, compared to Outerthought’s 1.3% mindshare. Additionally, 90% of Apache users are willing to recommend the solution.

Apache Spark

Read 65 Apache Spark reviews

1,404 Views
1,090 Comparison Views

90% willing to recommend

Outerthought Lily

49 Views
38 Comparison Views

Apache Spark

Outerthought Lily

Comparison Buyer's Guide

Download the report

Executive Summary

We performed a comparison between Apache Spark and Outerthought Lily based on real PeerSpot user reviews.

Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop.

To learn more, read our detailed Hadoop Report (Updated: March 2025).

Buyer's Guide

Hadoop

March 2025

Download the complete report

Helped 848,716 peers since 2012

Review summaries and opinions

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:

Categories and Ranking

Apache Spark

Ranking in Hadoop

1st

Average Rating

8.4

Reviews Sentiment

7.7

Number of Reviews

Ranking in other categories

Compute Service (4th), Java Frameworks (2nd)

Outerthought Lily

Ranking in Hadoop

16th

Average Rating

0.0

Number of Reviews

Ranking in other categories

No ranking in other categories

Mindshare comparison

As of April 2025, in the Hadoop category, the mindshare of Apache Spark is 17.5%, down from 21.4% compared to the previous year. The mindshare of Outerthought Lily is 1.3%, up from 0.4% compared to the previous year. It is calculated based on PeerSpot user engagement data.

Hadoop

Featured Reviews

Ilya Afanasyev

Senior Software Development Engineer at Yahoo!

Reliable, able to expand, and handle large amounts of data well

We use batch processing. It works well with our formats and file versions. There's a lot of functionality. In our pipeline each hour, we make a copy of data from MongoDB, of the changes from MongoDB to some specific file. Each time pipeline copied all of the data, it would do it each time without changes to all of the tables. Tables have a lot of data, and in the last MongoDB version, there is a possibility to read only changed data. This reduced the cost and configuration of the cluster, and we saved about $150,000. The solution is scalable. It's a stable product.

Read full review

Use Outerthought Lily?

Share your opinion

See which vendors are best for you

Use our free recommendation engine to learn which Hadoop solutions are best for your needs.

See recommendations

848,716 professionals have used our research since 2012.

Top Industries

By visitors reading reviews

Financial Services Firm

27%

Computer Software Company

13%

Manufacturing Company

Comms Service Provider

No data available

Company Size

By reviewers

Large Enterprise

Midsize Enterprise

Small Business

No data available

Questions from the Community

What do you like most about Apache Spark?

We use Spark to process data from different data sources.

See all answers

What is your experience regarding pricing and costs for Apache Spark?

Compared to other solutions like Doc DB, Spark is more costly due to the need for extensive infrastructure. It requires significant investment in infrastructure, which can be expensive. While cloud...

See all answers

What needs improvement with Apache Spark?

The Spark solution could improve in scheduling tasks and managing dependencies. Spark alone cannot handle sequential tasks, requiring environments like Airflow scheduler or scripts. For instance, o...

See all answers

Ask a question

Earn 20 points

Comparisons

Spring Boot vs Apache Spark

Compared 27% of the time

SAP HANA vs Apache Spark

Compared 12% of the time

AWS Batch vs Apache Spark

Compared 11% of the time

Cloudera Distribution for Hadoop vs Apache Spark

Compared 7% of the time

Spark SQL vs Apache Spark

Compared 7% of the time

More Apache Spark Competitors

Cloudera Distribution for Hadoop vs Outerthought Lily

Compared 100% of the time

More Outerthought Lily Competitors

Product Reports

Buyer's Guide

Apache Spark

April 2025

Download Apache Spark product report

Buyer's Guide

Hadoop

March 2025

Download Outerthought Lily product report

Also Known As

No data available

Lily

Overview

Spark provides programmers with an application programming interface centered on a data structure called the resilient distributed dataset (RDD), a read-only multiset of data items distributed over a cluster of machines, that is maintained in a fault-tolerant way. It was developed in response to limitations in the MapReduce cluster computing paradigm, which forces a particular linear dataflowstructure on distributed programs: MapReduce programs read input data from disk, map a function across the data, reduce the results of the map, and store reduction results on disk. Spark's RDDs function as a working set for distributed programs that offers a (deliberately) restricted form of distributed shared memory

Apache

Lily Enterprise sits on top of the Cloudera or Hortonworks Hadoop platforms, and considers the entire Hadoop stack and all data streams, including data warehouses, data lakes, CRM systems, mobile/online or social media activities, contact center applications and POS systems. Lily combines them into one cumulative, organized and actionable solution, and delivers real-time execution of predictive analytical models to help create individual customer profiles, Lily Customer DNA, to give organizations the most accurate, atomic-level view of their customers.

Outerthought

Sample Customers

NASA JPL, UC Berkeley AMPLab, Amazon, eBay, Yahoo!, UC Santa Cruz, TripAdvisor, Taboola, Agile Lab, Art.com, Baidu, Alibaba Taobao, EURECOM, Hitachi Solutions

ING, Orange, France Telecom, Alpha Credit, Turkcell, Eni, Zain Group, AXA, Rogers, Toyota, Belfius

Find out what your peers are saying about Apache, Cloudera, Amazon Web Services (AWS) and others in Hadoop. Updated: March 2025.

DOWNLOAD NOW

848,716 professionals have used our research since 2012.

See our list of best Hadoop vendors.

We monitor all Hadoop reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.