images/2020/04/Amazon-EMR.png}}

Amazon EMR

Amazon Elastic MapReduce is a web service that makes it easy to quickly process vast amounts of data.

20 Alternatives To Amazon EMR

images/2020/04/Apache-Beam.png}}

Apache Beam

Apache Beam provides an advanced unified programming model to implement batch and streaming data processing jobs.

Apache Spark

Apache Spark is an engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
images/2020/04/Apache-Spark-for-Azure-HDInsight.png}}

Apache Spark for Azure HDInsight

This article provides an introduction to Spark in HDInsight and the different scenarios in which you can use Spark cluster in HDInsight.
images/2020/04/Azure-Data-Lake-Store.png}}

Azure Data Lake Store

Azure Data Lake Storage Gen2 is highly scalable and secure storage for big data analytics. Maximize costs and efficiency through full integrations with other Azure products.
images/2020/04/Azure-HDInsight.png}}

Azure HDInsight

Azure HDInsight is a managed Apache Hadoop cloud service that lets you run Apache Spark, Apache Hive, Apache Kafka, Apache HBase, and more.
images/2020/04/Databricks.png}}

Databricks

Databricks provides a Unified Analytics Platform that accelerates innovation by unifying data science, engineering and business.‎What is Apache Spark?

Datameer

Datameer is a business-user-focused business intelligence (BI) platform for Hadoop.

Google Cloud Dataflow

Google Cloud Dataflow is a fully-managed cloud service and programming model for batch and streaming big data processing.

Google Cloud Dataproc

Managed Apache Spark and Apache Hadoop service which is fast, easy to use, and low cost

Hadoop

Open-source software for reliable, scalable, distributed computing
images/2020/04/Hadoop-HDFS.png}}

Hadoop HDFS

The Apache HDFS is a distributed file system that makes it possible to scale a single Apache Hadoop cluster to hundreds (and even thousands) of nodes.

Hazelcast

Clustering and highly scalable data distribution platform for Java

HortonWorks Data Platform

The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly…

Hortonworks

Hadoop-Related
images/2020/04/IBM-Analytics-Engine.png}}

IBM Analytics Engine

Analytics Engine is a combined Apache Spark and Apache Hadoop service for creating analytics applications.

MapR

MapR is a leading high-performance data management or IT management solution that integrates Apache Drill, Hadoop and Spark with real-time global event streaming, scalable enterprise storage, and database capabilities in order to control large appli…
images/2020/04/Microsoft-Azure-HDInsight.png}}

Microsoft Azure HDInsight

Azure HDInsight is an Apache Hadoop distribution powered by the cloud.

Platfora

BI and Analytics Platform

Qubole

Qubole delivers a self-service platform for big aata analytics built on Amazon, Microsoft and Google Clouds.