Apache Spark

Apache Spark is an open-source engine developed specifically for handling large-scale data processing and analytics. Spark offers the ability to access data in a variety of sources, including Hadoop Distributed File System (HDFS), OpenStack Swift, Amazon S3 and Cassandra.

Apache Spark is designed to accelerate analytics on Hadoop while providing a complete suite of complementary tools that include a fully-featured machine learning library (MLlib), a graph processing engine (GraphX) and stream processing.

Apache Spark originated at UC Berkeley s AMPLab in 2009 and was donated in 2013 to the Apache Software Foundation, where it has become the most active project in terms of contributions.

One of the key reasons behind Apache Spark s popularity, both with developers and in enterprises, is its speed and efficiency. Spark runs programs in memory up to 100 times faster than Hadoop MapReduce and up to 10 times faster on disk. Spark is natively designed to run in-memory, enabling it to support iterative analysis and more rapid, less expensive data crunching.

Forrest Stroud
Forrest Stroud
Forrest is an experienced, entrepreneurial and well-rounded professional with 15+ years covering technology, business software, website design, programming and more.

Top Articles

The Complete List of 1500+ Common Text Abbreviations & Acronyms

UPDATED: This article was updated April 6, 2021 by Web Webster   From A3 to ZZZ we list 1,559 text message and online chat abbreviations to...

How to Create a Website Shortcut on Your Desktop

UPDATED: This article was updated April 6, 2021 by Web Webster   This Webopedia guide will show you how to create a desktop shortcut to a...

Windows Operating System History & Versions

The Windows operating system (Windows OS) refers to a family of operating systems developed by Microsoft Corporation. We look at the history of Windows...

What are the 5 Generations of Computers?

UPDATED: This article was updated on April 6, 2021 by Web Webster   Learn about each of the 5 generations of computers and major technology developments...

Random Access Memory (RAM)...

UPDATED: This article Updated April 6, 2021 by Web Webster   Random Access Memory (RAM)...

OEM – original equipment...

UPDATED: This article was updated April 6, 2021 by Web Webster OEM (pronounced as...

Best ERP Software for...

UPDATED: This page was updated April 6, 2021 by Web Webster   Enterprise resource planning...