Click here

Apache Hadoop

Hadoop, formally called Apache Hadoop, is an Apache Software Foundation project and open source software platform for scalable, distributed computing. Hadoop can provide fast and reliable analysis of both structured data and unstructured data. Given its capabilities to handle large data sets, it's often associated with the phrase big data.

Recommended Reading: What is Open Source software?

The Apache Hadoop software library is essentially a framework that allows for the distributed processing of large datasets across clusters of computers using a simple programming model. Hadoop can scale up from single servers to thousands of machines, each offering local computation and storage.

The Apache Hadoop software library can detect and handle failures at the application layer, so it can deliver a highly-available service on top of a cluster of computers, each of which may be prone to failures.

As listed on the The Apache Hadoop project Web site, the Apache Hadoop project includes a number of subprojects, including:



Top Terms

Connect with Webopedia

  • The Difference Between Adware & Spyware

    Not technically fitting into either the virus or spam category we have spyware and adware, which are growing concerns for Internet users.

    Read More »

Did You Know? Archive »

  • Quick Reference Archive »