Hadoop, formally called Apache Hadoop, is an Apache Software Foundation project and open source software platform for scalable, distributed computing. Hadoop can provide fast and reliable analysis of both structured data and unstructured data. Given its capabilities to handle large data sets, it's often associated with the phrase big data.
Recommended Reading: What is Open Source software?
The Apache Hadoop software library is essentially a framework that allows for the distributed processing of large datasets across clusters of computers using a simple programming model. Hadoop can scale up from single servers to thousands of machines, each offering local computation and storage.
The Apache Hadoop software library can detect and handle failures at the application layer, so it can deliver a highly-available service on top of a cluster of computers, each of which may be prone to failures.
As listed on the The Apache Hadoop project Web site, the Apache Hadoop project includes a number of subprojects, including:
Stay up to date on the latest developments in Internet terminology with a free weekly newsletter from Webopedia. Join to subscribe now.
From wacky alarm clocks to lecture hall tools and after class entertainment, these Android apps are a good fit for a student's life and budget. Read More »Sharing Threat Intelligence
A growing number of startups make the sharing of threat intelligence a key part of their solutions. Read More »Smiley Faces and Symbols
A text smiley face is used to convey a facial expression or emotion in texting and online chat conversations. This Webopedia guide shows you how... Read More »
The Open System Interconnection (OSI) model defines a networking framework to implement protocols in seven layers. Use this handy guide to compare... Read More »Network Fundamentals Study Guide
Networking fundamentals teaches the building blocks of modern network design. Learn different types of networks, concepts, architecture and... Read More »Computer Architecture Study Guide
This Webopedia study guide describes the different parts of a computer system and their relations. Read More »