Hadoop, formally called Apache Hadoop, is an Apache Software Foundation project and open source software platform for scalable, distributed computing. Hadoop can provide fast and reliable analysis of both structured data and unstructured data. Given its capabilities to handle large data sets, it's often associated with the phrase big data.
Recommended Reading: What is Open Source software?
The Apache Hadoop software library is essentially a framework that allows for the distributed processing of large datasets across clusters of computers using a simple programming model. Hadoop can scale up from single servers to thousands of machines, each offering local computation and storage.
The Apache Hadoop software library can detect and handle failures at the application layer, so it can deliver a highly-available service on top of a cluster of computers, each of which may be prone to failures.
As listed on the The Apache Hadoop project Web site, the Apache Hadoop project includes a number of subprojects, including:
Stay up to date on the latest developments in Internet terminology with a free weekly newsletter from Webopedia. Join to subscribe now.
Webopedia's student apps roundup will help you to better organize your class schedule and stay on top of assignments and homework. Read More »List of Free Shorten URL Services
A URL shortener is a way to make a long Web address shorter. Try this list of free services. Read More »Top 10 Tech Terms of 2015
The most popular Webopedia definitions of 2015. Read More »
The Open System Interconnection (OSI) model defines a networking framework to implement protocols in seven layers. Use this handy guide to compare... Read More »Computer Architecture Study Guide
This Webopedia study guide describes the different parts of a computer system and their relations. Read More »What Are Network Topologies?
Network Topology refers to layout of a network. How different nodes in a network are connected to each other and how they communicate is... Read More »