Hadoop
(Sprache: Englisch)
Ready to unlock the power of your data? With this comprehensive guide, you'll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for...
Leider schon ausverkauft
versandkostenfrei
Buch
41.00 €
Produktdetails
Produktinformationen zu „Hadoop “
Klappentext zu „Hadoop “
Ready to unlock the power of your data? With this comprehensive guide, you'll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.You'll find illuminating case studies that demonstrate how Hadoop is used to solve specific problems. This third edition covers recent changes to Hadoop, including material on the new MapReduce API, as well as MapReduce 2 and its more flexible execution model (YARN). Store large datasets with the Hadoop Distributed File System (HDFS) Run distributed computations with MapReduce Use Hadoop's data and I/O building blocks for compression, data integrity, serialization (including Avro), and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster - or run Hadoop in the cloud Loaddata from relational databases into HDFS, using Sqoop Perform large-scale data processing with the Pig query language Analyze datasets with Hive, Hadoop's data warehousing system Take advantage of HBase for structured and semi-structured data, and ZooKeeper for building distributed systems
Inhaltsverzeichnis zu „Hadoop “
InhaltsverzeichnisChapter 1 Meet Hadoop
Data!
Data Storage and Analysis
Comparison with Other Systems
A Brief History of Hadoop
The Apache Hadoop Project
Chapter 2 MapReduce
A Weather Dataset
Analyzing the Data with Unix Tools
Analyzing the Data with Hadoop
Scaling Out
Hadoop Streaming
Hadoop Pipes
Chapter 3 The Hadoop Distributed Filesystem
The Design of HDFS
HDFS Concepts
The Command-Line Interface
Hadoop Filesystems
The Java Interface
Data Flow
Parallel Copying with distcp
Hadoop Archives
Chapter 4 Hadoop I/O
Data Integrity
Compression
Serialization
File-Based Data Structures
Chapter 5 Developing a MapReduce Application
The Configuration API
Configuring the Development Environment
Writing a Unit Test
Running Locally on Test Data
Running on a Cluster
Tuning a Job
MapReduce Workflows
Chapter 6 How MapReduce Works
Anatomy of a MapReduce Job Run
Failures
Job Scheduling
Shuffle and Sort
Task Execution
Chapter 7 MapReduce Types and Formats
MapReduce Types
Input Formats
Output Formats
Chapter 8 MapReduce Features
Counters
Sorting
Joins
Side Data Distribution
MapReduce Library Classes
Chapter 9 Setting Up a Hadoop Cluster
Cluster Specification
Cluster Setup and Installation
SSH Configuration
Hadoop Configuration
Post Install
Benchmarking a Hadoop Cluster
Hadoop in the Cloud
Chapter 10 Administering Hadoop
HDFS
Monitoring
Maintenance
Chapter 11 Pig
Installing and Running Pig
An Example
Comparison with Databases
Pig Latin
User-Defined Functions
Data Processing Operators
Pig in Practice
Chapter 12 HBase
HBasics
Concepts
Installation
Clients
Example
HBase Versus RDBMS
Praxis
Chapter 13 ZooKeeper
Installing and Running ZooKeeper
An Example
The ZooKeeper Service
Building Applications with ZooKeeper
ZooKeeper in
... mehr
Production
Chapter 14 Case Studies
Hadoop Usage at Last.fm
Hadoop and Hive at Facebook
Nutch Search Engine
Log Processing at Rackspace
Cascading
TeraByte Sort on Apache Hadoop
Appendix Installing Apache Hadoop
Prerequisites
Installation
Configuration
Appendix Cloudera's Distribution for Hadoop
Prerequisites
Standalone Mode
Pseudo-Distributed Mode
Fully Distributed Mode
Hadoop-Related Packages
Appendix Preparing the NCDC Weather Data
Colophon
Chapter 14 Case Studies
Hadoop Usage at Last.fm
Hadoop and Hive at Facebook
Nutch Search Engine
Log Processing at Rackspace
Cascading
TeraByte Sort on Apache Hadoop
Appendix Installing Apache Hadoop
Prerequisites
Installation
Configuration
Appendix Cloudera's Distribution for Hadoop
Prerequisites
Standalone Mode
Pseudo-Distributed Mode
Fully Distributed Mode
Hadoop-Related Packages
Appendix Preparing the NCDC Weather Data
Colophon
... weniger
Autoren-Porträt von Tom White
Tom White has been an Apache Hadoop committer since February 2007, and is a member of the Apache Software Foundation. He works for Cloudera, a company set up to offer Hadoop support and training. Previously he was as an independent Hadoop consultant, working with companies to set up, use, and extend Hadoop. He has written numerous articles for O'Reilly, java.net and IBM's developerWorks, and has spoken at several conferences, including at ApacheCon 2008 on Hadoop. Tom has a Bachelor's degree in Mathematics from the University of Cambridge and a Master's in Philosophy of Science from the University of Leeds, UK.
Bibliographische Angaben
- Autor: Tom White
- 2012, 3rd rev. a. upd. ed., 682 Seiten, Maße: 17,9 x 23,3 cm, Kartoniert (TB), Englisch
- Verlag: O'Reilly Media
- ISBN-10: 1449311520
- ISBN-13: 9781449311520
Sprache:
Englisch
Kommentar zu "Hadoop"
0 Gebrauchte Artikel zu „Hadoop“
Zustand | Preis | Porto | Zahlung | Verkäufer | Rating |
---|
Schreiben Sie einen Kommentar zu "Hadoop".
Kommentar verfassen