Foreword
         Preface
         Part Ⅰ.Hadoop Fundamentals
         1.MeetHadoop
         Data!
         Data Storage and Analysis
         Querying All Your Data
         Beyond Batch
         Comparison with Other Systems
         Relational Database Management Systems
         Grid Computing
         Volunteer Computing
         A Brief History of Apache Hadoop
         What's in This Book?
         2.MapReduce
         A Weather Dataset
         Data Format
         Analyzing the Data with Unix Tools
         Analyzing the Data with Hadoop
         Map and Reduce
         Java MapReduce
         Scaling Out
         Data Flow
         Combiner Functions
         Running a Distributed MapReduce Job
         Hadoop Streaming
         Ruby
         Python
         3.The Hadoop Distributed Filesystem
         The Design of HDFS
         HDFS Concepts
         Blocks
         Namenodes and Datanodes
         Block Caching
         HDFS Federation
         HDFS High Availability
         The Command—Line Interface
         Basic Filesystem Operations
         Hadoop Filesystems
         Interfaces
         The Java Interface
         Reading Data from a Hadoop URL
         Reading Data Using the FileSystem API
         Writing Data
         Directories
         Querying the Filesystem
         Deleting Data
         Data Flow
         Anatomy of a File Read
         Anatomy of a File Write
         Coherency Model
         Parallel Copying with distcp
         Keeping an HDFS Cluster Balanced
         4.YARN
         Anatomy of a YARN Application Run
         Resource Requests
         Application Lifespan
         Building YARN Applications
         YARN Compared to MapReduce 1
         Scheduling in YARN
         Scheduler Options
         Capacity Scheduler Configuration
         Fair Scheduler Configuration
         Delay Scheduling
         Dominant Resource Fairness
         Further Reading
         5.Hadoop I/O
         Data Integrity
         Data Integrity in HDFS
         LocaIFileSystem
         ChecksumFileSystem
         Compression
         Codecs
         Compression and Input Splits
         Using Compression in MapReduce
         Serialization
         The Writable Interface
         Writable Classes
         Implementing a Custom Writable
         Serialization Frameworks
         File—Based Data Structures
         SequenceFile
         MapFile
         Other File Formats and Column—Oriented Formats
         Part Ⅱ.MapReduce
         6.Developing a MapReduce Application
         The Conflguration API
         Combining Resources
         Variable Expansion
         Setting Up the Development Environment
         Managing Configuration
         GenericOptionsParser, Tool, and ToolRunner
         Writing a Unit Test with MRUnit
         Mapper
         Reducer
         Running Locally on Test Data
         Running a Job in a Local Job Runner
         Testing the Driver
         Running on a Cluster
         Packaging a Job
         Launching a Job
         The MapReduce Web UI
         Retrieving the Results
         Debugging a Job
         Hadoop Logs
         Remote Debugging
         Tuning a Job
         Profiling Tasks
         MapReduce Workflows
         Decomposing a Problem into MapReduce Jobs
         IobControl
         Apache Oozie
         7.How MapReduce Works
         Anatomy ofa MapReduce Job Run
         Job Submission
         Job Initialization
         Task Assignmenl
         Task Execution
         Progress and Status Updates
         Job Completion
         Failures
         Task Failure
         Application Master Failure
         Node Manager Failure
         Resource Manager Failure
         Shuffle and Sort
         The Map Side
         The Reduce Side
         Configuration Tuning
         Task Execution
         The Task Execution Environment
         Speculative Execution
         Output Committers
         8.MapReduce Typesand Formats
         MapReduce Types
         The Default MapReduce Job
         Input Formats
         Input Splits and Records
         Text Input
         Binary Input
         Multiple Inputs
         Database Input (and Output)
         Output Formats
         Text Output
         Binary Output
         Multiple Outputs
         Lazy Output
         Database Output
         ……
         9.MapReduce Features
         Part Ⅲ.Hadoop Operations
         10.Setting Up a Hadoop Cluster
         11.Administering Hadoop
         Part Ⅳ.RelatedProjects
         12.Avro
         13.Parquet
         14.Flume
         15.Sqoop
         16.Pig
         17.Hive
         18.Crunch
         19.Spark
         20.HBase
         21.ZooKeeper
         Part Ⅴ.Case Studies
         22.Composable Data at Cerner
         23.Biological Data Saence: Saving Lives with Software
         24.Cascading
         A.Installing Apache Hadoop
         B.Cloudera's Distribution Including Apache Hadoop
         C.Preparing the NCDC Weather Data
         D.The Old and New Java MapReduce APls
         Index
      · · · · · ·     (
收起)