Hadoop權威指南 (第4版英文影印版) pdf epub mobi txt 電子書下載2025

簡體網頁||繁體網頁

☆☆☆☆☆

出版者:東南大學齣版社

作者:Tom White

出品人:

頁數:726

译者:

出版時間:2015-8

價格:99.00

裝幀:平裝

isbn號碼:9787564159177

叢書系列:

圖書標籤:

hadoop
Programming
BigData
Hadoop
大數據
分布式存儲
分布式計算
MapReduce
YARN
HDFS
數據分析
雲計算
技術經典

下載連結在頁面底部

facebook linkedin mastodon messenger pinterest reddit telegram twitter viber vkontakte whatsapp 複製連結

想要找書就要到小美書屋

book.quotespace.org

立刻按 ctrl+D收藏本頁

你會得到大驚喜!!

具體描述

《Hadoop權威指南(第4版)(修訂版)(影印版)(英文版)》作者Tom White增加瞭關於YARN和一些Hadoop相關項目，如Parquet、Flume、Crunch和Spark的新章節。你將會瞭解到Hadoop版本的最新變化，並且研究在醫療健康係統和基因數據處理中Hadoop的應用案例。

著者簡介

懷特（Tom White），Tom White是Cloudera的工程師和Apache軟件基金會的成員，從2007年起就是Apache Hadoop的代碼提交者。他在oreilly.com、java.net和IBM的developerWorks寫瞭大量文章，並且經常在産業大會上作關於Hadoop的演講。

圖書目錄

Foreword
Preface
Part Ⅰ.Hadoop Fundamentals
1.MeetHadoop
Data！
Data Storage and Analysis
Querying All Your Data
Beyond Batch
Comparison with Other Systems
Relational Database Management Systems
Grid Computing
Volunteer Computing
A Brief History of Apache Hadoop
What's in This Book？
2.MapReduce
A Weather Dataset
Data Format
Analyzing the Data with Unix Tools
Analyzing the Data with Hadoop
Map and Reduce
Java MapReduce
Scaling Out
Data Flow
Combiner Functions
Running a Distributed MapReduce Job
Hadoop Streaming
Ruby
Python
3.The Hadoop Distributed Filesystem
The Design of HDFS
HDFS Concepts
Blocks
Namenodes and Datanodes
Block Caching
HDFS Federation
HDFS High Availability
The Command—Line Interface
Basic Filesystem Operations
Hadoop Filesystems
Interfaces
The Java Interface
Reading Data from a Hadoop URL
Reading Data Using the FileSystem API
Writing Data
Directories
Querying the Filesystem
Deleting Data
Data Flow
Anatomy of a File Read
Anatomy of a File Write
Coherency Model
Parallel Copying with distcp
Keeping an HDFS Cluster Balanced
4.YARN
Anatomy of a YARN Application Run
Resource Requests
Application Lifespan
Building YARN Applications
YARN Compared to MapReduce 1
Scheduling in YARN
Scheduler Options
Capacity Scheduler Configuration
Fair Scheduler Configuration
Delay Scheduling
Dominant Resource Fairness
Further Reading
5.Hadoop I／O
Data Integrity
Data Integrity in HDFS
LocaIFileSystem
ChecksumFileSystem
Compression
Codecs
Compression and Input Splits
Using Compression in MapReduce
Serialization
The Writable Interface
Writable Classes
Implementing a Custom Writable
Serialization Frameworks
File—Based Data Structures
SequenceFile
MapFile
Other File Formats and Column—Oriented Formats
Part Ⅱ.MapReduce
6.Developing a MapReduce Application
The Conflguration API
Combining Resources
Variable Expansion
Setting Up the Development Environment
Managing Configuration
GenericOptionsParser， Tool， and ToolRunner
Writing a Unit Test with MRUnit
Mapper
Reducer
Running Locally on Test Data
Running a Job in a Local Job Runner
Testing the Driver
Running on a Cluster
Packaging a Job
Launching a Job
The MapReduce Web UI
Retrieving the Results
Debugging a Job
Hadoop Logs
Remote Debugging
Tuning a Job
Profiling Tasks
MapReduce Workflows
Decomposing a Problem into MapReduce Jobs
IobControl
Apache Oozie
7.How MapReduce Works
Anatomy ofa MapReduce Job Run
Job Submission
Job Initialization
Task Assignmenl
Task Execution
Progress and Status Updates
Job Completion
Failures
Task Failure
Application Master Failure
Node Manager Failure
Resource Manager Failure
Shuffle and Sort
The Map Side
The Reduce Side
Configuration Tuning
Task Execution
The Task Execution Environment
Speculative Execution
Output Committers
8.MapReduce Typesand Formats
MapReduce Types
The Default MapReduce Job
Input Formats
Input Splits and Records
Text Input
Binary Input
Multiple Inputs
Database Input （and Output）
Output Formats
Text Output
Binary Output
Multiple Outputs
Lazy Output
Database Output
……
9.MapReduce Features
Part Ⅲ.Hadoop Operations
10.Setting Up a Hadoop Cluster
11.Administering Hadoop
Part Ⅳ.RelatedProjects
12.Avro
13.Parquet
14.Flume
15.Sqoop
16.Pig
17.Hive
18.Crunch
19.Spark
20.HBase
21.ZooKeeper
Part Ⅴ.Case Studies
22.Composable Data at Cerner
23.Biological Data Saence： Saving Lives with Software
24.Cascading
A.Installing Apache Hadoop
B.Cloudera's Distribution Including Apache Hadoop
C.Preparing the NCDC Weather Data
D.The Old and New Java MapReduce APls
Index
· · · · · · (收起)

讀後感

評分☆☆☆☆☆

首先，翻译太差，很多句子就是瞎翻，根本不通顺，很多时候你要停下来断句，慢慢去理解。然后，这本书是很多人去翻译的，很多人连代码都不懂，曾经一段代码看到我蒙圈，去看了一下源代码，好家伙，四行有五个错误。另外，从代码瞎缩进也可以看出这是群没写过代码的人翻的，而且...

評分☆☆☆☆☆

中文版412页：所以理论上，任何东西都可以表示成二进制形式，然后转化成为长整型的字符串或直接对数据结构进行序列化，来作为键值。原文460页： ..., so theoretically anything can serve as row key, from strings to binary representations of long or even serialized ...

評分☆☆☆☆☆

书中没有透露太多实现架构方面的细节，更多的是从使用者的角度上介绍了Hadoop的各种知识，包括MapReduce, HDFS, Hive, Pig, HBase, ZooKeeper。几乎涉及了Hadoop的所有关于使用方面的知识，包括安装和使用。你甚至可以直接在自己的电脑上装上一个Hadoop，对着书中的例子实际演...

評分☆☆☆☆☆

参加豆瓣China-pub抽奖，比较幸运的得到这本Hadoop权威指南中文第二版，拿来与第一版相比，发现新加入了Hive和Sqoop章节，译文质量也提高了不少，并且保留了英文索引。这本书对Hadoop的介绍还算全面，有实践冲动的朋友基本可以拿着书、配合Google百度马上实现梦想。个人感觉“...

評分☆☆☆☆☆