Hadoop: The Definitive Guide

Hadoop: The Definitive Guide pdf epub mobi txt 电子书 下载 2025

Tom White has been an Apache Hadoop committer since February 2007, and is a member of the Apache Software Foundation. He works for Cloudera, a company set up to offer Hadoop support and training. Previously he was as an independent Hadoop consultant, working with companies to set up, use, and extend Hadoop. He has written numerous articles for O'Reilly, java.net and IBM's developerWorks, and has spoken at several conferences, including at ApacheCon 2008 on Hadoop. Tom has a Bachelor's degree in Mathematics from the University of Cambridge and a Master's in Philosophy of Science from the University of Leeds, UK.

出版者:O'Reilly Media
作者:Tom White
出品人:
页数:756
译者:
出版时间:2015-4-11
价格:USD 49.99
装帧:Paperback
isbn号码:9781491901632
丛书系列:
图书标签:
  • Hadoop 
  • 大数据 
  • BigData 
  • 计算机 
  • 分布式 
  • hadoop 
  • 机器学习 
  • O'Reilly 
  •  
想要找书就要到 小美书屋
立刻按 ctrl+D收藏本页
你会得到大惊喜!!

Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.

Using Hadoop 2 exclusively, author Tom White presents new chapters on YARN and several Hadoop-related projects such as Parquet, Flume, Crunch, and Spark. You’ll learn about recent changes to Hadoop, and explore new case studies on Hadoop’s role in healthcare systems and genomics data processing.

Learn fundamental components such as MapReduce, HDFS, and YARN

Explore MapReduce in depth, including steps for developing applications with it

Set up and maintain a Hadoop cluster running HDFS and MapReduce on YARN

Learn two data formats: Avro for data serialization and Parquet for nested data

Use data ingestion tools such as Flume (for streaming data) and Sqoop (for bulk data transfer)

Understand how high-level data processing tools like Pig, Hive, Crunch, and Spark work with Hadoop

Learn the HBase distributed database and the ZooKeeper distributed configuration service

具体描述

读后感

评分

评分

评分

很多地方翻译的不行,需要对照英文看才能明白。。。不过对于快速学习,仍然是不错的选择。建议译者看看每部分内容的重要性,不重要的瞎翻翻就算了,重要的部分还是好好花点功夫,不要本末倒置了。比如第三章的数据流部分,这么经典的地方居然被翻译烂的一塌糊涂。不知道译者会...  

评分

Cobub Razor APP数据统计分析工具官网上有篇文章是讲Hadoop Yarn调度器的选择和使用的,我觉得写的挺好的,推荐http://www.cobub.com/the-selection-and-use-of-hadoop-yarn-scheduler/

评分

详见:http://www.cnblogs.com/aprilrain/archive/2013/03/07/2947664.html  

用户评价

评分

读了前3部分,该看源码去了。

评分

阅读了第1,2部分,算是对Hadoop有了基本的认知,接下来需要结合实际项目夯实。其他相关的技术如Hive,HBase,Spark也需要去学习。

评分

真尼玛长。介绍了生态圈里的大部分工具,用来总结回顾比较适合,没有实践过的读者看前两部分mr和yarn核心,扫一遍后面所有工具是做什么用的就可以了。

评分

很棒

评分

前半段原理英文第四版,后半段相关项目和案例学习中文第三版就直接划水划过去了。Definitive Guide一贯作风,料多废话也多,Hadoop也是复杂又难用,Spark要是革了你的命也是理所应当。

本站所有内容均为互联网搜索引擎提供的公开搜索信息,本站不存储任何数据与内容,任何内容与数据均与本站无关,如有需要请联系相关搜索引擎包括但不限于百度google,bing,sogou

© 2025 book.quotespace.org All Rights Reserved. 小美书屋 版权所有