圖書標籤: Hadoop 大數據 BigData 計算機 分布式 hadoop 機器學習 O'Reilly
发表于2024-08-03
Hadoop: The Definitive Guide pdf epub mobi txt 電子書 下載 2024
Get ready to unlock the power of your data. With the fourth edition of this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters.
Using Hadoop 2 exclusively, author Tom White presents new chapters on YARN and several Hadoop-related projects such as Parquet, Flume, Crunch, and Spark. You’ll learn about recent changes to Hadoop, and explore new case studies on Hadoop’s role in healthcare systems and genomics data processing.
Learn fundamental components such as MapReduce, HDFS, and YARN
Explore MapReduce in depth, including steps for developing applications with it
Set up and maintain a Hadoop cluster running HDFS and MapReduce on YARN
Learn two data formats: Avro for data serialization and Parquet for nested data
Use data ingestion tools such as Flume (for streaming data) and Sqoop (for bulk data transfer)
Understand how high-level data processing tools like Pig, Hive, Crunch, and Spark work with Hadoop
Learn the HBase distributed database and the ZooKeeper distributed configuration service
Tom White has been an Apache Hadoop committer since February 2007, and is a member of the Apache Software Foundation. He works for Cloudera, a company set up to offer Hadoop support and training. Previously he was as an independent Hadoop consultant, working with companies to set up, use, and extend Hadoop. He has written numerous articles for O'Reilly, java.net and IBM's developerWorks, and has spoken at several conferences, including at ApacheCon 2008 on Hadoop. Tom has a Bachelor's degree in Mathematics from the University of Cambridge and a Master's in Philosophy of Science from the University of Leeds, UK.
經典
評分知識體係和原理講解清晰,結閤實踐項目,會更容易理解一些。
評分第四版全麵基於hadoop2,相比前版進行一些重要增添和順序調整,之前版本就不要看瞭。繼續那麼全麵而透徹實用。
評分閱讀瞭第1,2部分,算是對Hadoop有瞭基本的認知,接下來需要結閤實際項目夯實。其他相關的技術如Hive,HBase,Spark也需要去學習。
評分還好我用的時候不需要寫 Java(
首先,翻译太差,很多句子就是瞎翻,根本不通顺,很多时候你要停下来断句,慢慢去理解。 然后,这本书是很多人去翻译的,很多人连代码都不懂,曾经一段代码看到我蒙圈,去看了一下源代码,好家伙,四行有五个错误。另外,从代码瞎缩进也可以看出这是群没写过代码的人翻的,而且...
評分首先,翻译太差,很多句子就是瞎翻,根本不通顺,很多时候你要停下来断句,慢慢去理解。 然后,这本书是很多人去翻译的,很多人连代码都不懂,曾经一段代码看到我蒙圈,去看了一下源代码,好家伙,四行有五个错误。另外,从代码瞎缩进也可以看出这是群没写过代码的人翻的,而且...
評分书中没有透露太多实现架构方面的细节,更多的是从使用者的角度上介绍了Hadoop的各种知识,包括MapReduce, HDFS, Hive, Pig, HBase, ZooKeeper。几乎涉及了Hadoop的所有关于使用方面的知识,包括安装和使用。 你甚至可以直接在自己的电脑上装上一个Hadoop,对着书中的例子实际演...
評分 評分其实也不算全部读完了,读它主要是为了技术选型,考虑升级持久层架构、提高系统可扩展性,仔细研读了前几章,对Hadoop、MapReduce、HDFS的模型、机制、使用场景有了一定了解。后面几章及其生态圈内的其他项目抱着了解的心态简单浏览了一下。整体感觉还行,至少从我看过的章节来...
Hadoop: The Definitive Guide pdf epub mobi txt 電子書 下載 2024