Apache Hadoop is ideal for organizations with a growing need to store and process massive application datasets. Hadoop: The Definitive Guide is a comprehensive resource for using Hadoop to build reliable, scalable, distributed systems. Programmers will find details for analyzing large datasets with Hadoop, and administrators will learn how to set up and run Hadoop clusters. The book includes case studies that illustrate how Hadoop solves specific problems.
Organizations large and small are adopting Apache Hadoop to deal with huge application datasets. Hadoop: The Definitive Guide provides you with the key for unlocking the wealth this data holds. Hadoop is ideal for storing and processing massive amounts of data, but until now, information on this open-source project has been lacking -- especially with regard to best practices. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems. Programmers will find details for analyzing large datasets with Hadoop, and administrators will learn how to set up and run Hadoop clusters.
With case studies that illustrate how Hadoop solves specific problems, this book helps you:
* Learn the Hadoop Distributed File System (HDFS), including ways to use its many APIs to transfer data
* Write distributed computations with MapReduce, Hadoop's most vital component
* Become familiar with Hadoop's data and IO building blocks for compression, data integrity, serialization, and persistence
* Learn the common pitfalls and advanced features for writing real-world MapReduce programs
* Design, build, and administer a dedicated Hadoop cluster
* Use HBase, Hadoop's database for structured and semi-structured data
And more. Hadoop: The Definitive Guide is still in progress, but you can get started on this technology with the Rough Cuts edition, which lets you read the book online or download it in PDF format as the manuscript evolves.
参加豆瓣China-pub抽奖,比较幸运的得到这本Hadoop权威指南中文第二版,拿来与第一版相比,发现新加入了Hive和Sqoop章节,译文质量也提高了不少,并且保留了英文索引。 这本书对Hadoop的介绍还算全面,有实践冲动的朋友基本可以拿着书、配合Google百度马上实现梦想。个人感觉“...
評分 評分很多地方翻译的不行,需要对照英文看才能明白。。。不过对于快速学习,仍然是不错的选择。建议译者看看每部分内容的重要性,不重要的瞎翻翻就算了,重要的部分还是好好花点功夫,不要本末倒置了。比如第三章的数据流部分,这么经典的地方居然被翻译烂的一塌糊涂。不知道译者会...
評分中文版412页: 所以理论上,任何东西都可以表示成二进制形式,然后转化成为长整型的字符串或直接对数据结构进行序列化,来作为键值。 原文460页: ..., so theoretically anything can serve as row key, from strings to binary representations of long or even serialized ...
評分看了几章中文版的,各种错误,太低级,实在是看不下去了。 建议还是看原版吧。 译者们的脸皮可真厚,英文译不明白也就罢了,中文都组织的不通顺,好意思吗!! 什么叫 “但是,......,但是”啊,“但是体”啊。
沒什麼意思,重點看瞭zookeeper
评分權威之作
评分因為做報告的需要,看瞭關於HDFS的部分
评分因為做報告的需要,看瞭關於HDFS的部分
评分在Baidu實習時候看的書。 三個月時間,一邊瞭解雲計算、Hadoop,一邊熟悉Java、軟件工程項目管理等。還寫瞭幾章的提綱,不過現在自己看已經看不懂瞭。 Anyway,這是一本Hadoop入門的好書。想深入瞭解Hadoop變成,還有一本 《Pro Hadoop》可以參考。 但是想完全看懂這本書,良好的Java語言基礎(反射、序列化、多綫程、GC)以及網絡編程功底(Socket、RPC)是很重要的。否則看起來可能會一頭霧水。我就是這樣。 以後還要再復習一遍。
本站所有內容均為互聯網搜索引擎提供的公開搜索信息,本站不存儲任何數據與內容,任何內容與數據均與本站無關,如有需要請聯繫相關搜索引擎包括但不限於百度,google,bing,sogou 等
© 2025 book.quotespace.org All Rights Reserved. 小美書屋 版权所有