圖書標籤: 數據挖掘 計算機 機器學習 Data Coursera CS 數據分析 軟件工程
发表于2024-10-21
Mining of Massive Datasets pdf epub mobi txt 電子書 下載 2024
Written by leading authorities in database and Web technologies, this book is essential reading for students and practitioners alike. The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and can be applied successfully to even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. Other chapters cover the PageRank idea and related tricks for organizing the Web, the problems of finding frequent itemsets and clustering. This second edition includes new and extended coverage on social networks, machine learning and dimensionality reduction.
Jure Leskovec is Assistant Professor of Computer Science at Stanford University. His research focuses on mining large social and information networks. Problems he investigates are motivated by large scale data, the Web and on-line media. This research has won several awards including a Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, Okawa Foundation Fellowship, and numerous best paper awards. His research has also been featured in popular press outlets such as the New York Times, the Wall Street Journal, the Washington Post, MIT Technology Review, NBC, BBC, CBC and Wired. Leskovec has also authored the Stanford Network Analysis Platform (SNAP, http://snap.stanford.edu), a general purpose network analysis and graph mining library that easily scales to massive networks with hundreds of millions of nodes and billions of edges. You can follow him on Twitter at @jure.
內容不錯,但作為技術嚮的書有些浮於錶麵。
評分花費6個月時間,斷斷續續看完,哈希和近似的想法真是開闊瞭眼界。第一迴看比較急促,此書值得反復看,多實踐。
評分花費6個月時間,斷斷續續看完,哈希和近似的想法真是開闊瞭眼界。第一迴看比較急促,此書值得反復看,多實踐。
評分下學期課程參考textbook,聽說professor還不錯,打算好好學一下這門課
評分bug非常之多, 還找不到地方提交, 讀起來極度痛苦, 前看後忘, 也許裏麵的算法本質上就是這樣, bottom line至少近15年最新的論文成果被這麼串講一下, 本科生也能看懂
Web数据挖掘特点,相比较ML增加了哪些理论和技术? (1) 大约覆盖了20篇论文。用了统一的语言,统一深度数学来表达。 (2) Hash用的特别多。方式各异。如下。 a. 提高检索速度,如index b. 数据随机分组。 c. 定义数据映射,重复这些映射。最基本功能。但对于新数据映射会存...
評分并非传统的”数据挖掘”教材,更像是,“数据挖掘”在互联网的应用场景,所遇到的问题(数据量大)和解决方案; 不过老实说,这本书挺不好懂的。 大概 get 了几个不错的思想: 思想-1:务必充分利用数据的”稀疏性”,如数据充分稀疏时,可以利用 HASH 将数据“聚合”成“有效...
評分内容是算法分析应该有的套路, 对于Correctness, Running Time, Storage的证明; 讲得很细, 一个星期要讲3个算法, 看懂以后全部忘光大概率要发生. 要是能多给些直觉解释就好了. Ullman的表达绝对是有问题的, 谁不承认谁就是不客观, 常常一句话我要琢磨2个小时, 比如DGIM算法有一...
評分本来是计划读英文版《Mining of Massive Datasets》的,但看到打折,而且译者在序言中信誓旦旦地说翻译的很用心,就买了中文的。结果读了第一章就读不下去了,中文表述太烂了,很多句子让人产生无限歧义,磕磕绊绊,叫人生厌。因此决定再次放弃这样的中文翻译书。
評分很差是给中译版的。 本书的中译版是中科院计算所的王斌老师翻译的,但是翻译的很屎。估计王老师拿到英文稿之后就扔给学生去翻译了,看这翻译水平,实在是不敢恭维。 以上纯为发泄心中不满所写。因为我看译者序,说是自己独立翻译,前后持续了七个多月,并历经多次修改。如果...
Mining of Massive Datasets pdf epub mobi txt 電子書 下載 2024