Data Science from Scratch

Data Science from Scratch pdf epub mobi txt 電子書 下載2025

出版者:O'Reilly Media
作者:Joel Grus
出品人:
頁數:330
译者:
出版時間:2015-4-28
價格:USD 39.99
裝幀:Paperback
isbn號碼:9781491901427
叢書系列:
圖書標籤:
  • Python
  • DataScience
  • 機器學習
  • 數據科學
  • Programming
  • 統計學習
  • 計算機
  • 數學/統計/數據
  • 數據科學
  • Python
  • 機器學習
  • 統計學
  • 數據分析
  • 算法
  • 編程
  • 數據挖掘
  • 從零開始
  • 實戰
想要找書就要到 小美書屋
立刻按 ctrl+D收藏本頁
你會得到大驚喜!!

具體描述

Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch.

If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out.

Get a crash course in Python

Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science

Collect, explore, clean, munge, and manipulate data

Dive into the fundamentals of machine learning

Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering

Explore recommender systems, natural language processing, network analysis, MapReduce, and databases

著者簡介

Joel Grus

Joel Grus is a software engineer at Google. Before that he worked as a data scientist at multiple startups. He lives in Seattle, where he regularly attends data science happy hours. He blogs infrequently at joelgrus.com.

View Joel Grus's full profile page.

圖書目錄

Chapter 1Introduction
The Ascendance of Data
What Is Data Science?
Motivating Hypothetical: DataSciencester
Chapter 2A Crash Course in Python
The Basics
The Not-So-Basics
For Further Exploration
Chapter 3Visualizing Data
matplotlib
Bar Charts
Line Charts
Scatterplots
For Further Exploration
Chapter 4Linear Algebra
Vectors
Matrices
For Further Exploration
Chapter 5Statistics
Describing a Single Set of Data
Correlation
Simpson’s Paradox
Some Other Correlational Caveats
Correlation and Causation
For Further Exploration
Chapter 6Probability
Dependence and Independence
Conditional Probability
Bayes’s Theorem
Random Variables
Continuous Distributions
The Normal Distribution
The Central Limit Theorem
For Further Exploration
Chapter 7Hypothesis and Inference
Statistical Hypothesis Testing
Example: Flipping a Coin
Confidence Intervals
P-hacking
Example: Running an A/B Test
Bayesian Inference
For Further Exploration
Chapter 8Gradient Descent
The Idea Behind Gradient Descent
Estimating the Gradient
Using the Gradient
Choosing the Right Step Size
Putting It All Together
Stochastic Gradient Descent
For Further Exploration
Chapter 9Getting Data
stdin and stdout
Reading Files
Scraping the Web
Using APIs
Example: Using the Twitter APIs
For Further Exploration
Chapter 10Working with Data
Exploring Your Data
Cleaning and Munging
Manipulating Data
Rescaling
Dimensionality Reduction
For Further Exploration
Chapter 11Machine Learning
Modeling
What Is Machine Learning?
Overfitting and Underfitting
Correctness
The Bias-Variance Trade-off
Feature Extraction and Selection
For Further Exploration
Chapter 12k-Nearest Neighbors
The Model
Example: Favorite Languages
The Curse of Dimensionality
For Further Exploration
Chapter 13Naive Bayes
A Really Dumb Spam Filter
A More Sophisticated Spam Filter
Implementation
Testing Our Model
For Further Exploration
Chapter 14Simple Linear Regression
The Model
Using Gradient Descent
Maximum Likelihood Estimation
For Further Exploration
Chapter 15Multiple Regression
The Model
Further Assumptions of the Least Squares Model
Fitting the Model
Interpreting the Model
Goodness of Fit
Digression: The Bootstrap
Standard Errors of Regression Coefficients
Regularization
For Further Exploration
Chapter 16Logistic Regression
The Problem
The Logistic Function
Applying the Model
Goodness of Fit
Support Vector Machines
For Further Investigation
Chapter 17Decision Trees
What Is a Decision Tree?
Entropy
The Entropy of a Partition
Creating a Decision Tree
Putting It All Together
Random Forests
For Further Exploration
Chapter 18Neural Networks
Perceptrons
Feed-Forward Neural Networks
Backpropagation
Example: Defeating a CAPTCHA
For Further Exploration
Chapter 19Clustering
The Idea
The Model
Example: Meetups
Choosing k
Example: Clustering Colors
Bottom-up Hierarchical Clustering
For Further Exploration
Chapter 20Natural Language Processing
Word Clouds
n-gram Models
Grammars
An Aside: Gibbs Sampling
Topic Modeling
For Further Exploration
Chapter 21Network Analysis
Betweenness Centrality
Eigenvector Centrality
Directed Graphs and PageRank
For Further Exploration
Chapter 22Recommender Systems
Manual Curation
Recommending What’s Popular
User-Based Collaborative Filtering
Item-Based Collaborative Filtering
For Further Exploration
Chapter 23Databases and SQL
CREATE TABLE and INSERT
UPDATE
DELETE
SELECT
GROUP BY
ORDER BY
JOIN
Subqueries
Indexes
Query Optimization
NoSQL
For Further Exploration
Chapter 24MapReduce
Example: Word Count
Why MapReduce?
MapReduce More Generally
Example: Analyzing Status Updates
Example: Matrix Multiplication
An Aside: Combiners
For Further Exploration
Chapter 25Go Forth and Do Data Science
IPython
Mathematics
Not from Scratch
Find Data
Do Data Science
· · · · · · (收起)

讀後感

評分

这本书可以作为 Data Science 101 ,只是一本基于 Python 学习 Data Science 的指南,我觉得里面最有价值的就是 For Further Exploration 部分了。  

評分

数据科学是一个蓬勃发展、前途无限的行业,有人将数据科学家称为“21世纪头号性感职业”。本书从零开始讲解数据科学工作,教授数据科学工作所必需的黑客技能,并带领读者熟悉数据科学的核心知识——数学和统计学。 作者选择了功能强大、简单易学的Python语言环境,亲手搭建工具...

評分

数据科学是一个蓬勃发展、前途无限的行业,有人将数据科学家称为“21世纪头号性感职业”。本书从零开始讲解数据科学工作,教授数据科学工作所必需的黑客技能,并带领读者熟悉数据科学的核心知识——数学和统计学。 作者选择了功能强大、简单易学的Python语言环境,亲手搭建工具...

評分

数据科学是一个蓬勃发展、前途无限的行业,有人将数据科学家称为“21世纪头号性感职业”。本书从零开始讲解数据科学工作,教授数据科学工作所必需的黑客技能,并带领读者熟悉数据科学的核心知识——数学和统计学。 作者选择了功能强大、简单易学的Python语言环境,亲手搭建工具...

評分

数据科学是一个蓬勃发展、前途无限的行业,有人将数据科学家称为“21世纪头号性感职业”。本书从零开始讲解数据科学工作,教授数据科学工作所必需的黑客技能,并带领读者熟悉数据科学的核心知识——数学和统计学。 作者选择了功能强大、简单易学的Python语言环境,亲手搭建工具...

用戶評價

评分

讀瞭還是不會 T T

评分

入門數據科學

评分

入門,語言詼諧。

评分

helpful

评分

各方麵都講瞭一點 也不是很深

本站所有內容均為互聯網搜索引擎提供的公開搜索信息,本站不存儲任何數據與內容,任何內容與數據均與本站無關,如有需要請聯繫相關搜索引擎包括但不限於百度google,bing,sogou

© 2025 book.quotespace.org All Rights Reserved. 小美書屋 版权所有