Apache Sqoop Cookbook pdf epub mobi txt 電子書下載2025

簡體網頁||繁體網頁

☆☆☆☆☆

出版者:O'Reilly Media

作者:Kathleen Ting

出品人:

頁數:94

译者:

出版時間:2013-7-26

價格:USD 14.99

裝幀:Paperback

isbn號碼:9781449364625

叢書系列:

圖書標籤:

sqoop
hadoop
Hadoop
Programming
英文原版
數據分析
tech
rdbms
Sqoop
Big Data
Hadoop
Data Integration
Data Migration
Database
Java
ETL
Cookbook
Apache

下載連結在頁面底部

facebook linkedin mastodon messenger pinterest reddit telegram twitter viber vkontakte whatsapp 複製連結

想要找書就要到小美書屋

book.quotespace.org

立刻按 ctrl+D收藏本頁

你會得到大驚喜!!

具體描述

Integrating data from multiple sources is essential in the age of big data, but it can be a challenging and time-consuming task. This handy cookbook provides dozens of ready-to-use recipes for using Apache Sqoop, the command-line interface application that optimizes data transfers between relational databases and Hadoop.

Sqoop is both powerful and bewildering, but with this cookbook’s problem-solution-discussion format, you’ll quickly learn how to deploy and then apply Sqoop in your environment. The authors provide MySQL, Oracle, and PostgreSQL database examples on GitHub that you can easily adapt for SQL Server, Netezza, Teradata, or other relational systems.

Transfer data from a single database table into your Hadoop ecosystem

Keep table data and Hadoop in sync by importing data incrementally

Import data from more than one database table

Customize transferred data by calling various database functions

Export generated, processed, or backed-up data from Hadoop to your database

Run Sqoop within Oozie, Hadoop’s specialized workflow scheduler

Load data into Hadoop’s data warehouse (Hive) or database (HBase)

Handle installation, connection, and syntax issues common to specific database vendors

著者簡介

圖書目錄

Chapter 1 Getting Started
Downloading and Installing Sqoop
Installing JDBC Drivers
Installing Specialized Connectors
Starting Sqoop
Getting Help with Sqoop
Chapter 2 Importing Data
Transferring an Entire Table
Specifying a Target Directory
Importing Only a Subset of Data
Protecting Your Password
Using a File Format Other Than CSV
Compressing Imported Data
Speeding Up Transfers
Overriding Type Mapping
Controlling Parallelism
Encoding NULL Values
Importing All Your Tables
Chapter 3 Incremental Import
Importing Only New Data
Incrementally Importing Mutable Data
Preserving the Last Imported Value
Storing Passwords in the Metastore
Overriding the Arguments to a Saved Job
Sharing the Metastore Between Sqoop Clients
Chapter 4 Free-Form Query Import
Importing Data from Two Tables
Using Custom Boundary Queries
Renaming Sqoop Job Instances
Importing Queries with Duplicated Columns
Chapter 5 Export
Transferring Data from Hadoop
Inserting Data in Batches
Exporting with All-or-Nothing Semantics
Updating an Existing Data Set
Updating or Inserting at the Same Time
Using Stored Procedures
Exporting into a Subset of Columns
Encoding the NULL Value Differently
Exporting Corrupted Data
Chapter 6 Hadoop Ecosystem Integration
Scheduling Sqoop Jobs with Oozie
Specifying Commands in Oozie
Using Property Parameters in Oozie
Installing JDBC Drivers in Oozie
Importing Data Directly into Hive
Using Partitioned Hive Tables
Replacing Special Delimiters During Hive Import
Using the Correct NULL String in Hive
Importing Data into HBase
Importing All Rows into HBase
Improving Performance When Importing into HBase
Chapter 7 Specialized Connectors
Overriding Imported boolean Values in PostgreSQL Direct Import
Importing a Table Stored in Custom Schema in PostgreSQL
Exporting into PostgreSQL Using pg_bulkload
Connecting to MySQL
Using Direct MySQL Import into Hive
Using the upsert Feature When Exporting into MySQL
Importing from Oracle
Using Synonyms in Oracle
Faster Transfers with Oracle
Importing into Avro with OraOop
Choosing the Proper Connector for Oracle
Exporting into Teradata
Using the Cloudera Teradata Connector
Using Long Column Names in Teradata
Colophon
· · · · · · (收起)