Social Icons

Bored? Watch this game and get refreshed

Informatica Big Data Management User Guide

Informatica Big Data Management User Guide


Table of Contents for above guide.

Preface ....................................................................... 8
Informatica Resources................................................... 8
Informatica My Support Portal........................................... 8
Informatica Documentation............................................. 8
Informatica Product Availability Matrixes..................................... 8
Informatica Web Site................................................. 9
Informatica How-To Library............................................. 9
Informatica Knowledge Base............................................ 9
Informatica Support YouTube Channel...................................... 9
Informatica Marketplace............................................... 9
Informatica Velocity.................................................. 9
Informatica Global Customer Support...................................... 9
Chapter 1: Introduction to Informatica Big Data Management.................. 11
Informatica Big Data Management Overview.................................... 11
Example........................................................ 12
Big Data Management Tasks .............................................. 12
Read from and Write to Big Data Sources and Targets........................... 12
Perform Data Discovery.............................................. 13
Perform Data Lineage on Big Data Sources.................................. 13
Stream Machine Data................................................ 14
Manage Big Data Relationships......................................... 14
Big Data Process...................................................... 14
Step 1. Collect the Data.............................................. 15
Step 2. Cleanse the Data............................................. 15
Step 3. Transform the Data............................................ 15
Step 4. Process the Data.............................................. 15
Step 5. Monitor Jobs................................................ 16
Big Data Management Component Architecture.................................. 16
Clients and Tools................................................... 16
Application Services................................................. 17
Repositories...................................................... 17
Third-Party Applications.............................................. 18
Big Data Management Connectivity Architecture.................................. 18
Hadoop Ecosystem Architecture............................................ 19
Chapter 2: Connections...................................................... 20
Connections Overview.................................................. 20
Hadoop Connection Properties............................................. 21
HDFS Connection Properties.............................................. 28
4 Table of Contents
HBase Connection Properties.............................................. 29
Hive Connection Properties............................................... 30
Creating a Connection to Access Sources or Targets............................... 36
Creating a Hadoop Connection............................................. 37
Chapter 3: Mappings in a Hadoop Environment................................ 39
Mappings in a Hadoop Environment Overview................................... 39
Data Warehouse Optimization Mapping Example ................................. 40
Hive Engine Architecture................................................. 42
Informatica Blaze Engine Architecture........................................ 43
High-Level Steps to Run a Mapping in the Hadoop Environment....................... 45
Sources in a Hadoop Environment........................................... 45
Flat File Sources................................................... 46
Hive Sources..................................................... 46
Relational Sources.................................................. 46
Targets in a Hadoop Environment........................................... 47
Flat File Targets................................................... 47
HDFS Flat File Targets............................................... 47
Hive Targets...................................................... 47
Relational Targets.................................................. 48
Transformations in a Hadoop Environment..................................... 48
Variable Ports in a Hadoop Environment.................................... 51
Functions in a Hadoop Environment.......................................... 51
Mappings in a Hadoop Environment.......................................... 52
Data Types in a Hadoop Environment........................................ 53
Parameters in a Hadoop Environment........................................ 53
Parameter Usage.................................................. 54
Create and Use Hadoop Parameters...................................... 55
Workflows that Run Mappings in a Hadoop Environment............................ 56
Configuring a Mapping to Run in a Hadoop Environment............................ 56
Mapping Execution Plans................................................ 57
Hive Engine Execution Plan Details....................................... 58
Blaze Engine Execution Plan Details...................................... 58
Viewing the Execution Plan for a Mapping in the Developer Tool.................... 59
Monitor Jobs......................................................... 60
Accessing the Monitoring URL.......................................... 61
Monitor Blaze Engine Jobs............................................ 62
Monitoring a Mapping................................................ 63
Hadoop Environment Logs................................................ 63
Blaze Engine Logs.................................................. 64
Hive Engine Logs.................................................. 64
Viewing Hadoop Environment Logs in the Administrator Tool....................... 65
Viewing Logs in the Blaze Job Monitor..................................... 65
Table of Contents 5
Optimization for the Hadoop Environment...................................... 65
Truncating Partitions in a Hive Target...................................... 66
Enabling Data Compression on Temporary Staging Tables........................ 66
Parallel Sorting.................................................... 67
Troubleshooting a Mapping in a Hadoop Environment.............................. 67
Chapter 4: Mappings in the Native Environment............................... 69
Mappings in the Native Environment Overview................................... 69
Data Processor Mappings................................................ 69
HDFS Mappings...................................................... 70
HDFS Data Extraction Mapping Example................................... 70
Hive Mappings....................................................... 71
Hive Mapping Example............................................... 72
Social Media Mappings.................................................. 72
Twitter Mapping Example............................................. 73
Chapter 5: Profiles........................................................... 74
Profiles Overview..................................................... 74
Native and Hadoop Environments........................................... 75
Supported Data Source and Run-time Environments............................ 75
Run-time Environment Setup and Validation................................. 76
Run-time Environment and Profile Performance............................... 77
Profile Types on Hadoop................................................. 77
Column Profiles on Hadoop............................................ 77
Rule Profiles on Hadoop.............................................. 77
Data Domain Discovery on Hadoop....................................... 78
Running a Profile on Hadoop in the Developer Tool................................ 78
Running a Profile on Hadoop in the Analyst Tool................................. 78
Running Multiple Data Object Profiles on Hadoop................................. 79
Monitoring a Profile.................................................... 80
Troubleshooting...................................................... 80
Chapter 6: Native Environment Optimization.................................. 82
Native Environment Optimization Overview..................................... 82
Processing Big Data on a Grid............................................. 82
Data Integration Service Grid........................................... 83
Grid Optimization................................................... 83
Processing Big Data on Partitions........................................... 83
Partitioned Model Repository Mappings.................................... 83
Partition Optimization................................................ 84
High Availability....................................................... 84
6 Table of Contents
Appendix A: Data Type Reference............................................ 86
Data Type Reference Overview............................................ 86
Hive Complex Data Types................................................ 86
Hive Data Types and Transformation Data Types................................. 87


No comments:

Post a Comment

Please share your comments