Images

Index

 

Please note that index links point to page beginnings from the print edition. Locations are approximate in e-readers, and you may need to page down one or more times after clicking a link to get to the indexed material.

 

References to figures are in italics.

$JAVA_HOME, 118

@IF function, 109–110

A

“A Relational Model of Data for Large Shared Data Banks” (Codd), 19

ACID, 242–243

Ada, 34

Advanced Replication, 79

After (A) image, 103

agents, 58

See also gateways

Agile, integrations with Agile development, 182–183

Apache Hadoop. See Hadoop

Apache Oozie, 245

Apache Pig, 245

APEX, 226–230

application integrations, 6

apply (replicat) process, 94–95

architecture

GoldenGate, 91–95

Oracle Data Integrator, 114–118, 147–153

archive logs, 77

ASCII-formatted files, 102–104

asynchronous mode, 79

atomicity, 242

auditing, 223

B

bad files, 49

BASE, 243

Before (B) image, 103

BerkleyDB, 243

Big Data

defined, 234

and GoldenGate, 246–247

and Oracle Data Integrator, 244–245

overview, 234–236

querying of, 20

volume, variety, and velocity (the three V’s), 234–235

Big Data Appliance

Cloudera, 237

NoSQL, 237

Oracle Big Data SQL, 240

Oracle Loader for Hadoop, 238

Oracle R distribution, 238

Oracle SQL Connector for Hadoop, 239–240

Oracle XQuery for Hadoop, 238

overview, 236–237

Big Data Connectors, 244

big endian systems, 74

business purpose, 187–189

C

capture (extract) process, 92–94

CDBs. See multitenant container databases (CDBs)

challenges

business purpose, 187–189

change, 184–187

data problems, 190–192

designing for integrations, 181–184

integrations with Agile development, 182–183

latency, 194–195

managing mapping tables, 198–200

overview, 180–181

performance, 201–202

standardization, 189–190

synchronizing data and copies, 192–194

testing, 200–201

tool issues, 195–197

change, 184–187

Cloudera, 237

Codd, Edgar, 19

commits, 29, 30

common and uniform access, 5

common data, 6

common data storage, 6

Common Infrastructure Object, 130

communication, 11–12

of the business purpose, 187–189

complete refresh, 61

components needed for integration, 8–11

config.sh, 135

connected users, 55

consistency, 243

consistent naming, 189–190

consolidation use-cases, 89, 90

COPY command, 41–43

Create Table As Select command, 32, 42

CTAS. See Create Table As Select command

Cutting, Doug, 240

D

data classification, 9

data cleansing, 10, 13

automated updates, 210–211

defined, 206–207

governing data sources, 217–221

manual updates, 210–211

standards, 211, 213–217

steps for, 212–213

tools, 211, 221–225, 226–230

using Oracle Data Integrator, 222–225

See also data quality; master data management (MDM)

data context, 218–219

Data Control Language. See DCL

Data Definition Language. See DDL

data distribution use-cases, 90

data integration

components needed for, 8–11

defined, 3–4

designing systems for, 7

history of, 4–6

today, 6–11

data lakes, 235

Data Manipulation Language. See DML

data marts, 5

data merge, 9

data migrations. See migrations

data owners, 11

data pools, 235

data pump (extract) process, 94

Data Pump utility, 69–72

data quality, 10, 190–192, 206

assessing the data, 210

cleansing the data, 210–211

establishing data context, 210

evolving business requirements, 210

monitoring, 211–212

tools, 196, 219–220

using Oracle Data Integrator, 222–225

See also data cleansing; master data management (MDM)

data replication. See replication

data reservoirs, 235, 236

data sources, 217–221

data stewards, 220

data types, 10

data validation, 13, 176–177

data warehouses, 5, 10, 192

database links, 54–57

DataNodes, 241

DBMS_HS_PASSTHROUGH package, 59

DCL, 33

DDL, 31–32

Create Table As Select command, 32

TRUNCATE command, 28

WHERE clause, 32

decision flow chart, 11–13

Definition Generator (DEFGEN) utility, 95–96

configuring, 96–97

parameters, 97, 98

running, 98–99

DELETE statements, 27–28

delimiter separated values (DSV) files, 100

designing systems for integration, 7, 181–184

DML

commits, 29, 30

DELETE statements, 27–28

INSERT statements, 24–25

MERGE statements, 28–29

overview, 23

ROLLBACK statements, 30

rollbacks, 29

savepoints, 29, 30–31

transactions, 29–31

UPDATE statements, 25–26, 27

VALUES clause, 24

WHERE clause, 25–26

durability, 243

E

endianness, 74

ETL, 9

See also ODI agent

event triggers, 40–41

export/import utility

full database export, 68

invoking export from the command line, 65

invoking export interactively from the command line, 65–66

invoking export via parameter files, 67–69

overview, 64, 65

external data sources, 3

external tables, 20, 50–52

Extract, Transform, and Load. See ETL

F

federated queries, 193

File System, 241

filtering data, 2, 7

fixed user links, 55

fixed-length format, 47

flat files, 99, 147

configuring a flat file physical schema, 148–151

delimiter separated values (DSV) files, 100

extract parameters to write, 101

generating, 100–102

generating an ASCII-formatted file, 102–104

length separated values (LSV) files, 100

megabyte clause, 102

parameters, 102

types of, 100

force logging, 93–94

FORMATASCII parameter, 102, 105

full database export, 68

See also export/import utility

functions, PL/SQL, 34–36

Fusion Middleware Configuration Wizard, 135–140

G

gateways, 58–60

GoldenGate, 14, 82–83, 88, 114

After (A) image, 103

adaptors, 246–247

apply (replicat) process, 94–95

architecture, 91–95

Before (B) image, 103

benefits of using, 88

and Big Data, 246–247

capture (extract) process, 92–94

changing data using @IF function, 109–110

compressed updates (V), 103

creating native database loader files, 104–106

data pump (extract) process, 94

Definition Generator (DEFGEN) utility, 95–99

extracting for database utility usage, 105–106

flat files, 99–104

force logging, 93–94

functions, 107–109

supplemental logging, 93–94

supported format parameters, 104

testing data with, 107–110

trail files, 91, 95

use-cases, 88–91

user exit functions, 106

Google Corporation, 241

GROUP BY clause, 21

H

Hadoop, 192

clusters, 20

MapReduce, 241, 242

Oracle Loader for Hadoop, 238

Oracle SQL Connector for Hadoop, 239–240

Oracle XQuery for Hadoop, 238

overview, 240–241

Hadoop Distributed File System, 239, 241

HAVING clause, 21

HBase, 244

HDFS, 239, 241

heterogeneous data, 191

heterogeneous platforms, 221

Heterogeneous Services, 58

Hive, 245

Hive KMs, 245

Hive Query Language (HQL), 245

HQL, 245

I

import. See export/import utility

incremental refresh, 61

INSERT statements, 24–25

instead-of triggers, 40

integrated knowledge modules (IKMs), 171

See also Knowledge Modules (KMs)

integrating data. See data integration

Integrator Studio. See ODI Studio

internal data, 3

Internet of Things, 234

IoT, 234

isolation, 243

J

Java Development Kit (JDK), 136–137

Java EE agents, 117

Java Messaging Service, 99

Java Virtual Machine (JVM), 117, 134–135

JSON, 191, 192

K

Knowledge Modules (KMs), 244–245

See also integrated knowledge modules (IKMs)

L

latency, 194–195

LCRs. See logical change records (LCRs)

length separated values (LSV) files, 100

little endian systems, 74

Loading Knowledge Modules (LKMs), 244–245

logical change records (LCRs), 80, 83–84

logical schemas, 152–153, 162–163

Logminer, 76–78

Lovelace, Ada, 34

M

manual integrations, 5

mappings, 147–151, 166–172

managing mapping tables, 198–200

running, 172–176

simulation, 174–175

step-by-step execution, 175–176

MapReduce, 241, 242

master data management (MDM), 9, 190, 199, 224

overview, 208–209

process, 209–213

See also data cleansing; data quality; metadata management

master repository, 116

materialized views, 60–64

MDM. See master data management (MDM)

MERGE statements, 28–29

metadata management, 10–11, 198–200

migrations, 5, 9

near-zero-downtime migrations, 89, 90–91

planning for, 185–186

transportable tablespaces, 72–75

multitenant container databases (CDBs), 75–76

multitenant databases. See multitenant container databases (CDBs)

N

NameNode, 241

near-zero-downtime migrations, 89, 90–91

net service name, 56

NonStop, 82

NoSQL, 192, 235, 237, 242–243

querying of, 20

null, 24

Nutch, 241

O

ODI. See Oracle Data Integrator

ODI agent, 134–141

scripting startup, 141

starting manually, 141

ODI Studio, 141–142, 222

OGG. See GoldenGate

OLTP, 80

online transaction processing. See OLTP

Oozie, 245

Oracle Application Express (APEX), 226–230

Oracle Big Data Appliance. See Big Data Appliance

Oracle Big Data SQL, 240

Oracle Data Integrator, 14, 224

adding a database data model, 163–166

architecture, 114–118, 147–153

and Big Data, 244–245

configuring a flat file physical schema, 148–151

configuring a topology, 147–153

configuring the ODI agent, 134–141

Console, 118

context menus, 154

creating a database logical schema, 162–163

creating a database physical schema, 159–162

creating a new model, 154–155

creating a project, 167–172

as a data flow modeler, 153–166

defining a datastore for a model, 155–158

deploying the binaries, 118–124

designing models, 153–166

initial connection and wallet configuration, 143–146

installation, 118–124

interacting with Oracle Enterprise Manager 12c, 118

Java EE agents, 117

logical architecture, 151–153

mappings, 166–176

master repository, 116

overview, 114

physical architecture, 147–151

preparing the repository, 124–133

repositories, 115–116

Repository Creation Utility (RCU), 124–133

rules for data quality and cleansing, 222–225

running mappings, 172–176

run-time agents, 117

setting connections, 142–143

setting up the target side of the integration, 159–166

standalone agents, 117

standalone co-located agents, 117

starting, 141–146

Studio, 141–142, 222

users, 116

validating a data integration, 176–177

verifying the repository, 133–134

work repository, 115–116

Oracle Enterprise Data Quality, 224

See also data quality

Oracle Enterprise Manager 12c, 118

Oracle Enterprise Metadata Management (OEMM), 200

Oracle GoldenGate. See GoldenGate

Oracle Inventory, 119

Oracle Loader for Hadoop, 238

Oracle Master Data Management, 224

See also master data management (MDM)

Oracle R distribution, 238

Oracle SQL Connector for Hadoop, 239–240

Oracle Streams, 79–82

Oracle Technology Network, 226

Oracle Universal Installer, 119, 120

Oracle Wallet Manager (OWM), 145

Oracle XQuery for Hadoop, 238

ORDER BY clause, 21–22

OUI. See Oracle Universal Installer

outbound servers, 84

P

packages, 37–38

par files. See parameter files

parameter files, 50

passwords, 145, 146

PDBs. See pluggable databases

performance, 201–202

physical schemas, 159–162

Pig, 245

planning a data integration, 180–181

anticipating other uses, 184

business purpose, 187–189

for change, 184–187

data quality, 190–192

designing for integrations, 181–184

integrations with Agile development, 182–183

involving the business and data owners, 183

latency, 194–195

managing mapping tables, 198–200

performance, 201–202

reference data, 183

standardization, 189–190

synchronizing data and copies, 192–194

testing, 200–201

tool issues, 195–197

PL/SQL

functions, 34–36

overview, 33–34

packages, 37–38

stored procedures, 36–37

SYSDATE operator, 37

triggers, 38–41

See also SQL

pluggable databases, 54, 75, 144–145

moving, 76

procedures. See stored procedures

PySpark, 245

Q

queries, 19

federated queries, 193

subqueries, 23

R

RCU wizard, 124–133

RDBMS systems, 19

record indicators, 104

redo logs, 77

refreshing materialized views, 61–62

remote databases, 55

hiding, 57

replication, 9

repositories, 115–116, 124–133

verifying, 133–134

Repository Creation Utility (RCU), 124–133

reverse engineering, 158, 165

ROLLBACK statements, 30

rollbacks, 29

row triggers, 39

run-time agents, 117

S

savepoints, 29, 30–31

scrubbing data. See data cleansing

SELECT statement, 19–23, 35–36

service names, 56, 71

snapshots, 60

See also materialized views

Spark, 245

SPOOL command, 43–45

SQL

DCL, 33

DDL, 31–32

DML, 23–31

external tables, 20

fields, 20

GROUP BY clause, 21

HAVING clause, 21

joins, 22

ORDER BY clause, 21–22

overview, 19

queries, 19

ROWID, 22

SELECT statement, 19–23, 35–36

semicolons, 20

subqueries, 23

tables, 20

views, 20

WHERE clause, 20–21, 35

See also PL/SQL

SQL Developer Data Modeler, 153

SQL Parser, 60

SQL*Loader

control file, 45–49

dealing with the bad file, 49

invoking, 49–50

overview, 45

SQL*Plus, 14

COPY command, 41–43

SPOOL command, 43–45

SQLLOADER, 105

SQLService, 58

Sqoop, 245

standalone agents, 117

standalone co-located agents, 117

standardization, 189–190

standards

data cleansing, 211, 213–217

field naming for, 215

outliers, 216

when standards don’t work, 215–217

statement triggers, 39–40

store and forward, 79

stored procedures, 36–37

Streams. See Oracle Streams

structured data, 10

Structured Query Language. See SQL

supplemental logging, 78, 93–94

synchronizing data and copies, 192–194

synchronous mode, 79

SYSDATE operator, 37

T

tablespace migrations. See transportable tablespaces

Tandem space, 82

testing, 200–201

with GoldenGate, 107–110

timing, 12–13

tnsname. See TNSNAMES.ora file

TNSNAMES.ora file, 56

tools, 13–14, 195–197, 211

combining, 224

data cleansing, 221–225, 226–230

data quality, 219–220

trail files, 91, 95

transactions, 29–31

transport databases, 75–76

transportable tablespaces, 72–75

triggers

event triggers, 40–41

instead-of triggers, 40

overview, 38

row triggers, 39

statement triggers, 39–40

TRUNCATE command, 28

U

unidirectional use-cases, 90

unstructured data, 10

UPDATE statements, 25–26, 27

UPSERT statement. See MERGE statements

use-cases

consolidation, 89, 90

data distribution, 90

near-zero-downtime migrations, 89, 90–91

overview, 88–89

unidirectional, 89

users, and Oracle Data Integrator, 116

V

validation. See data validation

value of data, 2–3

volume of data, 192

W

wallet password, 145, 146

WebLogic Domain, 135

WebLogic Server, 135

WHERE clause, 20–21, 25–26, 32, 35

work repository, 115–116

X

XML, 191

XStream API

overview, 83

XStream In, 84, 85

XStream Out, 83–84

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.230.107