A
access. See data access
active data warehouse,
14–
16
active metadata repository,
99–
100
administration
management, of DW 2.0 environment,
358–
361
prioritization/conflicts,
358
scheduling and milestones,
359
aggregate data
granularity managers,
235
airline metadata example,
106–
109
analysis
creating new, from DW 2.0 data,
276–
277
analytical productivity and response time,
243–
244
analytical response time,
241
applications
active data warehouse,
14–
16
from application to corporate data,
216,
219–
221
evolution of data warehousing, –, –, –
10
transaction monitor and response time,
171–
172
transaction processing. See OLTP
architecture
architectural administration,
348–
351
building “real” data warehouse,
21–
22
creating new analysis,
273–
277
flow of data through,
203
new paradigm of DW 2.0,
24,
25
Archival Sector,
76–
86
architectural administration,
348–
349
life cycle of data,
27–
30
archiving data.
See also Archival Sector
ASCII/EBCDIC conversions,
223
attacks on data, sensing,
185–
187
audience and star schemas,
18–
19
audit trail
correcting/resetting data,
330–
331
B
backflow of data
from exploration facilities,
152–
154
from Integrated Sector,
212–
213
balancing entry for bad data,
330
bank account information,
116
banking transactions,
193,
217
batch mode
exception-based flow of data,
212
BI (business intelligence) universed,
97–
98
big bang approach, –
brittleness and star schemas,
18
business
corporate data and Integrated Sector,
62
corporate information factory,
12,
13,
134
enterprise knowledge coordination stream,
129–
133
impact of data warehousing,
11
representation and data models,
157–
158
“Business Intelligence Road Map” (Moss),
125–
126
business perspective
evolution of data warehousing, –,
14
granularity managers,
238
monitoring DW 2.0 environment,
178
statistical processing,
155–
156
technology infrastructure,
121–
122
calculations and data mapping,
317
capacity/disk storage
active data warehouse,
15
DBMS activities,
evolution to DW 2.0 environment,
10
history of data warehousing,
CDC (changed data capture),
226
changes
infrastructure. See technology infrastructure
mitigating business change,
119
rapid business changes,
114
CMSM (cross-media storage manager),
74–
75,
211–
212
code
checking automatically generated,
257–
258
concepts data models,
291
consultants, managing,
359–
361
continuous time span data
beginning/ending sequence of records,
197–
198
nonoverlapping records,
197
corporate data
enterprise knowledge coordination stream,
129–
133
corporate information factory,
12,
13,
134
correcting/resetting data,
330–
331
costs/cost justification
active data warehouse,
16
creating new analysis,
273–
277
creating new analysis from DW 2.0 data,
276–
277
DW 2.0 implementation,
271,
273
economics and evolution of data warehousing,
10–
11
factoring cost of DW 2.0,
277–
278
factors affecting, for new analysis,
276
first-generation vs. DW 2.0,
281–
282
historical information,
280–
281
macro-level justification,
271–
272
micro-level justification,
272–
273
perspective of business users,
282
real economics of DW 2.0,
279
reality of information,
278
time value of information,
279–
280
value of integration,
280
credit card data,
11,
295
cross-media storage manager (CMSM),
74–
75,
211–
212
D
data access
in DW 2.0 environment,
33–
34
volumes. See volumes of data
data correction stream,
133
data flow. See flow of data
data integration
introduction, –
data item set (dis) level,
159–
160
data marts
vs. exploration facilities,
152
transforming data from,
325
data models
business representation,
157–
158
corporate, and seven streams approach,
131
granular vs. summarized data,
159
intellectual road map,
157
perspective of business users,
166
data profiling
enterprise knowledge coordination stream,
129–
133
tools and reverse-engineered data model,
288–
289
data quality
data profiling inconsistencies,
294–
296
reverse-engineered data model,
288–
289
data warehouse
building “real” vs. variations,
21–
22
business appeal of DW 2.0,
24
business perspective,
26,
90–
92
compared to data mart,
21
data mart. See data marts
defined,
different development approach, –
DW 2.0 technology foundation,
45–
46
environment. See environment, DW 2.0
exploration warehouse,
13
first-generation. See history of data warehousing
integrating data, –
new paradigm of DW 2.0,
24,
25
shaping factors of DW 2.0,
23–
24
speed of data movement into/through,
331
star schema approach,
15,
18–
19
useful applications,
51–
52
data warehouse monitor
falling probability of data access,
209–
210
data warehouse utility (DWU),
332–
337
DBMS (data base management systems)
data warehouse utilities,
332–
337
purpose of, –
DDL (data definition language),
290
Decision Support Systems (DSS) processing,
41,
70–
71
default values and ETL processing,
223
development
data warehouse approach, –
PCs and 4GL technology, –
dis (data item set) level,
159–
160
discontinuity of data,
197
disk storage. See capacity/disk storage
diversity and metadata,
41
dividing data and technology infrastructure,
121
domains in data quality tool set,
287
Dow Jones Industrial average,
194
DSS (Decision Support Systems) processing,
41,
70–
71
DW 2.0. See data warehouse; environment, DW 2.0
DWU (data warehouse utility),
332–
337
E
EBCDIC/ASCII conversions,
223
ELT (extract/load/transform) processing
perspective of business users,
227–
228
email
as unstructured data,
299
end-user perspective. See business perspective
enterprise knowledge coordination stream,
129–
133
enterprise reference model stream,
130
enterprise-wide metadata
metadata in DW 2.0 environment,
97–
98
environment, DW 2.0
data warehouse,
data warehouse monitor,
171
management administration,
358–
361
migration to unstructured,
267–
269
preparing unstructured data,
38–
40
referential integrity,
52
responding to business changes,
47–
48
spider web, –,
structured/unstructured data,
34–
35,
86–
90
technology foundation,
45–
46
transaction monitor and response time,
171–
172
ERD (entity relationship level),
159–
160
ETL (extract/transform/load) processing
from application to corporate data,
216,
219–
221
code creation/parametrically driven,
225
complex transformations,
221
data flow in DW 2.0,
48,
205
in DW 2.0 environment,
215–
216
migration shock absorber,
267
real-time processing,
218
technology to prepare data,
308
unstructured processing,
87–
88
evolution of data warehousing. See history of data warehousing
exception-based flow of data,
210–
213
exploration facilities
data marts compared to,
152
frequency of analysis,
147
refreshing exploration data,
149
sources for exploration processing,
149
using data internally,
155
exploration processing,
146
exploration warehouse,
13,
24
extensibility
nonextensibility and data marts,
20
extract/load/transform (ELT) processing,
226–
227
extracts
ETL. See ETL (extract/transform/load)
processing
proliferation and data marts,
20
F
federated data warehouse
variations of data warehouses,
14–
15
first-generation data warehousing. See history of data warehousing
flow of data
in DW 2.0 environment,
48–
49
falling probability of data access,
209–
210
perspective of business users,
213–
214
throughout architecture,
203
frequency of analysis,
147
frequent flyer programs,
11
H
hardware/software selection,
256
heuristics
analysis and statistical processing,
145–
146
highway analogy for workload,
64,
66
historical data
data warehouse,
federated data warehouse,
17
history of data warehousing
from business perspective, –,
14
capacity/disk storage,
data warehouse environment,
DBMS, –
DW 2.0 compared to first-generation,
23–
24
early progression of systems,
master files,
online applications, –
PCs and 4GL technology, –
spider web environment, –,
I
Improving Data Warehouse and Business Information Quality (English),
137
indexing
passive indexes for archival data,
81–
83,
344
information factory development
infrastructure
technology. See technology infrastructure
integrated data
in data warehouse, –
evolution of data warehousing,
10
federated data warehouse,
17
scope of, and data models,
158–
159
value of integration,
280
Integrated Sector,
62–
71
continuous time span data,
63
data key reconciliation,
62
life cycle of data,
27–
30
referential integrity,
68–
69
subject-oriented detailed data,
62–
63
transactions, and time-variant data,
193–
194
integrity of data
active data warehouse,
15
statistical comparison,
144–
145
Interactive Sector,
55–
61
life cycle of data,
27–
30
referential integrity,
58
intersector/intrasector referential integrity,
52
IT (information technology)
reducing IT response time,
115
technology infrastructure,
112–
113
M
macro-level cost justification,
271–
272
maintaining metadata,
106
management administration of DW 2.0 environment,
358–
361
mangled characters, monitoring,
175
maps, level of detail,
159–
160
master files,
metadata
active/passive repositories,
99–
100
building infrastructure,
266
as by-product of granularity manager,
237–
238
infrastructure and performance,
248
in Interactive Sector,
31–
33
reusability of data and analysis,
96,
249
transformation process,
341
using, airline example,
106–
109
methodology
seven streams approach,
129–
139
micro-level cost justification,
272–
273
migration
adding Archival Sector,
264–
265
adding components incrementally,
262–
264
building metadata infrastructure,
266
creating enterprise metadata,
265–
266
ETL as shock absorber,
267
perspective of business users,
269–
270
swallowing source systems,
266–
267
to unstructured environment,
267–
269
milestones and scheduling,
359
Monitor it and report it domain,
287
monitoring DW 2.0 environment
application monitoring,
172
by architectural administrator,
350
peak-period processing,
172–
174
perspective of business users,
178
transaction monitor and response time,
171–
172
transaction queue monitoring,
171
transaction record monitoring,
172
unmatched foreign keys,
174–
175
Moss, Larissa, and spiral methodology,
125–
128
Mythical Man Month, The (Brooks),
115
O
ODS (operational data store),
13
offline data and security,
182–
184
OLAP (online application processing),
20
OLTP (online transaction processing)
federated data warehouse,
16–
17
online applications/processing
active data warehouse,
14–
16
evolution of data warehousing, –
10
history of data warehousing, –, –
transaction processing. See OLTP
online response performance,
239–
241
operational application systems
operational/legacy systems environment,
313
P
paradigm of DW 2.0,
24,
25
parallelization
batch, and performance,
249
granularity managers,
237
transaction processing,
249–
250
passive indexes for archival data,
81–
83,
344
passive metadata repository,
99–
100
password flooding attacks,
186
patient’s records,
51,
52
PCs and 4GL technology, –
peak-period processing,
172–
174
performance
analytical productivity and response time,
243–
244
analytical response time,
241
batch parallelization,
249
checking automatically generated code,
257–
258
data models and Interactive Sector,
161–
162
in DW 2.0 environment,
239
exploration facilities,
252
federated data warehouse,
16–
17
hardware/software selection,
256
heuristic processing,
243
metadata infrastructure,
248–
249
monitoring environment,
246–
247
parallelization for transaction processing,
249–
250
perspective of business users,
258–
259
physically grouping data,
257
protecting Interactive Sector,
254–
255
reducing IT response time,
115
removing dormant data,
245–
246
separating farmers/explorers,
256–
257
separation of transactions into classes,
253–
254
service level agreements,
254
transaction monitor and response time,
171–
172
physically grouping data,
257
pointers and unstructured processing,
87
preprogrammed complex transactions,
340
prioritization/conflicts and administration,
358
probability of data access
for different sectors,
30–
31
processing in DW 2.0 environment,
339–
344
proliferation and star schemas,
19
protecting Interactive Sector,
254–
255
R
random data access,
33–
34
rationalization of textual data,
39–
40
reading text for analytical processing,
299–
300
reality of information,
278
real-time ETL processing,
218
reasonability checking,
224
reconciliation
encoded values for data mapping,
318
refreshing exploration data,
149
relational data base,
309
repositories for metadata,
98–
100
resetting data values,
330
response time
analytical productivity,
243–
244
transaction monitoring,
171–
172
return on investment (ROI),
128
reverse-engineered data model,
288–
289
road maps
“Business Intelligence Road Map” (Moss),
125–
126
DW/BI project road map,
137–
138
intellectual road map,
157
S
sales data
in Integrated Sector,
62–
63
semantically stable data,
117
scheduling and milestones,
359
SDLC (systems development life cycle),
123
searches.
See also data access; queries
sectors. See also Archival Sector;
Integrated Sector; Interactive
Sector; Near Line Sector
reasons for different sectors,
30–
31
security
data warehouse monitor,
185
password flooding attacks,
186
perspective of business users,
187–
188
protected unstructured data,
187
Self Organizing Map (SOM),
165
semantic relationships
mitigating business change,
119
mixing stable/unstable data,
118
separating stable/unstable data,
118
semantically temporal/static data,
116–
117
semistructured data/value,
307–
308
sequential data access,
33–
34
seven streams approach
data correction stream,
133
data profiling and mapping stream,
133
DW/BI project road map,
137–
138
enterprise knowledge coordination stream,
129–
133
enterprise reference model stream,
129
information factory development stream,
133
infrastructure stream,
133
total information quality management stream,
134–
137
shared data mart data,
327–
328
slivers and spiral methodology,
127
sniffing and data warehouse monitor,
176
SOM (Self Organizing Map),
165
source data system of records,
316
sources
best source data from operational environment,
316
exploration processing,
149
swallowing source systems,
266–
267
specific/general text,
39–
40
speed of data movement,
331
spellings, alternate,
105,
305
spider web environment
history of data warehousing, –
transition to data warehouse environment,
stable/unstable data, semantically features of,
117
statistical comparison,
144–
145
statistical processing
active data warehouse,
16,
141
data marts and exploration facilities,
152
exploration facilities,
147
exploration processing,
146
frequency of analysis,
147
integrity of comparisons,
144–
145
perspective of business analyst,
155–
156
refreshing exploration data,
149
sources for exploration processing,
149
using exploration data internally,
155
using statistical analysis,
143
storage. See capacity/disk storage
structured data
vs. unstructured data,
34–
35
subject-oriented detailed data,
62–
63
subjects
subject area definitions,
101–
102
summary data
granular vs. summarized data,
159
synonyms
replacement/concatenation,
303
system of record
best source data from operational environment,
316
operational/legacy systems environment,
313
perspective of business users,
319–
320
systems development life cycle (SDLC),
123
systems/technology administration,
355–
358
T
taxonomies
unstructured processing,
87
technology
for different sectors,
34
evolution of data warehousing,
federated data warehouse,
17
responding to business changes,
47–
48
seven streams approach,
129–
139
technology infrastructure
creating snapshots of data,
119–
120
getting off treadmill,
115
mitigating business change,
119
mixing semantically stable/unstable data,
118
rapid business changes,
114
reducing IT response time,
115
semantically stable data,
117
semantically temporal data,
116–
117
semantically temporal/static data,
115–
116
separating semantically stable/unstable data,
118
text across languages,
305
textual analytical processing,
300–
301
textual data
in DW 2.0 environment,
34–
35
evolution of data warehousing,
10
external glossaries/taxonomies,
304–
305
homographic resolution,
303–
304
perspective of business users,
310
relational data base,
309
semistructured data/value,
307–
308
synonym replacement/concatenation,
303
text across languages,
305
unstructured processing,
86–
90
time, performance. See performance
time value of information,
279–
280
time-variant data
beginning/ending sequence of records,
197–
198
end-user perspective,
200
nonoverlapping records,
197
structure of DW 2.0 data,
191–
192
time relativity in Interactive Sector,
192
transactions in Integrated Sector,
193–
194
TIQM (total information quality management) stream,
134–
137
total information quality management (TIQM) stream,
134–
137
total quality data management (TQdM),
134
TQdM (total quality data
transaction monitor
application monitoring,
172
transaction processing.
See also OLTP
in DW 2.0 environment,
339–
344
Interactive Sector,
56–
57
life cycle of data,
27–
30
parallelization and performance,
249–
250
preprogrammed transactions,
340
simple/complex transactions,
339–
340
transformation
of data. See ETL (extract/transform/load) processing
triggers for data flow,
206–
207