Home Page Icon
Home Page
Table of Contents for
Cover image
Close
Cover image
by Yinle Zhou, John R. Talburt
Entity Information Life Cycle for Big Data
Cover image
Title page
Table of Contents
Copyright
Foreword
Preface
Acknowledgements
Chapter 1. The Value Proposition for MDM and Big Data
Definition and Components of MDM
The Business Case for MDM
Dimensions of MDM
The Challenge of Big Data
MDM and Big Data – The N-Squared Problem
Concluding Remarks
Chapter 2. Entity Identity Information and the CSRUD Life Cycle Model
Entities and Entity References
Managing Entity Identity Information
Entity Identity Information Life Cycle Management Models
Concluding Remarks
Chapter 3. A Deep Dive into the Capture Phase
An Overview of the Capture Phase
Building the Foundation
Understanding the Data
Data Preparation
Selecting Identity Attributes
Assessing ER Results
Data Matching Strategies
Concluding Remarks
Chapter 4. Store and Share – Entity Identity Structures
Entity Identity Information Management Strategies
Dedicated MDM Systems
The Identity Knowledge Base
MDM Architectures
Concluding Remarks
Chapter 5. Update and Dispose Phases – Ongoing Data Stewardship
Data Stewardship
The Automated Update Process
The Manual Update Process
Asserted Resolution
EIS Visualization Tools
Managing Entity Identifiers
Concluding Remarks
Chapter 6. Resolve and Retrieve Phase – Identity Resolution
Identity Resolution
Identity Resolution Access Modes
Confidence Scores
Concluding Remarks
Chapter 7. Theoretical Foundations
The Fellegi-Sunter Theory of Record Linkage
The Stanford Entity Resolution Framework
Entity Identity Information Management
Concluding Remarks
Chapter 8. The Nuts and Bolts of Entity Resolution
The ER Checklist
Cluster-to-Cluster Classification
Selecting an Appropriate Algorithm
Concluding Remarks
Chapter 9. Blocking
Blocking
Blocking by Match Key
Dynamic Blocking versus Preresolution Blocking
Blocking Precision and Recall
Match Key Blocking for Boolean Rules
Match Key Blocking for Scoring Rules
Concluding Remarks
Chapter 10. CSRUD for Big Data
Large-Scale ER for MDM
The Transitive Closure Problem
Distributed, Multiple-Index, Record-Based Resolution
An Iterative, Nonrecursive Algorithm for Transitive Closure
Iteration Phase: Successive Closure by Reference Identifier
Deduplication Phase: Final Output of Components
ER Using the Null Rule
The Capture Phase and IKB
The Identity Update Problem
Persistent Entity Identifiers
The Large Component and Big Entity Problems
Identity Capture and Update for Attribute-Based Resolution
Concluding Remarks
Chapter 11. ISO Data Quality Standards for Master Data
Background
Goals and Scope of the ISO 8000-110 Standard
Four Major Components of the ISO 8000-110 Standard
Simple and Strong Compliance with ISO 8000-110
ISO 22745 Industrial Systems and Integration
Beyond ISO 8000-110
Concluding Remarks
Appendix A. Some Commonly Used ER Comparators
References
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Next
Next Chapter
Title page
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset