Home Page Icon
Home Page
Table of Contents for
Table of Contents
Close
Table of Contents
by Rajiv Tiwari
Hadoop for Finance Essentials
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Prev
Previous Chapter
Cover
Next
Next Chapter
Hadoop for Finance Essentials
Table of Contents
Hadoop for Finance Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Errata
Piracy
Questions
1. Big Data Overview
What is big data?
Data volume
Data velocity
Data variety
Big data technology evolution
History
Current
Future
The big data landscape
Storage
NoSQL
NoSQL database types
Resource management
Data governance
Batch computing
Real-time computing
Data integration tools
Machine learning
Business intelligence and virtualization
Careers in big data
Hadoop architecture
HDFS cluster
MapReduce V1
MapReduce V2 – YARN
The Hadoop jungle explained
Big data tamed
Hadoop – the hero
HDFS – Hadoop Distributed Filesystem
MapReduce
HBase
Hive
Pig
Zookeeper
Oozie
Flume
Sqoop
Hadoop distributions
Distribution – on premise
Distribution – cloud
Summary
2. Big Data in Financial Services
Big data use cases across industry sectors
Healthcare
Human science
Telecom
Online retailer
Why big data in the financial sector?
Big data use cases in the financial sector
Data archival on HDFS
Regulatory
Fraud detection
Tick data
Risk management
Customer behavior prediction
Sentiment analysis – unstructured
Other use cases
Big data evolution in finance
Big data tools – what to learn
Getting your data into HDFS
Querying data from HDFS
SQL on Hadoop
Real time
Data governance and operations
ETL tools
Data analytics and business intelligence
Big data implementations in finance
The key challenges
Overcoming the challenges
Generate interest – play area
Pilot with a low-cost project
Hadoop is live – now scale it up
Summary
3. Hadoop in the Cloud
The big data cloud story
The why
The when
What's the catch?
Project details – risk simulations in the cloud
Solution
The current world
The target world
Data collection
Configuring the Hadoop cluster
Data upload
Data transformation
Data analysis
Summary
4. Data Migration Using Hadoop
Project details – archive your transaction data
Solution
Project Phase 1 – split trade data into DW and Hadoop
The current world
The target world
Data collection
How to do it
Data analysis
HDFS shell
Hive queries
Pig
Project Phase 2 – migrate data from relational DW into Hadoop
The current world
The target world
Data collection
Check the connection to the relational database
Import into Hadoop
Initial data migration
Periodic incremental data migration
Import into Hive
Data analysis
Summary
5. Getting Started
Project details – risk and regulatory reporting
Solution
The current world
The target world
Data collection
Option 1 – Apache Oozie
Option 2 – ETL tool ingestion
Data transformation
Hive or Pig?
Hive
Step 1 – Staging
Step 2 – Output results
Pig
Step 1 – Staging
Step 2 – Output results
Other small use case to calculate risk – IR01
Java MapReduce
Data analysis
BI tools
Summary
6. Getting Experienced
Real-time big data
Project details – identifying fraudulent transactions
Solution
The current world
The target world
The Markov Chain Model execution – batch mode
The Storm architecture
The Spark architecture
Data collection
Using Storm
Using Spark
Data transformation
Using Storm
Using Spark
Summary
7. Scale It Up
Scale it up – actually horizontally
A few more big data use cases
Use case – fraud again
Solution
Use case – customer complaints
Solution
Use case – algorithm trading
Solution
Use case – forex trading
Solution
Use case – social media based trading
Solution
Use case – no big data
Solution
The data lake
The lambda architecture
Big data governance
The Apache Falcon overview
Security
Summary
8. Sustain the Momentum
The Hadoop distribution upgrade cycle
Best practices and standards
Environments
Integration with the BI and ETL tools
Tips
Business
Infrastructure
Coding
New trends
Summary
Index
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset