0%

Explains the mathematics, theory, and methods of Big Data as applied to finance and investing

Data science has fundamentally changed Wall Street—applied mathematics and software code are increasingly driving finance and investment-decision tools. Big Data Science in Finance examines the mathematics, theory, and practical use of the revolutionary techniques that are transforming the industry. Designed for mathematically-advanced students and discerning financial practitioners alike, this energizing book presents new, cutting-edge content based on world-class research taught in the leading Financial Mathematics and Engineering programs in the world. Marco Avellaneda, a leader in quantitative finance, and quantitative methodology author Irene Aldridge help readers harness the power of Big Data.

Comprehensive in scope, this book offers in-depth instruction on how to separate signal from noise, how to deal with missing data values, and how to utilize Big Data techniques in decision-making. Key topics include data clustering, data storage optimization, Big Data dynamics, Monte Carlo methods and their applications in Big Data analysis, and more. This valuable book:

  • Provides a complete account of Big Data that includes proofs, step-by-step applications, and code samples
  • Explains the difference between Principal Component Analysis (PCA) and Singular Value Decomposition (SVD)
  • Covers vital topics in the field in a clear, straightforward manner
  • Compares, contrasts, and discusses Big Data and Small Data
  • Includes Cornell University-tested educational materials such as lesson plans, end-of-chapter questions, and downloadable lecture slides

Big Data Science in Finance: Mathematics and Applications is an important, up-to-date resource for students in economics, econometrics, finance, applied mathematics, industrial engineering, and business courses, and for investment managers, quantitative traders, risk and portfolio managers, and other financial practitioners.

Table of Contents

  1. Cover
  2. Title Page
  3. Copyright
  4. Preface
    1. REFERENCE
  5. Chapter 1: Why Big Data?
    1. Introduction
    2. Appendix 1.A Coding Big Data in Python
    3. Reference
    4. Notes
  6. Chapter 2: Neural Networks in Finance
    1. Introduction
    2. Neural Network Construction Methodology
    3. The Architecture of Neural Networks
    4. Choosing the Activation Function
    5. Construction and Training of Neural Networks
    6. Model Selection via Dropout
    7. Overfitting
    8. Adding Complexity
    9. Big Data in Machine Learning
    10. Coding a Simple Neural Network for One Instrument from Daily Data
    11. Defining Target Outputs
    12. Testing Performance
    13. Adding Activation Levels
    14. Convergence
    15. Choosing Input Variables
    16. Conclusion
    17. Appendix 2.A Building a Neural Network in Python
    18. References
  7. Chapter 3: Supervised Learning
    1. Introduction
    2. Supervised Learning
    3. Conclusion
    4. Appendix 3.A Python for Supervised Models
    5. References
  8. Chapter 4: Modeling Human Behavior with Semi-Supervised Learning
    1. Introduction
    2. Performance Evaluation via Cross-Validation
    3. Generative Models
    4. Other SSL Models and Enhancements
    5. Conclusion
    6. Appendix 4.A Python for Semi-Supervised Models
    7. References
  9. Chapter 5: Letting the Data Speak with Unsupervised Learning
    1. Introduction
    2. Dimensionality Reduction in Finance
    3. Conclusion
    4. Appendix 5.A PCA and SVD in Python
    5. References
  10. Chapter 6: Big Data Factor Models
    1. Why PCA and SVD Deliver Optimal Factorization
    2. Eigenportfolios
    3. Using Factors to Predict Returns
    4. Factor Discovery
    5. Instrumented PCA
    6. The Three-Pass Model
    7. Risk-Premium PCA
    8. Nonlinear Factorization
    9. Correlation-Based Factors
    10. Hierarchical PCA (HPCA)
    11. Disadvantages of PCA and SVD
    12. Conclusion
    13. Appendix 6.A Python for Big Data Factor Models
    14. References
    15. Note
  11. Chapter 7: Data as a Signal versus Noise
    1. Introduction
    2. Random Data Shows in Eigenvalue Distribution
    3. Application: What's in the Data Bag?
    4. The Marčenko-Pastur Theorem
    5. Spike Model: Which Value to Pick on the “Elbow”?
    6. Dealing with Highly Correlated Data
    7. Deconstructing the Mona Lisa
    8. What's in the Data Bag?
    9. Applications
    10. The Karhunen-Loève Transform
    11. Data Imputation
    12. Missing Eigenvalues
    13. The Tracy-Widom Distribution
    14. Identifying (and Replacing) Missing Values in Streaming Data (the Johnson-Lindenstrauss Lemma)
    15. Conclusion
    16. Appendix 7 Finding the Optimal Number of Eigenvectors in Python
    17. References
  12. Chapter 8: Applications: Unsupervised Learning in Option Pricing and Stochastic Modeling
    1. Introduction
    2. Application 1: Unsupervised Learning in Options Pricing
    3. Application 2: Optimizing Markov Chains with the Perron-Frobenius Theorem
    4. Conclusion
    5. Appendix 8.A Determining the Percentage of Variation Explained by the Top Principal Components in Python
    6. References
    7. Note
  13. Chapter 9: Data Clustering
    1. Introduction
    2. Clustering Methodology
    3. Clustering Financial Data
    4. Empirical Results
    5. Conclusion
    6. Appendix 9.A Clustering with Python
    7. References
  14. Conclusion
  15. Index
  16. End User License Agreement
3.17.128.129