0%

Book Description

"Security has become a ""big data"" problem. The growth rate of malware has accelerated to tens of millions of new files per year while our networks generate an ever-larger flood of security-relevant data each day. In order to defend against these advanced attacks, you'll need to know how to think like a data scientist.

In Malware Data Science, security data scientist Joshua Saxe introduces machine learning, statistics, social network analysis, and data visualization, and shows you how to apply these methods to malware detection and analysis.

You'll learn how to:

• Analyze malware using static analysis• Observe malware behavior using dynamic analysis• Identify adversary groups through shared code analysis• Catch 0-day vulnerabilities by building your own machine learning detector• Measure malware detector accuracy• Identify malware campaigns, trends, and relationships through data visualization

Whether you're a malware analyst looking to add skills to your existing arsenal, or a data scientist interested in attack detection and threat intelligence, Malware Data Science will help you stay ahead of the curve."

Table of Contents

  1. Cover Page
  2. Title Page
  3. Copyright Page
  4. Dedication
  5. About the Authors
  6. About the Technical Reviewer
  7. BRIEF CONTENTS
  8. CONTENTS IN DETAIL
  9. FOREWORD by Anup Ghosh
  10. ACKNOWLEDGMENTS
  11. INTRODUCTION
    1. What Is Data Science?
    2. Why Data Science Matters for Security
    3. Applying Data Science to Malware
    4. Who Should Read This Book?
    5. About This Book
    6. How to Use the Sample Code and Data
  12. 1 BASIC STATIC MALWARE ANALYSIS
    1. The Microsoft Windows Portable Executable Format
    2. Dissecting the PE Format Using pefile
    3. Examining Malware Images
    4. Examining Malware Strings
    5. Summary
  13. 2 BEYOND BASIC STATIC ANALYSIS: X86 DISASSEMBLY
    1. Disassembly Methods
    2. Basics of x86 Assembly Language
    3. Disassembling ircbot.exe Using pefile and capstone
    4. Factors That Limit Static Analysis
    5. Summary
  14. 3 A BRIEF INTRODUCTION TO DYNAMIC ANALYSIS
    1. Why Use Dynamic Analysis?
    2. Dynamic Analysis for Malware Data Science
    3. Basic Tools for Dynamic Analysis
    4. Limitations of Basic Dynamic Analysis
    5. Summary
  15. 4 IDENTIFYING ATTACK CAMPAIGNS USING MALWARE NETWORKS
    1. Nodes and Edges
    2. Bipartite Networks
    3. Visualizing Malware Networks
    4. Building Networks with NetworkX
    5. Adding Nodes and Edges
    6. Network Visualization with GraphViz
    7. Building Malware Networks
    8. Building a Shared Image Relationship Network
    9. Summary
  16. 5 SHARED CODE ANALYSIS
    1. Preparing Samples for Comparison by Extracting Features
    2. Using the Jaccard Index to Quantify Similarity
    3. Using Similarity Matrices to Evaluate Malware Shared Code Estimation Methods
    4. Building a Similarity Graph
    5. Scaling Similarity Comparisons
    6. Building a Persistent Malware Similarity Search System
    7. Running the Similarity Search System
    8. Summary
  17. 6 UNDERSTANDING MACHINE LEARNING–BASED MALWARE DETECTORS
    1. Steps for Building a Machine Learning–Based Detector
    2. Understanding Feature Spaces and Decision Boundaries
    3. What Makes Models Good or Bad: Overfitting and Underfitting
    4. Major Types of Machine Learning Algorithms
    5. Summary
  18. 7 EVALUATING MALWARE DETECTION SYSTEMS
    1. Four Possible Detection Outcomes
    2. Considering Base Rates in Your Evaluation
    3. Summary
  19. 8 BUILDING MACHINE LEARNING DETECTORS
    1. Terminology and Concepts
    2. Building a Toy Decision Tree–Based Detector
    3. Building Real-World Machine Learning Detectors with sklearn
    4. Building an Industrial-Strength Detector
    5. Evaluating Your Detector’s Performance
    6. Next Steps
    7. Summary
  20. 9 VISUALIZING MALWARE TRENDS
    1. Why Visualizing Malware Data Is Important
    2. Understanding Our Malware Dataset
    3. Using matplotlib to Visualize Data
    4. Using seaborn to Visualize Data
    5. Summary
  21. 10 DEEP LEARNING BASICS
    1. What Is Deep Learning?
    2. How Neural Networks Work
    3. Training Neural Networks
    4. Types of Neural Networks
    5. Summary
  22. 11 BUILDING A NEURAL NETWORK MALWARE DETECTOR WITH KERAS
    1. Defining a Model’s Architecture
    2. Compiling the Model
    3. Training the Model
    4. Evaluating the Model
    5. Enhancing the Model Training Process with Callbacks
    6. Summary
  23. 12 BECOMING A DATA SCIENTIST
    1. Paths to Becoming a Security Data Scientist
    2. A Day in the Life of a Security Data Scientist
    3. Traits of an Effective Security Data Scientist
    4. Where to Go from Here
  24. APPENDIX AN OVERVIEW OF DATASETS AND TOOLS
    1. Overview of Datasets
    2. Tool Implementation Guide
  25. Index
52.15.63.145