0%

Learn how to write high-quality kernel module code, solve common Linux kernel programming issues, and understand the fundamentals of Linux kernel internals

Key Features

  • Discover how to write kernel code using the Loadable Kernel Module framework
  • Explore industry-grade techniques to perform efficient memory allocation and data synchronization within the kernel
  • Understand the essentials of key internals topics such as kernel architecture, memory management, CPU scheduling, and kernel synchronization

Book Description

Linux Kernel Programming is a comprehensive introduction for those new to Linux kernel and module development. This easy-to-follow guide will have you up and running with writing kernel code in next-to-no time. This book uses the latest 5.4 Long-Term Support (LTS) Linux kernel, which will be maintained from November 2019 through to December 2025. By working with the 5.4 LTS kernel throughout the book, you can be confident that your knowledge will continue to be valid for years to come.

This Linux book begins by showing you how to build the kernel from the source. Next, you'll learn how to write your first kernel module using the powerful Loadable Kernel Module (LKM) framework. The book then covers key kernel internals topics including Linux kernel architecture, memory management, and CPU scheduling. Next, you'll delve into the fairly complex topic of concurrency within the kernel, understand the issues it can cause, and learn how they can be addressed with various locking technologies (mutexes, spinlocks, atomic, and refcount operators). You'll also benefit from more advanced material on cache effects, a primer on lock-free techniques within the kernel, deadlock avoidance (with lockdep), and kernel lock debugging techniques.

By the end of this kernel book, you'll have a detailed understanding of the fundamentals of writing Linux kernel module code for real-world projects and products.

What you will learn

  • Write high-quality modular kernel code (LKM framework) for 5.x kernels
  • Configure and build a kernel from source
  • Explore the Linux kernel architecture
  • Get to grips with key internals regarding memory management within the kernel
  • Understand and work with various dynamic kernel memory alloc/dealloc APIs
  • Discover key internals aspects regarding CPU scheduling within the kernel
  • Gain an understanding of kernel concurrency issues
  • Find out how to work with key kernel synchronization primitives

Who this book is for

This book is for Linux programmers beginning to find their way with Linux kernel development. Linux kernel and driver developers looking to overcome frequent and common kernel development issues, as well as understand kernel internals, will benefit from this book. A basic understanding of Linux CLI and C programming is required.

Table of Contents

  1. Title Page
  2. Copyright and Credits
    1. Linux Kernel Programming
  3. Dedication
  4. Contributors
    1. About the author
    2. About the reviewers
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Reviews
  6. Section 1: The Basics
  7. Kernel Workspace Setup
    1. Technical requirements
    2. Running Linux as a guest VM
    3. Installing a 64-bit Linux guest
    4. Turn on your x86 system's virtualization extension support 
    5. Allocate sufficient space to the disk
    6. Install the Oracle VirtualBox Guest Additions
    7. Experimenting with the Raspberry Pi
    8. Setting up the software – distribution and packages
    9. Installing software packages
    10. Installing the Oracle VirtualBox guest additions
    11. Installing required software packages
    12. Installing a cross toolchain and QEMU
    13. Installing a cross compiler
    14. Important installation notes
    15. Additional useful projects
    16. Using the Linux man pages
    17. The tldr variant
    18. Locating and using the Linux kernel documentation
    19. Generating the kernel documentation from source
    20. Static analysis tools for the Linux kernel
    21. Linux Trace Toolkit next generation
    22. The procmap utility
    23. Simple Embedded ARM Linux System FOSS project
    24. Modern tracing and performance analysis with [e]BPF
    25. The LDV - Linux Driver Verification - project
    26. Summary
    27. Questions
    28. Further reading
  8. Building the 5.x Linux Kernel from Source - Part 1
    1. Technical requirements 
    2. Preliminaries for the kernel build
    3. Kernel release nomenclature
    4. Kernel development workflow – the basics
    5. Types of kernel source trees
    6. Steps to build the kernel from source
    7. Step 1 – obtaining a Linux kernel source tree
    8. Downloading a specific kernel tree
    9. Cloning a Git tree
    10. Step 2 – extracting the kernel source tree
    11. A brief tour of the kernel source tree
    12. Step 3 – configuring the Linux kernel
    13. Understanding the kbuild build system
    14. Arriving at a default configuration
    15. Obtaining a good starting point for kernel configuration
    16. Kernel config for typical embedded Linux systems
    17. Kernel config using distribution config as a starting point
    18. Tuned kernel config via the localmodconfig approach
    19. Getting started with the localmodconfig approach
    20. Tuning our kernel configuration via the make menuconfig UI
    21. Sample usage of the make menuconfig UI
    22. More on kbuild
    23. Looking up the differences in configuration
    24. Customizing the kernel menu – adding our own menu item
    25. The Kconfig* files
    26. Creating a new menu item in the Kconfig file
    27. A few details on the Kconfig language
    28. Summary
    29. Questions
    30. Further reading
  9. Building the 5.x Linux Kernel from Source - Part 2
    1. Technical requirements
    2. Step 4 – building the kernel image and modules
    3. Step 5 – installing the kernel modules
    4. Locating the kernel modules within the kernel source
    5. Getting the kernel modules installed
    6. Step 6 – generating the initramfs image and bootloader setup
    7. Generating the initramfs image on Fedora 30 and above
    8. Generating the initramfs image – under the hood
    9. Understanding the initramfs framework
    10. Why the initramfs framework?
    11. Understanding the basics of the boot process on the x86
    12. More on the initramfs framework
    13. Step 7 – customizing the GRUB bootloader
    14. Customizing GRUB – the basics
    15. Selecting the default kernel to boot into
    16. Booting our VM via the GNU GRUB bootloader
    17. Experimenting with the GRUB prompt
    18. Verifying our new kernel's configuration
    19. Kernel build for the Raspberry Pi
    20. Step 1 – cloning the kernel source tree
    21. Step 2 – installing a cross-toolchain
    22. First method – package install via apt
    23. Second method – installation via the source repo
    24. Step 3 – configuring and building the kernel
    25. Miscellaneous tips on the kernel build
    26. Minimum version requirements
    27. Building a kernel for another site
    28. Watching the kernel build run
    29. A shortcut shell syntax to the build procedure 
    30. Dealing with compiler switch issues
    31. Dealing with missing OpenSSL development headers
    32. Summary
    33. Questions
    34. Further reading
  10. Writing Your First Kernel Module - LKMs Part 1
    1. Technical requirements
    2. Understanding kernel architecture – part 1
    3. User space and kernel space
    4. Library and system call APIs
    5. Kernel space components
    6. Exploring LKMs
    7. The LKM framework
    8. Kernel modules within the kernel source tree
    9. Writing our very first kernel module
    10. Introducing our Hello, world LKM C code
    11. Breaking it down
    12. Kernel headers
    13. Module macros
    14. Entry and exit points
    15. Return values
    16. The 0/-E return convention
    17. The ERR_PTR and PTR_ERR macros
    18. The __init and __exit keywords
    19. Common operations on kernel modules
    20. Building the kernel module
    21. Running the kernel module
    22. A quick first look at the kernel printk()
    23. Listing the live kernel modules
    24. Unloading the module from kernel memory
    25. Our lkm convenience script
    26. Understanding kernel logging and printk
    27. Using the kernel memory ring buffer
    28. Kernel logging and systemd's journalctl
    29. Using printk log levels
    30. The pr_ convenience macros
    31. Wiring to the console
    32. Writing output to the Raspberry Pi console
    33. Enabling the pr_debug() kernel messages
    34. Rate limiting the printk instances
    35. Generating kernel messages from the user space
    36. Standardizing printk output via the pr_fmt macro
    37. Portability and the printk format specifiers
    38. Understanding the basics of a kernel module Makefile
    39. Summary 
    40. Questions
    41. Further reading
  11. Writing Your First Kernel Module - LKMs Part 2
    1. Technical requirements
    2. A "better" Makefile template for your kernel modules
    3. Configuring a "debug" kernel
    4. Cross-compiling a kernel module
    5. Setting up the system for cross-compilation
    6. Attempt 1 – setting the "special" environment variables
    7. Attempt 2 – pointing the Makefile to the correct kernel source tree for the target
    8. Attempt 3 – cross-compiling our kernel module
    9. Attempt 4  – cross-compiling our kernel module
    10. Gathering minimal system information
    11. Being a bit more security-aware
    12. Licensing kernel modules
    13. Emulating "library-like" features for kernel modules
    14. Performing library emulation via multiple source files
    15. Understanding function and variable scope in a kernel module
    16. Understanding module stacking
    17. Trying out module stacking
    18. Passing parameters to a kernel module
    19. Declaring and using module parameters
    20. Getting/setting module parameters after insertion
    21. Module parameter data types and validation
    22. Validating kernel module parameters
    23. Overriding the module parameter's name
    24. Hardware-related kernel parameters
    25. Floating point not allowed in the kernel
    26. Auto-loading modules on system boot
    27. Module auto-loading – additional details
    28. Kernel modules and security – an overview
    29. Proc filesystem tunables affecting the system log
    30. The cryptographic signing of kernel modules
    31. Disabling kernel modules altogether
    32. Coding style guidelines for kernel developers
    33. Contributing to the mainline kernel
    34. Getting started with contributing to the kernel
    35. Summary
    36. Questions
    37. Further reading
  12. Section 2: Understanding and Working with the Kernel
  13. Kernel Internals Essentials - Processes and Threads
    1. Technical requirements
    2. Understanding process and interrupt contexts
    3. Understanding the basics of the process VAS
    4. Organizing processes, threads, and their stacks – user and kernel space
    5. User space organization
    6. Kernel space organization
    7. Summarizing the current situation
    8. Viewing the user and kernel stacks
    9. Traditional approach to viewing the stacks
    10. Viewing the kernel space stack of a given thread or process
    11. Viewing the user space stack of a given thread or process
    12. [e]BPF – the modern approach to viewing both stacks
    13. The 10,000-foot view of the process VAS
    14. Understanding and accessing the kernel task structure
    15. Looking into the task structure
    16. Accessing the task structure with current
    17. Determining the context
    18. Working with the task structure via current
    19. Built-in kernel helper methods and optimizations
    20. Trying out the kernel module to print process context info
    21. Seeing that the Linux OS is monolithic
    22. Coding for security with printk
    23. Iterating over the kernel's task lists
    24. Iterating over the task list I – displaying all processes
    25. Iterating over the task list II – displaying all threads
    26. Differentiating between the process and thread – the TGID and the PID
    27. Iterating over the task list III – the code
    28. Summary
    29. Questions
    30. Further reading
  14. Memory Management Internals - Essentials
    1. Technical requirements
    2. Understanding the VM split
    3. Looking under the hood – the Hello, world C program
    4. Going beyond the printf() API
    5. VM split on 64-bit Linux systems
    6. Virtual addressing and address translation
    7. The process VAS – the full view
    8. Examining the process VAS
    9. Examining the user VAS in detail
    10. Directly viewing the process memory map using procfs
    11. Interpreting the /proc/PID/maps output
    12. The vsyscall page
    13. Frontends to view the process memory map
    14. The procmap process VAS visualization utility
    15. Understanding VMA basics
    16. Examining the kernel segment
    17. High memory on 32-bit systems
    18. Writing a kernel module to show information about the kernel segment
    19. Viewing the kernel segment on a Raspberry Pi via dmesg
    20. Macros and variables describing the kernel segment layout
    21. Trying it out – viewing kernel segment details
    22. The kernel VAS via procmap
    23. Trying it out – the user segment
    24. The null trap page
    25. Viewing kernel documentation on the memory layout
    26. Randomizing the memory layout – KASLR
    27. User-mode ASLR
    28. KASLR
    29. Querying/setting KASLR status with a script
    30. Physical memory
    31. Physical RAM organization
    32. Nodes
    33. Zones
    34. Direct-mapped RAM and address translation
    35. Summary
    36. Questions
    37. Further reading
  15. Kernel Memory Allocation for Module Authors - Part 1
    1. Technical requirements
    2. Introducing kernel memory allocators
    3. Understanding and using the kernel page allocator (or BSA)
    4. The fundamental workings of the page allocator
    5. Freelist organization
    6. The workings of the page allocator
    7. Working through a few scenarios
    8. The simplest case
    9. A more complex case
    10. The downfall case
    11. Page allocator internals – a few more details
    12. Learning how to use the page allocator APIs
    13. Dealing with the GFP flags
    14. Freeing pages with the page allocator
    15. Writing a kernel module to demo using the page allocator APIs
    16. Deploying our lowlevel_mem_lkm kernel module
    17. The page allocator and internal fragmentation
    18. The exact page allocator APIs
    19. The GFP flags – digging deeper
    20. Never sleep in interrupt or atomic contexts
    21. Understanding and using the kernel slab allocator
    22. The object caching idea
    23. Learning how to use the slab allocator APIs
    24. Allocating slab memory
    25. Freeing slab memory
    26. Data structures – a few design tips
    27. The actual slab caches in use for kmalloc
    28. Writing a kernel module to use the basic slab APIs
    29. Size limitations of the kmalloc API
    30. Testing the limits – memory allocation with a single call
    31. Checking via the /proc/buddyinfo pseudo-file
    32. Slab allocator – a few additional details
    33. Using the kernel's resource-managed memory allocation APIs
    34. Additional slab helper APIs
    35. Control groups and memory
    36. Caveats when using the slab allocator
    37. Background details and conclusions
    38. Testing slab allocation with ksize() – case 1
    39. Testing slab allocation with ksize() – case 2
    40. Interpreting the output from case 2
    41. Graphing it
    42. Slab layer implementations within the kernel
    43. Summary
    44. Questions
    45. Further reading
  16. Kernel Memory Allocation for Module Authors - Part 2
    1. Technical requirements
    2. Creating a custom slab cache
    3. Creating and using a custom slab cache within a kernel module
    4. Creating a custom slab cache
    5. Using the new slab cache's memory
    6. Destroying the custom cache
    7. Custom slab – a demo kernel module
    8. Understanding slab shrinkers
    9. The slab allocator – pros and cons – a summation
    10. Debugging at the slab layer
    11. Debugging through slab poisoning
    12. Trying it out – triggering a UAF bug
    13. SLUB debug options at boot and runtime
    14. Understanding and using the kernel vmalloc() API
    15. Learning to use the vmalloc family of APIs
    16. A brief note on memory allocations and demand paging
    17. Friends of vmalloc()
    18. Specifying the memory protections
    19. Testing it – a quick Proof of Concept
    20. Why make memory read-only?
    21. The kmalloc() and vmalloc() APIs – a quick comparison
    22. Memory allocation in the kernel – which APIs to use when
    23. Visualizing the kernel memory allocation API set
    24. Selecting an appropriate API for kernel memory allocation
    25. A word on DMA and CMA
    26. Stayin' alive – the OOM killer
    27. Reclaiming memory – a kernel housekeeping task and OOM
    28. Deliberately invoking the OOM killer
    29. Invoking the OOM killer via Magic SysRq
    30. Invoking the OOM killer with a crazy allocator program
    31. Understanding the rationale behind the OOM killer
    32. Case 1 – vm.overcommit set to 2, overcommit turned off
    33. Case 2 – vm.overcommit set to 0, overcommit on, the default
    34. Demand paging and OOM
    35. Understanding the OOM score
    36. Summary
    37. Questions
    38. Further reading
  17. The CPU Scheduler - Part 1
    1. Technical requirements
    2. Learning about the CPU scheduling internals – part 1 – essential background
    3. What is the KSE on Linux?
    4. The POSIX scheduling policies
    5. Visualizing the flow
    6. Using perf to visualize the flow
    7. Visualizing the flow via alternate (CLI) approaches
    8. Learning about the CPU scheduling internals – part 2
    9. Understanding modular scheduling classes
    10. Asking the scheduling class
    11. A word on CFS and the vruntime value
    12. Threads – which scheduling policy and priority
    13. Learning about the CPU scheduling internals – part 3
    14. Who runs the scheduler code?
    15. When does the scheduler run?
    16. The timer interrupt part
    17. The process context part
    18. Preemptible kernel
    19. CPU scheduler entry points
    20. The context switch
    21. Summary
    22. Questions
    23. Further reading
  18. The CPU Scheduler - Part 2
    1. Technical requirements
    2. Visualizing the flow with LTTng and trace-cmd
    3. Visualization with LTTng and Trace Compass
    4. Recording a kernel tracing session with LTTng
    5. Reporting with a GUI – Trace Compass
    6. Visualizing with trace-cmd
    7. Recording a sample session with trace-cmd record
    8. Reporting and interpretation with trace-cmd report (CLI)
    9. Reporting and interpretation with a GUI frontend
    10. Understanding, querying, and setting the CPU affinity mask
    11. Querying and setting a thread's CPU affinity mask
    12. Using taskset(1) to perform CPU affinity
    13. Setting the CPU affinity mask on a kernel thread
    14. Querying and setting a thread’s scheduling policy and priority
    15. Within the kernel – on a kernel thread
    16. CPU bandwidth control with cgroups
    17. Looking up cgroups v2 on a Linux system
    18. Trying it out – a cgroups v2 CPU controller
    19. Converting mainline Linux into an RTOS
    20. Building RTL for the mainline 5.x kernel (on x86_64)
    21. Obtaining the RTL patches
    22. Applying the RTL patch
    23. Configuring and building the RTL kernel
    24. Mainline and RTL – technical differences summarized
    25. Latency and its measurement
    26. Measuring scheduling latency with cyclictest
    27. Getting and applying the RTL patchset
    28. Installing cyclictest (and other required packages) on the device
    29. Running the test cases
    30. Viewing the results
    31. Measuring scheduler latency via modern BPF tools
    32. Summary
    33. Questions
    34. Further reading
  19. Section 3: Delving Deeper
  20. Kernel Synchronization - Part 1
    1. Critical sections, exclusive execution, and atomicity
    2. What is a critical section?
    3. A classic case – the global i ++
    4. Concepts – the lock
    5. A summary of key points
    6. Concurrency concerns within the Linux kernel
    7. Multicore SMP systems and data races
    8. Preemptible kernels, blocking I/O, and data races
    9. Hardware interrupts and data races
    10. Locking guidelines and deadlocks
    11. Mutex or spinlock? Which to use when
    12. Determining which lock to use – in theory
    13. Determining which lock to use – in practice
    14. Using the mutex lock
    15. Initializing the mutex lock
    16. Correctly using the mutex lock
    17. Mutex lock and unlock APIs and their usage
    18. Mutex lock – via [un]interruptible sleep?
    19. Mutex locking – an example driver
    20. The mutex lock – a few remaining points
    21. Mutex lock API variants
    22. The mutex trylock variant
    23. The mutex interruptible and killable variants
    24. The mutex io variant
    25. The semaphore and the mutex
    26. Priority inversion and the RT-mutex
    27. Internal design
    28. Using the spinlock
    29. Spinlock – simple usage
    30. Spinlock – an example driver
    31. Test – sleep in an atomic context
    32. Testing on a 5.4 debug kernel
    33. Testing on a 5.4 non-debug distro kernel
    34. Locking and interrupts
    35. Using spinlocks – a quick summary
    36. Summary
    37. Questions
    38. Further reading
  21. Kernel Synchronization - Part 2
    1. Using the atomic_t and refcount_t interfaces
    2. The newer refcount_t versus older atomic_t interfaces
    3. The simpler atomic_t and refcount_t interfaces
    4. Examples of using refcount_t within the kernel code base
    5. 64-bit atomic integer operators
    6. Using the RMW atomic operators
    7. RMW atomic operations – operating on device registers
    8. Using the RMW bitwise operators
    9. Using bitwise atomic operators – an example
    10. Efficiently searching a bitmask
    11. Using the reader-writer spinlock
    12. Reader-writer spinlock interfaces
    13. A word of caution
    14. The reader-writer semaphore
    15. Cache effects and false sharing
    16. Lock-free programming with per-CPU variables
    17. Per-CPU variables
    18. Working with per-CPU
    19. Allocating, initialization, and freeing per-CPU variables
    20. Performing I/O (reads and writes) on per-CPU variables
    21. Per-CPU – an example kernel module
    22. Per-CPU usage within the kernel
    23. Lock debugging within the kernel
    24. Configuring a debug kernel for lock debugging
    25. The lock validator lockdep – catching locking issues early
    26. Examples – catching deadlock bugs with lockdep
    27. Example 1 – catching a self deadlock bug with lockdep
    28. Fixing it
    29. Example 2 – catching an AB-BA deadlock with lockdep
    30. lockdep – annotations and issues
    31. lockdep annotations
    32. lockdep issues
    33. Lock statistics
    34. Viewing lock stats
    35. Memory barriers – an introduction
    36. An example of using memory barriers in a device driver
    37. Summary
    38. Questions
    39. Further reading
  22. About Packt
    1. Why subscribe?
  23. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think
18.116.36.221