Common Types of Problems

There are many reasons why the kernel may crash or why other problems may happen. However, they are usually variations on common errors and once you know what class of problem you are dealing with, it makes it a lot easier to start examining your code for problems. Let’s have a broad look at some of the problems you may encounter during kernel development.

  • Race Conditions: A general class of bugs used to describe a problem where multiple threads of execution conflict with each other and the outcome depends on which thread gets there first. Race conditions are quite common and are often due to poor design or poor locking in multi-threaded environments. They can sometimes be tricky to reproduce and may be hiding a long time before discovery. Things go wrong when a particular sequence of events happen in a specific order, for example, because it is dependent on user input.
  • Deadlocks: Happen when locking is poorly implemented and when one thread is waiting on an event or lock that will never happen, possibly because a second thread failed to release the lock after use. Once this happens, the condition can spread to new threads in the system also needing the lock. From a user’s point of view, it can look like their application hangs and they may be unable to force quitting it because it is stuck waiting for an event in the kernel.
  • Lock Contention: A performance problem, which happens when many threads need the same lock and spend excessive time waiting for the lock rather than doing anything useful. Lock contention is usually the result of poor design and can be prevented by implementing a proper locking scheme. The general rule is to lock access to data and not to code. Having large blocks of code protected may seem easier than fine-grained locking of shared data only; however, it will decrease performance and make it more likely for deadlocks to occur.
  • Access to Invalid Memory: The most common cause for kernel panics. Unlike user space programs, which are aborted, the kernel simply panics if the CPU causes an invalid memory exception. If a debugger is enabled, the kernel will dump into the debugger instead of showing the grey screen of death on Mac OS X or rebooting, which is the behavior under iOS. Buffer overruns will sometimes cause an invalid memory exception, unless the buffer happens to be followed by valid memory, in which case silent memory corruption may occur.
  • Memory and Resource Leaks: Can happen, for example, if a driver unloads and resources such as objects and buffers were not properly disposed of. It can also be that an extension allocates some memory each time it receives a request, but fails to free the memory after it is finished. The kernel has no garbage collection capabilities, so leaks can accumulate over time and cause a kernel panic.
  • Illegal Instruction/Operand: These exceptions are issued by the CPU if it detects an invalid instruction or an invalid argument to an instruction. This can happen as the result of memory corruption or a poorly written driver that attempts to use features not present on the CPU, for example, using the SSE3 instruction set on machines that do not support it. You could also see this exception as a result of memory corruption.
  • Blocking in Primary Interrupt Context: Results in a panic, as you cannot block during primary interrupt context. Blocking requires a scheduled thread, as blocking is implemented by putting the thread to sleep voluntarily. In this case the thread’s state is saved and later restored when the scheduler determines it is time to run that thread again. A primary interrupt handler cannot be resumed; it must run to completion without being interrupted. Many kernel APIs may block under certain circumstances. For example, memory allocation may block if the system is low on memory, which will result in some memory being paged out to disk to free up memory for the request. Because of this, functions such as IOMalloc() or even IOLog() cannot be used during primary interrupt context.  
  • Volunteered Panics: Happen when the kernel voluntarily decides to crash because it has determined that something is about to go horribly wrong or an exceptional condition has occurred that it can’t recover from. An example of this is if a memory allocation that cannot block fails. Your driver can panic the kernel by calling IOPanic(), which is a wrapper for the panic() function.

There are of course many other problems that can occur, but most are variants of the preceding typical ones.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.11.89