Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 11. Software Tracing for Event-Driven Systems

There has never been an unexpectedly short debugging period in the history of computers.

—Steven Levy

In any real-life project, getting the code written, compiled, and successfully linked is only the first step. The system still needs to be tested, validated, and tuned for best performance and resource consumption. A single-step debugger is frequently not helpful because it stops the system and exactly hinders seeing live interactions within the application. Clogging up high-performance code with printf() statements is usually too intrusive and simply unworkable in most embedded systems, which typically don't have adequate screens to print to. So the questions are: How can you monitor the behavior of a running real-time system without degrading the system itself? How can you discover and document elusive, intermittent bugs that are caused by subtle interactions among concurrent components? How do you design and execute repeatable unit and integration tests of your system? How do you ensure that a system runs reliably for long periods of time and gets top processor performance?

Techniques based on software tracing can answer many of these questions. Software tracing is a method for obtaining diagnostic information in a live environment without the need to stop the application to get the system feedback. Software tracing always involves some form of a target system instrumentation to log interesting discrete events for subsequent retrieval from the system and analysis.

Due to the inversion of a control, software tracing is particularly effective and powerful in combination with the event-driven programming model. An instrumented event-driven framework can provide much more comprehensive and detailed information than any traditional RTOS.

In this chapter, I describe the software-tracing system called Quantum Spy, which is part of the QP event-driven platform. I begin with a quick introduction to software tracing concepts. Next I walk you through an example of a software tracing session. I then describe the target-resident software-tracing component, called QS, explaining in detail the generation of trace data, the various filters, buffering, transmission protocol, and porting QS. Subsequently, I present the QSPY host application for receiving, displaying, storing, and analyzing the trace data. Finally I explain the steps required to add the QS software tracing component to a QP application.

Software Tracing Concepts

In a nutshell, software tracing is similar to peppering the code with printf() statements for logging and debugging, except that software tracing is much less intrusive and more selective than the primitive printf(). This quick overview introduces the basic concepts and describes some features you can expect from a commercial-grade software-tracing system.

Figure 11.1 shows a typical setup for software tracing. The embedded target system is executing instrumented code, which logs the trace data into a RAM buffer inside the target. From that buffer the trace data is sent over a data link to a host computer that stores, displays, and analyzes the information. This configuration means that a software tracing always requires two components: a target-resident component for collecting and sending the trace data and a host-resident component to receive, decompress, visualize, and analyze the data.

Note
Software-tracing instrumentation logs interesting discrete events that occur in the target system. I will call these discrete events trace records, to avoid confusing them with the application-level events.

Figure 11.1. Typical setup for collecting software trace data.

A good tracing solution is minimally intrusive, which means that it can provide visibility into the running code with minimal impact on the target system behavior. Properly implemented and used, it will let you diagnose a live system without interrupting or significantly altering the behavior of the system under investigation.

Of course, it's always possible that the overhead of software tracing, no matter how small, will have some effect on the target system behavior, which is known as the probe effect (a.k.a. the Heisenberg effect). To help you determine whether that is occurring, you must be able to configure the instrumentation in and out both at compile time as well as at runtime.

To minimize the probe effect, a good trace system performs efficient, selective logging of trace records using as little processing and memory resources of the target as possible. Selective logging means that the tracing system provides user-definable, fine-granularity filters so that the target-resident component only collects events of interest—you can filter as many or as few instrumented events as you need. That way you can make the tracing as noninvasive as necessary.

To minimize the RAM usage, the target-resident trace component typically uses a circular trace buffer that is continuously updated, and new data overwrites the old when the buffer “wraps around” due to limited size or transmission rate to the host. This reflects the typically applied last-is-best policy in collecting the trace data. To focus on certain periods of time, software tracing provides configurable software triggers that can start and stop trace collection before the new data overwrites the old data of interest in the circular buffer.

To further maximize the amount of data collected in the trace buffer, the target-resident component typically applies some form of data compression to squeeze more trace information into the buffer and to minimize the bandwidth required to uplink the data to the host.

However, perhaps the most important characteristic of a flexible software-tracing system is the separation of trace logging (what information is being traced) from the data transmission mechanism (how and when exactly the data is sent to the host). This separation of concerns allows the transmissions to occur in the least time-critical paths of the code, such as the idle loop. Furthermore, clients should be able to employ any data transmission mechanism available on the target, meaning both the physical transport layer (e.g., serial port, SPI, USB, Ethernet, etc.) as well as implementation strategy (polling, interrupt, DMA, etc.). The tracing facility should tolerate and be able to detect any RAM buffer overruns due to bursts of tracing data production rate or insufficient transmission rate to the host.

Finally, the tracing facility must allow consolidating data from all parts of the system, including concurrently executing threads and interrupts. This means that the instrumentation facilities must be reentrant (i.e., both thread-safe and interrupt-safe). Also, to be able to correlate all this data, most tracing systems provide precise timestamping of the trace records.

Quantum Spy Software-Tracing System

As I mentioned in the introduction to this chapter, software tracing is especially effective and powerful in combination with the event-driven active object computing model (see Chapter 6). A running application built of active objects is a highly structured affair where all important system interactions funnel through the real-time framework and the state-machine engine. This offers a unique opportunity to instrument these relatively small parts of the overall code to gain unprecedented insight into the entire system.

Quantum Spy is a software-tracing system that enables live monitoring of event-driven QP applications with minimal target system resources and without stopping or significantly slowing down the code. The Quantum Spy system consists of the target-resident component, called QS, and the application running on a host workstation, called QSPY.

Many operating systems provide software-tracing capabilities. However, Quantum Spy takes software tracing to the entirely new level. Due to inversion of control, an instrumented state machine framework, as opposed to merely an RTOS, is capable of providing incomparably more comprehensive information about the running system, even without adding any instrumentation to the application code. For example, the QS trace data is thorough enough to produce complete sequence diagrams and detailed state machine activity for all state machines in the system. You can selectively monitor all event exchanges, event queues, event pools, and time events because all these elements are controlled by the framework. Additionally, if you use one of the kernels built into QP (the vanilla kernel or the preemptive QK kernel), you can obtain all the data available to a traditional RTOS as well, such as context switches and mutex activity.

Example of a Software-Tracing Session

To show you how software-tracing works in practice, I present an example of a software-tracing session. I use the dining philosophers problem (DPP) test application, which I introduced in Chapter 9. All versions of the DPP application included in the code accompanying this book contain the QS instrumentation. The tracing instrumentation becomes active when you build the “Spy” configuration.

Figure 11.2 shows how to collect the software trace data from the QK/DOS version of the DPP application located in the directory <qp>qpcexamples80x86qk cpp101ldppspydpp-spy.exe. You can rebuild the “Spy” configuration by loading the DPP-SPY.PRJ project into the Turbo C++ IDE. You need to run the DPP-SPY.EXE executable on a target PC with a serial port. You connect the serial port of the target machine to the serial port of a Windows or a Linux host workstation via a NULL-modem cable. On the host workstation, you need to start the QSPY host application that decompresses and visualizes the QS trace data.

Figure 11.2. Collecting software trace data from a 80x86 target.

The Windows executable of the QSPY host application is located in the directory <qp>qpc oolsqspywin32vc2005Release. Assuming that this directory is your current directory or is in your path, you invoke this console application by typing the following command at the command prompt:

qspy –c COM1 –b 115200

The first command-line parameter –c COM1 tells the QSPY host application to receive the trace data from COM1. If your target is connected to a different COM port, you need to adjust the COM number. The second parameter configures the baud rate of the serial port to 115200.

Note
In the particular case of a Windows PC, you can use the same machine as the target and the host at the same time. You need to use a machine with two serial ports, which you connect with a NULL modem cable. You can use one serial port for the DPP target application running in a DOS-window and the other for the QSPY host application.

You might also use a Linux host machine. In case of Linux, you must first build the executable by running the Makefile located in the directory <qp>/qpc/tools/qspy/linux/gnu/. You invoke the Linux executable by typing the following command at the command prompt:

qspy –c /dev/ttyS0 –b 115200

The first parameter –c /dev/ttyS0 tells the QSPY application to receive the trace data from the ttyS0 serial device. If you connected a different serial port to the target, you need to adjust the ttyS number.

As I mentioned before, all DPP applications included in the code accompanying this book are instrumented for software tracing, and I encourage you to try them all. For example, you can collect trace data from the EV-LM3S811 board (see Figure 11.1). The EV-LM3S811 board sends the QS trace data through the UART0 connected to the Virtual COM Port (VCP) provided by the USB debugger, so the QSPY host application can conveniently receive the trace data on the host PC. No additional serial cable is needed.

The Human-Readable Trace Output

The QSPY host application is just a simple console-type program without any fancy user interface. QSPY application displays the trace data in a human-readable textual format. Listing 11.1 shows fragments of such a data log generated from the DOS/QK version of the DPP application.

Listing 11.1. Fragments of the software trace log from the DOS/QK version of the DPP application
qspy host application 3.5.00
Copyright (c) Quantum Leaps, LLC.
Mon Feb 25 12:20:13 2008
-T 4
-O 4
-F 4
-S 1
-E 2
-Q 1
-P 2
-B 2
-C 2
Obj Dic: 16CA18D8->l_smlPoolSto
Obj Dic: 16CA1900->l_tableQueueSto
Obj Dic: 16CA1914->l_philoQueueSto[0]
. . . . . .
EQ.INIT: Obj=l_tableQueueSto Len= 5
0000000000 AO.ADD : Active=16CA1CB8 Prio= 6
Obj Dic: 16CA1CB8->&l_table
Fun Dic: 141E0006->&QHsm_top
Fun Dic: 12DA00C9->&Table_initial
Fun Dic: 12DA020B->&Table_serving
Sig Dic: 00000008,Obj=16CA1CB8 ->HUNGRY_SIG
0000000000 AO.SUB : Active=l_table Sig=DONE_SIG
0000000000 AO.SUB : Active=l_table Sig=TERMINATE_SIG
Q_INIT : Obj=l_table Source=QHsm_top Target=Table_serving
0000000000 ==>Init: Obj=l_table New=Table_serving
0000070346 QF_isrE: IsrNest= 1, CurrPrio=255
TICK : Ctr= 1
0000070367 QF_isrX: IsrNest= 1, CurrPrio=255
0000135566 QF_isrE: IsrNest= 1, CurrPrio=255
TICK : Ctr= 2
0000135581 QF_isrX: IsrNest= 1, CurrPrio=255
. . . . .
0000461783 QF_isrE: IsrNest= 1, CurrPrio=255
TICK : Ctr= 7
TE.ADRM: Obj=l_philo[1].timeEvt Act=l_philo[1]
0000461797 TE.POST: Obj=l_philo[1].timeEvt Sig=TIMEOUT_SIG Act=l_philo[1]
0000461808 AO.FIFO: Obj=l_philo[1] Evt(Sig=TIMEOUT_SIG, Pool=0, Ref= 0) Queue(nUsed= 5, nMax= 5)
0000461824 QF_isrX: IsrNest= 1, CurrPrio=255
0000461836 AO.GETL: Active= l_philo[1] Evt(Sig=TIMEOUT_SIG, Pool=0, Ref= 0)
0000461850 NEW : Evt(Sig=HUNGRY_SIG, size= 3)
0000461862 MP.GET : Obj=l_smlPoolSto nFree= 9 nMin= 9
0000461874 AO.FIFO: Obj=l_table Evt(Sig=HUNGRY_SIG, Pool=1, Ref= 0) Queue(nUsed= 5, nMax= 5)
0000461886 AO.GETL: Active= l_table Evt(Sig=HUNGRY_SIG, Pool=1, Ref= 1)
0000461906 NEW : Evt(Sig=EAT_SIG, size= 3)
0000461917 MP.GET : Obj=l_smlPoolSto nFree= 8 nMin= 8
0000461929 PUBLISH: Evt(Sig=EAT_SIG, Pool=1, Ref= 0)
0000461941 AO.FIFO: Obj=l_philo[4] Evt(Sig=EAT_SIG, Pool=1, Ref= 1) Queue(nUsed= 5, nMax= 5)
0000461953 AO.FIFO: Obj=l_philo[3] Evt(Sig=EAT_SIG, Pool=1, Ref= 2) Queue(nUsed= 5, nMax= 5)
0000461965 AO.FIFO: Obj=l_philo[2] Evt(Sig=EAT_SIG, Pool=1, Ref= 3) Queue(nUsed= 5, nMax= 5)
0000461977 AO.FIFO: Obj=l_philo[1] Evt(Sig=EAT_SIG, Pool=1, Ref= 4) Queue(nUsed= 5, nMax= 5)
0000461987 AO.FIFO: Obj=l_philo[0] Evt(Sig=EAT_SIG, Pool=1, Ref= 5) Queue(nUsed= 5, nMax= 5)
0000462001 GC-ATT : Evt(Sig=EAT_SIG, Pool=1, Ref= 5)
0000462018 Intern : Obj=l_table Sig=HUNGRY_SIG Source=Table_serving
0000462030 GC : Evt(Sig=HUNGRY_SIG, Pool=1, Ref= 1)
0000462042 MP.PUT : Obj=l_smlPoolSto nFree= 9
0000462054 AO.GETL: Active= l_philo[4] Evt(Sig=EAT_SIG, Pool=1, Ref= 5)
0000462065 Intern : Obj=l_philo[4] Sig=EAT_SIG Source=Philo_thinking
0000462077 GC-ATT : Evt(Sig=EAT_SIG, Pool=1, Ref= 4)
0000462089 AO.GETL: Active= l_philo[3] Evt(Sig=EAT_SIG, Pool=1, Ref= 4)
0000462101 Intern : Obj=l_philo[3] Sig=EAT_SIG Source=Philo_thinking
0000462111 GC-ATT : Evt(Sig=EAT_SIG, Pool=1, Ref= 3)
0000462123 AO.GETL: Active= l_philo[2] Evt(Sig=EAT_SIG, Pool=1, Ref= 3)
0000462135 Intern : Obj=l_philo[2] Sig=EAT_SIG Source=Philo_thinking
0000462146 GC-ATT : Evt(Sig=EAT_SIG, Pool=1, Ref= 2)
Q_ENTRY: Obj=l_philo[1] State=Philo_hungry
0000462159 ==>Tran: Obj=l_philo[1] Sig=TIMEOUT_SIG Source=Philo_thinking New=Philo_hungry
0000462171 AO.GETL: Active= l_philo[1] Evt(Sig=EAT_SIG, Pool=1, Ref= 2)
0000462183 QK_muxL: OrgPrio= 2, CurrPrio= 5
0000462195 QK_muxU: OrgPrio= 2, CurrPrio= 5
0000462207 TE.ARM : Obj=l_philo[1].timeEvt Act=l_philo[1] nTicks= 8 Interval= 0
Q_ENTRY: Obj=l_philo[1] State=Philo_eating
0000462219 ==>Tran: Obj=l_philo[1] Sig=EAT_SIG Source=Philo_hungry New=Philo_eating
0000462231 GC-ATT : Evt(Sig=EAT_SIG, Pool=1, Ref= 1)
0000462243 AO.GETL: Active= l_philo[0] Evt(Sig=EAT_SIG, Pool=1, Ref= 1)
0000462255 Intern : Obj=l_philo[0] Sig=EAT_SIG Source=Philo_thinking
0000462265 GC : Evt(Sig=EAT_SIG, Pool=1, Ref= 1)
0000462277 MP.PUT : Obj=l_smlPoolSto nFree= 10
0000527134 QF_isrE: IsrNest= 1, CurrPrio=255
TICK : Ctr= 8
0000527153 QF_isrX: IsrNest= 1, CurrPrio=255
0000592283 QF_isrE: IsrNest= 1, CurrPrio=255
. . . . . .

Note
The QSPY host application supports also exporting data to the powerful MATLAB environment, as described in Section 11.5. MATLAB is a registered trademark of The Mathworks, Inc.

The QS trace log shown in Listing 11.1 contains quite detailed information because most QS records are enabled (not blocked in the QS filters). The following bullet items highlight the most interesting parts of the trace and illustrate how you can interpret the trace data:

• The QS log always contains the QSPY application version number, the date and time of the run, and all the configuration options used by the QSPY host application.
• A log typically starts with the dictionary records that provide a mapping between addresses of various objects in memory and their symbolic names. The dictionary entries don't have timestamps.
• After the dictionaries, you see the active object initialization. For example, the EQ.INIT record indicates event queue initialization with the l_tableQueueSto buffer. After this the AO.ADD trace record you see adding the Table object with priority 6. At this point, the time-tick interrupt is not configured, so all timestamps are 0000000000 (timestamps are always placed in the first eight columns).
• Active object initialization can contain dictionary entries for items that are encapsulated within the active object. For example, initialization of Table inserts an object dictionary entry for l_table object and three function dictionary entries for state handlers QHsm_top, Table_initial and Table_serving. Finally, the topmost initial transition is taken from QHsm_top to Table_serving.
• After the active object initialization, interrupts are enabled, and the first Tick interrupt arrives at the timestamp 0000070346. You can find out the type of the interrupt by the unique priority number. For example, the priority of the Tick interrupt is 0xFF == 255.
• The Tick interrupt occurs seven times. You can determine the ticking rate by comparing the timestamps between interrupt entry of Tick 1 and 2, which is ((0000135566 – 0000070346)/7 = 65220 ~= 0x10000). In the case of the DPP application, the timestamp is provided from counter-0 of the 8254 timer/counter, which is driven from the oscillator running at 1.193182MHz. The same counter-0 of the 8254 also generates time tick interrupts every 0x10000 number of counts (the 18.2Hz DOS tick).
• In the Tick 7 interrupt entered at timestamp 0000461783, you see that a time event posts TIMEOUT_SIG events to the l_philo[1] active objects. This triggers a lot of activity in the application. In fact, over 42 trace records occur before the next Tick 8.

Note
The QSPY human-readable format contains many cryptic names for various trace records. The “QSPY Reference Manual” available in the code accompanying this book (see Section 11.6.6) contains documentation of all predefined QS trace records and their parameters.

QS Target Component

The target-resident component of the Quantum Spy tracing system is called QS. The QS target component consists of the ring buffer, the QS filters, and the instrumentation added to QEP, QF, QK, and the application, as shown in Figure 11.3.

Figure 11.3. Structure of the QS target component.

Software tracing with QS is incomparably less intrusive than the primitive printf() statements because all the data formatting is removed from the target system and is done after the fact in the host. Additionally, the data-logging overhead incurred in the time-critical path of the target code is reduced to just storing the data into the trace buffer but typically does not include the overhead of sending the data out of the target device. In QS, data logging and sending to the host are separated so that the target system can typically perform the transmission outside of the time-critical path—for example, in the idle loop of the target CPU.

A nice byproduct of removing the data formatting from the target is a natural data compression compared to a formatted output. For example, ASCII representation of a single byte takes two hexadecimal digits (and three decimal digits), so avoiding the formatting gives at least a factor of two in data density. On top of this natural compression, QS uses such techniques as data dictionaries, and compressed format information, which in practice result in a compression factor of 4 to 5 compared to the expanded human-readable format.

Obviously, QS cannot completely eliminate the overhead of software tracing. But with the fine-granularity filters available in QS, you can make this impact as small as necessary. For greatest flexibility, QS uses two complementary levels of filters (see Figure 11.3). The first level is filtering based on trace record type, such as entry to a state, or publishing an event. This level works globally for all state machines and event publications in the entire system. The second level of filtering is component-specific. You can set up a filter to trace only a specific state machine object, for example. Combination of such two complementary filtering criteria results in very selective tracing capabilities.

Most QS trace records are timestamped. QS provides an efficient API for obtaining platform-specific timestamp information. Given the right timer-counter resource in your target system, you can provide QS with as precise timestamp information as required. The timestamp size is configurable to 1, 2, or 4 bytes.

One of its greatest QS strengths is the data transmission protocol. The QS protocol is very lightweight but has many the elements of the High Level Data Link Control (HDLC) protocol [HDLC] defined by the International Standards Organization (ISO). The protocol has provisions for detecting transmission errors and allows for instantaneous resynchronization after any error, such as data dropouts due to RAM buffer overruns.

Finally, QS contains a lightweight API for implementing data transmission to the host. The API supports any implementation strategy (polling, interrupt, DMA, etc.) and any physical transport layer (e.g., serial port, SPI, USB, Ethernet, data file, etc.)

QS Source Code Organization

Listing 11.2 shows the platform-independent directories and files comprising the QS software-tracing component in C. The structure of the C++ version is almost identical, except the implementation files have the .cpp extension.

Listing 11.2. Platform-independent QS source code organization
<qp>qpc - QP/C root directory (<qp>qpcpp for QP/C++)
|
+-include/ - QP platform-independent header files
| +-qs.h - QS platform-independent active interface
| +-qs_dummy.h - QS platform-independent inactive interface
|
+-qs/ - QS target component
| +-source/ - QS platform-independent source code (*.C files)
| | +-qs_pkg.h - internal, packet-scope interface for QS implementation
| | +-qs.c - internal ring buffer and formatted output functions
| | +-qs_.c - definition of basic unformatted output functions
| | +-qs_blk.c - definition of block-oriented interface QS_getBlock()
| | +-qs_byte.c - definition of byte-oriented interface QS_getByte()
| | +-qs_f32.c - definition of 32-bit floating point output QS_f32()
| | +-qs_f64.c - definition of 64-bit floating point output QS_f64()
| | +-qs_mem.c - definition of memory-block output
| | +-qs_str.c - definition of zero-terminated string output
|
+-ports - Platform-specific QP ports
| +- . . .
+-examples - Platform-specific QP examples
| +- . . .

The QS source files contain typically just one function or a data structure definition per file. This design aims at deploying QS as a fine-granularity library that you statically link with your applications. Fine granularity means that the QS library consists of several small loosely coupled modules (object files) rather than a single module that contains all functionality.

The QS Platform-Independent Header Files `qs.h` and `qs_dummy.h`

As most software tracing systems for C or C++, QS relies heavily on the C preprocessor for the tracing instrumentation to be enabled or disabled at compile time without changing the instrumented source code.

Note
Most QS facilities are provided in form of preprocessor macros. Depending on the global macro Q_SPY, the QS facilities are either defined to provide actual QS services or are “dummied-out” to prevent any code generation when the global macro Q_SPY is not defined. That way, the QS instrumentation can be left in the code at all times but becomes active only when the code is compiled with the macro Q_SPY defined. You typically define the macro Q_SPY externally through the compiler option (usually –D).

Listing 11.3 shows the platform-independent header file <qp>qpcincludeqs.h, which specifies the active interface to all QS facilities. The platform-independent header file <qp>qpcincludeqs_dummy.h, shown in Listing 11.4, specifies the inactive QS interface. Typically, you never need to explicitly include either of these header files in your application, because they are already included by all instrumented QP components. If the macro Q_SPY is defined, the QP components include the qs.h header file; otherwise they include the qs_dummy.h header file.

Listing 11.3. Active QS interface (fragments of the header file <qp>qpcincludeqs.h)
#ifndef qs_h
#define qs_h
#ifndef Q_SPY
(1) #error "Q_SPY must be defined to include qs.h"
(1) #endif
(2) enum QSpyRecords {
(2) /* QEP records */
(3) QS_QEP_STATE_ENTRY, /**< a state was entered */
(3) QS_QEP_STATE_EXIT, /**< a state was exited */
(3) . . .
(3) /* QF records */
(4) QS_QF_ACTIVE_ADD, /**< an AO has been added to QF (started) */
(4) QS_QF_ACTIVE_REMOVE, /**< an AO has been removed from QF (stopped) */
(4) QS_QF_ACTIVE_SUBSCRIBE, /**< an AO subscribed to an event */
(4) QS_QF_ACTIVE_UNSUBSCRIBE, /**< an AO unsubscribed to an event */
(4) QS_QF_ACTIVE_POST_FIFO, /**< an event was posted (FIFO) directly to AO */
(4) . . .
(4) /* QK records */
(5) QS_QK_MUTEX_LOCK, /**< the QK mutex was locked */
(5) QS_QK_MUTEX_UNLOCK, /**< the QK mutex was unlocked */
(5) QS_QK_SCHEDULE, /**< the QK scheduled a new task to execute */
(5) . . .
(5) /* Miscellaneous QS records */
(6) QS_SIG_DICTIONARY, /**< signal dictionary entry */
(6) QS_OBJ_DICTIONARY, /**< object dictionary entry */
(6) QS_FUN_DICTIONARY, /**< function dictionary entry */
(6) QS_ASSERT, /** assertion failed */
(6) . . .
(6) /* User records */
(7) QS_USER /**< the first record available for user QS records */
(7) };
(7) . . .
(7) /* Macros for adding QS instrumentation to the client code .................*/
(8) #define QS_INIT(arg_) QS_onStartup(arg_)
(9) #define QS_EXIT() QS_onCleanup()
(10) #define QS_FILTER_ON(rec_) QS_filterOn(rec_)
(11) #define QS_FILTER_OFF(rec_) QS_filterOff(rec_)
(12) #define QS_FILTER_SM_OBJ(obj_) (QS_smObj_ = (obj_))
(12) #define QS_FILTER_AO_OBJ(obj_) (QS_aoObj_ = (obj_))
(12) #define QS_FILTER_MP_OBJ(obj_) (QS_mpObj_ = (obj_))
(12) #define QS_FILTER_EQ_OBJ(obj_) (QS_eqObj_ = (obj_))
(12) #define QS_FILTER_TE_OBJ(obj_) (QS_teObj_ = (obj_))
(12) #define QS_FILTER_AP_OBJ(obj_) (QS_apObj_ = (obj_))
(12) /* Macros to generate user QS records (formatted data output) ..............*/
(13) #define QS_BEGIN(rec_, obj_) . . .
(14) #define QS_END() . . .
(15) #define QS_BEGIN_NOLOCK(rec_, obj_) . . .
(16) #define QS_END_NOLOCK() . . .
(16) . . .
(17) #define QS_I8 (w_, d_) QS_u8((uint8_t) (((w_) << 4)) | QS_I8_T, (d_))
(18) #define QS_U8 (w_, d_) QS_u8((uint8_t) (((w_) << 4)) | QS_U8_T, (d_))
(18) #define QS_I16(w_, d_) QS_u16((uint8_t)(((w_) << 4)) | QS_I16_T, (d_))
(18) #define QS_U16(w_, d_) QS_u16((uint8_t)(((w_) << 4)) | QS_U16_T, (d_))
(18) #define QS_I32(w_, d_) QS_u32((uint8_t)(((w_) << 4)) | QS_I32_T, (d_))
(18) #define QS_U32(w_, d_) QS_u32((uint8_t)(((w_) << 4)) | QS_U32_T, (d_))
(19) #define QS_F32(w_, d_) QS_f32((uint8_t)(((w_) << 4)) | QS_F32_T, (d_))
(20) #define QS_F64(w_, d_) QS_f64((uint8_t)(((w_) << 4)) | QS_F64_T, (d_))
(21) #define QS_STR(str_) QS_str(str_)
(22) #define QS_STR_ROM(str_) QS_str_ROM(str_)
(23) #define QS_MEM(mem_, size_) QS_mem((mem_), (size_))
(23) #if (QS_OBJ_PTR_SIZE == 1)
(24) #define QS_OBJ(obj_) QS_u8(QS_OBJ_T, (uint8_t)(obj_))
(24) #elif (QS_OBJ_PTR_SIZE == 2)
(25) #define QS_OBJ(obj_) QS_u16(QS_OBJ_T, (uint16_t)(obj_))
(25) #elif (QS_OBJ_PTR_SIZE == 4)
(26) #define QS_OBJ(obj_) QS_u32(QS_OBJ_T, (uint32_t)(obj_))
(26) #else
(27) #define QS_OBJ(obj_) QS_u32(QS_OBJ_T, (uint32_t)(obj_))
(27) #endif
(27) #if (QS_FUN_PTR_SIZE == 1)
(28) #define QS_FUN(fun_) QS_u8(QS_FUN_T, (uint8_t)(fun_))
(28) #elif (QS_FUN_PTR_SIZE == 2)
(28) . . .
(28) #endif
(28) #if (Q_SIGNAL_SIZE == 1)
(29) #define QS_SIG(sig_, obj_) QS_u8 (QS_SIG_T, (sig_)); QS_OBJ_(obj_)
(29) #elif (Q_SIGNAL_SIZE == 2)
(29) . . .
(29) #endif
(29) /* Dictionary records ...................................................*/
(30) #define QS_OBJ_DICTIONARY(obj_) . . .
(31) #define QS_FUN_DICTIONARY(fun_) . . .
(32) #define QS_SIG_DICTIONARY(sig_, obj_) . . .
(32) . . .
(32) /* Macros used only internally in the QP code ..............................*/
(33) #define QS_BEGIN_(rec_, obj_) . . .
(34) #define QS_END_() . . .
(35) #define QS_BEGIN_NOLOCK_(rec_, obj_) . . .
(36) #define QS_END_NOLOCK_() . . .
(36) /* QS functions for managing the QS trace buffer ...........................*/
(37) void QS_initBuf(uint8_t sto[], uint32_t stoSize);
(38) uint16_t QS_getByte(void); /* byte-oriented interface */
(39) uint8_t const *QS_getBlock(uint16_t *pNbytes); /* block-oriented interface */
(39) /* QS callback functions, typically implemented in the BSP .................*/
(40) uint8_t QS_onStartup(void const *arg);
(41) void QS_onCleanup(void);
(42) void QS_onFlush(void);
(43) QSTimeCtr QS_onGetTime(void);
(43) #endif /* qs_h */

Listing 11.4. Inactive QS interface (fragments of the header file <qp>qpcincludeqs_dummy.h)
#ifndef qs_dummy_h
#define qs_dummy_h
#ifdef Q_SPY
(1) #error "Q_SPY must NOT be defined to include qs_dummy.h"
(1) #endif
(2) #define QS_INIT(arg_) ((uint8_t)1)
(3) #define QS_EXIT() ((void)0)
(3) #define QS_DUMP() ((void)0)
(3) #define QS_FILTER_ON(rec_) ((void)0)
(3) #define QS_FILTER_OFF(rec_) ((void)0)
(3) #define QS_FILTER_SM_OBJ(obj_) ((void)0)
(3) . . .
(4) #define QS_GET_BYTE(pByte_) ((uint16_t)0xFFFF)
(5) #define QS_GET_BLOCK(pSize_) ((uint8_t *)0)
(6) #define QS_BEGIN(rec_, obj_) if (0) {
(7) #define QS_END() }
(7) #define QS_BEGIN_NOLOCK(rec_, obj_) QS_BEGIN(rec_, obj_)
(7) #define QS_END_NOLOCK() QS_END()
(7) #define QS_I8(width_, data_) ((void)0)
(7) #define QS_U8(width_, data_) ((void)0)
(7) . . .
(7) #define QS_SIG(sig_, obj_) ((void)0)
(7) #define QS_OBJ(obj_) ((void)0)
(7) #define QS_FUN(fun_) ((void)0)
(7) #define QS_SIG_DICTIONARY(sig_, obj_) ((void)0)
(7) #define QS_OBJ_DICTIONARY(obj_) ((void)0)
(7) #define QS_FUN_DICTIONARY(fun_) ((void)0)
(7) #define QS_FLUSH() ((void)0)
(7) . . .
(7) #endif /* qs_dummy_h */

(1) A compile-time error is reported if the qs.h header file is included without defining the Q_SPY macro (see also Listing 11.4(1)).
(2) The enumeration QSpyRecords defines all the standard QS record types.
(3-5) Each QP component generates specific QS record types. For example, standard QS records are designed for entering a state (3), adding an active object to the framework (4), or locking a QK mutex (5).
(6) Standard QS records include also miscellaneous records, like the dictionary records (see Section 11.3.8).
(7) The list QS records can be extended by adding user-defined records. The user records must start at the numerical value determined by QS_USER. Currently, QS supports up to 256 records, from which the first 70 are reserved for the standard, predefined records. This leaves 186 records for application-specific records. I discuss application-specific records in Section 11.3.9.
(8,9) As I mentioned before, all QS services are defined as preprocessor macros. That way, you can leave them in the code, even if software tracing is disabled. Here the services for initializing and terminating QS are specified.
(10,11) These two macros implement the global QS filter, which turns tracing of a given QS trace record on or off. I discuss QS filters in Section 11.3.5.
(12) This macro implements the local QS filter. This filter type allows you selectively trace only specified state machine object. I discuss QS filters in Section 11.3.5.
(13,14) These macros open and close an application-specific QS trace record. I discuss the application-specific records in Section 11.3.9.
(15,16) These macros also open and close an application-specific QS trace record, except they don't lock and unlock interrupts. These macros are supposed to be used inside an already established critical section.

Note
The QS trace buffer is obviously a shared resource, which must be protected against corruption. QS uses interrupt locking (critical section) as the mutual exclusion mechanism. The macro QS_BEGIN() locks interrupts and the macro QS_END() unlocks interrupts, so the whole QS record is saved to the QS buffer in one critical section. You should avoid producing big trace records because this could extend interrupt latency.

(17,18) These macros are used to output an unsigned 8-bit integer and a signed 8-bit integer in an application-specific trace record.
(19,20) These two macros output a 32-bit and 64-bit IEEE floating-point numbers, respectively, to an application-specific trace record.
(21) This macro outputs a zero-terminated string to an application-specific trace record.
(22) This macro outputs a zero-terminated string allocated in ROM to an application-specific trace record.

Note
Some Harvard CPU architectures use different instructions to access data in program memory (ROM) than in RAM.

(23) This macro outputs a memory block of specified length to an application-specific trace record. The block size cannot exceed 255 bytes.
(24-27) The macro QS_OBJ() outputs an object pointer to an application-specific trace record. Note how the actual macro definition depends on the object pointer size defined by the macro QS_OBJ_PTR_SIZE. This idiom is used quite often in QS.
(28) The macro QS_FUN() outputs a function pointer to an application-specific trace record. Note how the actual macro definition depends on the function pointer size defined by the macro QS_FUN_PTR_SIZE.
(29) The macro QS_SIG() outputs a signal value to an application-specific trace record. Note how the actual macro definition depends on the signal size defined by the macro Q_SIGNAL_SIZE.

Note
The macro QS_SIG() outputs both the signal value and state machine object pointer. This is done to avoid ambiguities, when numerical signal values are reused in different state machines.

(30-32) These macros output various dictionary trace records to the QS trace buffer. I discuss dictionary trace records in Section 11.3.8.
(33,34) These internal QS macros open and close an internal QS trace record.
(35,36) These internal macros also open and close an internal QS trace record, except they don't lock and unlock interrupts. These macros are supposed to be used inside an already established critical section.
(37) The function QS_initBuf() initializes the QS buffer. The caller must provide the storage for the buffer and its size. The function must be called before QS trace buffer can be used, typically from QS_onStartup().
(38) The function QS_getByte() obtains 1 byte from the QS trace buffer (see Section 11.3.7).
(39) The function QS_getBlock() obtains a contiguous block of data in QS trace buffer (see Section 11.3.7).
(40) The QS_onStartup() callback function initializes the QS tracing output and the trace buffer (see QS_initBuf()). The function returns the status of initialization. It is called from the macro QS_INIT().
(41) The QS_onCleanup() callback function cleans up the QS tracing output. The function is called from the macro QS_EXIT().
(42) The QS_onFlush() callback function flushes the QS trace buffer to the host. The function typically busy-waits until the whole buffer is transmitted to the host. QS_onFlush() is called at the end of each dictionary record (see Section 11.3.8), but you can also call it explicitly via macro QS_FLUSH(). I provide an example of the QS_onFlush() callback implementation in Section 11.6.2.
(43) The QS_onGetTime() callback function returns the timestamp for a QS record. I provide an example of the QS_onGetTime() callback implementation in Section 11.6.3.

(1) A compile-time error is reported if the qs_dummy.h header file is included when the Q_SPY macro is defined (see also Listing 11.3(1)).
(2) The dummy QS initialization always returns 1, meaning successful initialization.
(3) Most other QS dummy macros are defined as ((void)0), which is a valid empty C expression that can be terminated with a semicolon.
(4) The dummy QS macro for obtaining a byte from the trace buffer always returns 0xFFFF, which means that end of data is reached. I discuss QS API for accessing the trace buffer in Section 11.3.7.
(5) The dummy QS macro for obtaining a block of data to output always returns a NULL pointer, which means that there is no data in the buffer. I discuss QS API for accessing the trace buffer in Section 11.3.7.
(6,7) The dummy QS macros for opening and closing a trace record compile as if (0) {...}. Any active code between the braces is eliminated because of the FALSE condition of the if statement.

Note
Some trace records might contain temporary variables and expressions that are only used for the trace output. The “if (0) {” statement establishes a new scope to define such temporary variables.

QS Critical Section

The QS target component must protect the internal integrity of the trace buffer, which is shared among concurrently running tasks and interrupts (see Figure 11.3). To guarantee mutually exclusive access to the trace buffer, QS uses the same mechanism as the rest of the QP platform, that is, QS locks interrupts on entry to the critical section of code and unlocks interrupts again on exit.

When QS detects that the QF critical section macros QF_INT_LOCK()/QF_INT_UNLOCK() are defined, QS uses the provided definitions for its own critical section. However, when you use QS without the QF real-time framework, you need to define the platform-specific interrupt locking/unlocking policy of QS in the qs_port.h header file, as shown in Listing 11.5.

Listing 11.5. QS macros for interrupt locking and unlocking (file qs_port.h)
#define QS_INT_KEY_TYPE . . .
#define QS_INT_LOCK(key_) . . .
#define QS_INT_UNLOCK(key_) . . .

Note
QS can be used with just the QEP component or even completely standalone, without any other QP components. In these cases, QS must provide its own, independent critical section mechanism.

The QS macros are exactly analogous to the QF macros QF_INT_KEY_TYPE, QF_INT_LOCK(), and QF_INT_UNLOCK(), respectively. Refer to Section 7.3 in Chapter 7 for more details.

General Structure of QS Records

Like all software-tracing systems, QS logs the tracing data in discrete chunks called QS trace records. These trace records have the general structure shown in Listing 11.6.

Listing 11.6. General structure of a QS record
QS_BEGIN_xxx(record_type) /* trace record begin */
QS_yyy(data); /* QS data element */
QS_zzz(data); /* QS data element */
. . . /* QS data element */
QS_END_xxx() /* trace record end */

Each trace record always begins with one variant of the macro QS_BEGIN_xxx() and ends with the matching macro QS_END_xxx().

Note
The macros QS_BEGIN_xxx() and QS_END_xxx() are not terminated with the semicolon.

Sandwiched between these two macros are the data-generating macros that actually insert individual data elements into the QS trace buffer. QS provides four variants of the begin/end macro pairs for different purposes (Listing 11.3(13-16 and 33-36)).

The first two variants (Listing 11.3(13-16)) are for creating application-specific QS records (see Section 11.3.9). The QS_BEGIN()/QS_END() pair locks interrupts at the beginning of the record and unlocks at the end. The pair QS_BEGIN_NOLOCK()/QS_END_NOLOCK() is for application-specific records without entering the QS critical and should be used only within an already established critical section.

The third and fourth variants of the begin-record/end-record QS macros (Listing 11.3(33-36)) are for internal use within QP components to generate the predefined QS records. Such predefined records are generated with QS_BEGIN_()/QS_END_() or QS_BEGIN_NOLOCK_()/QS_END_NOLOCK_() macro pairs, depending whether a critical section must be entered or not.

QS Filters

One of the main roles of the begin-record macros QS_BEGIN_xxx() is to implement the filtering of QS records before they reach the trace buffer. QS provides two complementary levels of filtering: the global on/off filter and local filters (see Figure 11.3).

Global On/Off Filter

The global on/off filter is based on the record IDs. The qs.h header file provides the enumeration of all the predefined internal trace records IDs that are already instrumented into the QP components (Listing 11.3(2)). The enumeration of the predefined records ends with the QS_USER value, which is the first numerical value available for application-specific trace records. I discuss the application-specific trace records in Section 11.3.9.

The global on/off filter is efficiently implemented by means of an array of bitmasks QS_glbFilter_[], where each bit represents one trace record. Currently the QS_glbFilter_[] array contains 32 bytes for a total of 32*8 bits for 256 different trace records. A little more than a quarter of these records are already taken by the predefined QP trace records. The remaining three quarters are available for application-specific use.

The macro QS_BEGIN() for opening a trace record shows how the global on/off filter is implemented:

#define QS_BEGIN(rec_, obj_)
if (((QS_glbFilter_[(uint8_t)(rec_) >> 3U]
& (1U << ((uint8_t)(rec_) & 7U))) != 0) . . .

The global on/off filter works by checking the state of the bit corresponding to the given trace record argument “rec_.” This check is accomplished by the familiar expression (bimask & bit) != 0, where the bitmask is QS_glbFilter_[(uint8_t)(rec_) >> 3U] and the bit is (1U << ((uint8_t)(rec_) & 7U)). Note that for any constant value of the argument rec_, both the bitmask and the bit are compile-time constants. For example, a global filter check for a record ID of 46, say, costs as much as the expression (QS_glbFilter_[5] & 0x40) != 0).

Note
The global filter is specifically implemented to use byte-size computations only to be efficient even on 8-bit machines.

QS provides a simple interface for setting and clearing individual bits in the QS_glbFilter_[] bitmask array. The macro QS_FILTER_ON(rec_) turns on the bit corresponding to record “rec_.” Conversely, the macro QS_FILTER_OFF(rec_) turns off the bit corresponding to record “rec_.” In both cases, the special constant QS_ALL_RECORDS affects all records. Specifically, QS_FILTER_ON(QS_ALL_RECORDS) turns on all records, and QS_FILTER_OFF(QS_ALL_RECORDS) turns off all records. Examples of these macros are provided in Listing 11.16.

Just after QS initialization, the global on/off filter is set to OFF for all records types. You need to explicitly turn the filter ON for some records to enable the tracing.

Note
Globally disabling all records through QS_FILTER_OFF(QS_ALL_RECORDS) is a useful way of implementing a software-tracing trigger. You can use this trigger to rapidly stop the tracing after an interesting event, to prevent new trace data from overwriting interesting data in case the data uplink to the host cannot keep up with the production rate of new trace data.

Local Filters

The local filters allow generation of trace records only for specified objects. For example, you might set up a local filter to log only activities of a given state machine object. Independently, you might set up another local filter to log only activities of a given memory pool.

The Table 11.1

Table 11.1. Local filter summary

Local Filter	Object Type	Example	Applies to QS Records
QS_FILTER_SM_OBJ()	State machine	QS_FILTER_SM_OBJ(&l_qhsmTst);	QS_QEP_STATE_EMPTY,QS_QEP_STATE_ENTRY,QS_QEP_STATE_EXIT,QS_QEP_STATE_INIT,QS_QEP_INIT_TRAN,QS_QEP_INTERN_TRAN,QS_QEP_TRAN,QS_QEP_IGNORED
QS_FILTER_AO_OBJ()	Active object	QS_FILTER_AO_OBJ(&l_philo[3]);	QS_QF_ACTIVE_ADD,QS_QF_ACTIVE_REMOVE,QS_QF_ACTIVE_SUBSCRIBE,QS_QF_ACTIVE_UNSUBSCRIBE,QS_QF_ACTIVE_POST_FIFO,QS_QF_ACTIVE_POST_LIFO,QS_QF_ACTIVE_GET,QS_QF_ACTIVE_GET_LAST
QS_FILTER_MP_OBJ()^[1]	Memory pool	QS_FILTER_MP_OBJ(l_regPoolSto);	QS_QF_MPOOL_INIT,QS_QF_MPOOL_GET QS_QF_MPOOL_PUT,
QS_FILTER_EQ_OBJ()^[2]	Event queue	QS_FILTER_EQ_OBJ(l_philQueueSto[3]);	QS_QF_EQUEUE_INIT,QS_QF_EQUEUE_POST_FIFO,QS_QF_EQUEUE_POST_LIFO,QS_QF_EQUEUE_GET,QS_QF_EQUEUE_GET_LAST
QS_FILTER_TE_OBJ()	Time event	QS_FILTER_TE_OBJ(&l_philo[3].timeEvt);	QS_QF_TICK,QS_QF_TIMEEVT_ARM,QS_QF_TIMEEVT_AUTO_DISARM,QS_QF_TIMEEVT_DISARM_ATTEMPT,QS_QF_TIMEEVT_DISARM,QS_QF_TIMEEVT_REARM,QS_QF_TIMEEVT_POST,QS_QF_TIMEEVT_PUBLISH
QS_FILTER_AP_OBJ()	Generic application object	QS_FILTER_AP_OBJ(&myAppObject);	Application-specific records starting with `QS_USER`

The first column of Table 11.1 enlists the QS macros you need to use to set/clear the local filters. For example, you specify the state machine local filter by invoking:

QS_FILTER_SM_OBJ(aStateMachinePointer);

where aStateMachinePointer is the pointer to the state machine object you want to trace.

You deactivate any local filter by passing the NULL pointer to the appropriate QS macro. For example, to open up the local filter for all state machine objects, you write the following code:

QS_FILTER_SM_OBJ(0);

Just after QS initialization, all local filters is set to NULL, meaning that the local filters are open for all objects.

The highlighted code in the QS_BEGIN() macro definition shows the actual implementation of the local filter for the application-specific objects:

#define QS_BEGIN(rec_, obj_)
if (((QS_glbFilter_[(uint8_t)(rec_) >> 3U]
& (1U << ((uint8_t)(rec_) & 7U))) != 0)
&& ((QS_apObj_ == (void *)0) || (QS_apObj_ == (obj_))))
{
. . .

The QS local filters are closely related to the object dictionary records (see Section 11.3.8) and both facilities consistently use the same conventions.

QS Data Protocol

The data transmission protocol used in QS to transmit trace data from the target to the host is one of its greatest strengths. The protocol is very lightweight but has many elements of the HDLC protocol defined by the ISO.

The QS protocol has been specifically designed to simplify the data management overhead in the target yet allow detection of any data dropouts due to the trace buffer overruns. The protocol has not only provisions for detecting gaps in the data and other errors but allows for instantaneous resynchronization after any error, to minimize data loss.

The QS protocol transmits each trace record in an HDLC-like frame. The upper part of Figure 11.4 shows the serial data stream transmitted from the target containing frames of different lengths. The bottom part of Figure 11.4 shows the details of a single frame:

1 Each frame starts with the Frame Sequence Number byte. The target QS component increments the Frame Sequence Number for every frame inserted into the circular buffer. The Sequence Number naturally rolls over from 255 to 0. The Frame Sequence Number allows the QSPY host component to detect any data discontinuities.
2 Following the Fame Sequence Number is the Record ID byte, which is one of the predefined QS records (see Listing 11.3(2)) or an application-specific record (see Section 11.3.9).
3 Following the Record ID is zero or more data bytes.
4 Following the data is the Checksum. The Checksum is computed over the frame Sequence Number, the Record ID, and all the data bytes. The next section gives the detailed checksum computation formula.
5 Following the Checksum is the HDLC Flag, which delimits the frame. The HDLC flag is the 01111110 binary string (0x7E hexadecimal). Note that the QS protocol uses only one HDLC Flag at the end of each frame and no HDLC Flag at the beginning of a frame. In other words, only one Flag is inserted between frames.

Figure 11.4. QS transmission protocol.

The QS target component performs the HDLC-like framing described above at the time the bytes are inserted into the circular trace buffer. This means that the data in the buffer is already cleanly divided into frames and can be transmitted in any chunks, typically not aligned with the frame boundaries.

Transparency

One of the most important characteristics of HDLC-type protocols is establishing very easily identifiable frames in the serial data stream. Any receiver of such a protocol can instantaneously synchronize to the frame boundary by simply finding the Flag byte. This is because the special Flag byte can never occur within the content of a frame. To avoid confusing unintentional Flag bytes that can naturally occur in the data stream with an intentionally sent Flag, HDLC uses a technique known as transparency (a.k.a. byte stuffing or escaping) to make the Flag bytes transparent during the transmission. Whenever the transmitter encounters a Flag byte in the data, it inserts a 2-byte escape sequence to the output stream. The first byte is the Escape byte, defined as binary 01111101 (hexadecimal 0x7D). The second byte is the original byte XOR-ed with 0x20.

Of course, now the Escape byte itself must also be transparent to avoid interpreting an unintentional Escape byte as the 2-byte escape sequence. The procedure of escaping the Escape byte is identical to that of escaping the Flag byte.

The transparency of the Flag and Escape bytes complicates slightly the computation of the Checksum. The transmitter computes the Checksum over the Fame Sequence Number, the Record ID, and all data bytes before performing any byte stuffing. The receiver must apply the exact reversed procedure of performing the byte unstuffing before computing the Checksum.

An example might make this clearer. Suppose that the following trace record needs to be inserted to the trace buffer (the transparent bytes are shown in bold):

Record ID = 0x7D, Record Data = 0x7D 0x08 0x01

Assuming that the current Frame Sequence Number is, say, 0x7E, the Checksum will be computed over the following bytes:

Checksum == (uint8_t)(~(0x7E + 0x7D + 0x7D + 0x08 + 0x01)) == 0x7E

and the actual frame inserted into the QS trace buffer will be as follows:

0x7D 0x5E 0x7D 0x5D 0x7D 0x5D 0x08 0x01 0x7D 0x5E 0x7E

Obviously, this is a degenerated example, where the Frame Sequence Number, the Record ID, a data byte, and the Checksum itself turned out to be the transparent bytes. Typical overhead of transparency with real trace data is one escape sequence per several trace records.

Endianness

In addition to the HDLC-like framing, the QS transmission protocol specifies the endianness of the data to be little endian. All multibyte data elements, such as 16- and 32-bit integers, pointers, and floating-point numbers, are inserted into the QS trace buffer in the little-endian byte order (least significant byte first). The QS data inserting macros (see Listing 11.3(17-23)) place the data in the trace buffer in a platform-neutral manner, meaning that the data is inserted into the buffer in the little-endian order regardless of the endianness of the CPU. Also, the data-inserting macros copy the data to the buffer 1 byte at a time, thus avoiding any potential data misalignment problems. Many embedded CPUs, such as ARM, require certain alignment of 16- and 32-bit quantities.

QS Trace Buffer

As described in the previous section, the QS target component performs the HDLC-like framing at the time the bytes are inserted into the QS trace buffer. This means that only complete frames are placed in the buffer, which is the pivotal point in the design of the QS target component and has two important consequences.

First, the use of HDLC-formatted data in the trace buffer allows decoupling the data insertion into the trace buffer from the data removal out of the trace buffer. You can simply remove the data in whichever chunks you like, without any consideration for frame boundaries. You can employ just about any repetition physical data link available on the target for transferring the trace data from the target to the host.

Second, the use of the formatted data in the buffer enables the “last is best” tracing policy. The QS transmission protocol maintains both the Frame Sequence Number and the Checksum over each trace record, which means that any data corruption caused by overrunning the old data with the new data can be always reliably detected. Therefore, the new trace data is simply inserted into the circular trace buffer, regardless of whether it perhaps overwrites the old data that hasn't been sent out yet or is in the process of being sent. The burden of detecting any data corruption is placed on the QSPY host component. When you start missing the frames (which the host component easily detects by discontinuities in the Frame Sequence Number), you have several options. Your can apply some additional filtering, increase the size of the buffer, or improve the data transfer throughput.

Initializing the QS Trace Buffer `QS_initBuf()`

Before you can start producing trace data, you must initialize the QS trace buffer by calling the QS_initBuf() function. Typically, you invoke this function from the QS_onStartup() callback, which you typically define in your application. Listing 11.7 shows an example of initializing the QS trace buffer.

Listing 11.7. Initializing QS trace buffer with QS_initBuf(
(1) #ifdef Q_SPY /* define QS callbacks */
(2) uint8_t QS_onStartup(void const *arg) {
(3) static uint8_t qsBuf[2*1024]; /* buffer for Quantum Spy */
(4) QS_initBuf(qsBuf, sizeof(qsBuf));
(4) Initialize the QS data link
(4)
(4) return success; /* return 1 for success and 0 for failure */
(4) }
(4) #endif /* Q_SPY */

(1) The QS callback functions (such as QS_onStartup()) are defined only when QS tracing is enabled.
(2) At a minimum, the QS_onStartup() callback function must initialize the QS trace buffer.
(3) You need to statically allocate the storage for the QS trace buffer. The size of the buffer depends on the nature of your application and the data link to the host. Obviously, a bigger buffer is needed if you want to trace events occurring at a high rate and producing a higher volume of trace data. On the other hand, using a higher-bandwidth data link to the host allows you to reduce the size of the trace buffer.
(4) The QS_initBuf() function initializes internal QS variables to use the provided trace buffer.

Note
QS can work with a trace buffer of any size, but smaller buffers will lose data if the buffer “wraps around.” You will always know when any data loss occurs, however, because the QS data protocol maintains a sequence number in every trace record (see Section 11.3.6). When the QSPY host application detects a discontinuity in the sequence numbers, it produces the following message:
*** Incorrect record past seq=xxx
*** Dropped yy records

You have several options to avid losing trace records due to data overruns. You can increase the size of the trace buffer (Listing 11.7(3)) or apply more filtering to reduce the amount of trace data produced (see Section 11.3.5). You can also employ a faster data link to the host. Finally, sometimes you can improve the data throughput by changing the policy of sending trace data to the host. For example, using only the idle processing might not utilize the full available bandwidth of the data link if the idle processing executes too infrequently.

Byte-Oriented Interface: `QS_getByte()`

The lack of any constraints on removing the data from the trace buffer means that you can remove 1 byte at a time at arbitrary time instances. QS provides the function QS_getByte() for such byte-oriented interfaces. The signature of QS_getByte() is shown in Listing 11.3(38). Listing 11.8 shows an example of how to use this function.

Listing 11.8. Using QS_getByte() to output data to a 16550-compatible UART
(1) void QF_onIdle(void) { /* called with interrupts LOCKED */
(2) QF_INT_UNLOCK(dummy); /* always unlock interrupts */
(3) #ifdef Q_SPY
(4) if ((inportb(l_uart_base + 5) & (1 << 5)) != 0) { /* THR Empty? */
(5) uint8_t fifo = UART_16550_TXFIFO_DEPTH;/*depth of the 16550 Tx FIFO */
(6) uint16_t b;
(7) QF_INT_LOCK(dummy);
(8) while ((fifo != 0)
(9) && ((b = QS_getByte()) != QS_EOD)) /* get the next byte */
(9) {
(10) QF_INT_UNLOCK(dummy);
(11) outportb(l_base + 0, (uint8_t)b); /* insert byte into TX FIFO */
(12) --fifo;
(13) QF_INT_LOCK(dummy);
(13) }
(14) QF_INT_UNLOCK(dummy);
(14) }
(14) #endif
(14) }

The QS_getByte() function returns the byte in the least significant 8 bits of the 16-bit return value if the byte is available. If the trace buffer has no more data, the function returns QS_EOD (end-of-data), which is defined in qs.h as ((uint16_t)0xFFFF).

Note
The function QS_getByte() does not lock interrupts internally and is not reentrant. You should always design your software such that QS_getByte() is called with interrupts locked. In addition, an application should consistently use either the byte-oriented interface QS_getByte() or the block-oriented interface QS_getBlock() (see the next subsection), but never both at the same time.

(1) Idle processing is ideal for implementing trace data output. In this example, I use the QF_onIdle() idle callback of the cooperative vanilla kernel (see Section 8.2.4 in Chapter 8).
(2) As explained at the end of Section 7.11.1 in Chapter 7, the QF_onIdle() idle callback is invoked with interrupts locked and it always must unlock interrupts.
(3) The QS trace buffer output is performed only when QS is active, that is, when the macro Q_SPY is defined.
(4) The Transmitter Holding Register Empty bit of the 16550 UART is checked.
(5) The 16550 UART can accept up to the TX FIFO depth bytes (typically 16).
(6) The temporary variable ‘b’ will hold the return value from QS_getByte(). Note that it is 2 bytes wide.
(7) Interrupts are locked before calling QS_getByte().
(8) The loop continues until there is room in the TX FIFO.
(9) The QS_getByte() function is called to obtain the next trace byte to transmit. The return value of QS_EOD indicates end of data.
(10) Interrupts can be unlocked.
(11) The trace byte is written to the Transmitter Holding Register.
(12) One less byte is available in the TX FIFO.
(13) Interrupts are locked to make another call to QS_getByte().
(14) Interrupts are unlocked before returning to the caller.

Block-Oriented Interface: `QS_getBlock()`

QS also provides an alternative block-oriented interface for obtaining a contiguous block of data at a time. QS provides the function QS_getBlock() for such block-oriented interface. The signature of QS_getBlock() is shown in Listing 11.3(39). Such a block-oriented interface is very useful for DMA-type transfers. Listing 11.9 shows an example of how to use this function.

Listing 11.9. Using QS_getBlock() to output data to a 16550-compatible UART
(1) void QF_onIdle(void) { /* called with interrupts LOCKED */
(2) QF_INT_UNLOCK(dummy); /* always unlock interrupts */
(3) #ifdef Q_SPY
(4) if ((inportb(l_uart_base + 5) & (1 << 5)) != 0) { /* THR Empty? */
(5) uint16_t fifo = UART_16550_TXFIFO_DEPTH; /* 16550 Tx FIFO depth */
(6) uint8_t const *block;
(7) QF_INT_LOCK(dummy);
(8) block = QS_getBlock(&fifo); /* try to get next block to transmit */
(9) QF_INT_UNLOCK(dummy);
(10) while (fifo-- -- != 0) { /* any bytes in the block? */
(11) outportb(l_uart_base + 0, *block++);
(11) }
(11) }
(11) #endif
(11) }

If any bytes are available at the time of the call, the function returns the pointer to the beginning of the data block within the QS trace buffer and writes the number of contiguous bytes in the block to the location pointed to by pNbytes. The value of *pNbytes is also used as input to limit the maximum size of the data block that the caller can accept. Note that the bytes are not copied from the trace buffer.

If no bytes are available in the QS buffer when the function is called, the function returns a NULL pointer and sets the value pointed to by pNbytes to zero.

You should not assume that the QS trace buffer becomes empty after QS_getBlock() returns a data block with fewer bytes than the initial value of *pNbytes. Sometimes the data block falls close to the end of the trace buffer and you need to call QS_getBlock() again to obtain the rest of the data that “wrapped around” to the beginning of the QS data buffer. After the QS_getBlock() returns a memory block to the caller, the caller must transfer all the bytes in the returned block before calling QS_getBlock() again.

Note
The function QS_getBlock() does not lock interrupts internally and is not reentrant. You should always design your software such that QS_getBlock() is called with interrupts locked.

(1) Idle processing is ideal for implementing trace data output. In this example, I use the QF_onIdle() idle callback of the cooperative vanilla kernel (see Section 8.2.4 in Chapter 8).
(2) As explained at the end of Section 7.11.1 in Chapter 7, the QF_onIdle() idle callback is invoked with interrupts locked and it always must unlock interrupts.
(3) The QS trace buffer output is performed only when QS is active, that is, when the macro Q_SPY is defined.
(4) The Transmitter Holding Register Empty bit of the 16550 UART is checked.
(5) The 16550 UART can accept up to the TX FIFO depth bytes (typically 16).
(6) The temporary pointer ‘block’ will hold the return value from QS_getBlock().
(7) Interrupts are locked before calling QS_getBlock().
(8) The QS_getBlock() function is called to obtain the contiguous block of trace data to transmit.
(9) Interrupts can be unlocked.
(10) The loop continues while there is room in the TX FIFO.
(11) The trace byte is written to the Transmitter Holding Register.

Dictionary Trace Records

By the time you compile and load your application image to the target, the symbolic information about the object names, function names, and signal names is stripped from the code. Therefore, if you want to have the symbolic information available to the QSPY host-resident component, you need to supply it somehow to the software-tracing system.

QS provides special trace records designed expressly for including the symbolic information about the target code in the trace itself. The dictionary records included in the trace for the QSPY host application are very much like the symbolic information embedded in the object files for the traditional single-step debugger.

The dictionary trace records are not absolutely required to generate the trace in the same way as the symbolic information in the object files is not absolutely required to debug code. However, in both cases, the availability of the symbolic information greatly improves productivity in working with the software trace or the debugger.

QS supports three types of dictionary trace records: object dictionary, function dictionary, and signal dictionary. The following subsections cover these types in detail.

Note
As all QS trace records, the dictionary trace records are generated in a critical section of code, that is, interrupts are locked for the time the data is inserted into the QS trace buffer. Additionally, after unlocking interrupts, the callback function OS_onFlush() is invoked at the end of each dictionary record. This callback function typically busy-waits until all data are sent out to the host, which might take considerable time. For that reason, dictionary entries should be generated only during the system initialization, when the real-time constraints do not yet apply.

Object Dictionaries

Object dictionaries are generated with the macro QS_OBJ_DICTIONARY() that associates the address of the object in memory with its symbolic name. Listings 11.19 and 11.20 provide some examples of how you use this macro. The QS_OBJ_DICTIONARY() macro takes only one argument, the address of the object, and uses internally the “stringizing” preprocessor operator to convert the provided argument to a C string. Therefore, you should invoke the QS_OBJ_DICTIONARY() macro with meaningfully named persistent objects, such as &l_table, or &l_philo[3], and not generic pointers, such as “me” (or “this” in C++), because the latter will not help you much in recognizing the object name in the trace.

Table 11.2

Table 11.2. Object dictionaries required for the predefined QS records

Object Type	Example(s)	QS Records
State machine	QS_OBJ_DICTIONARY(&l_table); See Listing 11.19(3)	QS_QEP_STATE_EMPTY,QS_QEP_STATE_ENTRY,QS_QEP_STATE_EXIT,QS_QEP_STATE_INIT,QS_QEP_INIT_TRAN,QS_QEP_INTERN_TRAN,QS_QEP_TRAN,QS_QEP_IGNORED
Active object	QS_OBJ_DICTIONARY(&l_philo[0]); See Listing 11.20(4)	QS_QF_ACTIVE_ADD, QS_QF_ACTIVE_REMOVE, QS_QF_ACTIVE_SUBSCRIBE, QS_QF_ACTIVE_UNSUBSCRIBE, QS_QF_ACTIVE_POST_FIFO, QS_QF_ACTIVE_POST_LIFO, QS_QF_ACTIVE_GET, QS_QF_ACTIVE_GET_LAST
Memory pool^[1]	QS_OBJ_DICTIONARY(l_smlPoolSto); See Listing 11.16(7)	QS_QF_MPOOL_INIT,QS_QF_MPOOL_GET QS_QF_MPOOL_PUT,
Event queue^[2]	QS_OBJ_DICTIONARY(l_philQueueSto[0]); See Listing 11.16(9)	QS_QF_EQUEUE_INIT,QS_QF_EQUEUE_POST_FIFO,QS_QF_EQUEUE_POST_LIFO,QS_QF_EQUEUE_GET,QS_QF_EQUEUE_GET_LAST
Time event	QS_OBJ_DICTIONARY(&l_philo[0].timeEvt); See Listing 11.20(5)	QS_QF_TICK,QS_QF_TIMEEVT_ARM,QS_QF_TIMEEVT_AUTO_DISARM,QS_QF_TIMEEVT_DISARM_ATTEMPT,QS_QF_TIMEEVT_DISARM,QS_QF_TIMEEVT_REARM,QS_QF_TIMEEVT_POST,QS_QF_TIMEEVT_PUBLISH

The object dictionary records are closely related to the QS local filters (see Section 11.3.5). Both facilities consistently use the same conventions. For example, a local filter for a specific memory pool is selected by means of the QS_FILTER_MP_OBJ() macro, which accepts a pointer to the memory buffer managed by the memory pool. Similarly, a local filter for a specific event queue is selected by means of the QS_FILTER_EQ_OBJ() macro, which accepts a pointer to the ring buffer managed by the event queue.

Function Dictionaries

Function dictionaries are generated with the macro QS_FUN_DICTIONARY(), which associates the address of the function in memory with its symbolic name. Listing 11.19(4-6) provides examples of how you use this macro. The main purpose of the function dictionaries is to provide symbolic names for state-handler functions.

The QS_FUN_DICTIONARY() macro takes only one argument: the address of the function, and uses internally the “stringization” preprocessor operator to convert the provided argument to a C string.

Signal Dictionaries

Signal dictionaries are generated with the macro QS_SIG_DICTIONARY() that associates the numerical value of the event signal and the state machine object to the symbolic name of the signal.

The reason for using both the signal value and the state machine object rather than just the signal value is that a signal value alone is not sufficient to uniquely identify the symbolic signal. Only the globally published signals are required to be systemwide unique. Other signals, used only locally, can have completely different meanings for different state machines in the system.

The QS_SIG_DICTIONARY() macro takes two arguments: the numerical value of the signal and the address of the state machine object. The macro uses internally the “stringization” preprocessor operator to convert the provided signal argument to a string. The state machine object is not converted to a string, so the actual variable name you use is irrelevant.

Listing 11.19(7-10) provides examples of how you use the QS_SIG_DICTIONARY() macro. Listing 11.19(7-9) shows how to specify globally published signals that are associated with multiple state machines. In this case, you specify NULL as the state machine object. In contrast, Listing 11.19(10) shows a dictionary entry for the local signal TIMEOUT_SIG. This signal is associated only with the Philosopher state machines.

Application-Specific QS Trace Records

The application-specific QS records allow you to generate tracing information from the application-level code. You can think of the application-specific records as an equivalent to printf() but with much less overhead. Listing 11.10 shows an example of an application-specific QS record.

Listing 11.10. Example of an application-specific trace record
QS_BEGIN(MY_QS_RECORD, myObjectPointer) /* trace record begin */
QS_STR("Hello"); /* string data element */
QS_U8(3, n); /* uint8_t data, 3-decimal digits format */
. . . /* QS data */
QS_MEM(buf, sizeof(buf)); /* memory block of a given size */
QS_END() /* trace record end */

In most cases, the application-specific records are enclosed with the QS_BEGIN()/QS_END() pair of macros. This pair of macros locks interrupts at the beginning and unlocks at the end of each record (see Section 11.3.3). Occasionally you would want to generate trace data from within already established critical sections or ISRs. In such rare occasions, you would use the macros QS_BEGIN_NOLOCK()/QS_END_NOLOCK() to avoid nesting of critical sections.

The record-begin macro QS_BEGIN() takes two arguments. The first argument (e.g., MY_QS_RECORD) is the enumerated record type, which is used in the global on/off filter (Section 11.3.5) and is part of the each record header. The application-specific record types must start with the value QS_USER to avoid overlap with the predefined QS records already instrumented into the QP components.

The second argument (e.g., myObjectPointer) is used for the local filter, which allows you to selectively log only specific application-level objects. Listing 11.21 shows an example of an application-specific trace record, including the use of the second parameter of the QS_BEGIN() macro.

Note
If you don't want to use the local filter for a given application-specific trace record, you can use NULL as the second argument to the macros QS_BEGIN() or QS_BEGIN_NOLOCK(). That way, the trace record will always be produced, regardless of the setting of the application-specific local filter.

Sandwiched between the QS_BEGIN()/QS_END() macros are data elements that you want to store in the trace record. The macros for generating the data elements are shown in Listing 11.3(17-23). The supported data elements include signed and unsigned integers of 8-bit, 16-bit, and 32-bit size; floating-point numbers of 32-bit and 64-bit size; zero-terminated strings; and variable-size memory blocks. Special macros are also provided for inserting platform-dependent elements, such as event signals, object pointers, and function pointers. For these configurable or platform-specific data elements, QS logs only the minimal number of bytes required on the given platform.

The biggest challenge in supporting arbitrary trace records is that the host-resident component (the QSPY application) doesn't “know” the structure of such records, so the data type information must be stored with the data itself. Figure 11.5 shows the encoding of the application-specific trace record from Listing 11.10. The application-specific trace record, like all QS records, starts with the Sequence Number and the Record ID (see Section 11.3.6). Every application-specific trace record also contains the timestamp immediately following the Record ID. The number of bytes used by the timestamp is configurable by the macro QS_TIME_SIZE. After the timestamp, you see the data elements, such as the “Hello” string, an unsigned byte ‘n,’ some other data, and finally a memory block. Each of these data elements starts with a format byte, which actually contains both the data-type information (in the lower nibble) and the format width for displaying that element (in the upper nibble). For example, the data element QS_U8(3, n) will cause the value ‘n’ to be encoded as uint8_t with the format width of 3 decimal digits. The maximum allowed format width is 15 decimal digits.

Figure 11.5. Encoding of the application-specific trace record from Listing 11.10 (escaping bytes are omitted for clarity).

As shown in Listing 11.10, you can place many data elements of any kind in any order inside an application-specific record. The only limitation is that a complete record must fit in the QS trace buffer. Of course, you should avoid big trace records anyway, to keep the critical sections short (QS records are always placed in the buffer in a critical section of code). Furthermore, you might want to conserve the buffer space.

Porting and Configuring QS

When you use QS in your application, you are responsible for adapting QS to the CPU, compiler, and kernel/RTOS of your choice. Such adaptation is called a port. The code accompanying this book contains the QS ports and application examples for 80x86 (vanilla, QK, µC/OS-II, and Linux) as well as for Cortex-M3 (vanilla and QK kernels).

The source code for a QS port is organized in the same way as any other port of a QP component, as described in Chapter 8. The QS platform-specific code consists only of qs_port.h and QS callback functions defined typically in the board support package (bsp.c) of your application. Listing 11.11 shows an example of the qs_port.h header file for 80x86, QK/DOS, large memory model. Section 11.6 provides examples of the QS callback functions such as QS_onInit(), QS_onCleanup(), QS_onFlush(), and QS_onGetTime().

Listing 11.11. QS port header file for 80x86, QK/DOS, large memory model (<qp>qpcports80x86qk cpp101lqp_port.h)
#ifndef qs_port_h
#define qs_port_h
(1) #define QS_OBJ_PTR_SIZE 4
(2) #define QS_FUN_PTR_SIZE 4
(3) #define QS_TIME_SIZE 4
(4) #include "qf_port.h" /* use QS with QF */
(5) #include "qs.h" /* QS platform-independent public interface */
(5) #endif /* qs_port_h */

(1) The macro QS_OBJ_PTR_SIZE specifies the size (in bytes) of an object pointer on the particular platform.
(2) The macro QS_FUN_PTR_SIZE specifies the size (in bytes) of a function pointer on the particular platform.
(3) The macro QS_TIME_SIZE configures the size (in bytes) of the QS time stamp QSTimeCtr (see also Listing 11.3(43)).
(4) The QF port header file qf_port.h is included if QS is used together with the QF real-time framework.

Note
When QS is combined with QF, the QS critical section is the same as it is defined in the qf_port.h header file. However, QS can also be used with just the QEP component, or even completely standalone. In these cases, QS must provide its own, independent critical section mechanism by defining the macros QS_INT_KEY_TYPE, QS_INT_LOCK(), and QS_INT_UNLOCK() (see Section 11.3.3).

The QSPY Host Application

As described in Section 11.2, the host-resident component for the Quantum Spy software-tracing system is the QSPY host application. QSPY is a simple console application without any fancy GUI because its purpose is to provide only the QS data parsing, storing, and exporting to such powerful tools as MATLAB. QSPY has been designed from the ground up to be platform-neutral. The application is written in portable C++ and ports to Linux and Windows with various compilers are provided.

QSPY is easily adaptable to various target-host communication links. Out of the box, the QSPY host application supports serial (RS232), TCP/IP, and file communication links. Adding other communication links is easy because the data link is accessed only through a generic hardware abstraction layer (HAL).

The QSPY application accepts several command-line parameters to configure the data link and all target dependencies, such as pointer sizes, signal sizes, and the like. This means that the single QSPY host application can process data from any embedded target. The application has been tested with a wide range of 8-, 16-, or 32-bit CPUs.

QSPY provides a simple consolidated, human-readable textual output to the screen. If the QS trace data contains dictionary trace records (see Section 11.3.8), QSPY applies this symbolic information to output the provided identifiers for objects, signals, and states. Otherwise, QSPY outputs the hexadecimal values of various pointers and signals.

Finally, QSPY can export the trace data in the matrix format readable by MATLAB. A special MATLAB script to import QSPY trace data to MATLAB is provided. Once the data is available in MATLAB matrices, it can be conveniently manipulated and visualized with this powerful tool.

QSPY comes with a Reference Manual in electronic format (see Section 11.6.6), which contains detailed explanations of all command-line options, the human-readable format, and the MATLAB interface.

Installing QSPY

The QSPY host application is included in the code accompanying this book in the directory <qp>qpc oolsqspy (for QP/C) and also in <qp>qpcpp ools qspy (for QP/C++). The two versions are actually identical except that QSPY for QP/C includes the C-version of the qs.h header file, whereas the QP/C++ version includes the C++ version of qs.h. Listing 11.12 shows the contents of the QP Root Directory after the installation of QS component.

Listing 11.12. Source code organization for the QSPY host application
<qp>qpc - QP/C root directory (<qp>qpcpp for QP/C++)
|
+-doxygen - QP/C documentation generated with Doxygen
| +-html - "QP/C Reference Manual" in HTML format
| | +-index.html - The starting HTML page for the "QP/C Reference Manual"
| | | (contains the "QSPY Reference Manual")
| | +- . . .
| +-qpc.chm - "QP/C Reference Manual" in CHM Help format
| (contains the "QSPY Reference Manual")
|
+-include/ - QP platform-independent header files
| +-qs.h - QS platform-independent header file (used by QSPY)
|
+-tools - Tools directory
| +-qspy - QSPY host application
| | +-include - platform-independent include
| | | +-dict.h - dictionary class header file
| | | +-getopt.h - command-line option parser
| | | +-hal.h - Hardware Abstraction Layer header file
| | | +-qspy.h - QSPY parser header file
| | +-source - platform-independent sources (C++)
| | | +-dict.cpp - dictionary class implementation
| | | +-getopt.c - command-line option parser
| | | +-main.cpp - main() entry point
| | | +-qspy.cpp - QSpy parser
| | |
| | +-linux - Linux version of QSPY
| | | +-gnu - GNU compiler
| | | | +-dbg - debug build directory
| | | | +-rel - release build directory
| | | | +-com.cpp - serial port HAL for Linux
| | | | +-tcp.cpp - TCP/IP port HAL for Linux
| | | | +-Makefile - make file to build QSPY for Linux
| | |
| | +-win32 - Win32 (Windows) version of QSPY
| | | +-mingw - MinGW compiler (GNU)
| | | | +-dbg - debug build directory
| | | | +-rel - release build directory
| | | | | +-qspy.exe – QSPY executable
| | | | +-com.cpp - serial port HAL for Win32
| | | | +-tcp.cpp - TCP/IP port HAL for Win32
| | | | +-make.bat – Simple batch script to build QSPY
| | |
| | | +-vc2005 - Visual C++ 2005 toolset
| | | | +-Debug - debug build directory
| | | | +-Release - release build directory
| | | | | +-qspy.exe – QSPY executable
| | | | +-com.cpp - serial port HAL for Win32
| | | | +-tcp.cpp - TCP/IP port HAL for Win32
| | | | +-qspy.sln - Visual C++ Solution to build QSPY for Win32
| | |
| | +-matlab - MATLAB scripts
| | | +-qspy.m - MATLAB script to import the QS data into MATLAB
| | | +-dpp.spy - Example of a QS binary file from DPP application
| | | +-philo_timing.m - example MATLAB script to generate timing diagrams
| | | for the DPP example

Note
The QSPY host application includes the header file <qp>qpcincludeqs.h. The qs.h header file provides the link between the QS target-resident component and the QSPY host-resident component.

Building QSPY Application from Sources

The QSPY source code is written in portable C++, with ports to Windows and Linux already provided (see Listing 11.12). Note that the QSPY host application is coupled with the QS target component through the header file <qp>qpcincludeqs.h, which enumerates the predefined QS records.

Building QSPY for Windows with Visual C++ 2005

The Win32 executable of the QSPY application is provided in the file <qp>qpc oolsqspywin32vc2005Releaseqspy.exe. This executable should run on any version of 32-bit Windows.

If you want to rebuild the application, the directory <qp>qpc oolsqspywin32vc2005 contains the Microsoft Visual C++ 2005 solution file qspy.sln to build the QSPY application. You simply load the solution file to the Visual C++ 2005 IDE and start the build by pressing F7.

Building QSPY for Windows with MinGW

Alternatively, you can use the open source MinGW (Minimalist GNU for Windows) toolset available from www.mingw.org to build the QSPY executable. The directory <qp>qpc oolsqspywin32mingw contains a simple batch file make.bat to build the QSPY application. You probably need to modify the definition of the MINGW symbol at the top of the batch file to point it to the location where you installed the MinGW toolset. By default, make.bat produces the debug version of the application in the directory <qp>qpc oolsqspywin32mingwdbg. To produce the release version, add the ‘rel’ parameter to the make.bat script (make rel). The release version is produced in the release directory: <qp>qpc oolsqspywin32mingw el.

Building QSPY for Linux

The directory <qp>qpc oolsqspylinuxgnu contains the Makefile for building QSPY for Linux. By default, the Makefile produces the debug version of the application in the directory <qp>qpc oolsqspylinuxgnudbg. To produce the release version, add the ‘rel’ target to the make (make rel). The release version is produced in the release directory: <qp>qpc oolsqspylinuxgnu el.

Invoking QSPY

The QSPY host application is designed to work with all possible target CPUs and data links, which requires a wide range of configurability. For example, for any given target CPU, the QSPY application must “know” the size of object pointers, function pointers, event signals, timestamp size, and so on. You provide this information to QSPY by means of command-line parameters, which are summarized in Table 11.3

Table 11.3. Summary of QSpy command-line options

Option	Example	Default	Must Match QP Macro (QP Port Header File)	Comments
-h	-h			Help; prints the summary of options
-q	-q			Quiet mode (no stdout output)
-o	-o qs.txt			Produces output to the specified file
-s	-s qs.spy			Saves the binary input to the specified file; not compatible with -f
-m	-m qs.mat			Generates MATLAB output to the specified file
-c	-c COM2	COM1		COM port selection; not compatible with –t, -p, -f
-b	-b 115200	38400		Baud rate selection; not compatible with –t, -p, -f
-t	-t			TCP/IP input selection; not compatible with –c, -b, -f
-p	-p 6602	6601		TCP/IP server port number; not compatible with –c, -b, -f
-f	-f qs.spy			File input selection; not compatible with –c, -b, -t, -p
-T	-T 2	4	QS_TIME_SIZE (qs_port.h)	Time stamp size in bytes; valid values: 1, 2, 4
-O	-O 2	4	QS_OBJ_PTR_SIZE (qs_port.h)	Object pointer size in bytes; valid values: 1, 2, 4
-F	-F 2	4	QS_FUN_PTR_SIZE (qs_port.h)	Function pointer size in bytes; valid values: 1, 2, 4
-S	-S 2	1	Q_SIGNAL_SIZE (qep_port.h)	Signal size in bytes; valid values: 1, 2, 4
-E	-E 1	2	QF_EVENT_SIZ_SIZE (qf_port.h)	Event-size size in bytes (i.e., the size of variables that hold event size); valid values: 1, 2, 4
-Q	-Q 1	2	QF_EQUEUE_CTR_SIZE (qf_port.h)	Queue counter size in bytes; valid values 1, 2, 4
-P	-P 4	2	QF_MPOOL_CTR_SIZE (qf_port.h)	Pool counter size in bytes; valid values: 1, 2, 4
-B	-B 1	2	QF_MPOOL_SIZ_SIZE (qf_port.h)	Block size size in bytes (i.e., the size of variables that hold memory block size); valid values 1, 2, 4
-C	-C 4	2	QF_TIMEEVT_CTR_SIZE (qf_port.h)	Time event counter size; valid values: 1, 2, 4

Your main concern when invoking QSPY is to match exactly the target system you are using. The fourth column of Table 11.3 lists the configuration macros used by the target system as well as the platform-specific QP header files where those macros are defined. You need to use the corresponding QSPY command-line option only when the QP macro differs from the default. The default values assumed by QSPY are consistent with the defaults used in QP.

Note
When you do not match the QSPY host application with the QS target component, the QSPY application will be unable to correctly parse the mismatched trace records and will start generating the following errors:
********** 028: Error xx bytes unparsed
********** 014: Error -yy bytes unparsed
The number in front of the error indicates the Record ID of the trace record that could not be parsed.

Exporting Trace Data to MATLAB

The QSPY host application can also export trace data to MATLAB, which is a popular numerical computing environment and a high-level technical programming language. Created by The MathWorks, Inc., MATLAB allows easy manipulation and plotting of data represented as matrices.

Figure 11.6 summarizes the interface between the QSPY host application and MATLAB. The interface consists of the QSPY MATLAB output file, the qspy.m MATLAB script, and MATLAB matrices generated by the script in the current MATLAB workspace. The following sections explain these elements.

Figure 11.6. Exporting trace data to MATLAB.

Analyzing Trace Data with MATLAB

When you invoke QSPY with the –m <file name> option, the QSPY application generates a MATLAB-readable file of the specified name in addition to the human-readable format discussed in the previous section.

The MATLAB output file is an ASCII file that contains all the trace records formatted for MATLAB. However, the various trace records in the MATLAB file are still in the same order as they were produced in the target and don't yet form proper MATLAB matrices, which are the most natural way of representing data within MATLAB. You can find an example of a QSPY MATLAB output in the file <qp>qpcexamples80x86qk cpp101ldppdpp.mat.

The directory <qp>qpc oolsqspymatlab contains the MATLAB script qspy.m, which reads in the QSPY MATLAB file and converts the data into several MATLAB matrices in the current workspace. Assuming that the directory <qp>qpc oolsqspymatlab is included in the MATLAB path, you invoke the script from the MATLAB command window as follows:

Q_FILE=‘<qp>qpcexamples80x86qk cpp101ldppdpp.mat’; qspy

The variable Q_FILE is set to the file name of the QSPY MATLAB file. Note that the qspy.m script is intentionally not a MATLAB function because its main purpose is to fill the current workspace with matrices that remain after the script is done, which is not possible with a function that runs in a separate temporary workspace.

At this point you have all the data conveniently represented in MATLAB matrices. After filling in the matrices, the qspy.m script executes the ‘whos’ command to show the created objects. The matrices with the prefix Q_ contain the time-ordered trace data. All MATLAB matrices are documented in the “QSPY Reference Manual,” which is available in electronic format in the code accompanying this book (see Section 11.6.6).

Just to demonstrate what you can do with the data, the Figure 11.7 shows the timing diagrams for all Philosopher state machines in the DPP application.

MATLAB plot showing timing diagrams of the five Philosophers generated from the QS trace data. The vertical axis represents states “thinking” (lowest), “hungry” (middle), and “eating” (top).

Figure 11.7. MATLAB plot showing timing diagrams of the five Philosophers generated from the QS trace data. The vertical axis represents states “thinking” (lowest), “hungry” (middle), and “eating” (top).

The plots shown in Figure 11.7 have been generated by running the script philo_timing.m provided in the <qp>qpc oolsqspymatlab directory. Assuming that this directory is in the MATLAB path, you simply type the script's name at the MATLAB prompt:

» philo_timing

The philo_timing.m script displays the data from the MATLAB matrix Q_STATE, which you generated by running the qspy.m script. The Q_STATE matrix contains all the state machine information. Section 11.5.4 explains how this plot has been generated.

MATLAB Output File

The QSPY MATLAB file is in ASCII format and Listing 11.13 shows a snippet of the QSPY MATLAB file generated from the DPP application (see also Section 11.2.1).

Listing 11.13. Fragment of the QSPY MATLAB file for the DPP application
. . . . . .
62 Philo_initial= 308543675;
62 Philo_thinking= 308544589;
62 Philo_hungry= 308544835;
62 Philo_eating= 308545073;
60 HUNGRY_SIG=[ 8 382343546];
60 TIMEOUT_SIG=[ 10 382343546];
12 0 4 382343546
3 382343546 337510406 308544589
50 0 64 64
51 0 64 64
32 0 382343694 382343546 9 0
1 382343546 308544589
4 0 382343546 308544589
. . . . . .
33 382344192 382344044
37 528197 382344192 10 382344044
14 528208 10 382344044 0 0 5 5
33 382343860 382343712
37 528222 382343860 10 382343712
14 528234 10 382343712 0 0 5 5
42 528249 1 255
17 528262 10 382344044 0 0
28 528276 3 8
24 528287 382343384 9 9
14 528299 8 382344376 1 0 5 5
17 528312 8 382344376 1 1
28 528333 3 4
24 528343 382343384 8 8
26 528355 4 1
14 528367 4 382344210 1 1 5 5
14 528379 4 382344044 1 2 5 5
14 528391 4 382343878 1 3 5 5
14 528403 4 382343712 1 4 5 5
14 528415 4 382343546 1 5 5 5
. . . . . .

The QSPY MATLAB file is stored in portable ASCII format for cross-platform portability, but is really not intended to be human-readable. The purpose of Listing 11.13 is simply to demonstrate that the data is mostly numerical, with the only exception of the “dictionary” entries, which actually are stored as MATLAB commands.

The MATLAB output file shown in Listing 11.13 contains all the trace records formatted for MATLAB. However, the various records at this stage are still in the same order as they were produced in the target and don't yet form proper MATLAB matrices, which are the most natural way of representing data within MATLAB.

MATLAB Script `qspy.m`

The MATLAB script qspy.m, located in the directory <qp>qpc oolsqspymatlab, is designed to read the QSPY MATLAB file and sort the different records into various MATLAB matrices for subsequent analysis. In Section 11.5.1, I described how to invoke the script from MATLAB. Here I discuss the qspy.m script itself, which is shown in Listing 11.14.

Listing 11.14. Script qspy.m for Importing QSPY trace data to MATLAB
(1) % the string Q_FILE must be defined
(2) fid = fopen(Q_FILE, ‘r’);
(2) if fid == -1
(3) error(‘file not found’)
(3) end
(4) Q_STATE = []; % sate entry/exit, init, tran, internal tran, ignored
(4) Q_EQUEUE = []; % QEQueue
(4) Q_MPOOL = []; % QMPool
(4) Q_NEW = []; % new/gc
(4) Q_ACTIVE = []; % active add/remove, subscribe/unsubscribe
(4) Q_PUB = []; % publish/publish attempt
(4) Q_TIME = []; % time event arm/disarm/rearm, clock tick
(4) Q_INT_LOCK = []; % interrupt locking/unlocking
(4) Q_ISR_LOCK = []; % ISR entry/exit
(4) Q_MUTEX = []; % QK mutex locking/unlocking
(4) Q_SCHED = []; % QK scheduler events
(4) Q_TOT = 0; % total number of records processed
(5) while feof(fid) == 0
(5) line = fgetl(fid);
(5) Q_TOT = Q_TOT+1;
(6) rec = sscanf(line, '%d’, 1); % extract the record type
(7) switch rec % discriminate based on the record type
(7) % QEP trace records
(8) case 1 % QS_QEP_STATE_ENTRY
(8) Q_STATE(size(Q_STATE,1)+1,:) = ...
(8) [NaN 1 sscanf(line, '%*u %u %u’)' NaN 1];
(8) case 2 % QS_QEP_STATE_EXIT
(8) Q_STATE(size(Q_STATE,1)+1,:) = ...
(8) [NaN 2 sscanf(line, '%*u %u %u’)' NaN 1];
(8) case 3 % QS_QEP_STATE_INIT
(8) Q_STATE(size(Q_STATE,1)+1,:) = ...
(8) [NaN 3 sscanf(line, '%*u %u %u %u’)' 1];
(8) case 4 % QS_QEP_INIT_TRAN
(8) tmp = sscanf(line, '%*u %u %u %u’)';
(8) Q_STATE(size(Q_STATE,1)+1,:) = ...
(8) [tmp(1) 3 tmp(2) NaN tmp(3) 1];
(8) case 5 % QS_QEP_INTERN_TRAN
(8) Q_STATE(size(Q_STATE,1)+1,:) = ...
(8) [sscanf(line, '%*u %u %u %u %u’)' NaN 1];
(8) case 6 % QS_QEP_TRAN
(8) Q_STATE(size(Q_STATE,1)+1,:) = ...
(8) [sscanf(line, '%*u %u %u %u %u’)' 1];
(8) case 7 % QS_QEP_IGNORED
(8) Q_STATE(size(Q_STATE,1)+1,:) = ...
(8) [sscanf(line, '%*u %u %u %u %u’)' NaN 0];
(8) % QF trace records
(8) case 10 % QS_QF_ACTIVE_ADD
(8) tmp = sscanf(line, '%*u %u %u %u %u’)';
(8) Q_ACTIVE(size(Q_ACTIVE,1)+1,:) = [tmp(1) NaN tmp(2) tmp(3) 1];
(8) . . . .
(8) % Miscallaneous QS records
(8) case 60 % QS_SIG_DICTIONARY
(8) eval(line(5:end));
(8) case 61 % QS_OBJ_DICTIONARY
(8) eval(line(5:end));
(8) case 62 % QS_FUN_DICTIONARY
(8) eval(line(5:end));
(8) . . .
(8) % User records
(9) % . . .
(9) end
(9) end
(9) % cleanup ...
(10) fclose(fid);
(10) clear fid;
(10) clear line;
(10) clear rec;
(10) clear tmp;
(10) % display status information...
(11) Q_TOT
(12) whos

(1) This is a simple MATLAB script, not a MATLAB function, because the main purpose of qspy.m is to create new data that remain in the workspace after the script finishes so that you can use them for further computations. (A MATLAB function runs in a separate workspace, which disappears after the function returns.)
(2) The variable Q_FILE contains the name of the QSPY MATLAB file and must be defined before starting the qspy.m script.
(3) The specified MATLAB file must exist.
(4) The QSPY MATLAB matrices are cleared to be filled with trace data.
(5) The entire MATLAB file is processed one line at a time.
(6) The first number in each line is the QS record type (see Listing 11.3(2)).
(7) The switch statement discriminates based on the record type.
(8) Different case statements read specific record types and place the data into MATLAB matrices described in the “QSPY Reference Manual.”
(9) Here you can extend the script to include user-specific (your) trace records.
(10) In this section the temporary data is cleaned up (this is a simple script, not a function).
(11,12) The total number of records and the contents of the current workspace are displayed.

MATLAB Matrices Generated by `qspy.m`

The qspy.m script generates 11 MATLAB matrices (see Listing 11.14). Each of these matrices contains different group of related QS trace records. The “QSPY Reference Manual,” available in electronic format in the code accompanying this book (see Section 11.6.6), contains documentation for all MATLAB matrices generated by the qspy.m script. Here I give you only one example, the matrix Q_STATE, which stores all QS records pertaining to state machine activities in the system. Table 11.4

Table 11.4. Q_STATE matrix (N-by-6 ) produced by the qspy.m script

MATLAB Index	1	2	3	4	5	6
QS Record	Timesstamp	Signal	State Machine Object	Source State	New State	Event Handled
QS_QEP_STATE_ENTRY	NaN	1	√^[2]	√	NaN	1
QS_QEP_STATE_EXIT	NaN	2	√^[2]	√	NaN	1
QS_QEP_STATE_INIT	NaN	3	√^[2]	√	√	1
QS_QEP_STATE_INIT_TRAN	√	3	√^[2]	NaN	√	1
QS_QEP_STATE_INTERN_TRAN	√	√^[1]	√^[2]	√	NaN	1
QS_QEP_STATE_TRAN	√	√^[1]	√^[2]	√	√	1
QS_QEP_STATE_IGNORED	√	√^[1]	√^[2]	√	NaN	0

By MATLAB convention, the different variables such as timestamp, event signal, and the like are put into columns, allowing observations to vary down through the rows. Therefore, a data set consisting of N time samples of six variables is stored in a matrix of size N-by-6. The checkmark ‘√’ in a given cell of the matrix represents data available from the QSPY file. Other values represent data added by the qspy.m script to fill all the matrix cells and to allow unambiguous identification of the trace records. For example, the following six index matrices unambiguously select the QS trace record from the matrix:

QS Record	MATLAB Index Matrix
QS_QEP_STATE_ENTRY	Q_STATE(:,2) == 1
QS_QEP_STATE_EXIT	Q_STATE(:,2) == 2
QS_QEP_STATE_INIT	Q_STATE(:,2) == 3
QS_QEP_STATE_INIT_TRAN	isnan(Q_STATE(:,4))
QS_QEP_STATE_INTERN_TRAN	Q_STATE(:,2) > 3 & isnan(Q_STATE(:,5))
QS_QEP_STATE_TRAN	Q_STATE(:,2) > 3 & ~isnan(Q_STATE(:,5))
QS_QEP_STATE_IGNORED	~Q_STATE(:,6)

As an example of using the information contained in the matrix Q_STATE, consider the timing diagrams for the Philosopher active objects shown in Figure 11.7. These timing diagrams have been generated with the script philo_timing.m shown in Listing 11.15.

Listing 11.15. MATLAB script philo_timing.m that generates timing diagrams shown in Figure 11.7
(1) t=Q_STATE(:,2)>3 & ~isnan(Q_STATE(:,5)); % QS_QEP_STATE_TRAN
(2) o=Q_STATE(:,3) == l_philo_0_;
(3) subplot(5,1,1); stairs(Q_STATE(o & t,1),Q_STATE(o & t,5),‘r’)
(4) o=Q_STATE(:,3) == l_philo_1_;
(4) subplot(5,1,2); stairs(Q_STATE(o & t,1),Q_STATE(o & t,5),‘r’)
(4) o=Q_STATE(:,3) == l_philo_2_;
(4) subplot(5,1,3); stairs(Q_STATE(o & t,1),Q_STATE(o & t,5),‘r’)
(4) o=Q_STATE(:,3) == l_philo_3_;
(4) subplot(5,1,4); stairs(Q_STATE(o & t,1),Q_STATE(o & t,5),‘r’)
(4) o=Q_STATE(:,3) == l_philo_4_;
(4) subplot(5,1,5); stairs(Q_STATE(o & t,1),Q_STATE(o & t,5),‘r’)
(4) xlabel(‘time stamp’); zoom on

(1) The index matrix ‘t’ selects only the rows of the Q_STATE matrix that correspond to state transitions (the QS_QEP_STATE_TRAN trace record). The conditions used for a transition are: signal >3 (user-defined signal) and new state is available (not a NaN).
(2) The index matrix ‘o’ (object) selects only the Philosopher 0 state machine object. This line of code makes use of the dictionary entry l_pholo_0_.
(3) The timing diagram for Philosopher 0 is drawn using the index matrices ‘t’ and ‘o.’
(4) The index matrix ‘o’ is created for Philosopher 1, 2, 3, and 4 state machine objects and timing diagrams for these objects are drawn.

Obviously, this short demonstration barely scratches the surface of the possibilities. Refer to the “QSPY Reference Manual” for the description of other MATLAB matrices. The rest is MATLAB.

Adding QS Software Tracing to a QP Application

In this section I show you how to add QS software tracing to the DPP application, which I already used as an example of a software-tracing session at the beginning of this chapter (Section 11.2.1).

The DPP example for 80x86 with the QK preemptive kernel demonstrates the use of QS with all QP components: QEP, QF, and QK. The example is located in the directory <qp>qpcexamples80x86qk cpp101ldpp. You can rebuild the “Spy” configuration by loading the DPP-SPY.PRJ project into the Turbo C++ IDE. The application links to the “Spy” versions of the QP libraries located in the directory <qp>qpcports80x86qk cpp101lspy.

The DPP example demonstrates all aspects of QS setup. In particular it demonstrates how to send the QS trace data over a serial port (UART) and how to timestamp QS records with submicrosecond precision using the 8254 timer/counter found in every x86-based PC.

Initializing QS and Setting Up the Filters

(1) When the QS tracing is enabled (i.e., the macro Q_SPY is defined), the header file qp_port.h includes the QS active interface qs.h (see Listing 11.3).
(2) The BSP initialization also initializes QS (see Listing 11.17
Listing 11.16. Initialization of QS, setting up the filters, and generating dictionary entries (file <qp>qpcexamples80x86qk cpp101ldppmain.c)
- (1) #include "qp_port.h"
  (1) #include "dpp.h"
  (1) #include "bsp.h"
  (1) /* Local-scope objects --------------------------------------------------*/
  (1) static QEvent const *l_tableQueueSto[N_PHILO];
  (1) static QEvent const *l_philoQueueSto[N_PHILO][N_PHILO];
  (1) static QSubscrList l_subscrSto[MAX_PUB_SIG];
  (1) static union SmallEvent {
  (1) void *min_size;
  (1) TableEvt te;
  (1) /* other event types to go into this pool */
  (1) } l_smlPoolSto[2*N_PHILO]; /* storage for the small event pool */
  (1) /*......................................................................*/
  (1) int main(int argc, char *argv[]) {
  (1) uint8_t n;
  (1) Philo_ctor(); /* instantiate all Philosopher active objects */
  (1) Table_ctor(); /* instantiate the Table active object */
- (2) BSP_init(argc, argv); /* initialize the BSP (including QS) */
  (2) QF_init(); /* initialize the framework and the underlying RT kernel */
  (2) /* setup the QS filters ... */
- (3) QS_FILTER_ON (QS_ALL_RECORDS);
- (4) QS_FILTER_OFF(QS_QF_INT_LOCK);
- (5) QS_FILTER_OFF(QS_QF_INT_UNLOCK);
- (6) QS_FILTER_OFF(QS_QK_SCHEDULE);
  (6) /* provide object dictionaries... */
- (7) QS_OBJ_DICTIONARY(l_smlPoolSto);
- (8) QS_OBJ_DICTIONARY(l_tableQueueSto);
- (9) QS_OBJ_DICTIONARY(l_philoQueueSto[0]);
  (9) QS_OBJ_DICTIONARY(l_philoQueueSto[1]);
  (9) QS_OBJ_DICTIONARY(l_philoQueueSto[2]);
  (9) QS_OBJ_DICTIONARY(l_philoQueueSto[3]);
  (9) QS_OBJ_DICTIONARY(l_philoQueueSto[4]);
  (9) QF_psInit(l_subscrSto, Q_DIM(l_subscrSto)); /* init publish-subscribe */
  (9) /* initialize event pools... */
  (9) QF_poolInit(l_smlPoolSto, sizeof(l_smlPoolSto), sizeof(l_smlPoolSto[0]));
  (9) for (n = 0; n < N_PHILO; ++n) { /* start the active objects... */
  (9) QActive_start(AO_Philo[n], (uint8_t)(n + 1),
  (9) l_philoQueueSto[n], Q_DIM(l_philoQueueSto[n]),
  (9) (void *)0, 0, (QEvent *)0);
  (9) }
  (9) QActive_start(AO_Table, (uint8_t)(N_PHILO + 1),
  (9) l_tableQueueSto, Q_DIM(l_tableQueueSto),
  (9) (void *)0, 0, (QEvent *)0);
  (9) QF_run(); /* run the QF application */
  (9) return 0;
  (9) }
(3) Right after initialization, all QS global filters are disabled. Here I enable all global filters.
(4-6) I disable the high-volume trace records to avoid overrunning the QS trace buffer (see Section 11.3.7).
(7-9) I provide object dictionary entries (see Section 11.3.8) for all local-scope objects defined in this module. Note that I need to do it here because these objects are only known in this translation unit.

Defining Platform-Specific QS Callbacks

Listing 11.17 shows the platform-specific QS callback functions for QK/DOS. The QS tracing uses one of the standard 16550 UARTs (COM1-COM4) for data output at 115200 baud rate. I defer the discussion of the QS timestamp (the QS_onGetTime() callback) to the next section.

(1) The macro QS_INIT() initializes the QS component. The macro returns FALSE when the initialization fails (e.g., the specified COM port cannot be opened).
(2) Failing to initialize QS causes an error. Note that the assertion only fires when QS is active because the dummy version of the QS_INIT() macro always returns TRUE (see Listing 11.4(2)).
(3) The QK preemptive kernel calls the QK_onIdle() callback function from the idle loop (see Section 10.3.5 in Chapter 10). This callback is an ideal place to perform output of the trace data in the least intrusive way. The QS output in the QK_onIdle() function uses the QS_getBlock() API, which I already discussed in Listing 11.9 in Section 11.3.7.
(4) The function UART_config() configures one of the standard UARTs of the 80x86-based PC (COM1-COM4).
(5) The callback function QS_onStartup() initializes the QS component.
(6) The function QS_onStartup() must always initialize the QS trace buffer by calling QS_initBuf() (see Section 11.3.7).
(7) The callback QS_onCleanup() function performs cleanup of QS. This function has nothing to clean up in this case.
(8) The callback function QS_onFlush() flushes the entire trace buffer to the host. This function is called after each dictionary trace record to avoid overrunning the trace buffer during the system initialization.
(9) The function QS_onFlush() busy-waits until the data is sent out. Note that this policy might be only appropriate during the initial transient.
(10) The callback function QS_onGetTime() provides the timestamp for the QS trace records. I discuss this function in the next section.

Generating QS Timestamps with the `QS_onGetTime()` Callback

Most QS trace records are timestamped, which ties all the trace records to the common timeline. To be truly useful, the QS timestamps should have microsecond-level resolution or better, which is only possible with a dedicated clock or a timer device. The callback function QS_onGetTime() encapsulates the particular method of obtaining the timestamp. QS always calls this callback function inside a critical section of code.

In case of a standard PC, the time can be obtained from the 8284 timer/counter. The counter-0 of the 8254 chip is a 16-bit down-counter that is set up to generate the standard 18.2Hz clock-tick interrupt when it underflows from 0 to 0xFFFF. The counting rate is 1.193182MHz, which works out to approximately 0.838 microseconds per count.

The basic idea of using this 16-bit down-counter for 32-bit timestamping of the QS records is shown in Figure 11.8. The system clock-tick interrupt maintains the 32-bit-wide timestamp at tick l_tickTime. The clock-tick interrupt increments l_tickTime by 0x10000 counts to account for the 16-bit rollover. The complete 32-bit timestamp is constructed by adding l_tickTime and the on-chip count, or actually (0x10000 – the on-chip count), because the 8284 timer/counter counts down.

Figure 11.8. Using channel 0 of the 8254 PIT to generate 32-bit QS timestamp.

The method just described provides fine-granularity timestamp most of the time but occasionally can be off by the full period of the 16-bit counter (0x10000 counts). This can happen because the once-per-period “sampling” rate of the system tick interrupt is not sufficient to completely resolve the timer-cycle ambiguity. This “undersampling” allows a small time window in which the 16-bit clock rolls over but before the system clock interrupt increments l_tickTime by 0x10000 counts. This can happen due to interrupt locking. As a remedy, the QS_onGetTime() contains a protection against the time going “backward.” This solution assumes that the function QS_getTime() is called at least once per system clock tick. Listing 11.18 shows the QS timestamp implementation.

Listing 11.18. Generating QS timestamp (file <qp>qpcexamples80x86qk cpp101ldppsp.c)
/* Local-scope objects --------------------------------------------------*/
#ifdef Q_SPY
static QSTimeCtr l_tickTime; /* keeps timestamp at tick */
static uint32_t l_lastTime; /* last timestamp */
#endif
. . .
void interrupt ISR_tmr(void) {
uint8_t pin;

#ifdef Q_SPY
(1) l_tickTime += 0x10000; /* add 16-bit rollover */
(1) #endif
(1) QK_ISR_ENTRY(pin, TMR_ISR_PRIO); /* inform QK about entering the ISR */
(1) QF_tick(); /* call QF_tick() outside of critical section */
(1) QK_ISR_EXIT(pin); /* inform QK about exiting the ISR */
(1) }
(1) /*.....................................................................*/
(1) #ifdef Q_SPY /* define QS callbacks */
(1) . . .
(2) QSTimeCtr QS_onGetTime(void) { /* invoked with interrupts locked */
(2) uint32_t now;
(2) uint16_t count16; /* 16-bit count from the 8254 */
(3) if (l_tickTime != 0) { /* time tick has started? */
(4) outportb(0x43, 0); /* latch the 8254's counter-0 count */
(5) count16 = (uint16_t)inportb(0x40);/* read the low byte of counter-0 */
(6) count16 += ((uint16_t)inportb(0x40) << 8); /* add on the hi byte */
(7) now = l_tickTime + (0x10000 - count16);
(8) if (l_lastTime > now) { /* are we going "back" in time? */
(9) now += 0x10000; /* assume that there was one rollover */
(9) }
(10) l_lastTime = now;
(10) }
(10) else {
(11) now = 0;
(11) }
(12) return (QSTimeCtr)now;
(12) }
(12) #endif /* Q_SPY */

(1) The system clock tick interrupt increments the timestamp at tick l_tickTime by 0x10000 to account for the 16-bit counter rollover.
(2) The QS_onGetTime() is always called inside a critical section of code. The QSTimeCtr type is defined according to the macro QS_TIME_SIZE (see Listing 11.11(3)).
(3,11) During the initial transient, before the clock-tick interrupt is enabled, the QS_onGetTime() returns 0.
(4) The counter-0 in the 8254 is latched.
(5,6) The two halves of the counter can be now safely read, starting with the least significant byte.
(7) The complete 32-bit timestamp ‘now’ is constructed.
(8) Due to undersampling of the counter, the check is performed to find out if a counter rollover has been missed. (Note: This check assumes that QS_onGetTime() is called at least once per rollover period.)
(9) If so, the counter is corrected by 0x10000.
(10) The last timestamp value is updated for the next time around.
(12) The timestamp value is returned to the caller.

Generating QS Dictionary Records from Active Objects

The few dictionary records generated from main() (Listing 11.16) provide only the symbolic information available at the global level. However, the encapsulated application components, such as the Philosopher and Table active objects, also need to provide the symbolic information in the trace to the QSPY host application. Listings 11.19 and 11.20 show how to generate dictionary trace records from individual components without compromising their encapsulation.

Listing 11.19. Generating dictionary trace records from the Table active object (file <qp>qpcexamples80x86qk cpp101ldpp able.c)
(1) static Table l_table; /* the single instance of the Table active object */
(1) . . .
(2) QState Table_initial(Table *me, QEvent const *e) {
(2) (void)e; /* suppress the compiler warning about unused parameter */
(3) QS_OBJ_DICTIONARY(&l_table);
(4) QS_FUN_DICTIONARY(&QHsm_top);
(5) QS_FUN_DICTIONARY(&Table_initial);
(6) QS_FUN_DICTIONARY(&Table_serving);
(7) QS_SIG_DICTIONARY(DONE_SIG, 0); /* global signals */
(8) QS_SIG_DICTIONARY(EAT_SIG, 0);
(9) QS_SIG_DICTIONARY(TERMINATE_SIG, 0);
(10) QS_SIG_DICTIONARY(HUNGRY_SIG, me); /* signal just for Table */
(10) /* signal HUNGRY_SIG is posted directly */
(10) QActive_subscribe((QActive *)me, DONE_SIG);
(10) QActive_subscribe((QActive *)me, TERMINATE_SIG);
(10) return Q_TRAN(&Table_serving);
(10) }

Listing 11.20. Generating dictionary trace records from the Philosopher active objects (file <qp>qpcexamples80x86qk cpp101ldppphilo.c)
(1) static Philo l_philo[N_PHILO]; /* storage for all Philos */
(1) . . .
(1) /*......................................................................*/
(2) QState Philo_initial(Philo *me, QEvent const *e) {
(2) static uint8_t registered; /* starts off with 0, per C-standard */
(3) if (!registered) {
(3) registered = (uint8_t)1;
(4) QS_OBJ_DICTIONARY(&l_philo[0]);
(5) QS_OBJ_DICTIONARY(&l_philo[0].timeEvt);
(5) QS_OBJ_DICTIONARY(&l_philo[1]);
(5) QS_OBJ_DICTIONARY(&l_philo[1].timeEvt);
(5) QS_OBJ_DICTIONARY(&l_philo[2]);
(5) QS_OBJ_DICTIONARY(&l_philo[2].timeEvt);
(5) QS_OBJ_DICTIONARY(&l_philo[3]);
(5) QS_OBJ_DICTIONARY(&l_philo[3].timeEvt);
(5) QS_OBJ_DICTIONARY(&l_philo[4]);
(5) QS_OBJ_DICTIONARY(&l_philo[4].timeEvt);
(6) QS_FUN_DICTIONARY(&Philo_initial);
(6) QS_FUN_DICTIONARY(&Philo_thinking);
(6) QS_FUN_DICTIONARY(&Philo_hungry);
(6) QS_FUN_DICTIONARY(&Philo_eating);
(6) randomize(); /* initialize the random number generator just once */
(6) }
(7) QS_SIG_DICTIONARY(HUNGRY_SIG, me); /* signal for each Philos */
(8) QS_SIG_DICTIONARY(TIMEOUT_SIG, me); /* signal for each Philos */
(8) QActive_subscribe((QActive *)me, EAT_SIG);
(8) return Q_TRAN(&Philo_thinking); /* top-most initial transition */
(8) }

Listing 11.21. Generating application-specific trace records (file <qp>qpcexamples80x86qk cpp101ldppsp.c)
#ifdef Q_SPY
. . .
enum AppRecords { /* application-specific trace records */
(1) PHILO_STAT = QS_USER
(1) };
(1) #endif
(1) . . .
(1) /*......................................................................*/
(1) void BSP_displyPhilStat(uint8_t n, char const *stat) {
(1) Video_printStrAt(17, 12 + n, VIDEO_FGND_YELLOW, stat);
(2) QS_BEGIN(PHILO_STAT, AO_Philo[n]) /* application-specific record begin */
(3) QS_U8(1, n); /* Philosopher number */
(4) QS_STR(stat); /* Philosopher status */
(5) QS_END()
(5) }

(1) The Table active object instance is declared at file scope and is strictly encapsulated inside table.c.
(2) Generally, the topmost initial pseudostate (see Section 4.5.3 in Chapter 4) is the best place for generating the dictionary records. The QF real-time framework executes the topmost initial transition directly from QActive_start(); therefore, the dictionary records are generated during the initial transient, before the real-time constraints start to apply.
(3) Note that the object dictionary QS_OBJ_DICTIONARY() is generated using the address of the static variable l_table rather than the generic pointer “me.” Obviously, the name “l_table” is more descriptive than “me.”
(4-6) The function dictionary records provide symbolic names for the state-handler functions. Note that a function dictionary record for QHsm_top() should be also provided.
(7-9) The signal dictionary records for globally published signals must be associated with all state machines in the system. You achieve this by using zero as the second parameter to the macro QS_SIG_DICTIONARY().
(10) The HUNGRY_SIG signal is posted directly to the Table active object, so it does need to have the same meaning globally. You can associate such signals with a particular state machine object by providing the address of the state machine as the second parameter to the macro QS_SIG_DICTIONARY().

(1) All Philosopher active object instances are declared at file scope and are strictly encapsulated inside philo.c.
(2) The Philo_initial() initial pseudostate is invoked once for every Philosopher active object.
(3) The test of the static variable registered ensures that the code inside the if statement runs only once.
(4) An object dictionary is generated for the Philosopher active object. Note that the object dictionary QS_OBJ_DICTIONARY() is generated using the address of the static variable l_philo[0] rather than the generic pointer “me”. Obviously, the name “l_philo[0]” is more descriptive than “me.”
(5) An object dictionary entry is also generated for the private time event member of the Philosopher active object.
(6) The function dictionary entries for all Philosopher state-handler functions are generated only once.
(7,8) In contrast, the signal dictionary records for local signals are generated for every Philosopher instance.

Adding Application-Specific Trace Records

Although QS generates a very detailed trace even without adding any instrumentation to the application, it also allows inserting application-specific records into the trace (see Section 11.3.9). Listing 11.21 provides an example of a simple application-specific trace record that reports the Philosopher status.

(1) The application-specific trace record types need to be enumerated. Note that the user-level record types do not start from zero but rather are offset by the constant QS_USER.
(2) An application-specific trace record starts with QS_BEGIN(). The first parameter is the record type. The second parameter is the object pointer corresponding to this trace record, which is set to AO_philo[n]. This means that you have an option to use the application-specific local filter QS_FILTER_AP_OBJ(AO_philo[<n>]) to selectively trace only the AO_philo[<n>] object, where <n>=0..4. As described in Section 11.3.9, you can also set the second parameter to NULL, which will disable the local filter.
(3) The QS_U8() macro outputs the byte ‘n’ to the trace record to be formatted as 1 using one digit.
(4) The QS_STR() macro outputs the string ‘stat’ to the trace record.
(5) An application-specific trace record ends with QS_END().

The following QSPY trace output shows the application-specific trace records generated by the DPP application:

. . .
0000525113 User000: 4 eating
. . .
0000591471 User000: 3 hungry
. . .
0000591596 User000: 2 hungry
. . .
0000591730 User000: 0 hungry
. . .
0000852276 User000: 4 thinking
. . .
0000852387 User000: 3 eating
. . .
0000983937 User000: 1 thinking
. . .
0000984047 User000: 0 eating
. . .
0001246064 User000: 3 thinking

“QSPY Reference Manual”

The source code available from the companion Website to this book at www.quantum-leaps.com/psicc2/ contains the “QSPY Reference Manual” in HTML and CHM-Help formats (see Figure 11.9). The “QSPY Reference Manual” contains descriptions of the command-line options, human-readable format, and all MATLAB matrices generated by the qspy.m script.

Figure 11.9. Screen shots of the “QSPY Reference Manual,” which is available in HTML and CHM-help formats.

The “QSPY Reference Manual” is part of the bigger manual “QP Reference Manual” (see Section 7.12 in Chapter 7), which is located in <qp>qpcdoxygen directory. The HTML documentation is found in <qp>qpcdoxygenhtml, whereas the CHM Help format is located in <qp>qpcqpc.chm.

Summary

Testing and debugging your system can often take more calendar time and effort than analysis, design, and coding combined. The biggest problem, especially in embedded systems domain, is the very limited visibility into the target system.

Software tracing is a method for obtaining diagnostic information in a live environment without the need to stop the application to get the system feedback. Software-tracing techniques are especially effective and powerful in combination with the event-driven paradigm because all important system interactions funnel through the event-driven infrastructure. The inversion of control so common in event-driven architectures offers a unique opportunity to instrument just the event-driven infrastructure, to gain unprecedented insight into the entire system.

The Quantum Spy software-tracing system allows you to monitor the behavior of QP applications in a live environment without degrading the application itself. It allows you to discover and document elusive, intermittent bugs that are caused by subtle interactions among concurrent components. It enables executing repeatable unit and integration tests of your system. It can help you ensure that your system runs reliably for long periods of time and gets top processor performance.

Quantum Spy implements many advanced features, such as sophisticated data filtering, good data compression, robust transmission protocol, high configurability, generic application-specific tracing, and interfacing with external tools such as MATLAB. Overall, the QS target-resident module and the QSPY host application were the most difficult components of the QP event-driven platform to develop and test. I also consider them the most valuable.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 11. Software Tracing for Event-Driven Systems

Create new playlist

Sign In

Sign Up