What is Trace ?

Trace can take several forms and generally falls into two categories; instrumented and non-instrumented.

Instrumented Trace

The application generates data when an event occurs, e.g. RTOS context switch, or at periodic intervals. This data is then transmitted and reconstructed externally to generate a timeline of system activity. Examples of instrumented trace include Segger SystemView and Tracealyzer from Percepio.

Advantages Disadvantages
Low bandwidth Uses CPU resources
No hardware required Will affect application timing
Timing accuracy may be affected by system events
Low resolution

Non-Instrumented Trace

The target processor contains internal logic to generate a data stream that represents executed instructions. The Embedded Trace Macrocell (ETM), an optional part of the ARM Cortex-M architecture, is an example of this and the one which the QTrace probe is designed for. Some trace schemes can also include user specified variables or CPU registers in the data stream though it is not implemented in the M3/M4 ETM. Trace data is transmitted on multi-purpose processor GPIO pins to be decoded externally.

Advantages Disadvantages
No CPU cycles required Requires an external hardware decoder
Does not affect system timing High bandwidth, generates many MB/s of trace data
Highest resolution possible Requires up to 5 I/O pins and a larger debug connector
Real-time tracing Routing of trace signals is critical
Expensive (QTrace is an exception)

Non-instrumented trace, generally referred to as ‘trace’, has two forms; buffered and continuously streamed.

Buffered Trace

A large external RAM buffer is filled with trace data after a previously configured trigger event occurs. The trace data must be uploaded to a PC, decoded and then analysed to pick out the event of interest.

Streamed Trace

Trace data is continually streamed to an analyser application which decodes and presents the real-time data as live disassembly and source level views. The views can be paused for analysis whilst trace is still being streamed and decoded in the background at full speed. Not being limited by a hardware buffer means streamed trace also allows full code coverage and profiling of an application.

QTrace is implemented as a continuously streamed real-time trace system.

ETM Trace

The Embedded Trace Macrocell (ETM) is an optional part of the ARM Cortex-M architecture and is part of the larger ARM CoreSight debugging ecosystem which includes JTAG/SWD, Embedded Trace Buffer (ETB), Instrumentation Trace Macrocell (ITM), etc. A presentation from ARM outlines these features.

The ETM hardware block generates a trace data stream using up to five multi-purpose GPIO lines programmed to operate as trace output pins, a clock signal and up to 4 data lines (QTrace requires all 5 signals). The trace clock typically runs at half the system clock speed and the trace data is clocked out on both clock edges i.e. double data rate (DDR).



The trace data is arranged into 16 byte frames which are clocked into, and decoded by, an external hardware decoder such as the QTrace probe. The bytes contain compressed information such as instruction type (branch or not), branch address, exception information, etc. The trace data is streamed to a PC application e.g. QTrace Analyser, which translates it into the corresponding disassembly and high level source for display.

The trace data rate depends on clock speed and the instructions being executed. Branches with target addresses unknown at compile time and interrupt entry / exit typically generate the most trace data. A processor running at 200MHz can generate a trace data rate of 10’s MB/s.

ETM isn’t usually an option for devices with low pin counts e.g. 48 pins or less, even though the core may implement it. However, there is usually a higher pin count version of the processor is available that does provide access to the ETM signals.

Convince your hardware department to design-in a part with ETM and add a trace connector, at least on the rev.A board. It really will pay dividends !

Preconceptions of Trace

A common perception of trace is that it’s a heavyweight tool which is only used for tackling nasty problems or for code coverage certification. Trace is used in these scenarios but it offers so much more.

Another general view of trace is that it’s expensive. It is true that trace tools tend to have have a large price tag and are generally only used by large organisations. However, the low cost QTrace solution makes trace affordable for developers that are on a tight budget.

What can QTrace do ?

With unlimited real-time tracing it is possible to obtain lots of information about your application that’s just not possible with conventional debugging methods. Below is some of the functionality offered by QTrace:

● Determine which percentage of functions, and the paths within them, have been executed
● Identify which functions are being called most frequently and which are taking the most CPU cycles
● Calculate the rate at which an area of code is being executed without needing to toggle an I/O pin
● See which functions and condition branches have been executed without stopping the CPU
  (perfect for debugging motion control, communication protocols, PID controllers, etc.)
● See which interrupts are occurring and how frequently
● Show a call stack without having to stop the CPU
● Review 10’s of milliseconds of execution (C source and disassembly) prior to a CPU exception


Seeing what your code is doing in real-time is incredibly powerful. Incorporating trace into your everyday debugging process really does reduce development time.

Scroll Up