What is Trace ?

See inside your code

Real-time trace reveals precisely how, when and where your code is executed.

There are two categories of trace; instrumented and non-instrumented (aka Real-Time Trace).

Instrumented Trace

The application generates trace data when an event occurs, e.g. RTOS context switch, a mutex is locked or at periodic intervals. This data is then transmitted via a standard interface e.g SWO, USB, etc. and reconstructed externally to generate a timeline of system activity. Examples of instrumented trace include Segger SystemView and Tracealyzer from Percepio.

AdvantagesDisadvantages
Low bandwidthLow resolution
No trace probe requiredUses CPU resources
Low or zero costAffects application timing
Timing accuracy may be affected by system events

Non-Instrumented (Real-Time) Trace

The target processor contains internal logic that generates a data stream representing executed instructions. It requires no intervention from the CPU and has no effect on application timing. The Embedded Trace Macrocell (ETM), an optional part of the ARM Cortex-M architecture, is an example of this and which the QTrace probe is designed for. Some trace schemes can also include user specified variables or CPU registers in the data stream though it is typically not implemented in Cortex-M cores. Trace data is transmitted at high speed on multi-function processor GPIO pins and decoded by an external probe. This data is then used to generate detailed and extremely insightful views of CPU execution.

AdvantagesDisadvantages
No CPU cycles requiredRequires an external hardware decoder
Highest resolution possibleHigh bandwidth, can generate 10’s MB/s of trace data
Does not affect system timingRequires 5 I/O pins and a 20 pin debug connector
Tracing happens in real-timeRouting of trace signals is important
Every instruction is tracedExpensive (QTrace is an exception!)

Non-instrumented trace, generally referred to as ‘trace’, has two forms; buffered and continuously streamed.

Buffered Trace

A large external RAM buffer is filled with trace data after a previously configured trigger event occurs. The trace data must be uploaded to a PC, decoded and then analysed to pick out the event of interest.

Streamed Trace

Trace data is continually streamed to an analyser application which decodes and presents the real-time data as live disassembly and source level views. The views can be paused for analysis whilst trace is still being streamed and decoded in the background at full speed. Not being limited by a hardware buffer means streamed trace also allows full code coverage and profiling of an application.

๐Ÿ‘‰ QTrace is a continuously streamed real-time trace system with no hardware buffer limitations.

ETM Trace

The Embedded Trace Macrocell (ETM) is an optional part of the ARM Cortex-M architecture and is part of the larger ARM CoreSight debugging ecosystem which includes JTAG/SWD, Embedded Trace Buffer (ETB), Instrumentation Trace Macrocell (ITM), etc. A presentation from ARM outlines these features.

The ETM hardware block generates a trace data stream using up to five multi-purpose GPIO lines programmed to operate as trace output pins; a clock signal and up to 4 data lines (QTrace requires all 4). The trace data is transmitted on both edges of the trace clock i.e. double data rate (DDR) and the trace clock typically runs at half the CPU clock speed. For devices faster than 200MHz, e.g. STM32H7xx, the trace clock is user programmed for a sub-multiple of the CPU clock to maintain signal integrity.

 

 

The trace data is arranged into 16 byte frames which are clocked into, and decoded by, an external hardware decoder such as the QTrace probe. The bytes contain compressed information such as instruction type (branch or not), branch address, exception information, etc. The trace data is streamed to a PC application e.g. QTrace Analyser, which translates it into the corresponding disassembly and high level source for display.

The trace data rate depends on clock speed and the instructions being executed. Branches with target addresses unknown at compile time and interrupt entry / exit typically generate the most trace data. A processor running at 200MHz can generate a trace data rate of 10’s MB/s.

ETM isn’t usually an option for devices with low pin counts e.g. 48 pins or less, even though the core may implement it. However, there is usually a higher pin count version of the processor is available that does provide access to the trace signals. These are brought out to a trace header which is a version of the standard 10-way, 1.27mm pitch, JTAG/SWD connector extended to 20 pins to accomodate the trace signals, as shown below:

20 pin Trace Header

20 way, 1.27mm pitch trace header

๐Ÿ‘‰ Get your hardware department to use a processor with ETM and add a trace connector, at least on the rev.A board. It really will pay dividends.

Preconceptions of Trace

A common perception of trace is that it’s a heavyweight tool which is only used for tackling nasty problems or for code coverage certification. Trace is used in these scenarios but it offers so much more. There are many additional features that make trace an invaluable tool when developing code.

Another general view of trace is that it’s expensive. It’s definitely true that trace tools have have a large price tag and are generally only used by large organisations with deep pockets. However, the low cost QTrace solution now makes real-time trace affordable for developers that are on a tight budget.

What can QTrace do ?

With unlimited real-time tracing it is possible to obtain lots of information about your application that’s just not possible with conventional debugging methods. Below is some of the functionality offered by QTrace:

โ— Determine the percentage of functions, and paths within them, that have been executed
โ— Identify which functions are being called most frequently and which are taking the most CPU cycles
โ— Calculate the rate at which an area of code is being executed without needing to toggle an I/O pin
โ— See which functions and condition branches have been executed without stopping the CPU
(perfect for debugging motion control, communication protocols, PID controllers, etc.)
โ— See which interrupts are occurring and how frequently
โ— Show a call stack without having to stop the CPU
โ— Review 10’s milliseconds of execution prior to a CPU exception (C/C++ source and disassembly views)

๐Ÿ’ก Seeing your code execute in real-time is incredibly powerful & incorporating trace into your everyday debugging really will reduce development time.

For more details about QTrace goto the home page or browse the menus at the top of the page.