RH850 Time/Performance Unit

Plugins ››
Parent Previous Next

The RH850 Time/Performance Measurement Unit (TMU) consists of 4 Time and 4 Performance counter units. In multi-core devices each PE core has its own TMU.


These units offer the possibility of time and count measurements from various sources. With this plugin you can:

measure time (by counting debug clock cycles) with the Time unit

count events (executed instructions, interrupts, flash accesses...) with the Performance unit

do measurements between two events (start and stop condition)

save the minimal, maximal or last value encountered between several measurement cycles

accumulate results or reset counter on every start event

set a threshold for a measurement

break the CPU on counter overflow or threshold violation


The Time/Performance Unit plugin offers a convenient way to control this unit.

Using the Time/Performance Unit plugin

To open the plugin you need to have RH850 workspace loaded in winIDEA. You can then load it through Plugins / Time Performance Unit.


Please note shared resources:

1.The Renesas Cycle Counter plugin uses the 1st counter of the Time Unit 0 of the first core PE1. If the Time/Performance Unit plugin configures that counter, then the default Renesas Cycle Counter operation is suspended until the CPU is reset, or the counter is released by the Time/Performance plugin.

2.RH850 trace and the Time/Performance Unit plugin use the same event-triggering resource. Therefore, trace and TMU should not be used at the same time.


The window displays counter and saved values in HEX and in decimal form (HEX / decimal). The currently enabled units are indicated by bright green color.

The Threshold and Overflow columns display the status of Threshold Violation (TVF) and counter Overflow (OVF) flags.


The Time / performance unit is controlled with these buttons:

Edit TMU configuration. Opens the TMU Configuration options window.

Refresh the view

Reset counters


RH850 Time/Performance unit plugin window


TMU Configuration options

The Configuration window can be invoked by the Options button in the plugin or  through Hardware / SoC Debug Module drop-down menu.


Note that the settings in the plugin are not automatically applied to the CPU. In order to configure the Time/Performance unit you need to apply them. Click the Apply Current button when you are done with the configuration!

Save / Load plugin settings

The On-Chip debug tab offers a convenient way to save and load the plugin settings. Note that the TMU configuration needs to be applied after emulation is started.

RH850 Time/Performance Unit plugin / Options / On-Chip Debug tab

Save plugin settings

Once you configure the Time/Performance unit for a specific use case, you can save the configuration to speed up the configuration process next time you need it. Click save and choose the save destination. By default winIDEA will save the preset with the .xsoc extension in the default templates folder. It is advised not to change the destination directory, because the Load feature only offers presets from this directory.

Load plugin settings

Presets that were saved to default templates folder can be loaded through the Load command. Don't forget to hit the Apply Current button after loading them.

Apply after emulation start

Even though the plugin remains configured, the TMU settings on the CPU need to be applied after starting the debug session. Use this option to automatically apply TMU settings in such case.

Advanced Time/Performance unit configuration

Each core features it's own TMU. The plugin options display a pair of configuration tabs (TPU, SRC) for each core. Together they provide fine control over the available Time/Performance units.

Time/Performance unit behavior settings

The Time/Performance unit is configured through the core-specific TPU tabs.

RH850 Time/Performance Unit plugin / Options / TPU PE1 tab

Start / stop condition

To enable the unit you need to select the Auto or Advanced mode.

The Auto mode configures the start and stop condition to the start and stop of the CPU, which is sufficient in most cases.

The Advanced mode let's you configure the start and stop condition to different kinds of events generated by the CPU's on-chip debug logic (execution watchpoints, data access watchpoints, sequencer events). If used, these events must be configured manually through the SRC tab.


To configure the unit, click one the ... button inside the unit settings.

Advanced Time Unit settings


The start / stop mode selection will only be available if the Advanced mode was selected in the TPU dialog. Make sure to configure the events in the SRC tab once you select the desired events.

Counter and save settings

Save value can be configured to either:

save the counter value at it's maximum, minimal or last value between multiple start / stop events or

store the threshold value. In this case select whether the threshold limit represents the maximal or minimal acceptable value.


Specify the events you wish to count:

Time units can count debug clock ticks (which can be used for measuring time) and stop events (used for counting events).

Note: Debug clock depends on the settings in winIDEA.

If you are using the LPD debug port, then debug clock is the same as it is specified in the Hardware / CPU Setup / SoC / LPD clock setting. The higher the LPD clock, the more accurate the measurement will be. Note, however, that the LPD clock can not be set to an arbitrary value. If a too high LPD clock frequency is set the debug connection to the CPU will become unreliable. For best performance we suggest you to choose one of the predefined values available from the drop-down menu.

In cases where JTAG port is used, 8MHz clock will be used for iC5000 / iC5500 / iC6000 and 2MHz clock for iC3000.

Note that the debug clock might not be exactly the same as specified, because the CPU limits available debug clock frequencies. To get accurate debug clock frequency please use a qualified frequency counter and connect it to the TCK and GND pins on the RH850 debug connector. Refer to the hardware reference to learn about the adapter pinout.


Performance units can count various SoC events, which are device-specific. The most common list of available events is given below. Please contact chip vendor for explanation of individual performance items.

SoC event type

SoC event description

Opcodes

The number of instructions executed

Branches

The number of branch instructions executed

IrqAckEI

The number of acknowledged EI level interrupt requests

IrqAckFE

The number of acknowledged FE level interrupt requests

AsyncExceptions

The number of acknowledged asynchronous exceptions

SyncExceptions

The number of acknowledged synchronous exceptions

AllCycles

The CPU clock cycle number (note that if the CPU is in low power mode, this count will not increase)

NoIrqCycles

The CPU clock cycles of no interrupt handling (note that if the CPU is in low power mode, this count will not increase)

IrqDisabledCycles

The CPU clock cycles of interrupt disabled by DI/EI (note that if the CPU is in low power mode, this count will not increase)

FetchRequests

The number of instruction fetch requests

FetchFromFlash

The number of instruction fetch requests to FLASH

FetchFromVCIBus

The number of instruction fetch requests to VCI bus

AccessDataCache

The number of data fetch request to Flash ROM

DCacheHit

The number of non-wait response of the above request in data sub-cache

Count accumulation

Note that it is possible to either start counting from 0 each time the start condition is fulfilled, or the counter can accumulate the values over the sessions.

CPU behavior manipulation

The unit configuration dialog offers additional settings to control the CPU:

Break CPU on overflow

The CPU will break when the counter overflows. Use this to eliminate false measurements.

Note that the CPU will only break at the point when the flag is set. If the flag has been previously set, then the CPU will not break. Reset the counters to clear the flags before you start with the measurements.

The CPU will not break on the same instruction when the flag was set. The break will occur couple of CPU cycles later.

Break CPU on threshold violation

The CPU will break when the threshold violation occurs. Counter values are compared to the threshold limit after each measurement ends (when stop condition is fulfilled).

Note that the CPU will only break at the point when the flag is set. If the flag has been previously set, then the CPU will not break. Reset the counters to clear the flags before you start with the measurements.

The CPU will not break right at the instruction when the flag was set. The break will occur couple of CPU cycles later.


CPU event settings

The CPU events are configured through the core-specific SRC tabs:

enable the on-chip logic with the Enabled checkbox.

configure the comparators that you need for the plugin. Make sure to enable the Trigger / Watchpoint generation for the required comparators. If the Watchpoint generation is enabled it will indicate this with TRIG status in the SRC tab.


RH850 Time/Performance Unit plugin / Options / SRC PE1 tab


The SRC tab is similar to the one available for setting manual trace configuration, because it uses the same CPU module. Do not change settings that relate to trace. Note that this plugin should not be used together with trace, as both configurations can not be applied simultaneously. Please refer to the device reference manual for more details on the available on-chip logic.

Plugin usage example

Make sure to apply the counter settings after the CPU reset (or configure this in the plugin options). Reset the flags and counters before you begin with measurements.


If you pay attention to the settings in the previous screenshots you will see that the plugin is configured to provide the following measurements:

Time unit 0
Counts debug clock cycles between CPU start and stop. Considering we are using the LPD port with the debug clock frequency of 20MHz, one debug clock cycle lasts for 50ns. To get total execution time between start and stop, multiply the decimal counter value with 50ns to obtain timing measurement.

Time unit 1
Stops the CPU if the number of debug clocks counted between the entry into dummyFunction and its exit is higher then 20000 (0x4E20). Considering we are using the LPD port with the debug clock frequency of 20MHz, one debug clock cycle lasts for 50ns. This means that the CPU will effectively stop if the function execution takes longer then 1ms (20000 * 50ns). The image below indicates that the dummyFunction took over 10ms to execute and therefore the CPU stopped soon after exiting the dummy function.

Performance unit 0
Counts the CPU clock cycles when the IRQ is not active.

Performance unit 1
Counts all CPU clock cycles.

RH850 Time/Performance unit plugin window