AMD_F17H_ZEN2_EVENTS(3CPC) CPU Performance Counters Library Functions

NAME


amd_f17h_zen2_events - AMD Family 17h Zen2 processor performance monitoring
events

DESCRIPTION


This manual page describes events specfic to AMD Family 17h Zen2
processors. For more information, please consult the appropriate AMD BIOS
and Kernel Developer's guide or Open-Source Register Reference.

Each of the events listed below includes the AMD mnemonic which matches the
name found in the AMD manual and a brief summary of the event. If
available, a more detailed description of the event follows and then any
additional unit values that modify the event. Each unit can be combined to
create a new event in the system by placing the '.' character between the
event name and the unit name.

The following events are supported:

FpRetSseAvxOps
Core::X86::Pmc::Core::FpRetSseAvxOps - Retired SSE/AVX FLOPs

This is a retire-based event. The number of retired SSE/AVX FLOPs.
The number of events logged per cycle can vary from 0 to 64. This
event is a MergeEvent since it can count above 15.

This event has the following units which may be used to modify the
behavior of the event:

MacFLOPs
MacFLOPs count as 2 FLOPs. Does not provide a useful count
without use of the MergeEvent feature.

DivFLOPs
Divide/square root FLOPs. Does not provide a useful count
without use of the MergeEvent feature.

MultFLOPs
Multiply FLOPs. Does not provide a useful count without use
of the MergeEvent feature.

AddSubFLOPs
Add/subtract FLOPs. Does not provide a useful count without
use of the MergeEvent feature.

FpRetiredSerOps
Core::X86::Pmc::Core::FpRetiredSerOps - Retired Serializing Ops

The number of serializing Ops retired.

This event has the following units which may be used to modify the
behavior of the event:

SseBotRet
SSE bottom-executing uOps retired.

SseCtrlRet
SSE control word mispredict traps due to mispredictions in
RC, FTZ or DAZ, or changes in mask bits.

X87BotRet
x87 bottom-executing uOps retired.

X87CtrlRet
x87 control word mispredict traps due to mispredictions in
RC or PC, or changes in mask bits.

FpDispFaults
Core::X86::Pmc::Core::FpDispFaults - FP Dispatch Faults

Floating Point Dispatch Faults.

This event has the following units which may be used to modify the
behavior of the event:

YmmSpillFault
YMM Spill fault.

YmmFillFault
YMM Fill fault.

XmmFillFault
XMM Fill fault.

x87FillFault
x87 Fill fault.

LsBadStatus2
Core::X86::Pmc::Core::LsBadStatus2 - Bad Status 2

Store To Load Interlock (STLI) are loads that were unable to
complete because of a possible match with an older store, and the
older store could not do STLF for some reason. There are a number
of reasons why this occurs, and this perfmon organizes them into
three major groups.

This event has the following units which may be used to modify the
behavior of the event:

StliOther
Non-forwardable conflict; used to reduce STLI's via
software. All reasons. The most common among these is that
there is only a partial overlap between the store and the
load, for example there's an 8B store to address A and a
16B load starting at address A. STLF can't be performed in
this case because only some of the load's data is coming
from the store, so the load gets StliOther. Another
StliOther case is if the load hits a non-cacheable store
that's sitting in the non-cacheable buffers (WCBs).

LsLocks
Core::X86::Pmc::Core::LsLocks - Retired Lock Instructions

LsRetClClush
Core::X86::Pmc::Core::LsRetClClush - Retired CLFLUSH Instructions

The number of retired CLFLUSH instructions. This is a non-
speculative event.

LsRetCpuid
Core::X86::Pmc::Core::LsRetCpuid - Retired CPUID Instructions

The number of CPUID instructions retired.

LsDispatch
Core::X86::Pmc::Core::LsDispatch - LS Dispatch

Counts the number of operations dispatched to the LS unit.

LsSmiRx
Core::X86::Pmc::Core::LsSmiRx - SMIs Received

Counts the number of SMIs received.

LsIntTaken
Core::X86::Pmc::Core::LsIntTaken - Interrupts Taken

Counts the number of interrupts taken.

LsRdTsc
Core::X86::Pmc::Core::LsRdTsc - Time Stamp Counter Reads

Counts the number of reads of the TSC (RDTSC instructions). The
count is speculative.

LsSTLF Core::X86::Pmc::Core::LsSTLF - Store to Load Forward

Number of STLF hits.

LsStCommitCancel2
Core::X86::Pmc::Core::LsStCommitCancel2 - Store Commit Cancels 2

This event has the following units which may be used to modify the
behavior of the event:

StCommitCancelWcbFull
A non-cacheable store and the non-cacheable commit buffer
is full.

LsDcAccesses
Core::X86::Pmc::Core::LsDcAccesses - Data Cache Accesses

The number of accesses to the data cache for load and store
references. This may include certain microcode scratchpad accesses,
although these are generally rare. Each increment represents an
eight-byte access, although the instruction may only be accessing a
portion of that. This event is a speculative event.

LsMabAlloc
Core::X86::Pmc::Core::LsMabAlloc - DC Miss By Type

This event has the following units which may be used to modify the
behavior of the event:

DcPrefetcher

Stores

Loads

LsRefillsFromSys
Core::X86::Pmc::Core::LsRefillsFromSys - Data Cache Refills from
System

Demand Data Cache Fills by Data Source.

This event has the following units which may be used to modify the
behavior of the event:

LS_MABRESP_RMT_DRAM
DRAM or IO from different die.

LS_MABRESP_RMT_CACHE
Hit in cache; Remote CCX and the address's Home Node is on
a different die.

LS_MABRESP_LCL_DRAM
DRAM or IO from this thread's die.

LS_MABRESP_LCL_CACHE
Hit in cache; local CCX (not Local L2), or Remote CCXand
the address's Home Node is on this thread's die.

MABRESP_LCL_L2
Local L2 hit.

LsL1DTlbMiss
Core::X86::Pmc::Core::LsL1DTlbMiss - L1 DTLB Miss

This event has the following units which may be used to modify the
behavior of the event:

TlbReload1GL2Miss
DTLB reload to a 1G page that miss in the L2 TLB.

TlbReload2ML2Miss
DTLB reload to a 2M page that miss in the L2 TLB.

TlbReloadCoalescedPageMiss

TlbReload4KL2Miss
DTLB reload to a 4K page that miss the L2 TLB.

TlbReload1GL2Hit
DTLB reload to a 1G page that hit in the L2 TLB.

TlbReload2ML2Hit
DTLB reload to a 2M page that hit in the L2 TLB.

TlbReloadCoalescedPageHit

TlbReload4KL2Hit
DTLB reload to a 4K page that hit in the L2 TLB.

LsMisalAccesses
Core::X86::Pmc::Core::LsMisalAccesses - Misaligned loads

LsPrefInstrDisp
Core::X86::Pmc::Core::LsPrefInstrDisp - Prefetch Instructions
Dispatched

Software Prefetch Instructions Dispatched (Speculative).

This event has the following units which may be used to modify the
behavior of the event:

PrefetchNTA
PrefetchNTA instruction. See AMD64 Architecture
Programmer's Manual Volume 3: Instruction-Set Reference,
order# 24594 PREFETCHlevel.

PrefetchW
PrefetchW instruction. See AMD64 Architecture Programmer's
Manual Volume 3: Instruction-Set Reference, order# 24594
PREFETCHlevel.

Prefetch
PrefetchT0, T1 and T2 instructions. See AMD64 Architecture
Programmer's Manual Volume 3: Instruction-Set Reference,
order# 24594 PREFETCHlevel.

LsInefSwPref
Core::X86::Pmc::Core::LsInefSwPref - Ineffective Software
Prefetches

The number of software prefetches that did not fetch data outside
of the processor core.

This event has the following units which may be used to modify the
behavior of the event:

MabMchCnt
Software PREFETCH instruction saw a match on an already-
allocated miss request buffer.

DataPipeSwPfDcHit
Software PREFETCH instruction saw a DC hit.

LsSwPfDcFills
Core::X86::Pmc::Core::LsSwPfDcFills - Software Prefetch Data Cache
Fills

Software Prefetch Data Cache Fills by Data Source.

This event has the following units which may be used to modify the
behavior of the event:

LS_MABRESP_RMT_DRAM
DRAM or IO from different die.

LS_MABRESP_RMT_CACHE
Hit in cache; Remote CCX and the address's Home Node is on
a different die.

LS_MABRESP_LCL_DRAM
DRAM or IO from this thread's die.

LS_MABRESP_LCL_CACHE
Hit in cache; local CCX (not Local L2), or Remote CCX and
the address's Home Node is on this thread's die.

MABRESP_LCL_L2
Local L2 hit.

LsHwPfDcFills
Core::X86::Pmc::Core::LsHwPfDcFills - Hardware Prefetch Data Cache
Fills

Hardware Prefetch Data Cache Fills by Data Source.

This event has the following units which may be used to modify the
behavior of the event:

LS_MABRESP_RMT_DRAM
DRAM or IO from different die.

LS_MABRESP_RMT_CACHE
Hit in cache; Remote CCX and the address's Home Nodeis on a
different die.

LS_MABRESP_LCL_DRAM
DRAM or IO from this thread's die.

LS_MABRESP_LCL_CACHE
Hit in cache; local CCX (not Local L2), or Remote CCXand
the address's Home Node is on this thread's die.

MABRESP_LCL_L2
Local L2 hit.

LsNotHaltedCyc
Core::X86::Pmc::Core::LsNotHaltedCyc - Cycles not in Halt

LsTlbFlush
Core::X86::Pmc::Core::LsTlbFlush - All TLB Flushes

IcCacheFillL2
Core::X86::Pmc::Core::IcCacheFillL2 - Instruction Cache Refills
from L2

The number of 64 byte instruction cache line was fulfilled from the
L2 cache.

IcCacheFillSys
Core::X86::Pmc::Core::IcCacheFillSys - Instruction Cache Refills
from System

The number of 64 byte instruction cache line fulfilled from system
memory or another cache.

BpL1TlbMissL2TlbHit
Core::X86::Pmc::Core::BpL1TlbMissL2TlbHit - L1 ITLB Miss, L2 ITLB
Hit

The number of instruction fetches that miss in the L1 ITLB but hit
in the L2 ITLB.

BpL1TlbMissL2TlbMiss
Core::X86::Pmc::Core::BpL1TlbMissL2TlbMiss - L1 ITLB Miss, L2 ITLB
Miss

The number of instruction fetches that miss in both the L1 and L2
TLBs.

This event has the following units which may be used to modify the
behavior of the event:

IF1G Instruction fetches to a 1 GB page.

IF2M Instruction fetches to a 2 MB page.

IF4K Instruction fetches to a 4 KB page.

BpL1BTBCorrect
Core::X86::Pmc::Core::BpL1BTBCorrect - L1 Branch Prediction
Overrides Existing Prediction (speculative)

BpL2BTBCorrect
Core::X86::Pmc::Core::BpL2BTBCorrect - L2 Branch Prediction
Overrides Existing Prediction (speculative)

BpDynIndPred
Core::X86::Pmc::Core::BpDynIndPred - Dynamic Indirect Predictions

Indirect Branch Prediction for potential multi-target branch
(speculative)

BpDeReDirect
Core::X86::Pmc::Core::BpDeReDirect - Decoder Overrides Existing
Branch Prediction (speculative)

BpL1TlbFetchHit
Core::X86::Pmc::Core::BpL1TlbFetchHit - ITLB Instruction Fetch Hits

The number of instruction fetches that hit in the L1 ITLB.

This event has the following units which may be used to modify the
behavior of the event:

IF1G Instruction fetches to a 1 GB page.

IF2M Instruction fetches to a 2 MB page.

IF4K Instruction fetches to a 4 KB page.

DeDisUopQueueEmptyDi0
Core::X86::Pmc::Core::DeDisUopQueueEmptyDi0 - Micro-Op Queue Empty

Cycles where the Micro-Op Queue is empty.

DeDisUopsFromDecoder
Core::X86::Pmc::Core::DeDisUopsFromDecoder - UOps Dispatched From
Decoder

Ops dispatched from either the decoders, OpCache or both.

This event has the following units which may be used to modify the
behavior of the event:

OpCacheDispatched
Count of dispatched Ops from OpCache.

DecoderDispatched
Count of dispatched Ops from Decoder.

DeDisDispatchTokenStalls1
Core::X86::Pmc::Core::DeDisDispatchTokenStalls1 - Dispatch Resource
Stall Cycles 1

Cycles where a dispatch group is valid but does not get dispatched
due to a Token Stall.

This event has the following units which may be used to modify the
behavior of the event:

FPMiscRsrcStall
FP Miscellaneous resource unavailable. Applies to the
recovery of mispredicts with FP ops.

FPSchRsrcStall
FP scheduler resource stall. Applies to ops that use the FP
scheduler.

FpRegFileRsrcStall
floating point register file resource stall. Applies to all
FP ops that have a destination register.

TakenBrnchBufferRsrc
taken branch buffer resource stall.

IntSchedulerMiscRsrcStall
Integer Scheduler miscellaneous resource stall.

StoreQueueRsrcStall
Store Queue resource stall. Applies to all ops with store
semantics.

LoadQueueRsrcStall
Load Queue resource stall. Applies to all ops with load
semantics.

IntPhyRegFileRsrcStall
Integer Physical Register File resource stall. Integer
Physical Register File, applies to all ops that have an
integer destination register.

DeDisDispatchTokenStalls0
Core::X86::Pmc::Core::DeDisDispatchTokenStalls0 - Dispatch Resource
Stall Cycles 0

Cycles where a dispatch group is valid but does not get dispatched
due to a token stall.

This event has the following units which may be used to modify the
behavior of the event:

ScAguDispatchStall
SC AGU dispatch stall.

RetireTokenStall
RETIRE Tokens unavailable.

AGSQTokenStall
AGSQ Tokens unavailable.

ALUTokenStall
ALU tokens total unavailable.

ALSQ3_0_TokenStall

ALSQ2RsrcStall
ALSQ 2 Resources unavailable.

ALSQ1RsrcStall
ALSQ 1 Resources unavailable.

ExRetInstr
Core::X86::Pmc::Core::ExRetInstr - Retired Instructions

ExRetCops
Core::X86::Pmc::Core::ExRetCops - Retired Uops

The number of micro-ops retired. This count includes all processor
activity (instructions, exceptions, interrupts, microcode assists,
etc.). The number of events logged per cycle can vary from 0 to 8.

ExRetBrn
Core::X86::Pmc::Core::ExRetBrn - Retired Branch Instructions

The number of branch instructions retired. This includes all types
of architectural control flow changes, including exceptions and
interrupts.

ExRetBrnMisp
Core::X86::Pmc::Core::ExRetBrnMisp - Retired Branch Instructions
Mispredicted

The number of branch instructions retired, of any type, that were
not correctly predicted. This includes those for which prediction
is not attempted (far control transfers, exceptions and
interrupts).

ExRetBrnTkn
Core::X86::Pmc::Core::ExRetBrnTkn - Retired Taken Branch
Instructions

The number of taken branches that were retired. This includes all
types of architectural control flow changes, including exceptions
and interrupts.

ExRetBrnTknMisp
Core::X86::Pmc::Core::ExRetBrnTknMisp - Retired Taken Branch
Instructions Mispredicted

The number of retired taken branch instructions that were
mispredicted.

ExRetBrnFar
Core::X86::Pmc::Core::ExRetBrnFar - Retired Far Control Transfers

The number of far control transfers retired including far
call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and
interrupts. Far control transfers are not subject to branch
prediction.

ExRetNearRet
Core::X86::Pmc::Core::ExRetNearRet - Retired Near Returns

The number of near return instructions (RET or RET Iw) retired.

ExRetNearRetMispred
Core::X86::Pmc::Core::ExRetNearRetMispred - Retired Near Returns
Mispredicted

The number of near returns retired that were not correctly
predicted by the return address predictor. Each such
mispredictincurs the same penalty as a mispredicted conditional
branch instruction.

ExRetBrnIndMisp
Core::X86::Pmc::Core::ExRetBrnIndMisp - Retired Indirect Branch
Instructions Mispredicted

The number of indirect branches retired that were not correctly
predicted. Each such mispredict incurs the same penalty as a
mispredicted conditional branch instruction. Note that only EX
mispredicts are counted.

ExRetMmxFpInstr
Core::X86::Pmc::Core::ExRetMmxFpInstr - Retired MMX/FP Instructions

The number of MMX, SSE or x87 instructions retired. The UnitMask
allows the selection of the individual classes of instructions as
given in the table. Each increment represents one complete
instruction. Since this event includes non-numeric instructions it
is not suitable for measuring MFLOPs.

This event has the following units which may be used to modify the
behavior of the event:

SseInstr
SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41,
SSE42, AVX).

MmxInstr
MMX instructions.

X87Instr
x87 instructions.

ExRetCond
Core::X86::Pmc::Core::ExRetCond - Retired Conditional Branch
Instructions

ExDivBusy
Core::X86::Pmc::Core::ExDivBusy - Div Cycles Busy count

ExDivCount
Core::X86::Pmc::Core::ExDivCount - Div Op Count

ExTaggedIbsOps
Core::X86::Pmc::Core::ExTaggedIbsOps - Tagged IBS Ops

This event has the following units which may be used to modify the
behavior of the event:

IbsCountRollover
Number of times an op could not be tagged by IBS because of
a previous tagged op that has not retired.

IbsTaggedOpsRet
Number of Ops tagged by IBS that retired.

IbsTaggedOps
Number of Ops tagged by IBS.

ExRetFusBrnchInst
Core::X86::Pmc::Core::ExRetFusBrnchInst - Retired Fused Branch
Instructions

The number of fuse-branch instructions retired per cycle. The
number of events logged per cycle can vary from 0-8.

L2RequestG1
Core::X86::Pmc::Core::L2RequestG1 - Requests to L2 Group1

All L2 Cache Requests (Breakdown 1 - Common).

This event has the following units which may be used to modify the
behavior of the event:

RdBlkL Data Cache Reads (including hardware and software
prefetch).

RdBlkX Data Cache Stores.

LsRdBlkC_S
Data Cache Shared Reads.

CacheableIcRead
Instruction Cache Reads.

ChangeToX
Data Cache State Change Requests. Request change to
writable, check L2 for current state.

PrefetchL2Cmd

L2HwPf L2 Prefetcher. All prefetches accepted by L2 pipeline, hit
or miss. Types of PF and L2 hit/miss broken out in a
separate perfmon event.

Group2 Miscellaneous events covered in more detail by
Core::X86::Pmc::Core::L2RequestG2 (PMCx061).

L2RequestG2
Core::X86::Pmc::Core::L2RequestG2 - Requests to L2 Group2

All L2 Cache Requests (Breakdown 2 - Rare).

This event has the following units which may be used to modify the
behavior of the event:

Group1 Miscellaneous events covered in more detail by
Core::X86::Pmc::Core::L2RequestG1 (PMCx060).

LsRdSized
Data cache read sized.

LsRdSizedNC
Data cache read sized non-cacheable.

IcRdSized
Instruction cache read sized.

IcRdSizedNC
Instruction cache read sized non-cacheable.

SmcInval
Self-modifying code invalidates.

BusLocksOriginator
Bus locks.

BusLocksResponses
Bus Lock Response.

L2CacheReqStat
Core::X86::Pmc::Core::L2CacheReqStat - Core to L2 Cacheable Request
Access Status

L2 Cache Request Outcomes (not including L2 Prefetch).

This event has the following units which may be used to modify the
behavior of the event:

LsRdBlkCS
Data Cache Shared Read Hit in L2.

LsRdBlkLHitX
Data Cache Read Hit in L2.

LsRdBlkLHitS
Data Cache Read Hit on Shared Line in L2.

LsRdBlkX
Data Cache Store or State Change Hit in L2.

LsRdBlkC
Data Cache Req Miss in L2 (all types).

IcFillHitX
Instruction Cache Hit Modifiable Line in L2.

IcFillHitS
Instruction Cache Hit Clean Line in L2.

IcFillMiss
Instruction Cache Req Miss in L2.

L2PfHitL2
Core::X86::Pmc::Core::L2PfHitL2 - L2 Prefetch Hit in L2

L2PfMissL2HitL2
Core::X86::Pmc::Core::L2PfMissL2HitL2 - L2 Prefetcher Hits in L3

Counts all L2 prefetches accepted by the L2 pipeline which miss the
L2 cache and hit the L3.

L2PfMissL2L3
Core::X86::Pmc::Core::L2PfMissL2L3 - L2 Prefetcher Misses in L3

Counts all L2 prefetches accepted by the L2 pipeline which miss the
L2 and the L3 caches.

SEE ALSO


cpc(3CPC)

illumos March 25, 2019 illumos