AMD_F19H_ZEN3_EVENTS(3CPC) CPU Performance Counters Library Functions
NAME
amd_f19h_zen3_events - AMD Family 19h Zen3 processor performance
monitoring events
DESCRIPTION
This manual page describes events specfic to AMD Family 19h Zen3
processors. For more information, please consult the appropriate AMD
BIOS and Kernel Developer's guide or Open-Source Register Reference.
Each of the events listed below includes the AMD mnemonic which matches
the name found in the AMD manual and a brief summary of the event. If
available, a more detailed description of the event follows and then
any additional unit values that modify the event. Each unit can be
combined to create a new event in the system by placing the '.'
character between the event name and the unit name.
The following events are supported:
FpRetSseAvxOps Core::X86::Pmc::Core::FpRetSseAvxOps - Retired SSE/AVX FLOPs
This is a retire-based event. The number of retired SSE/AVX
FLOPs. The number of events logged per cycle can vary from 0 to
64. This event requires the use of the MergeEvent since it can
count above 15 events per cycle. See 2.1.17.3 [Large Increment
per Cycle Events]. It does not provide a useful count without
the use of the MergeEvent.
This event has the following units which may be used to modify
the behavior of the event:
MacFLOPs Multiply-Accumulate FLOPs. Each MAC operation is
counted as 2 FLOPS.
DivFLOPs Divide/square root FLOPs.
MultFLOPs Multiply FLOPs.
AddSubFLOPs Add/subtract FLOPs.
FpRetiredSerOps Core::X86::Pmc::Core::FpRetiredSerOps - Retired Serializing Ops
The number of serializing Ops retired.
This event has the following units which may be used to modify
the behavior of the event:
SseBotRet SSE/AVX bottom-executing ops retired.
SseCtrlRet SSE/AVX control word mispredict traps.
X87BotRet x87 bottom-executing ops retired.
X87CtrlRet x87 control word mispredict traps due to mispredictions
in RC or PC, or changes in Exception Mask bits.
FpDispFaults Core::X86::Pmc::Core::FpDispFaults - FP Dispatch Faults
Floating Point Dispatch Faults.
This event has the following units which may be used to modify
the behavior of the event:
YmmSpillFault YMM Spill fault.
YmmFillFault YMM Fill fault.
XmmFillFault XMM Fill fault.
x87FillFault x87 Fill fault.
LsBadStatus2 Core::X86::Pmc::Core::LsBadStatus2 - Bad Status 2
This event has the following units which may be used to modify
the behavior of the event:
StliOther Store-to-load conflicts: A load was unable to complete
due to a non-forwardable conflict with an older store.
Most commonly, a load's address range partially but not
completely overlaps with an uncompleted older store.
Software can avoid this problem by using same-size and
same-alignment loads and stores when accessing the same
data. Vector/SIMD code is particularly susceptible to
this problem; software should construct wide vector
stores by manipulating vector elements in registers
using shuffle/blend/swap instructions prior to storing
to memory, instead of using narrow element-by-element
stores.
LsLocks Core::X86::Pmc::Core::LsLocks - Retired Lock Instructions
This event has the following units which may be used to modify
the behavior of the event:
BusLock Read-write. Reset: 0. Comparable to legacy bus lock.
LsRetClClush Core::X86::Pmc::Core::LsRetClClush - Retired CLFLUSH
Instructions
The number of retired CLFLUSH instructions. This is a non-
speculative event.
LsRetCpuid Core::X86::Pmc::Core::LsRetCpuid - Retired CPUID Instructions
The number of CPUID instructions retired.
LsDispatch Core::X86::Pmc::Core::LsDispatch - LS Dispatch
Counts the number of operations dispatched to the LS unit.
LsSmiRx Core::X86::Pmc::Core::LsSmiRx - SMIs Received
Counts the number of SMIs received.
LsIntTaken Core::X86::Pmc::Core::LsIntTaken - Interrupts Taken
Counts the number of interrupts taken.
LsSTLF Core::X86::Pmc::Core::LsSTLF - Store to Load Forward
Number of STLF hits.
LsStCommitCancel2 Core::X86::Pmc::Core::LsStCommitCancel2 - Store Commit Cancels
2
This event has the following units which may be used to modify
the behavior of the event:
StCommitCancelWcbFull A non-cacheable store and the non-cacheable commit
buffer is full.
LsMabAlloc Core::X86::Pmc::Core::LsMabAlloc - LS MAB Allocates by Type
Counts when a LS pipe allocates a MAB entry.
LsDmndFillsFromSys Core::X86::Pmc::Core::LsDmndFillsFromSys - Demand Data Cache
Fills by Data Source
Demand Data Cache Fills by Data Source.
This event has the following units which may be used to modify
the behavior of the event:
MemIoRemote From DRAM or IO connected in different Node.
ExtCacheRemote From CCX Cache in different Node.
MemIoLocal From DRAM or IO connected in same node.
ExtCacheLocal From cache of different CCX in same node.
IntCache From L3 or different L2 in same CCX.
LclL2 From Local L2 to the core.
LsAnyFillsFromSys Core::X86::Pmc::Core::LsAnyFillsFromSys - Any Data Cache Fills
by Data Source
Any Data Cache Fills by Data Source.
This event has the following units which may be used to modify
the behavior of the event:
MemIoRemote From DRAM or IO connected in different Node.
ExtCacheRemote From CCX Cache in different Node.
MemIoLocal From DRAM or IO connected in same node.
ExtCacheLocal From cache of different CCX in same node.
IntCache From L3 or different L2 in same CCX.
LclL2 From Local L2 to the core.
LsL1DTlbMiss Core::X86::Pmc::Core::LsL1DTlbMiss - L1 DTLB Misses
This event has the following units which may be used to modify
the behavior of the event:
TlbReload1GL2Miss DTLB reload to a 1G page that also missed in the L2
TLB.
TlbReload2ML2Miss DTLB reload to a 2M page that also missed in the L2
TLB.
TlbReloadCoalescedPageMiss DTLB reload to a coalesced page that also missed in the
L2 TLB.
TlbReload4KL2Miss DTLB reload to a 4 K page that missed the L2 TLB
TlbReload1GL2Hit DTLB reload to a 1G page that hit in the L2 TLB.
TlbReload2ML2Hit DTLB reload to a 2M page that hit in the L2
TLB.1TlbReloadCoalescedPageHit. Read-write. Reset: 0.
DTLB reload to a coalesced page that hit in the L2 TLB.
TlbReload4KL2Hit DTLB reload to a 4K page that hit in the L2 TLB.
LsMisalLoads Core::X86::Pmc::Core::LsMisalLoads - Misaligned loads
This event has the following units which may be used to modify
the behavior of the event:
MA4K The number of 4KB misaligned (i.e., page crossing)
loads.
MA64 The number of 64B misaligned (i.e., cacheline crossing)
loads.
LsPrefInstrDisp Core::X86::Pmc::Core::LsPrefInstrDisp - Prefetch Instructions
Dispatched
Software Prefetch Instructions Dispatched (Speculative).
This event has the following units which may be used to modify
the behavior of the event:
PREFETCHNTA PrefetchNTA instruction. See docAPM3 PREFETCHlevel.
PREFETCHW PrefetchW instruction. See docAPM3 PREFETCHW.
PREFETCH PrefetchT0, T1 and T2 instructions. See docAPM3
PREFETCHlevel.
LsInefSwPref Core::X86::Pmc::Core::LsInefSwPref - Ineffective Software
Prefetches
The number of software prefetches that did not fetch data
outside of the processor core.
This event has the following units which may be used to modify
the behavior of the event:
MabMchCnt Software PREFETCH instruction saw a match on an
already-allocated miss request buffer.
DataPipeSwPfDcHit Software PREFETCH instruction saw a DC hit.
LsSwPfDcFills Core::X86::Pmc::Core::LsSwPfDcFills - Software Prefetch Data
Cache Fills
Software Prefetch Data Cache Fills by Data Source.
This event has the following units which may be used to modify
the behavior of the event:
MemIoRemote From DRAM or IO connected in different Node.
ExtCacheRemote From CCX Cache in different Node.
MemIoLocal From DRAM or IO connected in same node.
ExtCacheLocal From cache of different CCX in same node.
IntCache From L3 or different L2 in same CCX.
LclL2 From Local L2 to the core.
LsHwPfDcFills Core::X86::Pmc::Core::LsHwPfDcFills - Hardware Prefetch Data
Cache Fills
Hardware Prefetch Data Cache Fills by Data Source.
This event has the following units which may be used to modify
the behavior of the event:
MemIoRemote From DRAM or IO connected in different Node.
ExtCacheRemote From CCX Cache in different Node.
MemIoLocal From DRAM or IO connected in same node.
ExtCacheLocal From cache of different CCX in same node.
IntCache From L3 or different L2 in same CCX.
LclL2 From Local L2 to the core.
LsAllocMabCount Core::X86::Pmc::Core::LsAllocMabCount - Count of Allocated Mabs
This event counts the in-flight L1 data cache misses (allocated
Miss Address Buffers) divided by 4 and rounded down each cycle
unless used with the MergeEvent functionality. If the
MergeEvent is used, it counts the exact number of outstanding
L1 data cache misses. See 2.1.17.3 [Large Increment per Cycle
Events].
LsNotHaltedCyc Core::X86::Pmc::Core::LsNotHaltedCyc - Cycles not in Halt
LsTlbFlush Core::X86::Pmc::Core::LsTlbFlush - All TLB Flushes
Requires unit mask 0xFF to engage event for counting.
IcCacheFillL2 Core::X86::Pmc::Core::IcCacheFillL2 - Instruction Cache Refills
from L2
The number of 64-byte instruction cache line was fulfilled from
the L2 cache.
IcCacheFillSys Core::X86::Pmc::Core::IcCacheFillSys - Instruction Cache
Refills from System
The number of 64-byte instruction cache line fulfilled from
system memory or another cache.
BpL1TlbMissL2TlbHit Core::X86::Pmc::Core::BpL1TlbMissL2TlbHit - L1 ITLB Miss, L2
ITLB Hit
The number of instruction fetches that miss in the L1 ITLB but
hit in the L2 ITLB.
BpL1TlbMissL2TlbMiss Core::X86::Pmc::Core::BpL1TlbMissL2TlbMiss - ITLB Reload from
Page-Table walk
The number of valid fills into the ITLB originating from the LS
Page-Table Walker. Tablewalk requests are issued for L1-ITLB
and L2-ITLB misses.
This event has the following units which may be used to modify
the behavior of the event:
Coalesced4K Walk for >4K Coalesced page.
IF1G Walk for 1G page.
IF2M Walk for 2M page.
IF4K Walk to 4K page.
BpL2BTBCorrect Core::X86::Pmc::Core::BpL2BTBCorrect - L2 Branch Prediction
Overrides Existing Prediction (speculative)
BpDynIndPred Core::X86::Pmc::Core::BpDynIndPred - Dynamic Indirect
Predictions
The number of times a branch used the indirect predictor to
make a prediction.
BpDeReDirect Core::X86::Pmc::Core::BpDeReDirect - Decode Redirects
The number of times the instruction decoder overrides the
predicted target.
BpL1TlbFetchHit Core::X86::Pmc::Core::BpL1TlbFetchHit - L1 TLB Hits for
Instruction Fetch
The number of instruction fetches that hit in the L1 ITLB.
This event has the following units which may be used to modify
the behavior of the event:
IF1G L1 Instruction TLB hit (1G page size).
IF2M L1 Instruction TLB hit (2M page size).
IF4K L1 Instruction TLB hit (4K or 16K page size).
IcTagHitMiss Core::X86::Pmc::Core::IcTagHitMiss - IC Tag Hit/Miss Events
Counts various IC tag related hit and miss events.
OpCacheHitMiss Core::X86::Pmc::Core::OpCacheHitMiss - Op Cache Hit/Miss
Counts Op Cache micro-tag hit/miss events.
DeSrcOpDisp Core::X86::Pmc::Core::DeSrcOpDisp - Source of Op Dispatched
From Decoder
Counts the number of ops dispatched from the decoder classified
by op source. See docRevG erratum #1287.
This event has the following units which may be used to modify
the behavior of the event:
OpCache Count of ops fetched from Op Cache and dispatched.
x86Decoder Count of ops fetched from Instruction Cache and
dispatched.
DeDisCopsFromDecoder Core::X86::Pmc::Core::DeDisCopsFromDecoder - Types of Oops
Dispatched From Decoder
Counts the number of ops dispatched from the decoder classified
by op type. The UnitMask value encodes which types of ops are
counted.
DeDisDispatchTokenStalls1 Core::X86::Pmc::Core::DeDisDispatchTokenStalls1 - Dispatch
Resource Stall Cycles 1
Cycles where a dispatch group is valid but does not get
dispatched due to a Token Stall. Also counts cycles when the
thread is not selected to dispatch but would have been stalled
due to a Token Stall.
This event has the following units which may be used to modify
the behavior of the event:
FpFlushRecoveryStall FP Flush recovery stall.
FPSchRsrcStall FP scheduler resource stall. Applies to ops that use
the FP scheduler.
FpRegFileRsrcStall floating point register file resource stall. Applies
to all FP ops that have a destination register.
TakenBrnchBufferRsrc taken branch buffer resource stall.
StoreQueueRsrcStall Store Queue resource stall. Applies to all ops with
store semantics.
LoadQueueRsrcStall Load Queue resource stall. Applies to all ops with
load semantics.
IntPhyRegFileRsrcStall Integer Physical Register File resource stall. Integer
Physical Register File, applies to all ops that have an
integer destination register.
DeDisDispatchTokenStalls2 Core::X86::Pmc::Core::DeDisDispatchTokenStalls2 - Dynamic
Tokens Dispatch Stall Cycles 2
Cycles where a dispatch group is valid but does not get
dispatched due to a token stall.
This event has the following units which may be used to modify
the behavior of the event:
RetireTokenStall Insufficient Retire Queue tokens available.
IntSch3TokenStall No tokens for Integer Scheduler Queue 3 available.
IntSch2TokenStall No tokens for Integer Scheduler Queue 2 available.
IntSch1TokenStall No tokens for Integer Scheduler Queue 1 available.
IntSch0TokenStall No tokens for Integer Scheduler Queue 0 available.
ExRetInstr Core::X86::Pmc::Core::ExRetInstr - Retired Instructions
The number of instructions retired.
ExRetOps Core::X86::Pmc::Core::ExRetOps - Retired Ops
The number of macro-ops retired.
ExRetBrn Core::X86::Pmc::Core::ExRetBrn - Retired Branch Instructions
The number of branch instructions retired. This includes all
types of architectural control flow changes, including
exceptions and interrupts.
ExRetBrnMisp Core::X86::Pmc::Core::ExRetBrnMisp - Retired Branch
Instructions Mispredicted
The number of retired branch instructions, that were
mispredicted.
ExRetBrnTkn Core::X86::Pmc::Core::ExRetBrnTkn - Retired Taken Branch
Instructions
The number of taken branches that were retired. This includes
all types of architectural control flow changes, including
exceptions and interrupts.
ExRetBrnTknMisp Core::X86::Pmc::Core::ExRetBrnTknMisp - Retired Taken Branch
Instructions Mispredicted
The number of retired taken branch instructions that were
mispredicted.
ExRetBrnFar Core::X86::Pmc::Core::ExRetBrnFar - Retired Far Control
Transfers
The number of far control transfers retired including far
call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and
interrupts. Far control transfers are not subject to branch
prediction.
ExRetNearRet Core::X86::Pmc::Core::ExRetNearRet - Retired Near Returns
The number of near return instructions (RET or RET Iw) retired.
ExRetNearRetMispred Core::X86::Pmc::Core::ExRetNearRetMispred - Retired Near
Returns Mispredicted
The number of near returns retired that were not correctly
predicted by the return address predictor. Each such
mispredictincurs the same penalty as a mispredicted conditional
branch instruction.
ExRetBrnIndMisp Core::X86::Pmc::Core::ExRetBrnIndMisp - Retired Indirect Branch
Instructions Mispredicted
The number of indirect branches retired that were not correctly
predicted. Each such mispredict incurs the same penalty as a
mispredicted conditional branch instruction. Note that only EX
mispredicts are counted.
ExRetMmxFpInstr Core::X86::Pmc::Core::ExRetMmxFpInstr - Retired MMX/FP
Instructions
The number of MMX, SSE or x87 instructions retired. The
UnitMask allows the selection of the individual classes of
instructions as given in the table. Each increment represents
one complete instruction. Since this event includes non-numeric
instructions it is not suitable for measuring MFLOPs.
This event has the following units which may be used to modify
the behavior of the event:
SseInstr SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41,
SSE42, AVX).
MmxInstr MMX instructions.
X87Instr x87 instructions.
ExRetIndBrchInstr Core::X86::Pmc::Core::ExRetIndBrchInstr - Retired Indirect
Branch Instructions
The number of indirect branches retired.
ExRetCond Core::X86::Pmc::Core::ExRetCond - Retired Conditional Branch
Instructions
ExDivBusy Core::X86::Pmc::Core::ExDivBusy - Div Cycles Busy count
ExDivCount Core::X86::Pmc::Core::ExDivCount - Div Op Count
ExRetMsprdBrnchInstrDirMsmtch Core::X86::Pmc::Core::ExRetMsprdBrnchInstrDirMsmtch - Retired
Mispredicted Branch Instructions due to Direction Mismatch
The number of retired conditional branch instructions that were
not correctly predicted because of a branch direction mismatch.
ExTaggedIbsOps Core::X86::Pmc::Core::ExTaggedIbsOps - Tagged IBS Ops
Counts Op IBS related events.
This event has the following units which may be used to modify
the behavior of the event:
IbsCountRollover Number of times an op could not be tagged by IBS
because of a previous tagged op that has not retired.
IbsTaggedOpsRet Number of Ops tagged by IBS that retired.
IbsTaggedOps Number of Ops tagged by IBS.
ExRetFusedInstr Core::X86::Pmc::Core::ExRetFusedInstr - Retired Fused
Instructions
Counts retired fused instructions.
L2RequestG1 Core::X86::Pmc::Core::L2RequestG1 - Requests to L2 Group1
All L2 Cache Requests
This event has the following units which may be used to modify
the behavior of the event:
RdBlkL Data Cache Reads (including hardware and software
prefetch).
RdBlkX Data Cache Stores.
LsRdBlkC_S Data Cache Shared Reads.
CacheableIcRead Instruction Cache Reads.
ChangeToX Data Cache State Change Requests. Request change to
writable, check L2 for current state.
PrefetchL2Cmd L2HwPf L2 Prefetcher. All prefetches accepted by L2 pipeline,
hit or miss. Types of PF and L2 hit/miss broken out in
a separate perfmon event.
L2CacheReqStat Core::X86::Pmc::Core::L2CacheReqStat - Core to L2 Cacheable
Request Access Status
L2 Cache Request Outcomes (not including L2 Prefetch).
This event has the following units which may be used to modify
the behavior of the event:
LsRdBlkCS Data Cache Shared Read Hit in L2.
LsRdBlkLHitX Data Cache Read Hit in L2.
LsRdBlkLHitS Data Cache Read Hit Non-Modifiable Line in L2.
LsRdBlkX Data Cache Store or State Change Hit in L2.
LsRdBlkC Data Cache Req Miss in L2 (all types).
IcFillHitX Instruction Cache Hit Modifiable Line in L2.
IcFillHitS Instruction Cache Hit Non-Modifiable Line in L2.
IcFillMiss Instruction Cache Req Miss in L2.
L2PfHitL2 Core::X86::Pmc::Core::L2PfHitL2 - L2 Prefetch Hit in L2
L2PfMissL2HitL3 Core::X86::Pmc::Core::L2PfMissL2HitL3 - L2 Prefetcher Hits in
L3
Counts all L2 prefetches accepted by the L2 pipeline which miss
the L2 cache and hit the L3.
L2PfMissL2L3 Core::X86::Pmc::Core::L2PfMissL2L3 - L2 Prefetcher Misses in L3
Counts all L2 prefetches accepted by the L2 pipeline which miss
the L2 and the L3 caches.
SEE ALSO
cpc(3CPC)illumos March 25, 2019 illumos