AMD_F19H_ZEN4_EVENTS(3CPC) CPU Performance Counters Library Functions
amd_f19h_zen4_events - AMD Family 19h Zen4 processor performance
monitoring events
This manual page describes events specfic to AMD Family 19h Zen4
processors. For more information, please consult the appropriate AMD
BIOS and Kernel Developer's guide or Open-Source Register Reference.
Each of the events listed below includes the AMD mnemonic which matches
the name found in the AMD manual and a brief summary of the event. If
available, a more detailed description of the event follows and then
any additional unit values that modify the event. Each unit can be
combined to create a new event in the system by placing the '.'
character between the event name and the unit name.
The following events are supported:
FpRetx87FpOps
Core::X86::Pmc::Core::FpRetx87FpOps - Retired x87 FP Ops
The number of x87 floating-point Ops that have retired.
This event has the following units which may be used to modify
the behavior of the event:
DivSqrROps
Divide and square root Ops.
MulOps Multiply Ops.
AddSubOps
Add/subtract Ops.
FpRetSseAvxOps
Core::X86::Pmc::Core::FpRetSseAvxOps - Retired SSE/AVX FLOPs
This is a retire-based event. The number of retired SSE/AVX
FLOPs. The number of events logged per cycle can vary from 0 to
64. This event requires the use of the MergeEvent since it can
count above 15 events per cycle. See 2.1.13.3 [Large Increment
per Cycle Events]. It does not provide a useful count without
the use of the MergeEvent.
This event has the following units which may be used to modify
the behavior of the event:
BfloatMacFLOPs
bfloat Multiply-Accumulate FLOPs. Each bfloat MAC
operation is counted as 2 FLOPS.
MacFLOPs
Multiply-Accumulate FLOPs. Each MAC operation is
counted as 2 FLOPS. This event does not include bfloat
MAC operations.
DivFLOPs
Divide/square root FLOPs.
MultFLOPs
Multiply FLOPs.
AddSubFLOPs
Add/subtract FLOPs.
FpRetiredSerOps
Core::X86::Pmc::Core::FpRetiredSerOps - Retired Serializing Ops
The number of serializing Ops retired.
This event has the following units which may be used to modify
the behavior of the event:
SseBotRet
SSE/AVX bottom-executing ops retired.
SseCtrlRet
SSE/AVX control word mispredict traps.
X87BotRet
x87 bottom-executing ops retired.
X87CtrlRet
x87 control word mispredict traps due to mispredictions
in RC or PC, or changes in Exception Mask bits.
FpOpsRetiredByWidth
Core::X86::Pmc::Core::FpOpsRetiredByWidth - Retired FP Ops By
Width
This event has the following units which may be used to modify
the behavior of the event:
Pack512uOpsRetired
Number of packed 512-bit ops retired.
Pack256uOpsRetired
Number of packed 256-bit ops retired.
Pack128uOpsRetired
Number of packed 128-bit ops retired.
ScalaruOpsRetired
Number of scalar ops retired.
MMXuOpsRetired
Number of MMX ops retired.
x87uOpsRetired
Number of x87 ops retired.
FpOpsRetiredByType
Core::X86::Pmc::Core::FpOpsRetiredByType - Retired FP Ops By
Type
Note: Shuffle op counts may count for instructions that are not
necessarily thought of as including shuffles. For example,
Horizontal Add, Dot Product, and certain MOV instructions may
include or use only shuffle type ops.
SseAvxOpsRetired
Core::X86::Pmc::Core::SseAvxOpsRetired - INT Ops Retired
Note: Shuffle op counts may count for instructions that are not
necessarily thought of as including shuffles. For example,
Horizontal Add, Dot Product, and certain MOV instructions may
include or use only shuffle type ops.
FpPackOpsRetired
Core::X86::Pmc::Core::FpPackOpsRetired - Packed FP Ops Retired
Note: Shuffle op counts may count for instructions that are not
necessarily thought of as including shuffles. For example,
Horizontal Add, Dot Product, and certain MOV instructions may
include or use only shuffle type ops.
PackedIntOpType
Core::X86::Pmc::Core::PackedIntOpType - Packed INT Ops Retired
Note: Shuffle op counts may count for instructions that are not
necessarily thought of as including shuffles. For example,
Horizontal Add, Dot Product, and certain MOV instructions may
include or use only shuffle type ops. This event also counts FP
data type packed and scalar MOV and shuffle.
FpDispFaults
Core::X86::Pmc::Core::FpDispFaults - FP Dispatch Faults
Floating-point Dispatch Faults.
This event has the following units which may be used to modify
the behavior of the event:
YmmSpillFault
YMM Spill fault.
YmmFillFault
YMM Fill fault.
XmmFillFault
XMM Fill fault.
x87FillFault
x87 Fill fault.
LsBadStatus2
Core::X86::Pmc::Core::LsBadStatus2 - Bad Status 2
This event has the following units which may be used to modify
the behavior of the event:
StliOther
Store-to-load conflicts: A load was unable to complete
due to a non-forwardable conflict with an older store.
Most commonly, a load's address range partially but not
completely overlaps with an uncompleted older store.
Software can avoid this problem by using same-size and
same-alignment loads and stores when accessing the same
data. Vector/SIMD code is particularly susceptible to
this problem; software should construct wide vector
stores by manipulating vector elements in registers
using shuffle/blend/swap instructions prior to storing
to memory, instead of using narrow element-by-element
stores.
LsLocks
Core::X86::Pmc::Core::LsLocks - Retired Lock Instructions
This event has the following units which may be used to modify
the behavior of the event:
BusLock
Comparable to legacy bus lock.
LsRetClClush
Core::X86::Pmc::Core::LsRetClClush - Retired CLFLUSH
Instructions
The number of retired CLFLUSH instructions. This is a non-
speculative event.
LsRetCpuid
Core::X86::Pmc::Core::LsRetCpuid - Retired CPUID Instructions
The number of CPUID instructions retired.
LsDispatch
Core::X86::Pmc::Core::LsDispatch - LS Dispatch
Counts the number of operations dispatched to the LS unit. Unit
Masks events are ADDed.
LsSmiRx
Core::X86::Pmc::Core::LsSmiRx - SMIs Received
Counts the number of SMIs received.
LsIntTaken
Core::X86::Pmc::Core::LsIntTaken - Interrupts Taken
Counts the number of interrupts taken.
This event has the following units which may be used to modify
the behavior of the event:
IntTaken
Number of Interrupts taken. This event is also counted
when UnitMask[7:0]=0.
LsSTLF Core::X86::Pmc::Core::LsSTLF - Store to Load Forward
Number of STLF hits.
LsStCommitCancel2
Core::X86::Pmc::Core::LsStCommitCancel2 - Store Commit Cancels
2
This event has the following units which may be used to modify
the behavior of the event:
StCommitCancelWcbFull
A non-cacheable store and the non-cacheable commit
buffer is full.
LsMabAlloc
Core::X86::Pmc::Core::LsMabAlloc - LS MAB Allocates by Type
Counts when a LS pipe allocates a MAB entry.
LsDmndFillsFromSys
Core::X86::Pmc::Core::LsDmndFillsFromSys - Demand Data Cache
Fills by Data Source
Demand Data Cache Fills by Data Source.
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar
Requests that return from Extension Memory.
Dram_IO_Far
Requests that target another NUMA node and return from
DRAM or MMIO from another NUMA node, either from the
same or different socket.
FarCache_NearFar
Requests that return from another CCX cache in a
different NUMA node.
Dram_IO_Near
Requests that target the same NUMA node and return from
either DRAM or MMIO in the same NUMA node.
NearCache_NearFar
Requests that return from another CCX cache in the same
NUMA node.
LocalCcx
Data returned from L3 or different L2 in the same CCX.
LocalL2
Data returned from the local L2.
LsAnyFillsFromSys
Core::X86::Pmc::Core::LsAnyFillsFromSys - Any Data Cache Fills
by Data Source
Any Data Cache Fills by Data Source.
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar
Requests that return from Extension Memory.
Dram_IO_Far
Requests that target another NUMA node and return from
DRAM or MMIO from another NUMA node, either from the
same or different socket.
FarCache_NearFar
Requests that return from another CCX cache in a
different NUMA node.
Dram_IO_Near
Requests that target the same NUMA node and return from
either DRAM or MMIO in the same NUMA node.
ExtCacheLocal
Requests that return from another CCX cache in the same
NUMA node.
LocalCcx
Data returned from L3 or different L2 in the same CCX.
LocalL2
Data returned from the local L2.
LsL1DTlbMiss
Core::X86::Pmc::Core::LsL1DTlbMiss - L1 DTLB Misses
This event has the following units which may be used to modify
the behavior of the event:
TlbReload1GL2Miss
DTLB reload to a 1-G page that also missed in the L2
TLB.
TlbReload2ML2Miss
DTLB reload to a 2-M page that also missed in the L2
TLB.
TlbReloadCoalescedPageMiss
DTLB reload to a coalesced page that also missed in the
L2 TLB.
TlbReload4KL2Miss
DTLB reload to a 4-K page that missed the L2 TLB.
TlbReload1GL2Hit
DTLB reload to a 1-G page that hit in the L2 TLB.
TlbReload2ML2Hit
DTLB reload to a 2-M page that hit in the L2 TLB.
TlbReloadCoalescedPageHit
DTLB reload to a coalesced page that hit in the L2 TLB.
TlbReload4KL2Hit
DTLB reload to a 4-K page that hit in the L2 TLB.
LsMisalLoads
Core::X86::Pmc::Core::LsMisalLoads - Misaligned loads
This event has the following units which may be used to modify
the behavior of the event:
MA4K The number of 4-KB misaligned (i.e., page crossing)
loads.
MA64 The number of 64-B misaligned (i.e., cacheline
crossing) loads.
LsPrefInstrDisp
Core::X86::Pmc::Core::LsPrefInstrDisp - Prefetch Instructions
Dispatched
Software Prefetch Instructions Dispatched (Speculative).
This event has the following units which may be used to modify
the behavior of the event:
PREFETCHNTA
PrefetchNTA instruction. See docAPM3 PREFETCHlevel.
PREFETCHW
PrefetchW instruction. See docAPM3 PREFETCHW.
PREFETCH
PrefetchT0, T1 and T2 instructions. See docAPM3
PREFETCHlevel.
LsWcbCloseFlush
Core::X86::Pmc::Core::LsWcbCloseFlush - Write Combine Buffer
Close Flush
UnitMask events ADDed. Multible WCB can report events at the
same time.
LsInefSwPref
Core::X86::Pmc::Core::LsInefSwPref - Ineffective Software
Prefetches
The number of software prefetches that did not fetch data
outside of the processor core.
This event has the following units which may be used to modify
the behavior of the event:
MabMchCnt
Software PREFETCH instruction saw a match on an
already-allocated miss request buffer.
DataPipeSwPfDcHit
Software PREFETCH instruction saw a DC hit.
LsSwPfDcFills
Core::X86::Pmc::Core::LsSwPfDcFills - Software Prefetch Data
Cache Fills
Software Prefetch Data Cache Fills by Data Source.
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar
Requests that return from Extension Memory.
Dram_IO_Far
Requests that target another NUMA node and return from
DRAM or MMIO from another NUMA node, either from the
same or different socket.
FarCache_NearFar
Requests that return from another CCX cache in a
different NUMA node.
Dram_IO_Near
Requests that target the same NUMA node and return from
either DRAM or MMIO in the same NUMA node.
NearCache_NearFar
Requests that return from another CCX cache in the same
NUMA node.
LocalCcx
Data returned from L3 or different L2 in the same CCX.
LocalL2
Data returned from the local L2.
LsHwPfDcFills
Core::X86::Pmc::Core::LsHwPfDcFills - Hardware Prefetch Data
Cache Fills
Hardware Prefetch Data Cache Fills by Data Source.
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar
Requests that return from Extension Memory.
Dram_IO_Far
Requests that target another NUMA node and return from
DRAM or MMIO from another NUMA node, either from the
same or different socket.
FarCache_NearFar
Requests that return from another CCX cache in a
different NUMA node.
Dram_IO_Near
Requests that target the same NUMA node and return from
either DRAM or MMIO in the same NUMA node.
NearCache_NearFar
Requests that return from another CCX cache in the same
NUMA node.
LocalCcx
Data returned from L3 or different L2 in the same CCX.
LocalL2
Data returned from the local L2.
LsAllocMabCount
Core::X86::Pmc::Core::LsAllocMabCount - Count of Allocated Mabs
This event counts the in-flight L1 data cache misses (allocated
Miss Address Buffers) each cycle.
LsNotHaltedCyc
Core::X86::Pmc::Core::LsNotHaltedCyc - Cycles not in Halt
LsTlbFlush
Core::X86::Pmc::Core::LsTlbFlush - All TLB Flushes
LsNotHaltedP0Cyc
Core::X86::Pmc::Core::LsNotHaltedP0Cyc - P0 Freq Cycles not in
Halt
This event has the following units which may be used to modify
the behavior of the event:
P0FreqCyc
Counts at the P0 frequency (same as
Core::X86::Msr::MPERF) when not in Halt.
IcCacheFillL2
Core::X86::Pmc::Core::IcCacheFillL2 - Instruction Cache Refills
from L2
The number of 64-byte instruction cache lines fulfilled from
the L2 cache.
IcCacheFillSys
Core::X86::Pmc::Core::IcCacheFillSys - Instruction Cache
Refills from System
The number of 64-byte instruction cache line fulfilled from
system memory or another cache.
BpL1TlbMissL2TlbHit
Core::X86::Pmc::Core::BpL1TlbMissL2TlbHit - L1 ITLB Miss, L2
ITLB Hit
The number of instruction fetches that miss in the L1 ITLB but
hit in the L2 ITLB.
BpL1TlbMissL2TlbMiss
Core::X86::Pmc::Core::BpL1TlbMissL2TlbMiss - ITLB Reload from
Page-Table walk
The number of valid fills into the ITLB originating from the LS
Page-Table Walker. Tablewalk requests are issued for L1-ITLB
and L2-ITLB misses.
This event has the following units which may be used to modify
the behavior of the event:
Coalesced4K
Walk for >4-K Coalesced page.
IF1G Walk for 1-G page.
IF2M Walk for 2-M page.
IF4K Walk to 4-K page.
BpL2BTBCorrect
Core::X86::Pmc::Core::BpL2BTBCorrect - L2 Branch Prediction
Overrides Existing Prediction (speculative)
BpDynIndPred
Core::X86::Pmc::Core::BpDynIndPred - Dynamic Indirect
Predictions
The number of times a branch used the indirect predictor to
make a prediction.
BpDeReDirect
Core::X86::Pmc::Core::BpDeReDirect - Decode Redirects
The number of times the instruction decoder overrides the
predicted target.
BpL1TlbFetchHit
Core::X86::Pmc::Core::BpL1TlbFetchHit - L1 TLB Hits for
Instruction Fetch
The number of instruction fetches that hit in the L1 ITLB.
This event has the following units which may be used to modify
the behavior of the event:
IF1G L1 Instruction TLB hit (1-G page size).
IF2M L1 Instruction TLB hit (2-M page size).
IF4K L1 Instruction TLB hit (4-K or 16-K page size).
ResyncsOrNcRedirects
Core::X86::Pmc::Core::ResyncsOrNcRedirects - Resyncs
Counts the number of HW resyncs (pipeline restarts) or NC
redirects. NC redirects occur when the front-end transitions to
fetching from UC (un-cacheable) memory.
IcTagHitMiss
Core::X86::Pmc::Core::IcTagHitMiss - IC Tag Hit/Miss Events
Counts various IC tag related hit and miss events.
OpCacheHitMiss
Core::X86::Pmc::Core::OpCacheHitMiss - Op Cache Hit/Miss
Counts Op Cache micro-tag hit/miss events.
DeOpQueueEmpty
Core::X86::Pmc::Core::DeOpQueueEmpty - Op Queue Empty
Cycles where the Op Queue is empty.
DeSrcOpDisp
Core::X86::Pmc::Core::DeSrcOpDisp - Source of Op Dispatched
From Decoder
Counts the number of ops dispatched from the decoder classified
by op source.
This event has the following units which may be used to modify
the behavior of the event:
LoopBuffer
Count of ops dispatched from Loop Buffer.
OpCache
Count of ops fetched from Op Cache and dispatched.
Decoder
Count of ops fetched from Instruction Cache and
dispatched.
DeDisOpsFromDecoder
Core::X86::Pmc::Core::DeDisOpsFromDecoder - Types of Ops
Dispatched From Decoder
Counts the number of ops dispatched from the decoder classified
by op type. The UnitMask value encodes which types of ops are
counted.
DeDisDispatchTokenStalls1
Core::X86::Pmc::Core::DeDisDispatchTokenStalls1 - Dispatch
Resource Stall Cycles 1
Cycles where a dispatch group is valid but does not get
dispatched due to a Token Stall. UnitMask bits select the stall
types included in the count.
This event has the following units which may be used to modify
the behavior of the event:
FpFlushRecoveryStall
Counts FP Flush Recovery stall cycles.
FPSchRsrcStall
Counts FP Scheduler token stall cycles.
FpRegFileRsrcStall
Counts FP Register File token stall cycles. This
applies to all ops that have an FP or SIMD destination
register.
TakenBrnchBufferRsrc
Counts Taken Branch Buffer token stall cycles.
StoreQueueRsrcStall
Store Queue resource stall. Counts Store Queue token
stall cycles.
LoadQueueRsrcStall
Load Queue resource stall. Counts Load Queue token
stall cycles.
IntPhyRegFileRsrcStall
Integer Physical Register File resource stall. Counts
Integer PRF token stall cycles. This applies to all ops
that have an integer destination register.
DeDisDispatchTokenStalls2
Core::X86::Pmc::Core::DeDisDispatchTokenStalls2 - Dynamic
Tokens Dispatch Stall Cycles 2
Cycles where a dispatch group is valid but does not get
dispatched due to a token stall. UnitMask bits select the stall
types included in the count.
This event has the following units which may be used to modify
the behavior of the event:
RetireTokenStall
Counts Retire Queue token stall cycles.
IntSch3TokenStall
Counts Integer Scheduler Queue 3 token stall cycles.
IntSch2TokenStall
Counts Integer Scheduler Queue 2 token stall cycles.
IntSch1TokenStall
Counts Integer Scheduler Queue 1 token stall cycles.
IntSch0TokenStall
Counts Integer Scheduler Queue 0 token stall cycles.
DeNoDispatchPerSlot
Core::X86::Pmc::Core::DeNoDispatchPerSlot - Dispatch Stalls Per
Slot
Counts the number of dispatch slots (each cycle) that remained
unused for reasons selected by StallReason.
DeAdditionalResourceStalls
Core::X86::Pmc::Core::DeAdditionalResourceStalls - Dispatch
Additional Resource Stalls
This PMC event counts additional resource stalls that are not
captured by Core::X86::Pmc::Core::DeDisDispatchTokenStalls1 or
Core::X86::Pmc::Core::DeDisDispatchTokenStalls2.
ExRetInstr
Core::X86::Pmc::Core::ExRetInstr - Retired Instructions
The number of instructions retired.
ExRetOps
Core::X86::Pmc::Core::ExRetOps - Retired Ops
The number of macro-ops retired.
ExRetBrn
Core::X86::Pmc::Core::ExRetBrn - Retired Branch Instructions
The number of branch instructions retired. This includes all
types of architectural control flow changes, including
exceptions and interrupts.
ExRetBrnMisp
Core::X86::Pmc::Core::ExRetBrnMisp - Retired Branch
Instructions Mispredicted
The number of retired branch instructions, that were
mispredicted.
ExRetBrnTkn
Core::X86::Pmc::Core::ExRetBrnTkn - Retired Taken Branch
Instructions
The number of taken branches that were retired. This includes
all types of architectural control flow changes, including
exceptions and interrupts.
ExRetBrnTknMisp
Core::X86::Pmc::Core::ExRetBrnTknMisp - Retired Taken Branch
Instructions Mispredicted
The number of retired taken branch instructions that were
mispredicted.
ExRetBrnFar
Core::X86::Pmc::Core::ExRetBrnFar - Retired Far Control
Transfers
The number of far control transfers retired including far
call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and
interrupts. Far control transfers are not subject to branch
prediction.
ExRetNearRet
Core::X86::Pmc::Core::ExRetNearRet - Retired Near Returns
The number of near return instructions (RET or RET Iw) retired.
ExRetNearRetMispred
Core::X86::Pmc::Core::ExRetNearRetMispred - Retired Near
Returns Mispredicted
The number of near returns retired that were not correctly
predicted by the return address predictor. Each such mispredict
incurs the same penalty as a mispredicted conditional branch
instruction.
ExRetBrnIndMisp
Core::X86::Pmc::Core::ExRetBrnIndMisp - Retired Indirect Branch
Instructions Mispredicted
The number of indirect branches retired that were not correctly
predicted. Each such mispredict incurs the same penalty as a
mispredicted conditional branch instruction. Note that only EX
mispredicts are counted.
ExRetMmxFpInstr
Core::X86::Pmc::Core::ExRetMmxFpInstr - Retired MMX/FP
Instructions
The number of MMX, SSE or x87 instructions retired. The
UnitMask allows the selection of the individual classes of
instructions as given in the table. Each increment represents
one complete instruction. Since this event includes non-
numeric instructions it is not suitable for measuring MFLOPs.
This event has the following units which may be used to modify
the behavior of the event:
SseInstr
SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41,
SSE42, AVX).
MmxInstr
MMX instructions.
X87Instr
x87 instructions.
ExRetIndBrchInstr
Core::X86::Pmc::Core::ExRetIndBrchInstr - Retired Indirect
Branch Instructions
The number of indirect branches retired.
ExRetCond
Core::X86::Pmc::Core::ExRetCond - Retired Conditional Branch
Instructions
ExDivBusy
Core::X86::Pmc::Core::ExDivBusy - Div Cycles Busy count
ExDivCount
Core::X86::Pmc::Core::ExDivCount - Div Op Count
ExNoRetire
Core::X86::Pmc::Core::ExNoRetire - Cycles With No Retire
This event counts cycles when the hardware thread does not
retire any ops for reasons selected by UnitMask[4:0]. UnitMask
events [4:0] are mutually exclusive. If multiple reasons apply
for a given cycle, the lowest numbered UnitMask event is
counted.
ExRetUcodeInstr
Core::X86::Pmc::Core::ExRetUcodeInstr - Retired Microcoded
Instructions
Retired Microcoded Instructions.
ExRetUcodeOps
Core::X86::Pmc::Core::ExRetUcodeOps - Retired Microcode Ops
The number of microcode ops that have retired.
ExRetMsprdBrnchInstrDirMsmtch
Core::X86::Pmc::Core::ExRetMsprdBrnchInstrDirMsmtch - Retired
Mispredicted Branch Instructions due to Direction Mismatch
The number of retired conditional branch instructions that were
not correctly predicted because of a branch direction mismatch.
ExRetUncondBrnchInstrMispred
Core::X86::Pmc::Core::ExRetUncondBrnchInstrMispred - Retired
Unconditional Indirect Branch Instructions Mispredicted
The number of retired unconditional indirect branch
instructions that were mispredicted.
ExRetUncondBrnchInstr
Core::X86::Pmc::Core::ExRetUncondBrnchInstr - Retired
Unconditional Branch Instructions
The number of retired unconditional branch instructions.
ExTaggedIbsOps
Core::X86::Pmc::Core::ExTaggedIbsOps - Tagged IBS Ops
Counts Op IBS related events.
This event has the following units which may be used to modify
the behavior of the event:
IbsCountRollover
Number of times an op could not be tagged by IBS
because of a previous tagged op that has not retired.
IbsTaggedOpsRet
Number of Ops tagged by IBS that retired.
IbsTaggedOps
Number of Ops tagged by IBS.
ExRetFusedInstr
Core::X86::Pmc::Core::ExRetFusedInstr - Retired Fused
Instructions
Counts retired fused instructions.
L2RequestG1
Core::X86::Pmc::L2::L2RequestG1 - Requests to L2 Group1
All L2 Cache Requests (Breakdown 1 - Common)
This event has the following units which may be used to modify
the behavior of the event:
RdBlkL Data Cache Reads (including hardware and software
prefetch).
RdBlkX Data Cache Stores.
LsRdBlkC_S
Data Cache Shared Reads.
CacheableIcRead
Instruction Cache Reads.
ChangeToX
Data Cache State Change Requests. Request change to
writable, check L2 for current state.
PrefetchL2Cmd
L2HwPf L2 Prefetcher. All prefetches accepted by L2 pipeline,
hit or miss. Types of PF and L2 hit/miss broken out in
a separate perfmon event
Group2. Read-write
MiscRequests. Various Noncacheable requests. Non-cached
Data Reads, Non- cached Instruction Reads, Self-
modifying code checks.
L2CacheReqStat
Core::X86::Pmc::L2::L2CacheReqStat - Core to L2 Cacheable
Request Access Status
L2 Cache Request Outcomes (not including L2 Prefetch).
This event has the following units which may be used to modify
the behavior of the event:
LsRdBlkCS
Data Cache Shared Read Hit in L2.
LsRdBlkLHitX
Data Cache Read Hit in L2.
LsRdBlkLHitS
Data Cache Read Hit Non-Modifiable Line in L2.
LsRdBlkX
Data Cache Store or State Change Hit in L2.
LsRdBlkC
Data Cache Req Miss in L2.
IcFillHitX
Instruction Cache Hit Modifiable Line in L2.
IcFillHitS
Instruction Cache Hit Non-Modifiable Line in L2.
IcFillMiss
Instruction Cache Req Miss in L2.
L2PfHitL2
Core::X86::Pmc::L2::L2PfHitL2 - L2 Prefetch Hit in L2
Counts all L2 prefetches accepted by L2 pipeline which hit in
the L2 cache.
This event has the following units which may be used to modify
the behavior of the event:
L1Region
L1Region
L1Stride
L1Stride
L1Stream
L1Stream
L2Stride
L2Stride
L2Burst
L2Burst
L2Up_Down
L2 Up/Down
L2NextLine
L2NextLine
L2Stream
L2Stream
L2PfMissL2HitL2
Core::X86::Pmc::L2::L2PfMissL2HitL2 - L2 Prefetcher Hits in L3
Counts all L2 prefetches accepted by the L2 pipeline which miss
the L2 cache and hit the L3.
This event has the following units which may be used to modify
the behavior of the event:
L1Region
L1Region
L1Stride
L1Stride
L1Stream
L1Stream
L2Stride
L2Stride
L2Burst
L2Burst
L2Up_Down
L2 Up/Down
L2NextLine
L2NextLine
L2Stream
L2Stream
L2PfMissL2L3
Core::X86::Pmc::L2::L2PfMissL2L3 - L2 Prefetcher Misses in L3
Counts all L2 prefetches accepted by the L2 pipeline which miss
the L2 and the L3 caches
This event has the following units which may be used to modify
the behavior of the event:
L1Region
L1Region
L1Stride
L1Stride
L1Stream
L1Stream
L2Stride
L2Stride
L2Burst
L2Burst
L2Up_Down
L2 Up/Down
L2NextLine
L2NextLine
L2Stream
L2Stream
cpc(3CPC)
illumos March 25, 2019 illumos
NAME
amd_f19h_zen4_events - AMD Family 19h Zen4 processor performance
monitoring events
DESCRIPTION
This manual page describes events specfic to AMD Family 19h Zen4
processors. For more information, please consult the appropriate AMD
BIOS and Kernel Developer's guide or Open-Source Register Reference.
Each of the events listed below includes the AMD mnemonic which matches
the name found in the AMD manual and a brief summary of the event. If
available, a more detailed description of the event follows and then
any additional unit values that modify the event. Each unit can be
combined to create a new event in the system by placing the '.'
character between the event name and the unit name.
The following events are supported:
FpRetx87FpOps
Core::X86::Pmc::Core::FpRetx87FpOps - Retired x87 FP Ops
The number of x87 floating-point Ops that have retired.
This event has the following units which may be used to modify
the behavior of the event:
DivSqrROps
Divide and square root Ops.
MulOps Multiply Ops.
AddSubOps
Add/subtract Ops.
FpRetSseAvxOps
Core::X86::Pmc::Core::FpRetSseAvxOps - Retired SSE/AVX FLOPs
This is a retire-based event. The number of retired SSE/AVX
FLOPs. The number of events logged per cycle can vary from 0 to
64. This event requires the use of the MergeEvent since it can
count above 15 events per cycle. See 2.1.13.3 [Large Increment
per Cycle Events]. It does not provide a useful count without
the use of the MergeEvent.
This event has the following units which may be used to modify
the behavior of the event:
BfloatMacFLOPs
bfloat Multiply-Accumulate FLOPs. Each bfloat MAC
operation is counted as 2 FLOPS.
MacFLOPs
Multiply-Accumulate FLOPs. Each MAC operation is
counted as 2 FLOPS. This event does not include bfloat
MAC operations.
DivFLOPs
Divide/square root FLOPs.
MultFLOPs
Multiply FLOPs.
AddSubFLOPs
Add/subtract FLOPs.
FpRetiredSerOps
Core::X86::Pmc::Core::FpRetiredSerOps - Retired Serializing Ops
The number of serializing Ops retired.
This event has the following units which may be used to modify
the behavior of the event:
SseBotRet
SSE/AVX bottom-executing ops retired.
SseCtrlRet
SSE/AVX control word mispredict traps.
X87BotRet
x87 bottom-executing ops retired.
X87CtrlRet
x87 control word mispredict traps due to mispredictions
in RC or PC, or changes in Exception Mask bits.
FpOpsRetiredByWidth
Core::X86::Pmc::Core::FpOpsRetiredByWidth - Retired FP Ops By
Width
This event has the following units which may be used to modify
the behavior of the event:
Pack512uOpsRetired
Number of packed 512-bit ops retired.
Pack256uOpsRetired
Number of packed 256-bit ops retired.
Pack128uOpsRetired
Number of packed 128-bit ops retired.
ScalaruOpsRetired
Number of scalar ops retired.
MMXuOpsRetired
Number of MMX ops retired.
x87uOpsRetired
Number of x87 ops retired.
FpOpsRetiredByType
Core::X86::Pmc::Core::FpOpsRetiredByType - Retired FP Ops By
Type
Note: Shuffle op counts may count for instructions that are not
necessarily thought of as including shuffles. For example,
Horizontal Add, Dot Product, and certain MOV instructions may
include or use only shuffle type ops.
SseAvxOpsRetired
Core::X86::Pmc::Core::SseAvxOpsRetired - INT Ops Retired
Note: Shuffle op counts may count for instructions that are not
necessarily thought of as including shuffles. For example,
Horizontal Add, Dot Product, and certain MOV instructions may
include or use only shuffle type ops.
FpPackOpsRetired
Core::X86::Pmc::Core::FpPackOpsRetired - Packed FP Ops Retired
Note: Shuffle op counts may count for instructions that are not
necessarily thought of as including shuffles. For example,
Horizontal Add, Dot Product, and certain MOV instructions may
include or use only shuffle type ops.
PackedIntOpType
Core::X86::Pmc::Core::PackedIntOpType - Packed INT Ops Retired
Note: Shuffle op counts may count for instructions that are not
necessarily thought of as including shuffles. For example,
Horizontal Add, Dot Product, and certain MOV instructions may
include or use only shuffle type ops. This event also counts FP
data type packed and scalar MOV and shuffle.
FpDispFaults
Core::X86::Pmc::Core::FpDispFaults - FP Dispatch Faults
Floating-point Dispatch Faults.
This event has the following units which may be used to modify
the behavior of the event:
YmmSpillFault
YMM Spill fault.
YmmFillFault
YMM Fill fault.
XmmFillFault
XMM Fill fault.
x87FillFault
x87 Fill fault.
LsBadStatus2
Core::X86::Pmc::Core::LsBadStatus2 - Bad Status 2
This event has the following units which may be used to modify
the behavior of the event:
StliOther
Store-to-load conflicts: A load was unable to complete
due to a non-forwardable conflict with an older store.
Most commonly, a load's address range partially but not
completely overlaps with an uncompleted older store.
Software can avoid this problem by using same-size and
same-alignment loads and stores when accessing the same
data. Vector/SIMD code is particularly susceptible to
this problem; software should construct wide vector
stores by manipulating vector elements in registers
using shuffle/blend/swap instructions prior to storing
to memory, instead of using narrow element-by-element
stores.
LsLocks
Core::X86::Pmc::Core::LsLocks - Retired Lock Instructions
This event has the following units which may be used to modify
the behavior of the event:
BusLock
Comparable to legacy bus lock.
LsRetClClush
Core::X86::Pmc::Core::LsRetClClush - Retired CLFLUSH
Instructions
The number of retired CLFLUSH instructions. This is a non-
speculative event.
LsRetCpuid
Core::X86::Pmc::Core::LsRetCpuid - Retired CPUID Instructions
The number of CPUID instructions retired.
LsDispatch
Core::X86::Pmc::Core::LsDispatch - LS Dispatch
Counts the number of operations dispatched to the LS unit. Unit
Masks events are ADDed.
LsSmiRx
Core::X86::Pmc::Core::LsSmiRx - SMIs Received
Counts the number of SMIs received.
LsIntTaken
Core::X86::Pmc::Core::LsIntTaken - Interrupts Taken
Counts the number of interrupts taken.
This event has the following units which may be used to modify
the behavior of the event:
IntTaken
Number of Interrupts taken. This event is also counted
when UnitMask[7:0]=0.
LsSTLF Core::X86::Pmc::Core::LsSTLF - Store to Load Forward
Number of STLF hits.
LsStCommitCancel2
Core::X86::Pmc::Core::LsStCommitCancel2 - Store Commit Cancels
2
This event has the following units which may be used to modify
the behavior of the event:
StCommitCancelWcbFull
A non-cacheable store and the non-cacheable commit
buffer is full.
LsMabAlloc
Core::X86::Pmc::Core::LsMabAlloc - LS MAB Allocates by Type
Counts when a LS pipe allocates a MAB entry.
LsDmndFillsFromSys
Core::X86::Pmc::Core::LsDmndFillsFromSys - Demand Data Cache
Fills by Data Source
Demand Data Cache Fills by Data Source.
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar
Requests that return from Extension Memory.
Dram_IO_Far
Requests that target another NUMA node and return from
DRAM or MMIO from another NUMA node, either from the
same or different socket.
FarCache_NearFar
Requests that return from another CCX cache in a
different NUMA node.
Dram_IO_Near
Requests that target the same NUMA node and return from
either DRAM or MMIO in the same NUMA node.
NearCache_NearFar
Requests that return from another CCX cache in the same
NUMA node.
LocalCcx
Data returned from L3 or different L2 in the same CCX.
LocalL2
Data returned from the local L2.
LsAnyFillsFromSys
Core::X86::Pmc::Core::LsAnyFillsFromSys - Any Data Cache Fills
by Data Source
Any Data Cache Fills by Data Source.
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar
Requests that return from Extension Memory.
Dram_IO_Far
Requests that target another NUMA node and return from
DRAM or MMIO from another NUMA node, either from the
same or different socket.
FarCache_NearFar
Requests that return from another CCX cache in a
different NUMA node.
Dram_IO_Near
Requests that target the same NUMA node and return from
either DRAM or MMIO in the same NUMA node.
ExtCacheLocal
Requests that return from another CCX cache in the same
NUMA node.
LocalCcx
Data returned from L3 or different L2 in the same CCX.
LocalL2
Data returned from the local L2.
LsL1DTlbMiss
Core::X86::Pmc::Core::LsL1DTlbMiss - L1 DTLB Misses
This event has the following units which may be used to modify
the behavior of the event:
TlbReload1GL2Miss
DTLB reload to a 1-G page that also missed in the L2
TLB.
TlbReload2ML2Miss
DTLB reload to a 2-M page that also missed in the L2
TLB.
TlbReloadCoalescedPageMiss
DTLB reload to a coalesced page that also missed in the
L2 TLB.
TlbReload4KL2Miss
DTLB reload to a 4-K page that missed the L2 TLB.
TlbReload1GL2Hit
DTLB reload to a 1-G page that hit in the L2 TLB.
TlbReload2ML2Hit
DTLB reload to a 2-M page that hit in the L2 TLB.
TlbReloadCoalescedPageHit
DTLB reload to a coalesced page that hit in the L2 TLB.
TlbReload4KL2Hit
DTLB reload to a 4-K page that hit in the L2 TLB.
LsMisalLoads
Core::X86::Pmc::Core::LsMisalLoads - Misaligned loads
This event has the following units which may be used to modify
the behavior of the event:
MA4K The number of 4-KB misaligned (i.e., page crossing)
loads.
MA64 The number of 64-B misaligned (i.e., cacheline
crossing) loads.
LsPrefInstrDisp
Core::X86::Pmc::Core::LsPrefInstrDisp - Prefetch Instructions
Dispatched
Software Prefetch Instructions Dispatched (Speculative).
This event has the following units which may be used to modify
the behavior of the event:
PREFETCHNTA
PrefetchNTA instruction. See docAPM3 PREFETCHlevel.
PREFETCHW
PrefetchW instruction. See docAPM3 PREFETCHW.
PREFETCH
PrefetchT0, T1 and T2 instructions. See docAPM3
PREFETCHlevel.
LsWcbCloseFlush
Core::X86::Pmc::Core::LsWcbCloseFlush - Write Combine Buffer
Close Flush
UnitMask events ADDed. Multible WCB can report events at the
same time.
LsInefSwPref
Core::X86::Pmc::Core::LsInefSwPref - Ineffective Software
Prefetches
The number of software prefetches that did not fetch data
outside of the processor core.
This event has the following units which may be used to modify
the behavior of the event:
MabMchCnt
Software PREFETCH instruction saw a match on an
already-allocated miss request buffer.
DataPipeSwPfDcHit
Software PREFETCH instruction saw a DC hit.
LsSwPfDcFills
Core::X86::Pmc::Core::LsSwPfDcFills - Software Prefetch Data
Cache Fills
Software Prefetch Data Cache Fills by Data Source.
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar
Requests that return from Extension Memory.
Dram_IO_Far
Requests that target another NUMA node and return from
DRAM or MMIO from another NUMA node, either from the
same or different socket.
FarCache_NearFar
Requests that return from another CCX cache in a
different NUMA node.
Dram_IO_Near
Requests that target the same NUMA node and return from
either DRAM or MMIO in the same NUMA node.
NearCache_NearFar
Requests that return from another CCX cache in the same
NUMA node.
LocalCcx
Data returned from L3 or different L2 in the same CCX.
LocalL2
Data returned from the local L2.
LsHwPfDcFills
Core::X86::Pmc::Core::LsHwPfDcFills - Hardware Prefetch Data
Cache Fills
Hardware Prefetch Data Cache Fills by Data Source.
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar
Requests that return from Extension Memory.
Dram_IO_Far
Requests that target another NUMA node and return from
DRAM or MMIO from another NUMA node, either from the
same or different socket.
FarCache_NearFar
Requests that return from another CCX cache in a
different NUMA node.
Dram_IO_Near
Requests that target the same NUMA node and return from
either DRAM or MMIO in the same NUMA node.
NearCache_NearFar
Requests that return from another CCX cache in the same
NUMA node.
LocalCcx
Data returned from L3 or different L2 in the same CCX.
LocalL2
Data returned from the local L2.
LsAllocMabCount
Core::X86::Pmc::Core::LsAllocMabCount - Count of Allocated Mabs
This event counts the in-flight L1 data cache misses (allocated
Miss Address Buffers) each cycle.
LsNotHaltedCyc
Core::X86::Pmc::Core::LsNotHaltedCyc - Cycles not in Halt
LsTlbFlush
Core::X86::Pmc::Core::LsTlbFlush - All TLB Flushes
LsNotHaltedP0Cyc
Core::X86::Pmc::Core::LsNotHaltedP0Cyc - P0 Freq Cycles not in
Halt
This event has the following units which may be used to modify
the behavior of the event:
P0FreqCyc
Counts at the P0 frequency (same as
Core::X86::Msr::MPERF) when not in Halt.
IcCacheFillL2
Core::X86::Pmc::Core::IcCacheFillL2 - Instruction Cache Refills
from L2
The number of 64-byte instruction cache lines fulfilled from
the L2 cache.
IcCacheFillSys
Core::X86::Pmc::Core::IcCacheFillSys - Instruction Cache
Refills from System
The number of 64-byte instruction cache line fulfilled from
system memory or another cache.
BpL1TlbMissL2TlbHit
Core::X86::Pmc::Core::BpL1TlbMissL2TlbHit - L1 ITLB Miss, L2
ITLB Hit
The number of instruction fetches that miss in the L1 ITLB but
hit in the L2 ITLB.
BpL1TlbMissL2TlbMiss
Core::X86::Pmc::Core::BpL1TlbMissL2TlbMiss - ITLB Reload from
Page-Table walk
The number of valid fills into the ITLB originating from the LS
Page-Table Walker. Tablewalk requests are issued for L1-ITLB
and L2-ITLB misses.
This event has the following units which may be used to modify
the behavior of the event:
Coalesced4K
Walk for >4-K Coalesced page.
IF1G Walk for 1-G page.
IF2M Walk for 2-M page.
IF4K Walk to 4-K page.
BpL2BTBCorrect
Core::X86::Pmc::Core::BpL2BTBCorrect - L2 Branch Prediction
Overrides Existing Prediction (speculative)
BpDynIndPred
Core::X86::Pmc::Core::BpDynIndPred - Dynamic Indirect
Predictions
The number of times a branch used the indirect predictor to
make a prediction.
BpDeReDirect
Core::X86::Pmc::Core::BpDeReDirect - Decode Redirects
The number of times the instruction decoder overrides the
predicted target.
BpL1TlbFetchHit
Core::X86::Pmc::Core::BpL1TlbFetchHit - L1 TLB Hits for
Instruction Fetch
The number of instruction fetches that hit in the L1 ITLB.
This event has the following units which may be used to modify
the behavior of the event:
IF1G L1 Instruction TLB hit (1-G page size).
IF2M L1 Instruction TLB hit (2-M page size).
IF4K L1 Instruction TLB hit (4-K or 16-K page size).
ResyncsOrNcRedirects
Core::X86::Pmc::Core::ResyncsOrNcRedirects - Resyncs
Counts the number of HW resyncs (pipeline restarts) or NC
redirects. NC redirects occur when the front-end transitions to
fetching from UC (un-cacheable) memory.
IcTagHitMiss
Core::X86::Pmc::Core::IcTagHitMiss - IC Tag Hit/Miss Events
Counts various IC tag related hit and miss events.
OpCacheHitMiss
Core::X86::Pmc::Core::OpCacheHitMiss - Op Cache Hit/Miss
Counts Op Cache micro-tag hit/miss events.
DeOpQueueEmpty
Core::X86::Pmc::Core::DeOpQueueEmpty - Op Queue Empty
Cycles where the Op Queue is empty.
DeSrcOpDisp
Core::X86::Pmc::Core::DeSrcOpDisp - Source of Op Dispatched
From Decoder
Counts the number of ops dispatched from the decoder classified
by op source.
This event has the following units which may be used to modify
the behavior of the event:
LoopBuffer
Count of ops dispatched from Loop Buffer.
OpCache
Count of ops fetched from Op Cache and dispatched.
Decoder
Count of ops fetched from Instruction Cache and
dispatched.
DeDisOpsFromDecoder
Core::X86::Pmc::Core::DeDisOpsFromDecoder - Types of Ops
Dispatched From Decoder
Counts the number of ops dispatched from the decoder classified
by op type. The UnitMask value encodes which types of ops are
counted.
DeDisDispatchTokenStalls1
Core::X86::Pmc::Core::DeDisDispatchTokenStalls1 - Dispatch
Resource Stall Cycles 1
Cycles where a dispatch group is valid but does not get
dispatched due to a Token Stall. UnitMask bits select the stall
types included in the count.
This event has the following units which may be used to modify
the behavior of the event:
FpFlushRecoveryStall
Counts FP Flush Recovery stall cycles.
FPSchRsrcStall
Counts FP Scheduler token stall cycles.
FpRegFileRsrcStall
Counts FP Register File token stall cycles. This
applies to all ops that have an FP or SIMD destination
register.
TakenBrnchBufferRsrc
Counts Taken Branch Buffer token stall cycles.
StoreQueueRsrcStall
Store Queue resource stall. Counts Store Queue token
stall cycles.
LoadQueueRsrcStall
Load Queue resource stall. Counts Load Queue token
stall cycles.
IntPhyRegFileRsrcStall
Integer Physical Register File resource stall. Counts
Integer PRF token stall cycles. This applies to all ops
that have an integer destination register.
DeDisDispatchTokenStalls2
Core::X86::Pmc::Core::DeDisDispatchTokenStalls2 - Dynamic
Tokens Dispatch Stall Cycles 2
Cycles where a dispatch group is valid but does not get
dispatched due to a token stall. UnitMask bits select the stall
types included in the count.
This event has the following units which may be used to modify
the behavior of the event:
RetireTokenStall
Counts Retire Queue token stall cycles.
IntSch3TokenStall
Counts Integer Scheduler Queue 3 token stall cycles.
IntSch2TokenStall
Counts Integer Scheduler Queue 2 token stall cycles.
IntSch1TokenStall
Counts Integer Scheduler Queue 1 token stall cycles.
IntSch0TokenStall
Counts Integer Scheduler Queue 0 token stall cycles.
DeNoDispatchPerSlot
Core::X86::Pmc::Core::DeNoDispatchPerSlot - Dispatch Stalls Per
Slot
Counts the number of dispatch slots (each cycle) that remained
unused for reasons selected by StallReason.
DeAdditionalResourceStalls
Core::X86::Pmc::Core::DeAdditionalResourceStalls - Dispatch
Additional Resource Stalls
This PMC event counts additional resource stalls that are not
captured by Core::X86::Pmc::Core::DeDisDispatchTokenStalls1 or
Core::X86::Pmc::Core::DeDisDispatchTokenStalls2.
ExRetInstr
Core::X86::Pmc::Core::ExRetInstr - Retired Instructions
The number of instructions retired.
ExRetOps
Core::X86::Pmc::Core::ExRetOps - Retired Ops
The number of macro-ops retired.
ExRetBrn
Core::X86::Pmc::Core::ExRetBrn - Retired Branch Instructions
The number of branch instructions retired. This includes all
types of architectural control flow changes, including
exceptions and interrupts.
ExRetBrnMisp
Core::X86::Pmc::Core::ExRetBrnMisp - Retired Branch
Instructions Mispredicted
The number of retired branch instructions, that were
mispredicted.
ExRetBrnTkn
Core::X86::Pmc::Core::ExRetBrnTkn - Retired Taken Branch
Instructions
The number of taken branches that were retired. This includes
all types of architectural control flow changes, including
exceptions and interrupts.
ExRetBrnTknMisp
Core::X86::Pmc::Core::ExRetBrnTknMisp - Retired Taken Branch
Instructions Mispredicted
The number of retired taken branch instructions that were
mispredicted.
ExRetBrnFar
Core::X86::Pmc::Core::ExRetBrnFar - Retired Far Control
Transfers
The number of far control transfers retired including far
call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and
interrupts. Far control transfers are not subject to branch
prediction.
ExRetNearRet
Core::X86::Pmc::Core::ExRetNearRet - Retired Near Returns
The number of near return instructions (RET or RET Iw) retired.
ExRetNearRetMispred
Core::X86::Pmc::Core::ExRetNearRetMispred - Retired Near
Returns Mispredicted
The number of near returns retired that were not correctly
predicted by the return address predictor. Each such mispredict
incurs the same penalty as a mispredicted conditional branch
instruction.
ExRetBrnIndMisp
Core::X86::Pmc::Core::ExRetBrnIndMisp - Retired Indirect Branch
Instructions Mispredicted
The number of indirect branches retired that were not correctly
predicted. Each such mispredict incurs the same penalty as a
mispredicted conditional branch instruction. Note that only EX
mispredicts are counted.
ExRetMmxFpInstr
Core::X86::Pmc::Core::ExRetMmxFpInstr - Retired MMX/FP
Instructions
The number of MMX, SSE or x87 instructions retired. The
UnitMask allows the selection of the individual classes of
instructions as given in the table. Each increment represents
one complete instruction. Since this event includes non-
numeric instructions it is not suitable for measuring MFLOPs.
This event has the following units which may be used to modify
the behavior of the event:
SseInstr
SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41,
SSE42, AVX).
MmxInstr
MMX instructions.
X87Instr
x87 instructions.
ExRetIndBrchInstr
Core::X86::Pmc::Core::ExRetIndBrchInstr - Retired Indirect
Branch Instructions
The number of indirect branches retired.
ExRetCond
Core::X86::Pmc::Core::ExRetCond - Retired Conditional Branch
Instructions
ExDivBusy
Core::X86::Pmc::Core::ExDivBusy - Div Cycles Busy count
ExDivCount
Core::X86::Pmc::Core::ExDivCount - Div Op Count
ExNoRetire
Core::X86::Pmc::Core::ExNoRetire - Cycles With No Retire
This event counts cycles when the hardware thread does not
retire any ops for reasons selected by UnitMask[4:0]. UnitMask
events [4:0] are mutually exclusive. If multiple reasons apply
for a given cycle, the lowest numbered UnitMask event is
counted.
ExRetUcodeInstr
Core::X86::Pmc::Core::ExRetUcodeInstr - Retired Microcoded
Instructions
Retired Microcoded Instructions.
ExRetUcodeOps
Core::X86::Pmc::Core::ExRetUcodeOps - Retired Microcode Ops
The number of microcode ops that have retired.
ExRetMsprdBrnchInstrDirMsmtch
Core::X86::Pmc::Core::ExRetMsprdBrnchInstrDirMsmtch - Retired
Mispredicted Branch Instructions due to Direction Mismatch
The number of retired conditional branch instructions that were
not correctly predicted because of a branch direction mismatch.
ExRetUncondBrnchInstrMispred
Core::X86::Pmc::Core::ExRetUncondBrnchInstrMispred - Retired
Unconditional Indirect Branch Instructions Mispredicted
The number of retired unconditional indirect branch
instructions that were mispredicted.
ExRetUncondBrnchInstr
Core::X86::Pmc::Core::ExRetUncondBrnchInstr - Retired
Unconditional Branch Instructions
The number of retired unconditional branch instructions.
ExTaggedIbsOps
Core::X86::Pmc::Core::ExTaggedIbsOps - Tagged IBS Ops
Counts Op IBS related events.
This event has the following units which may be used to modify
the behavior of the event:
IbsCountRollover
Number of times an op could not be tagged by IBS
because of a previous tagged op that has not retired.
IbsTaggedOpsRet
Number of Ops tagged by IBS that retired.
IbsTaggedOps
Number of Ops tagged by IBS.
ExRetFusedInstr
Core::X86::Pmc::Core::ExRetFusedInstr - Retired Fused
Instructions
Counts retired fused instructions.
L2RequestG1
Core::X86::Pmc::L2::L2RequestG1 - Requests to L2 Group1
All L2 Cache Requests (Breakdown 1 - Common)
This event has the following units which may be used to modify
the behavior of the event:
RdBlkL Data Cache Reads (including hardware and software
prefetch).
RdBlkX Data Cache Stores.
LsRdBlkC_S
Data Cache Shared Reads.
CacheableIcRead
Instruction Cache Reads.
ChangeToX
Data Cache State Change Requests. Request change to
writable, check L2 for current state.
PrefetchL2Cmd
L2HwPf L2 Prefetcher. All prefetches accepted by L2 pipeline,
hit or miss. Types of PF and L2 hit/miss broken out in
a separate perfmon event
Group2. Read-write
MiscRequests. Various Noncacheable requests. Non-cached
Data Reads, Non- cached Instruction Reads, Self-
modifying code checks.
L2CacheReqStat
Core::X86::Pmc::L2::L2CacheReqStat - Core to L2 Cacheable
Request Access Status
L2 Cache Request Outcomes (not including L2 Prefetch).
This event has the following units which may be used to modify
the behavior of the event:
LsRdBlkCS
Data Cache Shared Read Hit in L2.
LsRdBlkLHitX
Data Cache Read Hit in L2.
LsRdBlkLHitS
Data Cache Read Hit Non-Modifiable Line in L2.
LsRdBlkX
Data Cache Store or State Change Hit in L2.
LsRdBlkC
Data Cache Req Miss in L2.
IcFillHitX
Instruction Cache Hit Modifiable Line in L2.
IcFillHitS
Instruction Cache Hit Non-Modifiable Line in L2.
IcFillMiss
Instruction Cache Req Miss in L2.
L2PfHitL2
Core::X86::Pmc::L2::L2PfHitL2 - L2 Prefetch Hit in L2
Counts all L2 prefetches accepted by L2 pipeline which hit in
the L2 cache.
This event has the following units which may be used to modify
the behavior of the event:
L1Region
L1Region
L1Stride
L1Stride
L1Stream
L1Stream
L2Stride
L2Stride
L2Burst
L2Burst
L2Up_Down
L2 Up/Down
L2NextLine
L2NextLine
L2Stream
L2Stream
L2PfMissL2HitL2
Core::X86::Pmc::L2::L2PfMissL2HitL2 - L2 Prefetcher Hits in L3
Counts all L2 prefetches accepted by the L2 pipeline which miss
the L2 cache and hit the L3.
This event has the following units which may be used to modify
the behavior of the event:
L1Region
L1Region
L1Stride
L1Stride
L1Stream
L1Stream
L2Stride
L2Stride
L2Burst
L2Burst
L2Up_Down
L2 Up/Down
L2NextLine
L2NextLine
L2Stream
L2Stream
L2PfMissL2L3
Core::X86::Pmc::L2::L2PfMissL2L3 - L2 Prefetcher Misses in L3
Counts all L2 prefetches accepted by the L2 pipeline which miss
the L2 and the L3 caches
This event has the following units which may be used to modify
the behavior of the event:
L1Region
L1Region
L1Stride
L1Stride
L1Stream
L1Stream
L2Stride
L2Stride
L2Burst
L2Burst
L2Up_Down
L2 Up/Down
L2NextLine
L2NextLine
L2Stream
L2Stream
SEE ALSO
cpc(3CPC)
illumos March 25, 2019 illumos