AMD_F1AH_ZEN5_EVENTS(3CPC) CPU Performance Counters Library Functions
NAME
amd_f1ah_zen5_events - AMD Family 1ah Zen5 processor performance
monitoring events
DESCRIPTION
This manual page describes events specfic to AMD Family 1ah Zen5
processors. For more information, please consult the appropriate AMD
BIOS and Kernel Developer's guide or Open-Source Register Reference.
Each of the events listed below includes the AMD mnemonic which matches
the name found in the AMD manual and a brief summary of the event. If
available, a more detailed description of the event follows and then
any additional unit values that modify the event. Each unit can be
combined to create a new event in the system by placing the '.'
character between the event name and the unit name.
The following events are supported:
Retired_x87_FP_Ops Core::X86::Pmc::Core::Retired_x87_FP_Ops - FP retired x87 uops
Number of retired x87 arithmetic operations. Can be used to
calculate x87 FLOPs.
This event has the following units which may be used to modify
the behavior of the event:
DivSqrROps x87 Divide or square root uops.
MulOps x87 Multiply uops.
AddSubOps x87 Add/subtract uops.
Retired_SSE_AVX_FLOPs Core::X86::Pmc::Core::Retired_SSE_AVX_FLOPs - FP retired SSE
and AVX FLOPs
Number of SSE and AVX floating point arithmetic operations
retired. Number of arithmetic operations retired is dependent
on number of uops retired, data size (scalar/128/256/512), data
type (BF16/FP16/FP32/FP64) and type of operation
(add/sub/mul/mac/...). Use MergeEvent feature for accurate
results.
Retired_FP_uOps Core::X86::Pmc::Core::Retired_FP_uOps - FP uops retired by size
Report number of FP uops retired by size. Can be used to
determine how vectorized code is and how much MMX / x87 content
is in the code.
This event has the following units which may be used to modify
the behavior of the event:
Pack512uOpsRetired Packed 512-bit uops retired.
Pack256uOpsRetired Packed 256-bit uops retired.
Pack128uOpsRetired Packed 128-bit uops retired.
ScalaruOpsRetired Scalar uops retired.
MMXuOpsRetired MMX uops retired.
x87uOpsRetired x87 uops retired.
FP_Ops_Retired Core::X86::Pmc::Core::FP_Ops_Retired - FP uops retired sorted
by vector or scalar
Number of FP uops retired of selected type sorted by vector
(AVX/SSE packed) or scalar (x87, AVX/SSE scalar). Can be used
to profile FP codes.
INT_Ops_Retired Core::X86::Pmc::Core::INT_Ops_Retired - FP executed integer
type uops sorted by vector or scalar
Number of integer uops executed in the FP retired of selected
type sorted by vector (SSE/AVX) or scalar (MMX). Can be used to
profile vector INT / MMX codes.
Packed_FP_Ops_Retired Core::X86::Pmc::Core::Packed_FP_Ops_Retired - FP uops retired
sorted by packed 128 or packed 256
Number of FP uops retired of selected type sorted by 128-bit
packed dest (XMM) or 256-bit packed dest (YMM). Can be used to
profile FP codes.
Packed_INT_Ops_Retired Core::X86::Pmc::Core::Packed_INT_Ops_Retired - FP executed
packed integer uops sorted by packed 128 or packed 256
Number of integer uops executed in FP retired of selected type
sorted by 128-bit packed dest (XMM) or 256-bit packed dest
(YMM). Can be used to profile FP codes.
FP_Dispatch_Faults Core::X86::Pmc::Core::FP_Dispatch_Faults - FP Dispatch Faults
Number of FP dispatch faults triggered by type. Dispatch
fill/spill faults occur when FP either does not have the data
needed to operate on in its local registers (fill), or FP needs
to empty out upper register data for proper SSE merging
behavior when executing AVX code (spill).
This event has the following units which may be used to modify
the behavior of the event:
YmmSpillFault YMM spill fault
YmmFillFault YMM fill fault
XmmFillFault XMM Fill fault
x87FillFault x87 Fill fault
Bad_Status_2_STLI Core::X86::Pmc::Core::Bad_Status_2_STLI - Bad Status 2
Store To Load Interlock (STLI) are loads that were unable to
complete because of a possible match with an older store, and
the older store could not do Store To Load Forwarding (STLF)
for some reason.
This event has the following units which may be used to modify
the behavior of the event:
StliOther Store-to-load conflicts: A load was unable to complete
due to a non-forwardable conflict with an older store.
Most commonly, a load's address range partially but not
completely overlaps with an uncompleted older store.
Software can avoid this problem by using same-size and
same-alignment loads and stores when accessing the same
data. Vector/SIMD code is particularly susceptible to
this problem; software should construct wide vector
stores by manipulating vector elements in registers
using shuffle/blend/swap instructions prior to storing
to memory, instead of using narrow element-by-element
stores.
Retired_Lock_Instructions Core::X86::Pmc::Core::Retired_Lock_Instructions - Retired Lock
Instructions
Counts retired atomic read-modify-write instructions with a
LOCK prefix.
CLFLUSH Core::X86::Pmc::Core::CLFLUSH - Retired CLFLUSH Instructions
The number of retired CLFLUSH instructions. This is a non-
speculative event.
CPUID Core::X86::Pmc::Core::CPUID - Retired CPUID Instructions
The number of CPUID instructions retired.
LS_Dispatch Core::X86::Pmc::Core::LS_Dispatch - LS Dispatch
Counts the number of operations dispatched to the LS unit. Unit
Masks events are ADDed.
This event has the following units which may be used to modify
the behavior of the event:
LdOpSt Dispatch of a single op that performs a load from and
store to the same memory address.
PureSt Dispatch of a single op that performs a memory store.
PureLd Dispatch of a single op that performs a memory load.
SMI_or_SMM_cycles Core::X86::Pmc::Core::SMI_or_SMM_cycles - SMIs Received
Counts the number of System Management Interrupts (SMIs)
received.
Interrupts_Taken Core::X86::Pmc::Core::Interrupts_Taken - Interrupts Taken
Counts the number of interrupts taken.
This event has the following units which may be used to modify
the behavior of the event:
NumInterrupts Number of interrupts taken. This event is also counted
when UnitMask[7:0]=0.
Store_to_Load_Forward Core::X86::Pmc::Core::Store_to_Load_Forward - Store to Load
Forward
Number of STLF hits.
Store_Globally_Visible_Cancels_2 Core::X86::Pmc::Core::Store_Globally_Visible_Cancels_2 - Store
Globally Visible Cancels 2
Counts reasons why a Store Coalescing Buffer (SCB) commit is
canceled.
This event has the following units which may be used to modify
the behavior of the event:
OlderStVisibleDepCancel Older SCB we are waiting on to become globally visible
was unable to become globally visible.
LS_MAB_Allocates_by_Type Core::X86::Pmc::Core::LS_MAB_Allocates_by_Type - LS MAB
Allocates by Type
Counts when an LS pipe allocates a Miss Address Buffer (MAB)
entry to make a miss request.
Demand_DC_Fills_by_Data_Source Core::X86::Pmc::Core::Demand_DC_Fills_by_Data_Source - Demand
Data Cache Fills by Data Source
Counts fills into the DC that were initiated by demand ops, per
data source.
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar Requests that return from Extension Memory.
DramIO_Far Requests that target another NUMA node and return from
DRAM or MMIO.
NearFarCache_Far Requests that target another NUMA node and return from
another CCX's cache.
DramIO_Near Requests that target the same NUMA node and return from
DRAM or MMIO.
NearFarCache_Near Requests that target the same NUMA node and return from
another CCX's cache.
LocalCcx Data returned from L3 or different L2 in the same CCX.
LocalL2 Data returned from local L2.
Any_DC_Fills_by_Data_Source Core::X86::Pmc::Core::Any_DC_Fills_by_Data_Source - Any Data
Cache Fills by Data Source
Counts all fills into the DC, per data source.
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar Requests that return from Extension Memory.
DramIO_Far Requests that target another NUMA node and return from
DRAM or MMIO.
NearFarCache_Far Requests that target another NUMA node and return from
another CCX's cache.
DramIO_Near Requests that target the same NUMA node and return from
DRAM or MMIO.
NearFarCache_Near Requests that target the same NUMA node and return from
another CCX's cache.
LocalCcx Data returned from L3 or different L2 in the same CCX.
LocalL2 Data returned from local L2.
L1_DTLB_Reloads Core::X86::Pmc::Core::L1_DTLB_Reloads - L1 DTLB Reloads
Counts L1DTLB reloads
This event has the following units which may be used to modify
the behavior of the event:
TlbReload1GL2Miss DTLB reload to a 1G page that missed in the L2DTLB.
TlbReload2ML2Miss DTLB reload to a 2M page that missed in the L2DTLB.
TlbReloadCoalescedPageMiss DTLB reload to a coalesced page that missed in the
L2DTLB.
TlbReload4KL2Miss DTLB reload to a 4K page that missed in the L2DTLB.
TlbReload1GL2Hit DTLB reload to a 1G page that hit in the L2DTLB.
TlbReload2ML2Hit DTLB reload to a 2M page that hit in the L2DTLB.
TlbReloadCoalescedPageHit DTLB reload to a coalesced page that hit in the L2DTLB.
TlbReload4KL2Hit DTLB reload to a 4K page that hit in the L2DTLB.
Misaligned_Load_Flows Core::X86::Pmc::Core::Misaligned_Load_Flows - Misaligned Load
Flows
The number of misaligned load flows.
This event has the following units which may be used to modify
the behavior of the event:
MA4K The number of 4KB misaligned (i.e., page crossing)
loads or LdOpSt.
MA64 The number of 64B misaligned (i.e., cacheline crossing)
loads or LdOpSt.
Software_Prefetch_Dispatched Core::X86::Pmc::Core::Software_Prefetch_Dispatched - Prefetch
Instructions Dispatched
Software Prefetch Instructions Dispatched (speculative)
This event has the following units which may be used to modify
the behavior of the event:
PREFETCHNTA PrefetchNTA instruction. See docAPM3 PREFETCHlevel.
PREFETCHW PrefetchW instruction. See docAPM3 PREFETCHlevel.
PREFETCH PrefetchT0, T1, and T2 instructions. See docAPM3
PREFETCHlevel.
WCB_Close Core::X86::Pmc::Core::WCB_Close - Write Combining Buffer Close
Counts events that cause a Write Combining Buffer (WCB) entry
to close.
This event has the following units which may be used to modify
the behavior of the event:
FullLine64B All 64 bytes of the WCB entry have been written.
Ineffective_Software_Prefetches Core::X86::Pmc::Core::Ineffective_Software_Prefetches - Ineffective Software Prefetches
The number of software prefetches that did not fetch data
outside of the processor core.
This event has the following units which may be used to modify
the behavior of the event:
MabHit Software PREFETCH instruction saw a match on an
already-allocated miss request.
DcHit Software PREFETCH instruction saw a DC hit.
Software_Prefetch_Data_Cache_Fills Core::X86::Pmc::Core::Software_Prefetch_Data_Cache_Fills - Software Prefetch Data Cache Fills by Data Source
Counts fills into the DC that were initiated by software
prefetch instructions, per data source.
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar Requests that return from Extension Memory.
DramIO_Far Requests that target another NUMA node and return from
DRAM or MMIO.
NearFarCache_Far Requests that target another NUMA node and return from
another CCX's cache.
DramIO_Near Requests that target the same NUMA node and return from
DRAM or MMIO.
NearFarCache_Near Requests that target the same NUMA node and return from
another CCX's cache.
LocalCcx Data returned from L3 or different L2 in the same CCX.
LocalL2 Data returned from local L2.
Hardware_Prefetch_Data_Cache_Fills Core::X86::Pmc::Core::Hardware_Prefetch_Data_Cache_Fills - Hardware Prefetch Data Cache Fills by Data Source
Counts fills into the DC that were initiated by hardware
prefetches, per data source.
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar Requests that return from Extension Memory.
DramIO_Far Requests that target another NUMA node and return from
DRAM or MMIO.
NearFarCache_Far Requests that target another NUMA node and return from
another CCX's cache.
DramIO_Near Requests that target the same NUMA node and return from
DRAM or MMIO.
NearFarCache_Near Requests that target the same NUMA node and return from
another CCX's cache.
LocalCcx Data returned from L3 or different L2 in the same CCX.
LocalL2 Data returned from local L2.
Allocated_DC_misses Core::X86::Pmc::Core::Allocated_DC_misses - Allocated DC misses
Counts the number of in-flight DC misses each cycle.
Cycles_Not_in_Halt Core::X86::Pmc::Core::Cycles_Not_in_Halt - Cycles Not in Halt
Counts cycles when the thread is not in a HALTed state
TLB_Flush_Events Core::X86::Pmc::Core::TLB_Flush_Events - All TLB Flushes
TLB flush events.
P0_frequency_Cycles_Not_in_Halt Core::X86::Pmc::Core::P0_frequency_Cycles_Not_in_Halt - P0 Freq
Cycles not in Halt
Counts cycles not in Halt, at the P0 P-state frequency,
regardless of the current Pstate.
This event has the following units which may be used to modify
the behavior of the event:
P0_frequency_Cycles_Not_in_Halt Counts at the P0 frequency (same as
Core::X86::Msr::MPERF) when not in Halt.
Instruction_Cache_Refills_from_L2 Core::X86::Pmc::Core::Instruction_Cache_Refills_from_L2 - Instruction Cache Refills From L2
The number of 64 byte instruction cache lines fulfilled from
the L2 cache.
Instruction_Cache_Refills_from_System Core::X86::Pmc::Core::Instruction_Cache_Refills_from_System - Instruction Cache Refills from System
The number of 64 byte instruction cache line fulfilled from
system memory or another cache.
L1_ITLB_Miss_L2_ITLB_Hit Core::X86::Pmc::Core::L1_ITLB_Miss_L2_ITLB_Hit - L1 ITLB Miss,
L2ITLB Hit
The number of instruction fetches that miss in the L1 ITLB but
hit in the L2 ITLB.
ITLB_Reload_from_Page_Table_walk Core::X86::Pmc::Core::ITLB_Reload_from_Page_Table_walk - L1
ITLB Miss, L2 ITLB Miss
The number of instruction fetches that miss in both the L1 ITLB
and L2 ITLB.
This event has the following units which may be used to modify
the behavior of the event:
Coalesced_4k Walk for >4k Coalesced page (implemented as 16k)
walk_1G Walk for 1G page
walk_2M Walk for 2M page
walk_4K Walk to 4k page
BP_Correct Core::X86::Pmc::Core::BP_Correct - BP Pipe Correction or Cancel
The Branch Predictor flushed its own pipeline due to internal
conditions such as a second level prediction structure. Does
not count the number of bubbles caused by these internal
flushes.
Variable_Target_Predictions Core::X86::Pmc::Core::Variable_Target_Predictions - Variable
Target Predictions
The number of times a branch used the indirect predictor to
make a prediction.
Decoder_Overrides_Existing_Branch_Prediction_Speculative Core::X86::Pmc::Core::Decoder_Overrides_Existing_Branch_Prediction_Speculative - Early Redirects
Number of times that an Early Redirect is sent to Branch
Predictor. This happens when either the decoder or dispatch
logic is able to detect that the Branch Predictor needs to be
redirected.
ITLB_Hits Core::X86::Pmc::Core::ITLB_Hits - ITLB Instruction Fetch Hits
The number of instruction fetches that hit in the L1ITLB.
This event has the following units which may be used to modify
the behavior of the event:
IF1G L1 Instruction TLB Hit (1G page size)
IF2M L1 Instruction TLB Hit (2M page size)
IF4K L1 Instruction TLB Hit (4k or 16k coalesced page size)
BP_redirects Core::X86::Pmc::Core::BP_redirects - BP Redirects
Counts redirects of the branch predictor. To support legacy
software, counts both EX mispredict and resyncs when
unit_mask[7:0] is set to 0.
This event has the following units which may be used to modify
the behavior of the event:
ExRedir Mispredict redirect from EX (execution-time)
Resync Resync redirect (Retire-time) from RT
Fetch_IBS_events Core::X86::Pmc::Core::Fetch_IBS_events - Fetch IBS events
Counts significant Fetch IBS State transitions.
This event has the following units which may be used to modify
the behavior of the event:
SampleVal Counts the number of valid Fetch Instruction Based
Sampling (fetch IBS) samples that were collected. Each
valid sample also created an IBS interrupt.
SampleFiltered Counts the number of Fetch IBS tagged fetches that were
discarded due to IBS filtering. When a tagged fetch is
discarded the Fetch IBS facility will automatically tag
a new fetch.
SampleDiscarded Counts when the Fetch IBS facility discards an IBS
tagged fetch for reasons other than IBS filtering. When
a tagged fetch is discarded the Fetch IBS facility will
automatically tag a new fetch.
FetchTagged Counts the number of fetches tagged for Fetch IBS. Not
all tagged fetches create an IBS interrupt and valid
fetch sample.
IC_Tag_Hit_Miss_events Core::X86::Pmc::Core::IC_Tag_Hit_Miss_events - IC Tag Hit and
Miss Events
Counts the number of microtag and full tag events as selected
by unit mask.
Op_Cache_hit_miss Core::X86::Pmc::Core::Op_Cache_hit_miss - Op Cache Hit or Miss
Counts Op Cache micro-tag hit/miss events.
Dispatch_Empty Core::X86::Pmc::Core::Dispatch_Empty - Op Queue Empty
Cycles where the Op Queue is empty.
Source_of_Op_Dispatched_From_Decoder Core::X86::Pmc::Core::Source_of_Op_Dispatched_From_Decoder - Source of Op Dispatched From Decoder
Counts the number of ops dispatched from the decoder classified
by op source.
This event has the following units which may be used to modify
the behavior of the event:
Op_Cache Count of ops dispatched from OpCache
x86_decoder Count of ops dispatched from x86 decoder
Types_of_Ops_Dispatched_From_Decoder Core::X86::Pmc::Core::Types_of_Ops_Dispatched_From_Decoder - Types of Ops Dispatched From Decoder
Counts the number of ops dispatched from the decoder classified
by op type. The UnitMask value encodes which types of ops are
counted.
Dispatch_Stall_Cycles_Dynamic_Tokens_Part_1 Core::X86::Pmc::Core::Dispatch_Stall_Cycles_Dynamic_Tokens_Part_1 - Dynamic Tokens Dispatch Stall Cycles 1
Cycles where a dispatch group is valid but does not get
dispatched due to a Token Stall. UnitMask bits select the stall
types included in the count.
This event has the following units which may be used to modify
the behavior of the event:
FPSchRsrcStall FP NSQ token stall
TakenBrnchBufferRsrc taken branch buffer resource stall.
StoreQueueRsrcStall STQ Tokens unavailable
LoadQueueRsrcStall Load Queue Token Stall.
IntPhyRegFileRsrcStall Integer Physical Register File resource stall.
Dispatch_Stall_Cycles_Dynamic_Tokens_Part_2 Core::X86::Pmc::Core::Dispatch_Stall_Cycles_Dynamic_Tokens_Part_2 - Dynamic Tokens Dispatch Stall Cycles 2
Cycles where a dispatch group is valid but does not get
dispatched due to a token stall. UnitMask bits select the stall
types included in the count.
This event has the following units which may be used to modify
the behavior of the event:
RetQ Retire queue tokens unavailable
EX_Flush_recovery Integer Execution flush recovery pending
AGTokens Agen tokens unavailable
ALTokens ALU tokens unavailable
No_Dispatch_per_Slot Core::X86::Pmc::Core::No_Dispatch_per_Slot - No_Dispatch_per_Slot
Counts the number of dispatch slots (each cycle) that remained
unused for reasons selected by UnitMask.
Additional_Resource_Stalls Core::X86::Pmc::Core::Additional_Resource_Stalls - Dispatch
Additional Resource Stalls
This PMC event counts additional resource stalls that are not
captured by Dispatch_Stall_Cycle_Dynamic_Tokens_Part_1 or
Dispatch_Stall_Cycles_Dynamic_Tokens_Part_2.
Retired_Instructions Core::X86::Pmc::Core::Retired_Instructions - Retired
Instructions
The number of instructions retired.
Retired_Macro_Ops Core::X86::Pmc::Core::Retired_Macro_Ops - Retired Macro-Ops
The number of macro-ops retired.
Retired_Branch_Instructions Core::X86::Pmc::Core::Retired_Branch_Instructions - Retired
Branch Instructions
The number of branch instructions retired. This includes all
types of architectural control flow changes, including
exceptions and interrupts.
Retired_Branch_Instructions_Mispredicted Core::X86::Pmc::Core::Retired_Branch_Instructions_Mispredicted - Retired Branch Instructions Mispredicted.
The number of retired branch instructions, that were
mispredicted. Note that only EX mispredicts are counted.
Retired_Taken_Branch_Instructions Core::X86::Pmc::Core::Retired_Taken_Branch_Instructions - Retired Taken Branch Instructions
The number of taken branches that were retired. This includes
all types of architectural control flow changes, including
exceptions and interrupts.
Retired_Taken_Branch_Instructions_Mispredicted Core::X86::Pmc::Core::Retired_Taken_Branch_Instructions_Mispredicted - Retired Taken Branch Instructions Mispredicted.
The number of retired taken branch instructions that were
mispredicted. Note that only EX mispredicts are counted.
Retired_Far_Control_Transfers Core::X86::Pmc::Core::Retired_Far_Control_Transfers - Retired
Far Control Transfers
The number of far control transfers retired including far
call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and
interrupts. Far control transfers are not subject to branch
prediction.
Retired_Near_Return_Branch_Instructions Core::X86::Pmc::Core::Retired_Near_Return_Branch_Instructions - Retired Near Return Branch Instructions
The number of near return instructions (RET [C3] or RET Iw
[C2]) retired.
Retired_Near_Return_Branch_Instructions_Mispredicted Core::X86::Pmc::Core::Retired_Near_Return_Branch_Instructions_Mispredicted - Retired Near Return Branch Instructions Mispredicted
The number of near returns retired that were not correctly
predicted by the return address predictor. Each such mispredict
incurs the same penalty as a mispredicted conditional branch
instruction. Note that only EX mispredicts are counted.
Retired_Indirect_Branch_Instructions_Mispredicted Core::X86::Pmc::Core::Retired_Indirect_Branch_Instructions_Mispredicted - Retired Indirect Branch Instructions Mispredicted
The number of indirect branches retired that were not correctly
predicted. Each such mispredict incurs the same penalty as a
mispredicted conditional branch instruction. Note that only EX
mispredicts are counted.
Retired_MMX_FP_Instructions Core::X86::Pmc::Core::Retired_MMX_FP_Instructions - Retired MMX
FP Instructions
The number of MMX, SSE or x87 instructions retired. The
UnitMask allows the selection of the individual classes of
instructions as given in the table. Each increment represents
one complete instruction. Since this event includes non-numeric
instructions it is not suitable for measuring MFLOPs
This event has the following units which may be used to modify
the behavior of the event:
SSE SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41,
SSE42, AVX).
MMX MMX instructions
X87 x87 instructions
Retired_Indirect_Branch_Instructions Core::X86::Pmc::Core::Retired_Indirect_Branch_Instructions - Retired Indirect Branch Instructions
The number of indirect branches retired.
Retired_Conditional_Branch_Instructions Core::X86::Pmc::Core::Retired_Conditional_Branch_Instructions - Retired Conditional Branch Instructions
Count of conditional branch instructions that retired
Div_Cycles_Busy_count Core::X86::Pmc::Core::Div_Cycles_Busy_count - Div Cycles Busy
count
Counts cycles when the divider is busy
Div_Op_Count Core::X86::Pmc::Core::Div_Op_Count - Div Op Count
Counts number of divide ops
Cycles_with_no_retire Core::X86::Pmc::Core::Cycles_with_no_retire - Cycles with no
retire
This event counts cycles when the hardware thread does not
retire any ops for reasons selected by UnitMask[4:0]. UnitMask
events [4:0] are mutually exclusive. If multiple reasons apply
for a given cycle, the lowest numbered UnitMask event is
counted.
This event has the following units which may be used to modify
the behavior of the event:
ThreadNotSelected The number cycles where ops could have retired (i.e.
did not fall into the sub-events [0]...[3]) but did not
retire because the thread arbitration did not select
the thread for retire.
Other The number of cycles where ops could have retired (self
and older ops are complete), but were stopped from
retirement for other reasons: retire breaks, traps,
faults, etc.
NotCompleteSelf The number of cycles where the oldest retire slot did
not have its completion bits set.
Empty The number of cycles when there were no valid ops in
the retire queue. This may be caused by front-end
bottlenecks or pipeline redirects.
Retired_Microcoded_Instructions Core::X86::Pmc::Core::Retired_Microcoded_Instructions - Retired
Microcoded Instructions
The number of retired microcoded instructions.
Retired_Microcode_Ops Core::X86::Pmc::Core::Retired_Microcode_Ops - Retired Microcode
Ops
The number of microcode ops that have retired.
Retired_Conditional_Branch_Instructions_Mispredicted Core::X86::Pmc::Core::Retired_Conditional_Branch_Instructions_Mispredicted - Retired Conditional Branch Instructions Mispredicted
The number of retired conditional branch instructions that were
not correctly predicted because of a branch direction mismatch.
Retired_Unconditional_Branch_Instructions_Mispredicted Core::X86::Pmc::Core::Retired_Unconditional_Branch_Instructions_Mispredicted - Retired Unconditional Branch Instructions Mispredicted
The number of retired unconditional indirect branch
instructions that were mispredicted.
Retired_Unconditional_Branch_Instructions Core::X86::Pmc::Core::Retired_Unconditional_Branch_Instructions - Retired Unconditional Branch Instructions
Retired Unconditional Branch Instructions
Tagged_IBS_Ops Core::X86::Pmc::Core::Tagged_IBS_Ops - Tagged IBS Ops
Counts Op IBS related events
This event has the following units which may be used to modify
the behavior of the event:
IbsCountRollover Number of times an op could not be tagged by IBS
because of a previous tagged op that has not yet
signaled interrupt.
IbsTaggedOpsRet Number of Ops tagged by IBS that retired
IbsTaggedOps Number of Ops tagged by IBS
Retired_fused_instructions Core::X86::Pmc::Core::Retired_fused_instructions - Retired
Fused Instructions
Counts retired fused instructions.
L2RequestG1 Core::X86::Pmc::L2::L2RequestG1 - Requests to L2 Group1
All L2 Cache Requests (Breakdown 1 - Common)
This event has the following units which may be used to modify
the behavior of the event:
RdBlkL Data Cache Reads (including hardware and software
prefetch).
RdBlkX Data Cache Stores
LsRdBlkC_S Data Cache Shared Reads
CacheableIcRead Instruction Cache Reads.
LsPrefetchL2Cmd L2HwPf All prefetches accepted by L2 pipeline, hit or miss.
Types of PF and L2 hit/miss broken out in a separate
perfmon event
Group2 Various Noncacheable requests. Non-cached Data Reads,
Non- cached Instruction Reads, Self-modifying code
checks.
L2RequestG2 Core::X86::Pmc::L2::L2RequestG2 - Requests to L2 Group2
All L2 Cache Requests (Breakdown 2 - Rare).
This event has the following units which may be used to modify
the behavior of the event:
LsRdSized LS sized read, coherent non-cacheable.
LsRdSizedNC LS sized read, non-coherent, non-cacheable.
L2WcbReq Core::X86::Pmc::L2::L2WcbReq - Write Combining Buffer Requests
Write Combining Buffer operations. For information on Write
Combining see docAPM2 sections: Memory System, Memory Types,
Buffering and Combining Memory Writes.
This event has the following units which may be used to modify
the behavior of the event:
WcbClose Write Combining Buffer close
L2CacheReqStat Core::X86::Pmc::L2::L2CacheReqStat - Core to L2 Cacheable
Request Access Status
L2 Cache Request Outcomes (not including L2 Prefetch).
This event has the following units which may be used to modify
the behavior of the event:
LsRdBlkCS Data Cache Shared Read Hit in L2.
LsRdBlkLHitX: Data Cache Read Hit in L2 Modifiable
LsRdBlkLHitS Data Cache Read Hit Non-Modifiable Line in L2.
LsRdBlkX Data Cache Store Hit in L2.
LsRdBlkC Data Cache Req Miss in L2.
IcFillHitX Instruction Cache Hit Modifiable Line in L2.
IcFillHitS Instruction Cache Hit Non-Modifiable Line in L2.
IcFillMiss Instruction Cache Req Miss in L2.
L2PfHitL2 Core::X86::Pmc::L2::L2PfHitL2 - L2 Prefetch Hit in L2
Counts all L2 prefetches accepted by L2 pipeline which hit in
the L2 cache.
L2PfMissL2HitL3 Core::X86::Pmc::L2::L2PfMissL2HitL3 - L2 Prefetcher Hits in L3
Counts all L2 prefetches accepted by the L2 pipeline which miss
the L2 cache and hit the L3.
L2PfMissL2L3 Core::X86::Pmc::L2::L2PfMissL2L3 - L2 Prefetcher Misses in L3
Counts all L2 prefetches accepted by the L2 pipeline which miss
the L2 and the L3 caches
L2FillRspSrc Core::X86::Pmc::L2::L2FillRspSrc - L2 Fill Response Source
Counts fill responses based on their source. Selecting an event
mask of 0xfe will count all L3 responses. This will count all
L3 responses to fill requests. This event is similar to LS PMC
0x44
This event has the following units which may be used to modify
the behavior of the event:
AlternateMemories_NearFar Requests that return from Extension Memory
DramIO_Far Requests that target another NUMA node and return from
either DRAM or MMIO from another NUMA node, either from
the same or different NUMA node.
NearFarCache_Far Requests that target another NUMA node and return from
another CCX's cache.
DramIO_Near Requests that target the same NUMA node and return from
either DRAM or MMIO from the same NUMA node.
NearFarCache_Near Requests that target the same NUMA node and return from
another CCX's cache.
LocalCcx Data returned from L3 or different L2 in the same CCX.
SEE ALSO
cpc(3CPC)illumos March 25, 2019 illumos