Tribblix: manual page: amd_f1ah_zen5

AMD_F1AH_ZEN5_EVENTS(3CPC) CPU Performance Counters Library Functions

NAME

amd_f1ah_zen5_events - AMD Family 1ah Zen5 processor performance
monitoring events

DESCRIPTION

This manual page describes events specfic to AMD Family 1ah Zen5
processors. For more information, please consult the appropriate AMD
BIOS and Kernel Developer's guide or Open-Source Register Reference.

Each of the events listed below includes the AMD mnemonic which matches
the name found in the AMD manual and a brief summary of the event. If
available, a more detailed description of the event follows and then
any additional unit values that modify the event. Each unit can be
combined to create a new event in the system by placing the '.'
character between the event name and the unit name.

The following events are supported:

Retired_x87_FP_Ops
Core::X86::Pmc::Core::Retired_x87_FP_Ops - FP retired x87 uops

Number of retired x87 arithmetic operations. Can be used to
calculate x87 FLOPs.

This event has the following units which may be used to modify
the behavior of the event:

DivSqrROps
x87 Divide or square root uops.

MulOps x87 Multiply uops.

AddSubOps
x87 Add/subtract uops.

Retired_SSE_AVX_FLOPs
Core::X86::Pmc::Core::Retired_SSE_AVX_FLOPs - FP retired SSE
and AVX FLOPs

Number of SSE and AVX floating point arithmetic operations
retired. Number of arithmetic operations retired is dependent
on number of uops retired, data size (scalar/128/256/512), data
type (BF16/FP16/FP32/FP64) and type of operation
(add/sub/mul/mac/...). Use MergeEvent feature for accurate
results.

Retired_FP_uOps
Core::X86::Pmc::Core::Retired_FP_uOps - FP uops retired by size

Report number of FP uops retired by size. Can be used to
determine how vectorized code is and how much MMX / x87 content
is in the code.

This event has the following units which may be used to modify
the behavior of the event:

Pack512uOpsRetired
Packed 512-bit uops retired.

Pack256uOpsRetired
Packed 256-bit uops retired.

Pack128uOpsRetired
Packed 128-bit uops retired.

ScalaruOpsRetired
Scalar uops retired.

MMXuOpsRetired
MMX uops retired.

x87uOpsRetired
x87 uops retired.

FP_Ops_Retired
Core::X86::Pmc::Core::FP_Ops_Retired - FP uops retired sorted
by vector or scalar

Number of FP uops retired of selected type sorted by vector
(AVX/SSE packed) or scalar (x87, AVX/SSE scalar). Can be used
to profile FP codes.

INT_Ops_Retired
Core::X86::Pmc::Core::INT_Ops_Retired - FP executed integer
type uops sorted by vector or scalar

Number of integer uops executed in the FP retired of selected
type sorted by vector (SSE/AVX) or scalar (MMX). Can be used to
profile vector INT / MMX codes.

Packed_FP_Ops_Retired
Core::X86::Pmc::Core::Packed_FP_Ops_Retired - FP uops retired
sorted by packed 128 or packed 256

Number of FP uops retired of selected type sorted by 128-bit
packed dest (XMM) or 256-bit packed dest (YMM). Can be used to
profile FP codes.

Packed_INT_Ops_Retired
Core::X86::Pmc::Core::Packed_INT_Ops_Retired - FP executed
packed integer uops sorted by packed 128 or packed 256

Number of integer uops executed in FP retired of selected type
sorted by 128-bit packed dest (XMM) or 256-bit packed dest
(YMM). Can be used to profile FP codes.

FP_Dispatch_Faults
Core::X86::Pmc::Core::FP_Dispatch_Faults - FP Dispatch Faults

Number of FP dispatch faults triggered by type. Dispatch
fill/spill faults occur when FP either does not have the data
needed to operate on in its local registers (fill), or FP needs
to empty out upper register data for proper SSE merging
behavior when executing AVX code (spill).

This event has the following units which may be used to modify
the behavior of the event:

YmmSpillFault
YMM spill fault

YmmFillFault
YMM fill fault

XmmFillFault
XMM Fill fault

x87FillFault
x87 Fill fault

Bad_Status_2_STLI
Core::X86::Pmc::Core::Bad_Status_2_STLI - Bad Status 2

Store To Load Interlock (STLI) are loads that were unable to
complete because of a possible match with an older store, and
the older store could not do Store To Load Forwarding (STLF)
for some reason.

This event has the following units which may be used to modify
the behavior of the event:

StliOther
Store-to-load conflicts: A load was unable to complete
due to a non-forwardable conflict with an older store.
Most commonly, a load's address range partially but not
completely overlaps with an uncompleted older store.
Software can avoid this problem by using same-size and
same-alignment loads and stores when accessing the same
data. Vector/SIMD code is particularly susceptible to
this problem; software should construct wide vector
stores by manipulating vector elements in registers
using shuffle/blend/swap instructions prior to storing
to memory, instead of using narrow element-by-element
stores.

Retired_Lock_Instructions
Core::X86::Pmc::Core::Retired_Lock_Instructions - Retired Lock
Instructions

Counts retired atomic read-modify-write instructions with a
LOCK prefix.

CLFLUSH
Core::X86::Pmc::Core::CLFLUSH - Retired CLFLUSH Instructions

The number of retired CLFLUSH instructions. This is a non-
speculative event.

CPUID Core::X86::Pmc::Core::CPUID - Retired CPUID Instructions

The number of CPUID instructions retired.

LS_Dispatch
Core::X86::Pmc::Core::LS_Dispatch - LS Dispatch

Counts the number of operations dispatched to the LS unit. Unit
Masks events are ADDed.

This event has the following units which may be used to modify
the behavior of the event:

LdOpSt Dispatch of a single op that performs a load from and
store to the same memory address.

PureSt Dispatch of a single op that performs a memory store.

PureLd Dispatch of a single op that performs a memory load.

SMI_or_SMM_cycles
Core::X86::Pmc::Core::SMI_or_SMM_cycles - SMIs Received

Counts the number of System Management Interrupts (SMIs)
received.

Interrupts_Taken
Core::X86::Pmc::Core::Interrupts_Taken - Interrupts Taken

Counts the number of interrupts taken.

This event has the following units which may be used to modify
the behavior of the event:

NumInterrupts
Number of interrupts taken. This event is also counted
when UnitMask[7:0]=0.

Store_to_Load_Forward
Core::X86::Pmc::Core::Store_to_Load_Forward - Store to Load
Forward

Number of STLF hits.

Store_Globally_Visible_Cancels_2
Core::X86::Pmc::Core::Store_Globally_Visible_Cancels_2 - Store
Globally Visible Cancels 2

Counts reasons why a Store Coalescing Buffer (SCB) commit is
canceled.

This event has the following units which may be used to modify
the behavior of the event:

OlderStVisibleDepCancel
Older SCB we are waiting on to become globally visible
was unable to become globally visible.

LS_MAB_Allocates_by_Type
Core::X86::Pmc::Core::LS_MAB_Allocates_by_Type - LS MAB
Allocates by Type

Counts when an LS pipe allocates a Miss Address Buffer (MAB)
entry to make a miss request.

Demand_DC_Fills_by_Data_Source
Core::X86::Pmc::Core::Demand_DC_Fills_by_Data_Source - Demand
Data Cache Fills by Data Source

Counts fills into the DC that were initiated by demand ops, per
data source.

This event has the following units which may be used to modify
the behavior of the event:

AlternateMemories_NearFar
Requests that return from Extension Memory.

DramIO_Far
Requests that target another NUMA node and return from
DRAM or MMIO.

NearFarCache_Far
Requests that target another NUMA node and return from
another CCX's cache.

DramIO_Near
Requests that target the same NUMA node and return from
DRAM or MMIO.

NearFarCache_Near
Requests that target the same NUMA node and return from
another CCX's cache.

LocalCcx
Data returned from L3 or different L2 in the same CCX.

LocalL2
Data returned from local L2.

Any_DC_Fills_by_Data_Source
Core::X86::Pmc::Core::Any_DC_Fills_by_Data_Source - Any Data
Cache Fills by Data Source

Counts all fills into the DC, per data source.

This event has the following units which may be used to modify
the behavior of the event:

AlternateMemories_NearFar
Requests that return from Extension Memory.

DramIO_Far
Requests that target another NUMA node and return from
DRAM or MMIO.

NearFarCache_Far
Requests that target another NUMA node and return from
another CCX's cache.

DramIO_Near
Requests that target the same NUMA node and return from
DRAM or MMIO.

NearFarCache_Near
Requests that target the same NUMA node and return from
another CCX's cache.

LocalCcx
Data returned from L3 or different L2 in the same CCX.

LocalL2
Data returned from local L2.

L1_DTLB_Reloads
Core::X86::Pmc::Core::L1_DTLB_Reloads - L1 DTLB Reloads

Counts L1DTLB reloads

This event has the following units which may be used to modify
the behavior of the event:

TlbReload1GL2Miss
DTLB reload to a 1G page that missed in the L2DTLB.

TlbReload2ML2Miss
DTLB reload to a 2M page that missed in the L2DTLB.

TlbReloadCoalescedPageMiss
DTLB reload to a coalesced page that missed in the
L2DTLB.

TlbReload4KL2Miss
DTLB reload to a 4K page that missed in the L2DTLB.

TlbReload1GL2Hit
DTLB reload to a 1G page that hit in the L2DTLB.

TlbReload2ML2Hit
DTLB reload to a 2M page that hit in the L2DTLB.

TlbReloadCoalescedPageHit
DTLB reload to a coalesced page that hit in the L2DTLB.

TlbReload4KL2Hit
DTLB reload to a 4K page that hit in the L2DTLB.

Misaligned_Load_Flows
Core::X86::Pmc::Core::Misaligned_Load_Flows - Misaligned Load
Flows

The number of misaligned load flows.

This event has the following units which may be used to modify
the behavior of the event:

MA4K The number of 4KB misaligned (i.e., page crossing)
loads or LdOpSt.

MA64 The number of 64B misaligned (i.e., cacheline crossing)
loads or LdOpSt.

Software_Prefetch_Dispatched
Core::X86::Pmc::Core::Software_Prefetch_Dispatched - Prefetch
Instructions Dispatched

Software Prefetch Instructions Dispatched (speculative)

This event has the following units which may be used to modify
the behavior of the event:

PREFETCHNTA
PrefetchNTA instruction. See docAPM3 PREFETCHlevel.

PREFETCHW
PrefetchW instruction. See docAPM3 PREFETCHlevel.

PREFETCH
PrefetchT0, T1, and T2 instructions. See docAPM3
PREFETCHlevel.

WCB_Close
Core::X86::Pmc::Core::WCB_Close - Write Combining Buffer Close

Counts events that cause a Write Combining Buffer (WCB) entry
to close.

This event has the following units which may be used to modify
the behavior of the event:

FullLine64B
All 64 bytes of the WCB entry have been written.

Ineffective_Software_Prefetches
Core::X86::Pmc::Core::Ineffective_Software_Prefetches -
Ineffective Software Prefetches

The number of software prefetches that did not fetch data
outside of the processor core.

This event has the following units which may be used to modify
the behavior of the event:

MabHit Software PREFETCH instruction saw a match on an
already-allocated miss request.

DcHit Software PREFETCH instruction saw a DC hit.

Software_Prefetch_Data_Cache_Fills
Core::X86::Pmc::Core::Software_Prefetch_Data_Cache_Fills -
Software Prefetch Data Cache Fills by Data Source

Counts fills into the DC that were initiated by software
prefetch instructions, per data source.

This event has the following units which may be used to modify
the behavior of the event:

AlternateMemories_NearFar
Requests that return from Extension Memory.

DramIO_Far
Requests that target another NUMA node and return from
DRAM or MMIO.

NearFarCache_Far
Requests that target another NUMA node and return from
another CCX's cache.

DramIO_Near
Requests that target the same NUMA node and return from
DRAM or MMIO.

NearFarCache_Near
Requests that target the same NUMA node and return from
another CCX's cache.

LocalCcx
Data returned from L3 or different L2 in the same CCX.

LocalL2
Data returned from local L2.

Hardware_Prefetch_Data_Cache_Fills
Core::X86::Pmc::Core::Hardware_Prefetch_Data_Cache_Fills -
Hardware Prefetch Data Cache Fills by Data Source

Counts fills into the DC that were initiated by hardware
prefetches, per data source.

This event has the following units which may be used to modify
the behavior of the event:

AlternateMemories_NearFar
Requests that return from Extension Memory.

DramIO_Far
Requests that target another NUMA node and return from
DRAM or MMIO.

NearFarCache_Far
Requests that target another NUMA node and return from
another CCX's cache.

DramIO_Near
Requests that target the same NUMA node and return from
DRAM or MMIO.

NearFarCache_Near
Requests that target the same NUMA node and return from
another CCX's cache.

LocalCcx
Data returned from L3 or different L2 in the same CCX.

LocalL2
Data returned from local L2.

Allocated_DC_misses
Core::X86::Pmc::Core::Allocated_DC_misses - Allocated DC misses

Counts the number of in-flight DC misses each cycle.

Cycles_Not_in_Halt
Core::X86::Pmc::Core::Cycles_Not_in_Halt - Cycles Not in Halt

Counts cycles when the thread is not in a HALTed state

TLB_Flush_Events
Core::X86::Pmc::Core::TLB_Flush_Events - All TLB Flushes

TLB flush events.

P0_frequency_Cycles_Not_in_Halt
Core::X86::Pmc::Core::P0_frequency_Cycles_Not_in_Halt - P0 Freq
Cycles not in Halt

Counts cycles not in Halt, at the P0 P-state frequency,
regardless of the current Pstate.

This event has the following units which may be used to modify
the behavior of the event:

P0_frequency_Cycles_Not_in_Halt
Counts at the P0 frequency (same as
Core::X86::Msr::MPERF) when not in Halt.

Instruction_Cache_Refills_from_L2
Core::X86::Pmc::Core::Instruction_Cache_Refills_from_L2 -
Instruction Cache Refills From L2

The number of 64 byte instruction cache lines fulfilled from
the L2 cache.

Instruction_Cache_Refills_from_System
Core::X86::Pmc::Core::Instruction_Cache_Refills_from_System -
Instruction Cache Refills from System

The number of 64 byte instruction cache line fulfilled from
system memory or another cache.

L1_ITLB_Miss_L2_ITLB_Hit
Core::X86::Pmc::Core::L1_ITLB_Miss_L2_ITLB_Hit - L1 ITLB Miss,
L2ITLB Hit

The number of instruction fetches that miss in the L1 ITLB but
hit in the L2 ITLB.

ITLB_Reload_from_Page_Table_walk
Core::X86::Pmc::Core::ITLB_Reload_from_Page_Table_walk - L1
ITLB Miss, L2 ITLB Miss

The number of instruction fetches that miss in both the L1 ITLB
and L2 ITLB.

This event has the following units which may be used to modify
the behavior of the event:

Coalesced_4k
Walk for >4k Coalesced page (implemented as 16k)

walk_1G
Walk for 1G page

walk_2M
Walk for 2M page

walk_4K
Walk to 4k page

BP_Correct
Core::X86::Pmc::Core::BP_Correct - BP Pipe Correction or Cancel

The Branch Predictor flushed its own pipeline due to internal
conditions such as a second level prediction structure. Does
not count the number of bubbles caused by these internal
flushes.

Variable_Target_Predictions
Core::X86::Pmc::Core::Variable_Target_Predictions - Variable
Target Predictions

The number of times a branch used the indirect predictor to
make a prediction.

Decoder_Overrides_Existing_Branch_Prediction_Speculative
Core::X86::Pmc::Core::Decoder_Overrides_Existing_Branch_Prediction_Speculative
- Early Redirects

Number of times that an Early Redirect is sent to Branch
Predictor. This happens when either the decoder or dispatch
logic is able to detect that the Branch Predictor needs to be
redirected.

ITLB_Hits
Core::X86::Pmc::Core::ITLB_Hits - ITLB Instruction Fetch Hits

The number of instruction fetches that hit in the L1ITLB.

This event has the following units which may be used to modify
the behavior of the event:

IF1G L1 Instruction TLB Hit (1G page size)

IF2M L1 Instruction TLB Hit (2M page size)

IF4K L1 Instruction TLB Hit (4k or 16k coalesced page size)

BP_redirects
Core::X86::Pmc::Core::BP_redirects - BP Redirects

Counts redirects of the branch predictor. To support legacy
software, counts both EX mispredict and resyncs when
unit_mask[7:0] is set to 0.

This event has the following units which may be used to modify
the behavior of the event:

ExRedir
Mispredict redirect from EX (execution-time)

Resync Resync redirect (Retire-time) from RT

Fetch_IBS_events
Core::X86::Pmc::Core::Fetch_IBS_events - Fetch IBS events

Counts significant Fetch IBS State transitions.

This event has the following units which may be used to modify
the behavior of the event:

SampleVal
Counts the number of valid Fetch Instruction Based
Sampling (fetch IBS) samples that were collected. Each
valid sample also created an IBS interrupt.

SampleFiltered
Counts the number of Fetch IBS tagged fetches that were
discarded due to IBS filtering. When a tagged fetch is
discarded the Fetch IBS facility will automatically tag
a new fetch.

SampleDiscarded
Counts when the Fetch IBS facility discards an IBS
tagged fetch for reasons other than IBS filtering. When
a tagged fetch is discarded the Fetch IBS facility will
automatically tag a new fetch.

FetchTagged
Counts the number of fetches tagged for Fetch IBS. Not
all tagged fetches create an IBS interrupt and valid
fetch sample.

IC_Tag_Hit_Miss_events
Core::X86::Pmc::Core::IC_Tag_Hit_Miss_events - IC Tag Hit and
Miss Events

Counts the number of microtag and full tag events as selected
by unit mask.

Op_Cache_hit_miss
Core::X86::Pmc::Core::Op_Cache_hit_miss - Op Cache Hit or Miss

Counts Op Cache micro-tag hit/miss events.

Dispatch_Empty
Core::X86::Pmc::Core::Dispatch_Empty - Op Queue Empty

Cycles where the Op Queue is empty.

Source_of_Op_Dispatched_From_Decoder
Core::X86::Pmc::Core::Source_of_Op_Dispatched_From_Decoder -
Source of Op Dispatched From Decoder

Counts the number of ops dispatched from the decoder classified
by op source.

This event has the following units which may be used to modify
the behavior of the event:

Op_Cache
Count of ops dispatched from OpCache

x86_decoder
Count of ops dispatched from x86 decoder

Types_of_Ops_Dispatched_From_Decoder
Core::X86::Pmc::Core::Types_of_Ops_Dispatched_From_Decoder -
Types of Ops Dispatched From Decoder

Counts the number of ops dispatched from the decoder classified
by op type. The UnitMask value encodes which types of ops are
counted.

Dispatch_Stall_Cycles_Dynamic_Tokens_Part_1
Core::X86::Pmc::Core::Dispatch_Stall_Cycles_Dynamic_Tokens_Part_1
- Dynamic Tokens Dispatch Stall Cycles 1

Cycles where a dispatch group is valid but does not get
dispatched due to a Token Stall. UnitMask bits select the stall
types included in the count.

This event has the following units which may be used to modify
the behavior of the event:

FPSchRsrcStall
FP NSQ token stall

TakenBrnchBufferRsrc
taken branch buffer resource stall.

StoreQueueRsrcStall
STQ Tokens unavailable

LoadQueueRsrcStall
Load Queue Token Stall.

IntPhyRegFileRsrcStall
Integer Physical Register File resource stall.

Dispatch_Stall_Cycles_Dynamic_Tokens_Part_2
Core::X86::Pmc::Core::Dispatch_Stall_Cycles_Dynamic_Tokens_Part_2
- Dynamic Tokens Dispatch Stall Cycles 2

Cycles where a dispatch group is valid but does not get
dispatched due to a token stall. UnitMask bits select the stall
types included in the count.

This event has the following units which may be used to modify
the behavior of the event:

RetQ Retire queue tokens unavailable

EX_Flush_recovery
Integer Execution flush recovery pending

AGTokens
Agen tokens unavailable

ALTokens
ALU tokens unavailable

No_Dispatch_per_Slot
Core::X86::Pmc::Core::No_Dispatch_per_Slot -
No_Dispatch_per_Slot

Counts the number of dispatch slots (each cycle) that remained
unused for reasons selected by UnitMask.

Additional_Resource_Stalls
Core::X86::Pmc::Core::Additional_Resource_Stalls - Dispatch
Additional Resource Stalls

This PMC event counts additional resource stalls that are not
captured by Dispatch_Stall_Cycle_Dynamic_Tokens_Part_1 or
Dispatch_Stall_Cycles_Dynamic_Tokens_Part_2.

Retired_Instructions
Core::X86::Pmc::Core::Retired_Instructions - Retired
Instructions

The number of instructions retired.

Retired_Macro_Ops
Core::X86::Pmc::Core::Retired_Macro_Ops - Retired Macro-Ops

The number of macro-ops retired.

Retired_Branch_Instructions
Core::X86::Pmc::Core::Retired_Branch_Instructions - Retired
Branch Instructions

The number of branch instructions retired. This includes all
types of architectural control flow changes, including
exceptions and interrupts.

Retired_Branch_Instructions_Mispredicted
Core::X86::Pmc::Core::Retired_Branch_Instructions_Mispredicted
- Retired Branch Instructions Mispredicted.

The number of retired branch instructions, that were
mispredicted. Note that only EX mispredicts are counted.

Retired_Taken_Branch_Instructions
Core::X86::Pmc::Core::Retired_Taken_Branch_Instructions -
Retired Taken Branch Instructions

The number of taken branches that were retired. This includes
all types of architectural control flow changes, including
exceptions and interrupts.

Retired_Taken_Branch_Instructions_Mispredicted
Core::X86::Pmc::Core::Retired_Taken_Branch_Instructions_Mispredicted
- Retired Taken Branch Instructions Mispredicted.

The number of retired taken branch instructions that were
mispredicted. Note that only EX mispredicts are counted.

Retired_Far_Control_Transfers
Core::X86::Pmc::Core::Retired_Far_Control_Transfers - Retired
Far Control Transfers

The number of far control transfers retired including far
call/jump/return, IRET, SYSCALL and SYSRET, plus exceptions and
interrupts. Far control transfers are not subject to branch
prediction.

Retired_Near_Return_Branch_Instructions
Core::X86::Pmc::Core::Retired_Near_Return_Branch_Instructions -
Retired Near Return Branch Instructions

The number of near return instructions (RET [C3] or RET Iw
[C2]) retired.

Retired_Near_Return_Branch_Instructions_Mispredicted
Core::X86::Pmc::Core::Retired_Near_Return_Branch_Instructions_Mispredicted
- Retired Near Return Branch Instructions Mispredicted

The number of near returns retired that were not correctly
predicted by the return address predictor. Each such mispredict
incurs the same penalty as a mispredicted conditional branch
instruction. Note that only EX mispredicts are counted.

Retired_Indirect_Branch_Instructions_Mispredicted
Core::X86::Pmc::Core::Retired_Indirect_Branch_Instructions_Mispredicted
- Retired Indirect Branch Instructions Mispredicted

The number of indirect branches retired that were not correctly
predicted. Each such mispredict incurs the same penalty as a
mispredicted conditional branch instruction. Note that only EX
mispredicts are counted.

Retired_MMX_FP_Instructions
Core::X86::Pmc::Core::Retired_MMX_FP_Instructions - Retired MMX
FP Instructions

The number of MMX, SSE or x87 instructions retired. The
UnitMask allows the selection of the individual classes of
instructions as given in the table. Each increment represents
one complete instruction. Since this event includes non-numeric
instructions it is not suitable for measuring MFLOPs

This event has the following units which may be used to modify
the behavior of the event:

SSE SSE instructions (SSE, SSE2, SSE3, SSSE3, SSE4A, SSE41,
SSE42, AVX).

MMX MMX instructions

X87 x87 instructions

Retired_Indirect_Branch_Instructions
Core::X86::Pmc::Core::Retired_Indirect_Branch_Instructions -
Retired Indirect Branch Instructions

The number of indirect branches retired.

Retired_Conditional_Branch_Instructions
Core::X86::Pmc::Core::Retired_Conditional_Branch_Instructions -
Retired Conditional Branch Instructions

Count of conditional branch instructions that retired

Div_Cycles_Busy_count
Core::X86::Pmc::Core::Div_Cycles_Busy_count - Div Cycles Busy
count

Counts cycles when the divider is busy

Div_Op_Count
Core::X86::Pmc::Core::Div_Op_Count - Div Op Count

Counts number of divide ops

Cycles_with_no_retire
Core::X86::Pmc::Core::Cycles_with_no_retire - Cycles with no
retire

This event counts cycles when the hardware thread does not
retire any ops for reasons selected by UnitMask[4:0]. UnitMask
events [4:0] are mutually exclusive. If multiple reasons apply
for a given cycle, the lowest numbered UnitMask event is
counted.

This event has the following units which may be used to modify
the behavior of the event:

ThreadNotSelected
The number cycles where ops could have retired (i.e.
did not fall into the sub-events [0]...[3]) but did not
retire because the thread arbitration did not select
the thread for retire.

Other The number of cycles where ops could have retired (self
and older ops are complete), but were stopped from
retirement for other reasons: retire breaks, traps,
faults, etc.

NotCompleteSelf
The number of cycles where the oldest retire slot did
not have its completion bits set.

Empty The number of cycles when there were no valid ops in
the retire queue. This may be caused by front-end
bottlenecks or pipeline redirects.

Retired_Microcoded_Instructions
Core::X86::Pmc::Core::Retired_Microcoded_Instructions - Retired
Microcoded Instructions

The number of retired microcoded instructions.

Retired_Microcode_Ops
Core::X86::Pmc::Core::Retired_Microcode_Ops - Retired Microcode
Ops

The number of microcode ops that have retired.

Retired_Conditional_Branch_Instructions_Mispredicted
Core::X86::Pmc::Core::Retired_Conditional_Branch_Instructions_Mispredicted
- Retired Conditional Branch Instructions Mispredicted

The number of retired conditional branch instructions that were
not correctly predicted because of a branch direction mismatch.

Retired_Unconditional_Branch_Instructions_Mispredicted
Core::X86::Pmc::Core::Retired_Unconditional_Branch_Instructions_Mispredicted
- Retired Unconditional Branch Instructions Mispredicted

The number of retired unconditional indirect branch
instructions that were mispredicted.

Retired_Unconditional_Branch_Instructions
Core::X86::Pmc::Core::Retired_Unconditional_Branch_Instructions
- Retired Unconditional Branch Instructions

Retired Unconditional Branch Instructions

Tagged_IBS_Ops
Core::X86::Pmc::Core::Tagged_IBS_Ops - Tagged IBS Ops

Counts Op IBS related events

This event has the following units which may be used to modify
the behavior of the event:

IbsCountRollover
Number of times an op could not be tagged by IBS
because of a previous tagged op that has not yet
signaled interrupt.

IbsTaggedOpsRet
Number of Ops tagged by IBS that retired

IbsTaggedOps
Number of Ops tagged by IBS

Retired_fused_instructions
Core::X86::Pmc::Core::Retired_fused_instructions - Retired
Fused Instructions

Counts retired fused instructions.

L2RequestG1
Core::X86::Pmc::L2::L2RequestG1 - Requests to L2 Group1

All L2 Cache Requests (Breakdown 1 - Common)

This event has the following units which may be used to modify
the behavior of the event:

RdBlkL Data Cache Reads (including hardware and software
prefetch).

RdBlkX Data Cache Stores

LsRdBlkC_S
Data Cache Shared Reads

CacheableIcRead
Instruction Cache Reads.

LsPrefetchL2Cmd

L2HwPf All prefetches accepted by L2 pipeline, hit or miss.
Types of PF and L2 hit/miss broken out in a separate
perfmon event

Group2 Various Noncacheable requests. Non-cached Data Reads,
Non- cached Instruction Reads, Self-modifying code
checks.

L2RequestG2
Core::X86::Pmc::L2::L2RequestG2 - Requests to L2 Group2

All L2 Cache Requests (Breakdown 2 - Rare).

This event has the following units which may be used to modify
the behavior of the event:

LsRdSized
LS sized read, coherent non-cacheable.

LsRdSizedNC
LS sized read, non-coherent, non-cacheable.

L2WcbReq
Core::X86::Pmc::L2::L2WcbReq - Write Combining Buffer Requests

Write Combining Buffer operations. For information on Write
Combining see docAPM2 sections: Memory System, Memory Types,
Buffering and Combining Memory Writes.

This event has the following units which may be used to modify
the behavior of the event:

WcbClose
Write Combining Buffer close

L2CacheReqStat
Core::X86::Pmc::L2::L2CacheReqStat - Core to L2 Cacheable
Request Access Status

L2 Cache Request Outcomes (not including L2 Prefetch).

This event has the following units which may be used to modify
the behavior of the event:

LsRdBlkCS
Data Cache Shared Read Hit in L2.

LsRdBlkLHitX: Data Cache Read Hit in L2
Modifiable

LsRdBlkLHitS
Data Cache Read Hit Non-Modifiable Line in L2.

LsRdBlkX
Data Cache Store Hit in L2.

LsRdBlkC
Data Cache Req Miss in L2.

IcFillHitX
Instruction Cache Hit Modifiable Line in L2.

IcFillHitS
Instruction Cache Hit Non-Modifiable Line in L2.

IcFillMiss
Instruction Cache Req Miss in L2.

L2PfHitL2
Core::X86::Pmc::L2::L2PfHitL2 - L2 Prefetch Hit in L2

Counts all L2 prefetches accepted by L2 pipeline which hit in
the L2 cache.

L2PfMissL2HitL3
Core::X86::Pmc::L2::L2PfMissL2HitL3 - L2 Prefetcher Hits in L3

Counts all L2 prefetches accepted by the L2 pipeline which miss
the L2 cache and hit the L3.

L2PfMissL2L3
Core::X86::Pmc::L2::L2PfMissL2L3 - L2 Prefetcher Misses in L3

Counts all L2 prefetches accepted by the L2 pipeline which miss
the L2 and the L3 caches

L2FillRspSrc
Core::X86::Pmc::L2::L2FillRspSrc - L2 Fill Response Source

Counts fill responses based on their source. Selecting an event
mask of 0xfe will count all L3 responses. This will count all
L3 responses to fill requests. This event is similar to LS PMC
0x44

This event has the following units which may be used to modify
the behavior of the event:

AlternateMemories_NearFar
Requests that return from Extension Memory

DramIO_Far
Requests that target another NUMA node and return from
either DRAM or MMIO from another NUMA node, either from
the same or different NUMA node.

NearFarCache_Far
Requests that target another NUMA node and return from
another CCX's cache.

DramIO_Near
Requests that target the same NUMA node and return from
either DRAM or MMIO from the same NUMA node.

NearFarCache_Near
Requests that target the same NUMA node and return from
another CCX's cache.

LocalCcx
Data returned from L3 or different L2 in the same CCX.

NAME

DESCRIPTION

SEE ALSO