
WW
WW
HH
HH
II
II
TT
TT
EE
EE
PP
PP
AA
AA
PP
PP
EE
EE
RR
RR
Page 8 The AMD Athlon™ MP Processor May 2003
In order to supply such a highly superscalar microarchitecture, the
AMD Athlon MP processor implements a large, on-chip cache architecture particularly
in the L1 cache closest to the core. The AMD Athlon MP processor’s high-
performance, on-chip cache architecture includes a dual-ported 128KB (two separate
64K) split-L1 cache with separate snoop ports, and an integrated full-speed, 16-way
set-associative, 512KB L2 cache using a 72-bit (64-bit data + 8-bit ECC) interface.
The AMD Athlon MP processor’s large integrated full-speed L1 cache is comprised of
two separate 64KB, two-way set-associative data and instruction caches, which are
much larger than the Intel Xeon processor’s L1 cache (128K vs. 8K + 12K µop). By
featuring a larger L1 cache, applications running on the AMD Athlon MP processor
perform exceptionally fast since more instruction and data information is local to the
processor. Applications exploit the larger caches by benefiting from the increased
support of instruction and data set locality. The data cache also has eight banks to
provide maximum parallelism for running multiple applications. It supports
concurrent accesses by two 64-bit loads or stores. The instruction cache contains
predecode data to assist multiple, high-performance instruction decoders. Both
instruction and data caches are dual-ported and contain dedicated snoop ports
designed to eliminate all system coherency traffic, common in systems with many
devices, from interfering with application performance.
The AMD Athlon MP processor also includes an integrated, full-speed, 16-way
set-associative, exclusive 512KB L2 cache. When the processor requests data, it first
searches the data in its L1 cache. If the processor finds the data in its L1 cache, the
result is what is known as a cache hit and the processor retrieves the data from the
low latency L1 cache. If the processor cannot retrieve the data from its L1 cache, the
processor attempts to retrieve the data in its L2 cache and once again attempts to
obtain a cache hit. In the event of a cache miss, the processor must then request
this data from the slower system memory. With the additional 256KB L2 cache over
previous AMD Athlon MP processors, the AMD Athlon MP processor with 512KB L2
cache increases the performance of server applications such as email, exchange, file,
print, and networking applications by keeping more frequently accessed instructions
and data close to the CPU. Depending on the environment, larger L2 caches can
greatly benefit server and workstation applications that demand large datasets such
as database and messaging applications. Higher set-associativity increases the hit
rate by reducing data conflicts. This translates into more possible locations in which
important data can reside in the L2 cache memory instead of system memory. With
an exclusive cache architecture, the contents of the L1 caches are not duplicated in
the L2 cache. This enables 512KB of L2 cache and 128KB of L1 cache for a total
usable storage space of 640KB.