# AMD64 Architecture. Gabe Emerson

#### Introduction:

The AMD64 is a 64-bit superset of the x86 architecture originally designed by Intel. Created by Advanced Micro Devices, Inc, AMD64 extends the functionality of earlier 32-bit architecutres with the addition of 64-bit registers. The architecure retains full 32-bit and 16-bit register and compiler compatibility, allowing consumers to easily upgrade from x86 and previous AMD 32-bit systems. Even though the architecture is RISC-based, it includes integrated pipeline decoders to perform x86 CISC instructions. AMD64 can operate in several modes, including "Long Mode" for 64-bit OSs, "Compatibility Mode" which runs 32-bit Operating Systems, and a subset of compatibility mode called "real mode" for legacy 16-bit OSs.

Prior to the first commercial release, this architecture had a variety of names, including Hammer, K8, x86-64 and IA-32e. The name "AMD64" was chosen in 2003, and the first commercial server CPU, the Opteron, was released in April 2003. A consumer-grade desktop processor, the Athlon 64, was released in September 2003. AMD64 is a relatively new architecture, its main competitors being HP and Intel who are jointly developing the IA-64 "Xeon". Other competitors for the server market include existing 64-bit manufacturers such as SGI and Apple. Sun Microsystems, a former competitor, recently joined in a partnership with AMD.

Speeds of current AMD64 processors vary from the 1.4ghz of the original low-end Opteron, to 2.6ghz for the most recent Athlon FX chip. Tests have indicated that Athlon chips using the 90nm manufacturing process (estimated to be available in late 2005) will be able to run at speeds up to 3.7ghz.

#### **Registers:**

The AMD64 architecture contains sixteen general-purpose 64-bit integer registers, eight of which are used only in 64-bit mode. There are also sixteen 128-bit registers used for SSE/SSE2 multimedia applications, doubling the register space available for these operations.

#### Memory:

Both the Athlon and Opteron chips support up to 8GB of RAM (using SDRAM DIMMs), although the architecture can physically address up to 1TB of physical and 256TB of virtual memory. The memory controller is included on-die and runs at the same clock rate as the CPU, which significantly decreases memory latency. Older Athlon chips include a single-channel 64-bit controller, while newer Athlon FX chips offer dual-channel 128-bit controller. The internal bandwidth of the Athlon chip is 6.4GB/sec.

Most AMD64 models have 128KB of level 1 cache (half dedicated for data and half for instructions), and 1MB of level 2 cache, both on-chip. The latency of the L1 cache is 3 clock cycles, while for L2 the best case latency is 11 and the worst 20 cycles (cache latency is affected by the AMD cache organization, which allows unused L1 content to be dropped to L2). The L1 cache includes a "Victim Buffer" to hold data dropped from it , which allows a more efficient transfer when the L1 cache needs to acquire new data. L2 cache latency is less with an empty victim buffer, but the average

case is that of a full buffer and full cache.



# **Pipeline:**

The newer implementations of AMD64 (such as the FX models) have 12 integer and 17 floating point pipeline stages, compared to 10 and 15 for previous models. Athlon XP's have 3 x86 decoders, 3 floating-point pipelines, and 3 integer pipelines, allowing 9 instructions per cycle compared to 6 with the Pentium 4.

The extended pipeline allows for pre-execution decoding of operations into "Micro-Ops" or mOPs, two paths for Vector and Direct operations, schedeling for integer and FP operations, encoders, and seperate execution units. The basic functions of each integer stage are:

IF1: Access L1 inst. Cache.
IF2: Load 16byte inst. Code
IPick: Pick 3 inst. From up to 24 bytes.
ID1: Id and route instruction bytes
ID2: decode inst.
IPack: get 3 int inst. From queue.
ID: Finish pack and decode.
IDispatch: redirect instructions.
IS: schedule instructions
EXEC: Execution
Addr: Address Calculation
LS: Load/Store
DC: Data cache write.



Figure 1: Hammer Microarchitectural Block Diagram

### **Multiprocessing:**

AMD's on-die memory controller and "Hypertransport" hardware interconnect (essentially a direct link between CPUs) allow for much faster bandwidth between multiple processors. A 2-processor AMD64 system can attain memory bandwidth of 10.6 GB/sec, vs 4.3 GB/sec for the Intel Xeon. A 4-processor AMD64 system has a maximum memory bandwidth of 21.2 GB/sec, compared to 6.4 GB/sec for the Xeon MP (the only Intel chip which will support more than two CPUs). Intel's bandwidth is bottlenecked by the bus architecture, which AMD avoids through use of the Hypertransport system. Current server motherboards allow up to 8-way multiprocessing with AMD chips (8 CPUs in a single system), and AMD is developing server chips with multiple CPU cores.

### **Special/Unique Features:**

The current line of Athlon chips remain the industry leader in 64-bit technology. Intel's efforts to create a comparable chip have largely relied on reverseengineering the AMD64 architecture, a technique used by AMD to retain x86 compatibility during the original design process. While Intel's current 64-bit Itanium chip offers higher maximum speeds, it does not offer native compatibility with 32-bit applications and is much less efficient at running legacy code.

### **Conclusion:**

AMD64 is a useful architecture for those wishing to upgrade from 32 to 64bit computing while retaining efficient backwards-compatibility with legacy code. It is also an easily scalable architecture, allowing for faster computation-intensive execution with multiple processors and a deeper pipeline than Intel x86 architectures are currently capable of. The fact that AMD's main competitor is backwards-engineering AMD64 features into their new products indicates the level of advancement that this architecture has to offer.

# List of sources, evaluated for reliability:

http://www.amd.com/us-en/Processors/ProductInformation/0,,30\_118\_9331,00.html AMD corporate site. Should be accurate for technical info, but may be biased on evaluations and comparative analysis of their products compared to others. (Other subsections of AMD.com were also used as sources)

http://en.wikipedia.org/wiki/AMD64

Reputable open information system, content on this particular topic is limited.

http://www.linux-mag.com/2004-07/athlon\_01.html

http://www.extremetech.com/article2/0,3973,1561875,00.asp?kc=ETRSS02129TX1K00 00532

http://www.gamepc.com/labs/view\_content.asp?id=amd64platforms&page=1&cookie%5 Ftest=1

<u>http://www.computerworld.com/hardwaretopics/hardware/story/0,10801,92083,00.html</u> The previous four sources are all reputable magazines, each with their own focus or slant (games, linux applications, hardware, etc).

<u>http://www.digit-life.com/articles2/amd-hammer-family/amd-hammer-family2-add0.html</u> Offers a great deal of in-depth technical data, unknown quality or reputability. Some information conflicts with other sources.

http://www.pcstats.com/articleview.cfm?articleID=1466 Fairly reputable consumer-information and hardware review website.

### http://www.devx.com/amd/Article/20960

Application developer website. AMD64 section co-authored by AMD, so information may be biased towards their products.

http://www.thejemreport.com/modules.php?op=modload&name=News&file=article&sid =117

Personal hardware review site. Unknow degree of reliability.