Alpha 21064A Microprocessors Data Sheet Order Number: EC-QFGKC-TE This document contains information about the following Alpha microprocessors: 21064A-200, 21064A-233, 21064A-275, 21064A-275-PC, and 21064A-300. Revision/Update Information: Digital Equipment Corporation Maynard, Massachusetts This document supersedes the Alpha 21064A-233, -275 Microprocessor Data Sheet, EC-QFGKB-TE. January 1996 While Digital believes the information included in this publication is correct as of the date of publication, it is subject to change without notice. Digital Equipment Corporation makes no representations that the use of its products in the manner described in this publication will not infringe on existing or future patent rights, nor do the descriptions contained in this publication imply the granting of licenses to make, use, or sell equipment or software in accordance with the description. (c) Digital Equipment Corporation 1995, 1996. Printed in U.S.A. All rights reserved. AlphaGeneration, Digital, Digital Semiconductor, OpenVMS, VAX, VAX DOCUMENT, the AlphaGeneration design mark, and the DIGITAL logo are trademarks of Digital Equipment Corporation. Digital Semiconductor is a Digital Equipment Corporation business. GRAFOIL is a registered trademark of Union Carbide Corporation. Windows NT is a trademark of Microsoft Corporation. All other trademarks and registered trademarks are the property of their respective owners. This document was prepared using VAX DOCUMENT Version 2.1. Contents 1 2 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 4 4.1 4.1.1 4.1.2 4.1.3 4.1.3.1 4.1.4 4.1.5 4.1.6 4.1.7 4.1.8 4.1.9 4.1.10 4.1.11 4.1.12 4.1.13 4.1.14 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Signal Names and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Instruction Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IEEE Floating-Point Instructions . . . . . . . . . . . . . . . . . . . . . . 21064A IEEE Floating-Point Conformance . . . . . . . . . . . . . . VAX Floating-Point Instructions . . . . . . . . . . . . . . . . . . . . . . Required PALcode Function Codes . . . . . . . . . . . . . . . . . . . . . Opcodes Reserved for PALcode . . . . . . . . . . . . . . . . . . . . . . . . Opcodes Reserved for Digital . . . . . . . . . . . . . . . . . . . . . . . . . Instructions Specific to the 21064A . . . . . . . . . . . . . . . . . . . . Internal Processor Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ibox Internal Processor Registers . . . . . . . . . . . . . . . . . . . . . . Translation Buffer Tag Register (TB_TAG) . . . . . . . . . . . Instruction Translation Buffer Page Table Entry Register (ITB_PTE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Instruction Cache Control and Status Register (ICCSR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Performance Counters . . . . . . . . . . . . . . . . . . . . . . . . Instruction Translation Buffer Page Table Entry Temporary Register (ITB_PTE_TEMP) . . . . . . . . . . . . . . Exceptions Address Register (EXC_ADDR) . . . . . . . . . . . Clear Serial Line Interrupt Register (SL_CLR) . . . . . . . . Serial Line Receive Register (SL_RCV) . . . . . . . . . . . . . . Instruction Translation Buffer ZAP Register (ITBZAP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Instruction Translation Buffer ASM Register (ITBASM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Instruction Translation Buffer IS Register (ITBIS) . . . . . Processor Status Register (PS) . . . . . . . . . . . . . . . . . . . . . Exception Summary Register (EXC_SUM) . . . . . . . . . . . . PAL_BASE Address Register (PAL_BASE) . . . . . . . . . . . Hardware Interrupt Request Register (HIRR) . . . . . . . . . 1 6 17 17 23 25 28 29 29 29 30 31 31 31 32 33 36 38 39 40 41 41 42 42 42 42 44 44 iii 4.1.15 4.1.16 4.1.17 4.1.18 4.1.19 4.1.20 4.2 4.2.1 4.2.2 4.2.3 4.2.4 4.2.5 4.2.6 4.2.7 4.2.8 4.2.9 4.2.10 4.2.11 4.2.12 4.2.13 4.2.14 4.2.15 4.2.16 4.2.17 4.2.18 4.2.19 4.2.20 4.2.21 4.3 4.4 4.5 5 5.1 5.2 6 6.1 7 7.1 iv Software Interrupt Request Register (SIRR) . . . . . . . . . . Asynchronous Trap Request Register (ASTRR) . . . . . . . . Hardware Interrupt Enable Register (HIER) . . . . . . . . . . Software Interrupt Enable Register (SIER) . . . . . . . . . . . AST Interrupt Enable Register (ASTER) . . . . . . . . . . . . . Serial Line Transmit Register (SL_XMIT) . . . . . . . . . . . . Abox Internal Processor Registers . . . . . . . . . . . . . . . . . . . . . Translation Buffer Control Register (TB_CTL) . . . . . . . . Data Translation Buffer Page Table Entry Register (DTB_PTE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Translation Buffer Page Table Entry Temporary Register (DTB_PTE_TEMP) . . . . . . . . . . . . . . . . . . . . . . . Memory Management Control and Status Register (MM_CSR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Virtual Address Register (VA) . . . . . . . . . . . . . . . . . . . . . Data Translation Buffer ZAP Register (DTBZAP) . . . . . . Data Translation Buffer ASM Register (DTBASM) . . . . . Data Translation Buffer Invalidate Single Register (DTBIS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flush Instruction Cache Register (FLUSH_IC) . . . . . . . . Flush Instruction Cache ASM Register (FLUSH_IC_ASM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abox Control Register (ABOX_CTL) . . . . . . . . . . . . . . . . Alternate Processor Mode Register (ALT_MODE) . . . . . . Cycle Counter Register (CC) . . . . . . . . . . . . . . . . . . . . . . Cycle Counter Control Register (CC_CTL) . . . . . . . . . . . . Bus Interface Unit Control Register (BIU_CTL) . . . . . . . Cache Status Register (C_STAT) . . . . . . . . . . . . . . . . . . . Bus Interface Unit Status Register (BIU_STAT) . . . . . . . Bus Interface Unit Address Register (BIU_ADDR) . . . . . Fill Address Register (FILL_ADDR) . . . . . . . . . . . . . . . . . Fill Syndrome Register (FILL_SYNDROME) . . . . . . . . . . Backup Cache Tag Register (BC_TAG) . . . . . . . . . . . . . . . PAL_TEMP Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lock Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Internal Processor Registers Reset State . . . . . . . . . . . . . . . . Electrical Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DC Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . AC Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thermal Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Critical Parameters of Thermal Design . . . . . . . . . . . . . . . . . Mechanical Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Package Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 47 48 49 50 50 51 51 51 52 53 54 54 54 54 54 54 55 58 58 59 60 65 66 69 70 71 73 74 74 75 78 79 81 94 96 99 99 7.2 7.3 21064A Pins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Signal Pin Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 108 Figures 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Block Diagram of the Alpha 21064A Microprocessor . . . . . Translation Buffer Tag Register . . . . . . . . . . . . . . . . . . . . . Instruction Translation Buffer Page Table Entry Register . ICCSR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ITB_PTE_TEMP Register . . . . . . . . . . . . . . . . . . . . . . . . . Exception Address Register . . . . . . . . . . . . . . . . . . . . . . . . Clear Serial Line Interrupt Register . . . . . . . . . . . . . . . . . Serial Line Receive Register . . . . . . . . . . . . . . . . . . . . . . . Processor Status Register . . . . . . . . . . . . . . . . . . . . . . . . . . Exception Summary Register . . . . . . . . . . . . . . . . . . . . . . . PAL_BASE Address Register . . . . . . . . . . . . . . . . . . . . . . . Hardware Interrupt Request Register . . . . . . . . . . . . . . . . Software Interrupt Request Register . . . . . . . . . . . . . . . . . Asynchronous Trap Request Register . . . . . . . . . . . . . . . . . Hardware Interrupt Enable Register . . . . . . . . . . . . . . . . . Software Interrupt Enable Register . . . . . . . . . . . . . . . . . . AST Interrupt Enable Register . . . . . . . . . . . . . . . . . . . . . Serial Line Transmit Register . . . . . . . . . . . . . . . . . . . . . . Translation Buffer Control Register . . . . . . . . . . . . . . . . . . Data Translation Buffer Page Table Entry Register . . . . . . Data Translation Buffer Page Table Entry Temporary Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Memory Management Control and Status Register . . . . . . Abox Control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alternate Processor Mode Register . . . . . . . . . . . . . . . . . . Cycle Counter Register . . . . . . . . . . . . . . . . . . . . . . . . . . . Cycle Counter Control Register . . . . . . . . . . . . . . . . . . . . . 21064A Bus Interface Unit Control Register . . . . . . . . . . . Cache Status Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bus Interface Unit Status Register . . . . . . . . . . . . . . . . . . Bus Interface Unit Address Register . . . . . . . . . . . . . . . . . Fill Address Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 32 33 34 39 40 40 41 42 43 44 45 46 47 48 49 50 51 51 52 . . . . . . . . . . . . . . . . . . . . . . 52 53 55 58 59 59 60 65 67 69 70 v 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 FILL_SYNDROME Register . . . . . . . . . . . . . . . . . . . . Backup Cache Tag Register . . . . . . . . . . . . . . . . . . . . . Clock Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . Input Clock Timing Diagram . . . . . . . . . . . . . . . . . . . . Reset Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reset Timing--End of Preload Sequence . . . . . . . . . . . Output Delay Time Measurement . . . . . . . . . . . . . . . . Setup and Hold Time Measurement . . . . . . . . . . . . . . . READ_BLOCK Timing Diagram . . . . . . . . . . . . . . . . . WRITE_BLOCK Timing Diagram . . . . . . . . . . . . . . . . BARRIER Timing Diagram . . . . . . . . . . . . . . . . . . . . . FETCH/FETCH_M Timing Diagram . . . . . . . . . . . . . . Package Components and Temperature Measurement Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Heat Sink Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . Package Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . PGA Cavity Down View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 73 82 83 84 85 88 89 90 91 92 93 . . . . . . . . . . . . . . . . . . . . 95 99 100 101 Data, Address, and Parity/ECC Bus Signals . . . . . . . . . . . Primary Cache Invalidate Signals . . . . . . . . . . . . . . . . . . . External Cache Control Signals . . . . . . . . . . . . . . . . . . . . . Fast Lock Mode Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . External Cycle Control Signals . . . . . . . . . . . . . . . . . . . . . Interrupt Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Instruction Cache Initialization and Serial ROM Interface Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Initialization Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Clock Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Performance Monitoring Signals . . . . . . . . . . . . . . . . . . . . Other Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Instruction Format and Opcode Notation . . . . . . . . . . . . . . Architecture Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . IEEE Floating-Point Instruction Function Codes . . . . . . . . VAX Floating-Point Instruction Function Codes . . . . . . . . . Required PALcode Function Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6 7 9 10 12 . . . . . . . . . . . . . . . . . . . . 14 15 15 16 16 17 18 23 28 29 Tables 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 vi 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Opcodes Specific to the 21064A . . . . . . . . . . . . . . . . . . . . . . . Opcodes Reserved for Digital . . . . . . . . . . . . . . . . . . . . . . . . . Instructions Specific to the 21064A . . . . . . . . . . . . . . . . . . . . ICCSR Fields and Description . . . . . . . . . . . . . . . . . . . . . . . . BHE, BPE Branch Prediction Selection (Conditional Branches Only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Performance Counter 0 Input Selection (in ICCSR) . . . . . . . . Performance Counter 1 Input Selection (in ICCSR) . . . . . . . Clear Serial Line Interrupt Register Fields . . . . . . . . . . . . . . Exception Summary Register Fields . . . . . . . . . . . . . . . . . . . Hardware Interrupt Request Register Fields . . . . . . . . . . . . . Hardware Interrupt Enable Register Fields . . . . . . . . . . . . . . Memory Management Control and Status Register . . . . . . . . Abox Control Register Fields . . . . . . . . . . . . . . . . . . . . . . . . . Alternate Processor Mode Register . . . . . . . . . . . . . . . . . . . . Bus Interface Unit Control Register Fields . . . . . . . . . . . . . . BC_SIZE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . BC_PA_DIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cache Status Register Fields . . . . . . . . . . . . . . . . . . . . . . . . . Bus Interface Unit Status Register Fields . . . . . . . . . . . . . . . Syndromes for Single-Bit Errors . . . . . . . . . . . . . . . . . . . . . . Backup Cache Tag Register Fields . . . . . . . . . . . . . . . . . . . . . Internal Process Register Reset State . . . . . . . . . . . . . . . . . . 21064A Maximum Ratings (PRELIMINARY ESTIMATES) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DC Input/Output Characteristics . . . . . . . . . . . . . . . . . . . . . . testClkIn Pin States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Input Clock Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . External Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21064A-200 Thermal Characteristics in a Forced-Air Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21064A-233 Thermal Characteristics in a Forced-Air Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21064A-275 and 21064A-275-PC Thermal Characteristics in a Forced-Air Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . 21064A-300 Thermal Characteristics in a Forced-Air Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 29 30 34 36 37 38 41 43 45 48 53 55 58 60 64 64 65 67 72 74 75 78 80 82 83 85 96 97 97 98 102 vii 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 viii Data Signals Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Address Signals Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . Parity/ECC Bus Signals Pin List . . . . . . . . . . . . . . . . . . . . . . Primary Cache Invalidate Signals Pin List . . . . . . . . . . . . . . External Cache Control Signals Pin List . . . . . . . . . . . . . . . . Interrupt Signals Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . Instruction Cache Initialization Signals Pin List . . . . . . . . . . Serial ROM Interface Signals Pin List . . . . . . . . . . . . . . . . . . Initialization Signals Pin List . . . . . . . . . . . . . . . . . . . . . . . . Load/Lock and Store/Conditional Fast Lock Mode Signals Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Clock Signals Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Performance Monitoring Signals Pin List . . . . . . . . . . . . . . . Other Signals Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Power Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ground Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spare Pin List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 111 112 112 113 114 114 114 115 115 115 115 115 116 117 118 1 Overview This document describes the Alpha 21064A microprocessors (21064A). The five versions of the 21064A differ in clock frequency as indicated by their labels (200, 233, 275, 275-PC, and 300). The 21064A-200, 21064A-233, 21064A-275, and the 21064A-300 are functionally identical. Their memory management operation is very flexible to allow them to enable multiple memory management functions for different operating system environments. The 21064A-275-PC is functionally identical to the other four except that it differs in its memory management functions. The 21064A-275-PC will only support the memory management functions necessary for the Windows NT operating system and other operating systems using the Windows NT memory management model. The label 21064A will be used to describe the functions and operations that are identical for the five devices. The label 21064A-275-PC will be used to identify information that is unique to that one device. The 21064A has the features listed here: * 64-bit RISC microprocessor implements the Alpha architecture * Super-scalar, 200, 233, 275, and 300 MHz * System clock frequency is the processor clock frequency divided by any value from 2 to 17 * Dual instruction issue that yields a peak instruction execution rate of 400, 466, 550, or 600 MIPS * Super pipelined * Two on-chip caches (with data and tag parity protection) 16-Kbyte instruction cache (Icache) 16-Kbyte data cache (Dcache) * 32 Integer registers and 32 floating-point registers (64-bit) * 4K x 2-bit branch prediction history table * External data path selectable for 128 bits or 64 bits * Byte parity mode available for the external data bus * Fast lock mode available for use with LDx/L and STx/C instructions * Backward compatible with the 21064 pin layout and software 1 * Programmable external cache Set size Set speed * 43-bit virtual address * 34-bit physical address * IEEE and VAX floating-point data types * Performance monitoring * 64-bit internal data paths The 21064A and associated PALcode implement IEEE single- and doubleprecision, VAX F_floating and G_floating data types, and support longword (32-bit) and quadword (64-bit) integers. Byte (8-bit) and word (16-bit) support is provided by byte manipulation instructions. Limited hardware support is provided for the VAX D_floating data type. The 21064A consists of four independent functional units: * Integer execution unit (Ebox) * Floating-point unit (Fbox) * Load/store or address unit (Abox) * Branch unit Other sections include the central control unit (Ibox), and the Icache and Dcache. Ebox--Contains a 64-bit fully pipelined integer execution data path including: adder, logic box, barrel shifter, byte extract and mask, and independent integer multiplier. The Ebox also contains a 32-entry, 64-bit integer register file. Fbox--Contains a fully pipelined floating-point unit and independent divider that supports both IEEE and VAX floating-point data types. IEEE singleprecision and double-precision floating-point data types are supported. VAX F_floating and G_floating data types are fully supported, and limited support is provided for the D_floating data type. Abox--Contains five major sections: address translation data path, load silo, write buffer, data cache interface, and the external bus interface unit (BIU). The Abox supports all integer and floating-point load and store instructions including address calculation and translation, and cache control logic. 2 Ibox--Performs instruction fetch, resource checks, and dual-instruction issue to the Ebox, Abox, Fbox, or branch unit. In addition, the Ibox controls pipeline stalls, aborts, and restarts. Pipeline Organization The 21064A uses a 7-stage pipeline for integer operate and memory reference instructions, and a 10-stage pipleline for floating-point operate instructions. The Ibox maintains state for all pipeline stages to track outstanding register writes. Cache Organization The 21064A contains two on-chip caches--data cache (Dcache) and instruction cache (Icache). The 21064A also supports an external cache. Icache--Contains 16K bytes and is a direct-mapped cache with 32-byte blocks. Virtual address bit 13 and physical address bits [12:5] are the index into the cache. Dcache--Contains 16K bytes and is a write through, read-allocate cache with 32-byte blocks. It can be used in either of two modes. * 8K-byte direct-mapped cache (to support 21064 designs). Physical address [12:5] is the index into the cache. * 16K-byte cache The Dcache appears externally as a 2-way set-associative cache. Physical address [12:5] is the index into the cache. The Dcache appears internally as direct-mapped cache. Virtual address [13] and physical address [12:5] are the index into the cache. External Cache--The 21064A supports an external cache that is made with readily available static RAMs. The 21064A directly controls RAM operation using its programmable external cache interface, allowing each hardware implementation to make its own external cache speed and configuration trade-offs. The external cache interface supports cache sizes 0, 256K bytes, 512K bytes, 1M bytes, 2M bytes, 4M bytes, 8M bytes and 16M bytes. The range of operating speeds of the external cache are sub-multiples of the 21064A clock. 3 Virtual Address Space The architecture virtual address is a 64-bit unsigned integer that specifies a byte location within the virtual address space. The 21064A implements a 43-bit subset of the virtual address space. Physical Address Space The 21064A uses a 34-bit physical address to support 16G bytes of physical address space. Backward Compatibility The 21064A is backward compatible with the 21064. The compatibility includes pin layout, PALcode, and application programs. The following restrictions apply to the compatibility between the 21064A and 21064. 4 * The 21064A has internal pulldown resistors on inputs which are unused spare pins on the 21064. If these spare pins are unconnected on a module designed for the 21064, then there will be no migration problem with these pins. * Two pins have been reallocated for other uses. If these pins were not used on a module designed for the 21064, then there will be no migration problem with these pins. On the 21064 the two pins are tagEq_l and tagAdr_h 17; on the 21064A they are lockWE_h and lockFlag_h respectively. * The behavior of the tagOK protocol on the 21064A differs from that of the 21064. Designers should investigate the effect of the change if this protocol is used in existing 21064 modules. Figure 1 shows a block diagram of the 21064A microprocessor. Figure 1 Block Diagram of the Alpha 21064A Microprocessor Instruction Cache (Icache) Branch History Table Tag Parity Data Integer Execution Unit (Ebox) Instruction Fetch/Decode Unit (Ibox) Multiplier Prefetcher Adder Resource Conflict Shifter Data Bus (128 bits) Multiplier/ Adder BIU Divider Pipeline Control Integer Register File (IRF) Address Bus Floating-Point Execution Unit (Fbox) PC Calculation ITB Logic Box Parity Bus Control Floating-Point Register File (FRF) Interrupts Load/Store Unit (Abox) Write Buffer Address Generator DTB Load Silo External Cache Control System Timing Data Cache (Dcache) Tag Parity Data Parity MLO-012077 5 2 Signal Names and Functions The tables in this section list the various signals grouped by function. The information in the Type column identifies a signal as input (I), output (O), or bidirectional (B). Signals with an _h suffix are active (asserted) when high. Those with an _l suffix are active (asserted) when low. Table 1 describes the 21064A data, address, and parity/ECC bus signals. Table 1 Data, Address, and Parity/ECC Bus Signals Signal Type Count Function data_h [127:0] B 128 Provide the data path between the 21064A and the system. adr_h [33:5] B 29 Provide the address path between the 21064A and the system. These address bits provide granularity down to 32-byte internal cache blocks. check_h [27:0] B 28 Provide a path for parity or ECC bits between the 21064A and the rest of the system. Table 2 describes the 21064A primary cache invalidate signals. Table 2 Primary Cache Invalidate Signals Signal Type Count Function iAdr_h [12:5] I 8 Used to index blocks in the Dcache for Dcache invalidates. dInvReq_h [1:0]1 I 2 Used by external logic to invalidate the Dcache entry indexed by iAdr_h [12:5]. 1 dInvReq_h 1 at PGA location C24 was a spare pin on the 21064. It has an internal pulldown that draws a maximum current of 200 A at 2.4 V dc. 6 Table 3 describes the 21064A external cache control signals. Table 3 External Cache Control Signals Signal Type Count Function tagCEOE_h O 1 Controls tag and tag control RAM chip enable or output enable during the 21064A controlled external cache accesses. tagCtlWE_h O 1 Controls tag control RAM write enable during the 21064A controlled transactions. tagCtlV_h, tagCtlS_h, tagCtlD_h B 3 Provide read/write path for external cache valid, shared, and dirty bits. The following combinations of the tagCtl RAM bits are allowed. The tagCtlS_h can be viewed as a write-protect bit. tagCtlP_h B 1 tagCtlV_h tagCtlS_h tagCtlD_h Meaning L X X Invalid H L L Valid, private H L H Valid, private, dirty H H L Valid, shared H H H Valid, shared, dirty Carries parity across tagCtlV_h, tagCtlD_h, and tagCtlS_h. (continued on next page) 7 Table 3 (Cont.) External Cache Control Signals Signal Type Count Function tagAdr_h [33:18] I 16 Carries the address contents of the tagAdr RAM to the 21064A address comparator and parity checker. tagAdrP_h I 1 Carries the parity contents of the tagAdr RAM to the 21064A address comparator and parity checker. tagOk_h, tagOk_l I 2 Bus interface control signals that allow external logic to stall a CPU-controlled access to the external cache RAMs at the last possible moment. dataCEOE_h [3:0] O 4 Controls data RAMs output enable or chip enable during the 21064A controlled cache accesses. dataWE_h [3:0] O 4 Controls data RAMs write enable during the 21064A controlled cache accesses. dataA_h [4:3] O 2 Controls data RAMs adr_h [4:3] during the 21064A controlled cache accesses. holdReq_h I 1 Asserted by external logic to gain access to the external cache. holdAck_h O 1 Asserted by the 21064A to indicate that external logic has access to the external cache. dMapWE_h [1:0]1 O 2 Controls the write enable inputs of the (optional) data cache backmap RAM during the 21064A controlled external cache reads. 1 dMapWE_h 8 1 at PGA location M24 was a spare pin on the 21064. Table 4 describes the signals which allow the 21064A to perform LDxL and STxC transactions to and from an external cache. Table 4 Fast Lock Mode Signals Signal Type Count Function lockWE_h1 O 1 The 21064A is able to probe Bcache for an LDxL transaction. If there is a Bcache hit, then the 21064A will assert lockWE_h allowing external logic to set a lock flag bit and load a lock address register. lockFlag_h2 I 1 This signal line allows external logic to indicate the state of the lock flag bit (set or clear). When the 21064A performs a STxC transaction, it may probe the Bcache and test this signal. If the signal is asserted, then the 21064A will perform the write to Bcache while asserting lockWE_h. This transaction allows the external logic to clear the lock flag bit. 1 lockWE_h at PGA location P24 was used for the signal tagEq_l by the 21064. 2 lockFlag_h at PGA location R23 was used for the signal tagAdr_h 17 by the 21064. 9 Table 5 describes the 21064A external cycle control signals. Table 5 External Cycle Control Signals Signal Type Count Function dOE_l I 1 Used by external logic to tell the 21064A to drive the data bus during external write transactions. dWSel_h [1:0] I 2 Used by external logic to tell the 21064A which part of the 32-byte block of write data should be driven onto the data bus. dRAck_h [2:0] I 3 Informs the 21064A that read data is valid on the data bus, and indicates whether data should be cached and whether ECC or parity checking should be attempted. Read data acknowledge types are: dRAck_h 2 dRAck_h 1 dRAck_h 0 Type L L L IDLE H L L OK_ NCACHE_ NCHK H L H OK_ NCACHE H H L OK_ NCHK H H H OK (continued on next page) 10 Table 5 (Cont.) External Cycle Control Signals Signal Type Count Function cReq_h [2:0] O 3 Used by the 21064A to specify a cycle type at the start of an external cycle. The cycle types are: cReq_h 2 cReq_h 1 cReq_h 0 Type L L L IDLE L L H BARRIER L H L FETCH L H H FETCH_M H L L READ_ BLOCK H L H WRITE_ BLOCK H H L LDL_L/ LDQ_L H H H STL_C/ STQ_C cWMask_h [7:0] O 8 Supplies longword write masks to external logic during write cycles; contains miss address bits and other miss information during other cycles. cAck_h [2:0] I 3 Used by external logic to acknowledge an external cycle. Acknowledge types are: cAck_h 2 cAck_h 1 cAck_h 0 Type L L L IDLE L L H HARD_ ERROR L H L SOFT_ ERROR L H H STL_C_FAIL /STQ_C_ FAIL H L L OK 11 Table 6 describes the 21064A interrupt signals. Table 6 Interrupt Signals Signal Type Count Function irq_h [5:0] I 6 These signal lines have two uses: sysClkDiv_h1 I 1 * During normal operation they provide six types of external interrupts to the 21064A. * At reset, they provide initialization information to the 21064A. At reset, this line provides initialization information to the 21064A. When reset_l is asserted, irq_h [4:3] encodes the delay in CPU clock cycles, from sysClkOut1 to sysClkOut2, as follows: irq_h 4 irq_h 3 Delay L L 0 L H 1 H L 2 H H 3 1 sysClkDiv_h at PGA location AA16 was a spare pin on the 21064. It has an internal pulldown that draws a maximum current of 200 A at 2.4 V dc. (continued on next page) 12 Table 6 (Cont.) Interrupt Signals Signal Type Count Function When reset_l is asserted, sysClkDiv_h and irq_h [2:0] encode the value of the divisor used to generate the system clock from the CPU clock, as follows: sysClkDiv_h1 irq_h [2:0] Ratio L L L L 2 L L L H 3 L L H L 4 L L H H 5 L H L L 6 L H L H 7 L H H L 8 L H H H 9 H L L L H L L H 11 H L H L 12 H L H H 13 H H L L 14 H H L H 15 H H H L 16 H H H H 17 10 When reset_l is asserted, irq_h 5 is used to select 128-bit or 64-bit mode. If irq_h 5 is asserted, then 128-bit mode is selected. 1 sysClkDiv_h at PGA location AA16 was a spare pin on the 21064. It has an internal pulldown that draws a maximum current of 200 A at 2.4 V dc. 13 Table 7 describes the 21064A instruction cache initialization and serial ROM interface signals. Table 7 Instruction Cache Initialization and Serial ROM Interface Signals Signal Type Count Function icMode_h [2:0]1 I 3 Determines which Icache initialization mode is used after reset. The 21064A implements several Icache modes used by Digital to support chip and module level testing. icMode_h [2:0] Mode L L L Serial ROM L L H Disabled All other combinations Digital reserved sRomOE_l O 1 In serial ROM mode, supplies the output enable to the external serial ROM, serving both as an output enable and as a reset. sRomD_h I 1 In serial ROM mode, inputs external serial ROM data to the 21064A. sRomClk_h O 1 In serial ROM mode, supplies the clock to the external serial ROM that causes it to advance to the next bit. The signals sRomOE_l, sRomD_h, and sRomClk_h also serve as simple parallel I/O pins to drive a diagnostic terminal. When the serial ROM is not being read, output signal sRomOE_l is false. This means that sRomOE_l can be wired to the active high enable of an RS422 receiver driving onto sRomD_h and to the active high enable of an RS422 driver driving from sRomClk_h. The CPU allows sRomD_h to be read and sRomClk_h to be written by PALcode; this is sufficient hardware support to implement a software-driven serial interface. 1 icMode_h 2 at PGA location AD7 was a spare pin on the 21064. It has an internal pulldown that draws a maximum current of 200 A at 2.4 V dc. 14 Table 8 describes the initialization signal pins dcOk_h, reset_l and reset_SClk_h. Table 8 Initialization Signals Signal Type Count Function dcOk_h I 1 Switches clock sources between an on-chip ring oscillator and the external clock oscillator. reset_l I 1 Forces the CPU into a known state. reset_SClk_h1 I 1 A test signal. It forces the system clock divider into a known state. 1 reset_SClk_h at PGA location AA11 was a spare pin on the 21064. It has an internal pulldown that draws a maximum current of 200 A at 2.4 V dc. Table 9 describes the 21064A clock signals. Table 9 Clock Signals Signal Type Count Function clkIn_h, clkIn_l I 2 Supplies the 21064A with a differential clock from external logic. testClkIn_h, testClkIn_l I 2 Test signals; should be tied to Vdd and Vss, respectively. cpuClkOut_h O 1 Supplies the internal chip clock for use by the external interface. The low-to-high transition of cpuClkOut_h is the ``CPU clock'' used in the timing specification for the tagOk_h and tagOk_l signals. sysClkOut1_h, sysClkOut1_l O 2 Provides the system clock for use by the external interface. The low-to-high transition of sysClkOut1_h provides the system clock that is used as a timing reference throughout this document. sysClkOut2_h, sysClkOut2_l O 2 Provide delayed system clock to the external interface. The delay is between zero and three CPU clock cycles. 15 Table 10 describes the performance monitoring signals. Table 10 Performance Monitoring Signals Signal Type Count Function perf_cnt_h [1:0] I 2 Provides the 21064A internal performance monitoring hardware with access to off-chip events. Table 11 describes some other signals common to the 21064A. Table 11 Other Signals 16 Signal Type Count Function tristate_l I 1 The assertion of this signal forces all the 21064A signals, with the exception of cpuClkOut_h, to the high-impedance state. cont_l I 1 The assertion of this signal causes the 21064A to connect all signals to Vss, with the exception of certain clock signals and vRef. vRef I 1 Supplies a reference voltage of 1.4 V to the input signal sense circuits. eclOut_h I 1 Digital reserved; should be tied to Vss. 3 Instruction Set This section provides information about instructions for the 21064A. 3.1 Instruction Summary This section contains a summary of all Alpha architecture instructions. All values are in hexadecimal radix. Table 12 describes the contents of the Format and Opcode columns that are in Table 13. Table 12 Instruction Format and Opcode Notation Instruction Format Format Symbol Opcode Notation Meaning Branch Bra oo oo is the 6-bit opcode field. Floatingpoint F-P oo.fff oo is the 6-bit opcode field. fff is the 11-bit function code field. Memory Mem oo oo is the 6-bit opcode field. Memory/ function code Mfc oo.ffff oo is the 6-bit opcode field. ffff is the 16-bit function code in the displacement field. Memory/ branch Mbr oo.h oo is the 6-bit opcode field. h is the high-order two bits of the displacement field. Operate Opr oo.ff oo is the 6-bit opcode field. ff is the 7-bit function code field. PALcode Pcd oo oo is the 6-bit opcode field; the particular PALcode instruction is specified in the 26-bit function code field. 17 Table 13 shows architecture instructions. Table 14 shows qualifiers for IEEE floating-point instructions and Table 15 shows qualifiers for VAX floating-point instructions. Table 13 Architecture Instructions Mnemonic Format Opcode Description ADDF ADDG ADDL ADDL/V ADDQ ADDQ/V ADDS ADDT AND BEQ BGE BGT BIC BIS BLBC F-P F-P Opr Opr Opr Opr F-P F-P Opr Bra Bra Bra Opr Opr Bra 15.080 15.0A0 10.00 10.40 10.20 10.60 16.080 16.0A0 11.00 39 3E 3F 11.0 11.20 38 Add F_floating Add G_floating Add longword Add longword Add quadword Add quadword Add S_floating Add T_floating Logical product Branch if = zero Branch if zero Branch if > zero Bit clear Logical sum Branch if low bit clear BLBS BLE BLT BNE BR BSR CALL_PAL CMOVEQ CMOVGE CMOVGT CMOVLBC CMOVLBS CMOVLE CMOVLT CMOVNE Bra Bra Bra Bra Bra Mbr Pcd Opr Opr Opr Opr Opr Opr Opr Opr 3C 3B 3A 3D 30 34 00 11.24 11.46 11.66 11.16 11.14 11.64 11.44 11.26 Branch if low bit set Branch if zero Branch if < zero Branch if 6= zero Unconditional branch Branch to subroutine Trap to PALcode CMOVE if = zero CMOVE if zero CMOVE if > zero CMOVE if low bit clear CMOVE if low bit set CMOVE if zero CMOVE if < zero CMOVE if 6= zero CMPBGE CMPEQ CMPGEQ Opr Opr F-P 10.0F 10.2D 15.0A5 Compare byte Compare signed quadword equal Compare G_floating equal (continued on next page) 18 Table 13 (Cont.) Architecture Instructions Mnemonic Format Opcode Description CMPGLE F-P 15.0A7 CMPGLT CMPLE F-P Opr 15.0A6 10.6D CMPLT Opr 10.4D CMPTEQ CMPTLE F-P F-P 16.0A5 16.0A7 CMPTLT CMPTUN CMPULE F-P F-P Opr 16.0A6 16.0A4 10.3D CMPULT Opr 10.1D CPYS CPYSE F-P F-P 17.020 17.022 Compare G_floating less than or equal Compare G_floating less than Compare signed quadword less than or equal Compare signed quadword less than Compare T_floating equal Compare T_floating less than or equal Compare T_floating less than Compare T_floating unordered Compare unsigned quadword less than or equal Compare unsigned quadword less than Copy sign Copy sign and exponent CPYSN CVTDG CVTGD CVTGF CVTGQ CVTLQ CVTQF CVTQG CVTQL CVTQL/SV CVTQL/V CVTQS CVTQT CVTST CVTTQ F-P F-P F-P F-P F-P F-P F-P F-P F-P F-P F-P F-P F-P F-P F-P 17.021 15.09E 15.0AD 15.0AC 15.0AF 17.010 15.0BC 15.0BE 17.030 17.530 17.130 16.0BC 16.0BE 16.2AC 16.0AF Copy sign negate Convert D_floating to G_floating Convert G_floating to D_floating Convert G_floating to F_floating Convert G_floating to quadword Convert longword to quadword Convert quadword to F_floating Convert quadword to G_floating Convert quadword to longword Convert quadword to longword Convert quadword to longword Convert quadword to S_floating Convert quadword to T_floating Convert S_floating to T_floating Convert T_floating to quadword CVTTS DIVF DIVG DIVS DIVT F-P F-P F-P F-P F-P 16.0AC 15.083 15.0A3 16.083 16.0A3 Convert T_floating to S_floating Divide F_floating Divide G_floating Divide S_floating Divide T_floating (continued on next page) 19 Table 13 (Cont.) Architecture Instructions Mnemonic Format Opcode Description EQV EXCB EXTBL EXTLH EXTLL EXTQH EXTQL EXTWH EXTWL FBEQ Opr Mfc Opr Opr Opr Opr Opr Opr Opr Bra 11.48 18.0400 12.06 12.6A 12.26 12.7A 12.36 12.5A 12.16 31 Logical equivalence Exception barrier Extract byte low Extract longword high Extract longword low Extract quadword high Extract quadword low Extract word high Extract word low Floating branch if = zero FBGE FBGT FBLE FBLT FBNE FCMOVEQ FCMOVGE FCMOVGT FCMOVLE FCMOVLT FCMOVNE FETCH FETCH_M INSBL INSLH Bra Bra Bra Bra Bra F-P F-P F-P F-P F-P F-P Mfc Mfc Opr Opr 36 37 33 32 35 17.02A 17.02D 17.02F 17.02E 17.02C 17.02B 18.8000 18.A000 12.0B 12.67 Floating branch if zero Floating branch if > zero Floating branch if zero Floating branch if < zero Floating branch if 6= zero FCMOVE if = zero FCMOVE if zero FCMOVE if > zero FCMOVE if zero FCMOVE if < zero FCMOVE if 6= zero Prefetch data Prefetch data, modify intent Insert byte low Insert longword high INSLL INSQH INSQL INSWH INSWL JMP JSR JSR_COROUTINE LDA LDAH LDF LDG Opr Opr Opr Opr Opr Mbr Mbr Mbr Mem Mem Mem Mem 12.2B 12.77 12.3B 12.57 12.1B 1A.0 1A.1 1A.3 08 09 20 21 Insert longword low Insert quadword high Insert quadword low Insert word high Insert word low Jump Jump to subroutine Jump to subroutine return Load address Load address high Load F_floating Load G_floating (continued on next page) 20 Table 13 (Cont.) Architecture Instructions Mnemonic Format Opcode Description LDL LDL_L Mem Mem 28 2A LDQ Mem 29 Load sign-extended longword Load sign-extended longword locked Load quadword LDQ_L LDQ_U LDS LDT MB MF_FPCR MSKBL MSKLH MSKLL MSKQH MSKQL MSKWH MSKWL MT_FPCR MULF Mem Mem Mem Mem Mfc F-P Opr Opr Opr Opr Opr Opr Opr F-P F-P 2B 0B 22 23 18.4000 17.025 12.02 12.62 12.22 12.72 12.32 12.52 12.12 17.024 15.082 Load quadword locked Load unaligned quadword Load S_floating Load T_floating Memory barrier Move from FPCR Mask byte low Mask longword high Mask longword low Mask quadword high Mask quadword low Mask word high Mask word low Move to FPCR Multiply F_floating MULG MULL MULL/V MULQ MULQ/V MULS MULT ORNOT RC RET RPCC RS S4ADDL S4ADDQ S4SUBL F-P Opr Opr Opr Opr F-P F-P Opr Mfc Mbr Mfc Mfc Opr Opr Opr 15.0A2 13.00 13.40 13.20 13.60 16.082 16.0A2 11.28 18.E000 1A.2 18.C000 18.F000 10.02 10.22 10.0B Multiply G_floating Multiply longword Multiply longword Multiply quadword Multiply quadword Multiply S_floating Multiply T_floating Logical sum with complement Read and clear Return from subroutine Read process cycle counter Read and set Scaled add longword by 4 Scaled add quadword by 4 Scaled subtract longword by 4 S4SUBQ S8ADDL Opr Opr 10.2B 10.12 Scaled subtract quadword by 4 Scaled add longword by 8 (continued on next page) 21 Table 13 (Cont.) Architecture Instructions 22 Mnemonic Format Opcode Description S8ADDQ S8SUBL S8SUBQ SLL SRA SRL STF STG STS STL STL_C STQ STQ_C Opr Opr Opr Opr Opr Opr Mem Mem Mem Mem Mem Mem Mem 10.32 10.1B 10.3B 12.39 12.3C 12.34 24 25 26 2C 2E 2D 2F Scaled add quadword by 8 Scaled subtract longword by 8 Scaled subtract quadword by 8 Shift left logical Shift right arithmetic Shift right logical Store F_floating Store G_floating Store S_floating Store longword Store longword conditional Store quadword Store quadword conditional STQ_U STT SUBF SUBG SUBL SUBL/V SUBQ SUBQ/V SUBS SUBT TRAPB UMULH Mem Mem F-P F-P Opr Opr Opr Opr F-P F-P Mfc Opr 0F 27 15.081 15.0A1 10.09 10.49 10.29 10.69 16.081 16.0A1 18.0000 13.30 WMB XOR ZAP ZAPNOT Mfc Opr Opr Opr 18.44 11.40 12.30 12.31 Store unaligned quadword Store T_floating Subtract F_floating Subtract G_floating Subtract longword Subtract longword Subtract quadword Subtract quadword Subtract S_floating Subtract T_floating Trap barrier Unsigned multiply quadword high Write memory barrier Logical difference Zero bytes Zero bytes not 3.2 IEEE Floating-Point Instructions Table 14 lists the hexadecimal value of the 11-bit function code field for the IEEE floating-point instructions, with and without qualifiers. The opcode for these instructions is 1616 . Table 14 IEEE Floating-Point Instruction Function Codes Mnemonic None /C /M /D /U /UC /UM /UD ADDS ADDT CMPTEQ CMPTLT CMPTLE CMPTUN CVTQS CVTQT CVTTS DIVS DIVT MULS MULT SUBS SUBT 080 0A0 0A5 0A6 0A7 0A4 0BC 0BE 0AC 083 0A3 082 0A2 081 0A1 000 020 - - - - 03C 03E 02C 003 023 002 022 001 021 040 060 - - - - 07C 07E 06C 043 063 042 062 041 061 0C0 0E0 - - - - 0FC 0FE 0EC 0C3 0E3 0C2 0E2 0C1 0E1 180 1A0 - - - - - - 1AC 183 1A3 182 1A2 181 1A1 100 120 - - - - - - 12C 103 123 102 122 101 121 140 160 - - - - - - 16C 143 163 142 162 141 161 1C0 1E0 - - - - - - 1EC 1C3 1E3 1C2 1E2 1C1 1E1 (continued on next page) 23 Table 14 (Cont.) IEEE Floating-Point Instruction Function Codes Mnemonic /SU /SUC /SUM /SUD /SUI /SUIC /SUIM /SUID ADDS ADDT CMPTEQ CMPTLT CMPTLE CMPTUN CVTQS CVTQT CVTTS DIVS DIVT MULS MULT SUBS SUBT 580 5A0 5A5 5A6 5A7 5A4 - - 5AC 583 5A3 582 5A2 581 5A1 500 520 - - - - - - 52C 503 523 502 522 501 521 540 560 - - - - - - 56C 543 563 542 562 541 561 5C0 5E0 - - - - - - 5EC 5C3 5E3 5C2 5E2 5C1 5E1 780 7A0 - - - - 7BC 7BE 7AC 783 7A3 782 7A2 781 7A1 700 720 - - - - 73C 73E 72C 703 723 702 722 701 721 740 760 - - - - 77C 77E 76C 743 763 742 762 741 761 7C0 7E0 - - - - 7FC 7FE 7EC 7C3 7E3 7C2 7E2 7C1 7E1 Mnemonic None /S CVTST 2AC 6AC Mnemonic None /C /V /VC /SV /SVC /SVI /SVIC CVTTQ 0AF 02F 1AF 12F 5AF 52F 7AF 72F Mnemonic D /VD /SVD /SVID /M /VM /SVM /SVIM CVTTQ 0EF 1EF 5EF 7EF 06F 16F 56F 76F 24 3.3 21064A IEEE Floating-Point Conformance The 21064A supports the IEEE floating-point operations as defined by the Alpha architecture. Support for a complete implementation of the IEEE Standard for Binary Floating-Point Arithmetic (ANSI/IEEE Standard 7541985) is provided by a combination of hardware and software as described in the Alpha Architecture Reference Manual. Additional information about writing code to support precise exception handling (necessary for complete conformance to the standard) is in the Alpha Architecture Reference Manual. The following information is specific to the 21064A: * Invalid operation (INV) The invalid operation trap is always enabled. If the trap occurs, then the destination register is UNPREDICTABLE. This exception is signaled if any VAX architecture operand is non-finite (reserved operand or dirty zero) and the operation can take an exception. (Certain instructions, such as CPYS, never take an exception.) This exception is signaled if any IEEE operand is non-finite (NAN, INF, denorm) and the operation can take an exception. This trap is also signaled for an IEEE format divide of +/- 0 divided by +/- 0. If the exception occurs, then FPCR[INV] is set and the trap is signaled to the Ibox. * Divide by zero (DZE) The divide-by-zero trap is always enabled. If the trap occurs, then the destination register is UNPREDICTABLE. For VAX architecture format, this exception is signaled whenever the numerator is valid and the denominator is zero. For IEEE format, this exception is signaled whenever the numerator is valid and non-zero, with a denominator of +/- 0. If the exception occurs, then FPCR[DZE] is set and the trap is signaled to the Ibox. For IEEE format divides, 0/0 signals INV, not DZE. * Floating overflow (OVF) The floating overflow trap is always enabled. If the trap occurs, then the destination register is UNPREDICTABLE. The exception is signaled if the rounded result exceeds in magnitude the largest finite number that can be represented by the destination format. This applies only to operations whose destination is a floating-point data type. If the exception occurs, then FPCR[OVF] is set and the trap is signaled to the Ibox. 25 * Underflow (UNF) The underflow trap can be disabled. If underflow occurs, then the destination register is forced to a true zero, consisting of a full 64 bits of zero. This is done even if the proper IEEE result would have been -0. The exception is signaled if the rounded result is smaller in magnitude than the smallest finite number that can be represented by the destination format. If the exception occurs, then FPCR[UNF] is set. If the trap is enabled, then the trap is signaled to the Ibox. * Inexact (INE) The inexact trap can be disabled. The destination register always contains the properly rounded result, whether the trap is enabled. The exception is signaled if the rounded result is different from what would have been produced if infinite precision (infinitely wide data) were available. For floating-point results, this requires both an infinite precision exponent and fraction. For integer results, this requires an infinite precision integer. If the exception occurs, then FPCR[INE] is set. If the trap is enabled, then the trap is signaled to the Ibox. The IEEE-754 specification allows INE to occur concurrently with either OVF or UNF. Whenever OVF is signaled (if the inexact trap is enabled), INE is also signaled. Whenever UNF is signaled (if the inexact trap is enabled), INE is also signaled. The inexact trap also occurs concurrently with integer overflow. All valid opcodes that enable INE also enable both overflow and underflow. If a CVTQL results in an integer overflow (IOV), then FPCR[INE] is automatically set. (The INE trap is never signaled to the Ibox because there is no CVTQL opcode that enables the inexact trap.) DIVx/I behavior is slightly different. If the DIVx/I instruction does not take an input exception (that is, no INV or DZE), then the Fbox calculates and stores the correct rounded result. The Fbox calculates the inexact flag, setting FPCR [INE] if appropriate, and trapping on DIVx/SI instructions only when the result is really inexact. 26 * Integer overflow (IOV) The integer overflow trap can be disabled. The destination register always contains the low-order bits ([64] or [32]) of the true result (not the truncated bits). Integer overflow can occur with CVTTQ, CVTGQ or CVTQL. In conversions from floating to quadword or longword integer, an integer overflow occurs if the rounded result is outside the range 0263 ..263 0 1. In conversions from quadword integer to longword integer, an integer overflow occurs if the result is outside the range 0231 ..231 0 1. If the exception occurs, then the appropriate bit in the FPCR is set. If the trap is enabled, then the trap is signaled to the Ibox. * Software completion (SWC) The software completion signal is not recorded in the FPCR. The state of this signal is always sent to the Ibox. If the Ibox detects the assertion of any of the listed exceptions concurrent with the assertion of the SWC signal, then it sets EXC_SUM[SWC]. Input exceptions always take priority over output exceptions. If both exception types occur, then only the input exception is recorded in the FPCR, and only the input exception is signaled to the Ibox. Programming Note Because underflow cannot occur for CMPTxx, there is no difference in function or performance between CMPTxx/S and CMPTxx/SU. It is intended that software generate CMPTxx/SU in place of CMPTxx/S. 27 3.4 VAX Floating-Point Instructions Table 15 lists the hexadecimal value of the 11-bit function code field for the VAX floating-point instructions, with and without qualifiers. The opcode for these instructions is 1516 . Table 15 VAX Floating-Point Instruction Function Codes Mnemonic None /C /U /UC /S /SC /SU /SUC ADDF CVTDG ADDG CMPGEQ CMPGLT CMPGLE CVTGF CVTGD CVTQF CVTQG DIVF DIVG MULF MULG SUBF SUBG 080 09E 0A0 0A5 0A6 0A7 0AC 0AD 0BC 0BE 083 0A3 082 0A2 081 0A1 000 01E 020 - - - 02C 02D 03C 03E 003 023 002 022 001 021 180 19E 1A0 - - - 1AC 1AD - - 183 1A3 182 1A2 181 1A1 100 11E 120 - - - 12C 12D - - 103 123 102 122 101 121 480 49E 4A0 4A5 4A6 4A7 4AC 4AD - - 483 4A3 482 4A2 481 4A1 400 41E 420 - - - 42C 42D - - 403 423 402 422 401 421 580 59E 5A0 - - - 5AC 5AD - - 583 5A3 582 5A2 581 5A1 500 51E 520 - - - 52C 52D - - 503 523 502 522 501 521 Mnemonic None /C /V /VC /S /SC /SV /SVC CVTGQ 0AF 02F 1AF 12F 4AF 42F 5AF 52F 28 3.5 Required PALcode Function Codes The opcodes listed in Table 16 are required for all Alpha implementations. The notation used is oo.ffff, where oo is the hexadecimal 6-bit opcode and ffff is the hexadecimal 26-bit function code. Table 16 Required PALcode Function Codes Mnemonic Type Function Code DRAINA Privileged 00.0002 HALT Privileged 00.0000 IMB Unprivileged 00.0086 3.6 Opcodes Reserved for PALcode The opcodes listed in Table 17 are reserved by the Alpha architecture and used in implementing PALcode for the 21064A. See the Alpha Architecture Reference Manual for more information. Table 17 Opcodes Specific to the 21064A 21064A Mnemonic Opcode Architecture Mnemonic 21064A Mnemonic Opcode Architecture Mnemonic HW_MFPR HW_MTPR HW_ST 19 1D 1F PAL19 PAL1D PAL1F HW_LD HW_REI - 1B 1E - PAL1B PAL1E - 3.7 Opcodes Reserved for Digital Table 18 lists the opcodes that are reserved for Digital. Table 18 Opcodes Reserved for Digital Mnemonic Opcode Mnemonic Opcode Mnemonic Opcode OPC01 OPC04 OPC07 OPC0D 01 04 07 0D OPC02 OPC05 OPC0A OPC0E 02 05 0A 0E OPC03 OPC06 OPC0C OPC14 03 06 0C 14 29 3.8 Instructions Specific to the 21064A Table 19 lists instructions that are specific to the 21064A. Table 19 Instructions Specific to the 21064A Mnemonic Operation Type HW_MTPR Move data to processor register PALmode, privileged HW_MFPR Move data from processor register PALmode, privileged HW_LD Load data from memory PALmode, privileged HW_ST Store data in memory PALmode, privileged HW_REI Return from PALmode exception PALmode, privileged Programming Note PALcode uses the HW_LD and HW_ST instructions to access memory outside the realm of normal Alpha memory management. 30 4 Internal Processor Registers This section describes the internal processor registers of the 21064A microprocessor. The section is organized as follows: * Ibox and Abox Internal Processor Registers * PAL_TEMP Registers * Lock Registers * Internal Processor Registers Reset State For the 21064A-275-PC, the Abox Control Register SPE_1 field, described in Section 4.2.11, functions differently from the other four 21064A microprocessors. This register field controls the memory management operation mode. 4.1 Ibox Internal Processor Registers This section describes each Ibox internal processor register (IPR). 4.1.1 Translation Buffer Tag Register (TB_TAG) The TB_TAG register is a write-only register that holds the tag for the next translation buffer update operation in the Instruction Translation Buffer (ITB) or the Data Translation Buffer (DTB). The tag is written to a temporary register and not transferred to the ITB or DTB until the Instruction Translation Buffer Page Table Entry (ITB_PTE) or the Data Translation Buffer Page Table Entry (DTB_PTE) register is written. The entry to be written is chosen at the time of the ITB_PTE or DTB_PTE write operation by a notlast-used algorithm, implemented in hardware. Figure 2 shows the TB_TAG register format. Note Writing to the Instruction Translation Buffer Tag array (ITB_TAG) is only performed while in PALmode, regardless of the state of the hardware enable (HWE) bit in the ICCSR register. 31 Figure 2 Translation Buffer Tag Register Small Page Format: 43 42 63 00 13 12 IGN VA[42:13] IGN GH = 11(bin) Format (ITB only): 43 42 63 IGN 00 22 21 VA[42:22] IGN LJ-01834-TI0 4.1.2 Instruction Translation Buffer Page Table Entry Register (ITB_PTE) The ITB_PTE register is a read/write register, representing twelve page table entries split into two distinct arrays. The first eight page table entries provide small page (8K byte) translations while the remaining four provide large page (4 MB) translations. The entry to be written is chosen by a not-last-used algorithm implemented in hardware for each array independently and the status of the TB_CTL. Writes to the ITB_PTE register use the memory format bit positions as described in the Alpha Architecture Reference Manual, with the exception that some fields are ignored. The ITB's tag array is updated simultaneously from the TB_Tag register when the ITB_PTE register is written. Reads of the ITB_PTE register require two instructions. The first instruction sends the PTE data to the Instruction Translation Buffer Page Table Entry Temporary register (ITB_PTE_TEMP) and the second instruction, reading from the ITB_PTE_TEMP register, returns the PTE entry to the register file. Reading or writing the ITB_PTE register increments the TB entry pointer corresponding to the large/small page selection indicated by the TB_CTL, which allows reading the entire set of ITB_PTE register entries. Figure 3 shows the ITB_PTE register format. Note Reading and writing the ITB_PTE register is only performed while in PALmode regardless of the state of the HWE bit in the ICCSR IPR. 32 Figure 3 Instruction Translation Buffer Page Table Entry Register Write Format: 63 53 52 IGN 32 31 IGN PFN[33:13] 12 11 10 09 08 07 05 04 03 U S E K R R R R E E E E A S M IGN 00 IGN Read Format: 63 RAZ 35 34 33 13 12 11 10 09 08 A S M U S E K R R R R E E E E PFN[33:13] 00 RAZ LJ-01835-TI0 4.1.3 Instruction Cache Control and Status Register (ICCSR) The ICCSR register contains various Ibox hardware enables. The only architecturally defined bit in this register is the floating-point enable (FPE), which enables floating-point instructions. When cleared, all floating-point instructions generate exceptions to the FEN entry point in PALcode. Most of this register is cleared by hardware at reset. Fields that are not cleared at reset include ASN, PC0, and PC1. The hardware enable bit allows the special privileged architecture library code (PALcode) instructions to execute in kernel mode. This bit is intended for diagnostic or operating system alternative PALcode routines only. It does not allow access to the ITB registers if not running in PALmode. Figure 4 shows the ICCSR register format. Table 20 lists the ICCSR register fields and a brief description. Table 21 lists branch states controlled by the Branch Prediction Enable (BPE) and Branch History Enable (BHE) bits in the ICCSR. 33 Figure 4 ICCSR Register Write Format: 63 47 46 45 44 43 42 41 40 39 38 37 36 35 34 53 52 MBZ ASN[5:0] R P P R F E C C E P S E E S E B J B M H A W D H S P P E I E E E P I P E 32 31 12 11 PC MUX1 [2:0] MBZ 08 07 05 04 03 02 01 00 PC MUX0 [3:0] MBZ R P E C S 0 MBZ P C 1 Read Format: 63 46 45 44 43 35 34 33 RAZ P P C C E E RAZ R E S 28 27 ASN[5:0] RES [5:2] 24 23 22 21 20 19 18 17 16 15 F P E B J B M H A W D H S P P E I E E E P I P E 13 12 12 09 08 PC MUX1 [2:0] PC MUX0 [3:0] 03 02 01 00 RAZ P P R C C A 1 0 Z LJ-01836-TI0A Table 20 ICCSR Fields and Description Field Type Description ASN RW The ASN field is used in conjunction with the Icache to further qualify cache entries and avoid some cache flushes. The ASN is written to the Icache during fill operations and compared with the I-stream data on fetch operations. Mismatches invalidate the fetch without affecting the Icache. (See the Alpha Architecture Reference Manual.) RES RW,0 The RES state bits are reserved by Digital and should not be used by software. PCE RW If both of these bits are clear, they disable both performance counters. If either bit is set, both performance counters will increment in their usual fashion. FPE RW,0 If set, floating-point instructions can be issued. If clear, floatingpoint instructions cause FEN exceptions. MAP RW,0 If set, it allows superpage I-stream memory mapping of virtual PC [33:13] directly to Physical PC [33:13] essentially bypassing ITB for virtual PC addresses containing virtual PC [42:41] = 2. Superpage mapping is allowed in kernel mode only. The Icache ASM bit is always set. If clear, superpage mapping is disabled. (continued on next page) 34 Table 20 (Cont.) ICCSR Fields and Description Field Type Description HWE RW,0 If set, it allows the five reserved opcodes (PAL19, PAL1B, PAL1D, PAL1E, and PAL1F) instructions to be issued in kernel mode. If cleared, attempts to execute reserved opcodes instructions while not in PALmode result in OPCDEC exceptions. DI RW,0 If set, it enables dual issue. If cleared, instructions can only single issue. BHE RW,0 Used in conjunction with BPE. See Table 21 for programming information. JSE RW,0 If set, it enables the JSR stack to push a return address. If cleared, JSR stack is disabled. BPE RW,0 Used in conjunction with BHE. See Table 21 for programming information. PIPE RW,0 If clear, it causes all hardware interlocked instructions to drain the machine and waits for the write buffer to empty before issuing the next instruction. Examples of instructions that do not cause the pipe to drain include HW_MTPR, HW_REI, conditional branches, and instructions that have a destination register of R31. If set, pipeline proceeds normally. PCMUX1 RW,0 See Table 23 for programming information. PCMUX0 RW,0 See Table 22 for programming information. PC1 RW If clear, it enables performance counter 1 interrupt request after 212 events counted. If set, enables performance counter 1 interrupt request after 28 events counted. PC0 RW If clear, it enables performance counter 0 interrupt request after 216 events counted. If set, it enables performance counter 0 interrupt request after 212 events counted. Note Using the HW_MTPR instruction to update the EXC_ADDR register while in the native mode is restricted to bit [0] being equal to 0. The combination of the native mode and EXC_ADDR bit [0] being equal to one causes UNDEFINED behavior. This combination is only possible through the use of the HWE bit. 35 Table 21 BHE, BPE Branch Prediction Selection (Conditional Branches Only) BPE BHE Prediction 0 X Not Taken 1 0 Sign of Displacement 1 1 Branch History Table 4.1.3.1 Performance Counters The performance counters are reset to zero upon powerup. Otherwise, they are never cleared. The counters are intended as a means of counting events over a long period of time, relative to the event frequency. They provide no means of extracting intermediate counter values. The performance counters may be enabled or disabled using ICCSR [45:44] (PCE [1:0]). Since the counters continuously accumulate selected events, despite interrupts being enabled, the first interrupt after selecting a new counter input has an error bound as large as the selected overflow range. Some inputs can over count events occurring simultaneously with D-stream errors that abort the actual event very late in the pipeline. For example, when counting load instructions, attempts to execute a load resulting in a TB miss exception will increment the performance counter after the first aborted execution attempt and again after the TB fill routine when the load instruction reissues and completes. Performance counter interrupts are reported six cycles after the event that caused the counter to overflow. Additional delay can occur before an interrupt is serviced, if the processor is executing PALcode that always disables interrupts. Events occurring during the interval between counter overflow and interrupt service are counted toward the next interrupt. Only in the case of a complete counter wraparound while interrupts are disabled will an interrupt be missed. The six cycles before an interrupt is triggered implies that a maximum of 12 instructions may have completed before the start of the interrupt service routine. When counting Icache misses, no intervening instructions can complete and the exception PC contains the address of the last Icache miss. Branch mispredictions allow a maximum of only two instructions to complete before start of the interrupt service routine. 36 Table 22 lists performance counter 0 inputs and Table 23 lists performance counter 1 inputs. Table 22 Performance Counter 0 Input Selection (in ICCSR) MUX0 [3:0] Input Comment 000X Total Issues/2 Counts total issues divided by 2, dual issue increments count by 1. 001X Pipeline Dry Counts cycles where nothing issued due to lack of valid I-stream data. Causes include Icache fill, misprediction, branch delay slots, and pipeline drain for exception. 010X Load Instructions Count all Load instructions. 011X Pipeline Frozen Counts cycles where nothing issued due to resource conflict. 100X Branch Instructions Counts all conditional branches, unconditional branches, JSR, and HW_REI instructions. 1011 PALmode Counts cycles while executing in PALmode. 1010 Total cycles Counts total cycles. 110X Total Non-issues/2 Counts total non-issues divided by 2 ("no issue" increments count by 1). 111X PERF_CNT_H [0] Counts external events supplied to a pin at a selected system clock cycle interval. 37 Table 23 Performance Counter 1 Input Selection (in ICCSR) MUX1 [2:0] Input Comment 000 Dcache miss Counts total Dcache misses. 001 Icache miss Counts total Icache misses. 010 Dual issues Counts cycles of Dual issue. 011 Branch Mispredicts Counts both conditional branch mispredictions and JSR or HW_REI mispredictions. Conditional branch mispredictions cost 4 cycles and others cost 5 cycles of dry pipeline delay. 100 FP Instructions Counts total floating-point operate instructions, that is, no FP branch, load, or store. 101 Integer Operate Counts integer operate instructions including LDA and LDAH with destination other than R31. 110 Store Instructions Counts total store instructions. 111 PERF_CNT_H [1] Counts external events supplied to a pin at a selected system clock cycle interval. 4.1.4 Instruction Translation Buffer Page Table Entry Temporary Register (ITB_PTE_TEMP) The ITB_PTE_TEMP register is a read-only holding register for ITB_PTE read data. Reads of ITB_PTE register require two instructions to return data to the register file. The two instructions are as follows: 1. Read the ITB_PTE register data to the ITB_PTE_TEMP register. 2. Read the ITB_PTE_TEMP register data to the integer register file. The ITB_PTE_TEMP register is updated on all ITB accesses, both read and write. A read of the ITB_PTE to the ITB_PTE_TEMP should be followed closely by a read of the ITB_PTE_TEMP to the register file. Figure 5 shows the ITB_PTE_TEMP register format. Note Reading the ITB_PTE_TEMP register is only performed while in PALmode regardless of the state of the HWE bit in the ICCSR. 38 Figure 5 ITB_PTE_TEMP Register 63 RAZ 35 34 33 13 12 11 10 09 08 A S M U S E K R R R R E E E E PFN[33:13] 00 RAZ LJ-01837-TI0 4.1.5 Exceptions Address Register (EXC_ADDR) The EXC_ADDR register is a read/write register used to restart the system after exceptions or interrupts. The register can be read and written by the software, by way of the HW_MTPR instruction. Also, the EXC_ADDR can be written directly to by the hardware. The HW_REI instruction executes a jump to the address contained in the EXC_ADDR register. The EXC_ADDR register is written by hardware after an exception to provide a return address for PALcode. The instruction pointed to by the EXC_ADDR register did not complete its execution. The LSB of the EXC_ADDR register is used to indicate PALmode to the hardware. When the LSB is clear, the HW_REI instruction executes a jump to native (non-PAL) mode, enabling address translation. CALL_PAL exceptions load the EXC_ADDR with the PC of the instruction following the CALL_PAL. This function allows CALL_PAL service routines to return without needing to increment the value in the EXC_ADDR register. This feature requires careful treatment in PALcode. Arithmetic traps and machine check exceptions can preempt CALL_PAL exceptions resulting in an incorrect value being saved in the EXC_ADDR register. In the cases of an arithmetic trap or machine check exception (only in these cases), EXC_ADDR [1] takes on special meaning. PALcode servicing these two exceptions must: * Interpret a 0 in EXC_ADDR [1] as indicating that the PC in EXC_ADDR [63:2] is too large by a value of 4 bytes and subtract 4 before executing a HW_REI from this address. * Interpret a 1 in EXC_ADDR [1] as indicating that the PC in EXC_ADDR [63:2] is correct and clear the value of EXC_ADDR [1]. All other PALcode entry points except reset can expect EXC_ADDR [1] to be 0. 39 The logic allows the following code sequence to conditionally subtract 4 from the address in the EXC_ADDR register without the use of an additional register. This code sequence must be present in arithmetic trap and machine check flows only. HW_MFPR SUBQ BIC HW_MTPR Rx, Rx, Rx, Rx, EXC_ADDR 2,Rx 2,Rx EXC_ADDR ; ; ; ; read EXC_ADDR into GPR subtract 2 causing borrow if bit [1]=0 clear bit [1] write back to EXC_ADDR Figure 6 shows the exception address register format. Figure 6 Exception Address Register 63 02 01 00 I P G A N L PC[63:2] LJ-01838-TI0 4.1.6 Clear Serial Line Interrupt Register (SL_CLR) The SL_CLR is a write-only register that clears the: * Serial line interrupt request * Performance counter interrupt requests * CRD interrupt request The indicated bit must be written with a zero to clear the selected interrupt source. Figure 7 shows the clear serial line interrupt register format. Table 24 lists the register fields and a description. Figure 7 Clear Serial Line Interrupt Register 63 33 32 31 IGN S L C 16 15 14 IGN P C 1 09 08 07 03 02 01 00 IGN P C 0 IGN C R D IGN LJ-01839-TI0 40 Table 24 Clear Serial Line Interrupt Register Fields Field Type Description CRD W0C Clears the correctable read error interrupt request PC1 W0C Clears the performance counter 1 interrupt request PC0 W0C Clears the performance counter 0 interrupt request SLC W0C Clears the serial line interrupt request 4.1.7 Serial Line Receive Register (SL_RCV) The SL_RCV register contains a single read-only bit (RCV). This bit is used with the interrupt control registers, the sRomD_h pin, and the sRomClk_h pin to provide an on-chip serial line function. The RCV bit is functionally connected to the sRomD_h pin after the Icache is loaded from the external serial ROM. Using a software timing loop, the RCV bit can be read to receive external data one bit at a time. A serial line interrupt is requested on detection of any transition on the receive line that sets the SLR bit in the HIRR. The serial line interrupt can be disabled by clearing the HIER register SLE bit. Figure 8 shows the Serial Line Receive Register format. Figure 8 Serial Line Receive Register 63 04 03 02 RAZ R C V 00 RAZ LJ-01840-TI0 4.1.8 Instruction Translation Buffer ZAP Register (ITBZAP) A write to this register invalidates all twelve instruction translation buffer (ITB) entries. It also resets both the NLU pointers to their initial state. The ITBZAP register is only written to in PALmode. 41 4.1.9 Instruction Translation Buffer ASM Register (ITBASM) A write to this register invalidates all ITB entries, in which the ITB_PTE ASM bit is equal to zero. The ITBASM register is only written to in PALmode. 4.1.10 Instruction Translation Buffer IS Register (ITBIS) A write to the ITBIS register invalidates all twelve ITB entries. It also resets both the NLU pointers to their initial state. The ITBIS register is only written to in PALmode. This register functions the same as the ITBZAP register. 4.1.11 Processor Status Register (PS) The PS register is a read/write register containing only the current mode bits of the architecturally defined PS. Figure 9 shows the PS register format. See the Alpha Architecture Reference Manual for additional information. Figure 9 Processor Status Register Write Format: 63 05 04 03 02 C C M M 1 0 IGN 00 IGN Read Format: 63 35 34 33 RAZ C M 1 02 01 00 RAZ C R M A 0 Z LJ-01841-TI0 4.1.12 Exception Summary Register (EXC_SUM) The EXC_SUM register records the various types of arithmetic traps that occurred since the last time the EXC_SUM was written (cleared). When the result of an arithmetic operation produces an arithmetic trap, the corresponding EXC_SUM bit is set. The register containing the result of the operation is recorded in the exception register write mask parameter, as a single bit in a 64-bit shift register specifying registers F31-F0 and I31-I0. The EXC_SUM register provides a one-bit window to the exception register write mask parameter. This is visible only through the EXC_SUM register. 42 Each read to the EXC_SUM shifts one bit in order F31-F0 then I31-I0. The read also clears the corresponding bit. The EXC_SUM must be read 64 times to extract the complete mask and clear the entire register. If no integer traps are present (IOV=0), only the first 32 corresponding floating-point register bits need to be read and cleared. Any write to EXC_SUM clears bits [8:2] and does not affect the write mask bit. The Write Mask register bit clears three cycles after a read. Code intended to read the register must allow at least three cycles between reads. This allows the clear and shift operations to complete in order to ensure reading successive bits. Figure 10 shows the exception summary register format. Table 25 lists the register fields and descriptions. Figure 10 Exception Summary Register 63 34 33 32 M S K RAZ 09 08 07 06 05 04 03 02 01 00 RAZ I I U F D I S O N N O Z N W V E F V E V C R A Z LJ-01842-TI0 Table 25 Exception Summary Register Fields Field Type Description SWC WA Indicates software completion possible. The bit is set after a floating-point instruction containing the /S modifier completes with an arithmetic trap and all previous floating-point instructions that trapped since the last HW_ MTPR EXC_SUM also contained the /S modifier. The SWC bit is cleared whenever a floating-point instruction without the /S modifier completes with an arithmetic trap. The bit remains cleared regardless of additional arithmetic traps until the register is written by way of an HW_MTPR instruction. The bit is always cleared upon any HW_MTPR write to the EXC_SUM register. INV WA Indicates invalid operation. DZE WA Indicates divide by zero. FOV WA Indicates floating-point overflow. UNF WA Indicates floating-point underflow. INE WA Indicates floating inexact error. IOV WA Indicates Fbox convert to integer overflow or integer arithmetic overflow. MSK RC Exception Register Write Mask IPR window. 43 4.1.13 PAL_BASE Address Register (PAL_BASE) The PAL_BASE register is a read/write register containing the base address for PALcode. This register is cleared by the hardware at reset. Figure 11 shows the PAL_BASE address register format. Figure 11 PAL_BASE Address Register 63 34 33 IGN/RAZ 14 13 PAL_BASE[33:14] 00 IGN/RAZ LJ-01843-TI0 4.1.14 Hardware Interrupt Request Register (HIRR) The HIRR is a read-only register providing a record of all currently outstanding interrupt requests and summary bits at the time of the read. For each bit of the HIRR [5:0], there is a corresponding bit of the Hardware Interrupt Enable register (HIER) that must be set to request an interrupt. In addition to returning the status of the hardware interrupt requests, a read of the HIRR returns the state of the software interrupt and AST requests. Note A read of the HIRR can return a value of zero if the hardware interrupt was released before the read (passive release). The register guarantees that the HWR bit reflects the status as shown by the HIRR bits. All interrupt requests are blocked while executing in PALmode. Figure 12 shows the hardware interrupt request register format. Table 26 lists the register fields and gives a description of each. 44 Figure 12 Hardware Interrupt Request Register 33 32 RAZ 29 28 USEK ASTRR [3:0] 14 13 12 10 09 08 07 S L R P P C C 0 1 SIRR [15:1] HIRR [2:0] 05 04 03 02 01 00 HIRR [5:3] C A S H R R T W W A R R R R Z LJ-01844-TI0 Table 26 Hardware Interrupt Request Register Fields Field Type Description HWR RO Is set if any hardware interrupt request and corresponding enable is set SWR RO Is set if any software interrupt request and corresponding enable is set ATR RO Is set if any AST request and corresponding enable is set. This bit also requires that the processor mode be equal to or higher than the request mode. SIER 2 must be set to allow AST interrupt requests. CRR RO CRD correctable read error interrupt request. This interrupt is cleared by way of the SL_CLR register. HIRR [5:0] RO Contains delayed copies of Irq_h [5:0] pins PC1 RO Performance counter 1 interrupt request PC0 RO Performance counter 0 interrupt request SLR RO Serial line interrupt request. Also see SL_RCV, SL_ XMIT, and SL_CLR SIRR [15:1] RO Corresponds to software interrupt request 15 through 1 ASTRR [3:0] RO Corresponds to AST request 3 through 0 (USEK) 45 4.1.15 Software Interrupt Request Register (SIRR) The SIRR is a read/write register used to control software interrupt requests. For each bit of the SIRR, there is a corresponding bit of the Software Interrupt Enable register (SIER) that must be set to request an interrupt. Reads of the SIRR return the complete set of interrupt request registers and summary bits (see Table 26 for details). All interrupt requests are blocked while executing in PALmode. Figure 13 shows the SIRR format. Figure 13 Software Interrupt Request Register Write Format: 63 48 47 33 32 IGN 00 SIRR[15:1] IGN Read Format: 33 32 63 RAZ 29 28 USEK ASTRR [3:0] SIRR [15:1] 14 13 12 10 09 08 07 S L R P P C C 0 1 HIRR [2:0] 05 04 03 02 01 00 HIRR [5:3] C A S H R R T W W A R R R R Z LJ-01845-TI0 46 4.1.16 Asynchronous Trap Request Register (ASTRR) The ASTRR is a read/write register. It contains bits to request AST interrupts in each of the processor modes. To generate an AST interrupt, the corresponding enable bit in the ASTER must be set. Also, the processor must be in the selected processor mode or higher privilege as described by the current value of the PS CM bits. AST interrupts are enabled if the SIER 2 is set. This provides a mechanism to lock out AST requests over certain IPL levels. All interrupt requests are blocked while executing in PALmode. Reads of the ASTRR return the complete set of interrupt request registers and summary bits. See Table 26 for details. Figure 14 shows the ASTRR format. Figure 14 Asynchronous Trap Request Register Write Format: 63 52 51 50 49 48 47 00 U S E K A A A A R R R R IGN IGN Read Format: 63 33 32 RAZ 29 28 USEK ASTRR [3:0] SIRR [15:1] 14 13 12 10 09 08 07 S L R P P C C 0 1 HIRR [2:0] 05 04 03 02 01 00 HIRR [5:3] C A S H R R T W W A R R R R Z LJ-01846-TI0 47 4.1.17 Hardware Interrupt Enable Register (HIER) The HIER is a read/write register. It is used to enable corresponding bits of the HIRR requesting interrupt. The PC0, PC1, SLE, and CRE bits of this register enable the: * Performance counters * Serial line * Correctable read interrupts There is a one-to-one correspondence between the interrupt requests and enable bits. As with the reads of the interrupt request registers, reads of the HIER return the complete set of interrupt enable registers. See Table 26 for details. Figure 15 shows the hardware interrupt enable register format. Table 27 lists the register fields and a description of each. Figure 15 Hardware Interrupt Enable Register Write Format: 33 32 31 63 IGN S L E 16 15 14 P C 1 IGN 09 08 07 02 P C 0 C R E HIER[5:0] IGN 00 IGN Read Format: 33 32 31 30 29 28 63 RAZ U S E A A A E E E K A E 14 13 12 10 09 08 07 05 04 03 S L E P P C C 0 1 C R E SIER [15:1] HIER [2:0] HIER [5:3] 00 RAZ LJ-01847-TI0 Table 27 Hardware Interrupt Enable Register Fields Field Type Description HIER [5:0] RW Interrupt enables for pins Irq_h [5:0] SIER [15:1] RW Corresponds to software interrupt requests 15 through 1 ASTER [3:0] RW Corresponds to ASTRR enable 3 through 0 (USEK) PC1 RW Performance counter 1 interrupt enable (continued on next page) 48 Table 27 (Cont.) Hardware Interrupt Enable Register Fields Field Type Description PC0 RW Performance counter 0 interrupt enable SLE RW Serial line interrupt enable Also see SL_RCV, SL_XMIT, and SL_CLR CRE RW CRD correctable read error interrupt enable This interrupt request is cleared by way of the SL_CLR register 4.1.18 Software Interrupt Enable Register (SIER) The SIER is a read/write register. It is used to enable corresponding bits of the SIRR requesting interrupts. There is a one-to-one correspondence between the interrupt requests and enable bits. As with the reads of the interrupt request registers, reads of the SIER return the complete set of interrupt enable registers. See Table 26 for details. Figure 16 shows the software interrupt enable register format. Figure 16 Software Interrupt Enable Register Write Format: 63 33 32 48 47 00 SIER[15:1] IGN IGN Read Format: 63 33 32 31 30 29 28 RAZ U S E A A A E E E K A E SIER [15:1] 14 13 12 10 09 08 07 05 04 03 S L E P P C C 0 1 C R E HIER [2:0] HIER [5:3] 00 RAZ LJ-01848-TI0 49 4.1.19 AST Interrupt Enable Register (ASTER) The ASTER is a read/write register. It is used to enable corresponding bits of the ASTRR requesting interrupts. There is a one-to-one correspondence between the interrupt requests and enable bits. As with the reads of the interrupt request registers, reads of the ASTER return the complete set of interrupt enable registers. See Table 26 for details. Figure 17 shows the ASTER format. Figure 17 AST Interrupt Enable Register Write Format: 52 51 50 49 48 47 63 IGN U S E A A A E E E 00 K A E IGN Read Format: 33 32 31 30 29 28 63 RAZ U S E A A A E E E K A E SIER [15:1] 14 13 12 10 09 08 07 05 04 03 S L E P P C C 0 1 C R E HIER [2:0] HIER [5:3] 00 RAZ LJ-01849-TI0 4.1.20 Serial Line Transmit Register (SL_XMIT) The SL_XMIT register contains a single write-only bit. This bit is used with the interrupt control registers, the sRomD_h pin, and the sRomClk_h pin to provide an on-chip serial line function. The TMT bit is functionally connected to the sRomClk_h pin after the Icache is loaded from the external serial ROM. Writing the TMT bit can be used to transmit data off chip, one bit at a time under a software timing loop. Figure 18 shows the SL_XMIT register format. 50 Figure 18 Serial Line Transmit Register 63 05 04 03 T M T IGN 00 IGN LJ-01850-TI0 4.2 Abox Internal Processor Registers The following sections describe the Abox internal processor registers. 4.2.1 Translation Buffer Control Register (TB_CTL) The granularity hint (GH) field selects between th TB page mapping sizes. There are two sizes in the ITB and four sizes in the DTB. When only two sizes are provided, the large-page-select (GH=11(bin)) field selects the largest mapping size (512 * 8 KB). All other values select the smallest (8 KB) size. The GH field affects both reads and writes to the ITB and DTB. Figure 19 shows the translation buffer control register format. See the Alpha Architecture Reference Manual for additional information. Figure 19 Translation Buffer Control Register 63 07 06 05 04 IGN GH 00 IGN LJ-01851-TI0 4.2.2 Data Translation Buffer Page Table Entry Register (DTB_PTE) The DTB_PTE register is a read/write register representing the 32-entry DTB. The entry to be written is chosen by a not-last-used (NLU) algorithm implemented in the hardware. A DTB round robin (DTB_RR) algorithm can be selected by setting ABOX_CTL [9]. Writes to the DTB_PTE use the memory format bit positions as described in the Alpha Architecture Reference Manual with the exception that some fields are ignored. The valid bit is not represented in hardware. 51 The DTB's tag array is updated simultaneously from the TB_Tag register when the DTB_PTE register is written. Reads of the DTB_PTE require two instructions. The first instruction sends the PTE data to the Data Translation Buffer Page Table Entry Temporary register (DTB_PTE_TEMP). The second instruction, reading from the DTB_PTE_TEMP register, returns the PTE entry to the register file. Reading or writing the DTB_PTE register increments the TB entry pointer of the DTB, which allows reading the entire set of DTB_PTE entries. Figure 20 shows the DTB_PTE register format. Figure 20 Data Translation Buffer Page Table Entry Register 63 53 52 32 31 IGN PFN[33:13] 16 15 14 13 12 11 10 09 08 07 05 04 03 02 01 00 IGN U S E K U S E K W W W W R R R R E E E E E E E E IGN A I F F I S G O O G M N W R N LJ-01852-TI0 4.2.3 Data Translation Buffer Page Table Entry Temporary Register (DTB_PTE_TEMP) The DTB_PTE_TEMP register is a read-only holding register for DTB_PTE read data. Reads of the DTB_PTE require two instructions to return the data to the register file. The two instructions are as follows: * Read the DTB_PTE register data to the DTB_PTE_TEMP register. * Read the DTB_PTE_TEMP register data to the integer register file. Figure 21 shows DTB_PTE_TEMP register format. Figure 21 Data Translation Buffer Page Table Entry Temporary Register 63 13 12 11 10 09 08 07 06 05 04 03 02 00 35 34 33 RAZ A S M PFN[33:13] U S E K U S E K F F R R R R W W W W O O E E E E E E E E W R RAZ LJ-01853-TI0 52 4.2.4 Memory Management Control and Status Register (MM_CSR) When D-stream faults occur the information about the fault is latched and saved in the MM_CSR register. The virtual address register (VA) and MM_CSR registers are locked against further updates until the software reads the Virtual Address register. PALcode must explicitly unlock this register whenever its entry point is higher in priority than a DTB miss. The MM_CSR bits are only modified by the hardware when the register is not locked and a memory management error or a DTB miss occurs. The MM_CSR is unlocked after reset. Figure 22 shows the MM_CSR register format. Table 28 lists the register fields and a brief description. Figure 22 Memory Management Control and Status Register 63 15 14 RAZ 09 08 OPCODE 04 03 02 01 00 RA F F A W O O C R W R V LJ-01854-TI0 Table 28 Memory Management Control and Status Register Field Type Description WR RO Set if reference that caused error was a write. ACV RO Set if reference caused an access violation. FOR RO Set if reference was a read and the PTE's FOR bit was set. FOW RO Set if reference was a write and the PTE's FOW bit was set. RA RO RA field of the faulting instruction. OPCODE RO Opcode field of the faulting instruction. 53 4.2.5 Virtual Address Register (VA) When D-stream faults or DTB misses occur, the effective virtual address associated with the fault or miss is latched in the read-only VA register. The VA and MM_CSR registers are locked against further updates until the software reads the VA register. The VA register is unlocked after reset. PALcode must explicitly unlock this register whenever its entry point is higher in priority than a DTB miss. 4.2.6 Data Translation Buffer ZAP Register (DTBZAP) The DTBZAP is a pseudo-register. A write to this register invalidates all 32 DTB entries. It also resets the not-last-used (NLU) pointer to its initial state. 4.2.7 Data Translation Buffer ASM Register (DTBASM) The DTBASM is a pseudo-register. A write to this register invalidates all 32 DTB entries in which the ASM bit is equal to zero. 4.2.8 Data Translation Buffer Invalidate Single Register (DTBIS) A write to this pseudo-register will invalidate the DTB entry, which maps the virtual address held in the integer register. The integer register is identified by the Rb field of the HW_MTPR instruction, used to perform the write. 4.2.9 Flush Instruction Cache Register (FLUSH_IC) A write to this pseudo-register flushes the entire instruction cache. 4.2.10 Flush Instruction Cache ASM Register (FLUSH_IC_ASM) A write to this pseudo-register invalidates all Icache blocks in which the ASM bit is clear. 54 4.2.11 Abox Control Register (ABOX_CTL) Figure 23 shows the Abox control register format. Table 29 lists the register fields and descriptions. Figure 23 Abox Control Register 63 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 MBZ W O W O W O W O W O W O W O W O W O W O W O W O W O W O W O W O WB_DIS MCHK_EN CRD_EN IC_SBUF_EN SPE_1 SPE_2 EMD_EN STC_NORESULT NCAHCE_NDISTURB DTB_RR DC_ENA DC_FHIT DC_16K F_TAG_ERR NOCHK_PAR DOUBLE_INVAL MLO-012194 Table 29 Abox Control Register Fields Field Type Description WB_DIS WO,0 Write Buffer unload Disable. When set, this bit prevents the write buffer from sending write data to the BIU. It should be set for diagnostics only. (continued on next page) 55 Table 29 (Cont.) Abox Control Register Fields Field Type Description MCHK_EN WO,0 Machine Check Enable. When this bit is set, the Abox generates a machine check when errors (which are not correctable by the hardware) are encountered. When this bit is cleared, uncorrectable errors do not cause a machine check. However, the BIU_STAT, DC_STAT, BIU_ADDR, and FILL_ADDR registers are updated and locked when the errors occur. CRD_EN WO,0 Corrected read data interrupt enable. When this bit is set, the Abox generates an interrupt request whenever a pin bus transaction is terminated with a cAck_h code of SOFT_ERROR. IC_SBUF_EN WO,0 Icache stream buffer enable. When set, this bit enables operation of a single entry Icache stream buffer. SPE_1 WO,0 When this bit is set, it enables one-to-one superpage mapping of the D-stream virtual addresses with VA [42:30] = 1FFE (Hex) to the physical addresses with PA [33:30] = 0 (Hex). Access is only allowed in kernel mode. Note For the 21064A-275-PC this bit must always be set when virtual-to-physical mapping is enabled. Operation in native mode (not PALmode) with this bit clear will will cause 21064A-275-PC operation to be UNPREDICTABLE. SPE_2 WO,0 When this bit is set, it enables one-to-one super page mapping of the D-stream virtual addresses with VA [33:13] directly to physical addresses PA [33:13], if virtual address bits VA [42:41] = 2. Virtual address bits VA [40:34] are ignored in this translation. Access is only allowed in kernel mode. EMD_EN WO,0 Limited hardware support is provided for big endian data formats by way of bit [6] of the ABOX_CTL register. When set, this bit inverts the physical address bit [2] for all D-stream references. It is intended that the chip endian mode be selected during initialization of PALcode only. (continued on next page) 56 Table 29 (Cont.) Abox Control Register Fields Field Type STC_NORESULT WO,0 Description When clear the 21064A implements lock operation in conformance to Alpha Architecture. When set the the 21064A does not conform to Alpha architecture. See the following two items. When STC_NORESULT is set these two items apply. * The result written into the register identified by Ra in STL_ C/STQ_C and HW_ST/C instructions is UNPREDICTABLE. This allows the Ibox to restart the memory reference pipeline when the STL_C/STQ_C is transferred from the write buffer to the BIU, and so increases the repetition rate with which STL_C/STQ_C instructions can be processed. * LDL_L/LDQ_L, STL_C/STQ_C and HW_ST/C instructions will invalidate the Dcache line associated with their generated address. These invalidates will not be visible to load or store instructions that issue in the two CPU cycles after the LDL_L/LDQ_L, STL_C/STQ_C or HW_ST/C issues. This bit is cleared by chip reset. NCACHE_ NDISTURB WO,0 When this bit is set, it enables a mode which make noncacheable only those external reads for which the 21064A does not probe the external cache. This bit is cleared by chip reset. DTB_RR2 WO,0 When this bit is set, it selects the round robin replacement algorithm in the DTB. DC_ENA WO,0 Dcache enable. When clear, this bit disables and flushes the Dcache. When set, this bit enables the Dcache. DC_FHIT WO,0 Dcache force hit. When set, this bit forces all D-stream references to hit in the Dcache. This bit takes precedence over DC_ENA. That is, when DC_FHIT is set and DC_ENA is clear all D-stream references hit in the Dcache. DC_16K WO,0 Set to select 16K byte Dcache. Clear to select 8K byte Dcache. F_TAG_ERR WO,0 Set to generate bad Dcache tag parity on fills. NOCHK_PAR WO,0 Set to disable checking of Icache and Dcache parity. DOUBLE_ INVAL WO,0 When set, asserting dInvReq_h 0 invalidates both Dcache blocks addressed by iAdr_h [12:5]. 57 4.2.12 Alternate Processor Mode Register (ALT_MODE) The ALT_MODE is a write-only register. The AM field specifies the alternate processor mode used by HW_LD and HW_ST instructions that have their ALT bit (bit [14]) set. Figure 24 shows the alternate processor mode register format and Table 30 lists the register modes. Figure 24 Alternate Processor Mode Register 63 05 04 03 02 IGN AM 00 IGN LJ-01856-TI0 Table 30 Alternate Processor Mode Register ALT_MODE [4:3] Mode 00 Kernel 01 Executive 10 Supervisor 11 User 4.2.13 Cycle Counter Register (CC) The 21064A supports a cycle counter, as described in the Alpha Architecture Reference Manual. When enabled, the CC increments once each CPU cycle. The HW_MTPR Rn, CC writes the CC [63:32] with the value held in the Rn [63:32]. The CC [31:0] are not changed. This register is read by the RPCC instruction as defined in the Alpha Architecture Reference Manual. Figure 25 shows the register format (top register) when read by the HW_MFPR Rn, CC instruction and when written (bottom register) by the HW_MTPR Rn, CC instruction. 58 Figure 25 Cycle Counter Register Read Format: 63 32 31 OFFSET Write Format: 63 00 COUNTER 32 31 OFFSET 00 IGN LJ-02162-TI0 4.2.14 Cycle Counter Control Register (CC_CTL) The HW_MTPR Rn, CC_CTL writes the CC [31:0] with the value held in Rn [31:0]. The CC register bits [63:32] are not changed. The CC register bits [3:0] must be written with zero. If Rn bit [32] is set, then the counter is enabled, otherwise the counter is disabled. CC_CTL is a write-only register. Figure 26 shows the register format when written by the HW_MTPR Rn, CC_CTL instruction. Figure 26 Cycle Counter Control Register CC_CTL Register Format 63 33 32 31 IGN 00 COUNTER ENABLE LJ-02161-TI0 59 4.2.15 Bus Interface Unit Control Register (BIU_CTL) Figure 27 shows the bus interface unit control register format. Table 31 lists the register fields and gives a description of each. Figure 27 21064A Bus Interface Unit Control Register 63 44 43 42 40 39 38 37 36 35 32 31 30 28 27 13 12 11 08 07 04 03 02 01 00 BC_WE_CTL [15:1] MBZ BC_ENA ECC OE BC_FHIT BC_RD_SPD BC_WR_SPD DELAY_WDATA BC_SIZE BAD_TCP BC_PA_DIS BAD_DP BYTE_PARITY SYS_WRAP IMAP_EN BC_BURST_SPD BC_BURST_ALL FAST_LOCK MLO-012196 Table 31 Bus Interface Unit Control Register Fields Field Type Description BC_ENA WO,0 External cache enable. When this bit is cleared, the bit disables the external cache. When the Bcache is disabled, the BIU does not probe the external cache tag store for read/write references; it launches a request on cReq_h immediately. (continued on next page) 60 Table 31 (Cont.) Bus Interface Unit Control Register Fields Field Type Description ECC WO,0 When this bit is clear, the 21064A generates/expects parity on four of the check_h pins. When this bit is set, the 21064A generates/expects ECC on the check_h pins. OE WO,0 When this bit is set, the 21064A does not assert its chip enable pins during RAM write cycles, thus enabling these pins to be connected to the output enable pins of the cache RAMs. Caution The output enable bit in the BIU_CTL register (BIU_CTL [2]) must be set if the system uses SRAMs in the output enable mode (that is, if the tagCEOE and/or dataCEOE signals are connected to the output enable input of the SRAM and the 21064A enable is always enabled). If this bit is inadvertently cleared, the tag and data SRAMs will be enabled during writes, and damage can result. BC_FHIT WO,0 External cache force hit. When this bit is set and the BC_ENA bit is also set, all pin bus READ_BLOCK and WRITE_BLOCK transactions are forced to hit in external cache. Tag and tag control parity are ignored. The BC_ENA takes precedence over BC_FHIT. When BC_ENA is cleared and BC_FHIT is set, no tag probes occur and external requests are directed to the cReq_h pins. Note The BC_PA_DIS field takes precedence over the BC_FHIT bit. (continued on next page) 61 Table 31 (Cont.) Bus Interface Unit Control Register Fields Field Type Description BC_RD_SPD WO,0 External cache read speed. This field indicates to the BIU the read access time of the RAMs used to implement the off-chip external cache, measured in CPU cycles. It should be written with a value equal to one less than the read access time of the external cache RAMs. 21064A access times for reads must be in the range [16:3] CPU cycles, which means the values for the BC_RD_SPD field are in the range of [15:2]. BC_WR_SPD WO,0 External cache write speed. This field indicates to the BIU the write cycle time of the RAMs used to implement the off-chip external cache, measured in CPU cycles. It should be written with a value equal to one less than the write cycle time of the external cache RAMs. The access times for writes must be in the range [16:2] CPU cycles, which means the values for the BC_WR_SPD field are in the range of [15:1]. DELAY_WDATA WO,0 When this bit is set, it changes the timing of the data bus during external cache writes. BC_WE_CTL WO,0 External cache write enable control. This field is used to control the timing of the write enable and chip enable pins during writes into the data and tag control RAMs. It consists of 15 bits, where each bit determines the value placed on the write enable and chip enable pins during a given CPU cycle of the RAM write access. When a given bit of the BC_WE_CTL is set, the write enable and chip enable pins are asserted during the corresponding CPU cycle of the RAM access. The BC_WE_CTL bit [0] (bit [13] in BIU_CTL) corresponds to the second cycle of the write access, BC_WE_CTL [1] (bit [14] in BIU_CTL) to the third CPU cycle, and so on. The write enable pins will never be asserted in the first CPU cycle of a RAM write access. Unused bits in the BC_WE_CTL field must be written with zeros. BC_SIZE WO,0 This field is used to indicate the size of the external cache. See Table 32 for the encodings. BAD_TCP WO,0 When set, this bit causes the 21064A to write bad parity into the tag control RAM whenever it does a fast external RAM write. (Diagnostic use only.) (continued on next page) 62 Table 31 (Cont.) Bus Interface Unit Control Register Fields Field Type Description BC_PA_DIS WO,0 This 4-bit field may be used to prevent the CPU chip from using the external cache to service reads and writes based upon the quadrant of physical address space that they reference. The correspondence between this bit field and the physical address space is shown in Table 33. When a read or write reference is presented to the BIU the values of BC_PA_DIS, BC_ENA, and the physical address bits [33:32] determine whether to attempt to use the external cache to satisfy the reference. If the external cache is not to be used for a given reference the BIU does not probe the tag store and makes the appropriate system request immediately. The value of BC_PA_DIS has NO impact on which portions of the physical address space can be cached in the primary caches. System components control this by way of the dRAck_h field of the pin bus. BAD_DP WO,0 When this bit is set, the BAD_DP causes the 21064A to invert the value placed on bits [0], [7], [14] and [21] of the check_h [27:0] field during off-chip writes. This produces bad parity when the 21064A is in parity mode, and bad check bit codes when in ECC mode. (Diagnostic use only.) SYS_WRAP WO,0 When this bit is set, it indicates that the system returns read response data wrapped around the requested chunk. This bit is cleared by chip reset. BC_BURST_SPD WO,0 When these bits are cleared, the timing of all Bcache reads is controlled by the value of BC_RD_SPD. When these bits are set in 128-bit mode, the second read takes BC_BURST_SPD+1 cycles. When these bits are set in 64-bit mode, the second and fourth reads take BC_BURST_SPD+1 cycles. If BC_BURST_ALL is set, the third read takes BC_BURST_ SPD+1 cycles also. BC_BURST_ALL WO,0 In 64-bit mode this bit is set if BC_BURST_SPD should be used to time the third (of four) RAM read cycle. BYTE_PARITY WO,0 If set when BIU_CTL ECC is cleared, external byte parity is selected. If set when BIU_CTL ECC is set, this bit is ignored. IMAP_EN WO,0 Set to allow dMapWE_h [1:0] to assert for I-stream backup cache reads. (continued on next page) 63 Table 31 (Cont.) Bus Interface Unit Control Register Fields Field Type Description FAST_LOCK WO,0 When set, FAST_LOCK mode operation is selected. FAST_ LOCK mode can only be used when BIU_CTL [2] OE is also set indicating that OE mode Bcache RAMs are used. Table 32 lists the encoding for BC_SIZE. Table 33 lists the BIU_CTL physical addresses. Table 32 BC_SIZE BC_SIZE Cache Size BC_SIZE Cache Size 000 128 KB 100 2 MB 001 256 KB 101 4 MB 010 512 KB 110 8 MB 011 1 MB 111 16 MB Table 33 BC_PA_DIS 64 BIU_CTL Bits Physical Address BIU_CTL Bits Physical Address 32 PA [33:32] = 0 34 PA [33:32] = 2 33 PA [33:32] = 1 35 PA [33:32] = 3 4.2.16 Cache Status Register (C_STAT) The C_STAT is a read-only register and is only used by the diagnostics. Figure 28 shows the 21064A Dcache status register format. Table 34 lists the register fields and gives a description of each. Figure 28 Cache Status Register 63 15 14 RAZ 05 04 03 02 RAZ R O 00 CHIP_ID DC_HIT DC_ERR IC_ERR MLO-012195 Table 34 Cache Status Register Fields Field Type Description CHIP_ID RO These bits identify the devices as listed here: * 0012 --Early version of 21064A * 0112 --Production version of 21064A DC_HIT RO This bit indicates whether the last load or store instruction processed by the Abox hit (DC_HIT set) or missed (DC_HIT clear) the Dcache. Loads that miss the Dcache can be completed without requiring external reads. (Diagnostic use only.) DC_ERR RC Set by Dcache parity error. IC_ERR RC Set by Icache parity error. 65 4.2.17 Bus Interface Unit Status Register (BIU_STAT) The BIU_STAT is a read-only register. Bits [6:0] of the BIU_STAT register are locked against further updates when one of the following bits is set: * BIU_HERR * BIU_SERR * BC_TPERR * BC_TCPERR The address associated with the error is latched and locked in the BIU_ ADDR register. Bits [6:0] of the BIU_STAT register and BIU_ADDR are also spuriously locked when a parity error or an uncorrectable ECC error occurs during a primary cache fill operation. The BIU_STAT bits [7:0] and BIU_ ADDR are unlocked when the BIU_ADDR register is read. When FILL_ECC or FILL_DPERR is set, BIU_STAT bits [13:8] are locked against further updates. The address associated with the error is latched and locked in the FILL_ADDR register. The BIU_STAT bits [14:8] and FILL_ADDR are unlocked when the FILL_ADDR register is read. This register is not unlocked or cleared by reset and needs to be explicitly cleared by PALcode. Figure 29 shows the bus interface unit status register format. Table 35 lists the register fields and gives a description of each. 66 Figure 29 Bus Interface Unit Status Register 63 04 03 02 01 00 14 13 12 11 10 09 08 07 06 RAZ R O RO R R R R R O O O O O RO R R R R O O O O BIU_HERR BIU_SERR BC_TPERR BC_TCPERR BIU_CMD FATAL 1 FILL_ECC FILL_CRD FILL_DPERR FILL_IRD FILL_QW FATAL 2 LJ-02123-TI0 Table 35 Bus Interface Unit Status Register Fields Field Type Description BIU_HERR RO When this bit is set, it indicates that an external cycle was terminated with the cAck_h pins indicating HARD_ ERROR. BIU_SERR RO When this bit is set, it indicates that an external cycle was terminated with the cAck_h pins indicating SOFT_ ERROR. BC_TPERR RO When this bit is set, it indicates that an external cache tag probe encountered bad parity in the tag address RAM. BC_TCPERR RO When this bit is set, it indicates that an external cache tag probe encountered bad parity in the tag control RAM. BIU_CMD RO This field latches the cycle type on the cReq_h pins when a BIU_HERR, BIU_SERR, BC_TPERR, or BC_TCPERR error occurs. (continued on next page) 67 Table 35 (Cont.) Bus Interface Unit Status Register Fields 68 Field Type Description FATAL1 RO When this bit is set, it indicates that an external cycle was terminated with the cAck_h pins indicating HARD_ ERROR or that an external cache tag probe encountered bad parity in the tag address RAM or the tag control RAM while one of BIU_HERR, BIU_SERR, BC_TPERR, or BC_TCPERR was already set. FILL_ECC RO ECC error. When this bit is set, it indicates that primary cache fill data received from outside the CPU chip contained an ECC error. FILL_CRD RO Correctable read. This bit only has meaning when FILL_ ECC is set. When this bit is set, it indicates that the information latched in BIU_STAT [13:8], FILL_ADDR, and FILL_SYNDROME relates to an error quadword which does not contain multi-bit errors in either of its component longwords. FILL_DPERR RO Fill Parity Error. When this bit is set, it indicates that the BIU received data with a parity error from outside the CPU chip while performing either a Dcache or Icache fill. FILL_DPERR is only meaningful when the CPU chip is in parity mode, as opposed to ECC mode. FILL_IRD RO This bit is only meaningful when either FILL_ECC or FILL_DPERR is set. The FILL_IRD bit is set to indicate that the error that caused FILL_ECC or FILL_DPERR to set occurred during an Icache fill and clear to indicate that the error occurred during a Dcache fill. FILL_QW RO This field is only meaningful when either FILL_ECC or FILL_DPERR is set. The FILL_QW bit identifies the quadword within the hexaword primary cache fill block which caused the error. It can be used together with FILL_ADDR [33:5] to get the complete physical address of the bad quadword. FATAL2 RO When this bit is set, it indicates that a primary cache fill operation resulted in either a multi-bit ECC error or in a parity error while FILL_ECC or FILL_DPERR was already set. 4.2.18 Bus Interface Unit Address Register (BIU_ADDR) The BIU_ADDR is a read-only register that contains the physical address associated with errors reported by BIU_STAT [7:0]. Its contents are meaningful only when one of BIU_HERR, BIU_SERR, BC_TPERR, or BC_TCPERR are set. Reads of the BIU_ADDR register unlock both BIU_ADDR and BIU_STAT [7:0]. The BIU_ADDR bits [33:5] contain the values of adr_h bits [33:5] associated with the pin bus transaction that resulted in the error indicated in BIU_STAT [7:0]. If the BIU_CMD field of the BIU_STAT register indicates that the transaction that received the error was READ_BLOCK or load_locked, then BIU_ADDR [4:2] are UNPREDICTABLE. If the BIU_CMD field of the BIU_STAT register encodes any pin bus command other than READ_BLOCK or load_locked, then BIU_ADDR bits [4:2] will contain zeros. The BIU_ADDR bits [63:34] and BIU_ ADDR bits [1:0] always read as zero. Figure 30 shows the bus interface unit address register (BIU_ADDR) format. Figure 30 Bus Interface Unit Address Register BIU_ADDR Register Format 63 05 04 34 33 RAZ ADDRESS 02 01 00 RB/LL R A Z LJ-02160-TI0 69 4.2.19 Fill Address Register (FILL_ADDR) The FILL_ADDR is a read-only register that contains the physical address associated with errors reported by BIU_STAT bits [14:8]. Its contents are meaningful only when FILL_ECC or FILL_DPERR is set. Reads of the FILL_ ADDR unlock FILL_ADDR, BIU_STAT bits [14:8] and FILL_SYNDROME. The FILL_ADDR bits [33:5] identify the 32-byte cache block that the CPU was attempting to read when the error occurred. If the FILL_IRD bit of the BIU_STAT register is clear, it indicates that the error occurred during a D-stream cache fill. At such times, FILL_ADDR bits [4:2] contain bits [4:2] of the physical address generated by the load instruction that triggered the cache fill. If FILL_IRD is set, then FILL_ADDR bits [4:2] are UNPREDICTABLE. The FILL_ADDR bits [63:34] and FILL_ADDR bits [1:0] will read as zero. Figure 31 shows the fill address register (FILL_ADDR) format. Figure 31 Fill Address Register Fill_ADDR Register Format 63 34 33 RAZ 05 04 ADDRESS PA/ UNP 02 01 00 RAZ LJ-02159-TI0 70 4.2.20 Fill Syndrome Register (FILL_SYNDROME) The FILL_SYNDROME register is a 14-bit read-only register. If the chip is in ECC mode and an ECC error is recognized during a primary cache fill operation, the syndrome bits associated with the bad quadword are locked in the FILL_SYNDROME register. The FILL_SYNDROME bits [6:0] contain the syndrome associated with the lower longword of the quadword, and FILL_SYNDROME bits [13:7] contain the syndrome associated with the upper longword of the quadword. A syndrome value of zero means that no errors were found in the associated longword. See Table 36 for a list of syndromes associated with correctable single-bit errors. The FILL_SYNDROME register is unlocked when the FILL_ADDR register is read. If the chip is in parity mode and a parity error is recognized during a primary cache fill operation, the FILL_SYNDROME register indicates which of the longwords in the quadword got bad parity. The FILL_SYNDROME bit [0] is set to indicate that the lower longword was corrupted, and FILL_SYNDROME bit [7] is set to indicate that the upper longword was corrupted. The FILL_ SYNDROME bits [13:8] and [6:1] are RAZ in parity mode. Figure 32 shows the fill syndrome register format. Figure 32 FILL_SYNDROME Register 63 14 13 RAZ 07 06 HI[6:0] 00 LO[6:0] LJ-01860-TI0 71 Table 36 Syndromes for Single-Bit Errors 72 Data Bit Syndrome (Hex) Data Bit Syndrome (Hex) Check Bit Syndrome (Hex) 00 4F 16 0E 00 01 01 4A 17 0B 01 02 02 52 18 13 02 04 03 54 19 15 03 08 04 57 20 16 04 10 05 58 21 19 05 20 06 5B 22 1A 06 40 07 5D 23 1C 08 23 24 62 09 25 25 64 10 26 26 67 11 29 27 68 12 2A 28 6B 13 2C 29 6D 14 31 30 70 15 34 31 75 4.2.21 Backup Cache Tag Register (BC_TAG) The BC_TAG is a read-only register. Unless locked, the BC_TAG register is loaded with the results of every backup cache tag probe. When a tag or tag control parity error or primary fill data error (parity or ECC) occurs, this register is locked against further updates. The software may read the LSB of this register by using the HW_MFPR instruction. Each time an HW_MFPR from BC_TAG completes, the contents of BC_TAG are shifted one bit position to the right, so that the entire register can be read using a sequence of HW_ MFPRs. The software may unlock the BC_TAG register using a HW_MTPR to BC_TAG. Successive HW_MFPRs from the BC_TAG register must be separated by at least one null cycle. Figure 33 shows the backup cache tag register format. Table 37 lists the register fields and gives a description of each. Figure 33 Backup Cache Tag Register 63 05 04 03 02 01 00 23 22 21 RAZ R O TAG [33:17] R R R R R O O O O O HIT TAGCTL_P TAGCTL_D TAGCTL_S TAGCTL_V TAGADR_P LJ-01861-TI0 Note Unused tag bits in the TAG field of this register are always clear, based on the size of the external cache as determined by the BC_SIZE field of the BIU_CTL register. 73 Table 37 Backup Cache Tag Register Fields Field Type Description TAGADR_P RO Reflects the state of the tagAdrP_h signal of the 21064A when a tag, tag control, or data parity error occurs. TAG RO Contains the tag that is being currently probed. TAGCTL_V RO Reflects the state of the tagCtlV_h signal of the 21064A when a tag, tag control, or data parity error occurs. TAGCTL_S RO Reflects the state of the tagCtlS_h signal of the 21064A when a tag, tag control, or parity error occurs. TAGCTL_D RO Reflects the state of the tagCtlD_h signal of the 21064A when a tag, tag control, or data parity error occurs. TAGCTL_P RO Reflects the state of the tagCtlP_h signal of the 21064A when a tag, tag control, or data parity error occurs. HIT When set, indicates that there was a tag match when a tag, tag control, or data parity error occurred. 4.3 PAL_TEMP Registers The CPU chip contains 32 (64-bit) registers that are accessible by way of the HW_MxPR instructions. These registers provide temporary storage for PALcode. 4.4 Lock Registers There are two registers per processor that are associated with the LDQ_ L/LDL_L and STQ_C/STL_C instructions: the lock_flag register and the locked_physical_address register. The use of these registers is described in the Alpha Architecture Reference Manual. These registers are required by the architecture but are not implemented on the 21064A. They must be implemented in the application. 74 4.5 Internal Processor Registers Reset State Table 38 lists the state of all the internal processor registers (IPRs) immediately following reset. The table also specifies which registers need to be initialized by power-up PALcode. Table 38 Internal Process Register Reset State IPR Reset State Comments TB_TAG UNDEFINED ITB_PTE UNDEFINED ICCSR cleared except ASN, PC0, PC1 ITB_PTE_TEMP UNDEFINED EXC_ADDR UNDEFINED SL_RCV UNDEFINED ITBZAP n/a ITBASM n/a ITBIS n/a PS UNDEFINED PALcode must set processor status. EXC_SUM UNDEFINED PALcode must clear exception summary and exception register write mask by doing 64 reads. PAL_BASE cleared Cleared on reset. HIRR n/a SIRR UNDEFINED PALcode must initialize. ASTRR UNDEFINED PALcode must initialize. HIER UNDEFINED PALcode must initialize. SIER UNDEFINED PALcode must initialize. Floating-point disabled, single issue mode, Pipe mode enabled, JSR predictions disabled, branch predictions disabled, branch history table disabled, performance counters reset to zero, Perf Cnt0: Total Issues/2, Perf Cnt1: Dcache Misses, superpage disabled PALcode must do a ITBZAP on reset before writing the ITB (must do HW_MTPR to ITBZAP register). (continued on next page) 75 Table 38 (Cont.) Internal Process Register Reset State IPR Reset State Comments ASTER UNDEFINED PALcode must initialize. SL_XMIT UNDEFINED PALcode must initialize. Appears on external pin. TB_CTL UNDEFINED PALcode must select between SP/LP DTB prior to any TB fill. DTB_PTE UNDEFINED DTB_PTE_TEMP UNDEFINED MM_CSR UNDEFINED VA UNDEFINED Unlocked on reset. DTBZAP n/a PALcode must do a DTBZAP on reset before writing the DTB (must do HW_MTPR to DTBZAP register). DTBASM n/a DTBIS n/a BIU_ADDR UNDEFINED Potentially locked. BIU_STAT UNDEFINED Potentially locked. SL_CLR UNDEFINED PALcode must initialize. C_STAT UNDEFINED Potentially locked. FILL_ADDR UNDEFINED Potentially locked. ABOX_CTL cleared Write buffer enabled, machine checks disabled, correctable read interrupts disabled, Icache stream buffer disabled, super pages 1 and 2 disabled, endian mode disabled, Dcache disabled, forced hit mode off. (STC_NORESULT disabled, NCACHE_NDISTURB disabled) ALT_MODE UNDEFINED CC UNDEFINED Unlocked on reset. Cycle counter is disabled on reset. (continued on next page) 76 Table 38 (Cont.) Internal Process Register Reset State IPR Reset State CC_CTL UNDEFINED BIU_CTL cleared Comments Bcache disabled, parity mode enabled, chip enable asserts during RAM write cycles, Bcache forcedhit mode disabled. BC_PA_DIS field cleared. BAD_TCP cleared. BAD_DP cleared. DELAY_WDATA cleared. SYS_WRAP cleared. FILL_SYNDROME UNDEFINED Potentially locked. BC_TAG UNDEFINED Potentially locked. PAL_TEMP [31:0] UNDEFINED Note The Bcache parameters listed here are all undetermined on reset and must be initialized in the BIU_CTL register before enabling the Bcache. * Bcache RAM read speed (BC_RD_SPD) * Bcache RAM write speed (BC_WR_SPD) * Bcache delay write data (DELAY_WDATA) * Bcache write enable control (BC_WE_CTL) * Bcache size (BC_SIZE) 77 5 Electrical Characteristics Table 39 lists the maximum ratings for the 21064A. Table 39 21064A Maximum Ratings (PRELIMINARY ESTIMATES) Characteristics Ratings Storage temperature -55C to 125C (-67F to 257F) Supply voltage Vss -0.5 V, Vdd 3.6 V Junction temperature 15C to 90C (59F to 194F) Voltage applied to pins 3 V tolerant pins 5 V tolerant pins -0.5 V to Vdd + 0.5 V -0.5 V to 5.5 V Case Temperature: 21064A-200 21064A-233 21064A-275, 21064A-275-PC 21064A-300 0C 0C 0C 0C to to to to Maximum power @Vdd=3.46 V: 21064A-200 21064A-233 21064A-275, 21064A-275-PC 21064A-300 24.0 28.0 33.0 36.0 W W W W 73C 71C 67C 65C (32F (32F (32F (32F to to to to 167.4F) 160F) 153F) 149.0F) Note See the Alpha 21064 and Alpha 21064A Microprocessors Hardware Reference Manual for formulas to calculate peak power and to calculate maximum power at other values of Vdd and clock frequency. Caution Stress beyond the absolute maximum ratings can cause permanent damage to the 21064A. Exposure to absolute maximum rating conditions for extended periods of time can affect the 21064A reliability. 78 5.1 DC Characteristics The 21064A uses CMOS/TTL voltage levels. In CMOS mode, the Vss pins are connected to 0.0 V and the Vdd pins are connected to 3.3 V nominal +/- 5%. Caution To prevent damage to the 21064A, it is important that the Vdd power supply be stable before any of its input or bidirectional pins are allowed to rise above 4.0 V. The vRef analog input should be connected to a 1.4 V +/-10% reference supply. The clkIn_h and clkIn_l are differential signals generated from an external oscillator circuit. The signals can be ac coupled (if Vcc to the oscillator is greater than Vdd), with nominal dc bias of Vdd/2 set by a high-impedance (greater than 1K ohm) resistive network on the chip. The signals need not be ac coupled if Vdd is used as the Vcc supply to the oscillator. The 21064A signal input pins are CMOS inputs that use standard TTL levels, set by vRef. Table 40 lists the dc input characteristics. The following signals are sampled before vRef is stable. These signals cannot be driven above the power supply. dcOk_h tristate_l (3.3 V) cont_l (3.3 V) eclOut_h (GND) The 21064A output pins are 3.3 V CMOS outputs. These output signals can be driven between Vdd and Vss. Timing is specified to standard TTL levels. Table 40 lists the dc output characteristics. The bidirectional pins are ordinary 3.3 V CMOS bidirectional pins. 79 Table 40 DC Input/Output Characteristics Symbol Description Min Max Units Test Conditions Vdd Power supply voltage 3.135 3.465 V - Vih High-level input voltage (except dcOk_h and cont_l) 2.0 - V - Vihs High-level input voltage (static pins dcOk_h and cont_l) 2.7 - V - Vil Low-level input voltage - 0.8 V - Voh High-level output voltage Ioh = 100 A 2.4 - V - Vol Low-level output voltage Iol = 3.2 mA - 0.4 V - Vdiffc Differential clock input swing (duty cycle 45-55%) 300 mV 3.0 V - Iil Input leakage current (except eclOut_h) -100 100 A 0