TX79 CORE ARCHITECTURE - Toshiba - Datasheet.Directory

TX System RISC

TX79 Core Architecture

(Symmetric 2-way superscalar

64-bit CPU) Rev. 2.0

The information contained herein is subject to change without notice.

The information contained herein is presented only as a guide for the applications of our

products. No responsibility is assumed by TOSHIBA for any infringements of patents or

other rights of the third parties which may result from its use. No license is granted by

implication or otherwise under any patent or patent rights of TOSHIBA or others.

TOSHIBA is continually working to improve the quality and reliability of its products.

Nevertheless, semiconductor devices in general can malfunction or fail due to their

inherent electrical sensitivity and vulnerability to physical stress.

It is the responsibility of the buyer, when utilizing TOSHIBA products, to comply with

the standards of safety in making a safe design for the entire system, and to avoid

situations in which a malfunction or failure of such TOSHIBA products could cause loss

of human life, bodily injury or damage to property.

In developing your designs, please ensure that TOSHIBA products are used within

specified operating ranges as set forth in the most recent TOSHIBA products

specifications.

Also, please keep in mind the precautions and conditions set forth in the “Handling

Guide for Semiconductor Devices,” or “TOSHIBA Semiconductor Reliability

Handbook” etc..

The Toshiba products listed in this document are intended for usage in general

electronics applications ( computer, personal equipment, office equipment, measuring

equipment, industrial robotics, domestic appliances, etc.).

These Toshiba products are neither intended nor warranted for usage in equipment that

requires extraordinarily high quality and/or reliability or a malfunction or failure of

which may cause loss of human life or bodily injury (“Unintended Usage”).

Unintended Usage include atomic energy control instruments, airplane or spaceship

instruments, transportation instruments, traffic signal instruments, combustion control

instruments, medical instruments, all types of safety devices, etc.. Unintended Usage of

Toshiba products listed in this document shall be made at the customer’s own risk.

The products described in this document may include products subject to the foreign

exchange and foreign trade laws.

Preface

Thank you for choosing Toshiba semiconductor products. This is the year 2000 edition of the user’s

manual for the architecture of the TX79 RISC microprocessor core, a member of the TX System RISC

Family of Toshiba microprocessors.

This user’s manual is designed to be easily understood by engineers who are designing a Toshiba

microprocessor into their products for the first time. No special knowledge of this architecture is

assumed – the contents includes basic information about the architecture of the TX79 microprocessor

core as well as more advanced, in-depth description.

Toshiba are cont inually updating t echnic al publicatio ns. Any comments and suggesti ons regarding any

Toshiba document are most welcome and will be taken into account when subsequent editions are

prepared. To receive updates to the information in this manual, or for additio nal information about this

architecture, please contact your nearest Toshiba office or authorized Toshiba dealer.

April 2001

Contents

i

CONTENTS

Handling Precautions

C790 User’s Manual

1. Introduction ...................................................................................................................................1-1

1.1 Features....................................................................................................................................1-2

1.2 Related Documents..................................................................................................................1-3

1.3 Revision History........................................................................................................................1-4

1.4 Conventions Used in This Manual ...........................................................................................1-5

1.5 Restrictions for Use of the C790 CPU Core.............................................................................1-6

2. Architecture Overview..................................................................................................................2-1

2.1 Block Diagram and Functional Block Descriptions ..................................................................2-2

2.1.1 PC Unit ..............................................................................................................................2-3

2.1.2 MMU ..................................................................................................................................2-3

2.1.3 Caches...............................................................................................................................2-3

2.1.4 Issue Logic and Staging Registers....................................................................................2-3

2.1.5 GPR (General Purpose Registers) and FPR (Floating-Point Registers)..........................2-3

2.1.6 The Five Execution Pipes..................................................................................................2-3

2.1.6.1 I0 and I1 Pipes............................................................................................................2-3

2.1.6.2 LS - Load/Store Pipe...................................................................................................2-3

2.1.6.3 BR - Branch Pipe ........................................................................................................2-3

2.1.6.4 C1 - COP1/FPU Pipe..................................................................................................2-3

2.1.7 Operand/ Bypass log ic.......................................................................................................2-4

2.1.8 Response Buffer and Writeback Buffer.............................................................................2-4

2.1.9 UCAB.................................................................................................................................2-4

2.1.10 Result and Move Buses ....................................................................................................2-4

2.1.11 Bus Interface Unit and BIU Bus.........................................................................................2-4

2.2 Superscalar Pipeline Operation ...............................................................................................2-5

2.2.1 Integer Instruction Pipeline Stages ...................................................................................2-5

2.2.2 C1 (COP1/FPU) Instruction Pipeline Stages ....................................................................2-8

2.2.3 Classification and Routing of Instructions According to Execution P ipelines.................2-10

2.2.4 Instruction Issue Combinations.......................................................................................2-12

2.3 Registers.................................................................................................................................2-14

2.3.1 CPU Registers.................................................................................................................2-14

2.3.2 FPU Registers .................................................................................................................2-14

2.3.3 COP0 Registers...............................................................................................................2-15

Contents

ii

2.4 Memory Management ............................................................................................................2-16

2.5 Cache Memory .......................................................................................................................2-17

2.6 Bus Interface ..........................................................................................................................2-18

2.7 Floating Point Unit..................................................................................................................2-18

2.8 Performance Counter.............................................................................................................2-19

2.9 Debug and Tra cing Functions ................................................................................................2 -19

3. Instruction Set Overview and Summary.....................................................................................3-1

3.1 Introduction...............................................................................................................................3-2

3.2 CPU Instruction Set Formats....................................................................................................3-3

3.3 Instruction Set Summary..........................................................................................................3-4

3.3.1 Load/ Store Instructions .....................................................................................................3-4

3.3.1.1 Normal Loads and Stores...........................................................................................3-4

3.3.1.2 Multimedia Loads and Stores.....................................................................................3-5

3.3.1.3 Coprocessor Loads and Stores..................................................................................3-5

3.3.1.4 Data Formats and Addressing....................................................................................3-5

3.3.1.5 Defining Access Types................................................................................................3-9

3.3.1.6 Scheduling a Load Delay Slot...................................................................................3-13

3.3.2 Computational Instructions..............................................................................................3-14

3.3.2.1 ALU Immediate Instructions......................................................................................3-14

3.3.2.2 Three Operand Register-T y pe Instructions ..............................................................3-15

3.3.2.3 Shift Instructions .......................................................................................................3-15

3.3.2.4 Multiply and Divide Instructions................................................................................3-15

3.3.2.5 64-Bit Operations......................................................................................................3-15

3.3.3 Jump and Branch Instructions.........................................................................................3-16

3.3.3.1 Jump Instructions......................................................................................................3-16

3.3.3.2 Branch Instructions ...................................................................................................3-17

3.3.4 Miscellaneous Instructions..............................................................................................3-18

3.3.4.1 Exception Instructions...............................................................................................3-18

3.3.4.2 Serialization Instructions...........................................................................................3-18

3.3.4.3 MIPS IV Instructions .................................................................................................3-19

3.3.5 System Control Coprocessor (COP0) Instructions .........................................................3-20

3.3.6 Coprocessor 1 (COP1)....................................................................................................3-21

3.3.6.1 Coprocessor 1 (COP1) Instructions..........................................................................3-21

3.3.7 C790-Specific Instructions...............................................................................................3-22

3.3.7.1 Integer Multiply / Divide Instructions.........................................................................3-22

3.3.7.2 Multimedia Instructions.............................................................................................3-23

3.4 User Instruction Latency and Repeat Rate............................................................................3-25

4. CPU and COP0 Registers....................................................................................................... ......4-1

4.1 CPU Registers..........................................................................................................................4-2

Contents

iii

4.1.1 General Purpose Registers...............................................................................................4-4

4.1.2 HI and LO Registers..........................................................................................................4-4

4.1.3 Shift Amount (SA) Register...............................................................................................4-4

4.1.4 Program Counter (PC) ......................................................................................................4-4

4.2 System Control Coprocessor (COP0) Registers......................................................................4-5

4.2.1 Index Register (0)..............................................................................................................4-6

4.2.2 Random Register (1).........................................................................................................4-7

4.2.3 EntryLo0 Register (2), and EntryLo1 Register (3).............................................................4-8

4.2.4 Context Register (4) ..........................................................................................................4-9

4.2.5 PageMask Register (5)....................................................................................................4-10

4.2.6 Wired Register (6) ...........................................................................................................4-11

4.2.7 BadVAddr Register (8).....................................................................................................4-12

4.2.8 Count Register (9)...........................................................................................................4-13

4.2.9 EntryHi Register (10).......................................................................................................4-14

4.2.10 Compare Register (11)....................................................................................................4-15

4.2.11 Status Register (12).........................................................................................................4-16

4.2.11.1 Status Register Format.............................................................................................4-17

4.2.11.2 Status Register Modes and Access States ..............................................................4-18

4.2.12 Cause Register (13) ........................................................................................................4-19

4.2.13 EPC Register (14) ...........................................................................................................4-21

4.2.14 PRId Register (15)...........................................................................................................4-22

4.2.15 Config Register (16) ........................................................................................................4-23

4.2.16 BadPAddr Register (23)...................................................................................................4-25

4.2.17 Debug Registers (24) ......................................................................................................4-26

4.2.18 Performance Counter Registers (25)..............................................................................4-28

4.2.19 TagLo (28) and TagHi (29) Registers..............................................................................4-31

4.2.20 ErrorEPC (30)..................................................................................................................4-33

5. Exception Processing and Reset................................................................................................5-1

5.1 The Exception Handling Process.............................................................................................5-2

5.1.1 Level 1 Exceptions ............................................................................................................5-2

5.1.2 Level 2 Exceptions ............................................................................................................5-5

5.2 Exception Vector Locations......................................................................................................5-7

5.3 Cause Register Setting ............................................................................................................5-8

5.4 Masking an exception...............................................................................................................5-9

5.5 Detaild Description .................................................................................................................5-10

5.5.1 Exception Priority.............................................................................................................5-10

5.5.2 Reset Exception ..............................................................................................................5-11

5.5.3 Non-Maskable Interrupt (NMI) Exception........................................................................5-12

5.5.4 Performance Counter Exception.....................................................................................5-13

Contents

iv

5.5.5 Debug Exception.............................................................................................................5-14

5.5.6 Address Error Exception .................................................................................................5-15

5.5.7 TLB Refill Exception........................................................................................................5-16

5.5.8 TLB Invalid Exception......................................................................................................5-17

5.5.9 TLB Modified Exception ..................................................................................................5-18

5.5.10 Bus Error Exception.........................................................................................................5-19

5.5.11 System Call Exception.....................................................................................................5-20

5.5.12 BREAK Instruction Exception..........................................................................................5-21

5.5.13 Reserved Instruction Exception.......................................................................................5-22

5.5.14 Coprocessor Unusable Exception...................................................................................5-23

5.5.15 Interrupt Exception ..........................................................................................................5-24

5.5.16 SIO Exception..................................................................................................................5-25

5.5.17 Integer Ov erflow Exception.............................................................................................5-26

5.5.18 T rap Exception.................................................................................................................5-27

5.5.19 Floating-Point Exception .................................................................................................5-28

6. Memory Management ...................................................................................................................6-1

6.1 T ranslation Look-aside Buffer (TLB) ........................................................................................6-2

6.1.1 T ranslation Status..............................................................................................................6-2

6.1.2 Multiple Matches................................................................................................................6-2

6.2 Address Spaces .......................................................................................................................6-3

6.2.1 Virtual Address Space.......................................................................................................6-3

6.2.2 Physical Address Space....................................................................................................6-4

6.2.3 Virtual-to-Physical Address Translation ............................................................................6-4

6.2.4 32-bit Address Translation Mode ......................................................................................6-5

6.2.5 Operating Modes...............................................................................................................6-6

6.2.6 User Mode Operations......................................................................................................6-8

6.2.7 Supervisor Mode Operations...........................................................................................6-10

6.2.8 Kernel Mode Operations .................................................................................................6-11

6.3 System Control Coprocessor .................................................................................................6-14

6.3.1 Format of a TLB Entry.....................................................................................................6-15

6.4 Virtual-to-Physical Address Translation Process...................................................................6-18

6.5 TLB Instructions......................................................................................................................6-20

7. Caches7-1

7.1 Cache Features........................................................................................................................7-2

7.2 Organization of the Caches......................................................................................................7-3

7.2.1 Data Cache........................................................................................................................7-3

7.2.2 Instruction Cache...............................................................................................................7-4

7.2.3 Tag Structure.....................................................................................................................7-5

Contents

v

7.2.3.1 Data Cache Tag Structure ..........................................................................................7-6

7.2.3.2 Instruction Cache Tag Structure .................................................................................7-6

7.2.4 State of Cache Tags After Reset.......................................................................................7-7

7.3 Cache Operations.....................................................................................................................7-8

7.3.1 Line Replacement Algorithm.............................................................................................7-8

7.3.2 Non-blocking Load s and Hit Under M iss...........................................................................7-8

7.3.3 Cache Miss and Hit Operations ........................................................................................7-9

7.3.4 Data Cache Writeback Policy..........................................................................................7-10

7.3.5 Data Cache State Transitions .........................................................................................7-11

7.3.6 Instruction Cache State T ransitions ................................................................................7-12

7.3.7 Data Cache Lock Function..............................................................................................7-12

7.3.7.1 Operations During Lock............................................................................................7-13

7.3.8 Relationship Between Cached and Uncached Operations.............................................7-13

7.4 Uncached Accelerated Buffer.................................................................................................7-14

7.4.1 UCAB Configuration........................................................................................................7-14

7.4.2 Tag Structure...................................................................................................................7-14

7.4.3 Non-bloc king Load s and HiT un der Miss........................................................................7-14

7.5 Cache Control Registers........................................................................................................7-15

7.6 CACHE Instruction .................................................................................................................7-16

8. CPU Bus.........................................................................................................................................8-1

8.1 Introduction...............................................................................................................................8-2

8.1.1 Terminology .......................................................................................................................8-3

8.1.2 Signal Naming Convention................................................................................................8-3

8.2 CPU Bus Architecture ..............................................................................................................8-4

8.2.1 CPU Bus Connectivity for Address and Control Paths.....................................................8-5

8.2.2 CPU Bus Connectivity for Data Paths...............................................................................8-6

8.3 CPU Bus Signal Descriptions...................................................................................................8-7

8.3.1 Address Bus Signals ....................................................................................................... ..8-7

8.4 Overview of CPU Bus Operations..........................................................................................8-12

8.4.1 CPU Bus Operations.......................................................................................................8-12

8.4.2 Processor Requests........................................................................................................8-12

8.4.2.1 Read Requests .........................................................................................................8-12

8.4.2.2 Write Requests..........................................................................................................8-13

8.4.3 Bus Error Operations.......................................................................................................8-13

8.5 CPU Bus Transaction Protocols and Timing..........................................................................8-14

8.5.1 Arbitration Operations .....................................................................................................8-14

8.5.1.1 Cycle Stealing...........................................................................................................8-15

8.5.2 CPU Single Operations ...................................................................................................8-16

8.5.2.1 CPU Single Reads....................................................................................................8-16

Contents

vi

8.5.2.2 CPU Single Writes ....................................................................................................8-17

8.5.2.3 CPU Single Read-Write-Read-Write Cycles.............................................................8-18

8.5.3 CPU Burst Operations.....................................................................................................8-19

8.5.3.1 CPU Burst Reads......................................................................................................8-19

8.5.3.2 CPU Burst Writes......................................................................................................8- 20

8.5.3.3 CPU Burst Read-Write Cycles..................................................................................8-21

8.5.3.4 CPU Burst Write-Read Cycles..................................................................................8-21

8.5.4 CPU Non-Pipeline Single Operations .............................................................................8-22

8.5.4.1 CPU Non-Pipeline Single Reads..............................................................................8-22

8.5.4.2 CPU Non-Pipeline Single Writes ..............................................................................8-23

8.5.5 CPU Non-Pipeline Burst Operations...............................................................................8-23

8.5.5.1 CPU Non-Pipeline Burst Reads................................................................................8-23

8.5.5.2 CPU Non-Pipeline Burst Writes................................................................................8-24

8.5.6 Bus Error Operations.......................................................................................................8-25

8.5.6.1 Bus Error Exceptions................................................................................................8-25

8.5.6.2 CPU Bus Cycle Termination .....................................................................................8-26

8.5.6.3 Bus Error Timing with No Pending Operation...........................................................8-26

8.5.6.4 Bus Error Timing with One Pending Operation ........................................................8-26

8.5.6.5 Bus Error Timing with Two Pending Operations.......................................................8-28

9. Performance Counter ...................................................................................................................9-1

9.1 Overview...................................................................................................................................9-2

9.2 Performance Counters and Performance Control Registers...................................................9-2

9.2.1 Accessing Counters and Registers...................................................................................9-3

9.2.2 State of Performance Counter Control Registers Upon Reset.........................................9-4

9.3 Counter Operation....................................................................................................................9-5

9.3.1 Counter Events..................................................................................................................9-6

9.3.1.1 Event Descriptions......................................................................................................9-7

9.3.2 Handling Performance Counter Exceptions....................................................................9-10

9.3.3 Priority of Counter Exceptions.........................................................................................9-11

9.3.4 Initializing Counters.........................................................................................................9-11

9.3.5 The Note to Read Counters ............................................................................................9-12

10. Floating-Point Unit, CP1 (Option)..............................................................................................10-1

10.1 Overview.................................................................................................................................10-2

10.2 Floating Point Register...........................................................................................................10-2

10.2.1 Floating-Point General Registers (FGRs).......................................................................10-2

10.2.2 Floating-Point Registers (FPRs)......................................................................................10-4

10.2.3 Floating-Point Control Registers .....................................................................................10-4

10.2.4 Accessing the FP Control and Implementation/Revision Registers ...............................10-9

10.3 Floating-Point Formats.........................................................................................................10-10

Contents

vii

10.4 Binary Fixed-Point Format....................................................................................................10-12

10.5 Floating-Point Instruction Set Summary...............................................................................10-13

10.5.1 Load, Stor e and Mov e Instructions (Table 10-10).........................................................10-13

10.5.2 Conversion Instructions (Table 10-11)...........................................................................10-14

10.5.3 Computational Instructions (Table 10-12) .....................................................................10-14

10.5.4 Compare and Branch Instructions (Table 10-13)..........................................................10-15

11. Floating-Point Exception (Option) ............................................................................................11-1

11.1 Introduction.............................................................................................................................11-2

11.2 Exception Types.....................................................................................................................11-2

11.3 Exception Tra p Processing ....................................................................................................11-3

11.4 Flags.......................................................................................................................................11-3

11.5 FPU Exceptions......................................................................................................................11-5

11.6 Saving and Restoring State....................................................................................................11-9

11.7 T rap Handlers for IEEE Standard 754 Exceptions.................................................................11-9

12. PC T race.......................................................................................................................................12-1

12.1 Real-Time PC T ra cing............................................................................................................12-2

12.1.1 Classification of Branch and Jump Instructions..............................................................12-2

12.1.2 PC Trace Signals.............................................................................................................12-3

12.1.3 Priority of Target Addresses............................................................................................12-7

12.1.4 Examples of PC Tracing..................................................................................................12-8

12.1.4.1 Sequential Execution................................................................................................12-9

12.1.4.2 Conditional Branch..................................................................................................12-10

12.1.4.3 Indirect Jump (Target in Phase A) ..........................................................................12-11

12.1.4.4 Indirect Jump (Target in Phase B) ..........................................................................12-12

12.1.4.5 Indirect Jump (During Target PC Output)...............................................................12-13

12.1.4.6 Exception (Target in Phase B) ................................................................................12-14

12.1.4.7 Exception (During Target PC Output).....................................................................12-15

12.1.4.8 Exception Generated by Branch or Jump Instruction.............................................12-16

12.1.4.9 Exception Generated by Branch Delay Slot Instruction .........................................12-17

12.1.4.10 Exception Generated by Target Instruction ............................................................12-18

12.1.4.11 Back to Back Exceptions (Case I) ..........................................................................12-19

12.1.4.12 Back to Back Exceptions (Case II) .........................................................................12-20

13. Hardware Breakpoint..................................................................................................................13-1

13.1 Hardware Breakpoint..............................................................................................................13-2

13.1.1 Hardware Breakpoint signal............................................................................................13-2

13.2 Breakpoint Registers..............................................................................................................13-3

13.2.1 Breakpoint Control Register (BPC) .................................................................................13-4

13.2.2 Instruct ion Address Breakpoi nt Register (IAB) / Instruct ion Address Breakpo int Mask

Contents

viii

Register (IABM)...............................................................................................................13-7

13.2.3 Data Address Breakpoint Register (DAB) / Dat a Address Breakpoint Mask Register

(DABM)............................................................................................................................13-7

13.2.4 Data Valu e Breakpoint Register (DVB) / Data Value B r eakpoint Mas k Register (DV BM)13-

8

13.3 Setting Breakpoint..................................................................................................................13-8

13.3.1 Sequence of Setting Breakpoint......................................................................................13-9

13.3.2 Instruction Breakpointing...............................................................................................13-14

13.3.3 Data Address Breakpointing..........................................................................................13-16

13.3.4 Breakpointing by Data Address and Value....................................................................13-18

13.3.5 Data Value Breakpointing..............................................................................................13-19

13.4 T rigger ing External Probes................................................................................................. ..13-20

13.5 Important notice on using hardware breakpoint...................................................................13-20

A. CPU Instruction Set Details ........................................................................................................A-1

A.1 Description of an Instruction............................................................................................... .....A-2

A.1.1 Instruction Mnemonic and Name ..................................................................................... A-2

A.1.2 Instruction Encoding Picture............................................................................................. A-2

A.1.3 Format .............................................................................................................................. A-2

A.1.4 Purpose ............................................................................................................................ A-2

A.1.5 Description........................................................................................................................ A-2

A.1.6 Restrictions....................................................................................................................... A-2

A.1.7 Operation.......................................................................................................................... A-2

A.1.8 Exceptions........................................................................................................................ A-2

A.1.9 Programming Notes, Implementation Notes.................................................................... A-3

A.2 Instruction Description Notation and Functions ...................................................................... A-3

A.2.1.1 Pseudocode Language Statement Execution........................................................... A-3

A.2.1.2 Pseudocode Symbols................................................................................................ A-3

A.2.2 Definitions of Pseudocode Functions Used in Instruction Descriptions .......................... A-4

A.2.2.1 Coprocessor General Register Access Pseudocode Functions ............................... A-4

A.2.2.2 Load and Store Memory Pseudocode Functions...................................................... A-6

A.2.2.3 Miscellaneous Functions............................................................................................ A-8

A.3 CPU Instruction Formats......................................................................................................... A-9

A.4 Instruction Descriptions......................................................................................................... A-10

A.5 CPU Instruction Encoding...................................................................................................A-141

B. C790-Specific Instruction Set Details........................................................................................B-1

B.1 Conventions Used in This Chapter ......................................................................................... B-2

B.1.1 Instruction Description Notation and Functions ............................................................... B-2

B.1.2 Pseudocode Languag e Statement Execution.................................................................. B-2

B.1.3 Pseudocode Symbols....................................................................................................... B-2

Contents

ix

B.2 Definitions for Pseudocode Functions Used in Operation Descriptions................................. B-2

B.3 Summary of C790-Specific Instructions.................................................................................. B-3

B.3.1 Multiply and Multiply-Add Instructions.............................................................................. B-3

B.3.2 Multimedia Instructions.....................................................................................................B-3

B.4 Instruction Set Details ............................................................................................................. B-6

B.5 C790-Specific Instruction Encoding.................................................................................... B-163

C. COP0 System Control Coprocessor Instruction Set Details...................................................C-1

C.1.1 Notes on the CACHE Instruction Sub-operations............................................................C-7

Cache Virtual Address................................................................................................................C-7

Cache Physical Address ............................................................................................................C-7

BTAC Virtual Address.................................................................................................................C-7

BTAC Index Bits .........................................................................................................................C-7

COP0 Not Usable.......................................................................................................................C-7

TLB Exceptions on Cache Operations.......................................................................................C-8

Hit Sub-operation Accesses.......................................................................................................C-8

Breakpoint Exception .................................................................................................................C-8

Address Error Exception ............................................................................................................C-8

C.1.2 Sub-Operation Descriptions.............................................................................................C-9

C.1.3 Updates of Data Tag Status Bits ....................................................................................C-13

C.2 COP0 Instruction Encoding...................................................................................................C-41

D. COP1 (FPU) Instruction Set Details ...........................................................................................D-1

D.1 Conventions Used in This Chapter .........................................................................................D-2

D.1.1 Instruction Description Notation and Functions ...............................................................D-2

D.1.2 Pseudocode L ang uag e State ment Execution..................................................................D-2

D.1.3 Pseudocode Symbols.......................................................................................................D-2

D.2 Definitions for Pseudocode Functions Used in Operation Descriptions.................................D-2

D.3 Instruction Descriptions...........................................................................................................D-3

D.4 COP1 Instruction Encoding...................................................................................................D-40

Figures

x

FIGURES

Figure 2-1. C790 Block Diagram .....................................................................................................2-2

Figure 2-2. C790 Integer Instruction Pipeline..................................................................................2-5

Figure 2-3. FPU Pipeline..................................................................................................................2-8

Figure 2-4. Instruction Routing in Logical Pipes and Physical Pipes............................................2-10

Figure 3-1. CPU Instruction Formats...............................................................................................3-3

Figure 3-2. Big-Endian Byte Ordering .............................................................................................3-6

Figure 3-3. Little-Endian Byte Ordering...........................................................................................3-6

Figure 3-4. Little-Endian Data in a Doubleword ..............................................................................3-7

Figure 3-5. Big-Endian Data in a Doubleword.................................................................................3-7

Figure 3-6. Big-Endian Misaligned Word Addressing......................................................................3-8

Figure 3-7. Little-Endian Misaligned Word Addressing...................................................................3-8

Figure 4-1. CPU Registers...............................................................................................................4-3

Figure 4-2. Index Register ...............................................................................................................4-6

Figure 4-3. Random Register ..........................................................................................................4-7

Figure 4-4. EntryLo0 and EntryLo1 Registers.................................................................................4-8

Figure 4-5. Context Register Format...............................................................................................4-9

Figure 4-6. PageMask Register.....................................................................................................4-10

Figure 4-7. Wired Register.............................................................................................................4-11

Figure 4-8. Wired Register Boundary............................................................................................4-11

Figure 4-9. BadVAddr Register......................................................................................................4-12

Figure 4-10. Count Register ..........................................................................................................4-13

Figure 4-11. EntryHi Register ........................................................................................................4-14

Figure 4-12. Compare Register.....................................................................................................4-15

Figure 4-13. Status Register..........................................................................................................4-16

Figure 4-14. Cause Register..........................................................................................................4-19

Figure 4-15. EPC Register.............................................................................................................4-21

Figure 4-16. PRId Register............................................................................................................4-22

Figure 4-17. Config Register Format.............................................................................................4-23

Figure 4-18. BadPAddr Register Format .......................................................................................4-25

Figure 4-19. Performance Counter Registers ...............................................................................4-28

Figure 4-20. TagLo and TagHi Registers.......................................................................................4-31

Figure 4-21. ErrorEPC Register.....................................................................................................4-33

Figure 5-1. Level 1 Exception processing flowchart........................................................................5-4

Figure 5-2. Level 2 Exception processing flowchart........................................................................5-6

Figure 6-1. Overview of a Virtual-to-Physical Address Translation.................................................6-3

Figure 6-2. 32-bit Mode Virtual Address Translation.......................................................................6-5

Figures

xi

Figure 6-3 State T ransition among Operating Modes.....................................................................6-6

Figure 6-4. User Mode Virtual Address Space................................................................................6-8

Figure 6-5. Supervisor Mode Virtual Address Space....................................................................6-10

Figure 6-6. Kernel Mode Address Space ......................................................................................6-11

Figure 6-7. COP0 Registers and the TLB......................................................................................6-14

Figure 6-8. Format of a TLB Entry.................................................................................................6-15

Figure 6-9. TLB Address Translation.............................................................................................6-19

Figure 7-1. Organization of Data Cache..........................................................................................7-3

Figure 7-2. Organization of Instruction Cache.................................................................................7-4

Figure 7-3. Read Missed Processed in Sequential Order.............................................................7-10

Figure 7-4. Data Cache Transition Diagram, Writeback Protoco l.................................................7-11

Figure 7-5. Instruction Cache Transition Diagram.........................................................................7-12

Figure 8-1. CPU Bus Architecture ...................................................................................................8-4

Figure 8-2. CPU Bus Address and Control Path Connections in System.......................................8-5

Figure 8-3. CPU Bus Data Path Connections in System ................................................................8-6

Figure 8-4. Connection of Arbitration Signals................................................................................8-14

Figure 8-5. Arbitration Protocol......................................................................................................8-15

Figure 8-6. Cycle Stealing Protocol...............................................................................................8-15

Figure 8-7. CPU Single Reads ......................................................................................................8-16

Figure 8-8. CPU Single Writes.......................................................................................................8-17

Figure 8-9. CPU Single Read-Writ e-Read-Write Cycles...............................................................8-18

Figure 8-10. CPU Burst Reads......................................................................................................8-19

Figure 8-11. CPU Burst Writes.......................................................................................................8-20

Figure 8-12. CPU Burst Read-Write Cycles..................................................................................8-21

Figure 8-13. CPU Burst Write-Read Cycles..................................................................................8-21

Figure 8-14. CPU Non-Pipeline Single Reads ..............................................................................8-22

Figure 8-15. CPU Non-Pipeline Single Writes...............................................................................8-23

Figure 8-16. CPU Non-Pipeline Burst Reads................................................................................8-23

Figure 8-17. CPU Non-Pipeline Burst Writes ................................................................................8-24

Figure 8-18. One Operation with BUSERR* as the Last SYSDACK*...........................................8-27

Figure 8-19. One Operation with BUSERR* as SYSAACK*.........................................................8-27

Figure 8-20. One Operation with BUSERR* as SYSAACK* and the Last SYSDACK*...............8-28

Figure 8-21. Two Operations with Bus Error as the Last SYSDACK*...........................................8-29

Figure 9-1. Format of the Performance Counter Control Register PCCR........................................9-2

Figure 9-2. Format of Performance Counter Registers PCR0 and PCR1 .......................................9-2

Figure 9-3. CAUSE Register Fields................................................................................................9-10

Figure 10-1. FP Registers..............................................................................................................10-3

Figure 10-2. Implementation/Revision Register ............................................................................10-5

Figure 10-3. FP Control/Status Register Bit Assignments ............................................................10-6

Figure 10-4. Control/Status Register Cause, Flag, and Enable Fields.........................................10-7

Figures

xii

Figure 10-5. Single-Precision Floating-Point Format ..................................................................10-10

Figure 10-6. Double-Precision Floating-Point Format.................................................................10-10

Figure 10-7. Binary Word Fixed-Point Format.............................................................................10-12

Figure 10-8. Binary Long Fixed-Point Format.............................................................................10-12

Figure 11-1. Control/Status Register Exception/Flag/Trap/Enable Bits ........................................11-2

Figure 12-1. Priority of Outputting Jump or Exception Target.......................................................12-7

Figure 12-2. Waveform for Sequential Excecution........................................................................12-9

Figure 12-3. Waveform for Conditional Branch...........................................................................12-10

Figure 12-4. Waveform for Indirect Jump (Target in Phase A)....................................................12-11

Figure 12-5. Waveform for Indirect Jump (Target in Phase B)....................................................12-12

Figure 12-6. Waveform for Indirect Jump (During Target PC Output).........................................12-13

Figure 12-7. Waveform for Exception (Target in Phase B)..........................................................12-14

Figure 12-8. Waveform for Exception (During Target PC Output)...............................................12-15

Figure 12-9. Waveform for Exception Generated by Branch or J ump Instruction .......................12-16

Figure 12-10. Waveform for Exception Generated by Branc h Delay S lot Instruction..................12-17

Figure 12-11. W ave form for Exception Generated by Target Instruction....................................12-18

Figure 12-12. Waveform for Back to Back Exceptions (Case I)...................................................12-19

Figure 12-13. Waveform for Back to Back Exceptions (Case II)..................................................12-20

Figure 13-1. Overall Structure of Hardware Breakpoint................................................................13-3

Figure 13-2. Instruction Address Breakpoint Register...................................................................13-7

Figure 13-3. Instruction Address Breakpoint Mask Register.........................................................13-7

Figure 13-4. Data Address Breakpoint Register............................................................................13-7

Figure 13-5. Data Address Breakpoint Mask Register..................................................................13-7

Figure 13-6. Data Value Breakpoint Register................................................................................13-8

Figure 13-7. Data Value Breakpoint Mask Register......................................................................13-8

Figure 13-8. Hardware Breakpoint detection flow (Setting) ........................................................13-10

Figure 13-9. Hardware Breakpoint detection flow (IAB)..............................................................13-11

Figure 13-10. Hardware Breakpoint detection flow (DAB/DVB) (1/2).........................................13-12

Figure A-1. CPU Instruction Formats .............................................................................................A-9

Tables

xiii

TABLES

Table 1-1. Restriction List ...............................................................................................................1-6

Table 2-1. Categories of Instructions and How They Are Routed................................................2-11

Table 2-2. Concurrently Issued Instruction Categories .................................................................2-13

Table 2-3. Coprocessor 0 Registers ..............................................................................................2-15

Table 3-1. Load / Store Instructions.................................................................................................3-4

Table 3-2. Multimedia Load / Store Instructions..............................................................................3-5

Table 3-3. Coprocessor Load / Store Instructions...........................................................................3-5

Table 3-4. Defining Access Types (Big-Endian)............................................................................3-10

Table 3-5. Defining Access Types (Little-Endian)..........................................................................3-12

Table 3-6. ALU Immediate Instructions..........................................................................................3-14

Table 3-7. Three Operand Register-Type Instructions ..................................................................3-15

Table 3-8. Shift Instructions ...........................................................................................................3-15

Table 3-9. Multiply and Divide Instructions....................................................................................3-15

Table 3-10. Jump Instructions Jumping Within a 256 MByte Region............................................3-16

Table 3-11. Jump Instructions to Absolute Address ......................................................................3-16

Table 3-12. PC-Relative Conditional Branch Instructions Comparing 2 Registers.......................3-17

Table 3-13. PC-Relative Conditional Branch Instructions Comparing Against Zero.....................3-17

Table 3-14. Exception Instructions.................................................................................................3-18

Table 3-15. Serialization Instructions.............................................................................................3-18

Table 3-16. MIPS IV Instructions ...................................................................................................3-19

Table 3-17. System Control Coprocessor Instructions..................................................................3-20

Table 3-18. Coprocessor 1 Instructions.........................................................................................3-21

Table 3-19. C790-Specific Multiply and Divide Instructions ..........................................................3-22

Table 3-20. Multimedia Instructions...............................................................................................3-23

Table 3-21. Latencies and Repeat Rates for User Instruction.......................................................3-25

Table 4-1. Coprocessor 0 Registers ................................................................................................4-5

Table 4-2. Index Register Field Description.....................................................................................4-6

Table 4-3. Random Register Fields .................................................................................................4-7

Table 4-4. EntryLo0 and EntryLo1 Register Fields..........................................................................4-8

Table 4-5. Context Register Fields...................................................................................................4-9

Table 4-6. PageMask Register Field..............................................................................................4-10

Table 4-7. Wired Register Field Descriptions ................................................................................4-11

Table 4-8. BadVAddr Register Field...............................................................................................4-12

Table 4-9. Count Register Field.....................................................................................................4-13

Table 4-10. EntryHi Register Fields...............................................................................................4-14

Table 4-11. Compare Register Field..............................................................................................4-15

Tables

xiv

Table 4-12. Status Register Fields.................................................................................................4-17

Table 4-13. Cause Register Fields.................................................................................................4-19

Table 4-14. EPC Register Field .....................................................................................................4-21

Table 4-15. PRId Register Fields...................................................................................................4-22

Table 4-16. Config Register Fields.................................................................................................4-23

Table 4-17. BadPAddr Register Fields...........................................................................................4-25

Table 4-18. Performance Counter Control Register Fields ...........................................................4-29

Table 4-19. Performance Counter Register 0 Fields.....................................................................4-30

Table 4-20. Performance Counter Register 1 Fields.....................................................................4-30

Table 4-21. TagLo Register Fields.................................................................................................4-32

Table 4-22. TagHi Register Fields..................................................................................................4-32

Table 4-23. ErrorEPC Register Field .............................................................................................4-33

Table 5-1. Exception Levels.............................................................................................................5-2

Table 5-2. Exception Vectors for Level 1 exceptions.......................................................................5-7

Table 5-3. Exception Vectors for Level 2 exceptions.......................................................................5-7

Table 5-4. Cause.ExcCode Field................................................................................................ .....5-8

Table 5-5. Cause.EXC2 Field ..........................................................................................................5-8

Table 5-6. Masking exceptions .........................................................................................................5-9

Table 5-7. Exception Priority Order................................................................................................5-10

Table 6-1 Processor Modes.............................................................................................................6-6

Table 6-2. Address Space................................................................................................................6-7

Table 6-3. User Mode Segments.....................................................................................................6-9

Table 6-4. Supervisor Mode Segments .........................................................................................6-10

Table 6-5. Kernel Mode Segments ................................................................................................6-12

Table 6-6 TLB Page Coherency (C) Bit Values .............................................................................6-17

Table 6-7. TLB Instructions............................................................................................................6-20

Table 7-1. Cache Configuration.......................................................................................................7-2

Table 7-2. Cache Size and Access Bits...........................................................................................7-5

Table 7-3. Data Cache Line States...................................................................................................7-6

Table 7-4. LRF Line Replacement Algorithm...................................................................................7-8

Table 7-5. Quadword Retrieved Address PA[5:4]..........................................................................7-10

Table 7-6. UCAB Configuration......................................................................................................7-14

Table 7-7. UCAB Size and Access Bits .........................................................................................7-14

Table 8-1. System Signal Naming Convention................................................................................8-3

Table 8-2. Bus Transaction Types ...................................................................................................8-8

Table 8-3. CPU Transfer Size..........................................................................................................8-9

Table 8-4. Bus Error Exceptions....................................................................................................8-25

Table 8-5. Operation Termination Sequence.................................................................................8-26

Table 9-1. PCCR Register Bits ........................................................................................................9-2

Table 9-2. Writing Performance Counters and Registers using MT C0...........................................9-3

Tables

xv

Table 9-3. Reading Performance Counters and Registers using MFC0.........................................9-3

Table 9-4. Mnemonics to Access the Performance Count ers and Registers...................................9-3

Table 9-5. Counter Events ...............................................................................................................9-6

Table 9-6. Definition of Data Cache Miss ........................................................................................9-7

Table 10-1. Floating-Point Control Register Assignments.............................................................10-4

Table 10-2. FCR0 Fields................................................................................................................10-5

Table 10-3. Control/Status Register Fields....................................................................................10-6

Table 10-4. Flush Values of Denormalized Results.......................................................................10-7

Table 10-5. Rounding Mode Bit Decoding.....................................................................................10- 9

Table 10-6. Equations for Calculating Values in Sing le and

Double-Precision Floating-Point Format.................................................................10-11

Table 10-7. Floating-Point Format Parameter Values .................................................................10-11

Table 10-8. Minimum and Maximum Floating-Point Values ........................................................10-11

Table 10-9. Binary Fixed-Point Format Fields .............................................................................10-12

Table 10-10. FPU Instruction Set (Optional): Load, Move and Store Instruction........................10-13

Table 10-11. FPU Instruction Set(Optional): Conversion Instruction...........................................10-14

Table 10-12. FPU Instruction Set(Optional): Computational Instruction .....................................10-14

Table 10-13. FPU Instruction Set(Optional): Compare and Branch Instruction..........................10-15

Table 11-1.　Default FPU Exception Actions.................................................................................11-3

Table 11-2.　FPU Exception-Causing Conditions..........................................................................11-4

Table 11-3.　Values of Overflow Results........................................................................................11-7

Table 12-1. Classification of Branch and Jump Instruction ...........................................................12-2

Table 12-2. Exception Vector Address Codes...............................................................................12-6

Table 13-1. Set a new value into breakpoint registers ..................................................................13-4

Table 13-2. Get the value from breakpoint registers .....................................................................13-4

Table 13-3. BPC Register Fields....................................................................................................13-5

Table A-1. Symbols in Instruction Operation Statements...............................................................A-3

Table A-2. Coprocessor General Register Access Functions........................................................A-5

Table A-3. Load and Store Functions........................................................................................... ..A-6

Table A-4. AccessLength Specifications for Loads / Stores...........................................................A-7

Table A-5. Miscellaneous Functions...............................................................................................A-8

Table B-1. Quotient and Remainder Signs......................................................................................B-8

Table C-1. CACHE Instruction Op Field Encoding.........................................................................C-6

Table C-2. Data Tag Status Bit Modifications ................................................................................C-13

Table D-1. FPU Comparisons Without Special Operand Exceptions.............................................D-9

Table D-2 FPU Comparisons With Special Operand Exceptions for QNaNs ..............................D-10

Tables

xvi

Handling Precautions

1 Using Toshiba Semiconductors Safely

1-1

1. Using Toshiba Semiconductors Safely

TOSHIBA is continually working to improve the quality and the reliability of its products.

Nevertheless, semiconductor devices in general can malfunction or fail due to their inherent

electrical sensitivity and vulnerability to physical stress. It is the responsibility of the buyer, when

utilizing TOSHIBA products, to observe standards of safety, and to avoid situations in which a

malfunction or failure of a TOSHIBA product could cause loss of human life, bodily injury or

damage to property.

In developing your designs, please ensure that TOSHIBA products are used within specified

operating ranges as set forth in the most recent products specifications. Also, please keep in mind

the precautions and conditions set forth in the TOSHIBA Semiconductor Reliability Handbook.

1 Using Toshiba Semiconductors Safely

1-2

2 Safety Precautions

2-1

2. Safety Precautions

This section lists important precautions which users of semiconductor devices (and anyone else)

should observe in order to avoi d injury and dama ge to propert y, and to ensure safe and correct us e

of devices.

Please be sure that you understand the meanings of the labels and the graphic symbol described

below before you move on to the detailed descriptions of the precautions.

[Explanation of labels]

[Explanation of labels][Explanation of labels]

[Explanation of labels]

Indicates an imminently hazardous situation which will result in death or

serious injury if you do not follow instructions.

Indicates a pot entially hazardous situation which could result in death or

serious injury if you do not follow instructions.

Indicates a potentially haza rdous situation which i f not avoided, ma y result

in minor injury or moderate injury.

[Explanation of graphic symbol]

[Explanation of graphic symbol][Explanation of graphic symbol]

[Explanation of graphic symbol]

Graphic symbol Meaning

Indicates t hat cauti on is required (laser beam is dangerous to eyes).

2 Safety Precautions

2-2

2.1 General Precautions regarding Semiconductor Devices

Do not use devices under conditions exceeding t hei r absol ute maximum ratings (e.g. current, voltage, power dissipation or

temperature).

This may cause the device to break down, degrade its perform ance, or cause it to catch fi re or explode resulting in injury.

Do not insert devices i n the wrong orientat i on.

Make sure that the positive and negati ve termi nals of power suppli es are connect ed correc tly. Otherwise the rated maximum

current or power dissipation may be exceeded and the device may break down or undergo performance degradation, causing it to

catch fire or explode and resulting in injury.

When power to a device is on, do not touch the device’s heat sink.

Heat sinks becom e hot, s o you may burn your hand.

Do not touch the tips of device leads.

Because some types of devic e have l eads with poi nted tips, you may prick your finger.

When conducting any ki nd of evaluation, inspection or testing, be sure to connect the testing equi pment’s electrodes or probes to

the pins of the device under test before powering it on.

Otherwise, you m ay receive an el ectric shock causing injury.

Before grounding an item of measuring equipm ent or a soldering iron, check that there is no electrical leakage from it.

Electri cal leakage may cause the device which you are testing or soldering to break down, or could give you an electric shock.

Always wear protecti ve gl asses when cutting the leads of a device with clippers or a simil ar tool.

If you do not, small bits of met al flying off the cut ends may damage your eyes.

2 Safety Precautions

2-3

2.2 Precautions Specific to Each Product Group

2.2.1 Optical semiconductor devices

When a visible semiconduct or l aser is operat ing, do not look directly into the laser beam or look through the optical system.

This is highly likel y to impair visi on, and i n the worst case may cause blindness.

If it is necessary to examine t he las er apparatus, for exampl e to inspect its optical characteristics , always wear the appropriate

type of laser prot ective gl asses as stipulated by IEC standard IEC825-1.

Ensure that the current flowing in an LED device does not exceed the device’s maximum rated current.

This is particularl y important for resin-pack aged LE D devic es, as excessive current may cause the package resin to blow up,

scatteri ng resi n fragments and causi ng injury.

When testing the diel ect ric strength of a photocoupler, us e test i ng equipment which can shut off the supply voltage to the

photocoupler. If you detect a leakage current of more than 100 µA, use the testing equipment to shut off the photocoupler’s

supply voltage; otherwise a large short-circuit current will flow continuously, and the device may break down or burs t into flames,

resulting in fire or injury.

When incorporat i ng a visible sem i conductor laser into a design, use the device’s internal photodetector or a separate

photodetector to stabilize the laser’s radiant power so as to ensure that laser beams exceeding the laser’s rated radiant power

cannot be emitted.

If this stabilizi ng m echanism does not work and the rated radiant power is exceeded, the device may break down or the

excessivel y powerful la ser beams may cause injury.

2.2.2 Power devices

Never touch a power device while it is powered on. Also, after turning off a power device, do not touch it until it has thoroughly

discharged all rem ai ning elect rical charge.

Touching a power device while it is powered on or still charged could caus e a severe electri c s hock, resulting in death or serious

injury.

When conducting any kind of evaluation, inspection or testing, be sure to connect the testing equipment’s electrodes or probes to

the device under test before powering it on.

When you have finished, disc harge any el ectrical charge remaini ng in the device.

Connecting the electrodes or probes of testing equipment to a device while it is powered on may result in electric shock, c a usi ng

injury.

2 Safety Precautions

2-4

Do not use devices under conditions which exceed thei r absol ute maximum ratings (current, volt age, power dissipation,

temperature etc. ).

This may cause the device to break down, causing a large short-circuit current to flow, which may in turn cause it to catch fire or

explode, resulting i n fi re or injury.

Use a unit which can detect short-circuit currents and which will shut off the power supply if a short-circuit occurs.

If the power supply is not shut off, a large short-circuit current will flow continuously, which may in turn cause the device to catch

fire or explode, resulti ng i n fire or injury.

When designing a case for enclosing your system, consider how best to protect the user from shrapnel in the event of the device

catching fire or exploding.

Flying shrapnel can cause injury.

When conducting any ki nd of evaluati on, inspection or testing, always us e prot ective safety tools such as a cover for the device.

Otherwise you may sustai n i nj u ry caused by t he devic e catc hi ng fire or exploding.

Make sure that all metal casings in your design are grounded to earth.

Even in modules where a device’s electrodes and m etal casing are i n sul at e d, capacit ance i n the module may cause the

electrost ati c pot enti al i n the casing to rise.

Dielectric breakdown may cause a high voltage to be applied to the casing, causing electric shock and injury to anyone touching it.

When designing the heat radiati on and safet y features of a system incorporating high-speed rectif i ers, remember to take the

device’s f o rward and reverse losses into account.

The leakage current in these devices is greater than that in ordinary rectifiers ; as a result, if a high-speed rectifier is used in an

extreme environment (e.g. at high temperature or high voltage), its reverse loss may increase, causi ng thermal runaway to occur.

This may in turn cause the device to explode and scatter shrapnel, resulting in injury to the user.

A design should ensure that, except when the main circuit of the device is active, reverse bias is appli ed to the device gate while

electricity is conducted to control circuits, so that the main circuit will becom e inactive.

Malfunct i on of the device may cause serious accidents or injuri es.

When conducting any ki nd of evaluation, inspection or testing, either wear protec tive gl oves or wait until the device has cooled

properly before handling it.

Devices become hot when they are operated. Even after the power has been turned off, the device will retain residual heat which

may cause a burn to anyone touching it.

2.2.3 Bipolar ICs (for use in automobiles)

If your design incl udes an inducti ve l oad such as a motor coil, incorporate diodes or similar devices i nto t he design to prevent

negative current from flowing in.

The load current generated by powering the device on and off may cause it to function erratically or to break down, which could in

turn caus e injury.

Ensure that the power supply t o any devic e which incorporates protective f unct i ons is stabl e.

If the power supply is unstabl e, the device may operate erratically, preventing the protective funct ions from working correctly. If

protect i ve funct i ons fail , t he devic e may break down causi ng injury to the user.

3 General Safety Precautions and Usage Considerations

3-1

3. General Safety Precautions and Usage Considerations

This section is designed to help you gain a better understanding of semiconductor devices, so as to

ensure the safety, quality and reliability of the devices which you incorporate in to your designs.

3.1 From Incomi ng to Shipping

3.1.1 Electrostatic discharge (ESD)

When handling individual devices (which are not yet mounted on a printed

circuit board), be sure that the environment is protected against

electrostatic electricity. Operators should wear anti-static clothing, and

containers and other objects which come into direct contact with devices

should be made of anti-static materials and should be grounded to earth via

an 0.5- to 1.0-MΩ protective resistor.

Please follow the precautions described below; this is particularly important

for devices which are marked “Be careful of static.”.

(1) Work environment

• When humidity in the working environment decreases, the human body and other insulators

can easily become charged with static electricity due to friction. Maintain the recommended

humidity of 40% to 60% in the work environment, while also taking into account the fact that

moisture-proof-packed products may absorb moisture after unpacking.

• Be sure that all equi pment, jigs and t ools in the working area are grounded to earth.

• Place a conductive mat over the floor of the work area, or take other appropriate measures, so

that the floor s urfac e is prot ected a gainst st at ic el ect ricit y an d is grounded t o ea rth. Th e surfa ce

resistivity should be 104 to 108 Ω/sq and the resistance between surface and ground, 7.5 × 105 to

108 Ω

• Cover the workbench surface also wit h a conductive mat (with a surface resistivity of 104 to

108 Ω/sq, for a resistance between surface and ground of 7.5 × 105 to 108 Ω) . The purpose of this

is to disperse static electricity on the surface (through resistive components) and ground it to

earth. Workbench surfaces must not be constructed of low-resistance metallic materials that

allow rapid static discharge when a charged device touches them directly.

• Pay attention to the following points when using automatic equipment in your workplace:

(a) When picking up ICs with a vacuum unit, use a conductive rubber fitting on the end of the

pick-up wand to protect against electrostatic charge.

(b) Mini mize friction on IC package s urfaces . If some rubbing is unavoi dable due to the devi ce’s

mechanical structure, minimize t h e friction plane or use material with a small friction

coefficient and low electrical resistance. Also, consider th e use of an ionizer.

(c) In sections which come into contact with device lead terminals, use a material which

dissipates static electricity.

(d) Ensure that no statically charged bodies (such as work clothes or the human body) touch

the devices.

3 General Safety Precautions and Usage Considerations

3-2

(e) Make sure that sections of the tape carrier which come into contact with installation

devices or other electrical machinery are made of a low-resistance material.

(f) Make sure that jigs and tools used in the assembly process do not touch devices.

(g) In processes in whi ch packages may retain an electrostatic charge, use an ionizer to

neutralize the ions.

• Make sure that CRT displays in the working area are protected against static charge, for

example by a VDT filter. As much as possible, avoid turning displays on and off. Doing so can

cause electrostatic induction in devices.

• Keep trac k of charged potential in the working area by taking periodic measurements.

• Ensure that work chairs are protected by an anti-static textile cover and are grounded to the

floor surface by a grounding chain. (Suggested resistance between the seat surface and

grounding chain is 7.5 × 105 to 1012Ω.)

• Install anti-static mats on storage shelf surfaces. (Suggested surface resistivity is 104 to 108

Ω/sq; suggested resistance between surface and ground is 7.5 × 105 to 108 Ω.)

• For transport and temporary storage of devices, use containers (boxes, jigs or bags) that are

made of anti-static materials or materials which dissipate electrostatic charge.

• Make sure that cart surfaces which come into contact with device packaging are made of

materials which will conduct static electricity, and verify that they are grounded to the floor

surface via a grounding chain.

• In any location where the level of static electricity is to be closely controlled, the ground

resistance level should be Class 3 or above. Use different ground wires for all items of

equipment which may come into physical contact with devices.

(2) Operating environment

• Operators must wear a nti-sta tic clot hing and conducti ve shoes (or

a leg or heel strap).

• Operators must wear a wrist strap grounded to eart h via a

resistor of about 1 MΩ.

• Soldering irons must be grounded from iron tip to earth, and must be used only at l ow voltages

(6 V to 24 V).

• If the tweezers you use are likely to touch the device terminals, use anti-static tweezers and in

particular avoid metallic tweezers. If a charged device touches a low-resistance tool, rapid

discharge can occur. When using vacuum tweezers, attach a conductive chucking pat to the tip,

and connect it to a dedicated ground used especially for anti-static purposes (suggested

resistance value: 104 to 108 Ω).

• Do not place devices or their containers near sources of strong electrical fields (such as above a

CRT).

3 General Safety Precautions and Usage Considerations

3-3

• When storing printed circuit boards which have devices mounted on them, use a boa rd

container or bag that is protected against static charge. To avoid the occurrence of static charge

or discha rge due to friction, keep the boards separate from one other and do not stack them

directly on top of one another.

• Ensure, if possible, that any articles (such as clipboards) which are brought to any location

where the level of static electricity must be closely controlled are constructed of anti-static

materials.

• In cases where the human body comes into direct contact with a device, be sure to wear anti-

static finger covers or gloves (suggested resistance value: 108 Ω or less).

• Equipment safety covers installed near devices should have resistance ratings of 109 Ω or less.

• If a wrist strap cannot be used for some reason, and there is a possibility of imparting friction to

devices, use an ionizer.

• The transport film used in TCP products is manufactured from materials in which static

charges tend to build up. When using these products, install an ionizer to prevent the fil m from

being charged with static electricity. Also, ensure that no static electricity will be applied to the

product’s copper foils by taking measures to prevent static occuring in the peripheral

equipment.

3.1.2 Vibration, impact and stress

Handle devices and packaging materials with care. To avoid damage

to devices, do not toss or drop packages. Ensure that devices are not

subject ed to mechanical vibration or shock during transportation.

Ceramic package devices and devices in canister-type packages which

have empty space inside them are subject to damage from vibration

and shock because the bonding wires are secured only at their ends.

Plastic molded devices, on the other hand, have a relatively high level

of resistance to vibration and mechanical shock because their bonding

wires are enveloped and fixed in resin. However, when any device or package type is installed in

target equipment , it is to some extent suscept i bl e to wiring dis connect ions and other damage from

vibration, shock and stressed solder junctions. Therefore when devices are incorporated in to the

design of equipment which will be subject to vibration, the structural design of the equipment

must be thought out carefully.

If a device is subjected to especially strong vibration, mechanical shock or stress, the package or

the chip itself may crack. In products such as CCDs which incorporate window glass, this could

cause su rface flaws in the glass or cause the connection between the glass and the ceramic to

separate.

Furthermore, it is known that stress applied to a semiconductor device through the package

changes the resistance characteristics of the chip because of piezoelectric effects. In analog circuit

design attention must b e paid to the problem of package stress as well as to the dangers of

vibration and shock as described above.

Vibration

3 General Safety Precautions and Usage Considerations

3-4

3.2 Storage

3.2.1 General storage

• Avoid storage locations where devices will be exposed to moisture or direct sunlight.

• Follow the instructions printed on the device cartons regarding

transportation and storage.

• The storage area temperature should be kept within a

temperature range of 5°C t o 35°C, a nd relative humi dity s hould

be maintained at between 45% and 75%.

• Do not store devices in the presence of harmful (especially

corrosive) gases, or in dusty conditions.

• Use storage areas where there is minimal temperature fluctuation. Rapid temperature changes

can cause moisture to form on stored devices , resulting in lead oxidation or corrosi on. As a result,

the solderability of the leads will be degraded.

• When repacking devices, use anti-static containers.

• Do not allow external forces or loads to be applied to devices while they are in storage.

• If devices have been stored for more than two years, their electrical characteristics should be

test ed and their leads should be tested for ease of soldering b efore they are used.

3.2.2 Moisture-proof packing

Moisture-proof packing should be handled with care. The handling

procedure specified for each packing type should be followed scrupulously.

If the proper procedures are not followed, the qua lity and reliability of

devices may be degraded. This section describes general precautions for

handling moisture-proof packing. Since the details may differ from device

to device, refer also to the relevant individual datasheets or databook.

(1) General precautions

Follow th e instructions printed on the device cartons regarding transportation and st orage.

• Do not drop or toss device packing. The laminated aluminum material in it can be rendered

ineffective by rough handling.

• The storage area temperature should be kept within a temperature range of 5°C to 30°C, and

relative humidity should be maintained at 90% (max). Use devices within 12 months of the date

marked on the package seal.

Humidity: Temperature:

3 General Safety Precautions and Usage Considerations

3-5

• If the 12 -month storage period has expired, or if the 30% humidity indicator shown in Figure 1

is pink when the packing is opened, it may be advisable, depending on the device and packing

type, to back the devices at high temperature to remove any moisture. Please refer to the table

below. After the pack has been opened, use the devices in a 5°C to 30°C. 60% RH environment

and within t he effecti ve usa ge period l ist ed on the mois ture-proof pa cka ge. If t he effect ive us age

period has expired, or if the packing has been stored in a high-humidity environment, back the

devices at high temperature.

Packing Moisture removal

Tray If the packing bears the “Heatproof” marking or indicates the maximum temperature which it can

withstand, bake at 125°C for 20 hours. (Some devices require a different procedure.)

Tube Transfer devices to trays bearing the “Heatproof” marking or indicating the temperature which they

can withstand, or to aluminum tubes before bak i ng at 125°C for 20 hours.

Tape Deviced packed on tape cannot be baked and must be used within the effective usage period after

unpacking, as specif i ed on the packing.

• When bak ing devices, protect the devices from static electricity.

• Moisture indicators can detect the approximate humidity level at a standard temperature of

25°C. 6-point indicators and 3-point indicators are currently in use, but eventually all indicators

will be 3-point indicators.

DANGER IF PINK

CHANGE DESICCANT

READ AT LAVENDER

BETWEEN PINK & BLUE

10%

20%

30%

40%

50%

60%

HUM IDITY INDIC ATO R

DANGER IF PINK

READ AT LAVENDER

BETWEEN PINK & BLUE

20

30

40

HUM IDITY INDIC ATO R

(a) 6-point indicator (b) 3-poin t indicat or

Figure 1 Humidity indicator

3 General Safety Precautions and Usage Considerations

3-6

3.3 Design

Care must be exercis ed in the des ign of electr onic equipment t o achieve the des ired relia bilit y. It is

important not only to adhere to specifications concerning absolute maximum ratings and

recommended operating conditions, it is also important to consider the overall environment in

which equipment will be used, including factors such as the ambient temperature, transient noise

and voltage and current surges, as well as mounting conditions which affect device reliability. This

section describes some general precauti ons which you should observe when designing circuits and

when mounting devices on printed circuit boards.

For more detailed information about each product family, refer to the relevant individual technical

datasheets available from Toshiba.

3.3.1 Absolute maximum ratings

Do not use devices under condi ti ons i n which t heir ab sol ute maximum rat ings

(e.g. current, voltage, power dissipation or temperature) will be exceeded. A

device may break down or its performance may be degraded, causing it to

catch fire or explode resulting in injury to the user.

The absolute maximum ratings are rated values which must not be

exceeded during operation, even for an instant. Although absolute

maximum ratings differ from product to product, they essentially

concern the voltage and current at each pin, the allowable power

dissipation, and the junction and storage tempera tures.

If the voltage or current on any pin exceeds the absolute maximum

rating, the device’s internal circuitry can become degraded. In the worst

case, heat generated in internal circuitry can fuse wiring or cause the semiconductor chip to break

down.

If storage or operating temperatures exceed rated va lues, the package seal can deteriorate or the

wires can become disconnected due to the differences between the thermal expansion coefficients

of the materials from which the device is constructed.

3.3.2 Recommended operating conditions

The recommended operating conditions for each device are those necessary to guarantee that the

device will operate as specified in the datasheet.

If greater reliability is required, derate the device’s absolute maximum ratings for voltage, current,

power and temperature before using it.

3.3.3 Derating

When incorporating a device into your desi gn, reduce its rated absolute maximum voltage, current,

power diss ipation and operating temperature in order to ensure high reliability.

Since derating differs from application to application, refer to the technical datasheets available

for the various devices used in your design.

3.3.4 Unused pins

If unused pins are left open, some devices can exhibit input instability problems, resulting in

malfunctions such as abrupt increase in current flow. Similarly, if the unused output pins on a

device are connected to the power supply pin, the ground pin or to other output pins, the IC may

malfuncti on or break down.

3 General Safety Precautions and Usage Considerations

3-7

Since the details regarding the handling of unused pins differ from devi ce to device and from pin

to pin, please follow the instructions given in the relevant individual datasheets or databook.

CMOS logic IC inputs, for example, have extremely high impedance. If an input pin is left open, it

can easily pick up extraneous noise and become unstable. In this case, if the input voltage level

reaches an intermediate level, it is possible that both the P-channel and N-channel transistors

will be turned on, allowing unwanted supply current to flow. Therefore, ensure that the unused

input pins of a devi ce are connected to the power s upply (Vcc) pin or ground (GND) pin of t he same

device. For details of what to do with the pins of heat sinks, refer to the relevant technical

datasheet and databook.

3.3.5 Latch-up

Latch-up is an abnormal conditi on inherent in CMOS devi ces, in which Vcc get s shorted to ground.

This happens when a parasitic PN-PN junction (thyrist or structure) internal to the CMOS chip is

turned on, causing a large current of the order of several hundred mA or more to flow between Vcc

and GND, eventually causing the device to break down.

Latch-up occurs when the input or output voltage exceeds the rated value, causing a large current

to flow in the internal chip, or when the voltage on the Vcc (Vdd) pin exceeds its rated value,

forcing the internal chip into a breakdown condition. Once the chip falls into the latch-up state,

even though the excess voltage may have been applied only for an instant, the la rge current

continues to flow between Vcc (Vdd) and GND (Vss). This causes the device to heat up and, in

extreme cas es , t o emit ga s fumes as wel l. To avoi d this prob lem, obs erve t he foll owing preca ut ions :

(1) Do not allow voltage levels on the input and output pins either to rise above Vcc (Vdd) or to

fall below GND (Vss). Also, follow any prescribed power-on sequence, so that power is applied

gradually or in steps rather than abruptly.

(2) Do not allow any abnormal noise signals to be applied to the device.

(3) Set the voltage levels of unused input pins to Vcc (Vdd) or GND (Vss).

(4) Do not connect output pins to one another.

3.3.6 Input/Output protection

Wired-AND configurations, in which outputs are connected together, cannot be used, since this

short-circuits the out puts . Outputs should, of course, never be connected to Vcc (Vdd) or GND

(Vss).

Furthermore, ICs with tri -state outputs can undergo performance degradation if a shorted outp ut

current is al lowed t o flow for an extended peri od of t ime. Th erefore, wh en des igni ng circuit s , ma ke

sure that tri-state outputs will not be enabled simultaneously.

3.3.7 Load capacitance

Some devices display increased delay times if the load capacitance is large. Also, large charging

and discharging currents will flow in the device, causing noise. Furthermore, since outputs are

shorted for a relatively long t ime, wiring can become fused.

Consult the technical information for the device being used to determine the recommended load

capacitance.

3 General Safety Precautions and Usage Considerations

3-8

3.3.8 Thermal design

The failure rate of semiconductor devices is greatly increased as operating temperatures increase.

As shown in Figure 2, the internal thermal stress on a device is the sum of the ambient

temperature and the temperat ure rise due to power dissipation in the device. Therefore, to

achieve optimum reliability, observe the following precautions concerning thermal design:

(1) Keep the a mbient t emperature (Ta) as low as possible.

(2) If the device’s dynamic power dis sipation is relatively large, select the most appropriate

circuit board material, and consider the use of heat sinks or of forced air cooling. Such

measures will help lower t he thermal resist ance of the package.

(3) Derate the device’s absolute maximum ratings to minimize thermal stress from power

dissipation.

θja = θjc + θca

θja = (Tj–Ta) / P

θjc = (Tj–Tc) / P

θca = (Tc–Ta) / P

in which θja = thermal resistance between junction and surrounding air (°C/W)

θjc = thermal resistance between junction and package surface, or internal t hermal

resistance (°C/W)

θca = thermal resistance between package surface and surrounding air, or external

thermal resistance (°C/W)

Tj = junction temperature or chip temperat ure (°C)

Tc = package su rface temperature or case temperature (°C)

Ta = ambient temperature (°C)

P = power dissipation (W)

Tc

θca

Ta

Tj

θjc

Figure 2 Thermal resistance of package

3.3.9 Interfacing

When connecting inputs and outputs between devices, make sure input voltage (VIL/VIH) and

output voltage (VOL/VOH) levels are matched. Otherwise, the devices may malfunction. When

connecting devices operating at different supply voltages, such as in a dual-power-supply system,

be aware that erroneous power-on and power-off sequences can result in device breakdown. For

details of how to interface particular devices, consult the relevant technical datasheets and

databooks. If you have any questions or doubts about interfacing, contact your nearest Toshiba

office or distributor.

3 General Safety Precautions and Usage Considerations

3-9

3.3.10 Decoupling

Spike currents generated during switching can cause Vcc (Vdd) and GND (Vss) voltage levels to

fluctuat e, ca using ri nging i n the output waveform or a dela y in res pons e speed. (The power s uppl y

and GND wiring impedance is normally 50 Ω to 100 Ω.) For this reason, the impedance of power

supply lines with respect to high frequencies must be kept low. This can be accomplished by using

thick and short wiring for the Vcc (Vdd) and GND (Vss) lines and by installing decoupling

capacitors (of approximately 0.01 µF to 1 µF capacitance) as high-frequency filters between Vcc

(Vdd) and GND (Vss) at strategic locations on the printed circuit board.

For low-frequency filtering, it is a good idea to install a 10- to 100-µF capacitor on the printed

circuit board (one capacitor will suffice). If the capacitance is excessively large, however, (e.g.

several thousand µF) latch-up can be a problem. Be sure to choose an appropriate capacitance

value.

An important point about wiring is that, in the case of high-speed logic ICs, noise is caused mainly

by reflection and crosstalk, or by the power supply impedance. Refl ections cause increased signal

delay, ringing, overshoot and undershoot, thereby reducing the device’s safety margins with

respect t o noise. To prevent reflections, reduce the wiring length by in creasing the device

mounting density so as to lower the inductance (L) and capacitance (C) in the wiring. Extreme

care must be taken, however, when taking this corrective measure, since it tends to cause

crosstalk between the wires. In practice, th ere must be a trade-off between these two factors.

3.3.11 External noise

Printed circuit boards with long I/O or signal pattern lines are

vulnerabl e to induced noise or surges from outsi d e sources.

Consequently, malfunctions or breakdowns can result from

overcurrent or overvoltage, depending on the types of device

used. To protect against noise, lower the impedance of the

pattern line or insert a noise-canceling circuit. Protective

measures mu st also be taken ag ains t su rge s.

For details of the appropria te protective measures for a

particular device, consult the relevant databook.

3.3.12 Electromagnetic interference

Widespread use of electrical and electronic equipment in recent years has brought with it radio

and TV reception problems due to electromagnetic interference. To use th e radio spectrum

effectively and to maintain radio communications quality, each country has formulated

regulati ons limiting the amount of electromagnetic interference which can be generated by

individual products.

Electromagnetic interference includes conduction noise propagated through power supply and

telephone lin es, and noise from direct electromagnetic waves radiated by equipment. Different

measurement methods and correcti ve measures are used to assess and counteract each specific

type of noise.

Difficult ies in controlling electromagnetic interference derive from the fact that there is no

method available which allows designers to calculate, at the design stage, the strengt h of the

electromagnetic waves which will emanate from each component in a piece of equipment. For this

reason, it is only after the prototype equipment has been completed that the designer can take

measurements using a dedicated instrument to determine the strength of electromagnetic

interference waves. Yet it is possible during system design to incorporate some measures for the

prevention of electromagnetic interference, which can facilitate taking corrective measures once

the design has been completed. These include installing shields and noise filters, and increasing

Input/Output

Signals

3 General Safety Precautions and Usage Considerations

3-10

the thi ckness of the power supply wiring patterns on the printed circuit board. One effective

method, for exampl e, i s t o devis e s everal shieldi ng opt ions during des i gn, and then s elect t he mos t

suitable shielding method based on the results of measurements taken after the prototype has

been completed.

3.3.13 Peripheral circuits

In most cases semiconductor devices are used with peripheral circuits and components. The input

and output signal voltages and currents in these circuits must be chosen to match the

semiconductor device’s specifications. The following factors must be taken into account.

(1) Inappropriate voltages or currents applied to a device’s input pins may cause it to operate

erratically. Some devices contain pull-up or pull-down resistors. When designing your system,

remember to take the effect of this on the voltage and current levels into account.

(2) The output pins on a device have a predetermined external circuit drive capability. If this

drive capability is greater than that required, either incorporate a compensating circuit into

your design or carefully select suitable components for use in external circuits.

3.3.14 Safety standards

Each country has safety standards which must be observed. These safety standards include

requirement s for quality assurance systems and design of device insulation. Such requirements

must be fully taken into account to ensure that your design conforms to the applicable safety

standards.

3.3.15 Other precautions

(1) When designing a system, be sure to incorporate fail-safe and other appropriat e measures

according to the intended purpose of your system. Also, be sure to debug your system under

actual board-mo un ted cond ition s.

(2) If a plasti c-package device is placed in a strong elect ric fiel d, surface leak age may occur due to

the charge-up phenomenon, resulting in device malfunction. In such cases t ak e appropriate

measures to prevent this problem, for example by protecting the package surface with a

conductive shield.

(3) With some microcomputers and MOS memory devices, caution is required when powering on

or resetting the device. To ensure that your design does not violate device specifications,

consult the relevant databook for each constituent device.

(4) Ensure that no conductive mat erial or object (such as a metal pin) can drop onto and short t he

leads of a device mounted on a printed circuit board.

3.4 Inspection, Testing and Evaluation

3.4.1 Grounding

Ground all measuring instruments, jigs, tools and soldering irons to earth.

Electrical leakage may cause a device to br eak down or may result in electric

shock.

3 General Safety Precautions and Usage Considerations

3-11

3.4.2 Inspection Sequence

c Do not insert devices in the wrong orientation. Make sure that the positive

and negative electrodes of the power supply are correct ly connected.

Otherwise, the rat ed maximum current or maximum power dissipation

may be exceeded and the device may break down or undergo performance

degradation, causing it to catch fire or explode, resulting in injury to the

user.

d When conducting any kind of evaluation, inspection or testing using AC

power with a peak voltage of 42.4 V or DC power exceeding 60 V, be sure to

connect the electrodes or probes of the testing equipment to the device

under test before powering it on. Connecting the electrodes or probes of

testing equipment to a device while it is powered on may result in electric

shock, causing injury.

(1) Apply voltage to the test jig only after inserting the device securely into it. When applying or

removing power, observe the relevant precautions, if any.

(2) Make sure that the voltage applied to the device is off before removing the device from the

test jig. Otherwise, the device may undergo performance degradation or be destroyed.

(3) Make sure that no surge voltages from the measuring equipment are applied to the device.

(4) The chips housed in tape carrier packages (TCPs ) are bare chips and are therefore exposed.

During inspection take care not to crack the chip or cause any flaws in it.

Electrical contact may also cause a chip to become faulty. Therefore make sure that nothing

comes into electrical contact with the chip.

3.5 Mounting

There are essentially two main types of semiconductor device package: lead insertion an d surface

mount. During mounting on printed circuit boards, devices can become contaminated by flux or

damaged by thermal stress from the soldering process. With surface-mount devices in particular,

the most significant problem is thermal stress from solder reflow, when the entire package is

subjected to heat. This section describes a recommended temperature profile for each mounting

method, as well as general precautions which you should take when mounting devices on printed

circuit boards. Note, however, that even for devices with the same package type, t he appropriate

mounting method varies according t o th e size of the chip and the size and shape of the lead fra me.

Therefore, please consult the relevant technical datasheet and databook.

3.5.1 Lead forming

c Always wear protective glasses when cutting the leads of a device with

clippers or a similar tool. If you do not, small bits of metal flying off the cut

ends may damage your eyes.

d Do not touch the tips of device leads. Because some types of device have

leads with pointed tips, you may prick your finger.

Semiconductor devices must undergo a process in which the leads are cut and formed before the

devices can be mounted on a printed circuit board. If undue stress is applied to the interior of a

device during this process, mechanical breakdown or performance degradation can result. This is

attributable primarily to differences between the stress on the device’s external leads and the

stress on the internal leads. If the relative difference is great enough, the device’s internal leads,

adhesive properties or sealant can be damaged. Observe these precautions during the lead-

forming process (this does not apply to surface-mount devices):

3 General Safety Precautions and Usage Considerations

3-12

(1) Lead insertion hole intervals on the printed circuit board should match the lead pitch of the

device precisely.

(2) If lead insertion hole intervals on the printed circuit board do not precisely match the lead

pitch of the device, do not attempt to forcibly insert devices by pressing on them or by pulling

on their leads.

(3) For the minimum clearance specification between a device and a

printed circuit board, refer to the relevant device’s datasheet and

databook. If necessary, achieve t h e required clearance by forming

the device’s leads appropriately. Do not use the spacers which are

used to raise devices above the surface of the printed circuit board

during soldering to achieve clea rance. These spac ers normally

continue to expand due to heat, even after the solder has begun to solidify; this applies severe

stress to the device.

(4) Observe the following precautions when forming the leads of a device prior to mounting.

• Use a tool or jig to secure the lead at its base (where the lead meets the device package) while

bending so as to avoid mechanical stress to the device. Also avoid bending or stretching device

leads repeatedly.

• Be careful not to damage the lead during lead forming.

• Follow any other precautions described in the individual datasheets and data books for each

device and package type.

3.5.2 Socket mounting

(1) When socket mounting devices on a printed circuit board, use sockets which match the

inserted device’s package.

(2) Use s ockets whose contacts have the appropriate contact pressure. If the contact pressure is

insufficient, the socket may not make a perfect contact when the device is repeatedly inserted

and removed; if the pressure is excessively high, the device leads may be bent or damaged

when they are inserted into or removed from the socket.

(3) When s oldering sockets to the printed circuit board, use sockets whos e construction prevents

flux from penetrating into the contacts or which allows flux to be completely cleaned off.

(4) Make sure the coating agent applied to the printed circuit board for moisture-proofing

purposes does not stick to the socket contacts.

(5) If the device leads are severely bent by a socket as it is inserted or removed and you wish to

repair the leads so as to continue using the device, make sure that this lead correction is only

performed once. Do not use devices whose leads have been corrected more than once.

(6) If the printed circuit board with the devices mounted on it will be subjected to vibration from

external sources, use sockets which have a strong contact pressure so as to prevent the

sockets and devices from vibrating relative to one another.

3.5.3 Soldering temperature profile

The soldering temperature an d heating time vary from device to device. Therefore, when

specifying the mounting condit ions, refer to the individual datasheets and databooks for the

devices us ed.

3 General Safety Precautions and Usage Considerations

3-13

(1) Using a soldering iron

Complete soldering within ten seconds for lead temperatures of up to 260°C, or within three

seconds for lead temperatures of up to 350°C.

(2) Using medium infrared ray reflow

• Heating top and bottom with long or medium infrared rays is recommended (see Figure 3).

Long infrared ray heater (preheating)

Medium infrared ray heater

(reflow)

Product flow

Figure 3 Heating top and bottom with long or medium infrared rays

• Complete the infra red ray reflow process wit hi n 30 seconds at a package surfa ce temperat ure of

between 210°C and 240°C.

• Refer to Figure 4 for an example of a good temperature profile for infrared or hot air reflow.

210

30

seconds

or less

Time (in seconds)

60-120

seconds

(°C)

240

160

140

Package surface temperature

Figure 4 Sample temperature profile for infrared or hot air reflow

(3) Using hot air reflow

• Complete hot air reflow within 30 seconds at a package surface temperature of between 210°C

and 240°C.

• For an exam ple of a recommended temperature profi le, refer to Figure 4 above.

(4) Using solder flow

• Apply preheating for 60 to 120 seconds at a temperature of 150°C.

• For le ad in se rt io n- ty pe pa ck ag e s, co mp le te so l de r f low w ith in 10 se co nd s w it h t he

temperature at the stopper (or, if there is no stopper, at a location more than 1.5 mm from

the body) which doe s not exceed 260°C.

3 General Safety Precautions and Usage Considerations

3-14

• For surface-mount packages, complete soldering within 5 seconds at a temperature of 250°C or

less in order to prevent thermal stress in the device.

• Figure 5 shows an example of a recommended temperature profile for surface-mount packages

using solder flow.

5 seconds

or less

60-120 seconds

(°C)

250

160

140

Package surface temperature

Time (in seconds)

Figure 5 Sample temperature profile for solder flow

3.5.4 Flux cleaning and ultrasonic cleaning

(1) When cleaning circuit boards to remove flux, make sure that no residual reactive ions such as

Na or Cl remain. Note that organic solvents react with water to generate hydrogen chloride

and other corrosive gases which can degrade device performance.

(2) Washing devices with water will not cause any problems. However, make sure that no

reactive ions such as sodium and chlorine are left as a residue. Also, be sure to dry devices

sufficiently after washing.

(3) Do not rub device markings with a brush or with your hand during cleaning or while the

devices ar e still wet from the cleaning agent. Doing so can rub off th e markings.

(4) The dip cleaning, shower cleaning and steam cleaning processes a ll involve the chemical

action of a solvent. Use only recommended solvents for these cleaning methods. When

immersin g devices in a solvent or steam bath, make sure that the temperature of t he liquid is

50°C or below, and that the circuit board is removed from the bath within one minute.

(5) Ultrasonic cleaning should not be used with hermetically-sealed ceramic packages such as a

leadless chip carrier (LCC), pin grid array (PGA) or charge-coupled device (CCD), because the

bonding wires can become disconnected due t o resonance during the cleaning process. Even if

a device package allows ultrasonic cleaning, limit the duration of ultrasonic cleaning to as

short a time as possi bl e, s i nce long hours of ult ras onic cl eaning degra de the a dhes ion b etween

the mold resin and the frame material. The following ultrasonic cleaning conditions are

recommended:

Frequency: 27 kHz ∼ 29 kHz

Ultrasonic output power: 300 W or less (0.25

W/cm2 or less)

Cleaning time: 30 seconds or less

Suspend the circuit board in the solvent bath during ultrasonic cleaning in such a way tha t

the ultrasonic vibrator does not come into direct contact with the circuit board or the device.

3 General Safety Precautions and Usage Considerations

3-15

3.5.5 No cleaning

If analog devices or high-speed devices are used without being cleaned, flux residues may cause

minute amounts of leakage between pins. Similarly, dew condensation, which occurs in

environments containing residual chlorine when power to the device is on, may cause between-

lead leakage or migration. Therefore, Toshiba recommends that these devices be cleaned.

However, if the flux used cont ains only a small amount of halogen (0.05W% or less), the devices

may be used without cleaning wi thout any problems.

3.5.6 Mounting tape carrier packages (TCPs)

(1) When tape carrier packages (TCPs) are mounted, measures must be taken to prevent

electrostatic breakdown of the devices.

(2) If devices are being picked up from tape, or outer lead bonding (OLB) mounting is being

carried out, consult the manufacturer of the insertion machine which is being used, in order

to establish the optimum mounting conditions in advance and to avoid any possible hazards.

(3) The base film, which is made of polyimide, is hard and thin. Be careful not to cut or scratch

your hands or any objects while handling the tape.

(4) When punching tape, try not to scatter broken pieces of tape too much.

(5) Treat the extra film, reels and spacers left after punching as industrial waste, taking care not

to destroy or pollute the environment.

(6) Chips housed in tape carrier packages (TCPs) are bare chips and therefore have their reverse

side exposed. To ensure that the chip will not be cracked during mounting, ensure that no

mechanical shock is a ppli ed to the reverse s i de of the chi p. E lect ri cal conta ct may a ls o caus e a

chip to fai l. Therefore, when mounting devices, make sure that nothing comes into electrical

contact with the reverse side of the chip.

If your design requires connecting the reverse side of the chip to the circuit board, please

consult Toshiba or a Toshiba distributor beforehand.

3.5.7 Mounting chips

Devices delivered in chip form tend to degrade or break under external forces much more easily

than plastic-packaged devices. Therefore, caution is required when handling t his type of device.

(1) Mount devices in a properly prepared environment so that chip surfaces will not be exposed to

polluted ambient air or other polluted substances.

(2) When handling chips, be careful not to expose them to static electricity.

In particul ar, measures must b e tak en to prevent sta tic dama ge during t he mount ing of chip s.

With this in mind, Toshiba recommend mounting all peripheral parts first and then mounting

chips last (after all other components have been mounted).

(3) Make sure that PCBs (or any other kind of circuit board) on which chips are being mounted do

not have any chemical resi dues on them (such as the chemicals which were used for etching

the PCBs).

(4) When mounting chips on a board, use the method of assembly that is most suitable for

maintaining the appropriate electrical, thermal and mechanical properties of the

semiconductor devices used.

* For details of devices in chip form, refer to the relevant device’s individual datasheets.

3 General Safety Precautions and Usage Considerations

3-16

3.5.8 Circuit board coating

When devices are to be used in equipment requiring a high degree of reliability or in extreme

environments (where moisture, corrosive gas or dust is present), circuit boards may be coated for

protection. However, before doing so, you must carefully consider the possible stress and

contamination effects that may result and then choose t he coating resin which results in the

minimum level of stress to the device.

3.5.9 Heat sinks

(1) When attaching a heat sink to a device, be careful not to apply excessive force to the device in

the process.

(2) When attaching a device to a heat sink by fixing it at two or more locations, evenly tighten all

the screws in stages (i.e. do not fully tighten one screw while the rest are still only loosely

tightened). Finally, fully tighten all the screws up to the specified torque.

(3) Drill holes for screws in the heat sink exactly as specified. Smooth the

surface by removing burrs and protrusions or indentations which might

interfere with the installation of any part of the device.

(4) A coating of silicone compound can be applied between the heat sink and

the device to improve heat conductivity. Be sure to apply the coating

thinly and evenly; do not use too much. Also, be sure to use a non-volatile

compound, as volatile compounds can crack after a time, causing the heat

radiation properties of the heat sink to deteriorate.

(5) If the device is housed in a plastic package, use caution when selecting the type of silicone

compound to be applied between the heat sink and the device. With some types, the base oil

separates and penetrates the plastic package, si gnificantly reducing the useful life of the

device.

Two recommended silicone compounds in which base oil separation is not a problem are

YG6260 from Toshiba Silicone.

(6) Heat-sink-equipped devices can become very hot during operation. Do not touch them, or you

may sustain a burn.

3.5.10 Tightening torque

(1) Make sure the screws are tightened with fastening torques not exceeding the torque values

stipulated in individual datasheets and databooks for the devices used.

(2) Do not allow a power screwdriver (elect rical or air-driven) to touch devices.

3.5.11 Repeated device mounting and usage

Do not remount or re-use devices which fall into th e categories listed below; these devices may

cause significant problems relating to performance and reliability.

(1) Devices which have been removed from the board after soldering

(2) Devices which have been inserted in the wrong orientation or which ha ve had reverse current

applied

(3) Devices which have undergone lead forming more than once

3 General Safety Precautions and Usage Considerations

3-17

3.6 Protecting Devices in the Field

3.6.1 Temperature

Semiconductor devices are generally more sensitive to temperature than are other electronic

components. The various electrical characteristics of a semiconductor device are dependent on the

ambient temperature at which the device is used. It i s therefore necessary to understand the

temperature characteristics of a device and t o incorporat e device derati ng into circuit design. Note

also that if a device is used above its maximum temperature rating, device deterioration is more

rapid and it will reach the end of its usable life sooner than expected.

3.6.2 Humidity

Resin-mol d ed devices are sometimes improperly sealed. When these devices are used for an

extended period of time in a high-humidity environment, moisture can penetrate into the device

and cause chip degradation or malfunction. Furthermore, when devices are mounted on a regular

printed circuit board, the impedance between wiring components can decrease under high-

humidity conditions. In systems which require a high signal-source impedance, circuit board

leakage or leakage between device lead pins can cause malfunctions. The application of a

moisture-proof treatment to the device surface should be considered in this case. On the other

hand, operation under low-humidity conditions can damage a device due to the occurrence of

electrostatic discharge. Unless damp-proofing measures have been specific ally taken, use devices

only in environments with appropriate ambient moisture levels (i.e. within a relative humidity

range of 40% to 60%).

3.6.3 Corrosive gases

Corrosive gases can cause chemical reactions in devices, degrading device characteristics.

For example, sulphur-bearing corrosive gases emanating from rubber placed near a device

(accompanied by condensation under high-humidity conditions) can corrode a device’s leads. The

resulting chemical reaction between leads forms foreign particles which can cause electrical

leakage.

3.6.4 Radioactive and cosmic rays

Most industrial and consumer semiconductor devices are not designed with protection against

radioactive and cosmic rays. Devices used in aerospace equipment or in radioactive environments

must therefore be shielded.

3.6.5 Strong electrical and magnetic fields

Devices exposed to strong magnetic fields can undergo a polarization phenomenon in their

plastic material, or within the chip, which gives rise to abnormal symptoms such as impedance

changes or increased leakage current. Failures have been reported in LSIs mounted near

malfuncti onin g deflect ion yok es in TV sets . In such cases the devi ce’s inst a lla ti on locat i on must b e

changed or the device must be shielded against the electrical or magnetic field. Shielding against

magnetism is especially necessary for devices used in an alternating magnetic field because of the

electromot ive forces generated in this type of environment.

3 General Safety Precautions and Usage Considerations

3-18

3.6.6 Interference from light (ultraviolet rays, sunlight, fluorescent lamps and

incandescent lamps)

Light st riki ng a semiconduct or device genera tes el ectromot ive force du e t o phot oelect ric effects . In

some cases the device can malfunction. This is especially true for devices in which the internal

chip is exposed. When designing circuits, make sure that devices are protected against incident

light from external sources. This problem is not limited to optical semiconductors and EPROMs.

All types of device can be affected by light.

3.6.7 Dust and oil

Just like corrosive gases, dust and oil can cause chemical reactions in devices, which will

adversely affect a device’s electrical characteristics. To avoid this problem, do not use devices in

dusty or oily environments. This is especially important for optical devices because dust and oil

can affect a device’s optical characteristics as well as its physical integrity and the electrical

performance factors mentioned above.

3.6.8 Fire

Semiconductor devices are combust ible; they can emit smoke and catch fire if heated sufficiently.

When this happens, some devices may generate poisonous gases. Devices should therefore never

be used in close proximity to an open flame or a heat-generating body, or near flammable or

combustible materials.

3.7 Disposal of devices and packing materials

When discarding unused devices and packing materials, follow all procedures specified by local

regulations in order to protect the environment against contamin ation.

4 Precautions and Usage Considerations

4-1

4. Precautions and Usage Considerations

This section describes matters specific to each product group which need to be taken into

consideration when using devices. If the same item is described in Sections 3 and 4, the

description in Section 4 takes precedence.

4.1 Microcontrollers

4.1.1 Design

(1) Using resonators which are not specifically recommended for use

Resonators recommended for use with Toshiba products in microcontroller oscillator applications

are listed in Toshiba databooks along with information about oscillation conditions. If you use a

resonator not included in this list, please cons ult Toshiba or the resonator manufacturer

concerning the suitability of the device for your application.

(2) Undefined functi ons

In some microcontrollers certain instruction code values do not constitute valid processor

instructions. Also, it is possible that the values of bits in registers will become undefined. Take

care in your applications not to use invalid instructions or to let register bit values become

undefined.

4 Precautions and Usage Considerations

4-2

Chapter 1 Int r oduction

1-1

1. Introduction

This user’s manual describes the C790 s upers calar microproces s or for the s yst em des igner,

paying special attention to the software interface and the bus interface.

The C790 is a superscalar integrated implementation of the subset of the 64-bit MIPS IV

Instruction Set Architecture. It also implements a large extension to this instruction set

specially tailored for multimedia applications. It contains a CPU, a floating point

execution unit (Coprocessor 1), primary instruction and data caches.

Two instructions can be decoded each cycle. These instructions are issued in-order and are

always completed in-order1. Data cache misses are non-blocking. A single outstanding

cache miss does not stall the pipeline, so that load misses or uncached loads are retired

out-of-order. Multiply, Multiply-Accumulate, Divide, Prefetch, and Coprocessor 1

instructions are also retired out-of-order.

1 However, some instructions are retired out-of-order.

Chapter 1 Int r oduction

1-2

1.1 Features

The C790 core has the following fe atures :

• 2-way superscalar pipeline

• 128-bit (two 64-bit) data path and 128- bit s yst em bus

• Instruction set architecture

• 64-bit MIPS III instruction set implementation (except LL, SC, LLD and

SCD)

• Selected MIPS IV instruction set implementation (Prefetch and Move

conditional instructions)

• Three-operand Multiply and Multiply-Accumulate instructions

• 128-bit (Quadword) load/s tore ins t ructions

• 128-bit multimedia instructions which configure the 128-bit data path as two

64-bit, four 32-bit, eight 16-bit or s i xteen 8-bit paths

• Configurable Endianness

• Branch prediction with Branch History Table (BHT) and Branch Target Address

Cache (BTAC)

• Large on-chip caches

• Instruction cache: 32KB, 2-way set associative

• Data cache: 32KB, 2-way set-associative (with write-back protocol)

• Non-blocking load, hit under miss and early res tart on f irs t quadw ord

• Data cache line locking

• Prefetch functions

• 64 Byte cache line

• Fast integer Multiply and Multiply-Accumulate operations

• Memory management unit

• 48-entry (96 pages) fully as s ociative trans l ation look - as ide buf f er ( TLB)

• 32-bit physical address space and 32- bit virtual addres s s p ace

• IEEE754-1985 compat ible FPU ( M IPS III ISA s up p o r t ed )

• Performance counters supported

• Debug support

• Multi-stepping of instruction execution

• Hardware breakpoint on instruction addresses

• Hardware breakpoint on data address and data value

• PC tracing capability

• 128-bit demultiplexed data bus and 32-bit address bus

• Pipelined addresses

• Bus error supported

• Multiple masters supported

Chapter 1 Int r oduction

1-3

1.2 Related Documents

The following documents should be referenced:

[1] MIPS R4000 Microprocessor User’s Manual

[2] MIPS R10000 Microprocessor User’s Manual

[3] MIPS IV Instruction Set (Revision 3.2)

Chapter 1 Int r oduction

1-4

1.3 Revision History

Rev. 1.0: June 24th, 1999

Rev. 1.1: December 25th, 1999

Add IEEE754 compatible FPU feature (both single- and double-precision)

Rev. 1.2: March , 2000

Publish

Rev. 2.0: April , 2001

Fixed a lot of typo

Chapter 1 Int r oduction

1-5

1.4 Conventions Used in This Manual

The names of registers, fields, and instructions are

italicized

as in this example:

The

Status

register (SR) is a read/write register that contains the operating mode,

interrupt enabling, and diagnostic states of the processor.

When a name is first introduced, it is shown in bol d typ e.

bold type.bold type.

bold type.

Ranges are denoted by a colon as in the following example:

The 4-bit

Coprocessor Usability (CU[3:0])

field controls the usability of four possible

coprocessors.

Conventions used in ins t ruction des criptions are defined at the beginning of Appendices A,

B, C, and D .

Chapter 1 Int r oduction

1-6

1.5 Restrictions for Use of the C790 CPU Core

1. Revision History

Revision Date Contents

1.0 4/2/2001 FLX01-FLX06; Rest ri ctions for User's Manual Rev.2.0

Items 1 through 6 in the description below are the restrictions that must be obeyed

when using the C790 CPU core (Us er's Manual Rev. 2.0).

Table 1-1. Restriction List

ID Contents

FLX01 TLB exceptions masks bus errors.

FLX02 Bus errors are mas ked when Status.ERL==1 or St atus.E XL = 1.

FLX03 AdEL occurs i n i ndex-type ICACHE or BTA C CACHE ins tructi ons.

FLX04 kuseg becomes an uncached area when an error exception (Status.ERL = 1) occ urs.

FLX05 First two instruc tions in an except ion handl er are execut ed as NOP when a bus error occ urs.

FLX06 Unexpected ins tructi on-fetch bus-errors oc cur when executing a Crashme program.

Chapter 1 Int r oduction

1-7

2. Description

2.1 TLB exceptions mask bus errors (FLX01)

2.1.1 Phenomenon

There are cases in which TLB exceptions occurring immediately after a bus error

mask the bus error and the bus error can not be detected.

2.1.2 Corrective measures

This is caused by bus error exceptions having a lower priority than TLB

exceptions in instruction fetch and data access (refer to “5.5.1 Exception Priority”).

Check the followings when programming a TLB exception handler.

1) Using the TLB exception handler, check for occurrence of any bus error

exceptions before a page ref ill.

2) Using the TLB exception handler, check for occurrence of any bus error

exceptions if a page t hat should be ref illed is incorrect .

3) Using the TLB exception handler, execute at Status.EXL==0 and

Status.ERL==0 after the TLB exception handler stores to EPC, Cause, and

Status registers.

Pending bus errors can be confirmed by referring to Status.BEM.

Chapter 1 Int r oduction

1-8

2.2 Bus errors are masked when Status.ERL==1 or Status.EXL = 1 (FLX02)

2.2.1 Phenomenon

Even if a bus error occurs during instruction fetch in an exception handler

(Status.EXL==1 or Status.ERL==1), the CPU does not accept the exception and

executes instruction code with indeterminate values read from the bus.

2.2.2 Corrective measures

This is caused by bus error exceptions being masked by Status.EXL==1 or

Status.ERL==1. Do not cause exceptions due to instruction fetch in

Status.EXL==1 or Status.ERL==1. Generating exceptions in an exception handler

is dangerous. For example:

1) The JR instruction may potentially cause an address error or a bus error. Do

not use JR instruction in St at us. EXL= = 1 or Status.ERL==1 .

2) A mapped region may potentially cause a TLB exception. Be sure to execute

using an unmapped region like that below:

0x8000_0000 – 0x9FFF_FFFF: kseg0

0xA000_0000 – 0xBFFF_FFFF: kseg1

Chapter 1 Int r oduction

1-9

2.3 AdEL occurs in index-type ICACHE or BTAC CACHE instructions (FLX03)

2.3.1 Phenomenon

When exe c ut ing index-ty pe CACH E inst r uc tions below in eit he r t he Use r m o de o r

Supervisor mode, operation occasionally becomes undefined and generates AdEL

(Address Error exception; load and inst fetch).

There are f iv e inde x -type ICACHE sub oper at io ns as list e d belo w .

00111 CACHE IXIN I$ index invalidate

00000 CACHE IXLTG I$ index load tag

00100 CACHE IXSTG I$ index store tag

00001 CACHE IXLDT I$ index load data

00101 CACHE IXSDT I$ index store data

There are f our BTAC CACH E sub o per at io ns as list e d belo w .

00010 CACHE BXLBT index load BTAC

00110 CACHE BXSBT index store BTAC

01100 CACHE BFH BTAC flush

01010 CACHE BHINBT hit invalidate BTAC

However, there is no problem when Status.KSU==Kernel. Please note that

Status.KSU==Kernel includes the kernel mode at Status.EXL==1 or

Status.ERL==1 as well. There is also no problem when Status.CU[0]==0, and

Status.KSU==User mode or Supervisor mode.

2.3.2 Corrective measures

In Status.CU[0]==1 and Status.KSU==Supervisor or User, execute under

VA[31]==0 when ex e c uting either inde x -type ICACHE o r BTAC CACH E

instructions. VA here represents base reg + offset.

Chapter 1 Int r oduction

1-10

2.4 kuseg becomes an uncached area when an error exception

(Status.ERL = 1) occurs (FLX04)

2.4.1 Phenomenon

There are cases in which kuseg (0x0000_0000 – 0x7FFF_FFFF) becomes

uncached in an error exception handler (St at us.ERL==1) and data consist ency

with cached area (kseg, ksseg, kseg0) is lost.

2.4.2 Corrective measures

In an error exception handler (Status.ERL==1), when accessing kuseg

(0x0000_0000 – 0x7FFF_FFFF), access it af ter g uarding using S YNC. L as f ollows:

SYNC.L

SW ku　seg

Chapter 1 Int r oduction

1-11

2.5 First two instructions in an exception handler are executed as NOP when a

bus error occurs (FLX05)

2.5.1 Phenomenon

There are cases in which the first tw o inst ruct ions in an ex cept ion handler are

executed as NOP instructions, when certain exception occurs and then a bus error

occurs immediately before jumping to the exception handler.

2.5.2 Corrective measures

Place NOP in the first two instruction locations in all exception handlers.

Chapter 1 Int r oduction

1-12

2.6 Unexpected instruction-fetch bus-errors occur when executing a Crashme

program (FLX06)

2.6.1 Phenomenon

In Kernerl mode or Supervisor mode, unexpected Instruction-fetch bus errors

occur when attempting to execute a program called "Crashme" of Linux, since

prohibited instruction-sequences that do not obey the following programming

restrictions are executed.

In User mode, such a phenomenon doesn’t occur.

2.6.2 Corrective measures

In Kernerl mode or Supervisor mode , obey the following programming

restrictions:

1) Any CACHE instr uc t ion must not be placed in a branc h delay slot.

2) SYNC.P must be located immediately before or immediately after any

CACHE instruc t io n.

Chapter 2 Archit ecture Overview

2-1

2. Architecture Overview

This chapter includes an overview of the C790 architecture. It discusses the following

items:• Block diagram and main modules

• Superscalar pipeline operation

• Instruction set

• Registers

• Memory Management

• Cache Memory

• Bus interf ac e

• Floating Point Unit

• Performance Monitors

• Debug Support

Chapter 2 Archit ecture Overview

2-2

2.1 Block Diagram and Functional Block Descriptions

This section presents a block diagram of the main modules of the C790 and summarizes

the modules.

PC Unit

PC Pipe &

BTAC

(64-entry

fully assoc.)

BR Execution Pipe

I1 Execution Pipe

I0 Execution Pipe

C1 COP1 (FPU) Pipe

2.1.1

48 entry TLB

Cop0 Registers

ITLB

2 entries

2.1.2 Instruction Cac he (I-Cache)

Tag, BHT, Predecode, Inst RAMs

(32 KB, 2-way set assoc.)

Issue Logical Stagi ng Resi gters

(2 Issue In-order)

GPR

(32x128-bit wide registers)

Operand/Bypass Logi c

2.1.2

Instruction

Virtual Address

(IVA) 2.1.3

2.1.4

2.1.5

2.1.7

MMU

DTLB

(4 entries)

Virtual Address

Computati on Logic FPR

(32x64-bit wide

registers)

UCAB

2.1.9

Data Cache

(D-Cache)

(32 KB, 2-way

set asso c.)

Data Virtual Address

(DVA)

WBB

Response

Buffer

2.1.8

2.1.10

Bus Interface Unit

2.1.11

Result and Move Buses

TLB Refill Bus

Data

Physical

Address

(DPA)

LS Execution Pipe

BIU Bus

I-Cache Output Pipeline

Control

2.1.5

2.1.6

128b

2.1.3 2.1.2

Instruction

Physical Address

(IPA)

CPU Bus

128b

Figure 2-1. C790 Block Diagram

Chapter 2 Archit ecture Overview

2-3

2.1.1 PC Unit

The 32-bit

Program Counter

(

PC

) holds the address of the instruction which is being

executed. It also contains a 64-entry Branch Target Address Cache

Branch Target Address CacheBranch Target Address Cache

Branch Target Address Cache (BTAC) which stores

branch target addresses used during branch prediction.

2.1.2 MMU

The Memory Management Unit supports the address translation functions of the CPU. It

supplies the DTLB (Data Translation Lookaside Buffer) and ITLB (Instruction

Translation Lookaside Buffer) with data via the TLB Refill Bus. Usage of these buffers is

described in chapter 6.

2.1.3 Caches

Operation of the Instruction Cache and the Data Cache is described in Chapter 7. For

each branch instruction, present in the instruction cache, two bits of branch history are

stored in the B ran ch Hist ory Table

Branch History TableBranch History Table

Branch History Table (BHT).

2.1.4 Issue Logic and Staging Registers

The issue logic decides how to route instructions to appropriate pipes. It issues up to 2

instructions every cycle. Routing is described and discussed later in section 2.2.

2.1.5 GPR (General Purpose Registers) and FPR (Floating-Point

Registers)

The General-Purpose Registers and the Floating-Point Registers are discussed in Section

2.3.

2.1.6 The Five Execution Pipes

2.1.6.1 I0 and I1 Pipes

There are two integer ALU pipelines (I0 and I1), each of which contains a complete 64-bit

ALU, Shifter and Multiply-Accumulate unit. The I0 pipeline contains the SA register used

for funnel shift operations. The two 64-bit ALU pipelines can be configured dynamically

The two 64-bit ALU pipelines can be configured dynamicallyThe two 64-bit ALU pipelines can be configured dynamically

The two 64-bit ALU pipelines can be configured dynamically

(on an instruction-by-instruction basis) into a single 128-bit execution pipeline

(on an instruction-by-instruction basis) into a single 128-bit execution pipeline(on an instruction-by-instruction basis) into a single 128-bit execution pipeline

(on an instruction-by-instruction basis) into a single 128-bit execution pipeline to

to to

to

execute 128-bit Multimedia

execute 128-bit Multimediaexecute 128-bit Multimedia

execute 128-bit Multimedia ALU, Shift

ALU, Shift ALU, Shift

ALU, Shift and Multiply-Accumulate instructions.

and Multiply-Accumulate instructions. and Multiply-Accumulate instructions.

and Multiply-Accumulate instructions.

Furthermore, the two ALU pipelines share a si ngle 128-bit multimedia aligner.

2.1.6.2 LS - Load/Store Pipe

The Load/Store (LS) pipe contains logic to support a single 128-bit Load and Store

instruction.

2.1.6.3 BR - Branch Pipe

The Branch (BR) pipe contains logic to implement a single Branch instruction including

Branch comparators.

2.1.6.4 C1 - COP1/FPU Pipe

The C1 pipe contains logic to support a single/double Floating Point coprocessor unit

(COP1).

Chapter 2 Archit ecture Overview

2-4

2.1.7 Operand/Bypass logic

This module takes data from the GPRs and from the Result and Move Buses, and routes

the data to the pipelines.

2.1.8 Response Buffer and Writeback Buffer

The Writeback Buffer (WBB) is an 8 entry by 16 byte (one quadword) FIFO queuing up

stores prior to accessing the CPU bus. It increases C790 performance by decoupling the

processor from the latencies of the CPU bus. It is also used during the gathering operation

of uncached accelerated stores; sequential stores less than a quadword in length are

gathered in the WBB, thereby reducing bus bandwidth usage.

2.1.9 UCAB

The Uncached Acc elerated Buf fer (UCAB) is a 1 entry by 8 quadw ord buffer . It caches 128

sequential bytes of data during an uncached accelerated load miss. Subsequent loads from

the uncached accelerated address space get their data from this buffer if the address hits

in the UCAB, ther eby eli m inat i ng bus lat e nc ie s and p rovi d ing hi gher pe r f or m ance.

2.1.10 Result and Move Buses

The Result and Move Buses convey data between execution units, the data cache, and the

Operand/Bypass Logic unit.

2.1.11 Bus Interface Unit and BIU Bus

The BIU connects the core to the rest of the system. It interfaces the core’s internal bus

signals to the CPU Bus.

Chapter 2 Archit ecture Overview

2-5

2.2 Superscalar Pipeline Operation

The C790 has a six-stage superscalar pipeline. It can fetch, decode and execute a

maximum of two instructions in parallel each cycle.

This section discusses in more detail the six execution pipelines listed in Section 2.1. It

also discusses how instructions are routed among pipes.

2.2.1 Integer Instruction Pipeline Stages

The C790 contains four integer pipelines : the I0 and the I1 pipes, and the Load/Store and

Branch pipes. Each pipe consists of the following six stages with each stage having 2

phases:

• I: Instruction Address Select

• Q: Instruction Queue

• R: Register Fet c h

• A: Execution

• D: Data Fetch

• W: Write-back

Figure 2-2 shows the six stages of an integer instruction pipeline

IQRADW

Current CPU

Cycle

IQRADW

Figure 2-2. C790 Integer Instruction Pipeline

Chapter 2 Archit ecture Overview

2-6

I: Inst ruct i on Address Select

During the I stage, the following occurs:

• The sequential address is calculated

• The branch address is calculated

• The instruction address is selected from the following sources

• Sequential address

• Actual Branch / Jump address

• Predicted Branch Target address from the BTAC

• Exception vector address

• EPC and Error PC

Q: Instruction Queue

During the Q stage, the following occurs:

• The instruction translation look-aside buffer (ITLB) does the virtual-to-physical

address translation

• The instruction cache (data, Tag, steering bits & BHT) fetch begins

• TLB read for instruction fetch starts

• The instruction cache fetch is completed

• TLB read for instruction fetch completes

• The instruction cache Tag hit check is determined and the way selection is

done

• The appropriate instructions are selected by the steering bits

R: Register Fet c h

During the R stage the following occurs:

• Instructions are bussed to the appropriate execution units

• Register file is read

• Execution unit structural hazards are determined

• Instructions are decoded, data dependencies are determined and the

appropriate instructions are issued

A: Execution

During the A stage, the following occurs:

• Results from the D or W stages are bypassed

• The execution units start and complete the integer arithmetic, logical, shift and

multimedia instructions

• The iterative steps of the Multiply, Multiply-Accumulate, or Divide instructions

are executed

• The virtual address for load and store instructions is calculated

• The branch condition is determined

• The DTLB is read

• The Data Cache and UCAB r e ad starts

Chapter 2 Archit ecture Overview

2-7

D: Data Fet ch

During the D stage, the following occurs:

• The TLB read for a data access

• The Data Cache and UCAB r e ad is compl et e d

• The Data Cache Tag checking is completed

• Load or register data is obtained from COP1 (FPU)

• COP0 registers are read

• Data alignment and way selection is done for the data from the Data Cache

• Data sign extension is done

• Complete updating BHT bits and the BTAC

• All the exceptions are detected

W: Write Back

During the W stage, the following occurs:

• For store operations data is written to the Data Cache

• Data for coprocessor data transfer instructions is transferred to COP1 (FPU)

• For register-to-register and load instructions, the result is written to the

register file

• COP0, COP1 (FPU) registers are written for coprocessor data transfer

instructions

Chapter 2 Archit ecture Overview

2-8

2.2.2 C1 (COP1/FPU) Instruction Pipeline Stages

The C790’s C1 (COP1/FPU) pipeline cons is ts of the f ollow i ng eight s tages :

• I: Instruction Address Select

• Q: Instruction Queue

• R: Register Fet c h

• T: COP1 Regist er Fetc h

• X: FP Execution 1st Stage

• Y: FP Execution 2nd Stage

• Z: FP Execution 3rd Stage

• S: Register File Write Stage

The eight stages of the pipeline for COP1/FPU are shown in Figure 2-3 with some pipeline

stages identified with two letters. COP1 instructions execute simultaneously in the main

integer pipeline I0 and the coprocessor 1 pipeline. The first letter identifies the main

integer pipeline stage and the second letter identifies the coprocessor pipeline stage.

IQRA/T D/X W/Y Z S

IQR

A/T D/X W/Y S

IQRA/T D/X W/Y ZS

IQR

A/T D/X W/Y Z S

IQR

A/T D/X W/Y Z S

IQ

RA/T D/X W/Y Z S

IQRA/T D/X W/Y Z S

Z

Current CPU Cyc le

Figure 2-3. FPU Pipeline

The I, Q, and R stages were previously described in Section 2.2.1. The following describes

stages specific to the COP1 pipeline:

T: COP1 Register Fet ch

During the T stage, the following occurs:

• Register file read for operands

• Bypass muxes from the S Stage/W Stage for S/T overlap.

Chapter 2 Archit ecture Overview

2-9

X: FP Execution 1st Stage

This stage is the first step for floating point operations.

During the X stage, the following occurs:

• Detect Exceptions for input data.

• Detect Exception possibilities for result.

• The Booth function/Wallace multiplication is performed for multiply, the de-

nor-malization is performed for add/subtract.

Y: FP Execution 2nd Stage

This stage is the second step for floating point operations.

The following occurs:

• Test overflow/underflow on exponent is done

• Normalization for multiplication is done.

• Add/subtract the significand for add/subtract operations.

• Count leading zeros, to determine the shift amount for the normalization

Z: FP Execution 3rd Stage

This stage is the third step for floating point operations.

The following occurs:

• Overflow/underflow detection

• Exponent readjustment

• Shift the significand for normalization

• Round the result

• Detect inexact exception

S: Register File Wri t e Stage

During the S stage, the following occurs:

• FPR registers are written.

• FCSR31 is updated.

• Bypass values are passed to the T stage.

Chapter 2 Archit ecture Overview

2-10

2.2.3 Classification and Routing of Instructions According to

Execution Pipelines

This section discusses how the five execution pipelines are used in conjunction with

instruction routing. Figure 2-4 identifies the specific execution pipelines into which

instructions of a particular class are routed, and shows which physical execution units

handle instructions from a particular logical pipe. Instruction categories are identified in

italics

, and are shown within the physical pipes where they are executed. ALU

instructions can be executed in either integer pipe I0 or I1. COP1 Operate, and COP1

Move instructions execute in two pipes as shown, as does the Wide Operate.

C1 MoveC1 Compute

Logical Pipe0

I0 pipe

ALU

SA Operate

MAC0

I1 pipe

ALU

SYNC

ERET

COP0

MAC1

LS pipe

Load/

Store

Prefetch

CACHE

BR pipe

Branch

COP1 Move

COP1 Operate

Logical Pipe1

Physical Pipes

Wide Operate

Figure 2-4. Instruction Routing in Logical Pipes and Physical Pipes

Chapter 2 Archit ecture Overview

2-11

Table 2-1 shows the categories of instructions and the execution pipelines that can execute

those instructions. The instructions in a single category have the same issuing policy.

Instructions which require more than a single execution pipeline are identified in the

pipeline column with the (✔&) symbol. For example, COP1 Move requires both the LS

and the C1 execution pipelines. On the other hand, the ALU instructions can be executed

in either the I0 or the I1 execution pipelines.

Table 2-1. Categories of Instructions and How They Are Routed

Categories Execution Pipeline Instructions

I0 I1 LS BR C1

Load/Store ✔Load, Store, Wide Load , Wide

Store, Prefetch, CACHE

SYNC ✔Synchronization

ERET ✔Exception return

SA Operate ✔Move to/from to SA register

COP0 ✔COP0 Coproces sor mo ve,

COP0 Coproces sor operations

COP1 Move1 ✔&✔COP1 Coproces sor m ove,

COP1 Coproces sor Load/St ore

COP1 Operate2✔& ✔COP1 Operate Instructions

ALU3✔ ✔ Arithmetic, Shift, Logical, Trap,

SYSCALL, BREAK

MAC0 ✔ Multi pl y and Multi pl y

-Accumulate for HI/LO

register, MFHI/LO, MTHI/LO

MAC1 ✔Multiply and Multiply-

Accumulate for HI1/LO1

register, MFHI1/LO1,

MTHI1/LO1

Branch ✔Branch, Jump, Jump/Link, All

Coprocess or Branches

Wide Operate4✔✔& Wide ALU, Wide shif t, Wi de

MAC, Funnel shi ft, Wide HI/LO

Moves

1 COP1 Move instructions execute concurrently in the LS and the C1 pipes.

2 COP1 Operate instructions execute concurrently in the I0 and the C1 pipes.

3 ALU instructions can be executed in either the I0 or the I1 pipes.

4 Wide Operate instructions execute concurrently in the I0 and the I1 pipes.

Chapter 2 Archit ecture Overview

2-12

2.2.4 Instruction Issue Combinations

The C790 always fetches two instructions. A pair of staging registers acts as a ‘bellows’

between the Q and the R stage. If an instruction can’t be issued in a particular cycle, it is

saved in the staging registers. In the next cycle the C790 again fetches two instructions

and tries to issue two (the one left over in the staging register from the previous cycle and

the next sequential one from the pair that is fetched). So the C790 always tries to issue

two instructions each cycle whenever it can.

The two instructions that get issued go to the R-stage of the pipeline and get associated

with one of two logical pipes: Pipe0 and Pipe1. The instructions are then routed to an

appropriate physical pipe for processing.

Instruction categories that can get issued to logical Pipe0 are:

1. ALU

2. Branch

3. Wide Operate

4. SA Operate

5. MAC0

6. COP1 Operate

An alternate way to view this is to recognize that logical Pipe0 is made up of the I0, C1

and BR execution pipelines. When issuing Wide Operate instructions logical Pipe0 also

uses the I1 execution pipeline.

Instruction categories that can get issued to logical Pipe1 are:

1. ALU

2. Branch

3. SYNC

4. ERET

5. Load/Store

6. COP1 Move

7. COP0

8. MAC1

An alternate way to view this is to recognize that logical Pipe1 is made up of the I1, LS,

C1 and BR execution pipelines.

All instruction categories are statically bound to a single logical pipe, that is, they can only

be issued to a particular logical pipe. However the ALU and Branch instruction categories

can get issued to either of the two logical pipes. Thus the binding of these two instruction

categories to a particular logical pipe is done at instruction issue time.

There are some special cases of instruction sequences that are not allowed in the MIPS

ISA. An instruction from the Branch category is not allowed to have another instruction

from either t he Br anch or ERET categor y in it s branc h delay slot. So the fo llowi ng pairs of

instructions are illegal and effectively never issued together:

1. Branch - Branch

2. Branch - ERET

Chapter 2 Archit ecture Overview

2-13

The following sequences of instructions are also not allowed in the C790. Branch-Likely

instructions are a subset of the Branch category (limited to the branch likely instructions).

1. Branch - SYNC.P

2. Branch - SYNC.L

3. Branch - CACHE *1

4. Branch-Likely - MTSA

5. Branch-Likely - MTSAB

6. Branch-Likely - MTSAH

7. Branch-Likely - TLBR *2

8. Branch-Likely - TLBWI *2

9. Branch-Likely - TLBWR *2

*1 CACHE instruction must be guarded by Sync instructions.

Sync.P Sync.L

CACHE I$ o r CACHE D$

Sync.P Sync.L

*2 TLBR, TLBWI, TLBWR instructions must be followed by Sync.P

TLBxx

Sync.P

The following table shows the instruction categories which can be issued concurrently to

the two logical pipes. All combinations are legal except the ones marked with an “X”. The

combinations marked with a “Y” can be issued concurrently, i.e., enter the R stage

together but then the younger instruction stalls in the A stage for a single cycle in order to

avoid a resource hazard.

Table 2-2. Concurrently Issued Instruction Categories

LOGICAL PIPE0

SA

Oper. COP1

Oper. ALU MAC0 Branch Wide

Oper.

Load/Store

ERET X

SYNC

LZC Y

COP1 Move

ALU Y

MAC1 Y

Branch X

LOGICAL PIPE1

COP0

X: illegal combination

Y: Can be issued concurrently but it will stall due to structure hazard.

Chapter 2 Archit ecture Overview

2-14

2.3 Registers

The C790 extends the normal MIPS compatible register set by extending the general

generalgeneral

general

purpose registers

purpose registerspurpose registers

purpose registers (GPR

GPRGPR

GPRs) from 64-bits to 128-bits, adding an additional pair of HI/LO

registers for the I1 pipe and adding the SA register f or the f unnel s hif t inst ruction.

2.3.1 CPU Registers

The C790 has 128-bit wide GPRs. The upper 64 bits of the GPRs are only used by the

C790-specific “Quad Load/ Store”, and “Multimedia (Parallel)” ins t ructions .

The HI1 and LO1, which are the upper 64 bits of each of the 128- bit HI and LO regis ters ,

are also used by new multiply and divide instructions, such as

MULT1

,

MULTU1

,

DIV1

,

DIVU1

,

MADD1

,

MADDU1

,

MFHI1

,

MFLO1

,

MTHI1

, and

MTLO1

, which are non-

parallel I1 pipeline-specific instructions.

The SA register contains the shift amount us ed by the 256 bit f unnel s hift ins t ruction.

2.3.2 FPU Registers

The floating point unit (COP1) has 64-bit wide floating point registers. It also contains 2

floating point control registers .

Chapter 2 Archit ecture Overview

2-15

2.3.3 COP0 Registers

Table 2-3 identifies the COP0 regis t ers of the C790.

Table 2-3. Coprocessor 0 Registers

Register

No. Register

Name Description Purpose

0Index Programmabl e regi ster to select TLB entry for readi ng or

writing MMU

1Random Pseudo-random counter for TLB replac ement MMU

2EntryLo0 Low half of TLB entry for even PFN (Physical page number) MMU

3EntryLo1 Low half of TLB entry for odd PFN (Physical page number) MMU

4Context Pointer t o kernel virt ual P T E table Exception

5PageMask Mas k that sets the TLB page si ze MMU

6Wired Number of wired TLB ent ri es MMU

7 (Reserved) Undefined Undefined

8BadVAddr B ad vi rtual address Exception

9Count Timer compare Exception

10 EntryHi High half of TLB entry(Virtual page num ber and ASID) MMU

11 Compare Timer compare Exception

12 Status Proces sor Stat us Regist er Exception

13 Cause Cause of the last excepti on taken E xcepti on

14 EPC Exception Program Counter Exception

15 PRId Process or Revi sion Identifier MMU

16 Config Configurati on Regi ster MMU

17 (Reserved) Undefined Undefined

18 (Reserved) Undefined Undefined

19 (Reserved) Undefined Undefined

20 (Reserved) Undefined Undefined

21 (Reserved) Undefined Undefined

22 (Reserved) Undefined Undefined

23 BadPAddr Bad Physi cal Address Exception

24 Debug This is used for Debug function Debug

25 Perf P erf ormanc e Count er and Control Regis ter Exception

26 (Reserved) Undefined Undefined

27 (Reserved) Undefined Undefined

28 TagLo Cache Tag register(l ow bits ) MMU

29 TagHi Cache Tag register(high bits) MMU

30 ErrorPC Error Exception P rogram Counter Excepti on

31 (Reserved) Undefined Undefined

Chapter 2 Archit ecture Overview

2-16

2.4 Memory Management

The C790 processor provides a memory management unit (MMU) which uses an on-chip

translation look-aside buffer (TLB) to translate virtual addresses into physical addresses.

The C790 supports the MIPS compatible

32-bit

address and

64-bit

data mode.

Only

32-bit

virtual and physical addresses have been implemented. There is no requirement for

address sign extension. Address error exception checking will not be done on the “upper”

32-bits (which are ignored). The only condition that will generate the address error

exception will be address alignment errors and segment protection errors. In Kernel mode,

it is free from address error exception for program counter to wrap-around from

kseg3

to

kuseg

.

Since there is only one addressing mode, all the four MIPS ISAs (I, II, III, IV) and the

C790 specific ISA are available without any res t rictions in all of the three processor modes

(with the appropriate MIPS ISA coprocessor usable restrictions). As such the reserved

instruction (RI) exception will occur only when the processor really tries to execute an

undefined opcode.

Features

FeaturesFeatures

Features

• MIPS III-com p at ib le 32-bit MM U

• Operating Modes: User, Supervisor, and Kernel

• TLB: 48 entries of even/odd page pairs (96 pages)

Fully associative

• Page Size: 4 KB, 16 KB, 64 K B, 256 K B, 1 MB, 4 MB, 16 MB

• ITLB: 2 entries

• DTLB: 4 entries

• Address Sizes: Virtual Address Size = 32 bit, 2 Gbyte per user Process

Physical Address Size = 32 bit, 4 G byte

Chapter 2 Archit ecture Overview

2-17

2.5 Cache Memory

The C790 core contains both an instruction cache and a separate data cache.

Features

FeaturesFeatures

Features

The following are the main features of the caches:

• Separate Instruction Cache and Data Cache

• Virtually indexed and physically tagged caches

• Write-back policy for the Data Cache

• Data Cache and Instruction Cache burst read sequential ordering

• Cache Size: Instruction Cache: 32 KB

Data Cache: 32 KB

• Line Size: 64 Bytes

• Refill size: 64 Bytes

• Associativity: 2-way set-associative

• Write Policy: Write-back and write allocate

• Data order for block reads: Sequential ordering

• Data order for block writes: Sequential ordering

• Instruction cache miss restart: After all data received

• Data cache miss restart: Early restart on first quadword

• Cache parity: No

• Cache Locking: Data Cache Line Lock.

Controll ed by CACHE ins t ructio n

• Cache Snooping: No

• Non-blocking load: Yes

• Hit Under Miss: Yes (Multiple hits under one miss are supported)

• Data Cache Prefetch: Yes

Chapter 2 Archit ecture Overview

2-18

2.6 Bus Interface

The C790 CPU core is connected to the res t of the s ystem, and to external devices, through

the group of on-chip C790 system bus s i gnals called the CPU Bus .

Features

FeaturesFeatures

Features

• Separate data and address buses (Demultiplexed operation)

• 128-bit data bus

• Clocked synchronous operations

• Peak transfer rate of 2.1 G B/ s ec (@ 133 MH z bus clock )

• 8/16/32/64/128-bit and burst accesses

• Multimaster capability

• Pipelined operations

• No turn-around or dead cycles between transfers

The CPU Bus does not provide:

• Cache coherency support

• Split transactions

2.7 Floating Point Unit

The floating point unit is IEEE754-1985 compatible as same as FPU in the TX49HF CPU

core.

Main Features

Main FeaturesMain Features

Main Features:

• Tightly coupled to the C790 Integer pipeline.

• Supports bot h d oubl e and single prec i s i on f o r m at as d efined in IEEE-754

specification

• No hardware supp or t f or D enor m alized num ber in t he IEEE- 754 s pec if ic at ion.

Software (exception handler) supports it.

• The FPU support s five IEEE excep t i ons and one MIPS defined excep t io n.

•

ADD

,

SUB

,

MUL

,

DIV

,

ABS

,

MOV

,

NEG

,

SQRT

, compare and convert are

supported

Chapter 2 Archit ecture Overview

2-19

2.8 Performance Counter

The performance counter provides the means for gathering statistical information about

the internal events of the CPU and the pipeline during program execution. The statistics

gathered during program execution aid in tuning the performance of hardware and

software systems based on the processor.

The performance counter consists of one control register and two counters. The control

register controls the functions of the performance counter while the counters count the

number of events specified by the control register.

Features:

Features:Features:

Features:

• Two performance counter registers

• Over twenty different events within the processor can be counted

• Counting can be selectively enabled in User, Supervisor, Kernel, and Exception

modes

2.9 Debug and Tracing Functions

The C790 supports real-time PC tracing. Pipeline status, target addresses of indirect

jumps, and exception vectors are made available on special signals. The executed

instruction sequence can be restored from signals and the source program.

Features:

Features:Features:

Features:

• One Instruction Address Breakpoint register

• One Instruction Address Breakpoint Mask register

• One Data Address Breakpoint register

• One Data Address Breakpoint Mask register

• One Data Value Breakpoint register

• One Data Value Breakpoint Mask register

• Each breakpoint individually enabled

• Breakpoint function can be selectively enabled in User, Supervisor, Kernel, and

Exception modes

• External Trigger signal can be generated when breakpoint occurs

• 11 signals used to provide real-time PC tracing function

Chapter 2 Archit ecture Overview

2-20

Chapter 3 Inst r uction Set Overview and Summary

3-1

3. Instruction Set Overview and Summary

This chapter provides an overview of the C790 instruction set. Refer to Appendices A - D

for detailed descriptions of individual instructions.

Chapter 3 Inst r uction Set Overview and Summary

3-2

3.1 Introduction

The C790 supports all MIPS III instructions with the exception of 64-bit multiply, 64-bit

divide, Load Linked and Store Conditional instructions. It also supports a limited number

of MIPS IV instructions and additional C790-specific instructions, such as Multiply/Add

instructions and multimedia instructions.

The instruction set can be divided into the following groups:

• Load and Store

• Computational

• Jump and Branch

• Miscellaneous

• System Control Coprocessor (COP0)

• Coprocessor 1 (COP1)

• C790-specific

Chapter 3 Inst r uction Set Overview and Summary

3-3

3.2 CPU Instruction Set Formats

There are three instruction formats:

immediate

immediateimmediate

immediate

(I-type),

jump

jumpjump

jump

(J-type), and

register

registerregister

register

(R-

type), as shown in Figure 3-1. The use of a small number of instruction formats simplifies

instruction decoding (thus producing higher f requency operations) and allow s the compiler

to synthesize more complicated (and less frequently used) operations and address modes

from these three formats as needed.

R-type (Register)

J-type (Jump)

op rs rt immediate

31 26 25 21 20 16 15 0

op target

31 26 25 0

op rs rt rd sa funct

31 26 25 21 20 16 15 11 10 6 5 0

I-type (Immediate)

op 6-bit operation code

rs 5-bit source register specifier

rt 5-bit target (source/destination) register or branch condition

immediate 16-bit immediate value, branch displacement or address displacement

target 26-bit jump target address

rd 5-bit destination register specifier

sa 5-bit shift amount

funct 6-bit function field

Figure 3-1. CPU Instruction Formats

Chapter 3 Inst r uction Set Overview and Summary

3-4

3.3 Instruction Set Summary

The C790 supports MIPS III instructions1 as well as a limited number of MIPS IV

instructions. A large number of C790-specific instructions, such as multiply/add

instructions and multimedia instructions have also been implemented.

3.3.1 Load/Store Instructions

The instructions in this group transfer data of different sizes: bytes, halfwords, words,

doublewords and quadwords. Signed and unsigned integers of different sizes are

supported by loads that either sign-extended or zero-extended the data loaded into the

register.

Load and store instructions are immediate (I-type) instructions that move data between

memory and the general registers. The only addressing mode that load and store

instructions directly support is base register plus 16-bit signed immediate offset.

3.3.1.1 Normal Loads and Stores

The C790 does not support Load Linked and Store Conditional instructions, LL, LLD, SC

and SCD. For details of these instructions refer to Appendix A.

Table 3-1. Load / Store Instructions

Mnemonic Description Defined in

LB Load Byte MIPS I

LBU Load Byte Unsi gned MIPS I

LD Load Doubleword MIPS III

LDL Load Doubleword Left MIPS I II

LDR Load Doubleword Right MIPS II I

LH Load Halfword MIPS I

LHU Load Halfword Unsigned MIPS I

LW Load Word MIPS I

LWL Load Word Left MIPS I

LWR Load Word Right MIPS I

LWU Load Word Unsigned MIPS III

SB Store Byte MIPS I

SD Store Doubleword MIPS III

SDL Store Doubleword Left MIPS III

SDR Sto re Doubl eword Right MIPS II I

SH Store Halfword MIPS I

SW Store Word MIPS I

SW L Store Word Left MIPS I

SW R Store Word Right MIPS I

1 Note: The C790 does not support the following MIPS III instructions:

64-bit multiply and divide instructions (DMULT, DMULTU, DDIV, DDIVU)

Semaphore instructions (LL, LLD, SC, SCD)

Chapter 3 Inst r uction Set Overview and Summary

3-5

3.3.1.2 Mult i m edi a Loads and St ores

The C790 implements 128-bit (quadword) load and store instructions for multimedia

purpose. For details of these instructions refer to Appendix B.

Table 3-2. Multimedia Load / Store Instructions

Mnemonic Description Defined in

LQ Load Quadword C790

SQ Store Quadword C790

3.3.1.3 Coprocessor Loads and Stores

These loads and stores are coprocessor instructions. A particular coprocessor is enabled if

corresponding CU bit is set in CP0 Status register. Otherwise executing one of these

instructions generates a Coprocessor Unusable exception. For details of these instructions

refer to Appendices C and D.

Table 3-3. Coprocessor Load / Store Instructions

Mnemonic Description Defined in

LDC1 Load Doubleword to Floating

Point MIPS II

LWC1 Load Word to Float i ng Point MIPS I

SDC1 St ore Doubleword from Floati ng

Point MIPS II

SWC1 Store Word from Fl oating Point MIPS I

3.3.1.4 Data Formats and Addressing

The C790 processor uses f ive data f ormats :

• 128-bit quadword

• 64-bit doubleword

• 32-bit word

• 16-bit halfword

• 8-bit byte

Byte ordering within each of the larger data formats — halfword, word, doubleword — can

be configured in either big-endian or little-endian order. Endianness refers to the location

of byte 0 within the multi-byte data structure. Figure 3-2 and Figure 3-3 show the

ordering of bytes within words and the ordering of words within multiple-word structures

for the big-endian and little-endian conventions.

When the C790 processor is configured as a big-endian system, byte 0 is the most-

significant (leftmost) byte, thereby providing compatibility with MC 68000® and IBM 370®

conventions. Figure 3-2 show s this conf iguration.

Chapter 3 Inst r uction Set Overview and Summary

3-6

Word

Address

12

8

4

0

Higher

Address

Lower

Address

31 24 23 16 15 8 7 0

12

8

4

0

13

9

5

1

14

10

6

2

15

11

7

3

Bit #

Figure 3-2. Big-Endian Byte Ordering

When configured as a little-endian system, byte 0 is always the least-significant

(rightmost) byte, which is compatible with iAPX® x86 and DEC VAX® conventions.

Word

Address

12

8

4

0

Higher

Address

Lower

Address

31 24 23 16 15 8 7 0

12

8

4

0

13

9

5

1

14

10

6

2

15

11

7

3

Bit #

Figure 3-3. Little-Endian Byte Ordering

In this text, bit 0 is always the least-significant (rightmost) bit: thus, bit designations are

always little-endian (although no instructions explicitly designate bit positions within

words).

Chapter 3 Inst r uction Set Overview and Summary

3-7

Figure 3-4 and Figure 3-5 show little- endian and big- endian byte ordering in doublew ords .

Most-significant byte Least-significant byte

Least significant Word

Bit # 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0

Halfword Byte

76543210

Bits in a Byte

Bit # 7 6 5 4 3 2 1 0

Byte #

Figure 3-4. Little-Endian Data in a Doubleword

M os t-s i gnificant byt e Leas t-s i gnificant byt e

Leas t signi fic ant Word

B i t # 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0

Halfword Byte

01234567

Bits in a Byte

Bit # 76543210

Byte #

Figure 3-5. Big-Endian Data in a Doubleword

Chapter 3 Inst r uction Set Overview and Summary

3-8

The CPU uses byte addressing for halfword, word, doubleword, and quadword

quadwordquadword

quadword accesses

with the following alignment constraints:

• Halfword accesses must be aligned on an even byte boundary (0, 2, 4...).

• Word accesses must be aligned on a byte boundary divisible by four (0, 4, 8...).

• Doubleword accesses mus t be aligned on a byte boundary divis ible by eight ( 0, 8,

16...).

• Quadword accesses mus t be aligned on a byte boundary divis i ble by s ixteen ( 0,

16, 32...).

The following special instructions load and store words that are not aligned on 4-byte

(word), 8-byte ( d oublew ord), boundaries :

LWL LWR SWL SWR

LDL LDR SDL SDR

These instructions are used in pairs to provide addressing of misaligned words.

Addressing misaligned data incurs one additional instruction cycle over that required for

addressing aligned data. This extra cycle is because of an extra instruction for the “pair”

(e.g.,LWL and LWR form a pair). Also note that the CPU moves the unaligned data at the

same rate as a hardware mechanism.

Figure 3-6 and Figure 3-7 shows the acces s of a mis aligned w ord that has byte addres s 3.

3

654

Higher

Address

Lower

Address

31 24 23 16 15 8 7 0

Bit #

Figure 3-6. Big-Endian Misaligned Word Addressing

3654

Higher

Address

Lower

Address

31 24 23 16 15 8 7 0

Bit #

Figure 3-7. Little-Endian Misaligned Word Addressing

Chapter 3 Inst r uction Set Overview and Summary

3-9

3.3.1.5 Defini ng Access Types

Access type

indicates the size of the C790 processor data item to be loaded or stored, set

by the load or store instruction opcode.

Regardless of access type or byte ordering (endianess), the address given specifies the low-

order byte in the addressed field. For a big-endian configuration, the low-order byte is the

most-significant byte; for a little-endian configuration, the low-order byte is the least-

significant byte.

The access type, together with the four

low-order bits of the address, defines the bytes

accessed within the addressed doubleword (shown in Table 3-4 and Table 3-5). Only the

combinations shown in Table 3-4 and Table 3-5 are permissible; other combinations cause

address error exceptions.

Chapter 3 Inst r uction Set Overview and Summary

3-10

Table 3-4. Defining Access Types (Big-Endian)

Access Type Low-Order Bytes Accessed

Mnemonic Address

Bits

3 2 1 0

Big endian

(127---------------95----------------63-----------------31-----------------0)

Byte

Quadword 0 0 0 0 0123456789101112131415

Doubleword 0 0 0 0 01234567

1000 8 9 10 11 12 13 14 15

Septibyte 0 0 0 0 0123456

0001 1234567

1000 8 9 10 11 12 13 14

1001 9 101112131415

Sextibyte 0 0 0 0 012345

0010 234567

1000 8 9 10 11 12 13

1010 10 11 12 13 14 15

Quintibyte 000001234

0011 34567

1000 8 9 10 11 12

1011 11 12 13 14 15

Word 00000123

0100 4567

1000 8 9 10 11

1100 12 13 14 15

Triplebyte 0000012

0001 123

0100 456

0101567

1000 8910

1001 91011

1100 12 13 14

1101 13 14 15

Halfword 000001

0010 23

0100 45

0110 67

1000 89

1010 10 11

1100 12 13

1110 14 15

Chapter 3 Inst r uction Set Overview and Summary

3-11

Access Type Low-Order Bytes Accessed

Mnemonic Address

Bits

3 2 1 0

Big endian

(127---------------95----------------63-----------------31-----------------0)

Byte

Byte 00000

0001 1

0010 2

0011 3

0100 4

0101 5

0110 6

0111 7

1000 8

1001 9

1010 10

1011 11

1100 12

1101 13

1110 14

1111 15

Chapter 3 Inst r uction Set Overview and Summary

3-12

Table 3-5. Defining Access Types (Little-Endian)

Access Type Low-Order Bytes Accessed

Mnemonic Address

Bits

3 2 1 0

Little endian

(127---------------95----------------63-----------------31-----------------0)

Byte

Quadword 00001514131211109876543210

Doubleword 0000 76543210

100015141312111098

Septibyte 0000 6543210

0001 7654321

1000 14 13 12 11 10 9 8

10011514131211109

Sextibyte 0000 543210

0010 765432

1000 13 12 11 10 9 8

1010151413121110

Quintibyte 0000 43210

0011 76543

1000 12 11 10 9 8

10111514131211

Word 0000 3210

0100 7654

1000 11 10 9 8

110015141312

Triplebyte 0000 210

0001 321

0100 654

0101765

1000 10 9 8

1001 11 10 9

1100 14 13 12

1101151413

Halfword 0000 10

0010 32

0100 54

0110 76

1000 98

1010 11 10

1100 13 12

11101514

Chapter 3 Inst r uction Set Overview and Summary

3-13

3.3.1.6 Scheduling a Load Del ay Slot

A load instruction that does not allow its result to be used by the instruction immediately

following is called a

delayed load instruction

. The instruction slot immediately following

this delayed load instruction is referred to as the

load delay slot

.

In the C790 processor, the instruction immediately following a load instruction can use

the contents of the loaded register. In such cases, however, hardware interlocks insert

additional clock cycles. Consequently, scheduling load delay slots can be desirable, both

for performance and R-Series processor compatibility. However, the scheduling of load

delay slots is not absolutely required.

Access Type Low-Order Bytes Accessed

Mnemonic Address

Bits

3 2 1 0

Little endian

(127---------------95----------------63-----------------31-----------------0)

Byte

Byte 0000 0

0001 1

0010 2

0011 3

0100 4

0101 5

0110 6

0111 7

1000 8

1001 9

1010 10

1011 11

1100 12

1101 13

1110 14

111115

Chapter 3 Inst r uction Set Overview and Summary

3-14

3.3.2 Computational Instructions

The instructions in this group perform two’s complement arithmetic, logical operations, or

shifts on integers represented in two’s complement notation.

Computational instructions can be either in register (R-type) format, in which both

operands are registers, or in immediate (I-type) format, in which one operand is a 16-bit

immediate.

Computational instructions perform the following operations on register values:

• Arithmetic

• Logical

• Shift

• Multiply

• Divide

These operations fit in the following four categories of computational instructions:

• ALU immediate instructions

• Three-Operand Register-Type instructions

• Shift instructions

• Multiply and Divide instructions

For detailed information of individual instructions, refer to Appendix A.

*Note: The C790 does not support 64-bit Multiply and Divide instructions, DMULT, DMULTU,

DDIV, and DDI VU.

3.3.2.1 ALU Immediate I nst ruct ions

Table 3-6. ALU Immediate Instructions

Mnemonic Description Defined in

ADDI Add Imm edi at e MIPS I

ADDIU Add Im mediat e Unsigned MIPS I

SLTI Set on Less Than Immediate MIPS I

SLTIU Set on Less Than Immedi at e Unsigned MIPS I

ANDI AND Imm edi ate MIPS I

ORI O R I mmediate MIPS I

XORI Exclusive OR I mmediate MIPS I

LUI Load Upper Imm edi ate MIPS I

DADDI Doubleword Add Immediat e MIPS III

DADDIU Doubleword Add Immediat e Unsigned MIPS III

Chapter 3 Inst r uction Set Overview and Summary

3-15

3.3.2.2 Three Operand Register-Type Instructions

Table 3-7. Three Operand Register-Type Instructions

Mnemonic Description Defined in

ADD Add MIPS I

ADDU Add Unsi gned MIPS I

SUB Subtrac t MIPS I

SUBU Subtract Unsigned MIPS I

DADD Doubleword Add MIPS I I I

DADDU Doubleword Add Unsigned MIPS I I I

DSUB Doubleword Subtract MIPS I I I

DSUBU Doubleword Subtract Unsigned MIPS III

SLT Set Less Than MIPS I

SLTU Set Less Than Unsigned MIPS I

AND AND MIPS I

OR OR MIPS I

XOR Exclu sive OR MIPS I

NOR NOR MIPS I

3.3.2.3 Shift Instructions

Table 3-8. Shift Instructions

Mnemonic Description Defined in

SLL S hi ft Left Logical MIPS I

SRL Shift Ri ght Logical MIPS I

SRA Shift Right A ri thmetic MIPS I

SLLV Shift Left Logic al Variable MIPS I

SRLV Shift Ri ght Logical Variable MIPS I

SRAV Shift Ri ght Arit hmeti c Variable MIPS I

DSLL Doubleword Shift Left Logical MIP S III

DSRL Doubleword Shift Right Logical MIPS III

DSRA Doubleword Shift Right Arithm et i c MIPS III

DSLL32 Doubleword Shift Lef t Logical + 32 MIPS II I

DSRL32 Doubleword Shift Right Logi cal + 32 MIPS I II

DSRA32 Doubleword Shift Right Arit hmeti c + 32 MIPS I II

DSLLV Doubleword Shif t Left Logi cal Variabl e MIPS III

DSRLV Doubleword Shift Right Logical V ari able MIPS I II

DSRAV Doubleword Shift Right A ri thmetic V ari able MIPS III

3.3.2.4 Mult i ply and Divide Instructions

These are the standard MIPS instructions for multiply, divide, and move to / from HI / LO

registers executed on the I0 pipeline’s MAC unit. See also C790-specific Multiply and

Divide instructions discussion.

Table 3-9. Multiply and Divide Instructions

Mnemonic Description Defined in

MULT Multiply MIPS I

MULTU Multipl y Unsigned MIPS I

DIV Divide MIPS I

DIVU Divide Unsi gned MIPS I

MFHI Mo ve From HI MIP S I

MTHI Mo ve To HI MIP S I

MFLO Move From LO MIPS I

MTLO Mo ve To L O MIPS I

3.3.2.5 64-Bit Operations

The result of operations that use incorrect sign-extended 32-bit values for 64-bit

operations is unpredictable.

Chapter 3 Inst r uction Set Overview and Summary

3-16

3.3.3 Jump and Branch Instructions

The architecture defines PC-relative conditional branches, a PC-region unconditional

jump, an absolute (register) unconditional jump, and a similar set of procedure calls that

record a return link address in a general register. For convenience, these are all referred

to here as branches.

All branches have an architectural delay of one instruction. When a branch is taken, the

instruction immediately following the branch instruction, in the branch delay slot, is

executed before the branch to the target instruction takes place. Conditional branches

come in two versions that treat the instruction in the delay slot differently when the

branch is not taken and execution falls through. The ‘branch’ instructions execute the

instruction in the delay slot, but the ‘branch likely’ instructions do not. (They are said to

‘nullify’ it.)

By convention, if an exception or interrupt prevents the completion of an instruction

occupying a branch delay slot, the instruction stream is continued by re-executing the

branch instruction. To permit this, branches must be res tartable; procedure calls may not

use the register in which the return link is stored (usually register 31) to determine the

branch target address.

For detailed information of individual instructions, refer to Appendix A. Branch on

Coprocessor instructions are covered under coprocessor’s discussions.

3.3.3.1 Jump Instructions

Subroutine calls in high-level languages are usually implemented with Jump or Jump and

Link instructions, both of which are J-type instructions. In J- type format, the 26-bit target

address shifts 2 bits and combines with the high-order 4-bits of the current program

counter to form an absolute address.

Returns, dispatches, and large cross-page jumps are usually implemented with the Jump

Register or Jump and Link Register instructions. Both are R-type instructions that take

the 32-bit byte address contained in one of the general purpose registers.

Table 3-10. Jump Instructions Jumping Within a 256 MByte Region

Mnemonic Description Defined in

JJump MIPS I

JAL Jum p and Li nk MIPS I

Table 3-11. Jump Instructions to Absolute Address

Mnemonic Description Defined in

JR Jump Register MIPS I

JALR Jump and Li nk Register MIPS I

Chapter 3 Inst r uction Set Overview and Summary

3-17

3.3.3.2 Branch Instructions

All branch instruction target addresses are computed by adding the address of the

instruction in the branch delay slot to the 16-bit offset (shifts left 2 bits and is sign-

extended to 32-bits). All branches occur with a delay of one instruction.

In case of a Branch Likely instruction, if a condition is not taken, the instruction in the

delay slot is nullified.

Table 3-12. PC-Relative Conditional Branch Instructions Comparing 2 Registers

Mnemonic Description Defined in

BEQ B ranch on Equal MIP S I

BNE Branch on Not Equal MIPS I

BLEZ Branch on Less Than or E qual to Zero MIP S I

BGTZ Branch on Greater Than Zero MIPS I

BEQL Branch on Equal Likel y MIPS II

BNEL Branch on Not E qual Li kely MIPS II

BLEZL Branch on Less Than or Equal t o Zero Li kely MIPS II

BGTZL Branch on Greater Than Zero Li k el y MIPS II

Table 3-13. PC-Relative Conditional Branch Instructions Comparing Against Zero

Mnemonic Description Defined in

BLTZ Branch on Less Than Zero MIPS I

BGEZ Branch on Greater Than or E qual t o Zero MIPS I

BLTZAL Branch on Less Than Zero and Link MIP S I

BGEZA L Branch on Greater Than or E qual to Zero and

Link MIPS I

BLTZL Branch on Less Than Zero Lik ely MIPS II

BGEZL Branch on Greater Than or Equal to Zero Lik el y MIPS II

BLTZALL B ranch on Less Than Zero and Li nk Likely MIPS II

BGEZA LL Branch on Greater Than or E qual t o Zero and

Link Lik el y MIPS II

Chapter 3 Inst r uction Set Overview and Summary

3-18

3.3.4 Miscellaneous Instructions

3.3.4.1 Exception Instructions

Exception instructions have as their sole purpose causing an exception that will transfer

control to a software exception handler in the kernel. System call and breakpoint

instructions cause exceptions unconditionally. The trap instructions cause exceptions

conditionally based upon the result of a comparison. For detail of these instructions, refer

to the individual instruction as described in Appendix A.

Table 3-14. Exception Instructions

Mnemonic Description Defined in

BREAK Breakpoint MIPS I

SYSCALL System Call MIPS I

TGE Trap if Greater or E qual MIPS II

TGEU Trap if Greater or Equal Unsigned MIPS II

TLT Trap i f Less Than MIPS II

TLTU Trap if Less Than Unsi gned MIPS II

TEQ Trap if Equal MIPS II

TNE Trap if Not Equal MIPS II

TGEI Trap if Greater or E qual Immediat e MIPS II

TGEIU Trap if Greater or E qual Immediat e Unsigned MIPS II

TLTI Trap if Less Than I mmediate MIPS II

TLTIU Trap if Less Than I mmediate Uns i gned MIPS II

TEQI Trap if Equal I mm edi ate MIPS II

TNEI Trap if Not E qual I mmediate MIPS II

3.3.4.2 Serialization Instructions

The order in which memory accesses from load and store instructions appear outside the

C790 is not specified by the architecture. The SYNC (or SYNC.L) instruction creates a

point in the executing instruction stream at which the relative order of some loads and

store is known. Loads and stores executed before the SYNC (or SYNC.L) are retired before

loads and stores after the SYNC ( or SYNC.L) can s t art.

In order to guarantee the completion of certain instructions a SYNC.P instruction can be

used. Instructions executed before a SYNC.P instruction are completed before instructions

after the SYNC.P can start. For detail of this instruction refer to SYNC instruction as

described in Appendix A.

Table 3-15. Serialization Instructions

Mnemonic Description Defined in

SYNC2Synchronization MIPS II

2 This includes the SYNC, SYNC.L and SYNC.P instructions.

Chapter 3 Inst r uction Set Overview and Summary

3-19

3.3.4.3 MIPS I V I nst ruct ions

The C790 supports a part of the MIPS IV instructions: Conditional Move instructions and

Prefetch instruction.

Conditional move operations allow ‘IF’ statements to be represented without branches.

‘THEN’ and ‘ELSE’ clauses are computed unconditionally and the results are placed in a

temporary register. Conditional move operations then transfer the temporary results to

their true register.

The Prefetch instruction fetches data expected to be used in the near future and places it

in the data cache.

For detail of these instructions, refer to the individual instruction as described in

Appendix A.

Table 3-16. MIPS IV Instructions

Mnemonic Description Defined in

MOVN Move Condit i onal on Not Zero MIPS IV

MOVZ Move Conditional on Zero MIPS IV

PREF Prefetch MIPS IV

Chapter 3 Inst r uction Set Overview and Summary

3-20

3.3.5 System Control Coprocessor (COP0) Instructions

COP0 instructions perform operations specifically on the System Control Coprocessor

registers to manipulate the memory management, exception handling, performance

monitor, and debug facilities of the processor.

COP0 instructions are enabled if the processor is in Kernel mode, or if bit 28 (CU) is set in

the Status register. Otherwise executing one of these instructions generates a Coprocessor

Unusable Exception.

For details of COP0 instructions refer to Appendix C.

Table 3-17. System Control Coprocessor Instructions

Mnemonic Description Defined in

BC0F Branch on Coprocessor 0 Fal se MIPS I

BC0T Branch on Coprocessor 0 True MIPS I

BC0FL Branch on Coprocess or 0 Fal se Likely MIPS I I

BC0TL Branch on Coprocess or 0 True Li kely MIPS II

CACHE Cache Operation R4000

DI Disabl e I nterrupt C790

EI Enabl e Interrupt C790

ERET Exception Return R4000

TLBR Read Indexed TLB Entry R4000

TLBWI Writ e I ndex TLB Ent ry R4000

TLBWR Write Random TLB Entry R4000

TLBP Probe TLB for Matching E ntry R4000

MTC0 Move To System Control Coprocesso r R4000

MFC0 Move From Sys tem Control Coproces sor R4000

MTPC Move To Performance Counter C790

MFPC Move From Performance Counter C790

MTPS Move To Perform ance Event Specifier C790

MFPS Move From Perf ormance Event Spec i f i er C790

MTBPC Move To Breakpoint Cont rol Regi ster C790

MFBPC Move From Breakpoi nt Control Register C790

MTDAB Move To Data Address B reakpoint Regi ster C790

MFDAB Move From Data A ddress Breakpoint Register C790

MTDABM Move To Data Address Breakpoint Mask

Register C790

MFDABM Move From Data A ddress Breakpoint Mask

Register C790

MTIAB Move To Instruction Address Breakpoint

Register C790

MFIAB Move From Instruct i on Address Breakpoint

Register C790

MTIABM Move To Instruc t i on Address Breakpoint Mask

Register C790

MFIABM Move From Ins t ruction A ddress Breakpoint

Mask Register C790

MTDVB Move To Data Value Break poi nt Regist er C790

MFDVB Move From Data V al ue B reakpoint Register C790

MTDVBM Move To Data Value Breakpoint Mask Regist er C790

MFDVBM Move From Data V al ue B reakpoint Mask

Register C790

Chapter 3 Inst r uction Set Overview and Summary

3-21

3.3.6 Coprocessor 1 (COP1)

Coprocessor instructions perform operations in their respective coprocessors. Coprocessor

loads and stores are I-type, and coprocessor computational instructions have coprocessor-

dependent formats. Coprocessor load and s t ore ins tructions are s ummarized in 3.3. 1. 3.

3.3.6.1 Coprocessor 1 (COP1) Inst ruct ions

COP1 instructions are enabled if bit 29 (CU) is set in the Status register. Otherwise

executing one of these instructions generates a Coprocessor Unusable Exception. For

details of COP1 instructions refer to Appendix D.

Table 3-18. Coprocessor 1 Instructions

Mnemonic Description Defined in

BC1F Branch on Float i ng Point Fals e MIPS I

BC1T Branch on Floating Point True MIPS I

LWC1 Load Word to Floati ng P oi nt MIPS I

LDC1 Load Doubleword to Floating Point MIPS I I

SWC1 Store Word from Fl oating Point MIPS I

SDC1 Store Doubl eword from Floating Point MIPS II

MFC1 Move Word from Fl oating Point MIPS I

MTC1 Move Word to Floati ng Point MIPS I

DMFC1 Move Doubleword from Fl oating Point MIPS III

DMTC1 Move Doubleword to Floating P oi nt MI P S III

CFC1 Move Control Word from Floati ng Point MIPS I

CTC1 Move Control Word to Floating Point MIP S I

CVT.D.fmt Floating P oi nt Convert to Double Fl oating Point MIPS I, I II

CVT.L.f mt Fl oat i ng Point Convert to Long Fixed Point MIPS III

CVT.S. fmt Floating Point Convert to S i ngl e Fl oat i ng Point MIPS I, I II

CVT.W.fmt Floating P oi nt Convert to Word Fi xed Point MIPS I

ADD.fmt F l oat i ng Point A dd MIPS I

SUB.f mt Float i ng Point Subtract MIPS I

MUL.fmt Floating Point Multiply MIPS I

DIV.fm t Floating Point Di vi de MIPS I

ABS.fmt Floating P oi nt A bsolute MIPS I

MOV.fmt F l oating Poi nt Move MIPS I

NEG.fmt Fl oating Point Negate MIPS I

SQRT.fmt F l oating Point Square Root MIPS II

C.cond. f mt Floati ng P oi nt Compare MIPS I

CEIL.L.fmt Floati ng P oi nt Ceiling Convert to Long Fixed

Point MIPS III

CEIL.W.fmt Floating Poi nt Cei l i ng Convert to Word Fi xed

Point MIPS II

FLOOR.L.fmt Floating Point Fl oor Convert to Long Fixed Point MIPS III

FLOOR.W.fmt Floati ng P oi nt Fl oor Convert to Word Fixed Point MIPS II

ROUND.L.fmt Floating Point Round to Long Fixed Point MIPS II I

ROUND.W. f mt Floating P oi nt Round to Word Fi xed Point MIPS II

TRUNC.L.fmt Float i ng P oi nt Truncate t o Long Fi xed Point MIPS III

TRUNC.W. f mt Floating P oi nt Truncate to Word Fixed Point MIPS II

Chapter 3 Inst r uction Set Overview and Summary

3-22

3.3.7 C790-Specific Instructions

The C790 extends its instruction set from the original MIPS architecture. The following

instructions are supported:

• Three-operand Multiply and Multiply/Add instructions

• Multiply instructions for Pipeline 1

• Multimedia instructions

• Enable interrupt and Disable interrupt instructions

For more information, refer to Appendices B and C.

3.3.7.1 Integer Multiply / Divide Instructions

The standard MIPS instructions for multiply, divide and move to / from HI / LO registers

execute on the I0 pipeline’s MAC unit. A complete set of new instructions has also been

defined to execute on the I1 pipeline’s MAC unit. All of these instructions are shown in the

following table.

Table 3-19. C790-Specific Multiply and Divide Instructions

OpCode Description OpCode Description

(Three Operand Multiply and Multi pl y-add) DIV1 Divide 1

MADD Multi p l y/ A dd DIVU1 Di vide Unsigned 1

MADDU Multiply/Add Uns i gned MADD1 Multi pl y/Add 1

MULT Multiply(3-operand) MADDU1 Multi pl y/Add Unsi gned 1

MULTU Multi pl y Unsigned(3-operand) MFHI1 Move From HI 1

(Multiply Instructions f or Pipeline 1) MFLO1 Move From LO 1

MULT1 Multiply 1 MTHI1 Move To HI 1

MULTU1 Mult ipl y Uns i gned 1 MTLO1 Move To LO 1

The C790 supports three-operand multiply instructions that s tore the multiply result to a

general purpose register in addition to the LO register. These instructions, as such, don’t

have to use the MFLO instruction to move data from the LO register to a general purpose

register.

• MULT

MULTMULT

MULT rd, rs, rt

rd, rs, rt rd, rs, rt

rd, rs, rt HI || LO = rs * rt (signed)

rd = new LO contents

• MULTU

MULTUMULTU

MULTU rd, rs, rt

rd, rs, rt rd, rs, rt

rd, rs, rt HI || LO = rs * rt (unsigned)

rd = new LO contents

The C790 also supports new multiply-add instructions, MADD and MADDU. These

instructions execute multiply-accumulate operations using the HI and LO registers as

accumulators.

• MADD

MADDMADD

MADD rd, rs, rt

rd, rs, rt rd, rs, rt

rd, rs, rt HI || LO += rs * rt (signed)

rd = new LO contents

• MADDU

MADDUMADDU

MADDU rd, rs, rt

rd, rs, rt rd, rs, rt

rd, rs, rt HI || LO += rs * rt (unsigned)

rd = new LO contents

Chapter 3 Inst r uction Set Overview and Summary

3-23

3.3.7.2 Multimedia Instructions

The C790 defines a new set of ins tructions to s upport multimedia applications . Thes e

instructions are shown in Table 3-20. Most of these instructions do parallel operations on

data by combining the execution units of the two pipelines ( I0 and I1). They f orm a 128- bit

path and then do parallel operations on either two 64-bit data items, four 32-bit data

items, eight 16-bit data items, or sixteen 8-bit data items.

In order to support the 128-bit datapath, 128-bit load/s tore operations are als o

implemented.

Table 3-20. Multimedia Instructions

OpCode Description

(Arithmetic)

PADDB Parallel Add Byte

PSUBB Parallel Subtract Byte

PADDH Parallel Add Halfword

PSUBH Parallel Subtract Halfword

PADDW Parallel Add Word

PSUBW P arallel Subtract Word

PADSBH Parallel Add/Subtract

Halfword

PADDSB Parallel Add with S i gned

Saturation Byte

PSUBS B Parallel Subtract with Signed

Saturation Byte

PADDSH Parallel Add with S i gned

Saturati on Hal fword

PSUBS H Parallel Subtract with Si gned

Saturati on Hal fword

PADDSW Parallel Add with Signed

Saturati on Word

PSUBS W Parallel Subtract with Si gned

Saturati on Word

PADDUB Parallel Add wit h Unsigned

Saturation Byte

PSUBUB Parallel Subtract with

Unsigned Sat uration Byte

PADDUH Parallel Add with Unsigned

Saturati on Hal fword

PSUBUH Parallel Subtract with

Unsigned Sat uration

Halfword

PADDUW Parallel A dd with Unsigned

Saturati on Word

PSUBUW Parallel Subtract with

Unsigned Sat u rat i on Word

(Min/Max)

PMAXH Parallel Maximum Halfword

PMINH Parallel Minimum Halfword

PMAXW Parallel Maximum Word

PMINW Parallel Minimum Word

OpCode Description

(Absolute)

PABSH Parallel Absolute Halfword

PABSW Parallel Absolute Word

(Multiply and Divide)

PMULTW Parallel Multiply Word

PMULTUW Parallel Multiply Uns i gned

Word

PDIVW Parallel Di vide Word

PDIVUW Parallel Di vide Unsigned

Word

PMADDW Parallel Multiply/Add Word

PMADDUW Parallel Multiply/Add

Unsigned Word

PMSUBW Parallel Multiply/Subtract

Word

PMFHI Parallel Move From HI

PMFLO Parallel Move From LO

PMTHI Parallel Move To HI

PMTLO Parallel Move To LO

PMULTH Parallel Multiply Halfword

PMADDH Parallel Multiply/Add

Halfword

PMSUBH Parallel Multiply/Subtract

Halfword

PMFHL Parallel Move From HI/LO

PMTHL Parallel Move To HI/LO

PHMADH Parallel Horizontal

Multiply/Add Halfword

PHMSBH Parallel Horizontal

Multiply/Subtract Halfword

PDIVBW Paral l e l Di vi de Broadcast

Word

Chapter 3 Inst r uction Set Overview and Summary

3-24

OpCode Description

(SA Operation)

MFSA Move from SA Regis ter

MTSA Move to SA Regi ster

MTSAB Move Byte Count to SA

Register

MTSAH Move Halfword Count t o SA

Register

(Shift)

PSLLH Parallel Shift Left Logic al

Halfword

PSRLH Paral l el Shift Ri ght Logical

Halfword

PSRAH Parallel Shift Right Arithmetic

Halfword

PSLLW Parallel Shift Lef t Logical

Word

PSRLW Parallel S hi ft Right Logi cal

Word

PSRAW Parallel Shift Right Arithmetic

Word

PSLLVW Parallel Shi ft Left Logi cal

Variable Word

PSRLVW Parallel Shift Right Logi cal

Variable Word

PSRAVW Parallel Shift Right Arithmetic

Variable Word

(Logical)

PAND Parallel AND

POR Parallel OR

PXOR Parallel XOR

PNOR Parallel NOR

(Compare)

PCGTB Parallel Compare f or Greater

Than Byte

PCEQB Paral l el Compare for E qual

Byte

PCGTH Parallel Compare for Great er

Than Halfword

PCEQH Parallel Compare for E qual

Halfword

PCGTW Parallel Compare for Great er

Than Word

PCEQW Parallel Compare for Equal

Word

OpCode Description

(Quadword Load Store )

LQ Load Quadword

SQ Store Quadword

(Pack/Extend)

PPACB Parallel Pack To Byte

PPACH Parallel Pack To Halfword

PINTEH Parallel Interleave Even

Halfword

PPACW Parallel Pack To Word

PEXTUB Parallel Extend Upper From

Byte

PEXTLB Parallel E xtend Lower From

Byte

PEXTUH Parallel Extend Upper From

Halfword

PEXTLH Parallel E xtend Lower From

Halfword

PEXTUW Parallel Extend Upper From

Word

PEXTLW Parallel Extend Lower From

Word

PEXT5 Parallel Extend from 5 bits

PPAC5 Parallel Pack to 5 bits

(Others)

PCPYH Parallel Copy Halfword

PCPYLD Parallel Copy Lower

Doubleword

PCPYUD Parallel Copy Upper

Doubleword

PREVH Parallel Reverse Halfword

PINTH Parallel Interleave Halfword

PEXEH Parallel Exchange Even

Halfword

PEXCH Parallel Exchange Cent er

Halfword

PEXEW Parallel Exchange Even

Word

PEXCW P aral l el Exchange Center

Word

PROT3W Parallel Rotate 3 word

QFSRV Q uadword Funnel Shift Right

Variable

PLZCW Parall el Leading Zero Count

Word

Chapter 3 Inst r uction Set Overview and Summary

3-25

3.4 User Instruction Latency and Repeat Rate

Table 3-21 shows the latencies and repeat rates for all user instructions executed in I0, I1,

BR, LS and C1 execution pipelines. Kernel instructions are not included, nor are

instructions not issued to these execution pipelines. See Figure 2-1 and Figure 2-4 for

execution pipeline name.

Table 3-21. Latencies and Repeat Rates for User Instruction

Instruction Type Execution Latency Repeat

Rate Comment

Integer Instruc t i ons

Add/Sub/Logical/Set I0/I1 1 1

MF/MT/HI/LO I0/I1 1 1

Shift/LUI I0/I1 1 1

Branch/Jump BR 1 1

Conditional Move I0/ I1 1 1

MULT/MULTU I0 4 2 Latency relative to

Lo/Hi/GPR

MULT1/MULTU1 I1 4 2 Latency relative t o

Lo1/Hi1/GPR

DIV/DIVU I0 37 37 Latency relati ve to

Lo/Hi

DIV1/ DI VU1 I1 37 37 Latency relative to

Lo1/Hi1

MADD/MADDU I0 4 2 Latency relati ve to

Lo/Hi/GPR

MADD1/MADDU1 I1 4 2 Latency relative to

Lo1/Hi1/GPR

Load LS 1 1 Assuming cac he hi t

Store LS - 1 Assuming c ache hit

Multimedia Multiply I0+I1 4 2

Multimedia Multiply/Add I0+I1 4 2

Multimedia Divide I0+I1 37 37

Floating-Point Inst ructions

ADD.S/SUB.S/C.cond.S C1 6 2

ADD.D/SUB.D/C.cond.D C1 8 2

ABS/NEG/MOV C1 6 2

CVT C1 8 2

MUL.S C1 6 2

MUL.D C1 8 2

DIV.S C1 21 15

DIV.D C1 35 29

SQRT.S C1 21 15

SQRT.D C1 35 29

MFC1/MTC1 C1+LS 2 1

DMFC1/DMTC1 C1+LS 2 1

CFC1/CTC1 C1+LS 2 1

LWC1/LDC1 C1+LS 2 1 Assumi ng cache hit

SWC1/SDC1 C1+LS −1

Chapter 3 Inst r uction Set Overview and Summary

3-26

Chapter 4 CPU and COP0 Registers

4-1

4. CPU and COP0 Registers

This chapter describes the CPU registers and the System Control Coprocessor (COP0)

registers.

The CPU registers group consists of:

•

General Purpose Registers (GPRs),

•

Multiply and Divide registers ( HI

HIHI

HI and LO

LOLO

LO registers) that hold the results of

integer multiply and divide,

•

The SA

SASA

SA regis ter w hich is us ed by the f unnel s hift ins t ructions ,

•

The

Program Counter

Program CounterProgram Counter

Program Counter

(PC) register.

The

COP0

registers control the processor state and report its status. These registers can

be read using the

MFC0

instruction and written using the

MTC0

instruction.

Chapter 4 CPU and COP0 Registers

4-2

4.1 CPU Registers

The central processing unit (CPU) provides the following registers:

•

32 128-bit

General Purpose Registers

(

GPR

)

•

Four registers that hold the results of integer multiply and divide operations

(

HI0

,

LO0

,

HI1

, and

LO1

)

• Shift Amount (SA)

register

• Program Counter

The C790 has 128-bit-wide General Purpose Registers (GPRs). The upper 64 bits of the

GPRs are only used by the C790-specific “Quad Load/Store”, and “Multimedia (Parallel)”

instructions.

HI0

and

LO0

are the standard 64-bit

HI

and

LO

registers.

HI1

and

LO1

, which are the

upper 64 bits of the 128-bit

HI

and

LO

registers, are only used by the new multiply and

divide instructions, such as

MULT1

,

MULTU1

,

DIV1

,

DIVU1

,

MADD1

,

MADDU1

,

MFHI1

,

MFLO1

,

MTHI1

, and

MTLO1

. All these instructions are equivalent to existing

instructions which operate on

HI0

and

LO0

registers.

The

Shift Amount

(SA) register specifies the shift amount used by the funnel shift

instruction. The shaded registers in Figure 4-1 are new architecturally-visible registers

that are specific to the C790.

Chapter 4 CPU and COP0 Registers

4-3

General Purpose Registers

(127 64 63 0)

63 0 63 0

$0 $0

$1 $1

$2 $2

$31 $31

HI and LO Register

HI HI1 HI (HI0)

LO LO1 LO (LO0)

SA Register

31 0

SA

Program Counter

PC

Figure 4-1. CPU Registers

Chapter 4 CPU and COP0 Registers

4-4

4.1.1 General Purpose Registers

The standard 64-bit CPU general purpose registers have been extended to 128-bit

registers. New instructions have been defined to use the upper 64-bits of these registers.

Two of the CPU general purpose registers have special assigned functions:

•

r0 is hardwired to a value of zero, and can be used as the target register for any

instruction whose result is to be discarded. r0 can also be used as a source when

a zero value is needed.

•

r31 is the link register used by the Jump and Link instructions. In general, it

should not be used by other instructions.

4.1.2 HI and LO Registers

The standard 64-bit

HI

and

LO

registers have been extended to 128-bit registers. New

instructions have been defined to use the upper 64-bits of these registers.

HI0

and

LO0

are the standard 64-bit

HI

and

LO

registers. HI1 and LO1 are the upper 64 bits of the

128-bit

HI

and

LO

registers

These four registers (

HI0

,

LO0

,

HI1

,

LO1

) store:

•

the product of integer multiply operations, or

•

the accumulation of integer multiply-accumulate operations, or

•

the quotient (in

LO0

or

LO1

) and remainder (in

HI0

or

HI1

) of integer di vide

operations.

4.1.3 Shift Amount (SA) Register

The

SA

register specifies the shift amount used by the funnel shift instruction. This is a

new architecturally-visible register and it needs to be saved and restored as part of the

processor state. New instructions have been defined to move values between this register

and the general purpose registers.

4.1.4 Program Counter (PC)

The

Program Counter

(

PC

) holds the address of the instruction which is being executed.

The

PC

is incremented automatically by 4 when a non-control-transfer instruction (that is:

branch, jump, ERET, SYSCALL,

or

TRAP

) is executed. Control-transfer instructions

change the value of the

PC

to the target address specified by them. An exception also

changes the contents of the

PC

to the specified exception vector address.

Chapter 4 CPU and COP0 Registers

4-5

4.2 System Control Coprocessor (COP0) Registers

COP0

registers are listed in Table 4-1.

Table 4-1. Coprocessor 0 Registers

Register

No. Register

Name Description Purpose

0 Index Programmable register to select TLB entry for readi ng or writing MMU

1 Random Pseudo-random count er for TLB replac ement MMU

2 EntryLo0 Low half of TLB ent ry for even PFN (Physical page number) MMU

3 EntryLo1 Low half of TLB ent ry f or odd PFN (Phys i cal page num ber) MMU

4 Context Pointer to kernel virtual PTE tabl e i n 32-bi t address i ng mode Exception

5 PageMask Mask that sets the TLB page si ze MMU

6 Wired Number of wired TLB ent ri es MMU

7 (Reserved) Undefined Undefined

8 BadVA ddr Bad virtual address Exception

9 Count Timer c ompare Exception

10 EntryHi High hal f of TLB entry (Virt ual page number and ASID) MMU

11 Com pare Timer compare Exception

12 Status Process or S tatus Regi ster Exception

13 Cause Caus e of the las t exception taken Exception

14 EPC Excepti on Program Counter Exception

15 PRId P rocess or Revi sion Ident i fier MMU

16 Config Configuration Register MMU

17 (Reserved) Undefined Undefined

18 (Reserved) Undefined Undefined

19 (Reserved) Undefined Undefined

20 (Reserved) Undefined Undefined

21 (Reserved) Undefined Undefined

22 (Reserved) Undefined Undefined

23 BadPAddr B ad physical address Exception

24 Debug This is us ed for Debug function Debug

25 Perf Performance Counter and Control Regis t er Excepti on

26 (Reserved) Undefined Undefined

27 (Reserved) Undefined Undefined

28 TagLo Cache Tag register (l ow bits) Cache

29 TagHi Cache Tag regi ster (high bit s) Cache

30 ErrorEPC Error Exception Program Counter Exception

31 (Reserved) Undefined Undefined

Chapter 4 CPU and COP0 Registers

4-6

4.2.1 Index Register (0)

31 30 6 5 0

P 0 Index

125 6

Figure 4-2. Index Register

The

Index

register is a 32-bit read/write register containing six bits to index an entry in

the TLB. The high-order bit of the register records the success or failure of a

TLB Probe

(

TLBP

) instruction.

The

Index

register also specifies the TLB entry affected by

TLB Read

(

TLBR

) or

TLB

Write Index

(

TLBWI

) instructions.

Table 4-2 shows the format of the

Index

register; Table 4-2 describes the

Index

register

fields.

Table 4-2. Index Register Field Description

Field Bits Description Type Initial

Value

P 31 Probe fail ure. Set t o 1 when the previous TLB Probe

(TLBP) instruction was unsuccessful. Read/Write Undefined

Index 5:0 Index to the TLB entry affected by the TLB Read and

TLB Write instructions. Read/Write Undefined

0 30:6 Reserved. Must be written as zeroes, and returns zeroes

when read. Read-only 0

Chapter 4 CPU and COP0 Registers

4-7

4.2.2 Random Register (1)

31 6 5 0

0 Random

26 6

Figure 4-3. Random Register

The

Random

register is a read-only register. The least significant six bits index an entry

in the TLB. This register decrements every cycle an instruction is executed. Its value

ranges between an upper and a lower bound, as f ollow s :

•

A lower bound is set by the number of TLB entries reserved for exclusive use by

the operating system (the contents of the

Wired

register).

•

An upper bound is set by the total number of TLB entries (47 maximum).

The

Random

register specifies the entry in the TLB that is affected by the

TLB Write

Random

(TLBWR) instruction. The register does not need to be read for this purpose;

however, the register is readable to verify proper operation of the processor.

To simplify testing, the

Random

register is set to the value of the upper bound upon

system reset. This register is also set to the upper bound when the

Wired

register is

written.

Figure 4-3 shows the format of the

Random

Register; Table 4-3 describes the

Random

Register fields.

Table 4-3. Random Register Fields

Field Bits Description Type Initial

Value

Random 5:0 TLB Random i ndex. Read-only Upper

bound (47)

0 31:6 Reserved. Must be written as zeros, and ret urns

zeroes when read. Read-only 0

Chapter 4 CPU and COP0 Registers

4-8

4.2.3 EntryLo0 Register (2), and EntryLo1 Register (3)

EntryLo0

31 26 25 6 5 3 2 1 0

0PFNCDVG

6203111

EntryLo1

31 26 25 6 5 3 2 1 0

0PFNCDVG

6203111

Figure 4-4. EntryLo0 and EntryLo1 Registers

The

EntryLo0

and

EntryLo1

registers consist of two registers that have similar format:

•

EntryLo0 is used for even virtual pages.

•

EntryLo1 is used f or od d vir t ual pages.

The

EntryLo0

and

EntryLo1

registers are read/write registers. They hold the physical

page frame number (PFN) of the TLB entry for even and odd pages, respectively, when

performing TLB read and write operations.

Figure 4-4 shows the format of the

EntryLo0

and

EntryLo1

Registers; Table 4-4 describes

the

EntryLo0

and

EntryLo1

Register fields.

Table 4-4. EntryLo0 and EntryLo1 Register Fields

Field Bits Description Type Initial

Value

PFN 25:6 Page f rame number; the upper bi ts of the physic al address. Read/Wri te Undefined

C 5: 3 Specifies t he TLB page coherency attribut e.

000(0): Reserved

001(1): Reserved

010(2): Uncac hed

011(3): Cacheable, write-back, write allocate

100(4): Reserved

101(5): Reserved

110(6): Reserved

111(7): Uncached Accelerated

Read/Write Undefined

D2

Dirty. If this bit is set, the page is marked as dirty and t herefore

writable. This bi t is ac tually a write-protec t bi t that software can us e

to prevent alt eration of dat a.

Read/Write Undefined

V1

Valid. If this bit is set, it indicates that the TLB entry is valid;

otherwise, a TLBL or TLBS miss will occur. Read/Write Undefined

G0

Global. If thi s bit i s set i n both EntryLo0 and E ntryLo1, then t he

process or i gnores the AS ID during TLB look -up. Read/Write Undefined

0 31:26 Reserved. Must be written as zeroes, and returns zeroes when

read.

EntryLo0[31] is res erved for Kernel us e. It contains the written

value. This bi t has no effec t on any CPU or TLB operat i on.

Read-only 0

Reserved codes in C field may not be written correctly into TLB entry by TLBWI or

TLBWR instruction.

Chapter 4 CPU and COP0 Registers

4-9

4.2.4 Context Register (4)

31 23 22 4 3 0

PTEBase BadVPN2 0

9194

Figure 4-5. Context Register Format

The

Context

register is a read/write register containing the pointer to an entry in the page

table entry (PTE) array. This array is an operating system data structure that stores

virtual-to-physical address translations. When there is a TLB miss, the CPU loads the

TLB with the missing translation from the PTE array. Normally, the operating system

uses the

Context

register to address the current page map which resides in the kernel-

mapped segment, kseg3. The

Context

register duplicates some of the information provided

in the

BadVAddr

register, but the information is arranged in a form that is more useful

for a software TLB exception handler. Figure 4-5 shows the format of the

Context

register;

Table 4-5 describes the

Context

register fields.

Table 4-5. Context Register Fields

Field Bits Description Type Initial

Value

PTEBase 31:23 This fi el d i s a read/write fiel d for use by the operating

system. It is normall y written with a val ue that allows t he

operating sys tem to use the Context register as a poi nter

into the current PT E array in m emory.

Read/Write Undefined

BadVPN2 22:4 This field i s written by hardware on a miss. It contains t he

virtual page number (VP N) of the most rec ent virtual

address t hat did not have a vali d translat i on.

Read-only Undefined

0 3:0 Reserved. Must be written as zeros , and returns zeroes

when read. Read-only 0

The 19-bit BadVPN2 field contains bits 31:13 of the virtual address that caused the TLB

miss; bit 12 is excluded because a single TLB entry maps to an even-odd page pair. For a 4

KB page size, this format can directly address the pair-table of 8-byte PTEs. For other

page and PTE sizes, shifting and masking this value produces the appropriate address.

Chapter 4 CPU and COP0 Registers

4-10

4.2.5 PageMask Register (5)

31 25 24 13 12 0

0MASK 0

712 13

Figure 4-6. PageMask Register

The

PageMask

register is a read/write register used for reading or writing the TLB. It

holds a comparison mask that sets the variable page size for each TLB entry, as shown in

Table 4-6.

Table 4-6. PageMask Register Field

Field Bits Description Type Initial Value

MASK 24:13 Page comparis on mas k.

0000 0000 0000: Page Size = 4 K byt es

0000 0000 0011: Page Size = 16 Kbytes

0000 0000 1111: Page Size = 64 Kbyt es

0000 0011 1111: Page Size = 256 Kbytes

0000 1111 1111: Page Si ze = 1 Mbytes

0011 1111 1111: Page Size = 4 Mbytes

1111 1111 1111: Page Size = 16 Mbytes

Read/Write Undefined

0 31:25,

12:0 Reserved. Mus t be written as zeros , and returns zeroes

when read. Read-only 0

TLB read and write operations use this register as either a source or a destination; when

virtual addresses are presented for translation into physical address, the corresponding

bits in the TLB identify which virtual address bits among bits 24:13 are used in the

comparison. When the Mask field is not one of the values shown in Table 4-6, the

operation of the TLB is undefined.

Chapter 4 CPU and COP0 Registers

4-11

4.2.6 Wired Register (6)

31 6 5 0

0Wired

26 6

Figure 4-7. Wired Register

The

Wired

register is a read/write register that specifies the boundary between the wired

and random entries of the TLB as shown in Figure 4-8. Wired entries are fixed, non-

replaceable entries which cannot be overwritten by a TLB write operation. Random

entries can be overwritten. Figure 4-7 shows the format of the

Wired

register. Table 4-7

describes the register fields.

The

Wired

register is set to 0 upon system reset. Writing this register also sets the

Random

register to the value of its upper bound as shown in Figure 4-8.

Wired entries

Random

entries

Wired Register

value

TLB 47

0

Figure 4-8. Wired Register Boundary

Writing a value greater than 47 into this register produces undefined results.

Table 4-7. Wired Register Field Descriptions

Field Bits Description Type Initial Value

Wired 5:0 TLB Wired boundary (the number of wired TLB

entries) Read/Write 0

0 31:6 Reserved. Must be written as zeros, and returns

zeroes when read. Read-only 0

Chapter 4 CPU and COP0 Registers

4-12

4.2.7 BadVAddr Register (8)

31 0

BadVAddr

32

Figure 4-9. BadVAddr Register

The

Bad Virtual Address

register (

BadVAddr

) is a read-only register that displays the

most recent virtual address that caused one of the following exceptions: TLB Invalid, TLB

Modified, TLB Refill, or Address Error exceptions.

Figure 4-9 shows the format of the

BadVAddr

register; Table 4-8 describes the register

fields.

Table 4-8. BadVAddr Register Field

Field Bits Description Type Initial

Value

BadVAddr 31:0 T he mos t recent virt ual address t hat cause a TLB Invalid,

TLB modified, TLB Refill, or Address Error exception. Read-only Undefined

Note: The

BadVAddr

register does not save any information for bus errors, since bus

errors are not addressing errors.

Chapter 4 CPU and COP0 Registers

4-13

4.2.8 Count Register (9)

31 0

Count

32

Figure 4-10. Count Register

The

Count

register acts as a real-time timer. It is incremented every CPU clock cycle. The

timer interrupt signaled through

IP[7]

can be disabled through the interrupt mask bit,

IM[7]

. This register can be read or written.

Figure 4-10 shows the format of the

Count

register. Table 4-9 describes the register fields.

Table 4-9. Count Register Field

Field Bits Description Type Initial Value

Count 31:0 32-bit tim er, incrementi ng at the CPU clock rate. Read/Wri te Undefined

Chapter 4 CPU and COP0 Registers

4-14

4.2.9 EntryHi Register (10)

31 13 12 8 7 0

VPN2 0 ASID

19 5 8

Figure 4-11. EntryHi Register

The

EntryHi

register holds the high-order bits of a TLB entry for TLB read and write

operations. The

EntryHi

register is accessed by the

TLB Probe

,

TLB Write Random

,

TLB

Write Indexed

, and

TLB Read Indexed

instructions.

When either a TLB Refill, TLB Invalid, or TLB Modified exception occurs, the

EntryHi

register is loaded with the virtual page number (VPN2) and the ASID of the virtual

address that did not have a matching TLB entry.

Figure 4-11 shows the format of the

EntryHi

register. Table 4-10 describes the register

fields.

Table 4-10. EntryHi Register Fields

Field Bits Description Type Initial Value

VPN2 31:13 Virtual page number divided by two (maps t o two

pages). Read/Write Undefined

ASID 7:0 Address spac e I D field. An 8-bit fi el d that let s mul tiple

process es share the TLB; each process can have a

disti nct mapping of ot herwise i dentical vi rt ual page

numbers.

Read/Write Undefined

0 12:8 Reserved. Must be written as zeroes, and returns

zeroes when read. Read-only 0

Chapter 4 CPU and COP0 Registers

4-15

4.2.10 Compare Register (11)

31 0

Compare

32

Figure 4-12. Compare Register

The

Compare

register acts as a timer (see also the

Count

register); it maintains a stable

value that does not change on its own. When the value of the

Count

register equals the

value of the

Compare

register, interrupt bit IP[7] in the

Cause

register is set. This causes

an interrupt as soon as the interrupt is enabled. Writing a value to the

Compare

register,

as a side effect, clears the timer interrupt.

For diagnostic purposes, the

Compare

register is a read/write register. In normal use,

however, the

Compare

register is write-only. Figure 4-12 shows the format of the

Compare

register. Table 4-11 describes the register fields.

Table 4-11. Compare Register Field

Field Bits Description Type Initial

Value

Compare 31:0 The Compare regis ter saves a stable value compared to the

Count register. When the value of the Count regis ter equals t o

the value of t he Compare register, interrupt IP[7] occurs.

Read/Write Undefined

Chapter 4 CPU and COP0 Registers

4-16

4.2.11 Status Register (12)

31 28 27 26 25 24 23 22 21 1918 17 16 15 14 13 12 11 109 5 4 3 2 1 0

CU

(CU[3:0]) 0F

R0D

E

V

B

E

V

0C

HE

D

I

E

I

E

IM

[7] 0B

E

M

IM

[3:2] 0K

S

U

E

R

L

E

X

L

I E

4 11211 3 1111 2 1 2 5 2111

Figure 4-13. Status Register

The

Status

register (SR) is a read/write register that contains the operating mode,

interrupt enabling, and the diagnostic states of the processor. Figure 4-13 shows the

format of the

Status

register. The following paragraphs identify the more important

Status

register fields and describe the fields. Some of the important fields include:

•

The 3-bit

Interrupt Mask (IM)

field controls the enabling of three interrupt

signals. Interrupts must be enabled before they can be asserted. Interrupts are

recognized by the processor when the corresponding bits are set in both the

Interrupt Mask

and the

Interrupt Enable

fields of the

Status

register and the

Interrupt Pending

field of the

Cause

register. The C790 does

not

support

software interrupts.

IM[7]

corresponds to the internal timer interrupt and

IM[3:2]

corresponds to

Int[1:0]

Int[1:0]Int[1:0]

Int[1:0]

signals.

•

The 4-bit

Coprocessor Usability (CU)

field

(CU[3:0])

controls the usability of four

possible coprocessors. Regardless of the

CU[0]

bit setting, COP0 is always

usable in Kernel mode. For all other cases, an access to an unusable coprocessor

causes an exception. C790 supports coproces s or 1 ( FPU) .

Chapter 4 CPU and COP0 Registers

4-17

4.2.11.1 Status Register Format

Table 4-12 describes the

Status

register fields. All bits in the

Status

register are readable

and writable.

Table 4-12. Status Register Fields

Field Bits Description Type Initial

Value

CU

(CU[3:0]) 31:28 Controls the usability of each of the four coprocessor unit numbers. COP0

is always usabl e when in Kernel mode, regardl ess of the setting of t he

CU[0] bit.

1 → usable

0 → unusable

Read/

Write Undefined

FR 26 Enable additi onal floating poi nt regi sters

0 → 16 registers

1 → 32 registers

Read/

Write 0

DEV 23 Controls the locat i on of Performance counter and debug/S IO exception

vectors.

0 → normal

1 → bootstrap

Read/

Write Undefined

BEV 22 Controls t he location of TLB refill and general exception vectors.

0 → normal

1 → bootstrap

Read/

Write 1

CH 18 Cache Hit (tag match and vali d state) or Miss indication for last CACHE Hit

Invalidat e and CACHE Hit Wri te-back Invalidate for the Data cache.

0 → miss

1 → hit

Read/

Write Undefined

EDI 17 EI/DI instruction Enable: When this bit is set, t he EI and DI i nstructions

can operate in User, Supervisor and K ernel modes and as suc h set or cl ear

the EIE bit to enable or disabl e al l i nterrupts (except NMI). When this bit i s

cleared, E I and DI operate as NOPs i n User and Supervis or modes and

executes properly in Ke rnel mode.

Read/

Write Undefined

EIE 16 Enable IE: This bi t enables or disables t he IE (Int errupt Enable) bit . This

bit is cleared by the DI i nstruc t i on and set by the EI ins tructi on.

0 → disables al l i nterrupts regardl ess of the value of t he IE bit.

1 → enables the IE bit. (Al l i nt errupts are enabled i f IE=1, EXL=0, and

ERL=0.)

Note: IM enables individual i nterrupt

Read/

Write Undefined

IM[7,3:2] 15,

11:10 Interrupt Mask: controls the enabling of each of the external and internal

interrupts. An interrupt is taken if i nterrupts are enabl ed, and the

corresponding bi ts are set in both t he Interrupt Mask field of the St atus

register and t he Interrupt Pending fiel d of the Cause register.

0 → disabled

1 → enabled

Note: T he enabl i ng of this bi t is vali d onl y when EI E = 1, IE=1, E XL=0 and

ERL=0

Read/

Write Undefined

BEM 12 Bus Error Mask: controls the updating of the BadPA ddr regi ster and

signaling a bus error exception.

0 → update BadPA ddr and signal a bus error excepti on.

1 → do not update BadPAddr and st op signaling a bus error

exception. This bi t i s set to 1 when it is a 0 and a bus error i s signal ed.

Read/

Write Undefined

KSU 4:3 Kernel/Supervisor/User Mode bits:

002 → Kernel

012 → Supervisor

102 → User

112 → Reserved

Read/

Write Undefined

Chapter 4 CPU and COP0 Registers

4-18

Field Bits Description Type Initial

Value

ERL 2 Error Level: set by the processor when Reset, NMI, performanc e counter,

SIO or debug exception is taken.

0 → normal 1 → error

Read/

Write 1

EXL 1 Exception Level: set by t he process or when any exception ot her t han

Reset, NMI, performance counter, or debug exception i s taken.

0 → normal 1 → excepti on

Read/

Write Undefined

IE 0 Interrupt E nable

0 → disables al l i nt errupts

1 → enables all i nt errupts (if EIE=1, ERL=0, and EXL=0)

Read/

Write Undefined

0 27,

25:24,

21:19,

14:13,

9:5

Reserved. Must be written as zeroes, and returns zeroes when read. Read-

only 0

4.2.11.2 Status Register M odes and Access States

Fields of the

Status

register set the modes and access states below.

Interrupt

InterruptInterrupt

Interrupt Enable:

Enable: Enable:

En a ble: Interrupts are enabled when all of the following conditions are true:

• Status.IE

= 1,

• and Status.EIE

= 1,

•

and

Status.EXL

= 0,

•

and

Status.ERL

= 0

If these conditions are met, setting the

IM

bits enable the appropriate interrupts.

SIO

SIOSIO

SIO Enable:

Enable: Enable:

En abl e: A level 2 exception by SIO is enabled when the following condition is true:

• Status.ERL

= 0

If this condition is met, asserting the SIO

SIOSIO

SIO signal causes a Debug exception to occur.

Operating Modes:

Operating Modes:Operating Modes:

Operating Modes: The following CPU

Status

register bit settings are required for

User

,

Kernel

, and

Supervisor

modes.

•

The Processor is in

User

mode when

KSU

= 102 and

EXL

= 0 and

ERL

= 0.

•

The processor is in

Supervisor

mode when

KSU

= 012 and

EXL

= 0 and

ERL

= 0.

•

The processor is in

Kernel

mode when

KSU

= 002 or

EXL

= 1 or

ERL

= 1.

Kernel

KernelKernel

Kernel Address Space Accesses:

Address Space Accesses: Address Space Accesses:

Address Space Accesses: Access to the kernel address space is allowed when the

processor is in Kernel mode.

Supervisor

SupervisorSupervisor

Supervisor Address Space Accesses:

Address Space Accesses: Address Space Accesses:

Address Space Accesses: Access to the supervisor address space is allowed

when the processor is in Kernel mode or Supervisor mode, as described above.

User Address Space Accesses:

User Address Space Accesses:User Address Space Accesses:

User Address Space Accesses: Access to the user address space is allowed in Kernel,

Supervisor, and User modes.

Chapter 4 CPU and COP0 Registers

4-19

4.2.12 Cause Register (13)

31 30 29 28 27 19 18 16 15 14 1312 11 10 9 7 6 2 1 0

B

DB

D

2

CE 0 EXC2 IP

[7] 0S

I

O

P

IP

[3:2] 0 ExcCode 0

11 2 9 3 1 2 1 2 3 5 2

Figure 4-14. Cause Register

The 32-bit read-only

Cause

register describes the cause of the most recent exception.

Figure 4-14 shows the fields of this register. Table 4- 13 describes the

Cause

register fields.

All bits in the

Cause

register are read-only.

Table 4-13. Cause Register Fields

Field Bits Description Type Initial

Value

BD 31 Set by the processor when any exception other than Res et, NMI,

perform ance counter, or debug occurs and i s taken i n a branch delay

slot.

1 → delay slot

0 → normal

Read-only Undefined

BD2 30 I ndi cates whether t he l ast NMI, performance counter, debug, or SI O

exception taken occurred in a branch delay sl ot .

1 → delay slot

0 → normal

Read-only Undefined

CE 29:28 Coprocesso r uni t number referenced when a Coprocessor Unusable

exception is t aken. Read-only Undefined

EXC2 18:16 Indicat es the exception codes for l evel 2 except i ons (Performance

Counter, Res et , Debug, SI O and NMI excepti ons)

000 (0) : Res (Res et)

001 (1) : NMI (Non-m askable Interrupt)

010 (2) : PerfC (Performance Counter)

011 (3) : Dbg (Debug) and SIO (SIO)

1xx (4-7) : Res erved

Read-only Undefined

IP[7,3:2] 15,

11:10 Indicat es an interrupt i s pending.

1 → interrupt pending

0 → no interrupt

Read-only Undefined,

Int[1:0]

SIOP 12 I ndi c ates an SI O signal is pendi ng

1 → SIO si gnal i s pendi ng

0 → no SIO si gnal i s pendi ng

Read-only SIO

Chapter 4 CPU and COP0 Registers

4-20

Field Bits Description Type Initial

Value

ExcCode 6:2 Exception code filed.

00000 (0) : I nt (Interrupt)

00001 (1) : Mod (TLB modifi cation exception)

00010 (2) : TLB L (TLB except ion (l oad or i nstruc t i on fetch))

00011 (3) : TLB S (TLB exception (st ore))

00100 (4) : A dE L (A ddress error exception

(load or inst ruction f etch))

00101 (5) : A dE S (Address error exception (store))

00110 (6) : I BE (Bus error except i on (i nstruction fet ch))

00111 (7) : DBE (B us error exception

(data referenc e: l oad or store))

01000 (8) : S ys (Sysc al l exception)

01001 (9) : B p (B reakpoint excepti on)

01010 (10): RI (Reserved instructi on except i on)

01011 (11): CpU(Coprocessor Unusable exception)

01100 (12): Ov (A ri thmetic overf l ow exception)

01101 (13): Tr (Trap exception)

01110 (14): Reserved

01111 (15): FPE Floating-Point exception

(16-31): (Reserved)

Read-

only Undefined

0 27:19,

14:13,

9:7,

1:0

Reserved. Must be written as zeroes , and returns zeroes when read. Read-

only 0

Chapter 4 CPU and COP0 Registers

4-21

4.2.13 EPC Register (14)

31 0

EPC

32

Figure 4-15. EPC Register

The

Exception Program Counter

(EPC)

is a read/write register that contains the address

at which processing resumes after an exception has been serviced.

For synchronous exceptions, the

EPC

register contains either:

•

the virtual address of the instruction that was the direct cause of the exception,

or

•

the virtual address of the immediately preceding branch or jump instruction

(when the instruction is in a branch delay slot, and the

BD

bit in the

Cause

register is set).

On the occurrence of an exception, if the

EXL

bit in the

Status

register is set to a 1, the

processor does not update the

EPC

register. Figure 4-15 shows the format of the

EPC

register. Table 4-14 describes the

EPC

register fields.

Table 4-14. EPC Register Field

Field Bits Description Type Initial Value

EPC 31:0 Contains the addres s at which processing can resume after an

exception has been s ervi ced. Read/Write Undefined

Chapter 4 CPU and COP0 Registers

4-22

4.2.14 PRId Register (15)

31 16 15 8 7 0

0ImpRev

16 8 8

Figure 4-16. PRId Register

The 32-bit read-only

Processor Revision Identifier (PRId)

register contains information

identifying the implementation and revision level of the C790 and COP0. Figure 4-16

shows the format of the

PRId

register; Table 4-15 describes the

PRId

register fields.

The low-order byte (bits 7:0) of the

PRId

register is interpreted as a revision number, and

the high-order byte (bits 15:8) is interpreted as an implementation number. The

implementation number of the C790 processor is 0x

0x0x

0x38

3838

38. The content of the high-order

halfword (bits 31:16) of the register are reserved.

The revision number is stored as a value in the form

y.x

, where

y

is major revision number

in bits 7:4 and

x

is a minor revision number in bits 3:0.

The revision number can distinguish some chip revisions, but there is no guarantee that

changes to the chip will necessarily be reflected in the

PRId

register, or that changes to

the revision number necessarily reflect real chip changes. For this reason, these values are

not listed and software should not rely on the revision number in the

PRId

register to

characterize the chip.

Table 4-15. PRId Register Fields

Field Bits Description Type Initial

Value

Im p 15:8 Implementati on number Read-only 0x38

Rev 7:0 Revision number of eac h mas k Read-only Revis i on

number

0 31:16 Reserved. Must be writt en as zeroes, and ret urns zeroes when read. Read-onl y

Chapter 4 CPU and COP0 Registers

4-23

4.2.15 Config Register (16)

31 30 28 27 19 18 17 16 15 14 13 12 11 9 8 6 5 3 2 0

0EC 0 D

I

E

I

C

E

D

C

E

B

E 0N

B

E

B

P

E

IC DC 0 K0

1 3 9 1111111 3 3 3 3

Figure 4-17. Config Register Format

The

Config

register specifies various configuration options which can be selected. Figure 4-

17 shows the format of the

Config

register; Table 4-16 describes the

Config

register fields.

Some configuration options, as defined by

Config

bits 30:28, 15 and 11:6, are set by the

hardware during reset and are included in the

Config

register as read-only status bits for

the software to access. Other configuration options like 18:16 and 13:12 are set by

hardware during reset and can be modified by software. Other configuration options like

bits 2:0 are read/write and controlled by software; on reset these fields are undefined.

Table 4-16. Config Register Fields

Field Bits Description Type Initial

Value

EC 30:28 B us cloc k ratio.

000: proces sor clock f requency divided by 2

001 ~ 111: (Reserved)

Read-only 0

DIE 18 Double issue enabl e

0 → Single iss ue 1 → Double issue Read/Write 0

ICE 17 S etting t hi s bit t o 1 enabl es the ins tructi on cache.

0 → Instruction cache disable

1 → Instruction cache enable

The CACHE ins tructi on for the ins tructi on cache is enabled

regardless of the value of this bit.

Read/Write 0

DCE 16 Setting thi s bit t o 1 enabl es the data cache.

0 → Data cache disable

1 → Data cache enable

If the cache is disabled, the PREF i nstruction becomes a NOP.

Read/Write 0

BE 15 Big Edian

0 → Little Edian 1 → Big Edian Read-only Pin

NBE 13 Setting this bit t o 1 enabl es non-block i ng l oad.

0 → Disable Non-blocking loads and hi t under miss

1 → Enable Non-blocki ng l oads and hit under mis s

Read/Write 0

BPE 12 Set ting this bit t o 1 enabl es branch predic tion.

0 → Disable Branc h P redi ction

1 → Enable Branch Predi ction

Read/Write 0

IC 11:9 Instruc tion cache Size (Instruction cache size = 212+IC bytes).

011 → 32 KB Read-only 011

DC 8:6 Data cac he Size (Data c ache size = 212+DC bytes).

011 → 32 KB Read-only 011

Chapter 4 CPU and COP0 Registers

4-24

Field Bits Description Type Initial

Value

K0 2:0 kseg0 coherency algori thm.

000: Reserved

001: Reserved

010: Uncac hed

011: Cacheable, write-back, write allocate

100: Reserved

101: Reserved

110: Reserved

111: Uncached Accelerated

Read/Write Undefined

0 31,

27:19,

14,

5:3

Reserved, Must be written as zeroes , and returns zeroes when

read. Read-only 0

With single issue enabled (DIE = 0), the C790 always fetches two instructions but only

issues a single instruction.

Chapter 4 CPU and COP0 Registers

4-25

4.2.16 BadPAddr Register (23)

31 4 3 0

BdPAddr 0

28 4

Figure 4-18. BadPAddr Register Format

The

Bad Physical Address

register (

BadPAddr

) is a read-only register that contains the

most recent physical address that caused a bus error. It is updated with a new value

whenever

Status.BEM

is clear (0). Once this bit is set (on the occurrence of a bus error)

the register holds the value.

Figure 4-18 shows

BadPAddr

register format; Table 4-17 des cribes the regis ter f ields .

Table 4-17. BadPAddr Register Fields

Field Bits Description Type Initial

Value

BdPAddr 31:4 Physical Addres s value Read-Only undefined

0 3:0 Res erved. Returns zeros when read. Read-Only 0

Chapter 4 CPU and COP0 Registers

4-26

4.2.17 Debug Registers (24)

There are seven separately addressable debug registers, which are all assigned to CP0,

register 24.

Each of the seven registers is accessed by specifying subaccess code which is bit2 to bit0 of

an instruction code.

Breakpoint Control Register (BPC) (subaccess code 0)

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 3 2 1 0

I

A

E

D

R

E

D

W

E

D

V

E0I

U

E

I

S

E

I

K

E

I

E0D

U

E

D

S

E

D

K

E

D

X

E

I

T

E

D

T

E

B

E

D0D

W

B

D

R

B

I

A

B

See Table 13-3 for a detailed description of individual BPC register fields.

Chapter 4 CPU and COP0 Registers

4-27

Instruction Address Breakpoint (IAB) (subaccess code 2)

31 21 0

IAB 0

30 2

Instruction Address Breakpoint Mask Register (IA BM) (subaccess code 3)

31 21 0

IABM 0

30 2

Data Address Breakpoint Register (DAB) (subaccess code 4)

31 0

DAB

32

Data Address Breakpoint Mask Register (DABM) (subaccess code 5)

31 0

DABM

32

Data value Breakpoint Register (DVB) (subaccess code 6)

31 0

DVB

32

Data value Breakpoint Mask Register (DVBM) (subaccess code 7)

31 0

DVBM

32

Chapter 4 CPU and COP0 Registers

4-28

4.2.18 Performance Counter Regi ster s (25)

There are three separately addressable performance counter registers, which are all

assigned to COP0, register 25.

Each of the three registers is accessed by specifying subaccess code which is bit1 to bit0 of

an instruction code.

All performance counter registers are read/write registers.

Performance Counter Control Register (PCCR)

3130 2019 1514131211109 543210

C

T

E

0

EVENT1

U

1S

1K

1E

X

L

1

0

EVENT0

U

0S

0K

0E

X

L

0

1 11 5 11111 5 11111

Performance Counter Register 0 (PCR0)

31 30 0

O

V

F

L

VALUE

131

Performance Counter Register 1 (PCR1)

31 30 0

O

V

F

L

VALUE

131

Figure 4-19. Performance Counter Registers

Chapter 4 CPU and COP0 Registers

4-29

Table 4-18 lists the field definitions for the

Performance Counter

Control

register.

Table 4-18. Performance Counter Control Register Fields

Field Bits Description Type Initial Value

CTE 31 Enables event counting (CTR1, CTR0) and exception

generation:

0 → Disabl e 1 → Enable

Read/Write 0

EVENT1 19:15 Set the event to be monit ored by PCR1

00000 (0) Low-order branch issued

00001 (1) Proces sor cycl e

00010 (2) Dual instruction i ssue

00011 (3) Branch miss predicted

00100 (4) TLB miss

00101 (5) DTLB miss

00110 (6) Data Cache miss

00111 (7) WB B single request unavail abl e

01000 (8) WB B burst request unavailable

01001 (9) WBB burst reques t al mos t full

01010 (10) WBB burst request full

01011 (11) CPU dat a bus busy

01100 (12) Instruction completed

01101 (13) Non-BDS instruc t i on com pl eted

01110 (14) COP1 instruction c omplet ed

01111 (15) Store completed

10000 (16) No event

(17-31) Reserved

Read/Write Undefined

EVENT0 9:5 Set the event to be monitored by P CR0

00000 (0) Reserved

00001 (1) Processor c yc l e

00010 (2) Single i nstruction is sue

00011 (3) Branch is sue

00100 (4) BTA C mis s

00101 (5) ITLB mis s

00110 (6) Instruction Cac he mis s

00111 (7) DTLB accessed

01000 (8) Non-blocking load

01001 (9) WB B single reques t

01010 (10) WBB burst request

01011 (11) CPU address bus busy

01100 (12) Instruction completed

01101 (13) Non-BDS instruction completed

01110 (14) Reserved

01111 (15) Load completed

10000 (16) No event

(17-31) Reserved.

Read/Write Undefined

U1, U0 14, 4 E nabl es event counting (PCR1/ P CR0) i n t he User m ode.

0 → Disabl e 1 → Enable Read/Write Undefined

S1, S0 13, 3 Enables event counting (P CR1/PCR0) in t he Supervisor

mode.

0 → Disabl e 1 → Enable

Read/Write Undefined

K1, K0 12, 2 Enables event counting (P CR1/ PCR0) in the K ernel mode.

0 → Disabl e 1 → Enable Read/Write Undefined

EXL1, EXL0 11, 1 Enables event counting (PCR1/ PCR0) when EXL bit is set

in the Status register.

0 → Disabl e 1 → Enable

Read/Write Undefined

0 30:20,

10,

0

Reserved. Must be written as zero, and returns zero when

read. Read-only 0

Chapter 4 CPU and COP0 Registers

4-30

Table 4-19 lists the field definitions for the

Performance Counter

register

0

(

PCR0

).

Table 4-19. Performance Counter Register 0 Fields

Field Bits Description Type Initial Value

OVFL 31 Overf l ow flag Read/Write Undefined

VALUE 30:0 The actual c ount er Read/ Write Undefined

Table 4-20 lists the field definitions for the

Performance Counter

register

1

(

PCR1

).

Table 4-20. Performance Counter Register 1 Fields

Field Bits Description Type Initial Value

OVFL 31 Overf l ow flag Read/Write Undefined

VALUE 30:0 The actual c ount er Read/ Write Undefined

Chapter 4 CPU and COP0 Registers

4-31

4.2.19 TagLo (28) and TagHi (29) Register s

TagLo

31 1211 765432 0

PTagLo Special use D V R L Su

20 5 1111 3

TagHi

31 0

Special use

32

Figure 4-20. TagLo and TagHi Registers

The

TagLo

and

TagHi

registers are 32-bit read/write registers used by the CACHE

instruction. For writing to the data cache tags, the

TagLo

register contains the fields as

shown above and the

TagHi

register is not used. For writing to the data cache data portion

the

TagLo

register contains the data value. For writing to the instruction cache tags the

TagLo

register contains the fields as defined above except that bits three and six are also

reserved bits. For writing to the instruction cache data portion, the

TagLo

register

contains the data (instruction) and the

TagHi

register contains the steering bits and bits

for the BHT as defined in Chapter 7. When reading from the caches, the values in the

TagLo

and

TagHi

register are the same as described above for writing. These registers are

also used for manipulating the BTAC. See the description of the CACHE instruction in

Appendix C for details. Figure 4-20 shows the format of these registers for some of the

cache operations.

Chapter 4 CPU and COP0 Registers

4-32

Table 4-21 lists the field definitions of the

TagLo

register.

Table 4-21. TagLo Register Fields

Field Bits Description Type Initial

Value

PTagLo

[31:12] 31:12 PTagLo[31:12] specif i es 20-bit phys i cal address tag cache. Read/W ri t e Undefined

D6

Dirty:

0 → Clean

1 → Dirty

Read/Write Undefined

V5

Valid:

0 → Invalid

1 → Valid

Read/Write Undefined

R4

LRF Replacement: Thi s bit parti cipates i n the calc ul ation

determining which cache way will be used for the next

replacement. S ee Secti on 7.3.1 for det ai l s.

Read/Write Undefined

L3

Lock: This bit is only used for the data cac he. For i nstruction

cache operat i ons this bi t is t reated as a reserved bi t.

0 → For this line, this s i de i s not loc ked.

1 → For this line, this side is locked.

Read/Write Undefined

Special

use, Su 11:7, 2:0 Used by the CACHE i nstruction to manipulate the branch t a rget

address c ache. Refer to Chapter 7 for details. Read/Write Undefined

Table 4-22. TagHi Register Fields

Field Bits Description Type Initial

Value

Special use 31:0 The TagHi register is used by the CACHE i nstruction to manipulate

som e of the bit s of the i nstruction cac he. Refer to Chapter 7 for

details.

Read/Write Undefined

Chapter 4 CPU and COP0 Registers

4-33

4.2.20 ErrorEPC (30)

31 0

ErrorEPC

32

Figure 4-21. ErrorEPC Register

The

ErrorEPC

register is similar to the

EPC

register, except that

ErrorEPC

is used on

nonmaskable interrupt (NMI), debug, SIO, and performance counter exceptions.

The read/write

ErrorEPC

register contains the virtual address at which instruction

processing can resume after servicing an error. This address can be:

•

the virtual address of the instruction that caused the exception

•

the virtual address of the immediately preceding branch or jump instruction

(when the instruction is in a branch delay slot, and the

BD2

bit in the

Cause

register is set).

Table 4-23 lists the field definition of the

ErrorEPC

register.

Table 4-23. ErrorEPC Register Field

Field Bits Description Type Initial Value

ErrorEPC 31: 0 Contains t he virtual addres s at which ins tructi o n

process i ng can resume aft er servicing an error. Read/Write Undefined

Chapter 4 CPU and COP0 Registers

4-34

Chapter 5 Exception Processing and Reset

5-1

5. Exception Processing and Reset

This chapter describes the exception processing, including level 1 and level 2 exceptions.

Chapter 5 Exception Processing and Reset

5-2

5.1 The Exception Handling Process

Exceptions can be recognized while the program is any of its three operating modes: User,

Supervisor, or Kernel.

Exceptions are categorized into 2 groups which are level 1 exceptions and level 2

exceptions as shown in Table 5- 1.

Table 5-1. Exception Levels

Level 1 Exceptions Level 2 Exceptions

Interrupt

TLB Modified

TLB Refill

TLB Invalid

Address Error

Syscall

Break

Trap

Reserved Instruction

Coprocessor Unusable

Integer Overflow

Bus Error

Floating Point Exception

Reset

NMI

Performance Counter

Debug

SIO

Compatibility Note: Level 2 exceptions are a generalization of “error level” exception

processing defined in earlier MIPS implementation.

5.1.1 Level 1 Exceptions

Exception

ExceptionException

Exception Processing

Processing Processing

Processing

When the processor takes a level 1 exception, the processor switches to Kernel mode.

Rather than set the

Status.KSU

bits to effect the switch, the

Status.EXL

bit is set to 1.

Whenever

Status.EXL

is 1, the operating mode is Kernel mode, regardless of the setting of

Status.KSU

.

Then the processor saves the virtual address of the instruction canceled by the exception.

This address is saved in the

EPC

register. If the canceled instruction is in the delay slot of

a branch instruction, the

Cause.BD

bit is set to 1 and

EPC

is set to the address of the

branch instruction (rather than the delay slot). For non-delay-slot instructions,

Cause.BD

is set to 0. If

Status.EXL

bit was 1 before the exception is taken, EPC and

Cause.BD

aren’t set. The exception service routine examines

Cause.BD

to determine the true

address of the instruction that raised the exception.

In addition to setting

EPC

,

Cause.BD

, and

Status.EXL

, the 5 bit field

Cause.ExcCode

is

also set. This field specifies the cause of the exception; The

Cause.CE

fields may also get

set when an Coprocessor unusable exception is rais ed.

After setting those bits, the processor jumps to the exception vector address.

Chapter 5 Exception Processing and Reset

5-3

The basic exception handling operation performed can be described using the Figure 5-1

Level 1 Exception Proces s i ng Flow c har t.

(see next page)

Disabled exceptions in level 1 exception

Disabled exceptions in level 1 exceptionDisabled exceptions in level 1 exception

Disabled exceptions in level 1 exception handler

handler handler

handler

Once a level 1 exception service routine is entered, interrupts and bus error are

unconditionally disabled.

C790 Programming Note:

The only level 1 exception that is unconditionally

disabled within level 1 exceptions handler is external interrupts and bus errors.

All other level 1 exceptions still occur and are recognized (if enabled). a software

system that makes use of such exceptions must use extreme care. In particular,

it must make sure that it has saved

EPC

and

Cause.BD

somewhere (e.g. in a

software managed stack) before the exception occurs.

Chapter 5 Exception Processing and Reset

5-4

= 1

Set Cause.ExcCode

Cause.CE ← coprocess or number when CpU exception

Set BadVAddr when AdES, AdEL or any TLB exception

Set Context and EntryHi when any TLB excepti on

Set BadPAddr when Bus Error

Offset ← 0x180

EPC ← PC

Cause.BD ← 0

= 0

Status.EXL

No

Instr.in

Br.Dly.Slot ?

EPC ← PC - 4

Cause.BD ← 1

Offset ← 0x180

Status.EXL ← 1

= Others

Exception ?

Offset ← 0x0

Status.BEV

Offset ← 0x200

PC ← 0x8000 0000+Offset PC ← 0xBFC0 0200+Offset

YES

= TLB Refill = Interrupt

= 0 (normal) = 1 (bootstrap)

Figure 5-1. Level 1 Exception processing flowchart

Chapter 5 Exception Processing and Reset

5-5

5.1.2 Level 2 Exceptions

Exception

ExceptionException

Exception Processing

Processing Processing

Processing

When the processor takes a level 2 exception, the processor switches to kernel mode, by

setting Status.ERL

to 1.

The address of the instruction where the Level 2 exception was recognized is stored in the

ErrorEPC

register. If the canceled instruction is in the delay slot of a branch instruction,

the

Cause.BD2

bit is set to 1 and

ErrorEPC

is set to the address of the branch instruction

(rather than the delay slot). For non-delay-slot instructions,

Cause.BD2

is set to 0. In

addition, the cause of the exception is stored in

Cause.EXC2

.

After setting those bits, the processor jumps to the exception vector address.

The basic Level 2 exception handling operation performed can be described using the

Figure 5-2 Level 2 Exception processing Flowchart.

(see next page)

Disabled Exceptions in level 2 exceptions

Disabled Exceptions in level 2 exceptionsDisabled Exceptions in level 2 exceptions

Disabled Exceptions in level 2 exceptions

When executing a Level 2 exception service routine, following exceptions are disabled.

• NMI, Interrupt, and Bus error

• Debug, SIO and Performance counter

C790 Implementation Note:

Any external exception that is not level-sensitive (e.g.

NMI) must be held until it is recognized; i.e. at least until the Level 2 handler is

exited.

C790 Programming Note:

It is the programmer’s responsibility to ensure that all

other internal exc ep t io ns ( e . g. OVERFLOW) never occ ur within a Level 2 handl er .

If they do occur, the corresponding Level 1 exception handler will be entered.

Since both

Status.EXL

and

Status.ERL

will be set when servicing this (nested)

exception, the ERET used to exit the service routine will operate incorrectly.

C790 Programming Note:

When

Status.ERL

= 1, the user address,

Kuseg

, region

becomes a 231-byte unmapped, uncached address space (that is, mapped directly

to physical address 0x0000 0000- 0x7FFF FFFF) .

Chapter 5 Exception Processing and Reset

5-6

= 0 (normal)

Offset ← 0x100

ErrorEPC ← PC

Cause.BD 2← 0

No

Instr.in

Br.Dly.Slot ?

ErrorEPC ← PC-4

Cause.BD2 ← 1

Status.ERL ← 1

= Debug or SIO

Exception ?

Status.BEV ← 1

Staus.DEV

Offset ← 0x80

PC ← 0x8000 0000+Offset PC ← 0xBFC0 0200+Offset

YES

= Reset or NMI = Performance Counter

= 1 (bootstrap)

Set Cause.EXC2

1

Status.BEM ← 0

Config.DIE/ICE/DCE ← 0

Config.NBE/BPE ← 0

Random ← 47

Wired ← 0

PCCR.CTE ← 0

BPC.IAE/DRC/DWE ← 0

PC ← 0xBFC0 0000

Reset

Exception ?

= NMI

Figure 5-2. Level 2 Exception processing flowchart

Chapter 5 Exception Processing and Reset

5-7

5.2 Exception Vector Locations

Exception vector addresses for level 1 exceptions are s how n in Table 5- 2.

The vector address for TLB refill depends on the

Status.EXL

bit. The vector addresses for

level 1 exceptions also depend on the

Status.BEV

bit.

Table 5-2. Exception Vectors for Level 1 exceptions

Vector Address

Exceptions BEV = 0 BEV = 1

TLB Refill (EXL = 0)

TLB Refill (EXL = 1) 0x8000 0000

0x8000 0180 0xBFC0 0200

0xBFC0 0380

Interrupt 0x8000 0200 0xBFC0 0400

Others 0x8000 0180 0xBFC0 0380

Exception vector addresses for level 2 exceptions are s how n in Table 5- 3.

The vector addresses for level 2 exceptions also depend on the

Status.DEV

bit.

Table 5-3. Exception Vectors for Level 2 exceptions

Vector Address

Exceptions DEV = 0 DEV = 1

Reset, NMI 0xBFC0 0000 0xBFC0 0000

Performance Counter 0x8000 0080 0xBFC0 0280

Debug, SIO 0x8000 0100 0xBFC0 0300

Chapter 5 Exception Processing and Reset

5-8

5.3 Cause Register Setting

The

Cause.ExcCode

bits are set when a level 1 exception is tak en.

The

Cause.ExcCode

setting is shown in Table 5- 4.

Table 5-4. Cause.ExcCode Field

ExcCode Exception

0 Int (Interrupt)

1 Mod (TLB modification exception)

2 TLBL (TLB exception; load or inst fetch)

3 TLBS (TLB exception; store)

4 AdEL (Address error exception; load or inst fetch)

5 AdES (Address error exception; store)

6 IBE (Bus error exception; instruction fetch)

7 DBE (Bus error exception; load or store)

8 Sys (Syscall exception)

9 Bp (Breakpoint exception)

10 RI (Reserved instruction exception)

11 CpU (Coprocessor Unusable exeption)

12 Ov (Integer Overflow exception)

13 Tr (Trap exception)

14 Reserved

15 FPE (Floating Point Exception)

16-31 Reserved

The

Cause.EXC2

bits are set when a level 2 exception is tak en.

The

Cause.EXC2

setting is shown in Table 5- 5.

Table 5-5. Cause.EXC2 Field

EXC2 Exception

0 Res (Reset exception)

1 NMI (Non-Maskable Interrupt)

2 PerfC (Performance Counter exception)

3 Dbg (Debug exception), SIO (SIO exception)

4 SS (Single Step)

5-7 Reserved

Chapter 5 Exception Processing and Reset

5-9

5.4 Masking an exception

The following exceptions can be masked by setting bits in Status register.

NMI, Performance counter, Debug, Bus error, Interrupt and SIO

The Table 5-6 shows whether the bits mask those exceptions. Exceptions which marked

with “X” can be masked by setting (BEM, EXL or ERL) or clearing (IE or IM) the

corresponding bit in the Status register.

Table 5-6. Masking exceptions

Mask bit (in Status register)

Exception IE IM BEM EXL ERL

Reset

NMI X

Performance Counter X

Debug X

SIO X

Address error

TLB Refill/Invalid/Modify

Bus error X X X

Syscall

Break

Reserved instrcution

Coprocessor Unusable

Interrupt X X X X

Integer overflow

Trap

Chapter 5 Exception Processing and Reset

5-10

5.5 Detaild Description

5.5.1 Exception Priority

Exception priority rules determine which exception is taken first, if multiple exceptions

occur on the same instruction. The Table 5-7. Shows the priority order of the exceptions.

Table 5-7. Exception Priority Order

Reset (highest priority)

NMI

Performance Counter

Instruction Breakpoint (debug)

Address error - Instruction fetch

TLB refill - Instruction fetch

TLB invalid - Instruction fetch

Bus Error - Instruction fetch

Single Step

SYSCALL, BREAK, Reserved Instruction,*

Floating Point Exception or Coprocessor Unusable*

Interrupt

Data address/value breakpoint (debug)

SIO

Integer overflow, Trap

Address error - data access

TLB refill - data access

TLB invalid - data access

TLB modified - data access

Bus error - data access (lowest priority)

*The exception priority between Reserved Instruction exception(RI) and Coprocessor

Unusable exception(CpU)

The exception priorities of the two exceptions are the same. However, when

Status.CU[1] = 0, an attempt to execute any FPU ( CO P1) ins t ruction caus es a CpU

exception. When Status.CU[1] = 1, the attempt is reported as an FPE(E):unimplemented

FPU exception in the Cop1 sub-instructions.

On the other hand, an attempt to execute any COP0 class Reserved Instruction causes

an RI exception regardless Status.CU[0].

Chapter 5 Exception Processing and Reset

5-11

5.5.2 Reset Exception

Cause

CauseCause

Cause

The RESET exception occurs when the

Reset

ResetReset

Reset

*

signal is asserted and then deasserted. This

exception is not maskable.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 2

Vector Address: 0xBFC00000

Vector Address: 0xBFC00000Vector Address: 0xBFC00000

Vector Address: 0xBFC00000

Processing

ProcessingProcessing

Processing

The RESET exception vector is located within uncached and unmapped address space.

Hence the cache and TLB need not be initialized in order to process the exception.

The contents of all registers in the CPU are undefined when this exception is recognized,

except for the following register fields:

• In the

Status

register,

Status.

ERL

and Status.

BEV

are set to 1.

Status.BEM

is set to 0.

All other bits except for 0-fixed bits are undefined.

• In the

Cause

register,

Cause.

EXC2

is set to 0 (to indicate that a Reset occurred)

All other bits except for 0-fixed bits are undefined.

• In the

Config

register,

DIE

,

ICE

,

DCE

,

NBE

, and

BPE

bits are set to 0.

All other bits except for fixed-value, read-only bits are undefined.

• The

Random

register is initialized to the value of its upper bound (47).

• The

Wired

register is initialized to 0.

• The Counter Enable flag in the Performance Counter Control register

(

PCCR.CTE

) is set to 0.

• The breakpoint address enable flags in the Breakpoint Control register,

BPC.IAE

,

BPC.DRE

, and

BPC.DWE,

are all set to 0.

• Valid, Dirty, LRF, and Lock bits of the data cache and the Valid and LRF bits of

the instruction cache are initialized to 0 on reset.

Servicing

ServicingServicing

Servicing

The RESET exception is serviced by:

• initializing all processor registers, coprocessor registers, caches, and the memory

system

• performing diagnostic tests

• bootstrapping the operating system

Chapter 5 Exception Processing and Reset

5-12

5.5.3 Non-Maskable Interrupt (NM I ) Exception

Cause

CauseCause

Cause

The Non-Maskable Interrupt (NMI) exception occurs in response to the falling edge of the

NMI

NMINMI

NMI

* signal. The NMI exception is maskable by setting the

Status.ERL

bit. It is

recognized regardless of the settings of the

Status.EXL,

and

Status.IE

bits.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 2

Vector Address: 0xBFC00000

Vector Address: 0xBFC00000Vector Address: 0xBFC00000

Vector Address: 0xBFC00000

Processing

ProcessingProcessing

Processing

NMI and RESET exceptions share the same exception vector. This vector is located within

uncached and unmapped address space; therefore, the cache and TLB need not be

initialized in order to process the exception.

When the NMI exception is recognized, all register contents are preserved with the

following exceptions:

•

ErrorEPC

register, which contains the restart PC, and

Cause.BD2

which records

whether the NMI was recognized in a branch delay slot.

•

Status.ERL

and

Status.BEV

flags are both set to 1.

•

Cause.EXC2

is set to 1 (NMI).

Servicing

ServicingServicing

Servicing

Note that the NMI service routine entry address does not depend on the

Status.BEV

flag.

In fact, the

Status.BEV

bit is unconditionally set to 1 before the NMI handler is entered.

It is up to the NMI service routine to restore the setting of the

Status.BEV

bit prior to exit.

Chapter 5 Exception Processing and Reset

5-13

5.5.4 Performance Counter Exception

Cause

CauseCause

Cause

A lower-case performance counter exception occurs when a Performance counter overflows

and conditions are met as described in Section 9.3. 2. This exception is maskable by setting

Status.ERL

bit.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 2

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0080 (DEV = 0), 0xBFC0 0280 (DEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.EXC2

is set to 2 (

PerfC)

. The

ErrorEPC

register contains the address

of the instruction where the Performance counter exception was detected unless it is in a

branch delay slot, in which case the

ErrorEPC

register contains the address of the

preceding branch instruction and the

Cause.BD2

is set.

Servicing

ServicingServicing

Servicing

When this exception is recognized, control is transferred to the applicable service routine.

Chapter 5 Exception Processing and Reset

5-14

5.5.5 Debug Exception

Cause

CauseCause

Cause

A DEBUG exception occurs whenever hardware breakpoint conditions as described in

Chapter 13 are detected. This exception is mask able by s etting

Status.ERL

bit.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 2

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0100 (DEV = 0), 0xBFC0 0300 (DEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.EXC2

is set to 3 (

Dbg)

. The

ErrorEPC

register contains the address of

the instruction where the debug exception was detected unless it is in a branch delay slot,

in which case the

ErrorEPC

register contains the address of the preceding branch

instruction and

Cause.BD2

is set. Note that the Load data value breakpoint exception is

imprecise. That is, the instruction where the breakpoint is detected is not the load

instruction that triggers the breakpoint; see Chapter 13 for more details.

Servicing

ServicingServicing

Servicing

When this exception is recognized, control is transferred to the applicable service routine.

Chapter 5 Exception Processing and Reset

5-15

5.5.6 Address Error Exception

Cause

CauseCause

Cause

The Address Error exception occurs when an attempt is made to execute one of the

following:

• load or store a doubleword that is not aligned on a doubleword boundary

• load, fetch, or store a word that is not aligned on a word boundary

• load or store a halfword that is not aligned on a halfw ord boundary

• reference the kernel address space from User or Supervisor mode

• reference the supervisor address space from User mode

This exception is not maskable.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0180 (BEV = 0), 0xBFC0 0380 (BEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.ExcCode

is set to 4 (

AdEL

) or 5 (

AdES

), depending on whether the

exception was caused due to an instruction reference (

AdEL

), load operation (

AdEL

), or

store operation (

AdES

).

When this exception is recognized, the virtual address that was not properly aligned or

that referenced protected address space is stored in the

BadVAddr

register. This update

occurs even if the exception occurs within a level 1 or level 2 exception handler. The

contents of the

VPN

field of the

Context

and

EntryHi

registers are undefined, as are the

contents of the

EntryLo

register.

The

EPC

register contains the address of the instruction that caused the exception, unless

this instruction is in a branch delay slot. If it is in a branch delay slot, the

EPC

register

contains the address of the preceding branch instruction and

Cause.BD

is set to indicate

that the branch delay slot instruction actually caused the exception.

Chapter 5 Exception Processing and Reset

5-16

5.5.7 TLB Refill Exception

Cause

CauseCause

Cause

The TLB refill exception occurs when there is no TLB entry to match a reference to a

mapped address space. This exception is not maskable.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: EXL = 0: 0x8000 0000 (BEV = 0), 0xBFC0 0200 (BEV = 1)

EXL = 1: 0x8000 0180 ( BEV = 0), 0xBFC0 0380 (BEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.ExcCode

is set to either a value of 2 (TLBL) or 3 (TLBS). This code

indicates whether the exception was caused due to an instruction reference, load operation,

or store operation.

When this exception is recognized, the

BadVAddr

,

Context

and

EntryHi

registers are

updated to hold the virtual address that failed address translation. The

EntryHi

register

also contains the ASID for which the translation fault occurred. These actions take place

even if the exception is recognized within a level 1 or level 2 exception handler. The

Random

register normally contains a valid location in which to place the replacement TLB

entry. The contents of the

EntryLo

register are undefined. The

EPC

register contains the

address of the instruction that caused the exception, unless this instruction is in a branch

delay slot, in which case the

EPC

register contains the address of the preceding branch

instruction and

Cause.BD

is set.

The

EPC

register and

BD

bit in the

Cause

register point to the address of the instruction

causing the exception.

Servicing

ServicingServicing

Servicing

To service this exception, the contents of the

Context

register are used as a virtual address

to fetch memory locations containing the physical page frame and access control bits for a

pair of TLB entries. The two entries are placed into the

EntryLo0/EntryLo1

register; the

EntryHi

and

EntryLo

registers are then written into the TLB.

It is possible that the virtual address used to obtain the physical address and access

control information is on a page that is not resident in the TLB. This condition is

processed by allowing a TLB refill exception in the TLB refill handler. This second

exception goes to the common exception vector because the

EXL

bit of the

Status

register

is set.

Chapter 5 Exception Processing and Reset

5-17

5.5.8 TLB Invalid Exception

Cause

CauseCause

Cause

The TLB invalid exception occurs when a virtual address reference matches a TLB entry

that is marked invalid (TLB valid bit cleared). This exception is not maskable.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0180 (BEV = 0), 0xBFC0 0380 (BEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.ExcCode

is set to either 2 (TLBL) or 3 (TLBS). This code indicates

whether the exception was caused due to an instruction reference, load operation, or store

operation.

When this exception is recognized, the

BadVAddr

,

Context,

and

EntryHi

registers are

loaded with the virtual address that failed address translation. The

EntryHi

register also

contains the ASID for which the translation fault occurred. These actions occur even if the

exception is recognized within a level 1 or level 2 exception handler. The

Random

register

normally contains a valid location in which to put the replacement TLB entry. The

contents of the

EntryLo

register is undefined.

The

EPC

register contains the address of the instruction that caused the exception unless

this instruction is in a branch delay slot, in which case the

EPC

register contains the

address of the preceding branch instruction and the

BD

bit of the

Cause

register is set.

Servicing

ServicingServicing

Servicing

A TLB entry is typically marked invalid when one of the following is true:

• a virtual address does not exist

• the virtual address exists, but is not in main memory (a page fault)

• a trap is desired on any reference to the page (for example, to maintain a

reference bit)

After servicing the cause of a TLB Invalid exception, the TLB entry is located with TLBP

(TLB Probe), and replaced by an entry with that entry’s

Valid

bit set.

Chapter 5 Exception Processing and Reset

5-18

5.5.9 TLB Modified Exception

Cause

CauseCause

Cause

The TLB modified exception occurs when a store operation generates a virtual address

that matches a TLB entry that is marked valid but is not dirty and therefore is not

writable. This exception is not maskable.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0180 (BEV = 0), 0xBFC0 0380 (BEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.ExcCode

is set to 1 (Mod) and the

BadVAddr, Context,

and

EntryHi

registers contain the virtual address that failed address translation. The

EntryHi

register

also contains the ASID for which the translation fault occurred. These actions occur even

if the exception is recognized within a level 1 or level 2 exception handler. The contents of

the

EntryLo

register is undefined.

The

EPC

register contains the address of the instruction that caused the exception unless

that instruction is in a branch delay slot, in which case the

EPC

register contains the

address of the preceding branch instruction and the

BD

bit of the

Cause

register is set.

Servicing

ServicingServicing

Servicing

The kernel uses the failed virtual address or virtual page number to identify the

corresponding access control information. The page identified may or may not permit

write accesses; if writes are not permitted, a write protection violation occurs.

If write accesses are permitted, the page frame is marked dirty/writable by the kernel in

its own data structures. The

TLBP

instruction places the index of the TLB entry that

must be altered into the

Index

register. The

EntryLo

register is loaded with a word

containing the physical page frame and access control bits (with the

D

bit set), and the

EntryHi

and

EntryLo

registers are written into the TLB.

Chapter 5 Exception Processing and Reset

5-19

5.5.10 Bus Error Exception

Cause

CauseCause

Cause

A Bus Error exception is raised when

BUSERR

* signal is asserted during bus transactions.

This exception is masked when

Status.BEM

,

Status.EXL

or

Status.ERL

are set to 1.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0180 (BEV = 0), 0xBFC0 0380 (BEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.ExcCode

is set to 6 (IBE) or 7 (DBE), indicating whether the exception

was caused due to an instruction reference (

IBE

), load operation (

DBE

), or store operation

(

DBE

). The

BadPAddr

is set to the physical address which caused a bus error when

Status.BEM

bit is 0.

The

EPC

register and

BD

bit in the

Cause

register point to the address of the instruction

currently being executed by the processor.

Note that there is no necessary relationship between a bus error and the instruction being

executed currently. For example, a bus error may be caused by instruction prefetch, or by

a data cache line operation that is unrelated to any instruction. Furthermore, it could be

caused by a load or store that was issued several instructions prior to the instruction that

was executing when the bus error was recognized.

If a bus error is caused by a load or store instruction, the instruction is retired. If the

instruction is a store, the nature of how memory is updated depends on the memory

subsystem’s design. If the instruction is a load, the value loaded into the destination

register is indeterminate. If a data value breakpoint is pending for the memory address

accessed, breakpoint recognition is implementation dependent.

Servicing

ServicingServicing

Servicing

In the C790 the bus error exception is imprecise and as such difficult to recover from and

continue processing. If a bus error occurs during instruction or data cache refills, the

cache line loaded has undefined values in it. Since it is not possible in general to

determine the offending address (from the

EPC

) the entire data and instruction cache

contents should be invalidated by using Index Invalidate suboperation of the

CACHE

instruction. (See the

CACHE

instruction’s definition for details on how to do this.)

Chapter 5 Exception Processing and Reset

5-20

5.5.11 System Call Exception

Cause

CauseCause

Cause

A SYSCALL exception occurs as a result of executing the

SYSCALL

instruction. This

exception is not maskable.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0180 (BEV = 0), 0xBFC0 0380 (BEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.ExcCode

is set to 8 (Sys). The

EPC

register contains the address of the

SYSCALL

instruction unless it is in a branch delay slot, in which case the

EPC

register

contains the address of the preceding branch instruction and

Cause.BD

is set.

Servicing

ServicingServicing

Servicing

When this exception is recognized, control is transferred to the applicable system routine.

To resume execution, the

EPC

register must be altered so that the

SYSCALL

instruction

does not re-execute; this is accomplished by adding a value of 4 to the

EPC

register (

EPC

register + 4) before returning.

If a

SYSCALL

instruction is in a branch delay slot, a more complicated algorithm, beyond

the scope of this description, may be required.

Chapter 5 Exception Processing and Reset

5-21

5.5.12 BREAK Instruction Exception

Cause

CauseCause

Cause

A BREAK excepti on occur s as a resul t of execut ing the

BREAK

instruction. This exception

is not maskable.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0180 (BEV = 0), 0xBFC0 0380 (BEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.ExcCode

is set to

9 (Bp)

. The

EPC

register contains the address of the

BREAK

instruction unless it is in a branch delay slot, in which case the

EPC

register

contains the address of the preceding branch instruction and

Cause.BD

is set.

Servicing

ServicingServicing

Servicing

When a BREAK exception is recognized, control is transferred to the applicable system

routine. Additional distinctions can be made by analyzing the unused bits of the

BREAK

instruction (bits 25:6), and loading the contents of the instruction whose address the

EPC

register contains. A value of 4 must be added to the contents of the

EPC

register (

EPC

register + 4) to locate the instruction if it resides in a branch delay slot.

To resume execution, the

EPC

register must be altered so that the

BREAK

instruction

does not re-execute; this is accomplished by adding a value of 4 to the

EPC

register (

EPC

register + 4) before returning.

If a

BREAK

instruction is in a branch delay slot, interpretation of the branch instruction

is required to resume execution.

Chapter 5 Exception Processing and Reset

5-22

5.5.13 Reserved Instruction Exception

Cause

CauseCause

Cause

The Reserved Instruction exception occurs when one of the following conditions occurs:

• an attempt is made to execute an instruction with an undefined major opcode

(bits 31:26)

• an attempt is made to execute a SPECIAL instruction with an undefined minor

opcode (bits 5:0)

• an attempt is made to execute a REGIMM instruction with an undefined minor

opcode (bits 20:16)

• an attempt is made to execute a MMI instruction with an undefined minor

opcode (bits 10:0)

• an attempt is made to execute a COPz instruction with an undefined minor

opcode (bits 25:21)

Note:

Note:Note:

Note: In the C790, 64-bit operations are always valid in User, Supervisor, and Kernel

mode.

This exception is not maskable.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0180 (BEV = 0), 0xBFC0 0380 (BEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.ExcCode

is set to

10 (RI).

The

EPC

register contains the address of the

reserved instruction unless it is in a branch delay slot, in which case the

EPC

register

contains the address of the preceding branch instruction.

Chapter 5 Exception Processing and Reset

5-23

5.5.14 Coprocessor Unusable Exception

Cause

CauseCause

Cause

The Coprocessor Unusable exception occurs when an attempt is made to execute a

coprocessor instruction for either:

• a corresponding coprocessor unit that has not been marked usable via the

Status.Cu[ ]

bits or

• COP0 instructions, when the unit has been marked not usable and the process

executes in either User or Supervisor mode.

NOTE:

COP0 instructions always execute in Kernel mode, regardless of the

setting of

Status.CU[0]

. Also note that the operation of the COP0 instructions EI

and DI is not controlled by

Status.CU[0]

. Instead, the

Status.EDI

bit specifies

whether the EI and DI instructions execute in User and Supervisor modes. In

case execution is suppressed, EI and DI behave as no-operations in User and

Supervisor modes; they do not signal an exception.

The exception is not maskable.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0180 (BEV = 0), 0xBFC0 0380 (BEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.ExcCode

is set to 11

(CpU)

and the field

Cause.CE (Coprocessor Usage

Error)

is set to indicate which of the four coprocessors was referenced. The

EPC

register

contains the address of the unusable coprocessor inst ruction unless it is in a branch delay

slot, in which case the

EPC

register contains the address of the preceding branch

instruction.

Servicing

ServicingServicing

Servicing

The coprocessor unit to which an attempted reference was made is identified by the

CE

(Coprocessor Usage Error) field, which result in one of the following situations:

• If the process is entitled access to the coprocessor, the coprocessor is marked

usable and the corresponding user state is restored to the coprocessor.

• If the process is entitled access to the coprocessor, but the coprocessor does not

exist or has failed, interpretation of the coprocessor instruction is possible.

• If the

BD

bit is set in the

Cause

register, the branch instruction must be

interpreted; then the coprocessor instruction can be emulated and execution

resumed with the

EPC

register advanced past the coprocessor instruction.

Chapter 5 Exception Processing and Reset

5-24

5.5.15 Interrupt Exception

Cause

CauseCause

Cause

The Interrupt exception occurs when one of the three interrupt signals is asserted. The

significance of the interrupts is dependent upon the specific system implementation.

Each of the three interrupts can be masked by clearing the corresponding bit in the

Int-

Mask

field of the

Status

register, and all of the three interrupts can be masked at once by

clearing the

IE

bit or EIE bit of the

Status

register.

All three interrupts are also masked at once when the

EXL

or

ERL

bit of the

Status

register is set to 1.

Interrupt IP[7] is set when the

Count

register is equal to the

Compare

register.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0200 (BEV = 0), 0xBFC0 0400 (BEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.ExcCode

is set to 0

(Int)

. The

IP

field of the

Cause

register indicates

current interrupt requests. It is possible that more than one of the bits can be

simultaneously set (or even

no

bits may be set) if the interrupt is asserted and then

deasserted before this register is read.

Servicing

ServicingServicing

Servicing

If the interrupt is hardware-generated, the interrupt condition is cleared by correcting the

condition causing the interrupt pin to be asserted.

Due to the on-chip write buffer, a store to an external device (possibly clearing the

interrupt) may not occur until after other instructions in the pipeline finish. Hence, the

user must ensure that the store will occur before the

return from exception

instruction

(

ERET

) is executed. This can be insured by executing a

SYNC

instruction. Otherwise the

interrupt may be serviced again even though there is no actual interrupt pending.

Chapter 5 Exception Processing and Reset

5-25

5.5.16 SIO Exception

Cause

CauseCause

Cause

The SIO exception occurs when the

SIOInt

SIOIntSIOInt

SIOInt

signal is asserted. This exception is maskable

by setting

Status.ERL

bit.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 2

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0100 (DEV = 0), 0xBFC0 0300 (DEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.EXC2

is set to 3(Dbg). The

Cause.SIOP

is set to 1. The

ErrorEPC

register contains the address of the instruction where the SIO exception was detected

unless if is in a branch delay slot, in which case the

ErrorEPC

register contains the

address of the preceding branch insruction and

Cause.BD2

is set.

Servicing

ServicingServicing

Servicing

When this exception is recognized, control is transferred to the applicable service routine.

Chapter 5 Exception Processing and Reset

5-26

5.5.17 Integer Overflow Exception

Cause

CauseCause

Cause

An Integer Overflow exception occurs when an

ADD

,

ADDI

,

SUB

,

DADD

,

DADDI

or

DSUB

instruction results in a 2’s complement overflow. This exception is not maskable.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0180 (BEV = 0), 0xBFC0 0380 (BEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.ExcCode

is set to 12 (Ov). The

EPC

register contains the address of the

instruction that caused the exception unless the instruction is in a branch delay slot, in

which case the

EPC

register contains the address of the preceding branch instruction and

the

BD

bit of the

Cause

register is set.

Chapter 5 Exception Processing and Reset

5-27

5.5.18 Trap Exception

Cause

CauseCause

Cause

The TRAP exception occurs when a

TGE

,

TGEU

,

TLT

,

TLTU

,

TEQ

,

TNE

,

TGEI

,

TGEIU

,

TLTI

,

TLTIU

,

TEQI

, or

TNEI

instruction results in a TRUE condition. This exception is

not maskable.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0180 (BEV = 0), 0xBFC0 0380 (BEV = 1)

Processing

ProcessingProcessing

Processing

The value of

Cause.ExcCode

is set to 13

(Tr)

. The

EPC

register contains the address of the

instruction causing the exception unless the instruction is in a branch delay slot, in which

case the

EPC

register contains the address of the preceding branch instruction and

Cause.BD

is set.

Chapter 5 Exception Processing and Reset

5-28

5.5.19 Floating-Point Exception

Cause

CauseCause

Cause

The Floating-Point exception is used by the floating-point coprocessor. This exception is

not maskable.

Exception

ExceptionException

Exception Level:

Level: Level:

Level: 1

Vector Address:

Vector Address:Vector Address:

Vector Address: 0x8000 0180 (BEV = 0), 0xBFC0 0380 (BEV = 1)

Processing

ProcessingProcessing

Processing

The common exception vector is used for this exception, and the FPE code in

Cause

register is set.

The contents of the Floating-Point Control/Status register indicate the cause of this

exception.

This exception is cleared by clearing the appropriate bit in the Floating-Point

Control/Status register.

For an unimplemented instruction exception, the kernel should emulate the instruction;

for other exceptions, the kernel should pass the exception to the user program that caused

the exception.

Chapter 6 Memory Management

6-1

6. Memory Management

The C790 processor provides a memory management unit (MMU) which uses an on-chip

translation look-aside buffer (TLB) to translate virtual addresses into physical addresses.

The C790 supports the MIPS compatible

32-bit

address and

64-bit

data mode.

Only

32-bit

virtual and physical addresses have been implemented. There is no requirement for

address sign extension and address error exception checking will not be done on the

“upper” 32-bits (which are ignored). The only condition that will generate the address

error exception will be address alignment errors and segment protection errors. In Kernel

mode, there will be address error exception free program counter wrap-around from

kseg3

to

kuseg

.

Since there is only one addressing mode, all the four MIPS ISAs (I, II, III, IV) and the

C790 specific ISA are available without any res t rictions in all of the three processor modes

(with the appropriate MIPS ISA coprocessor usable restrictions). As such the reserved

instruction (RI) exception will occur only when the processor really tries to execute an

undefined opcode.

This chapter describes the processor virtual and physical address spaces, the virtual-to-

physical address translation, the operation of the TLB in making these translations, and

those System Control Coprocessor (COP0) registers that provide the software interface to

the TLB.

Chapter 6 Memory Management

6-2

6.1 Translation Look-aside Buffer (TLB)

Mapped virtual addresses are translated into physical addresses using an on-chip TLB.

The TLB is a fully associative memory that holds 48 entries, which provide mapping to 48

odd / even page pairs (96 pages). When address mapping is indicated, each TLB entry is

checked simultaneously for a match with the virtual address that is extended with an

ASID stored in the low 8 bits of the

EntryHi

register.

The address mapped to a page ranges in size from 4 KB to 16 MB, in multiples of four;

that is, 4K, 16K, 64K, 256K, 1M, 4M, 16M.

6.1.1 Translation Status

In C790 processor, as the one implemented in R4000, each TLB entry holds two sets of

mapping information for two odd/even page pair and therefore the translation result is

categorized into three states, hit, miss and invalid.

Upon address translation, if there is no virtual address match in all 48 entries, the

translation result is categorized as TLB miss.

In this case, an exception is taken and software refills the TLB from the page table

resident in memory. Software can write over a selected TLB entry or use a hardware

mechanism to write into a random entry.

If there is a match on translation, the following takes place in the TLB hardware.

1. The translation information for odd page and even page is read out of the matching

entry. Also the page size is extracted at the same time.

2. The TLB selects either of trans lation inf ormation in accordance with the page size

information extracted above and the virtual address.

This becomes the translation result in the TLB.

The translation result includes a valid flag to indicate the translation information is valid

or not. If the flag is marked as ‘valid’, the translation is handled as TLB hit. The physical

page number is extracted from the TLB and concatenated with the offset to form the

physical address (s ee Figure 6- 1) .

If the flag is marked as ‘invalid’, the translation result is recognized as TLB invalid. In

this case, an exception is taken to request the software to update the entry that got a

match upon translation, by probing the TLB using

TLBP

operation.

6.1.2 Multiple Matches

Multiple match is the condition that there are two or more entries that match upon

address translation. This is strictly prohibited and software is expected never to allow this

to occur.

The C790 processor does NOT provide any meanings to detect this in hardware, such as

TLB shutdown. The result of this condition is undefined and the further execution may

provide incorrect result.

Chapter 6 Memory Management

6-3

6.2 Address Spaces

This section describes the virtual and physical address spaces and the manner in which

virtual addresses are converted or “translated” into physical addresses in the TLB.

6.2.1 Virtual Address Space

The C790 only implements 32 bits of virtual address space. There is no requirement for

address sign extension and no checking will be done on the upper 32 bits of the address.

Figure 6-1 shows the trans lation of a virtual addres s into a phys ical addres s .

TLB

Entry

Virtual address

Offset

ASID VPN

TLB

G ASID VPN

PFN

2. If there is a match, the page frame

number (PFN) representing the

upper bits of the physical address

(PA) is output from the TLB.

4. The Offset, which does not pass

through the TLB, is then concatenated

to the PFN.

1. Virtual address (VA) represented by

the virtual page number (VPN) is

concatenated with the ASID and

compared with the tags in the TLB.

Offset

PFN

Physical address

Figure 6-1. Overview of a Virtual-to-Physical Address Translation

As shown in Figure 6-2, the virtual address is extended with an 8-bit address space

identifier (ASID), which reduces the frequency of TLB flushing when switching contexts.

This 8-bit ASID is in the COP0

EntryHi

register as described later in this chapter.

Chapter 6 Memory Management

6-4

6.2.2 Physical Address Space

Using a 32-bit address, the processor physical address space encompasses 4 GB. The

following section describes the trans lation of a virtual addres s to a phys ical addres s .

6.2.3 Virtual-to-Physical Address Translation

Converting a virtual address to a physical address begins by comparing the virtual

address from the processor with the virtual addresses in the TLB; there is a match when

the virtual page number (VPN) of the address is the same as the VPN field of the entry,

and either:

• the Global (G) bit of the TLB entry is set, or

• the ASID field of the virtual address (taken from the 8-bit ASID field of the

EntryHi register) is the same as the ASID field of the TLB entry.

If there is no match, a TLB Miss exception is taken by the processor and software can

refill the TLB from a page table of virtual / physical addresses in memory.

If there is a virtual address match in the TLB, the physical address is output from the

TLB and concatenated with the

Offset

, which represents an address within the page

frame space. The

Offset

does not pass through the TLB. At the same time, the valid bit

output from TLB is checked to qualify the translation. If this bit is not set, a TLB Invalid

exception is taken by the processor and software can update the TLB.

Virtual-to-physical translation is described in greater detail throughout the remainder of

this chapter. Figure 6-9, shown at the end of this chapter, is a detailed flow diagram of

this process.

Chapter 6 Memory Management

6-5

6.2.4 32-bit Address Translation Mode

The C790 supports only 32-bit address translation mode. 64-bit addressing mode is

not

supported.

Figure 6-2 shows the virtual-to- p hys ical addres s trans lation of a 32- bit addres s .

• The top portion of Figure 6-2 shows a virtual addres s w ith a 12- bit, or 4- K B,

page size, labeled

Offset

. The remaining 20 bits of the address represent the

VPN, and index the 1M-entry page table.

• The bottom portion of Figure 6-2 shows a virtual addres s w ith a 24- bit, or 16-

MB, page size, labeled

Offset

. The remaining 8 bits of the address represent the

VPN, and index the 256-entry page table.

39 32 31 29 28 24 23 0

ASID VPN Offset

88 24

Virtual Address with 256 (2

8

) 16-Mbyte pages

39 32 31 29 28 12 11 0

ASID VPN Offset

820 12

32-bit Ph

y

sical Address

31 0

PFN Offset

Bits 31, 30 and 29 of the virtual

address select user, supervisor,

or kernel address spaces.

Virtual-to-physical

translation in TLB Offset passed

unchanged to

physical

memory

Virtual-to-physical

translation in TLB Offset passed

unchanged to

physical

memory

TLB

Virtual Address with 1M (2

20

) 4-Kbyte pages

Figure 6-2. 32-bit Mode Virtual Address Translation

Chapter 6 Memory Management

6-6

6.2.5 Operating Modes

The processor has the three standard MIPS operating modes:

• User mode

• Supervisor mode

• Kernel mode

Selection between the three modes can be made by the operating system (when in Kernel

mode) by writing into

Status

register’s KSU field. The processor is forced into Kernel

mode when the processor is handling a Level 1 exception (the EXL bit is set - also called

the Exception Level mode in R-series processors) or a Level 2 exception (the ERL bit is set

- also called the Error Level mode in R-series processors).

In the following table, dashes represent ‘don’t cares’.

Table 6-1 Processor Modes

Description KSU ERL EXL

32-bit User mode 10 0 0

32-bit Supervi sor mode 01 0 0

32-bit Kernel mode 00 0 0

32-bit Kernel mode (Level 1 excepti on) - 0 1

32-bit Kernel mode (Level 2 excepti on) - 1 -

Figure 6-3 shows a state transition among these three modes.

Kernel

Mode

User Mode

Supervisor

Mode

ERET & KSU = 01

ERET & KSU =10

Exception

Figure 6-3 State Transition among Operating Modes

Chapter 6 Memory Management

6-7

Table 6-2 summarizes address s p ace for each operating mode.

Table 6-2. Address Space

Virtual

Address 32-bit User

Mode 32-bit

Supervisor

Mode

32-bit Kernel

Mode

0xFFFF FFFF

to

0xE000 0000

Address

Error kseg3 (0.5 GB)

Mapped

0xDFFF FFFF

to

0xC000 0000 Address sseg (0.5 GB)

Mapped ksseg (0.5 GB)

Mapped

0xBFFF FFFF

to

0xA000 0000

Error

Address

kseg1 (0.5 GB)

Unmapped*

Uncached

0x9FFF FFFF

to

0x8000 0000

Error kseg0 (0.5 GB)

Unmapped*

Cached**

0x7FFF FFFF

to

0x0000 0000

useg (2 GB)

Mapped suseg (2 GB)

Mapped kuseg (2 GB)

Mapped

(becomes

unmapped if

ERL is 1)

*Note: Virtual addresses of Kernel segments, kseg0 and kseg1, are not mapped through the

TLB and always translated into physical addresses from 0x0000 0000 to 0x1FFF FFFF.

** Note: T he kseg 0 cache algorithm is controlled by the K0 f ield in t he Config reg ister.

Chapter 6 Memory Management

6-8

6.2.6 User Mode Operations

In User mode, a single, uniform virtual address space, labeled User segment, is available;

its size is:

• 2 GB (231 bytes) (

useg

)

Figure 6-4 shows User mode virtual address space.

useg

0x FFFF FFFF

0x 8000 0000

0x 0000 0000

2 GB

Mapped

Address

Error

32-bitVirtual Address

Figure 6-4. User Mode Virtual Address Space

The User segment starts at address 0x0000 0000 and the current active user process

resides in

useg

. The TLB identically maps all references to

useg

from all modes, and

controls cache accessibility.

The processor operates in User mode when the

Status

register contains the following bit-

values:

•

KSU

bits = 102

•

and EXL

= 0

•

and ERL

= 0

Chapter 6 Memory Management

6-9

Table 6-3 lists the characteristics of the User mode segment,

useg

.

Table 6-3. User Mode Segments

Address Bit

Values Status Register

Bit Values Segment

Name Virtual Address

Range Segment

Size

KSU EXL ERL

A[31] = 0 10200 useg 0x0000 0000 through

0x7FFF FFFF 2 Gbyte

(231 bytes)

User Mode, User Space(

User Mode, User Space(User Mode, User Space(

User Mode, User Space(

useg

useguseg

useg

)

))

)

In User mode(

KSU

= 102 in the

Status

register), when the most-significant bit of the 32-

bit virtual address is set to 0, the

useg

virtual address space is selected; it covers the 231

bytes (2 GB) of the current user address space. All valid User mode virtual addresses have

their most-significant bit cleared to 0; any attempt to reference an address with the most-

significant bit set while in User mode causes an Address Error exception.

The system maps all references to

useg

through the TLB. Bit settings within the TLB

entry for the page determine the cacheability of a reference. The virtual address is

extended with the contents of the 8-bit ASID field to form a unique virtual address.

This mapped space starts at virtual address 0x0000 0000 and runs through 0x7FFF FFFF.

Chapter 6 Memory Management

6-10

6.2.7 Supervisor Mode Oper ati ons

Supervisor mode is designed for layered operating systems in which a true kernel runs in

C790 Kernel mode, and the rest of the operating s yst em runs in Supervis or mode.

The processor operates in Supervisor mode when the

Status

register contains the

following bit-values:

•

KSU

= 012

•

and EXL

= 0

•

and ERL

= 0

32-bit

2 GB

Mapped

Address

error

0.5 GB

Mapped

Address

error

Address

error

suseg

0x FFFF FFFF

0x 0000 0000

0x E000 0000

0x A000 0000

0x C000 0000

0x 8000 0000

sseg

Virtual Address

Figure 6-5. Supervisor Mode Virtual Address Space

Table 6-4. Supervisor Mode Segments

Address Bit

Values Status Register

Bit Values Segment

Name Virtual Address

Range Segment

Size

KSU EXL ERL

A[31] = 0 01200 suseg 0x0000 0000 through

0x7FFF FFFF 2 Gbyte

(231 bytes)

A[31:29] = 110201200 sseg 0xC000 0000 through

0xDFFF FFFF 0.5 Gbyte

(229 bytes)

Supervisor

SupervisorSupervisor

Supervisor Mode, User Space (

Mode, User Space ( Mode, User Space (

Mode, User Space (

suseg

susegsuseg

suseg

)

))

)

In Supervisor mode (

KSU

= 012 in the

Status

register), when the most-significant bit of

the 32-bit virtual address is set to 0, the

suseg

virtual address space is selected; it covers

the 231 bytes (2 Gbytes) of the current user address space. The virtual address is extended

with the contents of the 8-bit ASID field to form a unique virtual address.

This mapped space starts at virtual address 0x0000 0000 and runs through 0x7FFF FFFF.

Supervisor

SupervisorSupervisor

Supervisor Mode, Supervisor Space (

Mode, Supervisor Space ( Mode, Supervisor Space (

Mode, Supervisor Space (

sseg

ssegsseg

sseg

)

))

)

In Supervisor mode (

KSU

= 012 in the

Status

register), when the three most-significant

bits of the 32-bit virtual address are 1102, the

sseg

virtual address space is selected; it

covers 229-bytes (512 Mbytes) of the current supervisor addres s space. The virtual address

is extended with the contents of the 8-bit ASID field to form a unique virtual address.

This mapped space begins at virtual address 0xC000 0000 and runs through 0xDFFF

FFFF.

Chapter 6 Memory Management

6-11

6.2.8 Kernel Mode Operations

The processor operates in Kernel mode when the

Status

register contains one of the

following values:

•

KSU

= 002

•

or EXL

= 1

•

or ERL

= 1

The processor enters Kernel mode whenever an exception is detected and it remains in

Kernel mode until an Exception Return (

ERET

) instruction is executed. The

ERET

instruction restores the processor to the mode existing prior to the exception.

Kernel mode virtual address space is divided into regions differentiated by the high-order

bits of the virtual address, as s how n in Figure 6- 6.

Table 6-5 lists the characteristics of the kernel mode segments.

Figure 6-6. Kernel Mode Address Space

32-bit

2 GB

Mapped

(becomes

unmapped if

ERL=1)

0.5 GB

Mapped

0.5 GB

Unmapped

Uncached

0.5 GB

Mapped

0.5 GB

Unmapped

Cached

0x FF FF F FF F

0x 0000 0000

0x E0 00 0000

0x A0 00 0000

0x C 000 00 00

0x 8000 0000

kseg1

ksseg

kseg3

kuseg

kseg0

Virtual Address

32-bit

0.5 GB

Kernel Boot

and I/O

0x FF FF F FF F

0x 0000 0000

0x 1FFF FFFF

Physical Address

Translated b

y

TLB

Translated b

y

TLB

Translated b

y

TLB

Chapter 6 Memory Management

6-12

Table 6-5. Kernel Mode Segments

Address Bit

Values Status Register

Bit Values Segment

Name Virtual Address

Range Segment

Size

KSU EXL ERL

A[31] = 0 KSU = 002kuseg 0x0000 0000 through

0x7FFF FFFF 2 Gbyte

(231 bytes)

A[31:29] = 1002or kseg0 0x8000 0000 through

0x9FFF FFFF 0.5 Gbyte

(229 bytes)

A[31:29] = 1012EXL = 1 kseg1 0xA000 0000 through

0xBFFF FFFF 0.5 Gbyte

(229 bytes)

A[31:29] = 1102or ksseg 0xC000 0000 through

0xDFFF FFFF 0.5 Gbyte

(229 bytes)

A[31:29] = 1112ERL = 1 kseg3 0xE000 0000 through

0xFFFF FFFF 0.5 Gbyte

(229 bytes)

Kernel

KernelKernel

Kernel Mode, User Space (

Mode, User Space ( Mode, User Space (

Mode, User Space (

kuseg

kusegkuseg

kuseg

)

))

)

In Kernel mode (

KSU

= 002 or EXL = 1 or ERL = 1 in the

Status

register), when the most-

significant bit of the virtual address , A[31], is a 0, the 32-bit

kuseg

virtual address space is

selected; it covers the full 231 bytes (2 GB) of the current user address space. The virtual

address is extended with the contents of the 8-bit ASID field to form a unique virtual

address.

When ERL = 1 in the

Status

register, the user address,

kuseg

, region becomes a 231-byte

unmapped, uncached address space (that is , mapped directly to physical address es 0x0000

0000 through 0x7FFF FFFF).

Kernel

KernelKernel

Kernel Mode, Kernel Space 0 (

Mode, Kernel Space 0 ( Mode, Kernel Space 0 (

Mode, Kernel Space 0 (

kseg0

kseg0kseg0

kseg0

)

))

)

In Kernel mode (

KSU

= 002 or EXL = 1 or ERL = 1 in the

Status

register), when the most-

significant three bits of the virtual address are 1002, 32-bit

kseg0

virtual address space is

selected; it is the 229-byte (512 MB) kernel physical space.

References to

kseg0

are not mapped through the TLB; the physical address selected is

defined by subtracting 0x8000 0000 from the virtual address. The

K0

field of the

Config

register, described in this chapter, controls cacheability and coherency.

Kernel

KernelKernel

Kernel Mode, Kernel Space 1 (

Mode, Kernel Space 1 ( Mode, Kernel Space 1 (

Mode, Kernel Space 1 (

kseg1

kseg1kseg1

kseg1

)

))

)

In Kernel mode (

KSU

= 002 or EXL = 1 or ERL = 1 in the

Status

register), when the most-

significant three bits of the 32-bit virtual address are 1012, 32-bit

kseg1

virtual address

space is selected; it is the 229-byte (512 MB) kernel physical space.

References to

kseg1

are not mapped through the TLB; the physical address selected is

defined by subtracting 0xA000 0000 from the virtual address .

Caches are disabled for accesses to these addresses, and physical memory (or memory-

mapped I/O device registers) is accessed directly.

Kernel

KernelKernel

Kernel Mode, Supervisor

Mode, Supervisor Mode, Supervisor

Mode, Supervisor Space (

Space ( Space (

Space (

ksseg

kssegksseg

ksseg

)

))

)

In Kernel mode (

KSU

= 002 in the

Status

register), when the most-significant three bits of

the 32-bit virtual address are 1102, the

ksseg

virtual address space is selected; it is the

current 229-byte (512 MB) supervisor virtual space. The virtual address is extended with

the contents of the 8-bit ASID field to form a unique virtual address.

Chapter 6 Memory Management

6-13

Kernel

KernelKernel

Kernel Mode, Kernel Space 3 (

Mode, Kernel Space 3 ( Mode, Kernel Space 3 (

Mode, Kernel Space 3 (

kseg3

kseg3kseg3

kseg3

)

))

)

In Kernel mode (

KSU

= 002 in the

Status

register), when the most-significant three bits of

the 32-bit virtual address are 1112, the

kseg3

virtual address space is selected; it is the

current 229-byte (512 MB) kernel virtual space. The virtual address is extended with the

contents of the 8-bit ASID field to form a unique virtual address.

Chapter 6 Memory Management

6-14

6.3 System Control Coprocessor

The System Control Coprocessor (COP0) is implemented as an integral part of the CPU,

and supports memory management, address translation, exception handling, and other

privileged operations. The COP0 registers shown in Figure 6-7 plus a 48-entry TLB make

up the MMU.

Each COP0 register has a unique number that identifies it; this number is referred to as

the

register number

. For instance, the

PageMask

register is register number 5.

EntryHi

10*

EntryLo0

2*

EntryLo1

3*

Index

0*

Random

1*

PageMask

5*

Wired

6*

Context

4*

Status

12*

BadVAddr

8*

TLB

(“Safe” entries)

(See Random register,

contents of TLB Wired)

127 0

*Register number

47

0

Figure 6-7. COP0 Registers and the TLB

Chapter 6 Memory Management

6-15

6.3.1 Format of a TLB Entry

Figure 6-8 shows the TLB entry formats for the 32-bit address translation modes. Each

field of an entry has a corresponding field in the

EntryHi

,

EntryLo0

,

EntryLo1

, or

PageMask

registers. For example, the

Mask

field of the TLB entry is also held in the

PageMask

register.

Figure 6-8. Format of a TLB Entry

The format of the

EntryHi

,

EntryLo, EntryLo1

, and

PageMask

registers are nearly the

same as the TLB entry. The one exception is the

Global

field (

G

bit), which is used in the

TLB, but is reserved in the

EntryHi

register. The following register tables describe the

TLB entry fields shown in Figure 6-8.

32-bit Mode

127 121 120 109 108 96

0 MASK 0

7 12 13

95 77 76 75 72 71 64

VPN2 G 0 ASID

19 1 4 8

31 26 25 6 5 3 2 1 0

128-bit TLB

entry in 32-

bit mode of

C790

processor

63 58 57 38 37 35 34 33 32

6 20 3 1 1 1

0PFNCDV0

6 20 3 1 1 1

0PFNCDV0

Chapter 6 Memory Management

6-16

PageMask Register

31 25 24 13 12 0

0MASK 0

712 13

MASK Page comparison mask.

0 Reserved. Must be written as zeroes, and returns zeroes when read.

EntryHI Register

31 13 12 8 7 0

VPN2 0 ASID

19 5 8

VPN2 Virtual page number divided by two (maps to two pages).

ASID Address spac e ID f ield. An 8-bit f ield that lets multiple proc es s es s har e the T LB; eac h

process has a distinct mapping of otherwise identical virtual page numbers.

0 Reserved. Must be written as zeroes, and returns zeroes when read.

EntryLo0 Register

31 26 25 6 5 3 2 1 0

0PFNCDVG

6203111

EntryLo1 Register

31 26 25 6 5 3 2 1 0

0PFNCDVG

6203111

PFN Page frame number; the upper bits of the physical address.

C Specifies the TLB page coherency attribute; see Table 6-7.

D Dirty. If this bit is set, the page is m arked as dirty and, therefore, writable. This bit is

actually a write-protect bit that software can use to prevent alteration of data.

V Valid. If this bit is set, it indicates that the TLB entry is valid; otherwise, a TLB invalid

exception occurs.

G Global. If this bit is set in both LO0 and LO1, then the processor ignores the ASID

during TLB lookup.

0 Reserved. Must be written as zeroes, and returns zeroes when read.

The TLB page coherency attribute (

C

) bits specify whether references to the page should

be either of cached, uncached, or uncache-accelerated. Table 6-6 shows the coherency

attributes selected by the

C

bits.

Chapter 6 Memory Management

6-17

Table 6-6 TLB Page Coherency (C) Bit Values

C[5:3] Value Page Coherency Attribute

0 Reserved

1 Reserved

2 Uncached

3 Cacheable, write-back, write-allocate

4 Reserved

5 Reserved

6 Reserved

7 Uncached, Accelerated

Write-back with allocate fetches the line with the missed data both on load misses and on

store misses. Therefore, storing data to such pages is always performed to the data cache

and will not be sent to the write buffer.

Uncached accelerated data provides a special kind of acceleration for handling uncached

data. On a load of an uncached accelerated data item (which can range in size from a byte

to a quadword) the C790 will always fetch an aligned 128-byte quantity from memory.

These eight quadwords will be placed in a special 128-byte buffer called the uncache

accelerat ed buffer , or U CAB in the CPU. Any subs equent loads which “ hit” t he UCAB wi ll

get the data from the UCAB. This process reduces bus traffic. The UCAB will be

invalidated under the following conditions:

• Any load operation which doesn’t hit the buffer, or

• any store operat ion, or

• a SYNC (or SYNC. L) operation, or

• any exception.

For uncached accelerated stores, the C790 write-back buffer (128-bit x 8) also has some

special features. On the first store of an uncached accelerated write the write-back buffer

will mark the fact that this is an uncached accelerated write to a particular address.

Subsequent uncached accelerated stores which hit within the same 128-bit address

boundary will be accumulated (gathered) within the same write buffer entry. This process

of data gathering reduces bus traffic. The gathering process will be terminated under the

following conditions:

• Any store which can’t be gat her ed ( different attribut e or different addr ess) , or

• any load operation, or

• a SYNC (or SYNC. L) operation, or

• any exception.

Chapter 6 Memory Management

6-18

6.4 Virtual-to-Physical Address Translation Process

In the supported 32-bit mode, the highest 8 to 20 bits of the virtual address (depending

upon the page size) are compared to the contents of the TLB virtual page number. The 8-

bit ASID is only compared if the global bit, G, is not set.

If a TLB entry matches, the physical address and access control bits (

C, D

, and

V

) are

retrieved from the matching TLB entry. While the

V

bit of the entry must be set for a

valid translation to take place, it is not involved in the determination of a matching TLB

entry.

Figure 6-9 illustrates the TLB address translation process.

Chapter 6 Memory Management

6-19

G=1?

Exception

Yes

For valid

address space, see

the secti on descri bing

Ope rating Mode s

in th is c hapter .

Virtual Address (Input)

No

Yes

No

Yes No

Yes

No Yes

No

Yes

No

Yes

No

Yes

VPN

and

ASID

User

Mode

Unmapped

Access

Sup.

Mode

Address

Error Access

Allowed?

VPN

Match? No

Address

Error

Exception

Access

Allowed?

Mapped

Area?

ASID

Match?

Match Not

Match

Match? No match entry

V=1? No

Yes

Exception

Yes No

No TLB

Invalid TLB

Refill

D

= 1?

Write?

TLB

Mod Exception

NoYes

Access

Cache

C =010

or 111?

Access

Main

Memory

Physical Address (Output)

Non-

cacheable

Yes

Dirty

Figure 6-9. TLB Address Translation

Chapter 6 Memory Management

6-20

If there is no TLB entry that matches the virtual address, a TLB miss exception occurs. If

the access control bits (

D

and

V

) indicate that the access is not valid, a TLB modified or

TLB invalid exception occurs.

If the

C

bits equal 0102 (Uncached) or 1112 (Uncached Accelerated), the physical address

that is generated directly accesses main memory, bypassing the cache.

6.5 TLB Instructions

Table 6-7 lists the instructions that the CPU provides for working with the TLB. See

Appendix C for a detailed description on these instructions.

Table 6-7. TLB Instructions

OpCode Description of Instruction

TLBP Translation Look-aside Buffer Probe

TLBR Translation Look-aside Buffer Read

TLBWI Translation Look-aside Buffer Write Index

TLBWR Translation Look-aside Buffer Write Random

Chapter 7 Caches

7-1

7. Caches

The C790 core contains both an instruction cache and a separate data cache. The

processor also contains a small size of read only cache memory for uncached accelerated

area.

This chapter describes the cache structures, operation of the caches, and cache control.

Chapter 7 Caches

7-2

7.1 Cache Features

The two caches are configured as shown in Table 7-1:

Table 7-1. Cache Configuration

Cache Size Organization Line Size Refill Size

Instruction Cache 32 KB 2-Way 64 bytes 64 bytes

Data Cache 32 KB 2-Way 64 bytes 64 bytes

The following are the main features of the caches:

• Separate Instruction Cache and Data Cache

• Virtually indexed and physically tagged caches

• 64 Byte line size

• 64 Byte Refill size

• 2-way set-associative cache for higher performance

• Write-back policy for the Data Cache

• Missed quadword first sequential order burst refills for the Data Cache

• Data Cache line locking

• Non-Blocking Loads

• Data cache supports multiple Hits under a single miss

• No Snoop capability

No cache snoop capability has been provided. The user may choose to use

CACHE

instructions to keep coherency between caches and main memory.

Chapter 7 Caches

7-3

7.2 Organization of the Caches

Organization of the caches is illustrated in Figure 7-1 and Figure 7-2. Both the

Instruction Cache and the Data Cacher are 2-way set-associative. Each cache line consists

of a

tag

tagtag

tag

and

data

datadata

data.

Each cache has a data line size of 64 bytes.

7.2.1 Data Cache

The Data Cache is connected to the CPU via a 128- bit bus. Therefore, the Data Cache can

supply to the CPU or the coprocessors up to a quadword of data per access.

The following diagram shows Data Cache structure. Tags are discussed in detail in a later

section.

Virtual Index 20 bits

L R V D PFN

64 bytes

DATA

Phys.Tag0 Data0

Way0

20 bits

L R V D PFN

64 bytes

DATA

Phys.Tag1 Data1

Way1

256

entries

L Lock Bit For descripti on, see Section 7.3.7, Data Cache Lock Function

R LRF Bit For descripti on, see S ection 7.3.1, Line Replacement Algorithm

V Valid Bi t For description, see Section 7. 2.3, Tag Structure

D Dirty Bit For descri p tion, see Section 7.2.3, Tag Structure

Figure 7-1. Organization of Data Cache

Chapter 7 Caches

7-4

7.2.2 Instruction Cache

The Instruction Cache is connected to the CPU pipeline via a 64-bit bus. This enables the

CPU to fetch two instructions per cycle from the Instruction Cache.

The following diagram shows Instruction Cache structure. Tags are discussed in detail in

a later section.

Virtual Index 20 bits

R V PFN

64 bytes

DATA

Phys.Tag0 Data0

Way0

256

entries

20 bits

R V PFN

64 bytes

DATA

Phys.Tag1 Data1

Way1

R LRF Bit

VValid Bit

Figure 7-2. Organization of Instruction Cache

Chapter 7 Caches

7-5

7.2.3 Tag Structure

The general structure of a tag consists of a set of state bits and a physical page frame

number or

PFN

PFNPFN

PFN

field. The Data Cache and the Instruction Cache have different numbers

of state bits; for more information, refer to the discussions in the following sections.

The size of the tag and the number of virtual address bits indexing the caches are

dependent upon the size of the cache, address space, and set associativity. The C790

supports 32-bit virtual and physical address es as s how n in the f igure below :

Virtual Address (VA)

31 14 13 12 11 0

VPN OFFSET

Physical Address (PA)

31 14 13 12 11 0

PFN OFFSET

Since the cache line size is fixed at 64 bytes, that is, four quadwords per entry, the Tag

Cache associated with each way will have one tag for every four quadwords. Table 7-2

shows cache sizes, address bits and tag size.

Table 7-2. Cache Size and Access Bits

Cache Size Way Size of

Each Way Cache Virtual

Address

Index Bits

Tag Cache

Size of Each

Way

Tag Virtual

Address

Index

Data 32 K 2 WAY 256 x 64 Bytes 13:4 256 x 20 Bits 13:6

Instruction 32 K 2 WAY 256 x 64 Bytes 13:3 256 x 20 Bits 13:6

While the caches are indexed by the virtual address, the tag comparison is physical. This

is possible because the caches and the TLB are accessed in parallel. So, when the tags

have been accessed, the page frame number is ready to be compared against the

translated virtual address for a cache hit or miss.

C790 Progr am ming Not e:

Overlapping of the cache index bit range and PFN bit range causes the “cache aliasing

problem”. C790 does not have any hardware mechanisms to detect the cache aliasing. It is

programmer’s responsibility to avoid the cache aliasing. When a physical page is mapped

on the different virtual pages, VPN[13:12] have to be same in both virtual address. The

conservative way to avoid this is that VPN[13:12] == PFN[13:12] whenever a page is

mapped.

Chapter 7 Caches

7-6

7.2.3.1 Data Cache Tag Structure

In addition to the physical page frame number (PFN), each Data Cache Tag entry also

contains additional

Cache State

Cache StateCache State

Cache State

bits as shown below. All lines in both ways of the Data

Cache have these four state bits. Cache line state bits are also illustrated in Figure 7-1.

Two state bits,

DIRTY

and

VALID

, together identify which of three states the Data Cache

is in: Valid Clean, Valid Dirty, or Invalid. Table 7-3 shows the state of the Data Cache

line as a function of

DIRTY

and

VALID

bits.

Table 7-3. Data Cache Line States

Dirty Bit (D) Valid Bit (V) Cache Line State

X 0 Invalid

0 1 Valid Clean

1 1 Valid Dirty

The

LRF

bit is the Least-Recently-Filled line replacement bit.

The

LRF

bits serve as a replacement algorithm between the two ways of the Data Cache.

A refill access to a cache line in a way will flip the

LRF

bit to point to the other way as the

least recently filled. For details of the LRF line update operation refer to Section 7.3.1.

As Figure 7-1 illustrates, Data Cache lines in each way have a

LOCK

bit. The

LOCK

bit,

as explained in Section 7.3.7,

Data Cache Lock

Function, locks lines in one of the ways to

keep data from being replaced.

7.2.3.2 Instruction Cache Tag Structure

In addition to the physical page frame number (PFN), each Instruction Cache Tag entry

also contains two additional

Cache State

bits as shown below. All lines in both ways of the

Instruction Cache have these two state bits.

The Instruction Cache

VALID

state bit defines whether each line is in the Valid or Invalid

states.

The

LRF

bit is the Least-Recently-Filled line replacement bit.

LRF

bits serve as a

replacement algorithm between the two ways of the Instruction Cache. A refill access to a

cache line in a way will flip the

LRF

bit to point to the other way as the least recently

filled. For details of LRF line update operation refer to Section 7.3.1.

Data Cache Tag Fields

Dirty (D) Valid (V) LRF (R) Lock (L) PFN

Instruction Cache Tag Fields

Valid (V) LRF (R) PFN

Even if Cache Instruction

try to set V = 0, D = 1

state, Dirty bit is forced to

zero in C790

implementation.

Chapter 7 Caches

7-7

7.2.4 State of Cache Tags After Reset

For all Data Cache tags the following fields are initialized to 0 upon reset:

• Valid

• Dirty

• LRF

• Lock

For all Instruction Cache tags the following fields are initialized to 0 upon reset:

• Valid

• LRF

All other fields in the Instruction Cache and the Data Cache contents are undefined upon

reset.

Chapter 7 Caches

7-8

7.3 Cache Operations

This section describes cache operation in regard to read/write policies, coherency, write-

back policy, and the lock function.

7.3.1 Line Replacement Algorithm

The line replacement policy for both the Instruction Cache and the Data Cache is based on

the Least Recently Filled (LRF) algorithm. In this policy, the LRF bit of a way is modified

(inverted) only when a cache line refill occurs to the corresponding way. Load/store

accesses to the Data Cache

do not

modify the LRF bit. The bit indicating which way is the

least recently filled way is the XOR of the two LRF bits of the two ways of the cache.

Table 7-4. LRF Line Replacement Algorithm

Current

Way0

LRF

Current

Way1

LRF

XOR Refill

Way New

Way0

LRF

New

Way1

LRF

000010

101111

110001

011100

The column under XOR indicates the way which could be refilled (line replaced) on the

next refill at that line location.

Note that the table shown above is valid only when none

of the ways of the cache line is lock ed. If a way of the cache line is locked, then regardless

of the state of the LRF bits, the least recently filled way will always be the unlocked way.

The behavior is also slightly different for Instruction and Data Caches when one of the

way is invalid. For the Data Cache the algorithm is followed exactly as given above

irrespective of the ways being valid or invalid. For the Instruction Cache the algorithm

given above is followed as long as both the ways are valid. Once a way becomes invalid,

then that way gets priority of being filled over the valid way irrespective of the LRF bits.

7.3.2 Non-blocking Loads and Hit Under Miss

The Data Cache supports non-blocking load

non-blocking loadnon-blocking load

non-blocking load and hit under miss

hit under misshit under miss

hit under miss to improve performance.

When a Data Cache miss occurs or an uncached load instruction is issued,

Non-blocking

load

allows the pipeline to continue instruction execution until one of the following occurs:

1. A subsequent non-load/ s tore/ p ref instruction has data dependency with the load

that is pending (to be retired).

2. A pipeline0 stalls.

Chapter 7 Caches

7-9

Hit under miss

is a feature that allows access (load or store) to the Data Cache while a

previous load miss (cached, uncached or uncached accelerated), a previous store miss

(cached) or a previous prefetch miss (cached) is still pending. In this case, access to the

cache proceeds and the pipe does not stall.

Uncached loads also do not stall the pipeline while they are pending (to be retired). The

pipeline continues instruction execution until one of the following occurs:

1. A subsequent load/store/pref instruction has data dependency with the load that

is pending (to be retired).

2. A Data Cache miss occurs or a miss occurs on the Uncached Accelerated Buffer.

3. An Uncached load instruction is issued.

To summarize,

Non-blocking load

and

Hit under miss

allow the pipelene to continue

instruction execution until one of following occurs when a Data Cache miss occurs or an

uncached load instruction is issued:

1. A subsequent instruction has data dependency with the load that is pending (t o

be retired).

2. A Data Cache miss occurs or a miss occurs on the Uncached Accelerated Buffer.

3. An uncached load instruction is issued.

4. A pipeline0 stalls.

Loads to the

GPR

s (IU) and

FPR

s (FPU) all follow the non-blocking protocol (when it is

enabled). Loads to COP1 is alwa ys

alwaysalways

alwa y s blocking.

7.3.3 Cache Miss and Hit Operations

In case of a Data Cache hit, the cache provides data to the CPU in 128-bit (single

quadword) quantities. In case of an Instruction Cache hit, the cache provides data

(“instruction”) in 64-bit quantities. CPU reads or writes to the Data Cache in quantities

less than 128 bits are specif ied by the leas t s ignif icant f our bits of the addres s , bits 3: 0.

Cache misses are processed by the cache controller in 64-byte quantities - one cache line.

Since the caches are connected to the system bus via a 128-bit bus, cache refill takes a

burst of 4 bus cycles (8 CPU cycles) that is, four quadw ords are transferred in 4 bus cycles

(actual transfer time can be more due to bus arbitration etc). Thes e reads are perf ormed in

sequential order for both the Instruction Cache and the Data Cache. The quadword for

which the address missed is always fetched first.

Table 7-5 indicates the sequential order. PA[5:4] are two leas t- s ignif icant addres s bits that

are put out on the CPU Bus. Figure 7-3 illustrates the case where the second quadword,

shaded area, missed and shows the order in which data are read from main memory.

Chapter 7 Caches

7-10

Table 7-5. Quadword Retrieved Address PA[5:4]

Bus Starting Block Address PA[5:4]

Cycle 00011011

1 00011011

2 01101100

3 10110001

4 11000110

128 bits 128 bits 128 bits 128 bits

11 10 01 00

Read order Third Second First Fourth

Figure 7-3. Read Missed Processed in Sequential Order

In case of a write miss to the Data Cache (for an allocate-on-write address), the cache

controller will read in sequential order a cache line from main memory. Whether the cache

line, being replaced, is first written out to memory or not - due to the

DIRTY

bit being set -

is discussed in the next section.

The Instruction Cache processes cache misses in burst of 4 quadwords, just like the Data

Cache. Furthermore, in case of an Instruction Cache miss, the pipeline starts in the same

cycle the final quadword is stored into the Instruction Cache.

7.3.4 Data Cache Writeback Policy

Data cache lines are written back to the memory in the following cases:

1. The p r oc es s o r exec utes Index Writ e Bac k Inval id at e CACH E i ns t r uc t io n

suboperation as defined in Appendix C and the line data are dirty. Or Hit

Writeback Inval id at e or H i t Wr i t eback without Invali d at e CACH E

suboperations hit on Data Cache and the line data are dirty.

2. A read or write miss occurs and the line data are dirty. In this case the line has

to be written to memory before it can be replaced by the miss data.

Chapter 7 Caches

7-11

7.3.5 Data Cache State Transitions

As discussed previously, lines in the Data Cache can be in one of several states:

Invalid

InvalidInvalid

Invalid

,

Valid Clean

Valid CleanValid Clean

Valid Clean

or

Valid Dirty

Valid DirtyValid Dirty

Valid Dirty

.

Invalid

means the Data Cache entry does not contain valid data. Upon a miss, the cache

can load data into this cache line with no further actions.

The

Valid Clean

state indicates that there are valid data in the Data Cache line and they

are the same as memory. All writeback segments have their data in the

Valid Clean

state

until they are written to by the processor.

The C790 supports the write-back protocol, hence the need for a

Valid Dirty

state. A Data

Cache line transitions to the

Valid Dirty

state when the cache line is written to without

reflecting the operation on the bus - the writeback protocol. In this case, the data in the

cache does not match the data in memory.

Figure 7-4 shows the transition diagram of the Data Cache performing according to the

writebac k p ol i cy. For deta il s on t he CACH E op er at i on, refer t o Ap p e nd ix C.

Invalid Valid

Clean

CPU

Write

CPU

Read

Valid

Dirt

y

CPU

Read

CPU

Write

Read Miss

PREF Miss

CACHE Index Store Tag (if V = 1, D = 0)

CACHE Hit W/B without I nval idate (if hit)

CACHE Index Invalidate

CACHE Index WriteBack Invalidate

CACHE Hit WriteBack Invalid ate (if hi t)

CACHE Hit Invali dat e (if hit)

CACHE Index Store Tag (if V = 0)

Reset

Write Miss

CACHE Index Store Tag (if V = 1, D = 1)

Figure 7-4. Data Cache Transition Diagram, Writeback Protocol

Chapter 7 Caches

7-12

7.3.6 Instruction Cache State Transitions

Cache lines in the Instruction Cache can be in either of two states:

Invalid

InvalidInvalid

Invalid

or

Valid

ValidValid

Valid

.

Invalid

means the Instruction Cache entry does not contain valid instruction data. Upon a

miss, the cache can load instructions into this cache line with no further actions.

The

Valid

state indicates that there are valid instructions in the cache line and so there is

no need for miss processing.

The transition diagram for the Instruction Cache is simple; refer to Figure 7-5. For

details on the CACHE i nstruct i o ns r e f e r t o Ap p e nd i x C.

INVALID

CPU

Read

CACHE Index Store Tag (if V = 1)

CPU Read Miss

CACHE Fill

CACHE Index Store Tag (if V = 0)

CACHE Index Invalidate

Reset

CACHE Hit

Invalidate

(if hit)

VALID

Figure 7-5. Instruction Cache Transition Diagram

7.3.7 Data Cache Lock Function

In a 2-way set-associative Data Cache, such as the one present in the C790, there is no

explicit way of forcing data to be retained in the cache. The LRF-based mechanism

dynamically determines which cache line should be replaced. A Data Cache lock function

has been defined to aid in retaining critical pieces of data in the Data Cache under strict

program control.

Each entry on each way of the Data Cache has a Lock (L) bit. The Lock bit aids in locking

the line by writing directly into it. After locking the line, the LRF bit is no longer

meaningful. Thus, if one of the ways for a particular line is locked, the other way is the

only way available for caching. Thus, once a line is locked with a particular physical

address tag, any other virtual address which maps onto the same cache line will have only

a direct mapped location rather than a 2-way location.

To lock the D at a Cac he, t he f ol lo w i ng two CACHE inst r uct i on s ubop e rat i ons c an be us ed :

INDEX STORE TAG (DCACH E)

INDEX STORE D ATA (DCACH E)

For details of the above CACHE instruction suboperation refer to Section 7.6. To lock a

Data Cache line, the following code sequence can be used:

Chapter 7 Caches

7-13

li t0,0x00010068 //PTagLo = 0x00010, D=V=L=1, R=0

mtc0 t0,$28 //t0 -> TagLo

sync.l

cache 18,0(r0) //TagLo -> Tag(way0)

sync.l

la s0,0x00010000

sw t1,0(s0) //store contents of t1 into

//locked cache line

In this example, t he tag has been modified usi ng the CACHE instructi on and the data has

been updated using a Store instruction.

The following restrictions apply to line locking:

• The result of re-locking a locked line is undefined

• The results of locking both ways of a cache line are undefined

To unlock Data Cache lines, the following code sequence can be used:

li t0,0x00010060 //D=V=1, L=R=0

mtc0 t0,$28 //t0 -> TagLo

sync.l

cache 18,0(r0) //TagLo -> Tag(way0)

sync.l

7.3.7.1 Operations Duri ng Lock

When the lock bit is set for cache line (index), only the other way is available for handling

cache misses. The misses are blocking. A write access to a locked line in the Data Cache

takes place only to the cache without affecting the state of memory. Writes to locked cache

lines will

not

notnot

not

set the DIRTY (D) bit.

7.3.8 Relationship Between Cached and Uncached Operations

Uncached and Uncached Accelerated load and store operations are always executed in

order on the CPU bus. Cached load operations can precede earlier store data present in

buffers on the CPU bus. All store data present in buffers prevents a

SYNC

(or

SYNC.L

)

instruction from completing until the store data has been sent either to the Data Cache or

the CPU bus.

Stores with the uncached and uncached accelerated attributes bypass the Data Cache

completely.

Chapter 7 Caches

7-14

7.4 Uncached Accelerated Buffer

The C790 has a small size of read only cache memory for uncached accelerated area to

reduce bus traffic. This read only cache, the Uncached Accelerated Buffer (UCAB), can

introduce data to itself only by refill process due to a load miss on the UCAB. Once load

instructions hit on the UCAB, data are provided directly from the UCAB. The UCAB is

invalidated under the following conditions:

• Any load operation w hi c h doesn’t hit t he UCAB, or

• Any store operation, or

• A

SYNC

(or

SYNC.L

) operation, or

• Any exception

Snoop is not s upp or t ed for the UCAB.

7.4.1 UCAB Configuration

The UCAB is confi gured as shown in Table 7-6.

Table 7-6. UCAB Configuration

Size Organization Line Size Refill Size

Uncached A ccelerat ed Buffer 128 bytes Direct Map 128 bytes 128 bytes

7.4.2 Tag Structure

The UCAB is also ind e xed by the vi r t ual ad d r es s, the tag c o m p ari s on i s p hysical. Tabl e 7- 7

shows t he UCAB size and access bits .

Table 7-7. UCAB Size and Access Bits

Size Way Size UCAB Virtual

Index Bits UCAB

Tag Size UCAB Tag Virtual

Index Bits

UCAB 128 B Direct Map 1×128

Bytes 6:4 1×25 Bits 

The least significant 5 bits of the UCAB Tag ([11:7]) is identical with the virtual address

[11:7]. The UCAB Tag has one bit of valid bit. The UCAB Tag doesn’t have Ditty, LRF,

Lock bits . The val id bi t of U CAB Tag is i nit i al ized t o 0 up on r eset.

7.4.3 Non-blocki ng Loads and Hi T under M iss

The UCAB also sup ports non-block ing load and hit under miss as well as the Data Cache.

Non-blocking load and Hit under miss allow the pipeline to continue instruction execution

until one of following occurs when an Uncached Accelerated Buffer miss occurs:

1. A subsequent instruction has data dependency with the load that is pending (to

be retired).

2. A Data cac he m iss occurs or a miss occurs on t he U CAB.

3. An uncached load instruction is issued.

4. A pipeline0 stalls.

Chapter 7 Caches

7-15

7.5 Cache Control Registers

The operations of the caches are controlled by certain programmable bits in the

Config

register. These bits are:

ICE Instruction Cache Enable

DCE D ata Cache Enable

IC Instruction Cache Size

DC Data Cache Size

IB Icache Line S ize

DB Dcache Line Size

For details of these configuration bits refer to the COP0 register section.

The two cache tag registers

TagLo

and

TagHi

are 32-bit read/write registers that hold the

tag and state of the cache line during initialization and diagnostics. The Tag registers are

manipula t ed by MTC0 and CACHE instruc t i ons.

TagLo

31 1211 765432 0

PTagLo 0 D V R L 0

TagHi

where

PTagLo Specifies physical addres s bits 31: 12

D Cache State DIRTY bit (Not used for the Instruction Cache)

V Cache State VALID bit

R LRF Bit

L LOCK Bit (Not used for the Instruction Cache)

0 Must be written as zeros, will return zero on reads

The

TagHi

register contains instruction- and operation-specific items (see the next

section).

Chapter 7 Caches

7-16

7.6 CACHE Instruction

For inform at i o n on t he CACH E i ns t r uc t i on, please r efer to Ap p e ndi x C.

Chapter 8 CPU Bus

8-1

8. CPU Bus

The C790 CPU core is connected to the rest of the system1, and to external devices,

through the group of on-chip C790 system bus signals called the CPU Bus

CPU BusCPU Bus

CPU Bus. This chapter

defines the architecture of the CPU Bus and describes it in the context of an overall sys-

tem design.

This chapter describes the following:

• the CPU Bus architecture and agents on the CPU Bus

• the types of transactions possible between agents on the bus

• the bus protocols for transactions

1 The system consists of a DMA Controller (DMAC) as a master, and various slave devices.

Chapter 8 CPU Bus

8-2

8.1 Introduction

The CPU Bus is an on-chip bus in a highly integrated processor. All agents

agentsagents

age n ts (see definitions

section 8.1.1 below) on the CPU Bus are equipped with a CPU Bus interface unit connect-

ed via CPU Bus signals. An agent acts like a master when it initiates reads or writes on

the bus. An agent acts like a slave when it responds to reads or writes initiated by a mas-

ter. For the CPU Bus to operate properly, an arbiter is needed, to perform arbitration be-

tween the CPU and the other bus masters. The arbiter is located in the CPU, and CPU

arbitration behavior is discussed in Section 8. 5.1, Arbitration Operations.

The following are main features of the CPU Bus:

• Separate data and address buses (Demultiplexed operation)

• 128-bit data bus

• Clocked synchronous operations

• Peak transfer rate of 2.1GB/sec (@ 133 MH z bus clock )

• 8/16/32/64/128-bit and burst accesses

• Multimaster capability

• Pipelined operations

• No turn-around or dead cycles between transfers

The CPU Bus does not provide:

• Cache coherency support

• Split transactions

Chapter 8 CPU Bus

8-3

8.1.1 Terminology

Address Phase

Address PhaseAddress Phase

Address Phase is the cycles during which an address is driven on the CPU Bus through

the cycle the address is acknowledged.

Agent

AgentAgent

Agent refers to different devices on the CPU Bus.

Assert

AssertAssert

Asser t means taking a signal to its active level. An active high signal is “1” when asserted,

and an active low signal is “0” when asserted.

CPU

CPUCPU

CPU means the C790 CPU. The terms CPU and C790 are used interchangeably in this

chapter.

Data Phase

Data PhaseData Phase

Da ta Phase is the cycles during which data are driven on the bus through the cycle they

are acknowledged.

DMAC

DMACDMAC

DM A C is the DMA Controller in the system.

Master

MasterMaster

Master means the current bus master on the CPU Bus.

MEM

MEMMEM

MEM refers to the system memory controller.

Negate

NegateNegate

Negate/Deassert

/Deassert/Deassert

/Deas se rt means tak i ng a s ignal to its inactive s t ate. An active high s ignal is “0”

when deasserted. An active low signal is “1” when negated.

*

(after signal name)

means active low signal.

8.1.2 Signal Naming Convention

Table 8-1 shows the prefixes used for naming signals in a system incorporating the C790

CPU Bus.

Table 8-1. System Signal Naming Convention

Signal

Prefix Signal Type

CPU Signals from the CPU multiplexed or logically com bined with the DMAC signals

to form the system signals. These signals include: CPUADDR, CPUBE*,

CPURD*, CPUWR*, CPUTSIZE, CPUASTART*, CPUDSTART*, CPUDATA.

SYS The combined or multiplexed signals from any agents on the CPU Bus. These

signals include: SYSADDR, SYSBE*, SYSRD*, SYSWR*, SYSTSIZE,

SYSASTART*, SYSDSTART*, SYSAACK*, SYSDACK*, SYSDATA.

Chapter 8 CPU Bus

8-4

8.2 CPU Bus Architecture

The CPU Bus design is a synchronous pipelined bus with separate data (128-bit) and

address buses running at half the clock frequency of the CPU. The CPU is connected to

the rest of the system and external devices through this bus. Figure 8-1 illustrates the

architecture of the bus and identifies different agents that can be on the bus.

CPU

Bus

Memory

Controller

DMAC

CPU CPU

Bus

Interface

WBB

D$

I$

I/O

Devices

Figure 8-1. CPU Bus Architecture

Chapter 8 CPU Bus

8-5

8.2.1 CPU Bus Connectivity for Address and Control Paths

Figure 8-2 illustrates the system-level interconnections for address paths of the CPU Bus.

Support logic is needed to handle the fact that the system contains multiple masters.

AGNT* is used to control the multiplexer in the support logic that selects a master to be

connected to the CPU Bus.

C790

CPU

DMAC

Mux

CPUADDR,

CPUBE*,

CPUTSIZE,

CPURD*,

CPUWR*

DMAADDR,

DMATSIZE,

DMARD*,

DMAWR*

SYSADDR,

SYSBE*,

SYSTSIZE,

SYSRD*,

SYSWR*

Memory

Controller

I/O

Devices

SYSAACK*

DMAAACK*

MEMAACK*

IOAACK*

DMAASTART *

CPUASTART *SYSASTART *

D Q

AGNT*

BUSCLK

Figure 8-2. CPU Bus Address and Control Path Connections in System

Chapter 8 CPU Bus

8-6

8.2.2 CPU Bus Connectivity for Data Paths

Figure 8-3 illustrates the system-level interconnections for data paths of the CPU Bus.

For read cycles, the support logic must control the multiplexer so that the correct source of

data is put on SYSDATA.

For write cycles, the support logic must detect whether the cycle is a CPU cycle or a DMA

cycle, and use this to control the multiplexer.

C790

CPU

DMAC

Memory

Controller

I/O

Devices

Mux

CPUDATA SYSDATA

SYSDACK*

DMADACK*

MEMDACK*

IODACK*

CPUDSTART*

DMADSTART*

SYSDSTART*

MEMDATA

IODATA

DMADATA

Figure 8-3. CPU Bus Data Path Connections in System

Chapter 8 CPU Bus

8-7

8.3 CPU Bus Signal Descriptions

This section describes the CPU Bus signals and their usage in different bus operations.

8.3.1 Address Bus Signals

CPUADDR[31:4] CPU address bus

CPUADDR[31:4] bits are valid during the address phase and can be sampled by the slave

when CPUASTART* is sampled low.

SYSADDR[31:4] System address bus

SYSADDR[31:4] are multiplexed outputs selecting between CPUADDR[31:4] and DMA

address. They are valid during the address phase and can be sampled by the slave when

SYSASTART* is sampled low.

CPUBE[15:0]*CPU byte enables

CPUBE[i

ii

i]*, driven during the address phase, indicates valid data on byte i

ii

i of

CPUDATA[127:0] during the data phase. CPU byte enables can be sampled by the slave

when CPUASTART* is sampled low. CPU byte enables are used only in CPU single cycles.

SYSBE[15:0]*System byte enables

SYSBE[i

ii

i]*, driven during the address phase, indicates valid data on byte i

ii

i of

SYSDATA[127:0] during the data phas e. System byte enables can be s ampled by the slave

when SYSASTART* is sampled low. System byte enables are used only in CPU single

cycles.

Chapter 8 CPU Bus

8-8

CPUTRANSTYPE[4:0] CPU transaction t ype

CPUTRANSTYPE[4:0], driven during the address phase, indicates the type of operation.

CPU transaction type can be sampled by the slave when CPUASTART* is sampled low.

Table 8-2. Bus Transaction Types

CPUTRANSTYPE Type of Bus Transaction

00000 Not defined or miscellaneous

00001 - 00111 Reserved

01000 Dat a Cache Refill due to Load Mis s

01001 Dat a Cache Refill due to P ref etch Ins truction

01010 Dat a Cache Refill due to S tore Miss

01011 Uncached Load

01100 Uncached Accelerat ed Load

01101 - 01111 Reserved

10000 Instruction Cac he Miss Refi l l

10001 Cac he Instruc tion - Fill Suboperation

10010 Uncached Executi on

10011 - 10111 Reserved

11000 Data Cache Write-back due to Load/St ore Miss

11001 Data Cache Write-back due to Cache Instruction

11010 Uncached Store

11011 Uncached Accelerat ed Store

11100 Non-al l ocated St ore

11101 - 11111 Reserved

CPURD*CPU read

The CPU asserts this signal to indicate a read operation. This signal can be sampled w hen

CPUASTART* is sampled low. This signal is active during the address phase. CPURD* is

used in transfers initiated by the CPU.

CPUWR*CPU write

The CPU asserts this signal to indicate a write operation. This signal can be sampled

when CPUASTART* is sampled low. This signal is active during the address phase.

CPUWR* is used in transfers initiated by the CPU.

Chapter 8 CPU Bus

8-9

CPUTSIZE[1:0] CPU t ransf er size

While driven by the CPU, these signals indicate the size of the transfer in the current

CPU initiated bus cycle. They are driven during the address phase and can be sampled

starting at the edge where CPUASTART* is sampled low.

Table 8-3. CPU Transfer Size

CPUTSIZE[1:0 ] Transfer Size

00 1 Quadword (Single Cycle)

11 4 Quadwords

SYSTSIZE[2:0] System t ransfer size

While driven by the system, these signals indicate the size of the transfer in the current

system bus cycle. They are driven during the address phase and can be sampled starting

at the edge where SYSASTART* is sampled low.

CPUASTART*CPU address start

Driven by the CPU, it indicates the start of the address phase. Address, byte enable, and

control signals (CPUADDR[31:4], CPUBE[15:0]*, CPURD*, CPUWR*, and CPUTSIZE)

can be sampled to determine the type of cycle requested starting where CPUASTART* is

sampled low. CPUASTART* is driven active for only one cycle.

SYSASTART*System address start

SYSASTART* is driven by the system; it indicates the start of the address phase. Address,

byte enable, and control signals can be sampled to determine the type of cycle requested

starting where SYSASTART* is sampled low. SYSASTART* is driven active for only one

cycle.

SYSAACK*System address acknowledge

This signal is an input to all the agents on the CPU Bus indicating that address and con-

trol signals have been sampled by the slave. The master terminates the address phase one

cycle aft e r s a m p li ng S Y S AACK * low.

CPUDATA[127:0] CPU data bus

This is a 128-bit data bus output f rom the CPU.

SYSDATA[127:0] System data bus

This is the 128-bit data bus input to all devices on the CPU Bus .

Chapter 8 CPU Bus

8-10

CPUDSTART*CPU data start

During read/write operations, this output from the CPU indicates the start of data phase.

For CPU write operations, the slave can sample data from the bus one cycle after CPUD-

START* has been asserted. For CPU read operations, the slave can output data on the bus

any cycle after the cycle CPUDSTART* has been asserted.

SYSDSTART*System data start

During read/write operations, this output from the system indicates the start of data

phase. Data transfer can begin one cycle after SYSDSTART* has been asserted. For DMA

cycles, if the slave, providing the data, cannot supply data in the next cycle after the as-

sertion of SYSDSTART*, it is the responsibility of the designer to come up with a new

DMA protocol.

SYSDACK*System data acknowledge

This signal is an input to all the agents on the bus indicating the valid status of data on

the bus. During read cycles, it indicates read data are available on the bus to be sampled

by the master. During write cycles, it indicates the slave has sampled the data. This sig-

nal should be asserted for each data transfer during burst operations. During read trans-

actions, data are sampled one cycle after SYSDACK* has been asserted. During write

transactions, the master drives new data on the bus one cycle after detecting SYSDACK*

low.

BUSERR*Bus error

This signal is an input to the CPU and the DMAC which indicates that a bus error has oc-

curred during the transaction. BUSERR* serves to terminate the bus protocol and return

bus ownership to the CPU.

INT[1:0]*Interrupt r equest lines

These signals are interrupt inputs to the CPU.

SIOINT*Serial I/O interrupt request

This line provides the serial I/O interrupt from the I/O controller.

NMI*Non-maskable interrupt

Non-maskable interrupt input to the CPU.

SYSBIGENDIAN Big Endian enable

This input signal is sampled during cold reset and make CPU to operate as big endian

when it is asserted. The input level of this signal must not be changed during the opera-

tion.

Chapter 8 CPU Bus

8-11

CPCOND0 Coprocessor condit ions

These lines are an input to the CPU as test conditions for some of the branch instructions.

RESET*Reset

Input to the CPU. When this line is asserted, the CPU, DMAC and slave devices execute a

reset.

CPUCLK CPU clock

CPU clock

BUSCLK Bus clock

Bus clock: 1/ 2, 1/ 3 or 1/4 frequency of the CPUCLK .

AREQ*Address bus r equest

This signal is an output from the DMAC to the CPU. When it is asserted, the DMAC re-

quests the address bus mastership.

AGNT*Address bus g r ant

This signal is an output from the CPU to grant the bus mastership to the DMAC. This

signal is as serted in r esponse to as s e r t ion of the AREQ* signal.

REL*Bus release request

This signal is asserted by the CPU to request that the current bus owner release the CPU

Bus.

Chapter 8 CPU Bus

8-12

8.4 Overview of CPU Bus Operations

This section discusses CPU Bus operations; it covers processor requests, DMA operations,

and bus error operation.

In this section descriptions show CPU signals followed by the system lines, in parentheses,

onto which they are asserted. For example: CPUASTART* (SYSASTART*) means

CPUASTART* is asserted on the SYSASTART* line. Where a value is given, the bits

output by the CPU are shown, followed by the bits, in parentheses, on the system lines.

For example if we have 11 on CPUTSIZE[1:0], during a CPU bus cycle, then we will get

011 on the SYSTSIZE[2:0]. This will be s how n as 11 (011).

8.4.1 CPU Bus Operations

The CPU Bus is different from conventional buses in that it allows

pipeline

pipelinepipeline

pipeline

operations. In

this case, pipeline implies up to two outstanding requests before any data transaction has

taken place. For instance, the CPU may issue two back-to-back read requests to main

memory before any data have been returned. Note that at any time, there can only be two

outstanding requests on the bus. The master requiring more than two operations has to

wait until the first request has been serviced completely prior to issuing the third one.

8.4.2 Processor Requests

The CPU issues single requests, burst requests or a series of requests to other agents on

the bus. These requests are referred to as

processor requests

initiated through the CPU

Bus interf ace.

The processor requests are in response to the following system events:

• Load miss

• Store miss

• Write-back buffer writes (dirty data cache lines, uncached writes, etc.)

• Uncached loads and uncached accelerated loads

• Instruction miss and uncached instruction f etch

Processor read/write requests can be a burst, quadword, or partial quadword of data to

and from the main memory or any other system resources. A processor-initiated burst is

always 4 quadwords .

8.4.2.1 Read Requests

The CPU initiates read requests by driving address and control on the bus and asserting

CPUASTART* (SYSASTART*) to indicate valid address and control. The CPU will keep

driving address and control until the slave device has acknowledged the address phase by

asserting address acknowledge, SYSAACK*. For burst reads, the CPU drives CPUTSIZE

(SYSTSIZE) to 11 (011) to indicate burst reads. The CPU also indicates that it is ready to

accept read data by asserting CPUDSTART* (SYSDSTART*). The slave device returns the

requested data on the data bus by asserting SYSDACK*,

,,

, data acknowledge.

Chapter 8 CPU Bus

8-13

8.4.2.2 Write Requests

The CPU initiates write requests by driving address and control on the bus and asserting

CPUASTART* (S YSASTART*). The CPU also drives data on the bus and indicates that by

asserting CPUDSTART* (SYSDSTART*).

..

. The slave device accepts the address and data

by asserting SYSAACK* and SYSDACK*, respectively. Burst writes are indicated by

driving CPUTSIZE (SYSTSIZE) to 11 (011) during the address phase.

8.4.3 Bus Error Operations

Bus error occurs when the CPU or DMA initiates cycles but there are no devices on the

CPU Bus responding to the cycles. The absence of response to either the address phase or

the data phase will cause the bus error condition. The bus error is always imprecise.

When bus error occurs, all the agents including the CPU, DMAC, and slave devices on the

CPU Bus will terminate the current bus cycle.

In the case where CPU is the initiator of the cycle, there can be two types of bus error:

• Data load/store bus error

• Instruction fetch bus error

Bus error sets the corresponding exception bit in the

CAUSE

register. Subsequently, the

CPU will jump to the proper error handler for the examination of the exception. However,

the bus error exception is imprecise. There is no guarantee that the CPU can recover from

this error condition.

In case the DMAC is the initiator of the cycle, the types of bus error depends on the im-

plementation of the DMAC. After bus error occurs, the DMAC will release the bus master-

ship back to the CPU and assert interrupt or NMI to the CPU. The interrupt or NMI rou-

tine will then handle the bus error condition for the DMAC.

Chapter 8 CPU Bus

8-14

8.5 CPU Bus Transaction Protocols and Timing

This section describes transaction protocols and the timing for the following CPU Bus op-

erations:

• Arbitration

• CPU single operations (one quadword)

• CPU burst operations (four quadwords )

• CPU non-pipelined single operations (one quadword)

• CPU non-pipelined burst operations (four quadwords)

• Bus error operations

8.5.1 Arbitration Operations

An arbiter is required to mediate between devices requesting the CPU Bus. The arbiter is

located in the CPU. The CPU is the default

defaultdefault

default bus master; AREQ* and AGNT* are both

deass er ted duri ng RES ET.

A master ot her t han t he CPU m ay re ques t t he bus by asserti ng t he r eques t signal, AREQ*.

In response to the AREQ* signal, the CPU will issue the grant signal, AGNT*, to grant

the address bus to the requesting master. In the cycle AGNT* is sampled active by the bus

master, the master starts the address phases and deasserts AREQ* in the beginning of

the last address phase. When the corresponding data phases commences, the CPU or the

requesting master starts the data transfers depending on the DMA transfer. Data phases

follow the exact order of address phas es . The arbitration s ignals are s how n in Figure 8- 4.

CPU Bus Master

AREQ*

AGNT*

REL*

CPU Bus

Figure 8-4. Connection of Arbitration Signals

The arbitration priority in using the CPU Bus is that the DMAC always has higher priori-

ty than the CPU. When both the CPU and the DMAC arbitrate for the CPU Bus, the arbi-

ter grants the bus mastership to the DMAC. The CPU can assert REL* to the DMAC in an

effort to get the bus ownership back from the DMAC. The CPU will proceed with the

transfer once the DMAC has released the CPU Bus.

The arbitration cycles and protocol are shown in Figure 8-5. In response to the DMAC asserting its

request AREQ*, the arbiter asserts AGNT* in cycle 3 which is the arbitration cycle. The DMAC

samples AGNT* asserted and begins its a ddress ph ases. W hen th e DMA C asserts to begin the la st

address phase, it deasserts its request line AREQ* in cycle 4. The arbiter then waits for the

SYSAACK* cycle to deassert AGNT* to release bus mastership back to the CPU.

Chapter 8 CPU Bus

8-15

Figure 8-5. Arbitration Protocol

8.5.1.1 Cycle Stealing

Cycle stealing refers to the CPU’s ability to preempt a master in order to perform a bus

operation. This operation could be either due to the write back buffer (WBB) being almost

full (having more than 64 bytes filled up) or the CPU needing to perform an instruction or

data read. These operations are collectively referred to as cycle stealing operations.

Figure 8-6 illustrates the cycle stealing protocol. The arbiter asserts the REL* (Release)

signal in response to the CPU’s request cycles. The master deasserts its request after

having finished its operations. When the master has begun the last address phase with

the master deas serts t he AREQ* signal indicating to the arbiter that the bus will be relin-

quished; as indicated in cycle 9. When the address phase ends, the address bus is returned

to the CPU by the deassertion of AGNT* in cycle 12. The arbiter deasserts REL* at the

same time AGNT* is deasserted. The data phases follow the same order as the address

phases.

Figure 8-6. Cycle Stealing Protocol

Master

BUSCLK

123456789

AREQ*

AGNT*

SYSADDR

SYSAACK*

CPU CPU

MasterCPU CPU

BUSCLK

SYSASTART*

1357911 13 15 17 19

AREQ*

AGNT*

SYSADDR

SYSAACK*

2 4 6 8 10 12 14 16 18

REL*

MasterCPU Master’s l ast address CPU

CPU CPU

SYSASTART*

Chapter 8 CPU Bus

8-16

8.5.2 CPU Single Operations

CPU Single operations transfer one quadword.

In single operations, the CPU drives the address, byte enables, and the read/write signals

and indicates their valid status by asserting CPUASTART* (SYSASTART*). The slave

samples valid address and control lines and responds by asserting SYSAACK*. In single

operations, CPUTSIZE (SYSTSIZE) is always 00 (000).

When the CPU detects SY SAACK* active and is ready to put another address on the bus,

it will start another address phase. The bus only supports two levels of address pipelining.

That means only two address phas es can be outs tanding bef o re any data phas e begins .

The CPU indicates that it is ready to accept/supply data by asserting CPUDSTART*

(SYSDSTART*) one cycle prior to actually accepting/supplying it. For read cycles, the

slave supplies the data and indicates that the data is ready by asserting SYSDACK*. For

write cycles, the CPU supplies data one cycle after CPUDSTART* (SYSDSTART*) is as-

serted, and the slave accepts the data by asserting SYSDACK*.

8.5.2.1 CPU Single Reads

The fastest CPU single read is 2 cycles. Address and data phases for AddrA illustrate the

fastest CPU single read cycle. The CPU asserts CPUASTART* (SYSASTART*) to begin

the address phase in cycle 1. The slave device asserts SYSAACK* in cycle 1 to indicate

that it has sampled the address. The CPU then begin another address phase in cycle 3.

The assertion of SYSDACK* by the slave device in cycle 1 triggers the CPU to sample

SYSDATA at the end of cycle 2.

Figure 8-7. CPU Single Reads

AddrA

12345678910

BUSCLK

SYSWR*

SYSADDR

SYSDATA

SYSTSIZE

SYSRD*

SYSASTART*

SYSAACK*

SYSDSTART*

SYSDACK*

AddrB AddrC AddrD

ABCD

0000

Chapter 8 CPU Bus

8-17

8.5.2.2 CPU Single Wri tes

The fastest CPU single write is 2 cycles. Address and data phases for AddrA illustrate the

fastest CPU single write cycle. The CPU always drives data onto CPUDATA one cycle

after the assertion of CPUDSTART* (SYSDSTART*). For example, in, the CPU drives

CPUDATA in cycle 2 which is one cycle after the assertion of CPUDSTART*

(SYSDSTART*) in cycle 1. The slave device samples SYSDATA one cycle after the

assertion of SYSDACK*.

Figure 8-8. CPU Single Writes

AddrA

12345678910

BUSCLK

SYSWR*

SYSADDR

SYSDATA

SYSTSIZE

SYSRD*

SYSASTART*

SYSAACK*

SYSDSTART*

SYSDACK*

AddrB AddrC AddrD

ABCD

0000

CPUDATA A B CD

Chapter 8 CPU Bus

8-18

8.5.2.3 CPU Single Read- W ri te-Read-Wri t e Cycles

All adjacent address phases are read-write or write-read cycles. AddrA is a read address

and AddrB is a write address, and so on.

Figure 8-9. CPU Single Read-Write-Read-Write Cycles

AddrA

12345678910

BUSCLK

SYSWR*

SYSADDR

SYSDATA

SYSTSIZE

SYSRD*

SYSASTART*

SYSAACK*

SYSDSTART*

SYSDACK*

AddrB AddrC AddrE

ABC D

000

CPUDATA B D

AddrD

0 0

Chapter 8 CPU Bus

8-19

8.5.3 CPU Burst Operations

CPU Burst operations transfer four quadwords. In burst operations, the CPU drives the

address and control signals and indicates their validity by asserting CPUASTART*

(SYSASTART*). The s lave samples val id ad d ress and cont r o l l ines and asserts SYSAACK*

to acknowledge the address phase. The address phase is the cycles from CPUASTART*

(SYSASTART*) asserted to one c ycle af t e r S Y SAACK * is asserted.

When the CPU detect s SY SAACK* active and has another address ready, it will start ano-

ther address phase.

The CPU indicates that it is ready to accept/supply data by asserting CPUDSTART*

(SYSDSTART*) one cycle prior to actually accepting/supplying it. For read cycles, the

slave supplies the data and indicates that data are valid by asserting SYSDACK* one cy-

cle prior to the data being available. For write cycles, the CPU supplies data one cycle af-

ter CPUD START * (SY SDSTART*) is asserted, and the slave accepts the data by asserting

SYSDACK*. For burst cycles, there are many SYSDACK* for data transfer.

The CPUTSIZE (SYSTSIZE) indicates the number of quadwords in the transfer. The CPU

initiated cycles use only values of either 00 (for CPU Single operations) or 11 (for CPU

Burst operations), w hich are single and burs t of 4 quadw ords res p ectively.

8.5.3.1 CPU Burst Reads

The fastest CPU burst read is 5 cycles. Address and data phases for AddrA illustrate the

fastest CPU burst read cycle. There are four SYSDACK* sent by the slave device for every

CPU burst read cycle. The slave device asserts SYSDACK* in cycle 1, 2, 3, and 4 to indi-

cate that data can be sampled at the end of cycle 2, 3, 4, and 5 by the CPU.

Figure 8-10. CPU Burst Reads

AddrA

12345678910

BUSCLK

SYSWR*

SYSADDR

SYSDATA

SYSTSIZE

SYSRD*

SYSASTART*

SYSAACK*

SYSDSTART*

SYSDACK*

AddrB AddrC AddrD

A1 A2 A3

3333

A4 B1 B2 B3 B4

Chapter 8 CPU Bus

8-20

8.5.3.2 CPU Burst Writ es

The fastest CPU burst write is 5 cycles. Address and data phases for AddrA illustrate the

fastest CPU burst write cycle. After assertion of CPUDSTART* (SYSD START*) in cycle 1,

the CPU drives the f irst d ata on CPUDATA in cyc le 2. As SYS DACK* is sampled asserted

in cycles 1, 2, 3, and 4, the CPU drives a new data on CPUDATA at the end of cycles 2, 3,

4, and 5.

Figure 8-11. CPU Burst Writes

AddrA

12345678910

BUSCLK

SYSWR*

SYSADDR

SYSDATA

SYSTSIZE

SYSRD*

SYSASTART*

SYSAACK*

SYSDSTART*

SYSDACK*

AddrB AddrC AddrD

A1 B1 B4 C1

3333

CPUDATA A1 B1 B4 C1

A2

A3

A4

B2

B3

Chapter 8 CPU Bus

8-21

8.5.3.3 CPU Burst Read-Write Cycles

All adjacent address phases are read-write or write-read cycles. AddrA is a read address

and AddrB is a write address, and so on.

Figure 8-12. CPU Burst Read-Write Cycles

8.5.3.4 CPU Burst Writ e- Read Cycles

All adjacent address phases are read-write or write-read cycles. AddrA is a write address

and AddrB is a read address, and so on.

Figure 8-13. CPU Burst Write-Read Cycles

AddrA

BUSCLK

SYSWR*

SYSADDR

SYSDATA

SYSTSIZE

SYSRD*

SYSASTART*

SYSAACK*

SYSDSTART*

SYSDACK*

AddrB AddrC

A1 B1 B4 C1

333

CPUDATA B1 B4

A2 A3 A4 B2

B2

B3

AddrA

BUSCLK

SYSWR*

SYSADDR

SYSDATA

SYSTSIZE

SYSRD*

SYSASTART*

SYSAACK*

SYSDSTART*

AddrB AddrC

A1 B1 B4 C1

333

CPUDATA

A2 A3 A4 B2 B3

A1 A2 A3

SYSDACK*

C1A4

Chapter 8 CPU Bus

8-22

8.5.4 CPU Non-Pipeline Single Operations

The CPU Bus can support non-pipeline operations as well as pipeline operations. The

non-pipeline operations are done simply by delaying the assertion of SYSAACK* until the

last SYSDACK* of the bus transaction. The advantage of this is that the peripheral does

not need to save the current address; it just decodes the address on the address bus for the

current operation. Using this mode of operation simplifies the peripheral interfaces to the

CPU Bus but it degrades the system performance.

8.5.4.1 CPU Non-Pipeline Single Reads

All adjacent address phases are read cycles .

Figure 8-14. CPU Non-Pipeline Single Reads

AddrA

12345678910

BUSCLK

SYSWR*

SYSADDR

SYSDATA

SYSTSIZE

SYSRD*

SYSASTART*

SYSAACK*

SYSDSTART*

SYSDACK*

AddrB AddrC

A

000

B C

Chapter 8 CPU Bus

8-23

8.5.4.2 CPU Non-Pipel ine Single Wri tes

All adjacent address phases are write cycles.

Figure 8-15. CPU Non-Pipeline Single Writes

8.5.5 CPU Non-Pipeline Burst Operations

8.5.5.1 CPU Non-Pipeline Burst Reads

All adjacent address phases are read cycles .

Figure 8-16. CPU Non-Pipeline Burst Reads

AddrA

BUSCLK

SYSWR*

SYSADDR

CPUDATA

SYSTSIZE

SYSRD*

SYSASTART*

SYSAACK*

SYSDSTART*

SYSDACK*

AddrB AddrC

A C

000

SYSDATA A

B

BC

12345678910

AddrA

BUSCLK

SYSWR*

SYSADDR

SYSDATA

SYSTSIZE

SYSRD*

SYSASTART*

SYSAACK*

SYSDSTART*

SYSDACK*

AddrB

A1 B4

33

B1

12345678910

B2 B3A2 A4A3

Chapter 8 CPU Bus

8-24

8.5.5.2 CPU Non-Pipel ine Burst Writ es

All adjacent address phases are write cycles.

Figure 8-17. CPU Non-Pipeline Burst Writes

AddrA

BUSCLK

SYSWR*

SYSADDR

SYSDATA

SYSTSIZE

SYSRD*

SYSASTART*

SYSAACK*

SYSDSTART*

SYSDACK*

AddrB

A1 B4

33

B1

12345678910

B2 B3A2 A3

CPUDATA A1 B4B1 B2 B3A2 A3

A4

Chapter 8 CPU Bus

8-25

8.5.6 Bus Error Operations

Bus error occurs when there are no slave responding to the address or data phases of the

bus cycle. When bus error occurs, the current bus operation is terminated, and the system

proceeds with the next bus operation. Without bus error detection, the CPU Bus would

remain waiting i nd efinitel y f o r t he S Y S AACK * or SYSDACK* signals.

Bus error is generated by the CPU Bus monitor logic. The monitor logic basically makes

sure that for both address and data phases in the current CPU Bus cycle, there are

SYSAACK* and SYSDACK*, respectively. In the case, when there is no SYSAACK* or

SYSDACK* or response to the address or data phase for a pre-defined period of time for

the current CPU Bus cycle, bus error is generated by asserting BUSERR* for one CPU

Bus clock. Bus error has higher priority than SYSAACK* or SYSDACK* if they are de-

tected in the same cycle.

Bus error is always asserted in reference to the data phase of the cycle. The exact timing

is the cycles from SYSDSTART* asserted to the cycle before the assertion of the next

SYSDSTART*. The bus error signal is sampled when the system is waiting for the asser-

tion of SYSDACK* and/or SYSAACK* of the operation corresponding to the current data

phase. For example, if the address phase of a certain cycle has no response from the slave

devices, the bus monitor logic will wait until the SYSDSTART* of the corresponding data

phase before generating the bus error. The bus monitor logic can generate the bus error

any time before the next data phase begins.

8.5.6.1 Bus Error Exceptions

As mentioned before, two operations can be pipelined on the CPU bus, and these two op-

erations can be initiated from either the CPU as master or the DMAC as master.

If the bus error occurs in the CPU initiated operation, the following occurs:

• a bus error exception due to instruction fetch or data access is generated

• the bus error instruction or data address is recorded in the

BadPAddr

Register

of COP0

• the

Status.BEM

bit is set (This bit is the bus error mask (BEM) in the COP0

Status Register).

Once a bus error occurs, any further bus errors are ignored until

Status.BEM

is cleared by

the bus error exception handler.

If the bus error occurs in the DMA initiated operation (DMA cycle), the DMAC will finish

the pending pipeline operations, disable itself, release the CPU Bus, and cause an inter-

rupt. The interrupt routine will then service and re-enable the DMAC accordingly. Table

8-4 summarizes the exception generation:

Table 8-4. Bus Error Exceptions

Operation with the Bus Error Exception Generated

CPU Init i ated Instructi on Fetch Bus Error Exception - I nstruc t i on Fetch

CPU Initiated Data Access Bus Error Exception - Data Access

DMA Cycle Interrupt Excepti on

Chapter 8 CPU Bus

8-26

8.5.6.2 CPU Bus Cycle Termination

Two pipeline operations can be in progress at any time, but if a bus error occurs, only the

operation with the bus error is terminated. That is, the occurrence of a bus error with one

master does not affect the program execution of another master. For example, if bus error

occurs when the first and second operations are initiated from the DMAC and CPU, re-

spectively, the CPU Bus will terminate the DMA operation and continue with the CPU

operation. Table 8-5 summarizes CPU Bus cycle sequence for all types of CPU Bus cycle

termination.

Table 8-5. Operation Termination Sequence

First Operation

with Bus Error Second

Operation CPU Bus Cycle Sequence

CPU Cycle #1 CP U Cycle #2 1. CP U Cycle #1 is term i nated.

2. Bus Error Exception occurs.

3. CPU Cyc l e #2 continues on.

CPU Cycle #1 DMA Cyc l e #2 1. CPU Cyc le #1 is t erminated.

2. Bus Error Exception occurs.

3. DMA Cycle #2 continues on.

DMA Cycle #1 CP U Cycle #2 1. DMA Cycle #1 is termi nated.

2. CPU Cyc l e #2 continues on.

3. DMA releases CP U Bus, disable its elf (disable further requests

until the interrupt routine re-enable the DMAC), and generate an

interrupt.

4. CPU cycles continues on.

DMA Cycle #1 DMA Cyc le #2 1. DMA Cycle #1 i s terminated.

2. DMA Cycle #2 continues on.

3. DMAC releases CPU Bus, disable itself (disable further re-

quests until the interrupt routi ne re-enable the DMAC), and gener-

ate an interrupt .

4. CPU cycles continue on.

8.5.6.3 Bus Error Timing with No Pendi ng O perat ion

If there are no pending operations on the bus, BUSERR* is ignored at all times.

8.5.6.4 Bus Error Timing with O ne Pendi ng O perat ion

If there is one pending operation on the bus, BUSERR* is sampled while waiting for the

assertion of SYSAACK* or SYSDACK*. If BUSERR* is asserted, the bus cycle will con-

tinue as if the SYSAACK* and/or the last SYSDACK* has been asserted. Figure 8-18,

Figure 8-19, and Figure 8-20 illustrates the bus error associated with one pending opera-

tion. In these figures, BUSERR* is ignored before CPUDSTART* and after BUSERR* as-

serted because the bus is not waiting for t he as s er t io n of SY S AACK* nor SYSDACK*.

Chapter 8 CPU Bus

8-27

Figure 8-18. One Operation with BUSERR* as the Last SYSDACK *

Figure 8-19. One Operation with BUSERR* as SYSAACK*

Addr

BUSCLK

CPUASTART*

CPUADDR

CPUWR*

CPUTSIZE

SYSAACK*

CPUDATA

CPUDSTART*

SYSDACK*

BUSERR*

3

D0 D1 D2

Ignored Bus Error Detection Ignored

Addr

BUSCLK

CPUASTART*

CPUADDR

CPUWR*

CPUTSIZE

SYSAACK*

CPUDATA

CPUDSTART*

SYSDACK*

BUSERR*

3

D0 D1

Ignored Bus Error Detection Ignored

D2 D3

Chapter 8 CPU Bus

8-28

Figure 8-20. One Operation with BUSERR* as SYSAACK*

and the Last SYSDACK*

8.5.6.5 Bus Error Timing with Two Pending O perat i ons

If there are two pending operations on the bus, BUSERR* is sampled while waiting for the

assertion of SYSDACK*. If BUSERR* is asserted, the bus cycle will continue as if the last

SYSDACK* has been asserted. The bus cycle will then proceed with the data phase of the

next operation. The bus error that occurred is for the first pending operation.

Figure 8-21 illustrates the bus error associated with two pending operations. In this figure,

BUSERR* is i gnored af ter BUSERR* asserted because the bus is no longer waiting for the

assertion of SYSDACK* corresponding to operation AddrA with the bus error, and detec-

tion of bus error for operation AddrB has not started until the assertion of CPUDSTART*.

Addr

BUSCLK

CPUASTART*

CPUADDR

CPUWR*

CPUTSIZE

SYSAACK*

CPUDATA

CPUDSTART*

SYSDACK*

BUSERR*

3

D0 D1

Ignored Bus E rror Detect i on Ignored

D2

Chapter 8 CPU Bus

8-29

Figure 8-21. Two Operations with Bus Error as the Last SYSDACK*

AddrB

BUSCLK

CPUASTART*

CPUADDR

CPUWR*

CPUTSIZE

SYSAACK*

CPUDATA

CPUDSTART*

SYSDACK*

BUSERR*

3

A0 A1

Ignored Bus E rror Detect i on Bus Error

Detection for B

AddrA

3

B0

Ignored

A2

Chapter 8 CPU Bus

8-30

Chapter 9 Perform ance Counter

9-1

9. Performance Counter

The performance counter provides the means for gathering statistical information about

the internal events of the CPU and the pipeline during program execution. The statistics

gathered during program execution aid in tuning the performance of hardware and

software systems based on the processor.

Chapter 9 Perform ance Counter

9-2

9.1 Overview

The performance counter consists of one control register and two counters. The control

register controls the functions of the monitor while the counters count the number of

events specified by the control register.

9.2 Performance Counters and Performance Control Registers

The

Performance

Counter Control Regi ster

, or

PCCR

, and

Performance Counter Registers

PCR0

and

PCR1

are mapped into

COP0

Register 25. Both the register and counters are

read/write registers accessible by

MTPC

,

MTPS

,

MTC0

,

MFPC

,

MFPS

and

MFC0

instructions. Each counter is capable of counting one event as specified by the control

register.

The format of the

PCCR

is shown in Figure 9-1, and the format of

PCR0

and

PCR1

is

shown in Figure 9-2.

31 30 29 28 27 26 25 24 23 22 21 20 19 15 14 13 12 11 10 9 5 4 3 2 1 0

C

T

E

00000000000 EVENT1 U

1S

1K

1E

X

L

1

0 EVENT0 U

0S

0K

0E

X

L

0

111111111111 5 11111 5 11111

Figure 9-1. Format of the Performance Counter Control Register PCCR

31 30 0

OVFL VALUE

131

Figure 9-2. Format of Performance Counter Registers PCR0 and PCR1

The interpretation of the

PCCR

register bits is as follows:

Table 9-1. PCCR Register Bits

Field Function Initial Value

CTE If 1, PCR0 and PCR1 counting and exception generation is enabled. 0

EVENT0/1 Event counted by PCR0/1; see Table 9-5 for details. Undefined

U0/1 PCR0/1 counts event EVENT0/1 when in User mode. Undefined

S0/1 PCR0/1 counts event EVENT0/1 when in Supervisor mode. Undefined

K0/1 PCR0/1 counts event EVENT0/1 when in non-exception Kernel

mode; i.e. with both STATUS.EXL and STATUS.ERL set to 0. Undefined

EXL0/1 PCR0/1 counts event EVENT0/1 when in Level 1 exception handler. Undefined

Chapter 9 Perform ance Counter

9-3

9.2.1 Accessing Counters and Regi ster s

The counter control register

PCCR

and the two performance counter registers

PCR0

and

PCR1

are accessed by using

MTC0

* and

MFC0

* instructions. All three registers are

mapped to

COP0

register 25. Table 9-2 illustrates how these registers are written by using

the

MTC0

instruction, and Table 9-3 illustrates the encoding of the

MFC0

instructions

used to read the registers.

Table 9-4 show special mnemonics to access the performance Counters and Registers.

Table 9-2. Writing Performance Counters and Registers using MTC0

OpCode[15:11] OpCode[1:0] Operation

11001 00 Move to Counter Control Register

11001 01 Move to Performance Counter Regi ster 0

11001 10 unused

11001 11 Move to Performance Counter Regi ster 1

Table 9-3. Reading Performance Counters and Registers using MFC0

OpCode[15:11] OpCode[1:0] Operation

11001 00 Move from Counter Control Register

11001 01 Move from Perf ormanc e Count er Regi ster 0

11001 10 unused

11001 11 Move from Perf ormanc e Counter Regist er 1

Table 9-4. Mnemonics to Access the Performance Counters and Registers

MTPC Move to Performance Count er

MTPS Move to Performance Event S pecifi es

MFPC Move from P erformance Counter

MFPS Move from Performance Event S pecifies

* MTPC, MTPS, MFPC and MFPS are the special encoding of MTC0 and MFC0.

Chapter 9 Perform ance Counter

9-4

9.2.2 State of Perfor mance Counter Control Register s Upon Reset

The CTE bit of the

Performance Counter Control Register

PCCR

is initialized to 0 upon

reset. This prevents event counting and interrupt generation until the control registers

are initialized. It also allows a precise way for counters to be initialized by software; see

the section 9.3.2 for more details. Note that the remaining bits of

PCCR

and both registers

PCR0

and

PCR1

must be initialized by software.

Chapter 9 Perform ance Counter

9-5

9.3 Counter Operation

The performance counters

PCR0

and

PCR1

increment by 1 whenever their corresponding

count event occurs, and the counter is enabled. The count event for

PCR0

is specified by

PCCR.EVENT0

and the count event for

PCR1

is specified by

PCCR.EVENT1

. The

encoding of the

EVENT

field is specified in Table 9-5, and discussed in detail later. A

counter is enabled only when both of the following conditions are satisfied:

1. The global counter enable flag

PCCR.CTE

is set to 1, and

2. The current privilege mode matches the permitted privilege mode for each

counter. The values in

PCCR.U0

,

PCCR.S0

,

PCCR.K0

, and

PCCR.EXL0

specify the

permitted privilege modes for

PCR0

and

PCCR.U1

.

PCCR.S1

,

PCCR.K1

, and

PCCR.EXL1

specify the permitted privilege modes for

PCR1

. For example, if the current privilege mode is

SUPERVISOR

,

PCR0

will

operate only if

PCCR.S0

is set to 1. Note that there is no “ERL0” or “ERL1” flag in

PCCR

. This is because counters are unconditionally disabled when in level 2

handlers.

Chapter 9 Perform ance Counter

9-6

9.3.1 Counter Events

A counter increments if it is enabled and its trigger event occurs. The permissible values

for

PCCR.EVENT0

and

PCCR.EVENT1

are as shown in Table 9-5 below. The events are

described in Section.9.3.1. 1Event D es criptions

Table 9-5. Counter Events

Event Counter 0 Counter 1

0reserved Low-order branch issued

1Proce ssor cycle Processor cycle

2Single instruction issue Dual instruction issue

3Branch issued Branch mispredicted

4BTAC miss JTLB miss

5ITLB miss DTLB miss

6I$ miss D$ miss

7DTLB accessed WBB single request unavailable

8Non-blocking load/store WBB burst request unavailable

9WBB single request WBB burst request almost full

10 WBB burst request WBB burst request full

11 CPU address bus busy CPU data bus busy

12 Instruction completed Instruction completed

13 Non-BDS instruction completed Non-BDS instruction completed

14 reserv ed COP1 instruction completed

15 Load completed Store completed

16 No event No event

17-31 reserved reserved

Chapter 9 Perform ance Counter

9-7

9.3.1.1 Event Descriptions

In event descriptions, the word ‘branch’ (for example, ‘branch issued’, or ‘branch miss-

predicted’) means any ‘transfer of control’ instruction that is subject to prediction (that is,

all the conditional branch instructions,

J

, and

JAL

). The

JR

,

JALR

,

ERET

,

SYSCALL

,

BREAK

, and

TRAP

instructions are not included.

Branch issued

This event is triggered whenever a branch is issued to a functional

pipe. Note that a branch that is issued in a pipelined

implementation may get canceled if an instruction prior to it

signals an exception.

Branch

mispredicted

This event is triggered whenever the predicted branch address

(taken o r not-taken ) is incorre ct. Note that a branch th at is issued

in a pipelined implementation may get canceled if an instruction

prior to it signals an exception.

BTAC miss

This event is triggered whenever the instruction address lookup

into the BTAC fails. Counts low-order (even) branch instructions

that miss the BTAC. Note that high-order (odd) branch does not

refer the BTAC.

COP1

instruction

completed

This event is triggered when a COP1 instruction completes. The

event is signaled even if the COP1 instruction completes

successfully, but appears in the branch delay slot of a branch-

likely instruction and is therefore nullified.

CPU address

bus busy

Generates a signal once every BUSCLK (not CPU clock) that the

CPU address bus is unavailable. The CPU address bus is

considered unavailable whenever it is b u sy, or when two addresses

have been issued but the data for the first address has yet to

return.

Data cache miss

This event is triggered whenever a data cache miss is detected.

See Table 9-6. for the D$ miss definition.

Table 9-6. Definition of Data Cache Miss

Access DCE Page Attr. Hit/Miss

0 Uncached, UCA, Cached Miss

Uncached, UCA Miss

Load 1Cached Hit/Miss

0Uncached, UCA, Cached Hit

Uncached, UCA Hit

Store 1Cached Hit/Miss

0Uncached, UCA, Cached Uncount *

Uncached, UCA Uncount *

Pref 1Cached Hit/Miss

In this event, the data cache miss is defined as any load/store/pref

instructions which may generate bus read operations to get missed data from

external memory.

* Prefetch to the Uncached or UCA page is considered as nop.

Chapter 9 Perform ance Counter

9-8

DTLB accessed

Barring canceled instructions, t his event counts the total number

of executed loads and stores. Thus, ‘data cache mis s’ divided by

‘DTLB accessed’ provide a good estimate of the D miss rate

(assuming no uncached loads/stores occur). Also, ‘DTLB miss’

divided by ‘DTLB accessed’ provides the DTLB miss rate. DTLB i s

accessed even when unmapped page is accessed in case that minor

revision number is 0x10 or later.

DTLB Miss

This event is triggered whenever a DTLB miss is detected. DTLB

is accessed even when unmapped page is accessed in case that

minor revision number is 0x10 or later.

Dual instruction

issued

This event is signaled whenever both functional pipes of the C790

are issued instructions*. The event counter is incremented by 1.

Instruction

cache miss

This event is tri ggered whenever an instruct ion cache miss is

detected.

Instruction

completed

This event triggers when an instruction completes. Note that some

instructions (e.g. SYSCALL, TEQ, TEQI, etc.) signal exceptions as

a normal part of their operation. Such instructions are considered

complete whether or not the “normal” exception was raised.

Therefore, an “instruction complete” event is signaled even if a

TEQ succeeds (i.e. raises a Trap exception). However, if a “true”

exception occurs (e.g. a counter exception is signaled while the

TEQ is executing), the instruction is canceled and no “instruction

complete” signal is generated. Similarly, an instruction in the

branch delay slot (BDS) of a branch-likely instruction is counted

as complete even if the BDS instruction is nullified. If the BDS

instruction is canceled because of a “true” exception, no

“instruction completed” event is signaled.

C790 Implementation Note: Up to two instructions can complete

every cycle in the C790. When two instructions do complete, the

event counter is incremented by 2.

ITLB miss

This event is triggered whenever a ITLB mi ss is detected.

JTLB miss

This event is triggered whenever a JTLB miss is detected.

Load completed

This event triggers when a load instruction completes. Note that

the event i s signaled even if the load appears in the branch delay

slot of a branch-likely instruction that is not taken and is therefore

nullified.

Low-order

branch issued

Counts the numbers of branches that were issued that appeared in

the low-order (even) position of an instruction pair fetch. This

count is needed since only these branches are subject to BTAC

lookup.

No event

This “event” effectively disables the corresponding counter. It is

useful principally if only one of the two counters need be activated.

Non-BDS

instruction

completed

(for stepping)

This event triggers when an instruct ion that does not have a

branch delay slot completes. In particular, it does not trigger when

a branch or jump instruction completes. However, it does trigger

when the instruction in the branch delay slot of the branch or

jump completes. In the case of a branch-likely instruction, the

instruction in the branch delay slot triggers the event even if this

instruction is nullified. Note: this event is useful for stepping over

instructions.

*(Dual instruction issued) *2 + (Single instruction issued) = instruction issued

(Instruction issued) − (instruction comp leted) = instruction cancele d

Chapter 9 Perform ance Counter

9-9

Non-blocking

load/store

(1st cache miss):

This event is signaled whenever a cached load/store/pref

instruction misses on the Data Cache and there is no pending

data cache miss, UCAB miss and uncached load.

Processor cycle

This event triggers on every processor clock cycle.

Single

instruction

issued

This event is s ignaled whenever only one of the functional pipes

of the C790 is issued an instruction*.

Store completed

This event triggers when a store instruction completes . Note that

the event i s signaled even if the store appears in the branch delay

slot of a branch-likely instruction that is not taken and is

therefore nullified.

WBB Single

Request

A non-burst request was made to the WBB.

WBB Burst

Request

A burst request was made to the WBB.

WBB Single

Request

unavailable

A non-burst request was made to the WBB, but there were

insufficient free entries in the WBB to service it. All 8 entries are

used at that time.

WBB Burst

Request

unavailable

A burst request was made to the WBB, but, the WBB was

completely full, or there were not enough t o service the request. 5,

6, 7, 8 entries are used at that time.

WBB Burst

Request almost

full

A burst request was made to the WBB, and even though there

were free entries, there were not enough to service the request. 5,

6, 7 entries are used at that time.

WBB Burst

Request full

A burst request was made to the WBB, but the WBB was

completely full. All 8 entries are used at that time.

*(Dual instruction issued) *2 + (Single instruction issued) = instruction issued

(Instruction issued) − (instruction comp leted) = instruction cancele d

Chapter 9 Perform ance Counter

9-10

9.3.2 Handling Performance Counter Excepti ons

A performance counter exception is detected by an instruction if the following condition

holds true:

~STATUS.ERL && PCCR.CTE && (CTR0.OVFL || CTR1.OVFL)

Note that software should not rely on the exception occurring if the instruction is nullified;

i.e. it appears in the branch delay slot of a branch likely instruction that is not taken.

C790 Implementation Note:

C790 implementation always counts events that occur within

nullified instructions.

The instruction detecting a counter exception is canceled by the exception, and instruction

execution continues as follows :

if ( in branch delay slot ) {

ErrorEPC = PC - 4;

CAUSE.BD2 = 1;

}

else {

ErrorEPC = PC;

CAUSE.BD2 = 0;

}

if ( STATUS.DEV )

PC = 0xBFC00280; // Uncached counter xcp handler

else

PC = 0x80000080; // “Normal” counter xcp handler

STATUS.ERL = 1;

CAUSE.EXC2 = 2; // Counter exception

The description above makes use of the

BD2

and

EXC2

fields in the

CAUSE

register. Both

are fields newly introduced in the C790 and occupy the bit positions s how n below .

0 0

012345

1112131415161718

0

272829

B

D

2

30

B

D

31

0

25

0

26

0

24

0

23

0

22

0

21

0

2019

I

P

2

10

I

P

3

0

I

P

7

CE 0

98

0

76

EXC

EXC2 0

S

I

O

P

0 0

Figure 9-3. CAUSE Register Fields

C790 Programming Note:

Note that the “normal” exception entry point is in kseg0 space.

That is, the address is unmapped and the caching policy is determined by

CONFIG.K0

. If

you don’t want to disturb the cache while counting and stepping, kseg0 should be

configured in “uncached” mode. If cache data preservation is secondary to counter

exception servicing performance counter overflow, kseg0 should be configured in “cached”

mode.

Chapter 9 Perform ance Counter

9-11

9.3.3 Priori ty of Counter Exceptions

Counter exceptions have the highest priority after cold reset and NMI. If a cold reset

occurs the processor is initialized – so a simultaneous counter exception is discarded. If an

NMI occurs, the NMI handler is entered with either

PCR0.OVFL

or

PCR1.OVFL

(or both)

set to 1, and

ErrorEPC

pointing at the instruction causing the counter overflow.

(

ErrorEPC

is used because NMI is handled as a level 2 exception.) Once the NMI handler

exits, the instruction that caused the overflow is re-executed. However, since

PCR0.OVFL

or

PCR1.OVFL

is 1, the instruction is canceled once more and the counter exception

handler is entered.

9.3.4 Initializing Counters

Let us look at the code sequence needed to initialize counters and activate them. In the

example below,

PCR0

is set up to count clocks in all operating modes and report a counter

exception after the count exceeds 231.

CTR1

is set up to count stores while in supervisor

mode only, and report a counter exception after the count exceeds 231. The code must be

executed while in level 2 exception mode (ERL=1).

STATUS.ERL = 1; // Set ERL (to inhibit counting)

ErrorEPC = <target instruction where counting is to start>

PCR0 = 0; // Init CTR0, and …

PCCR.EVENT0 = 1; // … set up to count clocks …

PCCR.U0 = 1; // … in all privilege modes

PCCR.S0 = 1;

PCCR.K0 = 1;

PCCR.EXL0 = 1;

PCR1 = 0; // Init PCRT1, and …

PCCR.EVENT1 = 15; // … set up to count completed stores …

PCCR.U1 = 0; // … while in supervisor mode

PCCR.S1 = 1;

PCCR.K1 = 0;

PCCR.EXL1 = 0;

PCCR.CTE = 1; // Enable global counter flag

ERET // Execute ERET to clear ERL -

// counting begins with ERET’s target

// Note that the ERET instruction also

// guarantees that the COP0 state

// updated (e.g. CCR) is valid.

Chapter 9 Perform ance Counter

9-12

9.3.5 The Note to Read Counters

Whenever you want to read a counter by MTC0 or MTPC, be sure that any counting

events must NOT occur, otherwise you may get wrong number. For example, counter for

TLB event should be read in the unmapped area, that of instruction completion event

should be read in the ERL=1 (level 2 exception) area or other disabled area.

It is a implement-dependent that when the event is counted. It depends on the number of

the pipeline stages and so on.

To write a robust code among silicon versions and mask versions, you read the counters

after flushing the pipeline by

SYNC.P

instruction. C790 is a pipeline processor. It is

required for the instruction completion type event.

It is a nature of event counting that some inaccuracy exists. You don’t need to be

surprised if different number is observed in different version of silicon/mask.

Chapter 10 Floating-Point Unit , CP1

10-1

10. Floating-Point Unit, CP1 (Option)

This chapter describes the floating-point operations, including the programming model,

instruction set and formats.

The floating-point operations fully conform to the requirements of ANSI/IEEE Standard

754-1985,

IEEE Standard f o r Binar y Fl oat ing-Point Arit hm et i c

.

Chapter 10 Floating-Point Unit , CP1

10-2

10.1 Overview

All floating-point instructions, as defined in the MIPS ISA for the floating-point

coprocessor, CP1, are processed by the other hardware unit that executes integer

instructions.

The floating point execution unit can be disabled by the coprocessor usability

CU

bit

defined in the CP0

Status

register.

10.2 Floating Point Register

10.2.1 Floating-Point General Registers (FGRs)

CP1 has a set of

Floating-Point General Purpose registers (FGRs)

that can be accessed in

the following ways:

• As 32 general purpose registers (32 FGRs), each of which is 32 bits wide when the

FR

bit in the

CPU

Status register equals 0; or as 32 general purpose registers (32 FGRs),

each of which is 64-bits wide when FR equals 1. The CPU accesses these registers

through move, load, and store instructions.

• As 16 floating-point registers (see the next section for a description of FPRs), each of

which is 64-bits wide, when the

FR

bit in the CPU

Status

register equals 0. The FPRs

hold values in either single- or double-precision floating-point format. Each FPR

corresponds to adjacently numbered FGRs as shown in Figure 10-1.

• As 32 floating-point registers (see the next section for a description of FPRs), each of

which is 64-bits wide, when the

FR

bit in the CPU

Status

register equals 1. The FPRs

hold values in either single- or double-precision floating-point format. Each FPR

corresponds to an FGR as s how n in Figure 10- 1.

Chapter 10 Floating-Point Unit , CP1

10-3

Floating-point

Registers (FPR)

(FR = 0)

Floating-Point

Gen eral Purp o se Re

g

isters

Floating-point

Registers (FPR)

(FR = 1)

Floating-Point

General Purpose Registers

31

(

FGR

)

063(FGR)0

(least) FGR0 FPR0 FGR0

FPR0 (most) FGR1 FPR1 FGR1

(least) FGR2 FPR2 FGR2

FPR2 (most) FGR3 FPR3 FGR3

••

(least) FGR28 FPR28 FGR28

FPR28 (most) FGR29 FPR29 FGR29

(least) FGR30 FPR30 FGR30

FPR30 (most) FGR31 FPR31 FGR31

Floating-point

Control Registers

(FCR)

Control/Status Register Implem entation/Revision Register

31 (FCR31) 0 31 (FCR0) 0

Figure 10-1. FP Registers

Chapter 10 Floating-Point Unit , CP1

10-4

10.2.2 Floating-Point Registers (FPRs)

The FPU provides:

• 16

Floating-Point

registers (

FPRs

) when the

FR

bit in the

Status

register equals 0, or

• 32

Floating-Point

registers (

FPRs

) when the

FR

bit in the

Status

register equals 1.

These 64-bit registers hold floating-point values during floating-point operations and are

physically formed from the

General Purpose

registers (

FGRs

). When the

FR

bit in the

Status

register equals 1, the

FPR

references a single 64-bit

FGR

.

The

FPRs

hold values in either single- or double-precision floating-point format. If the

FR

bit equals 0, only even numbers (the

least

register) can be used to address

FPRs

. When

the

FR

bit is set to a 1, all

FPR

register numbers are valid.

If the

FR

bit equals 0 during a double-precision floating-point operation, the general

registers are accessed in double pairs. Thus, in a double-precision operation, selecting

Floating-Point Register 0 (FPR0)

actually addresses adjacent

Floating-Point General

Purpose

registers

FGR0

and

FGR1

.

10.2.3 Floating-Poi nt Contr ol Regi ster s

The MIPS RISC architecture defines 32 floating-point control registers (

FCRs

); the C790

processor implements two of these registers:

FCR0

and

FCR31

. These

FCRs

are described

below:

• The

Implementation/Revision

register

(FCR0)

holds revision information.

• The

Control/Status

register

(FCR31)

controls and monitors exceptions, holds the

result of compare operations, and establishes rounding modes.

•

FCR1

to

FCR30

are reserved.

Table 10-1 lists the ass i gnments of the FCRs .

Table 10-1. Floating-Point Control Register Assignments

FCR Number Use

FCR0 Coprocessor implem entation and revision regis ter

FCR1 to FCR30 Reserved

FCR31 Rounding mode, cause, trap enables, and flags

Chapter 10 Floating-Point Unit , CP1

10-5

Implementation and Revision Register (FCR0)

Implementation and Revision Register (FCR0)Implementation and Revision Register (FCR0)

Implementation and Revision Register (FCR0)

The read-only

Implementation and Revision

register

(FCR0)

specifies the implementation

and revision number of CP1. This information can determine the coprocessor revision and

performance level, and can also be used by diagnos tic s of t w are.

Figure 10-2 shows the layout of the register; Table 10-2 describes the

Implementation and

Revision

register

(FCR0)

fields.

Implementation/Revision Register (FCR0)

31 16 15 8 7 0

0ImpRev

16 8 8

Figure 10-2. Implementation/Revision Register

Table 10-2. FCR0 Fields

Field Description Initial value

Im p Im pl ementat i on number 0x38

Rev Revision number i n the form of y. x Revisi on Number

0 Reserved. Retu rns zeroes when read.

The revision number is a value of the form

y

.

x

, where:

•

y

is a major revision number held in bits 7:4.

•

x

is a minor revision number held in bits 3:0.

The revision number distinguishes some chip revisions; however, there is not guarantee

that changes to its chips are necessarily reflected by the revision number, or that changes

to the revision number necessarily reflect real chip changes. For this reason revision

number values are not listed, and software should not rely on the revision number to

characterize the chip.

IEEE Standard 754

IEEE Standard 754IEEE Standard 754

IEEE Standard 754

IEEE Standard 754 specifies that floating-point operations detect certain exceptional

cases, raise flags, and can invoke an exception handler when an exception occurs. These

features are implemented in the MIPS architecture with the

Cause

,

Enable

, and

Flag

fields of the

Control/Status

register. The

Flag

bits implement IEEE 754 exception status

flags, and the

Cause

and

Enable

bits implement exception handling.

Chapter 10 Floating-Point Unit , CP1

10-6

Control/St atus Regis ter ( FCR31

Control/St atus Regis ter ( FCR31Control/St atus Regis ter ( FCR31

Control/St atus Regis ter ( FCR31 )

))

)

The

Control/Status

register

(FCR31)

contains control and status information that can be

accessed by instructions in either Kernel or User mode.

FCR31

also controls the

arithmetic rounding mode and enables User mode traps, as well as identifying any

exceptions that may have occurred in the most recently executed floating-point instruction,

along with any exceptions that may have occurred without being trapped.

Figure 10-3 shows the format of the

Control/Status

register, and Table 10-3 describes the

Control/Status

register fields. Figure 10-4 shows the

Control/Status

register

Cause, Flag,

and

Enable

fields.

Control/Status Register (FCR31)

31 25 24 23 22 18 17 12 11 7 6 2 1 0

0FS C0 Cause

EVZOUI Enables

VZOUI Flags

VZOUI RM

7115 6 5 52

Figure 10-3. FP Control/Status Register Bit Assignments

Table 10-3. Control/Status Register Fields

Field Description

FS When set, denormalized results can be flushed instead of causing

an unimpl emented operat i on except i on.

C Condition bit. See description of Control/Status register Condition

bit.

Cause Cause bits. See Figure 10-4 and the description of Control/Status

register Cause, Flag, and Enable bits.

Enables Enable bits. See Figure 10-4 and the description of Control/Status

register Cause, Flag, and Enable bits.

Flags Flag bits. See Figure 10-4 and the description of Control/Status

register Cause, Flag, and Enable bits.

RM Rounding mode bits. See Table 10-5 and the description of

Control/Status register Roundi ng Mode Control bits.

Chapter 10 Floating-Point Unit , CP1

10-7

Bit# 17 16 15 14 13 12

EVZOUI

Bit#1110987

VZOUI

Bit#65432

VZOUI

Inexact Operation

Underflow

Overflow

Divisi on by Zero

Invalid Operat i on

Unimplement ed Operation

Cause

Bits

Enable

Bits

Flag

Bits

Figure 10-4. Control/Status Register Cause, Flag, and Enable Fields

Control/Status

Control/StatusControl/Status

Control/Status Regist er

Register Register

Register FS Bit

FS Bit FS Bit

FS Bit

The

FS

bit enables the flushing of denormalized values. When the

FS

bit is set and the

Underflow and Inexact

Enable

bits are not set, denormalized results are flushed instead of

causing an Unimplemented Operation exception. Results are flushed to either 0 or the

minimum normalized value, depending upon the rounding mode (see Table 10-4 below),

and the Underflow and Inexact of the

Cause

and

Flag

bits are set.

Table 10-4. Flush Values of Denormalized Results

Flushed Resul t Roundi ng ModeDenormalized

Result RN RZ RP RM

Positive +0+0+2Emin +0

Negative -0 -0 -0 -2Emin

Control/Status Register Condition Bit

Control/Status Register Condition BitControl/Status Register Condition Bit

Control/Status Register Condition Bit

When a floating-point Compare operation takes place, the result is stored at bit 23, the

Condition

bit. The

C

bit is set to 1 if the condition is true; the bit is cleared to 0 if the

condition is false. Bit 23 is affected only by compare and

CTC1

instructions.

Chapter 10 Floating-Point Unit , CP1

10-8

Control/Status

Control/StatusControl/Status

Control/Status Regist er

Register Register

Register Cause, Flag, and Enable Fields

Cause, Flag, and Enable Fields Cause, Flag, and Enable Fields

Cause, Flag, and Enable Fields

Figure 10-4 illustrates the

Cause, Flag,

and

Enable

fields of the

Control/Status

register.

The

Cause

and

Flag

fields are updated by all conversion, computational (except MOV. fmt),

CTC1

, reserved, and unimplemented instructions. All other instructions have no affect on

these fields.

Cause Bits

Cause BitsCause Bits

Cause Bits

Bits 17:12 in the

Control/Status

register contain

Cause

bits, as shown in Figure

10-4, which reflect the results of the most recently executed floating-point

instruction. The

Cause

bits are a logical extension of the CP0

Cause

register; they

identify the exceptions raised by the last floating-point operation. If the

corresponding

Enable

bit is set at the time of the exception a floating-point

exception is raised and trapped by CPU. If more than one exception occurs on a

single instruction, each appropriate bit is set.

The

Cause

bits are updated by most floating-point operations. The Unimplemented

Operation

(E)

bit is set to 1 if software emulation is required, otherwise it remains 0.

The other bits are set to 0 or 1 to indicate the occurrence or non-occurrence

(respectively) of an IEEE 754 exception. Within the set of floating-point

instructions that update the

Cause

bits, the

Cause

field indicates the exceptions

raised by the most-recently-executed instruction.

When a floating-point exception is taken, no results are stored, and the only state

affected is the

Cause

bit.

Enable Bits

Enable BitsEnable Bits

Enable Bits

A floating-point exception is generated any time a

Cause

bit and the corresponding

Enable

bit are set. A floating-point operation that sets an enabled

Cause

bit forces

an immediate floating-point exception, as does setting both

Cause

and

Enable

bits

with

CTC1

.

There is no enable for Unimplemented Operation

(E)

. An Unimplemented exception

always generates a floating- p oint exception.

Before returning from a floating-point exception, software must first clear the

enabled

Cause

bits with a

CTC1

instruction to prevent a repeat of the exception

trapping. Thus, User mode programs can never observe enabled

Cause

bits set; if

this information is required in a User mode handler, it must be passed somewhere

other than the

Status

register.

For a floating-point operation that sets only unenabled

Cause

bits, no f loating-point

exception occurs and the default res ult defined by IEEE 754 is stored. In this case,

the exceptions that were caused by the immediately previous floating-point

operation can be determined by reading the

Cause

field.

Chapter 10 Floating-Point Unit , CP1

10-9

Flag Bits

Flag BitsFlag Bits

Flag Bits

The

Flag

bits are cumulative and indicate the exceptions that were raised by the

operations that were executed since the bits were explicitly reset.

Flag

bits are set

to 1 if an IEEE 754 exc ep tion is rais ed , ot herwise t hey r em ain unchanged. The

Flag

bits are never cleared as a side effect of floating-point operations; however, they can

be set or cleared by writing a new value into the

Status

register, using a

CTC1

instruction.

When a floating-point exception is trapped, the flag bits are not set by the

hardware; floating-point exception software is responsible for setting these bits

before invoking a user handler.

Control/Status

Control/StatusControl/Status

Control/Status Regist er

Register Register

Register Round ing Mode Cont rol Bits

Rounding Mode Control Bits Rounding Mode Control Bits

Rounding Mode Control Bits

Bits 1 and 0 in the

Control/Status

register constitute the

Rounding Mode (RM)

field.

As shown in Table 10-5, these bits specify the rounding mode that CP1 uses for all

floating-point operations.

Table 10-5. Rounding Mode Bit Decoding

Rounding

ModeRM

(1:0) Mnemonic Description

0 RN Round result t o nearest repres entabl e value;

round to value with least-significant bit 0

when the two nearest representable values

are equally near.

1 RZ Round toward 0: round to value closest to

and not greater in magnitude than the

infinitely precise result .

2 RP Round toward +∞: round to value closest to

and not less than the i nf i ni tely precise result .

3 RM Round toward −∞: round to value closest to

and not greater than the infinitely precise

result.

10.2.4 Accessing the FP Control and Implementation/Revision

Registers

The

Control/Status

and the

Implementation/Revision

registers are read by a Move Control

From Coprocessor 1 (

CFC1

) instruction.

The bits in the

Control/Status

register can be set or cleared by writing to the register

using a Move Control To Coprocessor 1 (

CTC1

) instruction. The

Implementation/Revision

register is a read-only register. There are no pipeline hazards (between any instructions)

associated with floating-point control registers.

Chapter 10 Floating-Point Unit , CP1

10-10

10.3 Floating-Point Formats

CP1 performs both 32-bit (single-precision) and 64-bit (double-precision) IEEE standard

floating-point operations. The 32-bit single-precision format has a 24-bit signed-

magnitude fraction field

(f+s)

and an 8-bit exponent

(e)

, as show n in Figure 10- 5.

31 30 23 22 0

s

Sign e

Exponent f

Fraction

18 23

Figure 10-5. Single-Precision Floating-Point Format

The 64-bit double-precision format has a 53-bit signed-magnitude fraction field

(f+s)

and

an 11-bit exponent, as show n in Figure 10- 6.

63 62 5251 0

s

Sign e

Exponent f

Fraction

111 52

Figure 10-6. Double-Precision Floating-Point Format

As shown in the above figures, numbers in floating-point format are composed of three

fields:

• sign field, s

• biased exponent,

e

=

E

+

bias

• fraction,

f

=

b

1

b

2

....b

p-1

where

bias

= 127, p = 24 in single precision,

bias

= 1023

,

p = 53 in double precision

The range of the unbiased exponent

E

includes every integer between the two values Emin

and Emax inclusive, together with two other reserved values:

• Emin − 1 (to encode 0 and denormalized numbers)

• Emax + 1 (to encode ∞ and NaNs [Not a Number])

For single-and double-precision formats, each representable nonzero numerical value has

just one encoding uniquely.

For single-and double-precision formats, the value of a number,

v

, is determined by the

equations shown in Table 10-6.

Chapter 10 Floating-Point Unit , CP1

10-11

Table 10-6. Equations for Calculating Values in Single and Double-Precision Floating-Point Format

Equation Condition

v = NaN E = Emax+1 and f ≠ 0, regardl ess of s

v = (−1)s∞ E = Emax+1 and f = 0

v = (−1)s2E(1.f) Emin ≤ E ≤ Emax

v = (−1)s2Emin(0.f) E = Emin−1 and f ≠ 0

v = (−1)s0 E = Emin−1 and f = 0

For all floating-point formats, if

v

is NaN, the most-significant bit of

f

determines whether

the value is a signaling or quiet NaN:

v

is a signaling NaN if the most-significant bit of

f

is

set, otherwise,

v

is a quiet NaN.

Table 10-7 defines the values for the format parameters; minimum and maximum

floating-point values are given in Table 10-8.

Table 10-7. Floating-Point Format Parameter Values

Format

Parameter Single Double

Emax +127 +1023

Emin −126 −1022

Exponent bias +127 +1023

Exponent width in bits 8 11

Integer bit hidden hidden

Fraction width in bi ts 23† 52†

Format width i n bi ts 32 64

† Excluding the sign bit.

Table 10-8. Minimum and Maximum Floating-Point Values

Type Value

Float Minim um 1.40129846e-45

Float Minimum Norm 1.17549435e-38

Float Maximum 3.40282347e+38

Double Minimum 4.9406564584124654e-324

Double Minim um Norm 2.2250738585072014e-308

Double Maximum 1.7976931348623157e+308

Chapter 10 Floating-Point Unit , CP1

10-12

10.4 Binary Fixed-Point Format

Binary fixed-point values are held in 2’s complement format. Unsigned fixed-point values

are not directly provided by the floating-point instruction set. Figure 10-7 illustrates

binary word fixed-point format and Figure 10-8 illustrates binary long fixed-point format;

Table 10-9 lists the binary fixed-point f ormat f ields .

31 30 0

Sign Integer

131

Figure 10-7. Binary Word Fixed-Point Format

63 62 0

Sign Integer

163

Figure 10-8. Binary Long Fixed-Point Format

Field assignments of the binary fixed-point format are:

Table 10-9. Binary Fixed-Point Format Fields

Field Description

sign sign bit

integer integer value (2’s compl ement)

Chapter 10 Floating-Point Unit , CP1

10-13

10.5 Floating-Point Instruction Set Summary

Each instruction is 32 bits long, and aligned on a word boundary. This section describes

the overview of instructions for floating-point unit. A detailed description of each

instruction is provided in Appendix D.

10.5.1 Load, Store and Move Instructions (Table 10-10)

Load and Store instructions move data between memory and FPU general purpose

registers(FGR), and Move instructions move data directly between CPU and FPU general

purpose registers(FGR). These instructions are not perform format conversions and

therefore never cause floating-point exceptions. The instruction immediately following a

load can use the contents of the loaded register. However, in such case the hardware

interlocks, requiring additional real cycles. Thus, the scheduling of load delay slots is

required to avoid the interlocking.

Table 10-10. FPU Instruction Set (Optional): Load, Move and Store Instruction

Instruction Description Note

LWC1 Load Word to FPU (c oprocess or 1) MIPS I

SWC1 Store Word f rom FPU (c oprocess or 1) MIPS I

MTC1 Move Word to FPU (coprocess or 1) MIPS I

MFC1 Move Word from FP U (coproces sor 1) MIPS I

CTC1 Move Control Word to FPU (coprocessor 1) MIPS I

CFC1 Move Control Word from FPU (coprocessor 1) MIPS I

LDC1 Load Doubleword to FPU (coproc essor1) MIPS II

SDC1 Store Doubleword from FPU (coprocessor1) MIP S II

DMTC1 Move Doubleword to FPU (c oprocessor1) MIPS II I

DMFC1 Move Doubleword from FP U (coproces sor1) MIP S III

Chapter 10 Floating-Point Unit , CP1

10-14

10.5.2 Conversion Instructions (Table 10-11)

Conversion instructions perform conversion operations between the various data formats.

Table 10-11. FPU Instruction Set(Optional): Conversion Instruction

Instruction Description Note

CVT.S. fmt Floating-P oint Convert to S i ngl e FP Format MIPS I

CVT.W.fmt Float i ng-Point Convert to Word Fixed-Point Format MIPS I

CVT.D.f mt Floating-Poi nt Convert to Double FP Format MIPS I

ROUND.W. fmt Floating-point Round t o Word Fixed-Point MIPS II

TRUNC.W. f mt Floating-poi nt Truncate t o Word Fixed-Point MIPS II

CEIL.W.fmt F l oating-point Cei l i ng Convert to Word Fi xed-Point MIPS II

FLOOR.W.fmt Floating-point Floor Convert to Word Fi xed-Point MIPS II

CVT.L.f mt Floating-Point Convert to Long Fixed-Point Format MIPS II I

ROUND.L.fmt Floating-point Round to Long Fixed-Point MIPS III

TRUNC.L.fmt Floating-point Truncate to Long Fixed-Point MIPS I I I

CEIL.L. fmt Floating-point Ceili ng Convert to Long Fixed-Point MIPS III

FLOOR.L.fmt Float i ng-poi nt Floor Convert to Long Fi xed-Point MIPS II I

10.5.3 Computational Instructions (Table 10-12)

Computational instructions perform arithmetic operations on floating-point values in the

FPU registers. These are two categories of computational instructions:

• 3-Operand Register-Type instructions, which perform floating-point addition,

subtraction multiplication, and division operations

• 2-Operand Register-Type instructions, which perform floating-point abusolute value,

move, negate, and square root operations.

Table 10-12. FPU Instruction Set(Optional): Computational Instruction

Instruction Description Note

ADD.fmt Floating-point Add MIPS I

SUB.f mt Floati ng-poi nt Subtract MIPS I

MUL.fm t Floating-point Mult i pl y MIPS I

DIV.fm t Float i ng-point Divide MIPS I

ABS. fmt Fl oating-point Absolut e Value MIPS I

MOV.fmt Floating-point Move MIPS I

NEG.fmt F l oating-point Negate MIPS I

SQRT.fmt Float i ng-poi nt Square root MIPS II

Chapter 10 Floating-Point Unit , CP1

10-15

10.5.4 Compare and Branch Instructions (Table 10-13)

Compare instructions perform comparisons of the contents of registers and set a

conditional bit based on the results. Branch on FPU Condition instructions perform a

branch to the specified target if the specified coprocessor condition is met.

Table 10-13. FPU Instruction Set(Optional): Compare and Branch Instruction

Instruction Description Note

C.cond. f mt F l oating-point Compare MIPS I

BC1T Branch on FPU True MIPS I

BC1F Branch on FPU Fals e MIPS I

Chapter 10 Floating-Point Unit , CP1

10-16

Chapter 11 Floating-Point Exception

11-1

11. Floating-Point Exception (Option)

This chapter describes FPU floating-point exceptions, including FPU exception types,

exception trap processing, exception flags, saving and restoring state when handling an

exception, and t rap hand ler s f o r IEEE St and ar d 754 exce p tions.

A floating-point exception occurs whenever the FPU cannot handle either the operands or

the results of a floating-point operation in its normal way. The FPU responds by

generating an exception to initiate a software trap or by se tting a s tatus f lag.

Chapter 11 Floating-Point Exception

11-2

11.1 Introduction

This chapter describes floating-point exceptions, including FPU exception type, exception

trap processing, exception flags, saving and restoring state when handling an exception,

and trap handler s f o r IEEE St and ar d 754 exc ep t ions .

11.2 Exception Types

The FP Control/Status register described in Chapter 10 contains an Enable bit for each

exception type; exception Enable bits determine whether an exception will cause the FPU

to initiate a trap or set a status flag.

• If a trap is taken, the FPU remains in the state found at the beginning of the

operation and a software exception handling routine executes.

• If no trap is taken, an appropriate value is written into the FPU destination register

and execution continues.

The FPU support s t he f ive IEEE St and ar d 754 exce p tions:

• Inexact (I)

• Underflow (U)

• Overflow (O)

• Division by Zero (Z)

• Invalid Operation (V)

Cause bits, Enables, and Flag bits (status f lags ) are us ed.

The FPU adds a sixth exception type, Unimplemented Operation (E). This exception

indicates the use of a software implementation. The Unimplemented Operation exception

has no Enable or Flag bit; whenever this exception occurs, an unimplemented exception

trap is taken.

Figure 11-1 shows the Control/Status regis t er bits that s upport exceptions.

Bit #171615141312

E V Z O U I Cause Bit s

Bit # |

11 |

10 |

9|

8|

7

V Z O U I Enable Bits

Bit # |

6|

5|

4|

3|

2

V Z O U I Flag Bits

|

Unimplemented |

Invalid |

Divisi on by

Zero

|

Overflow |

Underflow |

Inexact

Figure 11-1. Control/Status Register Exception/Flag/Trap/Enable Bits

Chapter 11 Floating-Point Exception

11-3

11.3 Exception Trap Processing

When a floating-point exception trap is taken, the Cause register indicates the floating-

point coprocessor is the cause of the exception trap.

The Floating-Point Exception (FPE) code is used, and the Cause bits of the floating-point

Control/Status register indicate the reason for the floating-point exception. These bits are,

in effect, an extension of the system coprocessor Cause register.

11.4 Flags

A Flag bit i s pr ovided for each IEEE exc eption. This Flag bit is set t o a 1 on the ass ertion

of its corresponding exception, without corresponding exception trap signaled.

The Flag bit is reset by writing a new value into the Status register; flags can be saved

and restored by software either individually or as a group.

When no exception trap is signaled, floating-point coprocessor takes a default action,

providing a substitute value for the exception-causing result of the floating-point

operation. The particular default action taken depends upon the type of exception. Table

11-1 lists the defaul t ac ti on t ak e n by t he FPU f o r each of the IEEE except io ns .

Table 11-1.　Default FPU Exception Actions

Field Description Rounding

Mode Default action

I Inexact except i on Any Supply a rounded result

RN Modify underflow values t o 0 with the sign of the intermediate result

RZ Modify underflow values to 0 with t he sign of t he i ntermedi ate result

RP Modify positive underflows to the format’s smallest positive finite

number; modif y negat i ve underf l ows t o −0.

U Underf l ow exception

RM Modify negative underflows to the format’s smallest negative finite

number; modif y pos i tive underflows to 0.

RN Modify overflow values to ∞ with the si gn of the int ermediat e result

RZ Modify overflow values to the form at’s largest fini te num ber with t he sign

of the int ermediat e result

RP Modify negative overflows to the format’s most negative finite number;

modify positi ve overflows to +∞

O Overflow exception

RM Modify positive overflows to the format’s largest finite number; modify

negative overflows t o −∞

Z Division by zero Any Supply a properly si gned ∞

V I nval i d operat i o n Any Supply 231 −1 res ult (Word Fixed-Point);

Supply 267 −1 resul t (Long Fixed-Point);

Otherwise supply a qui et Not a Number

Chapter 11 Floating-Point Exception

11-4

The FPU detects the eight exception causes internally. When the FPU encounters one of

these unusual situations, it causes either an IEEE exception or an Unimplemented

Operation exception (E).

Table 11-2 lists the exception-causing situations and contrasts the behavior of the FPU

with the requirem ent s of t he IEEE St andar d 754.

Table 11-2.　FPU Exception-Causing Conditions

FPA Internal

Result

IEEE

Standard

754

Trap

Enable Trap

Disable Notes

Inexact result I I I Loss of accurac y

Exponent overflow O, I (*1) O, I O, I Normali zed exponent > Emax

Division by zero Z Z Z Zero is (exponent=Emin −1, mantissa=0)

Overflow on convert

to Integer VV

(*2) V (*2) Source out of int eger range, ∞, NaN

Signaling NaN

source VVV

Invalid operat i on V V V 0/0, etc .

Exponent underflow U E UI (*3) Normalized exponent < Emin

Denormali zed or

QNaN None E E Denormalized is (exponent=Emin −1 and

manti ssa <> 0)

(*1) The IEEE Standard 754 specifies an inexact exception on overflow only if the overflow trap is

disabled.

(*2) Some implementations such as TX49 trap as (E) and SW support is requred. In TX79

implementation there is NO SW support required.

(*3) Exponent underf low sets the U and I Cause bits if both the U and I Enable bits are not s et and the

FS bit is set; otherwise exponent underflow sets the E Cause bit.

Chapter 11 Floating-Point Exception

11-5

11.5 FPU Exceptions

The following sections describe the conditions that cause the FPU to generate each of its

exceptions, and details the FPU response to each exception-causing condition.

Inexact Exception (I)

Inexact Exception (I)Inexact Exception (I)

Inexact Exception (I)

The FPU generates the Inexact exception if one of the following occurs:

• the rounded result of an operation is not exact, or

• the rounded result of an operation overflows, or

• the rounded result of an operation underflows and both the Underflow and Inexact

Enable bits are not set and the

FS

bit is set.

Trap Enabled Results: If Inexact exception traps are enabled, the result register is not

modified and the source registers are preserved.

Trap Disabled Results: The rounded or overflowed result is delivered to the destination

register if no other software trap occurs.

Chapter 11 Floating-Point Exception

11-6

Invalid Operation Exception (V)

Invalid Operation Exception (V)Invalid Operation Exception (V)

Invalid Operation Exception (V)

Floating-Point format operation

Floating-Point format operationFloating-Point format operation

Floating-Point format operation

The Invalid Operation exception is signaled if one or both of the operands are invalid for

an implemented operation. When the exception occurs without a trap, the MIPS ISA

defines the result as a quiet Not a Number (QNaN) for Floating-Point format. The

invalid operations are:

• Addition or subtraction: magnitude subtraction of infinities, such as: ( + ∞) + (−∞) or

(−∞) − (−∞)

• Multiplication: 0 times ∞, with any signs

• Division: 0/0, or ∞/∞, with any signs

• Comparison of predicates involving ‘<’ or ‘>’ without ‘?’, when the operands are

unordered∗

• Any arithmetic operation, when one or both operands is a signaling NaN. A move

(MOV) operation is not considered to be an arithmetic operation, but absolute value

(ABS) and negate (NEG) are considered to be arithmetic operations.

• Comparison or Convertion From Floating-point Format on a signaling NaN.

• Square root: x, where x is less than zero.

Software can simulate the Invalid Operation exception for other operations that are

invalid for the given source operands. Examples of these operations include IEEE

Standard 754-specified functions implemented in software, such as Remainder:

x

REM

y

, where

y

is 0 or

x

is infinite; conversion of a floating-point number to a decimal format

whose value causes an overflow, is infinity, or is NaN; and transcendental functions,

such as ln (−5) or cos−1 (3). Refer to Appendix D for examples or for routines to handle

these cases.

Trap Enabled Results: The result register is not modified, and the source registers are

preserved.

Trap Disabled Results: A quiet NaN is delivered to the destination register if no other

software trap occurs.

Conversion to Integer format

Conversion to Integer formatConversion to Integer format

Conversion to Integer format

The Invalid Operation exception is also raised when the source operand is an Infinity

(∞) or NaN, or the correctly rounded integer result is outside of the representable range.

Trap Enabled Results: The result register is not modified, and the source registers are

preserved.

Trap Disable Results: The result value 231 −1 (for Word Fixed-Point) or 263 −1 (for

Long Fixed-Point) is delivered to the destination register if no

other software trap occurs.

∗‘<’, ‘>’ and ‘?’ are the notation in IEEE std 754.

‘?’ means ‘unordered.’ See Compare instruction in Appendix D.

Chapter 11 Floating-Point Exception

11-7

Division

DivisionDivision

Division-by-Zero Exception (Z)

-by-Zero Exception (Z)-by-Zero Exception (Z)

-by-Zero Exception (Z)

The Division-by-Zero exception is signaled on an implemented divide operation if the

divisor is zero and the dividend is a finite nonzero number. Software can simulate this

exception for other operations that produce a signed infinity, such as In (0), sec (π/2), csc

(0), or 0-1

Trap Enabled Results: The result register is not modified, and the source registers are

preserved.

Trap Disabled Results: The result, when no trap occurs, is a correctly signed infinity.

Overflow Exception (O)

Overflow Exception (O)Overflow Exception (O)

Overflow Exception (O)

The Overflow exception is signaled when the magnitude of the rounded floating-point

result, with an unbounded exponent range, is larger than the larges t finite number of the

destination format. (This exception als o s ignals an Inexact exception. )

Trap Enabled Results: The result register is not modified, and the source registers are

preserved.

Trap Disabled Results: The result, when no trap occurs, is determined by the rounding

mode and the sign of the intermediate result (see Table 11-3).

Table 11-3.　Values of Overflow Results

Flushed result Rounding Mode

Denormalized

Result RN RZ RP RM

Positive +∞+Emax +∞+Emax

Negative −∞ −Emax −Emax −∞

Underflow Exception (U)

Underflow Exception (U)Underflow Exception (U)

Underflow Exception (U)

Two related events contribute to the Underflow exception:

• creation of a tiny nonzero result between ±2Emin which can cause some later exception

because it is so tiny

• extraordinary loss of accuracy during the approximation of such tiny numbers by

denormalized numbers.

IEEE Standard 754 allows a variety of ways to detect these events, but requires they be

detected the same way for all operations.

Tininess can be detected by one of the following methods:

• after rounding (when a nonzero result, computed as though the exponent range were

unbounded, would lie strictly between ±2Emin)

• before rounding (when a nonzero result, computed as though the exponent range and

the precision were unbounded, would lie strictly between ±2Emin).

The MIPS architecture requires that tininess be detected after rounding.

Loss of accuracy can be detected by one of the following methods:

Chapter 11 Floating-Point Exception

11-8

• denormalization loss (when the delivered result differs from what would have been

computed if the exponent range were unbounded)

• inexact result (when the delivered result differs from what would have been computed

if the exponent range and precision were both unbounded).

The MIPS architecture requires that loss of accuracy be detected as an inexact result.

Trap Enabled Results: If Underflow or Inexact traps are enabled, or if the

FS

bit is not

set, then an Unimplemented exception (E) is generated, and the

result register is not modified and the source registers are

preserved.

Trap Disabled Results: If Underflow and Inexact traps are not enabled and the

FS

bit is

set, the result is determined by the rounding mode and the sign

of the intermediate result (See Table 10-4).

Unimplemented Instruction Exception (E)

Unimplemented Instruction Exception (E)Unimplemented Instruction Exception (E)

Unimplemented Instruction Exception (E)

Any attempt to execute an instruction with an operation code or format code that has been

reserved for future definition sets the

Unimplemented

bit in the

Cause

field in the FPU

Control/Status

register and traps. The operand and destination registers remain

undisturbed and the instruction is emulated in software. Any of the IEEE Standard 754

exceptions can arise from the emulated operation, and these exceptions are simulated.

The Unimplemented Instruction exception can als o be signaled when unusual operands or

result conditions are detected that the implemented hardware cannot handle properly.

These include:

• Denormalized operand, except for Compare instruction

• Quiet Not a Number operand, except for Compare instruction

• Denormalized result or Underflow, when either Underflow or Inexact

Enable

bit is set

or the

FS

bit is not set.

• Reserved opcodes

• Unimplemented formats

• Operations which are invalid for their format (for instance, CVT.S.S)

NOTE: Denormalized and NaN operands are only trapped if the instruction is a convert or a

computational operation. A move opration does not trap if their operands are either

denormalized or NaNs.

The use of this exception for such conditions is optional; most of these conditions are

newly developed and are not expected to be widely used in early implementations.

Loopholes are provided in the architecture so that these conditions can be implemented

with assistance provided by software, maintaining full compatibility with the IEEE

Standard 754.

Trap Enabled Results: The result register is not modified, and the source registers are

preserved.

Trap Disabled Results: This trap cannot be disabled.

Chapter 11 Floating-Point Exception

11-9

11.6 Saving and Restoring State

Sixteen doubleword† coprocessor load or store operations save or restore the coprocessor

floating-point register state in memory. The remainder of control and status information

can be saved or restored through

CFC1/CTC1

instructions, and saving and restoring the

processor registers. Normally, the

Control/Status

register is saved first and restored last.

When state is restored, state information in the

Control/Status

register indicates the

exceptions that are pending. Writing a zero value to the

Cause

field of

Control/Status

register clears all pending exceptions, permitting normal processing to restart after the

floating-point register state is restored.

11.7 Trap Handlers for IEEE Standard 754 Exceptions

The IEEE Standard 754 strongly recommends that users be allowed to specify a trap

handler for any of the five standard exceptions so that a software subroutine can return a

value to be used in stead of the exceptional operation’s result; the trap handler can either

compute or specify a substitute result to be placed in the destination register of the

operation.

By retrieving an instruction using the processor

Exception Program Counter

(

EPC

)

register, the trap handler determines:

• exceptions occurred during the operation

• the operation being performed

• the destination format

On Overflow or Underflow exceptions (except for conversions), and on Inexact exceptions,

the trap handler gains access to the correctly rounded result by decoding source register

field of the instruction code and simulating the operation in software.

On Overflow or Underflow exceptions caused by a floating-point conversion, on Invalid

Operation and on Division-by-Zero exceptions, the trap handler gains access to the

operand values by decoding the source register field of the instruction code.

The IEEE Standard 754 recommends that, if enabled, the overflow and underflow traps

take precedence over a separate inexact trap. This prioritization is accomplished in

software; hardware sets the bits for both the Inexact exception and the Overflow or

Underflow exception.

† 32 doublewords if the FR bit is set to 1.

Chapter 11 Floating-Point Exception

11-10

Chapter 12 PC T r ace

12-1

12. PC Trace

This chapter describes the trace functions pres ent on the C790.

The C790 supports real-time PC tracing. Pipeline status, target addresses of indirect

jumps, and exception vectors are made available on special signals. The executed

instruction sequence can be restored from signals and the source program.

The C790 also supports hardware breakpoints. The breakpoint facility is described in

Chapter 13.

Chapter 12 PC T r ace

12-2

12.1 Real-Time PC Tracing

Trace information and non-sequential Program Counters are made available on special

signal lines of the CPU.

The following trace information is made available:

• Instruction being executed in pipeline 0

• Instruction being executed in pipeline 1

• Current execution status (Normal (s equential) , Branch Tak en, Jump Target,

Exception Target)

For Indirect jumps, the target address is also made available. For exception vectors, a code

for the exception vector address is made available.

12.1.1 Classification of Branch and Jump Instructions

In this chapter, branches and jumps are classified into three categories which are direct

jump, indirect jump and branch in order to explains the function of PC trace.

The classification is s how in Table 12- 1.

Table 12-1. Classification of Branch and Jump Instruction

Class Instruction

Jump

Direct Jump

Indirect Jump

Direct or Indirect Jump

J or JAL Instruction

JR, JALR or ERET Instruction

Branch Any of conditional branch Instruction

Chapter 12 PC T r ace

12-3

12.1.2 PC Trace Signals

All PC trace signals operate at half the C790 CPU clock frequency using the BUSCLK

clock signal. Because of the half frequency operation there are pairs of signals which

indicate the status of execution within the CPU pipelines. Phase A signals show the status

corresponding to the

even

CPU clock cycle and Phase B signals show the status

corresponding to the

odd

CPU clock cycle.

As can be seen from the following figure the execution status of the CPU pipeline during

time 0 (all time references are in relation to the CPU clock) is put on the phase A signals

at the next rising edge of BUSCLK during time 2. Similarly the execution status of the

CPU pipeline during time 1 is put on the phase B signals.

123456789100

ABABABABABA

Time

Phase

CPUCLK

BUSCLK

Phase A

Signals 0246

Phase B

Signals 1357

The following signals are made available f or real- time PC tracing.

• P0EXEA*(Phase A Pipeline 0 Execution Status) Output

• P1EXEA*(Phase A Pipeline 1 Execution Status) Output

• JMPA*(Phase A Jump) Output

• P0EXEB*(Phase B Pipeline 0 Execution Status) Output

• P1EXEB*(Phase B Pipeline 1 Execution Status) Output

• JMPB*(Phase B Jump) Output

• TPCE*(Target PC Enable) Output

• TPC[3:0] (Target PC Bus) Output

(1) P0EXEA* ( Phase A Pipeline 0 Execution St atus) Output

P0EXEA indicates whether an instruction has completed execution w ithout generating an

exception (retired) via Pipeline 0 during phase A.

0: An instruction was retired.

1: No instruction was retired.

Chapter 12 PC T r ace

12-4

(2) P1EXEA* ( Phase A Pipeline 1 Execution St atus) Output

P1EXEA indicates whether an instruction retired via Pipeline 1 during phase A. Note if

this signal is asserted at the same time as P0EXEA* then two instructions were retired

simultaneously during phase A via pipelines 0 and 1 but there is no indication as to w hich

specific instruction was retired via which pipeline.

0: An instruction was retired.

1: No instruction was retired.

(3) JMPA* (Jump Phase A) Output

A jump was retired during phase A or a conditional branch instruction was retired and the

branch was taken during phase A. Note that exceptions do not assert this signal.

0: Jump or conditional branch instruction was retired.

1: No Jump or conditional branch instruction was retired.

(4) P0EXEB* ( Phase B Pipeline 0 Execut ion Status) Output

P0EXEB indicates whether an instruction retired via Pipeline 0 during phase B.

0: An instruction was retired.

1: No instruction was retired.

(5) P1EXEB* ( Phase B Pipeline 1 Execut ion Status) Output

P1EXEB indicates whether an instruction retired via Pipeline 1 during phase B. Note if

this signal is asserted at the same time as P0EXEB* then two instructions were retired

simultaneously during phase B via pipelines 0 and 1 but there is no indication as to which

specific instruction was retired via which pipeline.

0: An instruction was retired.

1: No instruction was retired.

(6) JMPB* (Jump Phase B) Output

A jump was retired during phase B or a conditional branch instruction was retired and the

branch was taken during phase B. Note that exceptions do not assert this signal.

0: Jump or conditional branch instruction was retired.

1: No Jump or conditional branch instruction was retired.

Chapter 12 PC T r ace

12-5

(7) TPCE* (Target PC Enable) Output

When this signal is asserted the TPC bus indicates the type of target PC that will be made

available.

0: TPC bus indicates type of target PC.

1: TPC bus has either the target PC or the exception vector address code

or has no information.

The normal sequence of operation for the TPCE* and the TPC[3:0] signals is as follows:

First TPCE* is asserted and simultaneously TPC[3: 0] contains inf ormation about the type

of the target PC (non-sequential PC). Next TPCE* is deasserted and either the target PC

for indirect jumps is made available on the TPC[3:0] bus or for exceptions an exception

vector address code is made available on the TPC[3:0] bus.

(8) TPC[3:0] (Target PC) Output

TPC[3:0] either indicates the type of the target PC address or the target address of

indirect jump instructions or exception vector address codes.

TPC

TPCTPC

TPC[ 3:0 ] w hen TPCE

[3:0] when TPCE[3:0] when TPCE

[3:0] when TPCE* is asserted

is asserted is asserted

is asserted

When TPCE* is asserted the type of the target PC address is made available on

TPC[3:0]. Each bit of TPC[3:0] indicates a different type and multiple bits can be

active at the same time.

• TPC[0]: Jump Tar get d ur i ng Phas e A

When this signal is asserted it indicates that the target instruction of an

Indirect Jump instruction (includes JR, JALR and ERET) is retired during

Phase A. The target address is made available on TPC[3:0] in the next cycle if

neither TPC[2] or TPC[3] are asserted simultaneously with this signal.

• TPC[1]: Exception Target during Phase A

When this signal is asserted it indicates that the first instruction of an

exception handler is retired during Phase A. The exception vector address is

made available on TPC[3:0] in the next cycle if neit her TPC[2] nor TPC[3] ar e

asserted simultaneously with this signal.

• TPC[2]: Jump Tar get d ur i ng Phas e B

When this signal is asserted it indicates that the target instruction of an

Indirect Jump instruction is retired during Phase B. The target address is

made available on TPC[3:0] in the next cycle.

• TPC[3]: Exception Target during Phase B

When this signal is asserted it indicates that the first instruction of an

exception handler is retired during Phase B. The exception vector address is

made available on TPC[3:0] in the next cycle.

Chapter 12 PC T r ace

12-6

TPC

TPCTPC

TPC[ 3:0 ] w hen TPCE

[3:0] when TPCE[3:0] when TPCE

[3:0] when TPCE* is deasserted

is deasserted is deasserted

is deasserted

When TPCE* is not asserted TPC[3:0] can be carrying the following three type of

information:

1. There is no meaningful information on TPC. This happens most of the time

when the program is executing sequentially.

2. The target address is made available because in the previous cycle TPCE*

was asserted and TPC[0] or TPC[2] were equal to 0. The target address starts

with the least significant four bits of the target instruction address (bits[5:2]).

3. An exception vector address code is made available because in the previous

cycle TPCE* was asserted and TPC[1] or TPC[3] were equal to 0. The

exception vector address code are shown in Table 12-2.

Table 12-2. Exception Vector Address Codes

Exception STATUS.BEV STATUS.DEV STATUS.EXL Vector

Address Code

(TPC[3:0])

Reset, NMI x x x 0xBFC0 0000 8 (1000)

TLB Miss 1 x 0 0xBFC0 0200 12 (1100)

TLB Miss 0 x 0 0x8000 0000 0 (0000)

TLB Miss 1 x 1 0xBFC0 0380 15 (1111)

TLB Miss 0 x 1 0x8000 0180 3 (0011)

Debug & SIO x 1 x 0xBFC0 0300 14 (1110)

Debug & SIO x 0 x 0x8000 0100 2 (0010)

Performance

Counter x 1 x 0xBFC0 0280 13 (1101)

Performance

Counter x 0 x 0x8000 0080 1 (0001)

Interrupt 1 x x 0xBFC0 0400 9 (1001)

Interrupt 0 x x 0x8000 0200 4 (0100)

Common 1 x x 0xBFC0 0380 15 (1111)

Common 0 x x 0x8000 0180 3 (0011)

Chapter 12 PC T r ace

12-7

12.1.3 Priority of Target Addresses

The target address for an indirect jump instruction or an exception vector address code is

made available on TPC[3:0]. For an indirect jump instruction it takes multiple cycles (8

BUSCLK cycles or 16 CPU clock cycles) for the complete target address to be made

available on the TPC[3:0] bus. As such multiple conditions can occur simultaneously and

there are certain priorities associated with putting out the target address. The rules

governing what is made available on the TPC[3:0] bus are lis ted below :

1. If a new indirect jump instruction is retired while the target address PC for a

previous indirect instruction is still being put out on TPC[3:0], the new indirect

jump instruction’s target PC will be signaled and start coming out on the

TPC[3:0] bus and the previous target PC output will be terminated.

2. If an exception is taken while the target address PC for a previous indirect

instruction is still being put out on TPC[3:0], the exception vector address code

will be signaled and start coming out on the TPC[3:0] bus and the previous

target PC output will be terminated

The rules are also described in the following flowchart.

New Indirect Jump

or Exception

Target Retired ?

Yes Previous Target

address is Being Output

Currently ?

Suspend Outputting

Previous Target

Address Out put

Start Outputting

Target Address

of Jump

Terminate Outputting

Current PC Output

Yes

No No

Exception Indirect Jump

Previous Target

Address. Is Being Output

Currently ?

Output Exception

Target

Resume Outputting

Previous Target

Address

Output Exception

Target

Figure 12-1. Priority of Outputting Jump or Exception Target

Chapter 12 PC T r ace

12-8

12.1.4 Examples of PC Tracing

The following sections contains examples of program execution and the corresponding

waveforms of the PC trace signals. Note that when two instructions are retired

simultaneously, just for the sake of illustration, it is indicated which instruction is

executed in which pipeline. In reality, in this case, it is not known which instruction is

retired from which pipeline.

Chapter 12 PC T r ace

12-9

12.1.4.1 Sequential Execution

This is an example of sequential program execution. The program fragment is as follows:

mul

add

sub

lw r1

add

sub ,,r1

add

The PC trace signals for the program fragment are shown below:

ABABABABPhase

CPUCLK

BUSCLK

mul add

mul sub add −−addPipe 0

add lw −sub addPipe 1

P0EXEA*

sub

lw

P1EXEA*

addsub

P0EXEB*

addadd

P1EXEB*

JMPA*

JMPB*

TPCE*

TPC[3:0]

Figure 12-2. Waveform for Sequential Excecution

Chapter 12 PC T r ace

12-10

12.1.4.2 Conditional Branch

This is an example of program with conditional branch instructions. Both the branch

taken and not taken case is illustrated. The program fragment is as follows:

add

beq L0 # Not Taken

lw

add

beq L1 # Taken

add

....

L1: add

bne L2 # Taken

sll

....

L2: sub

sub

The PC trace signals for the program fragment are shown below:

ABABABABPhase

CPUCLK

BUSCLK

add add

add add add −−addPi pe 0

−beq lw −beq addPipe 1

P0EXEA*

beq

lw

P1EXEA*

addadd

P0EXEB*

addbeq

P1EXEB*

JMPA*

JMPB*

TPCE*

TPC[3:0]

BA

bne sub

sll sub

Taken

TakenNot Taken

bne

sll

sub

beq bne

Figure 12-3. Waveform for Conditional Branch

Chapter 12 PC T r ace

12-11

12.1.4.3 Indirect Jump ( Target in Phase A)

This is an example of program with an indirect jump instruction which is retired during

phase B. The program fragment is as follows:

add

jr L1

lw

....

L1: xor

add

ori

sw

sll

sub

The PC trace signals for the program fragment are shown below:

xor

ABABABABPhase

CPUCLK

BUSCLK

add

add add −−xor oriPipe 0

−jr lw −add oriPipe 1

P0EXEA*

P1EXEA*

ori

add

P0EXEB*

orijr

P1EXEB*

JMPA*

JMPB*

TPCE*

TPC[3:0]

BA

sll sub

sw sub

Target

sll

sub

addlw sw

jr

1110

TA[x:y] = Target address bit x to y

xor

TA[5:2] TA[31:30]

9 Bus Cycles

Figure 12-4. Waveform for Indirect Jump (Target in Phase A)

Chapter 12 PC T r ace

12-12

12.1.4.4 Indirect Jump (Target in Phase B)

This is an example of program with an indirect jump instruction which is retired during

phase A. The program fragment is as follows:

add

jr L1

lw

....

L1: xor

add

ori

sw

sll

sub

The PC trace signals for the program fragment are shown below:

ABABABABPhase

CPUCLK

BUSCLK

add

add −−−−ori

Pipe 0

jr lw −xor add oriPipe 1

P0EXEA*

jr

P1EXEA*

ori

P0EXEB*

orixor

P1EXEB*

JMPA*

JMPB*

TPCE*

TPC[3:0]

BA

sll sub

sw

Target

sll

sub

add sw

TA[9:6] TA[31:30]

8 Bus Cycles

lw

jr

xor

1011 TA[5:2]

sub

Figure 12-5. Waveform for Indirect Jump (Target in Phase B)

Chapter 12 PC T r ace

12-13

12.1.4.5 Indirect Jump ( During Target PC Out put)

This is an example of a program with two indirect jump instructions. While the target

address PC associated with the first indirect jump instruction is being put out the second

indirect jump instruction is retired. Thus the first target PC output is terminated and the

second target PC output is signaled and then made available. The program fragment is as

follows: add

add

jr L1

lw

....

L1: xor

add

jr L2

add

....

L2 sw

sll

sub

The PC trace signals for the program fragment are shown below:

TA[5:2]

ABABABABPhase

CPUCLK

BUSCLK

add

add add −−xor jrPipe 0

−jr lw −add addPipe 1

P0EXEA*

P1EXEA*

P0EXEB*

P1EXEB*

JMPA*

JMPB*

TPCE*

TPC[3:0]

BA

−−

Target

sll

add

TA[5:2]1110

−−

BA

sll sub

sw sub

Target

xor

lw sw

subjradd

subaddjr

jr jr

xor

1110

sw

Figure 12-6. Waveform for Indirect Jump (During Target PC Output)

Chapter 12 PC T r ace

12-14

12.1.4.6 Exception (Target in Phase B)

This is an example of a program which generates an exception. The target instruction

(first instruction of the exception handler) retires in phase B. The program fragment is

shown below. The label

ExHnd

identifies the first instruction of the exception handler.

add

lw

teq # Generates exception

....

ExHnd: xor

add

sw

sll

sub

The PC trace signals for the program fragment are shown below:

E.Code0111

ABABABABPhase

CPUCLK

BUSCLK

add

add add −−−xorPipe 0

−add lw −−addPi pe 1

P0EXEA*

lw

P1EXEA*

xor

P0EXEB*

add

P1EXEB*

JMPA*

JMPB*

TPCE*

TPC[3:0]

BA

sll sub

Exception

Target

sll

sub

sw

add

xor

More stall cycles mi ght be inserted.

sw sub

add

E.Code = Exception Vect or Code

Figure 12-7. Waveform for Exception (Target in Phase B)

Chapter 12 PC T r ace

12-15

12.1.4.7 Exception (During Target PC Out put )

This is an example of a program which generates an exception while a target PC from an

earlier indirect jump instruction is being made available. The target PC output is

terminated and the exception vector address code is signaled and then made available.

The target instruction (first instruction of the exception handler) retires in phase B. The

program fragment is shown below. The label

ExHnd

identifies the first instruction of the

exception handler.

add

lw

teq # Generates exception

....

ExHnd: xor

add

sw

sll

sub

The PC trace signals for the program fragment are shown below:

TA17:14

ABABABABPhase

CPUCLK

BUSCLK

add

add add −−−xorPipe 0

−add lw −−addPi pe 1

P0EXEA*

lw

P1EXEA*

xor

P0EXEB*

add

P1EXEB*

JMPA*

JMPB*

TPCE*

TPC[3:0]

BA

sll sub

Exception

Target

sll

sub

sw

add

xor

0111 E.Code

More stall cycles mi ght be inserted.

sw sub

add

TAxx:yy = Target Address bit xx to yy

E.Code = Exception Vect or Code

TA21:18TA13:10

Figure 12-8. Waveform for Exception (During Target PC Output)

Chapter 12 PC T r ace

12-16

12.1.4.8 Exception G enerat ed by Branch or Jump Instructi on

This is an example of a program in which an indirect jump instruction generates an

exception. As such the program jumps to the exception handler and the only thing

indicated is the exception vector address code and not the jump. The target instruction

(first instruction of the exception handler) retires in phase B. The program fragment is

shown below. The label ExHnd identifies the first instruction of the exception handler.

add

lw

jr # Generates an exception

nop # Branch delay slot

....

ExHnd: xor

add

sw

sll

sub

The PC trace signals for the program fragment are shown below:

0111 E.Code

ABABABABPhase

CPUCLK

BUSCLK

add

add add −−−xorPipe 0

−add lw −−addPi pe 1

P0EXEA*

lw

P1EXEA*

xor

P0EXEB*

add

P1EXEB*

JMPA*

JMPB*

TPCE*

TPC[3:0]

BA

sll sub

Exception

Target

sll

sub

sw

add

xor

More stall cycles mi ght be inserted.

sw sub

add

E.Code = Exception Vect or Code

Figure 12-9. Waveform for Exception Generated by Branch or Jump Instruction

Chapter 12 PC T r ace

12-17

12.1.4.9 Exception Generated by Branch Delay Slot Instructi on

This is an example of a program in which the branch delay slot instruction generates an

exception. As such the program jumps to the exception handler and the only thing

indicated is the exception vector address code and not the jump. The target instruction

(first instruction of the exception handler) retires in phase B. The program fragment is

shown below. The label ExHnd identifies the first instruction of the exception handler.

add

lw

jr

lw # Generates an exception

....

ExHnd: xor

add

sw

sll

sub

The PC trace signals for the program fragment are shown below:

0111 E.Code

ABABABABPhase

CPUCLK

BUSCLK

add

add add jr −−xorPipe 0

−add lw −−addPi pe 1

P0EXEA*

lw

P1EXEA*

xor

P0EXEB*

add

P1EXEB*

JMPA*

JMPB*

TPCE*

TPC[3:0]

BA

sll sub

Exception

Target

sll

sub

sw

add

xor

More stall cycles mi ght be inserted.

sw sub

add

E.Code = Exception Vect or Code

jr

Figure 12-10. Waveform for Exception Generated by Branch Delay Slot Instruction

Chapter 12 PC T r ace

12-18

12.1.4.10 Exception Generated by Target Instruction

This is an example of a program in which the target instruction of an indirect jump

generates an exception. As such the program jumps to the exception handler and the only

thing indicated is the exception vector address code and not the jump. The target

instruction (first instruction of the exception handler) retires in phase B. The program

fragment is shown below. The label ExHnd identifies the first instruction of the exception

handler. add

add

lw

jr L1

nop

....

L1: lw # Generates an exception

and

....

ExHnd: xor

add

sw

sll

sub

The PC trace signals for the program fragment are shown below:

E.Code

ABABABABPhase

CPUCLK

BUSCLK

add

add add jr nop −−

Pipe 0

−add lw −−−

Pipe 1

P0EXEA*

P1EXEA*

P0EXEB*

P1EXEB*

JMPA*

JMPB*

TPCE*

TPC[3:0]

BA

−xor

sll

0111

−add

BA

sll sub

sw sub

lw sw

subxoradd

subaddadd

xor

More stall cycles m ight be inserted.

jr

nop

jr

Figure 12-11. Waveform for Exception Generated by Target Instruction

Chapter 12 PC T r ace

12-19

12.1.4.11 Back to Back Exceptions ( Case I )

This is an example of a program in which two back to back exceptions are generated. The

program jumps to the first exception handler but then immediately jumps to the second

exception handler. The target instruction (first instruction of the second exception

handler) retires in phase A. The exception vector address code for the first handler is

never made available. The program fragment is shown below. The label ExHnd1 identifies

the first instruction of the first exception handler and the label ExHnd2 identifies the first

instruction of the second exception handler.

add

add # Generates the first exception

....

ExHnd1: xor # Generates the second exception

xor

....

ExHnd2: sw sll

sub

The PC trace signals for the program fragment are shown below:

E.Code

ABABABABPhase

CPUCLK

BUSCLK

add

add −−−−−

Pipe 0

−−−−−−

Pipe 1

P0EXEA*

P1EXEA*

P0EXEB*

P1EXEB*

JMPA*

JMPB*

TPCE*

TPC[3:0]

BA

−−

sll

−−

BA

sll sub

sw sub

sw

sub

1101

sw

More stall cycles m ight be inserted.

Exception

Target

E.Code = Exception Vector Code

Figure 12-12. Waveform for Back to Back Exceptions (Case I)

Chapter 12 PC T r ace

12-20

12.1.4.12 Back to Back Exceptions ( Case I I )

This is an example of a program in which two (all most) back to back exceptions are

generated. The program jumps to the first exception handler and then generates an

exception when executing the second instruction of the exception handler. It then jumps to

the second exception handler. The target instruction (first instruction of the first exception

handler) retires in phase A. As compared to the case discussed above the exception vector

address code for the both the handlers are made available. The program fragment is

shown below. The label ExHnd1 identifies the first instruction of the first exception

handler and the label ExHnd2 identifies the first instruction of the second exception

handler. add

add # Generates the first exception

....

ExHnd1: xor xor # Generates the second exception

....

ExHnd2: sw sll

sub

The PC trace signals for the program fragment are shown below:

ABABABABPhase

CPUCLK

BUSCLK

add

add −−−

xor −

Pipe 0

−−−−−−

Pipe 1

P0EXEA*

P1EXEA*

P0EXEB*

P1EXEB*

JMPA*

JMPB*

TPCE*

TPC[3:0]

BA

−−

sll

E.Code

−−

BA

sll sub

sw sub

sw

sub

1101

sw

More stall cycles m ight be inserted.

Exception

Target

E.Code = Exception Vector Code

Exception

Target

xor

1101 E.Code

Figure 12-13. Waveform for Back to Back Exceptions (Case II)

Chapter 13 Hardware Breakpoint

13-1

13. Hardware Breakpoint

This chapter describes hardware break point f unctions f or debugging pres ent on the C790.

Chapter 13 Hardware Breakpoint

13-2

13.1 Hardw are Breakpoint

C790 provides hardware breakpoint mechanism for debugging purpose. (In this section,

hardware breakpoint is sometimes referred to as “breakpoint”.) This function allows users

to set a instruction breakpoint and a data address/value breakpoint with signaling the

breakpoint event occurrence to external probe. The following summarizes the features of

the breakpoint function.

• Provides both instruction and data breakpointing in virtual address.

• Instruction address breakpoint with address masking.

• Data breakpoint with masking. Data breakpoint can be set by the following

events:

Address with masking

Value with masking

Read/write

• Independent exception event control for instruction and data.

• Individual event control by processor operating mode/exception level.

• Provides a trigger signal to external probes synchronized with the breakpointing

event.

Hardware breakpointing is implemented as a part of Coprocessor 0. Configuring the

breakpoint is done by setting 7 Breakpoint registers by special

MTC0/MFC0

instructions.

Figure 13-1 shows the basic structure of the breakpoint hardware.

Breakpoint can generate breakpoint exception which is categorized in Level2 exception,

and has a dedicated exception vector. (See 5. Exception) This exception is only masked in

Level2 mode, and exception generation itself can be controlled by the Breakpoint Control

Register mentioned in the following section. Note that some of breakpoint exceptions are

imprecise, for instance, setting value breakpoint for load instruction is basically imprecise

because the load instruction may retire from the pipeline before actual acquisition of

memory contents. The following summarizes imprecise cases:

• All data value breakpoint on load instruction

• Data value breakpoint on

SWC1

instruction

13.1.1 Hardware Breakpoint signal

To signal a breakpoint occurrence, the C790 activates a signal called TRIG, whenever a

trigger condition is met.

• TRIG (Trigger Output) Output

This signal is asserted for two BUSCLK cycles when a trigger condition is met.

Chapter 13 Hardware Breakpoint

13-3

Address / Value

Re

g

ister

IAB

DAB

DVB

Mask

fetch PC

load/store address

load/store val ue

Mask

Mask Register IABM

DABM

DVBM

= ?

Enable

Ctrl.

Enable

Ctrl.

Breakpoint Control BP C

Pipeline Control

(

Exception Control

)

Exception

Trigger to

external probe

(TRIG*)

Breakpoint

Event

Figure 13-1. Overall Structure of Hardware Breakpoint

13.2 Breakpoint Registers

Hardware breakpoint is comprised of 3 pairs of breakpoint registers and one control

register listed below. Each of breakpoint register pair includes one breakpoint value

register and one breakpoint mask register.

•

Breakpoint Control

Register (BPC)

• Instruction Address Breakpoint Registers

Instruction Address Breakpoint

Register (IAB)

Instruction Address Breakpoint Mask

Register (IABM)

• Data Address Breakpoint Registers

Data Address Breakpoint Register

(DAB)

Data Address Breakpoint Mask Register

(DABM )

• Data Value Breakpoint Registers

Data Value Breakpoint Register

(DVB)

Data Value Breakpoint Mask Register

(DVBM )

Chapter 13 Hardware Breakpoint

13-4

All 7 registers are 32-bit read/write and assigned to Coprocessor0 register 24. Therefore,

C790 provides extended

MTC0

instructions for accessing these registers and it is

necessary to use these instructions to access these registers instead of the conventional

MTC0/MFC0

instructions. Table 13-1 and Table 13-2 summarizes the instructions for

accessing the registers.

Table 13-1. Set a new value into breakpoint registers

Mnemonic Operation

MTBPC Move to Breakpoint Control Register

MTIAB Move to Instruction Addres s Breakpoi nt Regist er

MTIABM Move to I nstruc t i on Address Breakpoint Mas k Regist er

MTDAB Move to Data Address Breakpoint Regi ster

MTDABM Move t o Data Address Breakpoi nt Mask Register

MTDVB Move to Data Value Breakpoint Register

MTDVBM Move t o Data Value B reakpoint Mask Regist er

Table 13-2. Get the value from breakpoint registers

Mnemonic Operation

MFBPC Move from Breakpoi nt Control Register

MFIAB Move from Instructi on A ddress Breakpoint Regi ster

MFIABM Move from Instruction Addres s Breakpoi nt Mas k Register

MFDAB Move from Data A ddress Breakpoint Register

MFDABM Move f rom Data A ddress B reakpoint Mask Regist er

MFDVB Move from Data Value Breakpoi nt Regist er

MFDVBM Move f rom Data Value Break poi nt Mask Register

13.2.1 Breakpoint Control Register (BPC)

The

BPC

register contains enable bits and status bits for controling the breakpointing of

both instruction and data. This register consists of 5 parts of bit fields:

•

Breakpoint overall control

(bit [31:28])

These bits controls the operation mode of the breakpointing.

•

Instruction breakpoint control

(bit [26:23])

These bits specifies the processor mode that the instruction breakpoint is

enabled.

•

Data breakpoint control

(bit[21:18])

These bits specifies the processor mode that the data breakpoint is enabled.

•

Signaling Control

(bit[17:15])

These bits controls the occurrence of breakpoint exception / trigger generation

upon the breakpoint event.

•

Breakpoint Status

(bit[2:0])

These bits indicates the type of breakpoint event. This part is used to identify

which breakpoint event occurred in the breakpoint exception handler.

Chapter 13 Hardware Breakpoint

13-5

The following shows the detailed bitmap of BPC register.

D

R

B

D

W

B

00 I

A

B

0123456

91112131415161718

0

27

D

V

E

28

D

W

E

29

D

R

E

30

I

A

E

31

I

S

E

25

I

U

E

26

I

K

E

24

I

X

E

23

0

22

D

U

E

21

D

S

E

2019

0

10

0000 00000

78

D

K

E

D

X

E

I

T

E

D

T

E

B

E

D

Table 13-3 describes the

BPC

register fields.

Table 13-3. BPC Register Fields

Field Bits Description Type Initial

Value

IAE 31 Instruction Address Enable. This bit enables/dis abl es inst ruction

address break poi nting.

0: disabl e i nstruction address breakpointing

1: enable ins t ruction address breakpoi nting

Read /

Write 0

DRE 30 Data Read Enable. This bit enables dat a l oad address breakpoi nting.

0: disabl e breakpointing on reads

1: enable breakpoi nt i ng on reads

Read /

Write 0

DWE 29 Data Write Enable. This bit enables data store address breakpointing.

0: disabl e breakpointing on writes

1: enable breakpoi nt i ng on writes

Read /

Write 0

DVE 28 Data Value Enabl e. Thi s bit i s valid only when DRE and/or DWE are

set t o 1. When DVE is set t o 1 data read breakpoint s (DRE == 1) are

further quali f i ed by the value of t he data read, and data write

breakpoints (DWE == 1) are further qualified by the value of the data

written. Note that data val ue breakpoints for data reads are

imprecise. See section 13.1 (“Hardware Breakpoi nt”) for more details.

Read /

Write Undefined

rsvd 27 Reserved - must be writt en as zeros by s oftware. The proces sor

returns zeros i n these bit positions when read. Read 0

IUE 26 Instruc tion break - User Enable. This bi t enabl es inst ruction addres s

breakpointi ng i n (standard) user mode. Thi s bit i s only valid if IAE i s

set to 1.

0: disabl e i nstruction address breakpointing i n User mode

1: enable ins tructi on address breakpoi nting in User mode

Read /

Write Undefined

ISE 25 Instruc t i on break - Supervisor Enable. Thi s bit enables inst ruction

address break poi nting in s upervi sor mode. This bi t i s only valid i f IAE

is set to 1.

0: disabl e i nstruction address breakpoint i ng i n Supervisor mode

1: enable ins tructi on address break poi nt i ng i n S upervi sor mode

Read /

Write Undefined

IKE 24 Instruc t i on break - Kernel Enable. Thi s bit enables inst ruction addres s

breakpointi ng i n non-excepti on kernel mode - i.e. when both

STATUS.EXL and STAT US.ERL are 0. This bit is onl y val i d i f IAE is

set to 1.

0: disabl e i nstruction address breakpointing i n Kernel m ode

1: enable ins tructi on address breakpoi nting in Kernel mode

Read /

Write Undefined

IXE 23 Inst ruction break - EXL mode Enabl e. This bit enables instruction

address break poi nting in exception kernel m ode - i .e. when

STATUS.EXL is 1 and S T ATUS.ERL is 0. This bit i s only valid i f IAE

is set to 1.

0: disabl e i nstruction address breakpointing i n EXL mode

1: enable ins tructi on address breakpoi nting in EXL mode

Read /

Write Undefined

rsvd 22 Reserved - mus t be written as zeros by software. The proc essor

returns zeros i n these bit positions when read. Read 0

Chapter 13 Hardware Breakpoint

13-6

Field Bits Description Type Initial

Value

DUE 21 Data break - User Enable. This bit enabl es data break poi nt i ng i n User

mode. Thi s bit is only valid i f DWE or DRE i s set to 1.

0: disabl e dat a breakpointing i n User mode

1: enable data break poi nting in User mode

Read /

Write Undefined

DSE 20 Data break - Supervisor Enable. Thi s bit enables data breakpoi nting in

Supervisor mode. This bi t is only vali d if DWE or DRE is set to 1.

0: disabl e dat a breakpointing i n Supervisor mode

1: enable data break poi nting in Supervisor mode

Read /

Write Undefined

DKE 19 Data break - Kernel Enable. This bit enables dat a breakpointing i n

Kernel mode - i.e. when both ST A T US.EXL and STAT US .ERL are 0.

This bit is only valid if DWE or DRE is set to 1.

0: disabl e dat a breakpointing i n Kernel mode

1: enable data break poi nting in Kernel mode

Read /

Write Undefined

DXE 18 Data break - EXL mode Enable. Thi s bit enabl es data breakpoint i ng i n

Exc ept i on Kernel m ode - i .e. when STATUS.EXL is 1 and

STATUS.ERL i s 0. This bi t is onl y val i d i f at least one of DRE or DWE

are set t o 1.

0: disabl e dat a breakpointing i n EXL mode

1: enable data break poi nting in EXL m ode

Read /

Write Undefined

ITE 17 Instruction Trigger Enable. Thi s bit enables the generati on of the

trigger si gnal when an inst ruction breakpoint oc curs.

0: disabl e i nstruction breakpoint trigger

1: enable ins tructi on breakpoint t rigger

Read /

Write Undefined

DTE 16 Data Trigger Enable. This bit enables the generati on of the trigger

signal when an data breakpoi nt occurs.

0: disabl e dat a breakpoint t ri gger

1: enable data break poi nt trigger

Read /

Write Undefined

BED 15 Breakpoint Exception Dis abl e. This bit disables the entry i nto the

debug exception handler. Not e that the setting of this bi t does not

affec t trigger signal generation.

0: enable entry into debug exception handler

1: disabl e ent ry i nto debug exception handler

Read /

Write Undefined

rsvd 14 - 3 Reserved - must be written as zeros by software. The process or

returns zeros i n these bit positions when read. Read 0

DWB 2 Data Write Breakpoint. Thi s stat us bit indi cates whether a dat a

breakpoint has occurred on a write or not.

0: no data breakpoi nt has oc curred on a write

1: data breakpoi nt has oc curred on a write

Read /

Write Undefined

DRB 1 Data Read Breakpoint. This s tatus bi t indicates whether a data

breakpoint has occurred on a read or not.

0: no data breakpoi nt has oc curred on a read

1: data breakpoi nt has oc curred on a read

Read /

Write Undefined

IAB 0 Instruction Address Breakpoint . This s tatus bi t i ndi cates whether an

instruction addres s breakpoint has occ urred or not .

0: no instructi on address breakpoi nt has occurred on a read

1: instructi on address breakpoi nt has occurred on a read

Read /

Write Undefined

Chapter 13 Hardware Breakpoint

13-7

13.2.2 Instruction Address Breakpoint Register (IAB) / Instruction

Address Breakpoint Mask Register (IABM)

31 2 1 0

IAB 0

Figure 13-2. Instruction Address Breakpoint Register

31 2 1 0

IABM 0

Figure 13-3. Instruction Address Breakpoint Mask Register

This register pair holds the instruction breakpointing address. Both the value in IAB

register and the current fetch PC are masked by the value in IABM. If the values are

equal, condition for instruction address breakpoint becomes true. As fetch PC is always

word-aligned, the bit 0 and bit 1 of these regis ters are f ixed to zeros .

13.2.3 Data Address Breakpoint Register (DAB) /

Data Address Breakpoint Mask Register (DABM)

This register pair holds the data breakpointing address. Both the value in DAB register

and the destination for load/store operation are masked by the value in DABM. If the

values are equal, condition for data address breakpoint becomes true. These registers are

32-bit wide readable/writable.

31 0

DAB

Figure 13-4. Data Address Breakpoint Register

31 0

DABM

Figure 13-5. Data Address Breakpoint Mask Register

Chapter 13 Hardware Breakpoint

13-8

13.2.4 Data Value Breakpoint Register (DVB) /

Data Value Breakpoint Mask Register (DVBM)

This register pair holds the value for data value breakpointing. Both the value in DVB and

the lower 32 bits of load/store data are mask ed with the value in DVBM. If the values are

equal, condition for data value breakpoint becomes true. Note that enabling data value

breakpoint implies activating the data address breakpointing (setting either/both of

DRE/DWE bit in BPC), and therefore break point event for data value only happens if both

condition for data address breakpoint and data value breakpoint becomes true.

Note that the comparison of data value is always performed in 32bit regardless of the

width of load/store operation: the store value comes from GPR is truncated to 32bit value

for comparison and the load value is appropriately signextended or merged with the

contents of GPR (unaligned cases) and then the least significant 32-bits are used for

comparison. For instance, mos t s ignif icant ( 64+32) bits / 32- bits are truncated on data value

comparison for LQ/SQ/LD/SD instructions, while the value from memory is sign-extended

to comprise a 32bit value for LB/LH instructions .

13.3 Setting Breakpoint

The following sections mention the details of breakpoint controls with some sample codes.

As C790 is a pipelined superscalar process or, s everal res trictions are applied in s etting

breakpoint registers. The following is the main topic that has to be taken care of:

31 0

DVB

Figure 13-6. Data Value Breakpoint Register

31 0

DVBM

Figure 13-7. Data Value Breakpoint Mask Register

• Upon chainging the configuration of breakpointing, it is very likely that 3 or

more registers must be updated. However, the change is performed in pipelined

manner as C790 is pipelined process or. This potentially has poss ibility to create

a hazardous area in generating exception unconsciously.

• C790 does NOT wait for the data arrival on load operation. The instruction itself

may retire from the pipeline before storing the data into the registers, and the

occurrence of breakpointing event delays from the instruction completion. This

not only make some data value breakpoints imprecise, but also temporally

masks an occurrence of breakpointing event as following case: a data load

instruction that should cause data value breakpoint exception results in cache

miss. But in the next cycle, other level2 exception such as SIO interrupt had

been detected and the processor entered level2 before the acquisition of the data.

Under this scenario, data value exception will be delayed until the processor

returns from Level2 mode.

Chapter 13 Hardware Breakpoint

13-9

13.3.1 Sequence of Setting Breakpoi nt

In order to prevent spurious exception during reconfiguring the breakpoint, managing

breakpointing enable before and after the change is mandatory. One easy way is to change

the processor mode into Level2 to mask breakpoint exception unconditionally, but, this

has an side effect that the user segment becomes unmapped. Therefore, this section

mainly focuses on changing the configuration without changing the processor mode.

The following summarizes the sequence of changing breakpointing configuration.

1. Synchronize the pipeline

2. Disable the breakpoint exception that is going to be reconfigured

3. Synchronize the pipeline

4. Set appropriate data in Breakpoint register pairs

5. Set appropriate configuration into Breakpoint Control Register, including enabling

the break point exception.

6. Synchronize the pipeline

There are three synchronization points in the sequence: the first one is to ensure that

there is no pending breakpoint exception for consistency in the breakpoint exception

handler. The second one is right after disabling the breakpoint that is going to be

reconfigured. This separates the change in the control register from the change for other

breakpoint register so that programmer can safely change the breakpoint. The third

synchronization is after updating breakpoint control register. Since C790 issues the

instructions in in-ordered manner, changes for breakpoint register pair always precedes

the change in the control register. In this sense, there is no spurious exception without

this synchronization. However, in order to catch the breakpointing event right after

updating the control register, flushing the pipeline at this point is s trongly recommended.

The first synchronized operation must be either of SYNC.P or SYNC.L operation

depending on the breakpoint that is going to be reconfigured. If it is instruction

breakpoint, SYNC.P is to be used and otherwise SYNC.L is to be used. For second and

third synchronization, SYNC.P is to be used.

The flow generating TRIG* and exception is shown in Figure 13-8, Figure 13-9, Figure

13-10. Figure 13-8 describes the flow hardware breakpoint encounts the breakpointing

event. Figure 13-9, and Figure 13-10 describe the flow how the exception and TRIG*

signal is asserted.

The following shows some simple sample codes for configuring breakpoint registers.

Several programming notes/issues are put in the comments.

Chapter 13 Hardware Breakpoint

13-10

No

Breakpointing

Configuration

Check

Kernel (00b)

1 (Level2)

Start

In

Level2

Mode ?

In

Level1

Mode ?

Processor

Mode ?

I/DUE = ?

No

Breakpoint

Event

No

YesYes

I/DSE = ?

No

Breakpoint

Event

No

Yes

I/DKE = ?

No

Breakpoint

Event

No

Checking

Breakpoint

Event

No

Breakpoint

Event

Yes

I/DXE = ?

No

Breakpoint

Event

No

Breakpoint

Event

1 (Level1)

Supervisor (01b)

Status.KSU

(2bits)

Status.EXL

Status.ERL

User (10b)

Figure 13-8. Hardware Breakpoint detection flow (Setting)

Chapter 13 Hardware Breakpoint

13-11

Checking

Breakpoint

Event

(Instruction)

IAB = 1

Checking

Breakpoint

Event

Mask

Instruction

address

Yes

Equal ?

Mask

Value in

IAB

No

Breakpoint

Event

No

Yes

IAE = 1 ? No

Breakpoint

Event

No

Signal

External

Trigger ?

Assert

TRIG*

Yes

G

enerate

Exception ?

Breakpoint

Exception

Check

Condition

Signal

Breakpoint

BPC.IT E = 1 ? No

Yes

(End)

No

BPC.BED = 1 ?

Figure 13-9. Hardware Breakpoint detection flow (IAB)

Chapter 13 Hardware Breakpoint

13-12

Checking

Breakpoint

Event

(Data)

DWB = 1

Checking

Breakpoint

Event

Mask

Data

address

Yes

Equal ?

Mask

Value in

DAB

No

Breakpoint

Event

No

Yes

Check

Value

Also ?

No

Check

Condition

(Address)

Signal

Breakpoint

Mask

Data

Value

Yes

Equal ?

Mask

Value in

DVB

No

Check

Condition

Yes

Read ?

Yes

DRE = 1 ?

Yes

DWE = 1 ?

No

DRB = 1

No

Breakpoint

Event

No

BPC.DVE = 1 ?

Figure 13-10. Hardware Breakpoint detection flow (DAB/DVB) (1/2)

Chapter 13 Hardware Breakpoint

13-13

Signal

External

Trigger ?

Assert

TRIG*

Yes

Generate

Exception ?

Breakpoint

Exception

BPC.IT E = 1 ? No

Yes

(End)

No

BPC.BED = 1 ?

No

Breakpoint

Event

Figure 13-10. Hardware Breakpoint detection flow (IAB) (2/2)

Chapter 13 Hardware Breakpoint

13-14

13.3.2 Instruction Breakpointing

The following code sets an instruction breakpoint from 0x1234_5600 to 0x1234_56ff, and

traps if the processor is either in user mode or in supervisor mode.

------------------------------------------------------------------

#

# Setting Instruction address breakpoint from 0x1234_5600 to 0x1234_56ff

# in user mode and supervisor mode

#

# 1st sync.

sync.p # A barrier to ensure there is no pending

# instruction address breakpoint in pipe.

# pipeline flusing works for this purpose.

# At first, disable instruction breakpointing to avoid spurious exceptions.

# The following uses conservative way not to break the configuration for

# data breakpointing.

#

mfbpc $4 # get the value in BPC

bgez $4, 1f # skip following if ( BPC[31] == 0 )

nop # (bds)

li $5, (1 << 31) # IAE is in 31st bit of BPC

xor $4, $5, $4 # Resetting IAE bit to zero.

mtbpc $4 # reload BPC.

# 2nd sync.

sync.p # barrier to ensure the configuration change

# of breakpoint function

1: #

# Reconfigure instruction breakpoint address.

# Note that least significant 8 bits can be anything because it is masked

# by IABM register anyway

#

li $4, 0x12345678

mtiab $4

#

# Setting mask register. Masked if corresponding bit in mask register

# is reset to zero.

#

li $5, 0xffffff00

mtiabm $5

#

# Reconfigure instruction breakpoint. For better understanding, once

# resetting all the bits for instructio breakpoint, and then sets new

# config.

#

mfbpc $4

#

# Reset IUE/ISE/IKE/ITE/IAB. Especially resetting IAB is important to

# know the cause of next breakpoint exception correctly.

#

li $5, ~( \

( 1 << 26 ) # IUE \

| ( 1 << 25 ) # ISE \

| ( 1 << 24 ) # IKE \

| ( 1 << 23 ) # IXE \

| ( 1 << 17 ) # ITE \

| ( 1 << 0 ) # IAB \

)

and $4, $4, $5

#

# Set new configuration to BPC register.

# Note that setting BPC after IAB/IABM is so important to avoid spurious

# exception.

#

Chapter 13 Hardware Breakpoint

13-15

li $6, $6, \

( \

( 1 << 31 ) # IAE = 1 to enable Inst. B.P. \

| ( 1 << 26 ) # IUE = 1 to enable Inst. B.P in user mode. \

| ( 1 << 20 ) # IUE = 1 to enable Inst. B.P in supv. mode. \

| ( 1 << 15 ) # BED = 1 to enable generating exception. \

)

or $5, $4, $6

mtbpc $5

# 3rd sync.

Sync.p # Barrier to ensure the configuration change

------------------------------------------------------------------

Chapter 13 Hardware Breakpoint

13-16

13.3.3 Data Address Breakpointing

The following code sets a data address breakpoint from 0x1230_0000 to 0x1233_ffff for

both reading and writing, and traps if the processor is either in kernel mode(including

under level1).

------------------------------------------------------------------

#

# Setting data address breakpoint from 0x1230_0000 to 0x1233_ffff

# in kernel(normal,L1) mode

#

# 1st sync.

sync.l # A barrier to ensure there is no pending

# data address breakpoint in pipe.

# Must flush all buffers for load/store for this

# purpose by SYNC.L

#

# At first, reset data-breakpoint related bits to zeros.

# Resetting DWB/DRB is important so that the hander can recognize the

# next breakpoint exception correctly.

#

mfbpc $4 # load current configuration

li $5, ~( \

( 1 << 30 ) # DRE \

| ( 1 << 29 ) # DWE \

| ( 1 << 28 ) # DVE \

| ( 1 << 21 ) # DUE \

| ( 1 << 20 ) # DSE \

| ( 1 << 19 ) # DKE \

| ( 1 << 18 ) # DXE \

| ( 1 << 16 ) # DTE \

| ( 1 << 2 ) # DWB \

| ( 1 << 1 ) # DRB \

)

and $4, $4, $5

mtbpc $4 # reload BPC.

# 2nd sync.

sync.p # barrier to ensure the configuration change

# of breakpoint function

#

# Reconfigure data breakpoint address.

# Note that least significant 18 bits can be anything because it is masked

# by DABM register anyway

#

li $6, 0x12305678

mtdab $6

#

# Setting mask register. Masked if corresponding bit in mask register

# is reset to zero.

#

li $5, 0xfffc0000

mtdabm $5

#

# Set new configuration to BPC register.

# Note that setting BPC after DAB/DABM is so important to avoid spurious

# exception.

#

li $6, $6, \

( \

( 1 << 30 ) # DRE = 1 to enable Data B.P on read \

| ( 1 << 29 ) # DWE = 1 to enable Data B.P on write \

| ( 1 << 19 ) # DKE = 1 to enable Data B.P in kern. mode. \

| ( 1 << 18 ) # DXE = 1 to enable Data B.P under L1. \

| ( 1 << 15 ) # BED = 1 to enable generating exception. \

)

or $5, $4, $6 # Note that $4 still holds the value used

# on MTBPC.

mtbpc $5

Chapter 13 Hardware Breakpoint

13-17

# 3rd sync.

sync.p # Barrier to ensure the configuration change

------------------------------------------------------------------

Chapter 13 Hardware Breakpoint

13-18

13.3.4 Breakpointing by Data Address and Value

Setting Data Address and Value breakpoint is the same as Data Address breakpoint. The

following example is the same as the previous example except in that the trap only

happens if the data contains 0xCAFE in least s ignif icant 16 bits, and traps only on loading

data.

------------------------------------------------------------------

#

# Setting data address/value breakpoint from 0x1230_0000 to 0x1233_ffff

# with data that contains 0xCAFE in kernel(normal, L1) mode.

#

# 1st sync.

sync.l # A barrier to ensure there is no pending

# data address breakpoint in pipe.

# Must flush all buffers for load/store for this

# purpose by SYNC.L

#

# At first, reset data-breakpoint related bits to zeros.

# Resetting DWB/DRB is important so that the hander can recognize the

# next breakpoint exception correctly.

#

mfbpc $4 # load current configuration

li $5, ~( \

( 1 << 30 ) # DRE \

| ( 1 << 29 ) # DWE \

| ( 1 << 28 ) # DVE \

| ( 1 << 21 ) # DUE \

| ( 1 << 20 ) # DSE \

| ( 1 << 19 ) # DKE \

| ( 1 << 18 ) # DXE \

| ( 1 << 16 ) # DTE \

| ( 1 << 2 ) # DWB \

| ( 1 << 1 ) # DRB \

)

and $4, $4, $5

mtbpc $4 # reload BPC.

# 2nd sync.

sync.p # barrier to ensure the configuration change

# of breakpoint function

#

# Reconfigure data breakpoint address.

# Note that least significant 18 bits can be anything because it is masked

# by DABM register anyway

#

li $6, 0x1233ffff

mtdab $6

#

# Setting mask register. Masked if corresponding bit in mask register

# is reset to zero.

#

li $5, 0xfffc0000

mtdabm $5

#

# Configure data value address.

# Note that least significant 8 bits can be anything because it is masked

# by DVBM register anyway

#

li $6, 0xbabecafe

mtdvb $6

#

# Setting mask register. Masked if corresponding bit in mask register

# is reset to zero.

#

li $5, 0x0000ffff

mtdvbm $5

Chapter 13 Hardware Breakpoint

13-19

#

# Set new configuration to BPC register.

# Note that setting BPC after DAB/DABM is so important to avoid spurious

# exception.

#

li $6, \

( \

( 1 << 30 ) # DRE = 1 to enable Data B.P on read \

| ( 1 << 28 ) # DVE = 1 to enable Data value B.P \

| ( 1 << 19 ) # DKE = 1 to enable Data B.P in kern. mode. \

| ( 1 << 18 ) # DXE = 1 to enable Data B.P under L1. \

| ( 1 << 15 ) # BED = 1 to enable generating exception. \

)

or $5, $4, $6 # Note that $4 still holds the value used

# on MTBPC.

mtbpc $5

# 3rd sync.

sync.p # Barrier to ensure the configuration change

------------------------------------------------------------------

13.3.5 Data Value Breakpointing

Data value breakpoint can be configured so that it traps only by data value, by setting

zero to

DABM

register and configuring the data breakpoint to “Data Address and Value”

mode.

Chapter 13 Hardware Breakpoint

13-20

13.4 Triggering External Probes

There is one dedicated pad to make breakpoint visible outside of C790. This pad, TRIG*

signal, is asserted for two cycles whenever break point event is detected. This trigger

signal generation is enabled by setting ITE/DTE bit in

BPC

register to 1. Note that

assertion of TRIG* signal is not completely synchronized with the occurrence of exception:

TRIG signal is directly connected to the internal breakpoint detect logic while exception

including breakpoint always occurs along with retirement of instruction. Threfore,

thiming of the assertion of TRIG* signal and that of occurrence of exception may differs.

Especially, if the breakpoint is detected right before entering Level2 mode, and if the

breakpoint exception is taken imprecisely, exception may be masked because of processor's

mode change although TRIG* signal has already been as s e rted.

13.5 Important notice on using hardware breakpoint

One important issue not mentioned in this section is that breakpointing does not take care

of ASID on detecting breakpoint. This implies not only that software has to take care of it

on context switching to apply breakpointing for a specific process, but also that imprecise

breakpoint exception may be detected after or in the middle of context switching. In such

condition, it may become difficult to identify which process the breakpoint exception

belongs to. This can be avoided by executing SYNC.L instruction right before changing

ASID. (Since all imprecise breakpoint events relates to load/store instructions, executing

SYNC.L works as a barrier)

Relating to this issue, as briefly described in section 13.3, issuing breakpoint exception

may delay because of other level2 exception handling, although the breakpoint exception

is actual precedent from instruction ordering point of view. In such condition, because

C790 generates breakpoint exception after the processor returns f rom Level2,1 there is no

possibility to miss encounting the breakpoint. However, if the program need to insure the

order of occurrence between level2 exceptions, software has to take care of it (i.e. all level2

handler has to check the occurrence of breakpointing first). Similarly, if a level2 exception

DOES NOT return to where the exception was detected, software has to insure to reset

the condition of breakpoint.

1 C790 tracks the occurrence of breakpoint exception until the breakpoint exception is taken.

Index

X-1

INDEX

A

ABS.............................................................................................................................................. 2-18, 11-6, D-4

ABS.fmt....................................................................................................................................3-21, 10-14, D-41

AbsoluteValue.................................................................................................................................................D-4

ADD .......................................................................................................................2-18, 3-15, 5-26, A-11, A- 14 1

ADD. ...............................................................................................................................................................D-5

ADD.fmt ...................................................................................................................................3-21, 10-14, D-41

ADDI ...............................................................................................3-14, 5-26, A-12, A-141, B-163, C-41, D-40

ADDIU.............................................................................................3-14, A-12, A-13, A-141, B-163, C-41, D-40

AddressError......................................................................... A-58, A-67, A-68, A-70, A-79 , A-94, A-103 , A-116

ADDU..............................................................................................................................3-15, A-11, A-14, A-141

AdEL.............................................................................................................................................4-20, 5-8, 5-15

AdES.............................................................................................................................................4-20, 5-8, 5-15

AGNT...................................................................................................................................8-5, 8-11, 8-14, 8-15

alignm ent ............. 2-7, 2-16, 3-8, 6-1, A-2, A- 6, A-7, A-60, A-64, A- 72, A-76, A-95, A-99, A- 117, A-121, B-10,

B-162

ALU...................................................................................................................2-3, 2-10, 2-11, 2-12, 2-13, 3-14

AND ................................................................3-14, 3-15, 3-25, A-3, A-15, A-1 6, A-141 , B-4, B-48, C-39, C-40

ANDI ........................................................................................................ 3-14, A-16, A-141, B-163, C-41, D-40

arbiter............................................................................................................................................8-2, 8-14, 8-15

AREQ..........................................................................................................................................8-11, 8-14, 8-15

ASID.......... 2-15, 4-5, 4-8, 4-14, 5-16, 5-17, 5-18, 6-2, 6-3, 6-4, 6-9, 6-10, 6-12, 6-13, 6-16, 6-18, 13-20, C-38

Associativity..................................................................................................................................................2-17

B

BadPAddr..........................................................................................................2-15, 4-5, 4-17, 4-25, 5-19, 8-25

BadVAddr......................................................................................... 2-15, 4-5, 4-9, 4-12, 5-15, 5-16, 5-17, 5-18

BadVPN2........................................................................................................................................................4-9

BC0.....................................................................................................................................................C-41, C-42

BC0F..................................................................................................................................3-20, C-2, C-41, C-42

BC0FL..........................................................................................................................................3-20, C-3, C-42

BC0T............................................................................................................................................3-20, C-4, C-42

BC0TL..........................................................................................................................................3-20, C-5, C-42

BC1...............................................................................................................................................................D-40

BC1F........................................................................................................................ 3-21, 10-15, D-6, D-8, D-40

BC1T........................................................................................................................ 3-21, 10-15, D-7, D-8, D-40

BD2................................................................................................ 4-19, 4-33, 5-5, 5-12, 5-13, 5-14, 5-25, 9-10

Index

X-2

BdPAddr........................................................................................................................................................ 4-25

BDS.................................................................................................................................................4-29, 9-6, 9-8

BE.................................................................................................................................................................4-23

BED............................................................................................................................. 13-6, 13-15, 13-16, 13-19

BEM..................... 4-16, 4-17, 4-25, 5-9, 5-11, 5-19, 8-25, A-61, A-62, A-65, A-66, A-73, A-74, A-77, A-78,

A-97, A-98, A-101, A-102, A-119, A-120, A-123, A-124

BEQ ......................................................................................................... 3-17, A-17, A-141, B-163, C-41, D-40

BEQL ....................................................................................................... 3-17, A-18, A-141, B-163, C-41, D-40

BEV...................... 4-16, 4-17, 5-7, 5-11, 5-12, 5-15, 5-16, 5-17, 5-18, 5-19, 5-20, 5-21, 5-22, 5-23, 5-24, 5-26,

5-27, 5-28, 12-6

BFH.................................................................................................................................................................C-6

BGEZ.......................................................................................................................................3-18, A-19, A-142

BGEZAL...................................................................................................................................3-18, A-20, A-14 2

BGEZALL.................................................................................................................................3-18, A-21, A-14 2

BGEZL.....................................................................................................................................3-18, A-22, A-142

BGTZ ........................................................................................................3-17, A-23, A-14 1, B-163, C-41, D-40

BGTZ L ...................................................................................................... 3-17, A-24, A-14 1, B-163, C-41, D-40

BHINBT...........................................................................................................................................................C-6

BHT........................................................................................................................ 1-2, 2-3, 2-6, 2-7, 4-31, C-10

BIU..................................................................................................................................................................2-4

BLEZ.........................................................................................................3-17, A-25, A-141, B-163, C-41, D-40

BLEZL...................................................................................................... 3-17, A-26, A-141, B-163, C-41, D-40

BLTZ ........................................................................................................................................3-18, A-27, A-142

BLTZAL....................................................................................................................................3-18, A-28, A-142

BLTZALL..................................................................................................................................3-18, A-29, A-142

BLTZL ......................................................................................................................................3-18, A-30, A-142

BNE.......................................................................................................... 3-17, A-31, A-141, B-163, C-41, D-40

BNEL........................................................................................................ 3-17, A-32, A-141, B-163, C-41, D-40

bootstrapping.................................................................................................................................................5-11

BPC.........................................................4-26, 5-11, 13-3, 13-4, 13-5, 13-8, 13-14, 13-16, 13-18, 13-19, 13-20

BPE.............................................................................................................................................. 4-23, 5-11, C-9

BR........................................................................................................................................2-3, 2-11, 2-12, 3-26

branch likely.........................................................................................................................................2-13, 9-10

BREAK.......................................................................2-11, 3-18, 5-10, 5-21, 9-7, A-33, A-39, A-141, B-8, B-67

breakpoint............ 1-2, 2-19, 3-18, 5-10, 5-11, 5-14, 5-19, 12-1, 13-1, 13-2, 13-3, 13-4, 13-6, 13-7, 13-8, 13-9,

13-14, 13-16, 13- 18, 13-19, 13-20, A-33

breakpoints .........................................................................................................................12-1, 13-5, 13-8, A-2

BTAC...................................1-2, 2-3, 2-6, 2-7, 4-29, 4-31, 9-6, 9-7, 9-8, C-6, C-7, C-9, C-10, C-11, C-13, C-28

BUSERR................................................................................................ 5-19, 8- 10, 8-25, 8-26, 8-27, 8-28 , 8-29

BXLBT.............................................................................................................................................................C-6

Index

X-3

BXSBT............................................................................................................................................................C-6

C

C.cond.D.........................................................................................................................................................D-8

C.cond.fmt ...............................................................................................................................3-21, 10-15, D-41

C.cond.fmt. ...................................................................................................................................D-6, D-7, D-41

C.cond.S.........................................................................................................................................................D-8

Cache................... 1-2, 2-1, 2-3, 2-6, 2-7 , 2-1 5, 2-17, 2-1 8, 3-20, 4-5, 4-1 7, 4-29, 8-2, 8-8, 9-7, 9-9, A-6, A-7,

C-6, C-7, C-8, C-9, C-13

CACHE ................ 2-1 1, 2-13, 2-17, 3-20, 4-17, 4-23, 4-31, 4-32, 5-19, A-141, B-163, C-6, C-7, C-8, C-9, C-10,

C-11, C-12, C-13, C-41, D-40

CacheOp.........................................................................................................................................................C-7

CAUSE.................................................................................................................................................8-13, 9-10

CCR...............................................................................................................................9-2, 9-5, 9-10, 9-11, A-3

CE....................................................................................................................................... 4-19, 4-23, 5-2, 5-23

CEIL..............................................................................................................................................................D-12

CEIL.L.fmt................................................................................................................................3-21, 10-14, D-41

CEIL.W..........................................................................................................................................................D-13

CEIL.W.fmt...............................................................................................................................3-21, 10-14, D-41

CFC1.....................................................................................................................3-21, 10-13, 11-9, D-14, D-40

CH........................................................................................................................................................4-16, 4-17

coherency ...........................................................................................................2-18, 4-8, 4-24, 6-12, 6-16, 8-2

Coherency.....................................................................................................................................................6-17

Config.......................................................................................................... 2-15, 4-5, 4-23, 5-11, 6-7, 6-12, C-9

CONFIG.............................................................................................................................................. 9-10, C-28

consistency...................................................................................................................................................13-9

Context.......................................................................................................2-15, 4-5, 4-9, 5-15, 5-16, 5-17, 5-18

contexts...........................................................................................................................................................6-3

ConvertFmt..........................................................................................D-2, D-16, D-17, D-18, D-19, D-23, D-24

COP0................... 2-7, 2-11, 2-12, 2-13, 2-15, 3-2, 3-20, 4-1, 4-5, 4-16, 4-17, 4-22, 4-28, 5-23, 6-1, 6-3, 6-14,

8-25, 9-2, 9-3, 9-1 1, A-4, A-141, A-142, B-163, C-1, C-7, C-9, C-10, C-1 1, C-12, C-14, C-15,

C-17, C-18, C-19, C-20, C-21, C-22, C-23, C-24, C-25, C-26, C-27, C-28, C-29, C-30, C-31,

C-32, C-33, C-34, C-35, C-36, C-41, C-42, D-40

COP1................... 2-3, 2-4, 2-7, 2-8, 2-10, 2-11, 2-12, 2-13, 2-14, 3-2, 3-21, 4-29, 9-6, 9-7, A-8, A-125, A-141,

A-142, B-163, C- 16, C-41, D-1, D-2, D-27, D- 29, D-40, D-41

coprocess or......... 2-4, 2-7, 2-8, 2-16, 3-5, 3-21, 4-16, 4-17, 5-11, 5-23, 6-1, 10-2, A-4, A-5, A-142, C-1, C-2,

C-3, C-4, C-5, C-14, C-15, C-18, C-28, D-1, D-14, D-15, D-21, D-26

Coprocessor ........ 1-1, 1-5, 2-11, 2-15, 3-2, 3-5, 3-16, 3-20, 3-21, 4-1, 4-5, 4-16, 4-19, 4-20, 5-2, 5-8, 5-9,

5-10, 5-23, 6-1, 6-14, 8-10, 8-11, 13-2, A-3, A-4, A-5, A-8, A-141, A-142, C-1, C-2, C-3,

C-4, C-5, C-7, C-16, C-17, C-18, C-19, C-20, C-21, C-22, C-23, C-24, C-25, C-26, C-27,

C-28, C-29, C-3 0, C-31 , C-32, C-33 , C-34, C-35, C- 36, C-37, C-38, C- 39, C-40, D- 4, D-5,

Index

X-4

D-6, D-7, D- 11, D-12, D-13, D-14 , D- 15, D-16 , D-17, D-18, D-19, D-20, D-21, D-22, D-23,

D-24, D-25, D-26, D-27, D-28, D-29, D-30, D-31, D-32, D-33, D-34, D-35, D-36, D-37, D-38,

D-39

Coprocessor0 ...............................................................................................................................................13-4

Count .................................................................................................2-15, 3-25, 4-5, 4-13, 4-15, 5-2 4, B- 4, B-5

counter................. 2-15, 2-16, 2-19, 3-17, 4-5, 4-17, 4-18, 4-19, 4-28, 4-30, 4-33, 5-5, 5-9, 5-13, 6-1, 9-1, 9-2,

9-3, 9-5, 9-6, 9-8, 9-10, 9-11, C-28, C-35

Counter................ 2-3, 2-15, 2-19, 3-20, 4-1, 4-2, 4-3, 4-4, 4-5, 4-19, 4-21, 4-28, 4-29, 4-30, 5-2, 5-7, 5-8,

5-9, 5-10, 5-11, 5-13, 9-1, 9-2, 9-3, 9-4, 9-5, 9-6, 9-10, 9-11, 12-6, A-4, C-25, C-26, C-35

CPCOND ........................................................................................................................................................A-3

CPCOND0 ............................................................................................................8-10, 8-11, C-2, C-3, C-4, C-5

CPR ..................... A-3, C-17, C-18, C-19, C-20, C-21, C-22, C-23, C-24, C-25, C-26, C-27, C-28, C-29, C-30,

C-31, C-32, C-33, C-34, C-35, C-36

CPUADDR........................................................................................................................................8-3, 8-7, 8-9

CPUASTART ....................................................................................... 8-3, 8-7, 8-8, 8-9, 8-12, 8-13, 8-16, 8-19

CPUBE..............................................................................................................................................8-3, 8-7, 8-9

CPUCLK ........................................................................................................................................................8-11

CPUDATA...................................................................................................................... 8-3, 8-7, 8-9, 8-17, 8-20

CPUDSTART...............................................................8-3, 8-10, 8-12, 8-13, 8-16, 8-17, 8-19, 8-20, 8-26, 8-28

CPURD.............................................................................................................................................8-3, 8-8, 8-9

CPUTRANSTYPE...........................................................................................................................................8-8

CPUTSIZE..........................................................................................................8-3, 8-9, 8-12, 8-13, 8-16, 8-19

CPUWR ............................................................................................................................................8-3, 8-8, 8-9

CTC1......................................................................................... 3-21, 10-7, 10-8, 10-9, 10-13, 11-9, D-15, D-40

CTE.....................................................................................................4-28, 4-29, 5-11, 9-2, 9-4, 9-5, 9-10, 9-11

CTR0...........................................................................................................................................4-29, 9-10, 9-11

CTR1...........................................................................................................................................4-29, 9-10, 9-11

CU........................................................................................... 1-5, 3-5, 3-20, 3-21, 4-16, 4-17, C-1, C-14, C-15

CU0....................................................................................................................................................... 5-23, C-7

CVT...............................................................................................................................................................3-26

CVT.D............................................................................................................................................................D-16

CVT.D.fmt ................................................................................................................................3-21, 10-14, D-41

CVT.L............................................................................................................................................................D-17

CVT.L.fmt.................................................................................................................................3-21, 10-14, D-41

CVT.S............................................................................................................................................................D-18

CVT.S.fmt.................................................................................................................................3-21, 10-14, D-41

CVT.W.fmt................................................................................................................................3-21, 10-14, D-41

CVT.W.S .......................................................................................................................................................D-19

D

DAB...........................................................................................................4-27, 13-3, 13-7, 13-12, 13-16 , 13-19

Index

X-5

DABM........................................................................................................4-27, 13-3, 13-7, 13-16, 13-18, 13-19

DADD..............................................................................................................................3-15, 5-26, A-34, A-141

DADDI.............................................................................................3-14, 5-26, A-35, A-141, B-163, C-41, D-40

DADDIU..........................................................................................3-14, A-35, A-36 , A-141, B-16 3, C-41, D-40

DADDU..........................................................................................................................3-15, A-34, A- 37 , A-141

DBE...............................................................................................................................................4-20, 5-8, 5-19

DC.................................................................................................................................................................4-23

DCE ............................................................................................................................4-23, 5-11, 9-7, C-9, C-28

DDIV ...........................................................................................................3-4, 3-14, A-142, B-165, C-42, D-41

DDIVU.........................................................................................................3-4, 3-14, A-142, B-165, C-42, D-41

debug..................................................................................3-20, 4-17, 4-18, 4- 19, 4-26, 4- 33 , 5-10, 5-14, 13-6

DEBUG.........................................................................................................................................................5-14

DEC ................................................................................................................................................................ 3-6

decoupling.......................................................................................................................................................2-4

Demultiplexed........................................................................................................................................2-18, 8-2

DEV................................................................................................ 4-16, 4-17, 5-7, 5-13, 5-14, 5-2 5, 9- 10, 12-6

DHIN...............................................................................................................................................................C-6

DHWBIN.........................................................................................................................................................C-6

DHWOIN.........................................................................................................................................................C-6

DI .................................................................................................3-20, 4-16, 4-17, 5-23, C-1, C-14, C-15, C-42

DIE..............................................................................................................................................4-23, 4-24, 5-11

dirty........................................................................................................ 4-8, 5-18, 6-16, 8-1 2, A-91, C- 11, C-12

Dirty........................................................................................................ 4-8, 4-32, 5-11, 6-16, C-11, C-12, C-13

dispatches.....................................................................................................................................................3-17

displacement............................................................................................................................................3-3, A-9

DIV...........................................................................................2-18, 3-16, 3-26, A-38 , A-40, A-80, A-141, D-20

DIV.fmt .....................................................................................................................................3-21, 10-14, D-41

DIV1..................................................................................................2-14, 3-23, 3-26, 4-2, B-3, B-7, B-9, B-163

Divide........................................................1-1, 2-6, 3-14, 3-16, 3-21, 3-22, 3-23, 3-24, 3-26, 4-1, B-3, B-5, B-8

DIVU ...............................................................................................................................3-16, 3-26, A-40, A-141

DIVU1 ...................................................................................................... 2-14, 3- 23, 3-26, 4-2, B-3, B-9 , B-163

DKE............................................................................................................................. 13-6, 13-16, 13-18, 13-19

DMA...................................................................................8-1, 8-3, 8-6, 8-7, 8-10, 8-12, 8-13, 8-14, 8-25, 8-26

DMAC ...............................................................................................8-1, 8-3, 8-10, 8-11, 8-13, 8-14, 8-25, 8-26

DMFC1...........................................................................................................................3-21, 10-13, D-21, D-40

DMTC1...........................................................................................................................3-21, 10-13, D-22, D-40

DMULT........................................................................................................3-4, 3-14, A-142, B-165, C-42, D-41

DMULTU.....................................................................................................3-4, 3-14, A-142, B-165, C-42, D-41

doubleword .......... 3-5, 3-8, 3-9, 5-15, A-4, A-5, A-6, A-34, A-37, A-41, A-42, A-43, A-44, A-45, A-46, A-47,

A-48, A-49, A-50, A-51, A-58, A-59, A-60, A-63, A-64, A-72, A-94, A-95, A-96, A-99, A-100,

Index

X-6

A-118, A-122, B-2, B-64, B-65, B-72, B-74, B-78, B-79, B-80, B-81, B-82, B-83, B-89, B-93,

B-95, B-113, B-120, B-122, B-128, B-129, B-130

DRB ........................................................................................................................................13-6, 13-16, 13-18

DRE .................................................................................................5-11, 13-5, 13-6, 13-8, 13-16, 13-18, 13-19

DSE.........................................................................................................................................13-6, 13-16, 13-18

DSLL........................................................................................................................................3-15, A-41, A-141

DSLL32....................................................................................................................................3-15, A-42, A-141

DSLLV......................................................................................................................................3-15, A-43, A-141

DSRA.......................................................................................................................................3-15, A-44, A-141

DSRA32...................................................................................................................................3-15, A-45, A-141

DSRAV.....................................................................................................................................3-15, A-46, A-141

DSRL .......................................................................................................................................3-15, A-47, A-141

DSRL32 ...................................................................................................................................3-15, A-48, A-141

DSRLV.....................................................................................................................................3-15, A-49, A-141

DSUB..............................................................................................................................3-15, 5-26, A-50, A-141

DSUBU ..........................................................................................................................3-15, A-50, A- 51 , A-1 41

DTE............................................................................................................................. 13-6, 13-16, 13-18, 13-20

DTLB.......................................................................................................................2-3, 2-6, 2-16, 4-29, 9-6, 9-8

DUE ........................................................................................................................................13-6, 13-16, 13-18

DVB................................................................................................................................. 4-27, 13-3, 13-8, 13-12

DVBM.............................................................................................................................. 4- 27, 13-3, 13-8, 13- 18

DVE............................................................................................................................. 13-5, 13-16, 13-18, 13-19

DWB........................................................................................................................................13-6, 13-16, 13-18

DWE............................................................................................................ 5-11, 13-5, 13-6, 13-8, 13-16, 13-18

DXE............................................................................................................................. 13-6, 13-16, 13-18, 13-19

DXIN ...............................................................................................................................................................C-6

DXLDT............................................................................................................................................................C-6

DXLTG............................................................................................................................................................C-6

DXSDT............................................................................................................................................................C-6

DXSTG ...........................................................................................................................................................C-6

DXWBIN .........................................................................................................................................................C-6

E

EC.................................................................................................................................................................4-23

EDI..................................................................................................................4-16, 4-17, 5-23, C-1, C-14, C-1 5

Edian.............................................................................................................................................................4-23

EI..................................................................................................3-20, 4-16, 4-17, 5-23, C-1, C-14, C-15, C-42

EIE.................................................................................................................4-16, 4-17, 4-18, 5-24, C-14 , C- 15

endian.................. 3-5, 3-6, 3-7, 3-9, 3-10, 3-11, 3-12, 3-13, A-3, A-6, A-61, A-62, A-65, A-66, A-73, A-74,

A-77, A-78 , A-97, A- 98, A- 10 1, A-102 , A-119, A- 12 0, A-123 , A-12 4

endianess .......................................................................................................................................................3-9

Index

X-7

Endianness..............................................................................................................................................1-2, 3-5

EntryHi.................... 2-15, 4-5, 4-14, 5-15, 5-16, 5-17, 5-18, 6-2, 6-3, 6-4, 6-15, C-28, C-37, C-38, C-39, C-40

EntryHI..........................................................................................................................................................6-16

EntryHi7........................................................................................................................................................C-37

Entr yLo........................................................................................5-15, 5- 16, 5-17, 5-18, 6-15, C-38, C-39, C-40

EntryLo0................................................................................ 2-15, 4-5, 4-8, 5-16, 6-15, 6-16, C-38, C-39, C-40

EntryLo1................................................................................ 2-15, 4-5, 4-8, 5-16, 6-15, 6-16, C-38, C-39, C-40

EPC...................... 2-6, 2-15, 4-5, 4-21, 4-33, 5-2, 5-3, 5-15, 5-16, 5-17, 5-18, 5-19, 5-20, 5-21, 5-22, 5-23,

5-26, 5-27, 11-9, C-16

ERET ............2-11, 2-12, 2-13, 3-20, 4-4, 5-5, 5-24, 6-11, 9-7, 9-11, 12-2, 12-5, C-16, C-38, C-39, C-40, C-42

ERL...................... 4-16, 4-17, 4-18, 5-5, 5-9, 5-11, 5-12, 5-13, 5-14, 5-19, 5-24, 5-25, 6-6, 6-7, 6-8, 6-9, 6-10,

6-11, 6-12, 9-2, 9-10, 9-11, 13-5, 13-6, C-14, C-15, C-16

ERL0...............................................................................................................................................................9-5

ERL1...............................................................................................................................................................9-5

Error..................... 2-6, 2-15, 4-5, 4-12, 4-17, 4-18, 5-2, 5-10, 5-15, 5-19, 5-23, 6-6, 6-7, 6-9, 8-13, 8-25, 8-26,

8-28, A-2, A-54, A-55, A-56, A-57, A-58, A-62, A-66, A-67, A-68, A-70, A-74, A-78, A-79,

A-93, A-94, A-9 8, A-102, A-1 03, A-11 6, A-120, A-1 24, B-10, B-162, C-7, C-8, D-26, D- 34,

D-37

ErrorEPC...............................................................................4-33, 5-5, 5-12, 5-13, 5-14, 5-25, 9-10, 9-11, C-16

ErrorPC..................................................................................................................................................2-15, 4-5

EVENT............................................................................................................................................................9-5

EVENT0................................................................................................................4-28, 4-29, 9-2, 9-5, 9-6, 9-11

EVENT1........................................................................................................................4-28, 4-29, 9-5, 9-6, 9-11

EXC2....................................................................................... 4-19, 5-5, 5-8, 5-11, 5-12, 5-13, 5-14, 5-25, 9-10

ExcCode ................ 4-19, 4-20, 5-2, 5-8, 5-15, 5-16, 5-17, 5-18, 5-19, 5-20, 5-21, 5-22, 5-23, 5-24, 5-26, 5-27

exception.............. 2-15, 2-16, 2-18, 2-19, 3-2, 3-5, 3-16, 3-18, 3-20, 4-4, 4-5, 4-9, 4-12, 4-14, 4-16, 4-17, 4-18,

4-19, 4-20, 4-21, 4-29, 4- 33, 5-1, 5-2, 5-3 , 5-5, 5-8, 5-9, 5-10 , 5-11, 5-12, 5-13, 5- 14, 5-15,

5-16, 5-17, 5-18, 5-19, 5-20, 5-21, 5-22, 5-23, 5-24, 5-25, 5-26, 5-27, 6-1, 6-2, 6-4, 6-6,

6-9, 6-1 1, 6-14, 6-15, 6-16, 6-17, 6-20, 8-13, 8-25, 9-2, 9-7, 9-8, 9-10, 9-1 1, 10-8, 1 1-2, 11-3,

12-1, 12-2, 12-3, 12-5, 12-6, 12-7 , 12-14, 12-15, 12-16, 12-17, 12-18, 12-19, 12- 20, 13-2,

13-4, 13-5, 13-6, 13-8, 13-9, 13-14, 13-15, 13-16, 13-18, 13-19, 13-20, A-2, A-6, A-8, A-11,

A-12, A-13 , A-14, A- 20, A- 21 , A-28, A- 29, A- 33 , A-34, A- 35 , A- 36 , A-37 , A-38, A-39, A- 40,

A-50, A-51, A-54, A-55, A-58, A-67, A-68, A-70, A-86, A-87, A-91, A-92, A-94, A-103, A-106,

A-107, A-108, A-109, A-114, A-115, A-116, A-126, A-127, A-128, A-129, A-130, A-131,

A-132, A-133, A-134, A-135, A-136, A-137, A-138, A-142, B-7, B-8, B-9, B-11, B-12, B-13,

B-14, B-20 , B-21, B- 22, B- 23 , B-25, B- 27, B- 28 , B-66, B- 67 , B- 68 , B-70 , B-71, B-84, B- 86,

B-91, B-93, B-95, B-111, B-113, B-118, B-120, B-122, B-165, C-1, C-2, C-3, C-4, C-5, C-7,

C-8, C-16, C-17, C-18, C-19, C-20, C-21, C-22, C-23, C-24, C-25, C-26, C-27, C-28, C-29,

C-30, C-31, C-32, C-33, C-34, C-35, C-36, C-37, C-38, C-39, C-40, C-42, D-26, D-37, D-41

Exception............. 2-6, 2-11, 2-15, 2-19, 3-18, 3-20, 3-21, 4-5, 4-18, 4-20, 4-21, 5-1, 5-2, 5-3, 5-4, 5-5, 5-6, 5-7,

Index

X-8

5-8, 5-9, 5-10, 5-11, 5-12 , 5-13 , 5-14, 5- 15, 5-1 6, 5-1 7, 5-18, 5-19, 5-20, 5- 21, 5-22, 5-23,

5-24, 5-25, 5-26, 5-27, 5-28, 6-6, 6-11, 8-25, 8-26, 12-2, 12-5, 12-6, 12-7, 12-14, 12-15,

12-16, 12-17, 12-18, 13-2, 13-6, A-8, A-37, A-79, B-62, C-8

Exceptions .....................................................................................................................................................11-5

execution pipeline..................................................................................... 2-3, 2-5, 2-10, 2-11, 2-12, 3-26, C-16

ExHnd............................................................................................................12-14, 12-15, 12-16, 12-17, 1 2- 18

ExHnd1............................................................................................................................................12-19, 12-2 0

ExHnd2............................................................................................................................................12-19, 12-2 0

EXL...................... 4-16, 4-17, 4-18, 4-21, 4-29, 5-2, 5-5, 5-7, 5-9, 5-12, 5-16, 5-19, 5-24, 6-6, 6-8, 6-9, 6-10,

6-11, 6-12, 9-2, 12-6, 13-5, 13-6, C-14, C-15, C-16

EXL0......................................................................................................................................4-29, 9-2, 9-5, 9-11

EXL1.............................................................................................................................................4-29, 9-5, 9-11

F

FCR...............................................................................................................................................................D-14

FCR0.............................................................................................................................................................10-4

FCR31........................................................................................................................................10-4, 10-6, D-15

FCRs.............................................................................................................................................................10-4

FetchAddress......................................................................................................................................C-10, C-11

FGR ............................................................................................................................................................ 10-13

FGRs.............................................................................................................................................................10-2

FLOOR.L.......................................................................................................................................................D-23

FLOOR.L.fmt ...........................................................................................................................3-21, 10-14, D-41

FLOOR.W. ....................................................................................................................................................D-24

FLOOR.W.fmt ..........................................................................................................................3-21, 10-14, D-41

FP_Control..........................................................................................................................................D-14, D-15

FPE......................................................................................................................................4-20, 5-8, 5-28, 11-3

FPR...................... 2-3, 2-9, D-2, D-4 , D-5, D- 8, D-1 2, D-13 , D-16 , D-17 , D-18, D-19, D-20, D-21, D-22, D-23,

D-24, D-26, D-27, D-28, D-30, D-31, D-32, D-33, D-35, D-36, D-37, D-38, D-39

FPRs......................................................................................................................10-2, D-10, D-16, D-17, D-28

FPU ...................... 1-2, 2-3, 2-7, 2-8, 2-14, 2-18, 4-16, 10-13, 10-14, 11-2, 11-5, 11-8, D-1, D-2, D-3, D-14,

D-15, D-27, D-29

FR...............................................................................................................................................4-16, 4-17, 10-2

funnel shift .....................................................................2-3, 2-14, 4-1, 4-2, 4-4, B-17, B-20, B-21, B-22, B-161

Funnel shift ....................................................................................................................................................2-11

G

gathering............................................................................................................2-4, 2-19, 6-17, 9-1, A-8, A-125

General Purpose Registers ........................................................................................2-3, 4-1, 4-2, 4-3, 4-4, A-3

global bit........................................................................................................................................................6-18

GPR..............................................................................................................................................................D-21

GPR10................................................................................................................................................B-21, B-22

Index

X-9

GPRLEN.........................................................................................................................................A-3, D-6, D-7

H

HI ......................... 2-11, 2-14, 3-16, 3-22, 3-23, 3-24, 3-26, 4-1, 4-2, 4-3 , 4-4, A-38, A-39, A-40, A-80, A-84,

A-86, A-87, B-2, B-5, B-11, B-13, B-23, B-25, B-66, B-67, B-68, B-70, B-84, B-85, B-86,

B-87, B-91, B-92, B-93, B-95, B-101, B-102, B-111, B-113, B-115, B-116, B-118, B-120,

B-122

HI0 ............................................................................................................................................4-2, 4-3, 4-4, B-2

HI1 .................................2-11, 2-14, 4-2, 4-3, 4-4, B-2, B-3, B-7, B-8, B-9, B-12, B-14, B-15, B-18, B-24, B-26

hit under miss ........................................................................................................................................1-2, 4-23

I

IAB...................................................................................................4-27, 13-3, 13-6, 13-7, 13-11, 13-13, 13-14

IABM............................................................................................................................... 4-27, 13-3, 13-7, 13-14

IAE.................................................................................................................................5-11, 13-5, 13-14, 13-15

IBE................................................................................................................................................4-20, 5-8, 5-19

IC ..................................................................................................................................................................4-23

ICE............................................................................................................................................... 4-23, 5-11, C-9

ID .........................................................................................................................................................4-14, 6-16

IE...................................................................................................4-16, 4- 17, 4-18, 5-9, 5-12, 5-24, C-14, C-15

IEEE............................2-18, 10-1, 10-8, 10-9, 10-10, 11-2, 11-3, 11-6, 11-7, 11-8, 11-9, D-8, D-12, D-13, D-19

IFL...................................................................................................................................................................C-6

IHIN.................................................................................................................................................................C-6

IKE.....................................................................................................................................................13-5, 13-14

IM...............................................................................................................................4- 13, 4-16, 4- 17, 4-18, 5- 9

imprec is e .............................................................................................5-14, 5-19, 8-13, 13-2, 13-5, 13-8, 13-20

Index.....................2-15, 3-20, 4-5, 4-6, 5-18, 5-19, 6-20, C-7, C-9, C-10, C-11, C-12, C-13, C-37, C-38, C-39

INDEX.............................................................................................................................................................C-6

Index5.................................................................................................................................................C-38, C-39

Init..................................................................................................................................................................9-11

initialize..........................................................................................................................................................9-11

initializing .......................................................................................................................................................5-11

Initializing.......................................................................................................................................................9-11

INT................................................................................................................................................................8-10

interleave ............................................................................................................................................B-88, B-89

interleaved ..........................................................................................................................................B-88, B-89

interrupt........ 1-5, 3-16, 3-22, 4-13, 4-15, 4-16, 4-17, 4-19, 4-33, 5-24, 8-10, 8-13, 8-25, 8-26, 9-4, 13-8, C-16

Interrupt...............3-20, 4-16, 4-17, 4-18, 4-19, 4-20, 5-2, 5-5, 5-7, 5-8, 5-9, 5-10, 5-12, 5-24, 8-10, 8-25, 12-6

Interrupts..............................................................................................................................................4-16, 4-18

INVALIDATE ...................................................................................................................................................C-6

ISE.....................................................................................................................................................13-5, 13-14

Issue ......................................................................................................................................................2-3, 2-12

Index

X-10

issues.................................................................................................................................. 2-3, 4-24, 8-12, 13-9

ITE ..........................................................................................................................................13-6, 13-14, 13-20

ITLB .................................................................................................................................2-3, 2-6, 2-16, 9-6, 9-8

IUE..........................................................................................................................................13-5, 13-14, 13-15

IV......................................................................1-1, 1-2, 1-3, 2-16, 3-2, 3-4, 3-19, 6-1, A-82, A-83, A-91, A-141

IXE.....................................................................................................................................................13-5, 13-14

IXIN.................................................................................................................................................................C-6

IXLDT..............................................................................................................................................................C-6

IXLTG..............................................................................................................................................................C-6

IXSDT .............................................................................................................................................................C-6

IXSTG.............................................................................................................................................................C-6

J

J........................... 3-3, 3-17, 9-7, 12-2, A-9, A-17, A-18, A-19, A-22, A-23, A-24, A-25, A-26, A-27, A-30, A-31,

A-32, A-52, A-61, A-62, A-65, A-66, A-73, A-74, A-77, A-78, A-141, B-163, C-41, D-6, D-7,

D-40

JAL.................................................... 3-17, 9-7, 12- 2, A-20, A- 21 , A-2 8, A- 29, A-53 , A-1 41, B-163, C-41 , D-40

JALR....................................................................... 3-17, 9-7, 12-2, 12-5, A-20, A-21, A-28, A-29, A-54, A-141

JMPA....................................................................................................................................................12-3, 12-4

JMPB ...................................................................................................................................................12-3, 12-4

JR......................... 3-17, 9-7, 12-2, 12-5, A-17, A-18, A-19, A-22, A-23, A-24, A-25, A-26, A-27, A-30, A-31,

A-32, A-55, A-1 41, D-6, D-7

JTLB.........................................................................................................................................................9-6, 9-8

K

K0.....................................................................................4-23, 4-24, 4-29, 6-7, 6-12, 9-2, 9-5, 9-10, 9-11, C-28

KB........................ 6-2, 6-5, A-17, A-18, A-19, A-20, A-21, A-22, A-23, A-24, A-25, A-26, A-27, A-28, A-29,

A-30, A-31 , A-32

Kernel................... 2-16, 2-19, 3-20, 3-26, 4-16, 4-17, 4-18, 4-29, 5-2, 5-22, 5-23, 6-1, 6-6, 6-7, 6-10, 6-11,

6-12, 6-13, 9-2, 13-5, 13-6, C-1, C-7, C-14, C-15

kseg0 .........................................................................................................................4-24, 6-7, 6-12, 9-10, C-28

kseg1 .....................................................................................................................................................6-7, 6-12

kseg3 ....................................................................................................................2-16, 4-9, 6-1, 6-7, 6-12, 6-13

ksseg......................................................................................................................................................6-7, 6-12

KSU.......................................................4-16, 4-17, 4-18, 5-2, 6-6, 6-8, 6-9, 6-10, 6-11, 6-12, 6-13, C-14, C-15

kuseg .....................................................................................................................................2-16, 6-1, 6-7, 6-12

L

LB...................................................................................................... 3-4, 13-8, A-56, A-141, B-163, C-41, D-40

LBU............................................................................................................ 3-4, A-57, A-141, B-163, C-41, D-40

LD ..............................................................................................3-4, 13-8, A-5, A-58, A-141, B-163, C-41, D-40

LDC1............................................................................ 3-5, 3-21, 3-26, 10-13, A-141, B-163, C-41, D-25, D-40

LDL ..................................................................................3-4, 3-8, A-59, A-60, A-63, A-141, B-163, C-41, D-40

Index

X-11

LDR..................................................................................3-4, 3-8, A-59, A-63, A-64, A-141, B-163, C-41, D-40

LH ..........................................................................................3-4, 13-8, A-67, A-141, B-102, B-163, C-41, D-40

LHU............................................................................................................ 3-4, A-68, A-141, B-163, C-41, D-40

li .....................................................................................................................13-14, 13-15, 13-16, 13-1 8, 1 3- 19

Link ......................................................................................................................................2-11, 3-17, 3-18, 4-4

LL..................................................................................................................1-2, 3-4, A-142, B-165, C-42, D-41

LLD ...............................................................................................................1-2, 3-4, A-142, B-165, C-42, D-41

LO........................ 2-11, 2-14, 3-16, 3-22, 3-23, 3-24, 3-26, 4-1, 4-2, 4-3 , 4-4, A-38, A-39, A-40, A-81, A-85,

A-86, A-87, B-2, B-5, B-11, B-13, B-23, B-25, B-66, B-67, B-68, B-70, B-84, B-85, B-86,

B-87, B-91, B-92, B-93, B-95, B-102, B-106, B-111, B-113, B-116, B-117, B-118, B-120,

B-122

LO0..................................................................................................................................4-2, 4-3, 4-4, 6-16, B-2

LO1.......................2-11, 2-14, 4-2, 4-3, 4-4, 6-16, B-2, B-3, B-7, B-8, B-9, B-12 , B-14, B-16, B-19 , B-24, B-26

LoadMem ory...............................A-6, A-56, A-57 , A-5 8, A-60, A-64 , A-67, A- 68, A- 70 , A-72, A- 76, A-79 , B-10

Lock ...............................................................................................................2-17, 4-32, 5-11, C-11, C-12, C-13

Locking.......................................................................................................................................................... 2-17

logical pipe..................................................................................................................................2-10, 2-12, 2-13

LQ.................................................................................... 3-5, 3-25, 13-8, A-141, B-4, B-10, B-163, C-41, D-40

LRF.......................................................................................................4-32, 5-11, C-9, C-10, C-11, C-12, C-13

LUI ..................................................................................................3-14, 3-26, A-69, A-141, B-163, C-41, D-40

LW................................................................................3-4, A-5, A-70, A-141, B-102, B-116, B-163, C-41, D-40

LWC1............................................................................3-5, 3-21, 3-26, 10-13, A-141, B-163, C-41, D-26, D-40

LWC2.......................................................................................................................... A-142, B-165, C-42, D-41

LWL........................................................................ 3-4, 3-8, A-71, A-72, A-75, A-76, A-141, B-163, C-41, D-40

LWR....................................................................... 3-4, 3-8, A-71, A-72, A-75, A-76, A-141, B-163, C-41, D-40

LWU............................................................................................................3-4, A-79, A-141, B-163, C-41, D-40

LZC..............................................................................................................................................2-13, B-4, B-90

M

MAC............................................................................................................................................2-11, 3-16, 3-22

MAC0..........................................................................................................................................2-11, 2-12, 2-13

MAC1..........................................................................................................................................2-11, 2-12, 2-13

MADD ............................................................................................................3-23, 3-26, B-3, B-11, B-13, B-1 63

MADD1 .........................................................................................2-14, 3-23, 3-26, 4-2, B-3, B- 12, B-14 , B-163

MADDU...................................................................................................................3-23, 3 -26, B-3, B-13, B-1 63

MADDU1................................................................................................ 2-14, 3-23, 3-26, 4-2, B-3, B-14, B-163

Mask .................... 2-15, 2-19, 3-20, 4-5, 4- 10, 4-1 6, 4-17, 4-27, 5-9, 5-24, 6 -15, 13-3, 13-4, 13-7, 13-8, C-20,

C-22, C-24, C-30, C-32, C-34, C-39, C-40

MASK...................................................................................................................................................4-10, 6-16

Maskable................................................................................................................................................5-8, 5-12

MAX..............................................................................................................................................................2-18

Index

X-12

MB......................................................................................................................6-2, 6-5, 6-12, 6-13, A-52, A-53

MF0...............................................................................................................................................................C-41

MFBPC ............................................................................................................................3-20, 13-4, C-17, C-41

MFC0................................................................................................................. 3-20, 4-1, 9-3, 13-2, 13-4, C-18

MFC1.............................................................................................................................3-21, 10-13, D-27, D-40

MFDAB ............................................................................................................................3-20, 13-4, C-19, C-41

MFDABM .........................................................................................................................3-20, 13-4, C-20, C-41

MFDVB ............................................................................................................................3-20, 13-4, C-21, C-41

MFDVBM .........................................................................................................................3-20, 13-4, C-22, C-41

MFHI.....................................................................................................................2-11, 3-16, A-80, A-81 , A-141

MFHI1.....................................................................................................2-11, 2-14, 3-23, 4-2, B-3, B-15, B-163

MFIAB..............................................................................................................................3-20, 13-4, C-23, C-41

MFIABM...........................................................................................................................3-20, 13-4, C-24, C-41

MFLO..............................................................................................................................3-16, 3-23, A-81, A-141

MFLO1.............................................................................................................2-14, 3-23, 4-2, B-3, B-16, B-16 3

MFPC..........................................................................................................................3-20, 9-2, 9-3, C-25, C-41

MFPS..........................................................................................................................3-20, 9-2, 9-3, C-26, C-41

MFSA..................................................................................................3-25, A-141, B-5, B-17, B-20, B-21, B-22

MIN ...............................................................................................................................................................2-18

Misaligned.......................................................................................................................................................3-8

misalignment...................................................................................................................................................C-8

mispredicted ............................................................................................................................................9-6, 9-7

Miss................................................................................................................2-17, 4-17, 6-4, 8-8, 9-7, 9-8, 12-6

misses.............................................................................................................................................1-1, 6-17, 9-9

MMI.............................................................................................5-22, A-141, B-163, B-164, B-165, C-41, D-40

MMI0...............................................................................................................................................B-163, B-164

MMI1...............................................................................................................................................B-163, B-164

MMI2...............................................................................................................................................B-163, B-165

MMI3...............................................................................................................................................B-163, B-165

MMU .....................................................................................................................2-3, 2-15, 2-16, 4-5, 6-1, 6-14

mod.........................................................................................................A-38, A-40, B-7, B-9, B-66, B-68, B-70

MOV.....................................................................................................................................................11-6, D-28

MOV. fmt.......................................................................................................................................................10-8

MOV.fmt...................................................................................................................................3-21, 10-14, D-41

Move1............................................................................................................................................................2-11

MOVN......................................................................................................................................3-19, A-82, A-141

MOVZ.......................................................................................................................................3-19, A-83, A-14 1

MT0...............................................................................................................................................................C-41

MTBPC ......................................................................................................3-20, 13-4, 13-16, 13-19, C-27, C -41

MTC0................................................................................................................. 3-20, 4-1, 9-3, 13-2, 13-4, C-28

Index

X-13

MTC1....................................................................................................................3-21, 3-26, 10-13, D-29, D-40

MTDAB ............................................................................................................................3-20, 13-4, C-29, C-41

MTDABM ......................................................................................................................... 3-20, 13-4, C-30, C-41

MTDVB ............................................................................................................................3-20, 13-4, C-31, C-41

MTDVBM ......................................................................................................................... 3-20, 13-4, C-32, C-41

MTHI...............................................................................................................................2-11, 3-16, A-84, A-141

MTHI1.....................................................................................................2-11, 2-14, 3-23, 4-2, B-3, B-18, B-163

MTIAB..............................................................................................................................3-20, 13-4, C-33, C-41

MTIABM...........................................................................................................................3-20, 13-4, C-34, C-41

MTLO.......................................................................................................................................3-16, A-85, A-141

MTLO1.............................................................................................................2-14, 3-23, 4-2, B-3, B-19, B-16 3

MTPC..........................................................................................................................3-20, 9-2, 9-3, C-35, C-41

MTPS..........................................................................................................................3-20, 9-2, 9-3, C-36, C-41

MTSA............................................................................................................ 2-13, 3-25, A-141, B- 5, B-17, B- 20

MTSAB ......................................................................... 2-13, 3-25, A- 14 1, A-142, B-5 , B-20, B- 21, B-22 , B-161

MTSAH ..................................................................................2-13, 3-25, A-141, A-142, B-5, B-20, B-22, B- 16 1

MTSAx..........................................................................................................................................................B-20

MUL .................................................................................................................................................... 2-18, D-30

MUL.fmt .............................................................................................................................................3-21, 10-14

MUL.mft ........................................................................................................................................................D-41

MULT ......................................................................3-16, 3-23, 3-26, A-80, A-86, A-87, A-14 1, B-3, B-23, B-25

MULT 1 ..........................................................................................2-14, 3-23, 3-26, 4-2, B-3, B-24, B-26, B-163

Multi ................................................................................................................................................................1-2

Multimaster ............................................................................................................................................2-18, 8-2

multimedia.................................................................................................. 1-1, 1-2, 2-3, 2-6, 3-2, 3-4, 3-5, 3-23

Multimedia...........................................................................2-3, 2-14, 3-5, 3-22, 3-23, 3-24, 3-26, 4-2, B-1, B-3

multiply ................. 2-14, 3-2, 3-4, 3-16, 3-22, 3-23, 4-1, 4-2, 4-4, A-8, A-86, A-87, A-125, B-11, B-12, B-13,

B-14, B-23, B-24, B-25, B-26, B-84, B-85, B-86, B-87, B-91, B-92, B-93, B-95, B-1 11, B-1 13,

B-118, B-120, B-122, C-16, D-30

Multiply................1-1, 1-2, 2-3, 2-6, 2-9, 2-11, 3-2, 3-14, 3-16, 3-21, 3-22, 3-23, 3-24, 3-26, 4-1, B-1, B-3, B-5

MULT U.................................................................................................3-16, 3-23, 3-26, A-87, A-14 1, B-3, B- 25

MULT U1................................................................................................. 2-14, 3- 23, 3-26, 4-2, B-3, B-26, B-163

N

NaN..................................................................................................... 10-11, 11-6, D-8, D-10, D-11, D-12, D-13

NaNs.............................................................................................................................................................2-18

NBE............................................................................................................................................ 4-23, 5-11, C-28

NEG........................................................................................................................................... 2-18, 11-6, D-31

NEG.fmt...................................................................................................................................3-21, 10-14, D-41

Negate ..............................................................................................................3-21, 8-3, D-2, D-31, D-32, D- 3 3

NMI ..............................4-17, 4-18, 4-19, 4-33, 5-2, 5-5, 5-7, 5-8, 5-9, 5-10, 5-12, 8-10, 8-13, 9-11, 12-6, C-14

Index

X-14

nonmaskable ................................................................................................................................................4-33

NOR.....................................................................................................3-15, 3-25, A-3, A-88, A-141, B-4, B-12 4

Normalization..................................................................................................................................................2-9

NOT ...............................................................................................................6-2, 13-8, 13-20, A-3, A-88 , B-124

NotWordValue...... A-11, A-12, A-13, A-14, A-38, A-40, A-86, A-87, A-110, A-111, A-112, A-113, A-114, A-115,

B-7, B-9, B-11, B-12, B-13, B-14, B-23, B-24, B-25, B-26, B-68, B-70, B-93, B-95, B-113,

B-120, B-122

Nullif yCurrentInstruc t ion ............................................A-8, A-18, A-21, A-22, A-24, A-26 , A-29, A-30, A-32 , C-5

O

Offset ....................................................................6-4, 6-5, A-62, A-66, A-74, A-78, A-98, A-102, A-120, A-124

opcode...........................................................................................................................2-16, 3-9, 5-22, 6-1, A-2

OpCode................ 3-23, 3-24, 3-25, 6-20, 9-3, A-141, A-142, B-163, B-164, B-165, C-6, C-25, C-26, C-35,

C-36, C-41, C-42, D-40, D-41

operand.................................................................1-2, 3-14, 3-22, 3-23, A-104, B-1, B-3, D-1, D-4, D-31, D-35

Operand.......................................................................................................................2-4, 3-14, 3-15, 3-23, B-3

OR.....................2-9, 3-14, 3-15, 3-25, A-3, A-88, A-89, A-90, A-139, A-140, A-141, B-4, B-124, B-125, B-160

ORI............................................................................................................3-14, A-90, A-141, B-163, C-41, D-40

Ov .................................................................................................................................................4-20, 5-8, 5-26

Overflow............... 2-9, 4-30, 5-2, 5-8, 5-26, A-1 1, A-12, A-13, A-14, A-34, A-35, A-36, A-37, A-50, A-51, A-106,

A-107, A-1 08, A-10 9, A- 114, B-3 1, B- 35, B-37, B-3 9, B- 42, B-44 , B-14 4, B- 14 8, B-150

OVERFLOW ................................................................................................................................................... 5-5

OVFL.......................................................................................................................... 4-28, 4-30, 9-2, 9-10, 9-11

P

P0EXEA...............................................................................................................................................12-3, 12-4

P0EXEB...............................................................................................................................................12-3, 12-4

P1EXEA...............................................................................................................................................12-3, 12-4

P1EXEB...............................................................................................................................................12-3, 12-4

PA ......................................................................................................................C-6, C-7, C-9, C-10, C-11, C-12

PABSH.............................................................................................................................3- 24, B-4, B- 27, B- 16 4

PABSW ............................................................................................................................3- 24, B-4, B- 28, B-16 4

PADDB.............................................................................................................................3-24, B- 3, B-29, B- 16 4

PADDH.............................................................................................................................3-24, B-3, B-30, B-164

PADDSB ..........................................................................................................................3-24, B- 3, B-31, B- 16 4

PADDSH ..........................................................................................................................3-24, B-3, B- 35, B- 16 4

PADDSW .........................................................................................................................3-24, B- 3, B-37, B- 16 4

PADDUB ..........................................................................................................................3-24, B-3, B- 39, B- 16 4

PADDUH..........................................................................................................................3-24, B- 3, B-42, B- 16 4

PADDUW .........................................................................................................................3-24, B-3, B- 44, B-16 4

PADDW............................................................................................................................3-24, B- 3, B-46, B- 16 4

PADSBH ..........................................................................................................................3-24, B-3, B- 47, B-164

Index

X-15

Page....................................................................................................................2-16, 4-8, 4-10, 6-16, 6-17, 9-7

PageMask........................................................................... 2-15, 4-5, 4-10, 6-14, 6-15, 6-16, C-38, C-39, C-40

PAND ...............................................................................................................................3-25, B-4, B-48, B-165

PC........................ 1-2, 2-3, 2-6, 2-19, 3-16, 3-17, 3-18, 4-1, 4-3, 4-4, 5-12, 9-10, 12-1, 12-2, 12-3, 12-5, 12-7,

12-8, 12-9, 12-10, 12-11, 12-12, 12-13, 12-14, 12-15, 12-16, 12-17, 12-18, 12-19, 12-20,

13-7, A-4, A-9, A-17, A-18, A-19, A-20, A-21, A-22, A-23, A-24, A-25, A-26, A-27, A-28,

A-29, A-30, A-31, A-32, A-52, A-53, A-54, A-55, C-2, C-3, C-4, C-5, C-16, D-6, D-7

PC tracing........................................................................................................................... 1-2, 2-19, 12-1, 12-3

PCEQB ............................................................................................................................3-25, B-4, B-49, B-164

PCEQH............................................................................................................................3- 25, B-4, B- 52, B-16 4

PCEQW ...........................................................................................................................3-25, B-4, B- 54, B-16 4

PCGTB.............................................................................................................................3- 25, B-4, B- 56, B-16 4

PCGTH ............................................................................................................................3- 25, B- 4, B-59, B- 16 4

PCGTW ........................................................................................................................... 3-25, B-4, B- 61, B-16 4

PCPYH.............................................................................................................................3- 25, B-5, B- 63, B-16 5

PCPYLD...........................................................................................................................3- 25, B-5, B- 64, B-16 5

PCPYUD..........................................................................................................................3-25, B-5, B-65, B- 16 5

PDIVBW........................................................................................................3-24, B-5, B-66, B-69, B-71, B-165

PDIVUW .......................................................................................................................... 3- 24, B-5, B- 68, B-165

PDIVW.............................................................................................................................3-24, B-5, B- 70, B-16 5

Perf ........................................................................................................................................................2-15, 4-5

PerfC.............................................................................................................................................4-19, 5-8, 5-13

Performance........ 1-2, 2-1, 2-15, 2-19, 3-20, 4-5, 4-17, 4-19, 4-28, 4-29, 4-30, 5-2, 5-5, 5-7, 5-8, 5-9, 5-10,

5-11, 5-13, 9-1, 9-2, 9-3, 9-4, 9-10, 12-6, C-25, C-26, C-35, C-36

performance monitor.....................................................................................................................................3-20

PEXCH............................................................................................................................. 3-25, B-5, B-72, B- 16 5

PEXCW............................................................................................................................ 3-25, B-5, B-73, B- 16 5

PEXEH............................................................................................................................. 3-25, B-5, B-74, B- 16 5

PEXEW............................................................................................................................3-25, B- 5, B-75, B- 16 5

PEXT5..............................................................................................................................3-25, B-5, B-76, B-164

PEXTLB...........................................................................................................................3-25, B- 5, B- 78, B-16 4

PEXTLH...........................................................................................................................3-25, B- 5, B-79, B- 16 4

PEXTLW ..........................................................................................................................3-25, B-5, B-80, B-16 4

PEXTUB...........................................................................................................................3-25, B-5, B- 81, B-16 4

PEXTUH ..........................................................................................................................3-25, B-5, B- 82, B- 16 4

PEXTUW .........................................................................................................................3-25, B-5, B- 83, B- 16 4

PFN...................................................................................... 2-15, 4-5, 4-8, 6-16, C-10, C-11, C-12, C-39, C-40

PHMADH .........................................................................................................................3- 24, B-5, B- 84, B- 16 5

PHMSBH..........................................................................................................................3- 24, B-5, B- 86, B- 16 5

Physical................................................................2-10, 2-15, 2-16, 4-5, 4-25, 6-3, 6-4, 6-18, A-4, A-6, A-7, C-7

Index

X-16

PINTEH............................................................................................................................ 3-25, B-5, B-88, B- 16 5

PINTH..............................................................................................................................3-25, B-5, B-89, B-165

PLZCW ............................................................................................................................3-25, B-4, B-90, B-163

PMADDH ............................................ 3-24, B-5, B-91, B-94, B-96, B-112, B-114, B-119, B-121, B-12 3, B-165

PMADDUW......................................................................................................................3-24, B-5, B-93, B-165

PMADDW ........................................................................................................................3-24, B-5, B-95, B-165

PMAXH............................................................................................................................3- 24, B-4, B- 97, B-16 4

PMAXW ...........................................................................................................................3-24, B-4, B- 99, B-16 4

PMFHI............................................................................................................................3- 24, B-5, B- 101 , B-1 65

PMFHL...........................................................................................................................3- 24, B-5, B- 102 , B-1 63

PMFLO...........................................................................................................................3- 24, B-5, B- 106 , B-1 65

PMINH ...........................................................................................................................3-24, B- 4, B-107 , B-1 64

PMINW ..........................................................................................................................3-24, B- 4, B-109 , B-1 64

PMSUBH.........................................................................................................................3- 24, B-5, B- 111, B-165

PMSUBW........................................................................................................................3-24, B-5, B-113, B-1 65

PMTHI.............................................................................................................................3-24, B-5, B-115, B-165

PMTHL............................................................................................................................3-24, B-5, B-116, B-16 3

PMTLO............................................................................................................................3-24, B-5, B-117, B-165

PMULTH .........................................................................................................................3-24, B-5, B-118, B-165

PMULTUW..................................................................................................................... 3-24, B-5, B-120, B-165

PMULTW .......................................................................................................................3-24, B-5, B-122 , B-1 65

PNOR.............................................................................................................................3- 25, B-4, B- 124 , B-1 65

pointer....................................................................................................................................................4-9, A-92

POR...............................................................................................................................3- 25, B-4, B- 125 , B-1 65

PPAC5 ...........................................................................................................................3-25, B-5, B- 126 , B-1 64

PPACB...........................................................................................................................3- 25, B- 5, B-128 , B-1 64

PPACH........................................................................................................................... 3-25, B-5, B-129, B-164

PPACW..........................................................................................................................3- 25, B-5, B- 130 , B-1 64

precise ............................................................................................................................................................9-4

prediction .................................................................................................................................1-2, 2-3, 4-23, 9-7

Prediction......................................................................................................................................................4-23

PREF .......................................................................................3-19, 4-23, A-2, A-91, A-141 , B-16 3, C-41, D-40

prefetch......................................................................................................................................5-19, A-91, A-92

Prefetch.........................................................................................1-1, 1-2, 2-11, 2-17, 3-19, 8-8, 9-7, A- 7, A- 92

Prefix............................................................................................................................................................... 8-3

PREVH...........................................................................................................................3-25, B-5, B- 131 , B-1 65

PRId..............................................................................................................................................2-15, 4-5, 4-22

priorities ........................................................................................................................................................12-7

privilege.......................................................................................................................................... 9-5, 9-11, C-8

privilege mode .......................................................................................................................................9-5, 9-11

Index

X-17

Probe .........................................................................................................................3-20, 4-6, 4-14, 5-17, 6-20

PROT3W .......................................................................................................................3-25, B-5, B-132 , B-165

Pseudo...................................................................................................................................................2-15, 4-5

pseudoco de ..............................................................................................A- 1, A-2, A- 3, A-4, A- 6, A-8, B- 2, D-2

Pseudocode.....................................................................................................................A-3, A- 4, A-6, B- 2, D-2

PSLLH............................................................................................................................ 3-25, B-4, B-133 , B-1 63

PSLLVW ........................................................................................................................ 3-25, B-4, B-134, B-1 65

PSLLW...........................................................................................................................3- 25, B- 4, B-135 , B-1 63

PSRAH...........................................................................................................................3-25, B-4, B- 136 , B-1 63

PSRAVW .......................................................................................................................3-25, B-4, B-137, B-1 65

PSRAW..........................................................................................................................3- 25, B-4, B- 138 , B-1 63

PSRLH...........................................................................................................................3-25, B- 4, B- 139 , B-163

PSRLVW........................................................................................................................ 3-25, B-4, B-140, B-1 65

PSRLW ..........................................................................................................................3-25, B- 4, B-141 , B-1 63

PSUBB........................................................................................................................... 3-24, B-3, B-142 , B-1 64

PSUBH...........................................................................................................................3-24, B-3, B- 143 , B-1 64

PSUBSB ........................................................................................................................ 3-24, B-3, B-144, B-164

PSUBSH........................................................................................................................ 3-24, B-3, B-148, B-1 64

PSUBSW .......................................................................................................................3-24, B-3, B-150 , B-1 64

PSUBUB........................................................................................................................ 3-24, B-3, B-152, B-1 64

PSUBUH........................................................................................................................ 3-24, B-3, B-155, B-164

PSUBUW....................................................................................................................... 3-24, B-3, B-157 , B-1 64

PSUBW..........................................................................................................................3-24, B-3, B-159 , B-1 64

PTagLo.................................................................................................................................................4-31, 4-32

PTE.................................................................................................................................................2-15, 4-5, 4-9

PTEBase.........................................................................................................................................................4-9

PTEs...............................................................................................................................................................4-9

PXOR............................................................................................................................. 3-25, B-4, B-160 , B-1 65

Q

QFSRV.............................................................................................. 3-25, B-5, B-20, B-21 , B-22, B-161, B-1 64

qNaN..............................................................................................................................................................11-6

Quadword ...................................................................................... 1-2, 3-5, 3-8, 3-10, 3-12, 3-25, 8-9, B- 4, B-5

QUADWORD.............................................................................................................................A-7, B-10, B-162

Quintibyte.............................................................................................................................................3-10, 3-12

quotient......................................................................................................................... 4- 4, A-38, A- 40, B-7, B- 9

R

R10000 ...........................................................................................................................................................1-3

R4000 ......................................................................................................................................................1-3, 6-2

random...................................................................................................................................2-15, 4-5, 4-11, 6-2

Random ................................................................2-15, 3-20, 4-5, 4-7, 4-11, 4-14, 5-11, 5-16, 5-17, 6-20, C-40

Index

X-18

Random5 ......................................................................................................................................................C-40

Refill..................... 2-3, 2-17, 4-12, 4-14, 5-2, 5-7, 5-9, 5-16, 8-8, A-56, A-57, A-58, A-62, A-66, A-67, A-68,

A-70, A-74, A-78, A-79, A-93, A-94, A-98, A-102, A-103, A-1 16, A-120, A-124, B-10, B-162,

C-7, C-8, D-26, D-37

REGIMM................................................................................................ 5-22, A-141, A-142, B-163, C-41, D-40

register.............................................................................................................10-2, 10-6, 11-2, 11-3, 11-8, 11-9

Register................ 2-5, 2-6, 2-8, 2-15, 3-14, 3-15, 3-17, 3-20, 3-25, 4-3, 4-4, 4-5, 4-6, 4-7, 4-8, 4-9, 4-10, 4-11,

4-12, 4-13, 4-14, 4-15, 4-16, 4-17, 4-18, 4-19, 4-21, 4-22, 4-23, 4-25, 4-26, 4-27, 4-28,

4-29, 4-30, 4- 32, 4- 33, 5-8, 6- 9 , 6-1 0, 6- 1 2, 6-16, 8- 25, 9 -2, 9- 3, 9-4, 9-10 , 1 0-7, 1 0- 8, 1 0-

9, 13-2, 13-3, 13-4, 13-5, 13-7, 13-8, 13-9, A-3, A-4, A-5, A-9, A-54, B-3, B-5, B-161

registers........................................................................................................................................................10-4

Registers.......2-1, 2-3, 2-14, 2-15, 3-17, 4-1, 4-2, 4-3, 4-4, 4-5, 4-8, 4-26, 4-28, 4-31, 6-14, 9-2, 9-3, 9-4, 13-3

REL.............................................................................................................................................8-11, 8-14, 8-15

Request...........................................................................................................................................................9-9

Res.........................................................................................................................................................4-19, 5-8

Reset.........................................................4-18, 4-19, 5-1, 5-2, 5-7, 5-8, 5-9, 5-10, 5-11, 8-11, 9-4, 12-6, 13-14

RESET...............................................................................................................................5-11, 5-12, 8-11, 8-14

RI .................................................................................................................................2-16, 4-20, 5-8, 5-22, 6-1

Root ..............................................................................................................................................................3-21

Rotate ....................................................................................................................................................3-25, B-5

ROUND.L......................................................................................................................................................D-32

ROUND.L.fmt...........................................................................................................................3-21, 10-14, D-41

ROUND.W ....................................................................................................................................................D-33

ROUND.W.fmt .........................................................................................................................3-21, 10-14, D-41

RSQRT ................................................................................................................................................2-18, 3-26

S

S0...........................................................................................................................................4-29, 9-2, 9-5, 9-11

S1..................................................................................................................................................4-29, 9-5, 9-11

sa......................... 3-3, A-41, A-42, A-44, A-45, A-47, A-48, A-104, A-110, A-112, B-133, B-135, B-136, B-138,

B-139, B-141

SA......................................2-3, 2-11, 2-12, 2-13, 2-14, 3-25, 4-1, 4-2, 4-3, 4-4, B-17, B-20, B-21, B-22, B-161

Saturate ................................. B-34, B-36, B-38, B-41, B-43 , B-45, B-147, B-149, B-1 51, B-15 4, B-156, B-158

saturation........................B-3, B-31, B-35 , B-37, B-39, B-42 , B-44, B-144 , B-148, B-15 0, B-15 2, B-155, B-157

Saturation............................................................................................................................................... 3-24, B-3

SB.............................................................................................................. 3-4, A-93, A-141, B-163, C-41, D-40

SC.................................................................................................................1-2, 3-4, A-142, B-165, C-42, D-41

SCD ..............................................................................................................1-2, 3-4, A-142, B-165, C-42, D-41

SD..............................................................................................3-4, 13-8, A-5, A-94, A-141, B-163, C-41, D-40

SDC1 .....................................................................................3-5, 3-21, 10-13, A-141, B-163, C-41, D-34, D-40

SDL..................................................................................3-4, 3-8, A-95, A-96, A-99, A-141, B-163, C-41, D-40

Index

X-19

SDR ...............................................................................3-4, 3-8, A-95, A-99, A-100, A-141, B-163, C-41, D -40

segment..................................................................................................................2-16, 4-9, 6-1, 6-8, 6-9, 13-9

Segment........................................................................................................................................6-9, 6-10, 6-12

Semaphore .....................................................................................................................................................3-4

Septibyte..............................................................................................................................................3-10, 3-12

Serialization .................................................................................................................................................. 3-19

Sextibyte..............................................................................................................................................3-10, 3-12

SH.................................................................................................3-4, A-103, A-14 1, B-10 2, B-163, C-41, D-40

Shift..................................................................................... 2-3, 2-11, 3-14, 3-15, 3-25, 3-26, 4-2, 4-4, B-4, B-5

Shifter..............................................................................................................................................................2-3

shutdown.........................................................................................................................................................6-2

sign ...................... 2-7, 2-9, 2-16, 3-4, 3-16, 3-17, 6-1, 6-3, 10-10, 10-11, 10-12, 13-8, A-11, A-12, A-13, A-14,

A-17, A-18 , A-19, A- 20, A- 21 , A-22, A- 23, A- 24 , A-25, A- 26 , A- 27 , A-28 , A-29, A-30, A- 31,

A-32, A-35, A-36, A-38, A-39, A-40, A-44, A-45, A-46, A-56, A-57, A-58, A-60, A-64, A-67,

A-68, A-69 , A-70, A- 71, A- 72 , A-74, A- 75, A- 76 , A-78, A- 79 , A- 86 , A-87 , A-92, A-93, A- 94,

A-96, A-99, A-100, A-103, A-104, A-105, A-107, A-108, A-110, A-111, A-112, A-113, A-114,

A-115, A-116, A-117, A-118, A-121, A-122, A-128, A-130, A-131, A-134, A-135, A-138,

B-7, B-9, B-10, B-11, B-12, B-13, B-14, B-23, B-24, B-25, B-26, B-68, B-70, B-93, B-95,

B-113, B-120, B-122, B-136, B-137, B-138, B-140, B-162, C-2, C-3, C-4, C-5, C-6, D-2,

D-14, D-27, D-31

Sign.............................................................................................................................................................10-10

sign_extend.......... A-11, A-12, A-13, A-14, A-17 , A-18, A-19, A-20 , A-21, A-22 , A-23 , A-24, A-25, A-26, A-27,

A-28, A-29 , A-30, A- 31, A- 32 , A-35, A- 36, A- 38 , A-40, A- 56 , A- 57 , A-58 , A-60, A-64, A- 67,

A-68, A-69 , A-70, A- 72, A-76, A-79, A- 92, A-93, A-94, A-96 , A-100, A-103, A-104, A -105,

A-107, A-108, A-110, A-111, A-112, A-113, A-114, A-115, A-116, A-118, A-122, A-128,

A-130, A-131, A-134, A-135, A-138, B-10, B-162, C-2, C-3, C-4, C-5, D-14, D-27

Signal............................................................................................................................................... 8-3, 8-7, A-8

SignalException... A-8, A-11, A-12, A-33, A-34, A-35, A-50, A-58, A-67, A-68, A-70, A-79, A-94, A-103, A-114,

A-116, A-126, A-127, A-128, A-129, A-130, A-131, A-132, A-133, A-134, A-135, A-136,

A-137, A-138

SIO........................................4-17, 4-18, 4-19, 4-33, 5-2, 5-5, 5-7, 5-8, 5-9, 5-10, 5-25, 8-10, 12-6, 13-8, C-14

SIOINT..........................................................................................................................................................8-10

SIOP ....................................................................................................................................................4-19, 5-25

sll.................................................12-10, 12-11, 12-12, 12-13, 12-14, 12-15, 12-16, 12-17, 12-18, 12-19, 12-20

SLL......................................................................................................................3-15, A-74, A-78, A-104, A-141

SLLV ...................................................................................................................3-15, A-74, A-78, A-105, A-141

SLT......................................................................................................................3-15, A-82, A-83, A-106, A-141

SLT I.....................................................................................3-14, A-82, A-83 , A-107, A-141, B-16 3, C-41, D-40

SLT IU..................................................................................3-14, A-82, A-83, A-108, A-141, B-16 3, C-41, D-40

SLT U...................................................................................................................3-15, A-82, A-83, A-109, A-141

Index

X-20

SLW ............................................................................................................................................................B-102

Snooping.......................................................................................................................................................2-17

SPECIAL.................................................................................................... 5-22, A-9, A-141, B-163, C-41, D-40

SQ.................................................................................. 3-5, 3-25, 13-8, A-141, B-4, B-162, B-163, C-41, D-40

SQRT.........................................................................................................................................2-18, 3-26, D-35

SQRT.fmt .................................................................................................................................3-21, 10-14, D-41

Square ..........................................................................................................................................................3-21

SquareRoot...................................................................................................................................................D-35

SR..........................................................................................................................................................1-5, 4-16

SRA........................................................................................................................................3-15, A-110, A-141

SRAV ..................................................................................................................................... 3-15, A-111, A-14 1

SRL........................................................................................................................................3-15, A-112, A-141

SRLV......................................................................................................................................3-15, A-113, A-141

sseg .......................................................................................................................................................6-7, 6-10

State.........................................................................................................................................................6-6, 9-4

Status................... 1-5, 2-15, 3-5, 3-20, 3-21, 4-5, 4-16, 4-17, 4-1 8, 4-21, 4-25, 4-29, 5-2, 5-5, 5-7, 5-9, 5-11,

5-12, 5-13, 5-14, 5-16, 5-19, 5-23, 5-24, 5-25, 6-2, 6-6, 6-8, 6-9, 6-10, 6-11, 6-12, 6-13,

8-25, 10-2, 10-4, 10-7, 10- 8, 10-9 , 11-2, 11-8, 11-9, 12-3, 12-4, 13-4, C- 1, C-7, C- 9, C-13,

C-14, C-15, C-16

STATUS............................................................................................................ 9-2, 9-10, 9-11, 12-6, 13-5, 13-6

steering..................................................................................................................................................2-6, 4-31

SteeringBits ..................................................................................................................................................C-10

stepping .............................................................................................................1-2, 9-8, 9-10, B-20, B-21, B-22

StoreFPR............. D-2, D-4, D-5, D-12, D-13, D-16, D-17, D-18, D-19, D-20, D-23, D-24, D-28, D-30, D-31,

D-32, D-33, D-35, D-36, D-38, D-39

StoreMem ory............................................... A-7, A-93, A-94 , A-96, A-100, A-103, A-116, A-118, A-122, B-162

SUB............................................................................................................2-18, 3-15, 5-26, A-114, A-141, D-36

SUB.fmt ...................................................................................................................................3-21, 10-14, D-41

Subroutine.....................................................................................................................................................3-17

Subsequent............................................................................................................................................2-4, 6-17

Subtract....................................................................................................................... 3- 15, 3-21, 3- 24, B- 3, B-5

SUBU..........................................................................................................................3-15, A-114, A-115, A-14 1

supervisor ............................................................................................4-18, 5- 15, 6-10, 6-12, 9-11, 13-5, 13-1 4

Supervisor............ 2-16, 2-19, 4-17, 4-18, 4-29, 5-2, 5-15, 5-22, 5-23, 6-6, 6-7, 6-10, 6-12, 9-2, 13-5, 13-6,

C-1, C-14, C-15

SUPERVISOR ................................................................................................................................................ 9-5

suseg .....................................................................................................................................................6-7, 6-10

SW ....................................................................................................3-4, A-5, A-116, A-141, B-163, C-41, D-40

SWC1............................................................................3-5, 3-21, 10-13, 13-2, A-141, B-163, C-41, D-37, D-40

SWC2.......................................................................................................................... A-142, B-1 65, C-42, D- 41

Index

X-21

SWL........................................................................... 3-4, 3-8, A-117, A-118, A-121, A-141, B-163, C-41, D-40

SWR...........................................................................3-4, 3-8, A-117, A-121, A-122, A-141, B-163, C-41, D-40

SYNC................... 2-11, 2-12, 2-13, 3-19, 5-24, 6-17, 13-9, 13-16, 13-18, 13-20, A-125, A-141, C-13, C-27,

C-28, C-29, C-30, C-31, C-32, C-33, C-34, C-35, C-36, C-38, C-39, C-40

Synchronization...................................................................................................................................2-11, 3-19

Sys................................................................................................................................................4-20, 5-8, 5-20

SYS.................................................................................................................................................................8-3

SYSAACK..........................................8-3, 8-9, 8-12, 8-13, 8-14, 8-16, 8-19, 8-22, 8-25, 8-26, 8-27, 8-28, 8-29

SYSADDR................................................................................................................................................8-3, 8-7

SYSASTART................................................................................................8-3, 8-7, 8-9, 8-12, 8-13, 8-16, 8-19

SYSBE.....................................................................................................................................................8-3, 8-7

Syscall......................................................................................................................................4-20, 5-2, 5-8, 5-9

SYSCALL..............................................................................2-11, 3-18, 4-4, 5-10, 5-20, 9-7, 9-8, A-126, A- 14 1

SYSDACK............................ 8-3, 8-10, 8-12, 8-13, 8-16, 8-17, 8-19, 8-20, 8-22, 8-25, 8-26, 8-27, 8-28, A-125

SYSDATA................................................................................................................8-3, 8-6, 8-7, 8-9, 8-16, 8-17

SYSDSTART.........................................................................8-3, 8-10, 8-12, 8-13, 8-16, 8-17, 8-19, 8-20, 8-25

SYSRD............................................................................................................................................................8-3

SYSTSIZE...........................................................................................................8-3, 8-9, 8-12, 8-13, 8-16, 8-19

SYSWR...........................................................................................................................................................8-3

T

Tag..................................................................................................... 2-6, 2-7, 2-15, 4-5, C-9, C-11, C-12, C-13

TAG.................................................................................................................................................................C-6

TagHi................................................................................................................................... 2-15, 4-5, 4-31, 4-32

TagHI...................................................................................................................................................C-10, C-11

TagLo.................................................................................................................................. 2-15, 4-5, 4-31, 4-32

TagLO ............................................................................................................................... C-9, C-10, C-11, C-12

tags..............................................................................................................................................4-31, C-9, C-12

TargetAddress.....................................................................................................................................C-10, C-11

TEQ....................................................................................................................... 3-18, 5-27, 9-8, A-127 , A-141

TEQI...................................................................................................................... 3-18, 5-27, 9-8, A-128 , A-142

TGE............................................................................................................................... 3- 18, 5-27, A- 12 9, A-141

TGEI.............................................................................................................................. 3- 18, 5-27, A- 13 0, A-142

TGEIU...........................................................................................................................3-18, 5- 27, A-13 1, A- 142

TGEU............................................................................................................................ 3-18, 5- 27, A-132, A-141

timer............................................................................................................................................4-13, 4-15, 4-16

TLB ...................... 1-2, 2-3, 2-6, 2-7, 2-15, 2-16, 3-20, 4-5, 4-6, 4-7, 4-8, 4-9, 4-10, 4-11, 4-12, 4-14, 4-17,

4-20, 4-29, 5-2, 5-7, 5-8, 5-9, 5-10, 5-11, 5-12, 5-16, 5-17, 5-18, 6-1, 6-2, 6-3, 6-4, 6-7,

6-8, 6-9, 6-12, 6-14, 6-1 5, 6-16, 6-1 7, 6-18, 6-19, 6-20, 12 -6, A-6, A-56 , A-57, A- 58, A-62,

A-66, A-67, A-68, A-70, A-74, A-78, A-79, A-92, A-93, A-94, A-98, A-102, A-103, A-116,

A-120, A-124, B-10, B-162, C-6, C-7, C-8, C-28, C-37, C-38, C-39, C-40, D-26, D-37

Index

X-22

TLBEnteries..................................................................................................................................................C-37

TLBL ............................................................................................................................4-8, 4-20, 5-8, 5-16, 5-17

TLBP...............................................................................................3-20, 4-6, 5-17, 5-18, 6-2, 6-20, C-37, C-42

TLBR................................................................................................................2-13, 3-20, 4-6, 6-20, C-38, C-42

TLBS............................................................................................................................4-8, 4-20, 5-8, 5-16, 5-17

TLBWI...................................................................................2-13, 3-20, 4-6, 4-8, 6-20, C-28, C-38, C-39, C-42

TLBWR .................................................................................2-13, 3-20, 4-7, 4-8, 6-20, C-28, C-38, C-40, C-42

TLT................................................................................................................................ 3-18, 5-27, A-13 3, A-141

TLTI...............................................................................................................................3-18, 5-27, A-134, A-142

TLTIU............................................................................................................................ 3-18, 5- 27, A-13 5, A- 142

TLTU............................................................................................................................. 3-18, 5- 27, A-13 6, A- 141

TNE............................................................................................................................... 3- 18, 5-27, A- 13 7, A-141

TNEI..............................................................................................................................3- 18, 5-27, A- 13 8, A-142

TPC................................................................................................................................... 12-3, 12-5, 12-6, 12-7

TPCE ..........................................................................................................................................12-3, 12-5, 12-6

Trace...........................................................................................................................................12-1, 12-2, 12-3

transaction ................................................................................................................. 8-8, 8-10, 8-12, 8-14, 8-22

Translation............................................................................................. 2-3, 6-2, 6-3, 6-4, 6-5, 6-18, 6-19, 6-20

translations .................................................................................................................................... 4-9, 6-1, A-92

Trap...................... 2-11, 3-18, 4-20, 5-2, 5-8, 5-9, 5-10, 5-27, 9-8, A-127, A-128, A-129, A-130, A-131, A-132,

A-133, A-1 34, A-13 5, A- 136 , A-1 37, A-1 3 8

TRAP ..............................................................................................................................................4-4, 5-27, 9-7

TRIG ..................................................................................................................................................13-9, 13-20

Trigger..................................................................................................................................................2-19, 13-6

Triplebyte.............................................................................................................................................3-10, 3-12

TRUNC.L. .....................................................................................................................................................D-38

TRUNC.L.fmt...........................................................................................................................3-21, 10-14, D-41

TRUNC.W.....................................................................................................................................................D-39

TRUNC.W.fmt..........................................................................................................................3-21, 10-14, D-41

U

U0 ..........................................................................................................................................4-29, 9-2, 9-5, 9-11

U1 .................................................................................................................................................4-29, 9-5, 9-11

UCA ................................................................................................................................................................9-7

UCAB...............................................................................................................................2-4, 2-6, 2-7, 6-17, 9-9

unaligned ...........................................3-8, 13-8, A-59, A-63, A-71, A-7 4, A-75, A-78, A-95, A-99 , A-11 7, A-12 1

uncached ............. 1-1, 2-4, 5-11, 5-12, 6-12, 6-16, 6-17, 8-12, 9-8, 9-9, 9-10, A-6, A-8, A-56, A-57, A-58, A-60,

A-64, A-67, A-68, A-70, A-72, A-76, A-79, A-91, A-92, A-93, A-94, A-96, A-100, A-103,

A-116, A-118, A-122, A-125, B-10, B-162, C-6, C-7

Uncached.............................................................................2-4, 4-8, 4-24, 6-7, 6-17, 6-20, 8-8, 8-12, 9-7, 9-10

UndefinedResult .. A-8, A-11, A-12, A-13, A-14, A-38, A-40, A-86, A-87, A-110, A-111, A-112, A-113, A-114,

Index

X-23

A-115, B-7, B-9, B-11, B-1 2, B-13, B-14, B-23, B-24, B-25, B-26, B-68, B-70, B-93, B-95,

B-113, B-120, B-122

underflow ............. 2-9, B-29, B-30, B-31, B-35, B-37, B-46, B-47, B-142, B-143, B-144, B-148, B-150, B-152,

B-155, B-157, B-159

Underflo w............................................................ B-31, B-35 , B-3 7, B-144 , B-1 4 8, B-15 0, B- 15 2, B-155, B-157

UNIX ............................................................................................................................................A-39, B-8, B-67

unmapped...................................................5-11, 5-12, 6-7, 6-12, 9-8, 9-10, 13-9, A-6, C-28, C-38, C-39, C-40

Unmapped ...................................................................................................................................................... 6-7

Unsigned.......................................................................3-4, 3-14, 3-15, 3-16, 3-18, 3-23, 3-24, B-3, B-5, B-158

useg..................................................................................................................................................6-7, 6-8, 6-9

UW..............................................................................................................................................................B-102

V

VA ..............................................................................................................C-6, C-7, C-8, C-9, C-10, C-11, C-12

VALID..............................................................................................................................................................C-9

VALUE .......................................................................................................................... ................4-28, 4-30, 9-2

Value FPR.....................................................................................................................................................D-10

ValueFPR..........................................................................................................................D-4, D-12, D-13, D-16

VAX.................................................................................................................................................................3-6

VPN..........................................................................................................................................4-9, 5-15, 6-4, 6-5

VPN2................................................................................................................................4-14, 6-16, C-39, C-40

W

WBB...............................................................................................................................2-4, 4-29, 8-15, 9-6, 9-9

Wide...................................................................................................................................2-10, 2-11, 2-12, 2-13

wired .............................................................................................................................................2-15, 4-5, 4-11

Wired.............................................................................................................................2-15, 4-5, 4-7, 4-11, 5-11

W ORD ................................................................................................................. A-7, A-70, A-79, A-116, A- 122

writeback.......................................................................................................................................................A-91

Writeback........................................................................................................... 2-4, C-7, C-8, C-11, C-12, C-13

WRITEBACK.........................................................................................................................................C-6, C-13

X

XOR.......................................................................................3-15, 3-25, A-3, A-13 9, A- 140, A-141, B-4, B- 16 0

XORI...................................................................................................... 3-14, A-140, A-141, B-163, C-41, D-40

Index

X-24

Appendix A CPU Instruct ion Set Details

A-1

A. CPU Instruction Set Details

This appendix provides a detailed description of the operation of each instruction. The

instructions are listed in alphabetical order.

Exceptions that may occur due to the execution of each instruction are listed after the

description of each instruction. Descriptions of the immediate cause and manner of

handling exceptions are omitted from the instruction descriptions in this appendix.

Descriptions use a pseudocode notation explained in Section A.2.

For an overview of the instruction set, refer to Chapter 3 of the User’s Manual.

Appendix A CPU Instruct ion Set Details

A-2

A.1 Description of an Instruction

Each instruction description contains several sections that contain specific information

about the instruction. The following sections describe the contents of each section in detail.

A. 1.1 Instruction Mnemonic and Name

The instruction mnemonic and name are printed as page headings for each page in the

instruction description.

A. 1.2 Instruction Encoding Picture

The instruction word encoding is shown in pictorial form at the top of the instruction

description. The picture shows the values of all constant fields and the opcode names for

opcode fields in upper-case. It labels all variable fields with lower-case names that are

used in the instruction description. Fields that contain zeroes but are not named are

unused fields that are required to be zero.

A.1.3 Format

The assembler formats for the instruction and the architecture level at which the

instruction was originally defined are shown.

A.1.4 Purpose

This is a very short statement of the purpose of the instruction.

A.1.5 Description

If a one-line symbolic description of the instruction is feasible, it will appear immediately

to the right of the

Description

heading. The body of the section is a description of the

operation of the instruction in text, tables, and figures. This description complements the

high-level language description in the

Operation

section.

A.1.6 Restrictions

This section documents the restrictions on the instructions. Most restrictions fall in the

category of alignment requirements for memory addresses, valid values of operands, and

order of instructions necessary to gurantee correct execution.

A.1.7 Operation

This section describes the operation as pseudocode in a high-level language notation

resembling Pascal. The purpose of this section is to describe the operation of the

instruction clearly in a form with less ambiguity than prose.

A.1.8 Exceptions

This section lists the exceptions that can be caused by the operation

operationoperation

operation of the instruction. It

omits exceptions that can be caused by instruction fetch, performance counters, and

breakpoints. It also omits exceptions that can be caused by asynchronous external events,

e.g. interrupts. Although the Bus Error exception may be caused by the operation of a load,

store or PREF instruction this section does not list Bus Error for load, store or PREF

instructions because the relationship between these instructions and external error

conditions, like Bus Error is asynchronous and implementation specific.

Appendix A CPU Instruct ion Set Details

A-3

A. 1.9 Programming Notes, Implementation Notes

These sections contain material that is useful for programmers and implementors

respectively but is not necessary to describe the instruction and does not belong in the

description sections.

A.2 Instruction Description Notation and Functions

The

Operation

sections of the instruction descriptions describe the operation performed by

each instruction using a high-level language notation, or pseudocode. Symbols, functions,

and structures used in the

Operation

sections are described here.

A. 2.1.1 Pseudocode Language Statement Execution

Each of the high-level language statements in an operation description is executed in

sequential order (as modified by conditional and loop constructs).

A.2.1.2 Pseudocode Symbols

Special symbols used in the notation are described in Table A-1.

Table A-1. Symbols in Instruction Operation Statements

Symbol Meaning

←Assignment.

=, ≠Tests for equality and inequality.

|| Bit string concatenation.

XyA y-bit string formed by y copies of the single-bit value x.

Xy..z Selection of bits y through z of bit string x.

+, −Two’s complement or floating point arithmetic: addition, subtraction.

*, ×Two’s complement or floating point multiplication (both used for either).

div Two’s complement integer division.

Mod Two’s complement modulo.

/ Floating point division.

< Two’s complement less than comparison.

Not Bit-wise logical NOT.

Nor Bit-wise logical NOR.

Xor Bit-wise logical XOR.

And Bit-wise logical AND.

or Bit-wise logical OR.

GPRLEN The length in bits (64 in the C790), of the CPU General Purpose Registers.

GPR[x] CPU General Purpose Register x. The content of GPR[0] is always zero.

CPR[z, x] Coprocessor unit z, general register x.

CCR[z, x] Coprocessor unit z, control register x.

CPCOND[z] Coprocessor unit z condition signal.

BigEndian Big-endian made as configured at reset (0→Little, 1→Big) from core boundary signal.

Appendix A CPU Instruct ion Set Details

A-4

Symbol Meaning

I:,

I+n:,

I−n:

This occurs as a prefix to operation description lines and functions as a label. It indicates

the instruction time during which the effects of the pseudocode lines appears to occur

(i.e., when the pseudocode is “executed”). Unless otherwise indicated, all effects of the

current instruction appear to occur during the instruction time of the current instruction.

No label is equivalent to a time label of “I:”.

Sometimes effects of an instruction appear to occur either earlier or later-during the

instruction time of another instruction. When that happens, the instruction operation is

written in sections labeled with the instruction time, relative to the current instruction I, in

which the effect of that pseudocode appears to occur. For example, an instruction may

have a result that is not available until after the next instruction. Such an instruction will

have the portion of the instruction operation description that writes the result register in a

section labeled “I+1:”.

The effect of pseudocode statements for the current instruction labeled “I+1:” appears to

occur “at the same time” as the effect of pseudocode statements labeled “I:” for the

following instruction. Within one pseudocode sequence the effects of the statements

takes place in order. However, between sequences of statements for different

instructions that occur “at the same time”, there is no order defined. Programs must not

depend on a particular order of evaluation between such sections.

PC The Program Counter value. During the instruction time of an instruction this is the

address of the instruction word. The address of the instruction that occurs during the

next instruction time is determined by assigning a value to PC during an instruction time.

If no value is assigned to PC during instruction time by any pseudocode statement, it is

automatically incremented by 4 before the next instruction time. A taken branch assigns

the target address to PC during the instruction time of the instruction in the branch delay

slot.

PSIZE The SIZE, number of bits, of Physical address in an implementation.

A.2.2 Definitions of Pseudocode Functions Used in

Instruction Descriptions

A variety of functions are used in the pseudocode employed in the instruction descriptions.

These functions are used to make the pseudocode more readable and also to abstract

implementation-specific behavior. These functions are defined in this section. Certain

additional functions specific to a particular coprocessor are described at the beginning of

the appendix for that coprocessor.

A. 2.2.1 Coprocessor General Regist er Access Pseudocode Functions

Defined coprocessors, except for COP0, have instructions to exchange words and

doublewords and quadwords between coprocessor general registers and the rest of the

system. What a coprocessor does with a word or doubleword supplied to it, and how a

coprocessor supplies a word or doubleword, is defined by the coprocessor itself. The

functions are listed in Table A-2.

Appendix A CPU Instruct ion Set Details

A-5

Table A-2. Coprocessor General Register Access Functions

COP_LW(z, rt, memword)

z: The coprocessor unit number.

rt: Coprocessor general register specifier.

Memword: A 32-bit w ord value supplied to the coprocessor.

This is the action taken by coprocessor z when supplied with a word from memory

during a load word operation. The action is coprocessor-specific. The typical action

would be to store the contents of memword in coprocessor general register rt.

COP_LD(z, rt, memdouble)

z: The coprocessor unit number.

rt: Coprocessor general register specifier.

Memdouble: 64-bit doubleword value supplied to the coprocessor.

This is the action taken by coprocessor z when supplied with a doubleword from

memory during a load doubleword operation. The action is coprocessor-specific. The

typical action would be to store the contents of memdouble in coprocessor general

register rt.

Dataword ←

←←

← COP_SW(z, rt)

z: The coprocessor unit number.

rt: Coprocessor general register specifier.

Dataword: 32-bit word value.

This defines the action taken by coprocessor z to supply a word of data during a store

word operation. The action is coprocessor-specific. The typical action would be to

supply the contents of low-order word in coprocessor general register rt.

Datadouble ←

←←

← COP_SD(z, rt)

z: The coprocessor unit number.

rt: Coprocessor general register specifier.

Datadouble: 64-bit doubleword value.

This defines the action taken by coprocessor z to supply a doubleword of data during

a store doubleword operation. The action is coprocessor-specific. The typical action

would be to supply the contents of the doubleword coprocessor general register rt.

Appendix A CPU Instruct ion Set Details

A-6

A. 2.2.2 Load and Store Memory Pseudocode Functions

Regardless of byte-numbering order (endianness), the address of a halfword, word, or

doubleword is the smallest byte address among the bytes in the object. For a big-endian

ordering this is the most-significant byte; for a little-endian ordering this is the least-

significant byte.

In the operation description pseudocode for load and store operations, the functions listed

in Table A-3 are used to summarize the handling of virtual addresses and accessing

physical memory.

The size of the data item to be loaded or stored is passed in the

AccessLength

field. The

valid constant names and values are shown in Table A-4. The bytes within the addressed

unit of memory (quadword for 128-bit processors) which are used can be determined

directly from the AccessLength and the four low-order bits of the address.

Table A-3. Load and Store Functions

(pAddr, CCA) ←

←←

← AddressTranslation (vAddr, IorD, LorS)

pAddr: Physical Address.

CCA: Cache Coherence Algorithm: the method used to access caches and

memory and resolve the reference.

vAddr: Virtual Address.

IorD: Indicates whether access is for Instruction or Data.

LorS: Indicates whether access is for Load or Store

Translate a virtual address to a physical address and a cache coherence algorithm describing the

mechanism used to resolve the memory reference.

Given the virtual address vAddr, and whether the reference is to Instructions or Data (IorD), find the

corresponding physical address (pAddr) and the cache coherence algorithm (CCA) used to resolve the

reference. If the virtual address is in one of the unmapped address spaces the physical address and

CCA are determined directly by the virtual address. If the virtual address is in one of the mapped

address spaces then the TLB is used to determine the physical address and access type; if the

required translation is not present in the TLB or the desired access is not permitted the function fails

and an exception is taken.

MemElem ←

←←

← LoadMemory (CCA, AccessLength, pA ddr, vAddr, IorD)

MemElem: Data is returned in a fixed width with a natural alignment. The width is the

same size as the CPU general purpose register.

CCA: Cache Coherence Algorithm: the method used to access caches and

memory and resolve the reference.

AccessLength: Length, in bytes, of access.

pAddr: Physical Address.

vAddr: Virtual Address.

IorD: Indicates whether access is for Instructions or Data.

Load a value from memory.

Uses the cache and main memory as specified in the Cache Coherence Algorithm (CCA) and the sort

of access (IorD) to find the contents of AccessLength memory bytes starting at physical location pAddr.

The data is returned in the fixed width naturally -aligned memory element (MemElem). The low-order

two, three, or four bits of the address and the AccessLength indicate which of the bytes within

MemElem needs to be given to the processor. If the memory access type of the reference is uncached

then only the referenced bytes are read from memory ad valid within the memory element. If the access

type is cached, and the data is not present in cache, an implementation specific size and alignment

block of memory is read and loaded into the cache to satisfy a load reference. At a minimum, the block

is the entire memory element.

Appendix A CPU Instruct ion Set Details

A-7

StoreMemory (CCA, AccessLength, MemElem, pAddr, vAddr)

CCA: Cache Coherence Algorithm: the method used to access caches and

memory and resolve the reference.

AccessLength: Length, in bytes, of access.

MemElem: Data in the width and alignment of a memory element. The width is the

same size as the CPU general purpose register. For a partial-memory-

element store, only the bytes that will be stored must be valid.

pAddr: Physical Address.

vAddr: Virtual Address.

Store a value to memory.

The specified data is stored into the physical location pAddr using the memory hierarchy (data caches

and main memory) as specified by the Cache Coherence Algorithm (CCA). The MemElem contains

the data for an aligned, fixed-width memory element, though only the bytes that will actually be stored

to memory need to be valid. The low-order four bits of pAddr and the AccessLength field indicates

which of the bytes within the MemElem data should actually be stored; only these bytes in memory will

be changed.

Prefetch (CCA, pA ddr, vAddr, DATA, hint)

CCA: Cache Coherence Algorithm: the method used to access caches and

memory and resolve the reference.

pAddr: Physical Address.

vAddr: Virtual Address.

DATA: Indicates that access is for DATA.

hint: Hint that indicates the possible use of the data

Prefetch data from memory.

Prefetch is an advisory instruction for which an implementation specific action is taken. The action

taken may increase performance but must not change the meaning of the program or alter

architecturally-visible state.

Table A-4. AccessLength Specifications for Loads / Stores

AccessLength

name Value Meaning

QUADWORD 15 16 bytes (128 bits)

DOUBLEWORD 7 8 bytes (64 bits)

SEPTIBYTE 6 7 bytes (56 bits)

SEXTIBYTE 5 6 bytes (48 bits)

QUINTIBYTE 4 5 bytes (40 bits)

WORD 3 4 bytes (32 bits)

TRIPLEBYTE 2 3 bytes (24 bits)

HALFWORD 1 2 bytes (16 bits)

BYTE 0 1 byte (8 bits)

Appendix A CPU Instruct ion Set Details

A-8

A.2.2.3 Miscellaneous Functions

Table A-5 describes additional miscellaneous functions for CPU instruction descriptions.

Table A-5. Miscellaneous Functions

Sy ncOperation (stype)

stype: Type of synchronization operation to be performed.

Based on the value of stype either a memory barrier operation is performed or a pipeline barrier

operation is performed.

In case of a memory barrier all pending loads and stores are retired. Loads are retired when the

destination register is written. Stores are retired when the stored data (in store buffers or write buffers) is

either stored in the data cache, or sent on the processor bus.

All uncached accelerated data gathering operation is terminated.

The uncached accelerated buffer is invalidated.

All bus read processes due to load/store/pref/cache instructions are completed.

All pending bus write processes in the write back buffer are completed.

In case of pipeline barrier all instructions prior to the barrier are completed before the instructions

following the barrier operation are fetched. Note that the barrier operation does not wait for any

instruction which was issued prior to the barrier operation but not retired (e.g., multiply, divide, multicycle

COP1 operations or a pending load which were issued prior to the pipeline barrier operation).

SignalException (Exception)

Exception; The exception condition that exists.

Signal an exception condition.

This will result in an exception that aborts the instruction. The instruction operation pseudocode will

never see a return from this function call.

UndefinedResult()

This function indicates that the result of the operation is undefined.

NullifyCurrentInstruction()

Nullify the current instruction.

This occurs during the instruction time for some instruction and that instruction is not executed further.

This appears for branch-likely instructions during the execution of the instruction in the delay slot and it

kills the instruction in the delay slot.

CoprocessorOperation (z, cop_fun)

z: Coprocessor unit number

cop_fun: Coprocessor function from function field of instruction

Perform the specified Coprocessor operation.

Appendix A CPU Instruct ion Set Details

A-9

A.3 CPU Instruction Formats

A CPU instruction is a single 32-bit aligned word. There are three instruction formats:

Immediate (I-type), Jump (J-type), and Register (R-type). These formats are shown in

Figure A-1 below:

I-Type (Immedi at e)

op rs rt immediate

31 26 25 21 20 16 15 0

655 16

J-Type (Jump)

op target

31 26 25 0

626

R-Type (Register)

op rs rt funct

655 6

rd sa

55

31 26 25 21 20 16 15 011 10 6 5

op 6-bit primary operation code

rd 5-bit destination register specifier

rs 5-bit source register specifier

rt 5-bit target (source/destination) register specification or

branch condition

immediate 16-bit signed immediate used for: logical operands, arithmetic

signed operands, load/store address byte offsets, PC-relative

branch signed instruction displacement

target 26-bit index shifted left two bits to supply the low-order 28 bits

of the jump target address.

sa 5-bit shift amount

funct 6-bit function field used to specify functions within the primary

operation code value SPECIAL

Figure A-1. CPU Instruction Formats

Appendix A CPU Instruct ion Set Details

A-10

A.4 Instruction Descriptions

The user-level CPU instructions are described in alphabetical order in this section.

Appendix A CPU Instruct ion Set Details

A-11

ADD ADD

Add Word

SPECIAL

000000 ADD

100000

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: ADD rd, rs, rt

Purpose: To add 32-bit integers. If overflow occurs, then trap.

Description: rd ← rs + rt

The 32-bit word value in GPR

rt

is added to the 32-bit value in G PR

rs

to produce a 32-bit

result. If the addition results in 32-bit 2’s complement arithmetic overflow then the

destination register is not modified and an Integer Overflow exception occurs. If it does

not overflow, the 32-bit result is placed into GPR

rd

.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation is undefined.

Operation:

If (NotWordValue (GPR[rs] 63..0) or NotWordValue (GPR[rt] 63..0)) then UndefinedResult()endif

temp ← GPR[rs] 63..0 + GPR[rt] 63..0

if (32_bit_arithmetic_overflow) then

SignalException (IntegerOverflow)

else GPR[rd]63..0 ← sign_extend (temp31..0)

endif

Exceptions:

Integer Overflow

Programming Notes:

ADDU performs the same arithmetic operation but, does not trap on overflow.

Appendix A CPU Instruct ion Set Details

A-12

ADDI ADDI

Add Immediate Word

ADDI

001000 immediate

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: ADDI rt, rs, immediate

Purpose: To add a constant to a 32-bit integer. If overflow occurs, then trap.

Description: rt ← rs + immediate

The 16-bit signed

immediate

is added to the 32-bit value in GPR

rs

to produce a 32-bit

result. If the addition results in 32-bit 2’s complement arithmetic overflow then the

destination register is not modified and an Integer Overflow exception occurs. If it does

not overflow, the 32-bit result is placed into GPR

rt

.

Restrictions:

If GPR

rs

does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result

of the operation is undefined.

Operation:

if (NotWordValue (GPR[rs] 63..0)) then UndefinedResult() endif

temp ← GPR[rs] 63..0 + sign_extend (immediate)

if (32_bit_arithmetic_overflow) then

SignalException (IntegerOverflow)

else GPR[rt]63..0 ← sign_extend (temp31..0)

endif

Exceptions:

Integer Overflow

Programming Notes:

ADDIU performs the same arithmetic operation but, does not trap on overflow.

Appendix A CPU Instruct ion Set Details

A-13

ADDIU ADDIU

Add Immediate Unsigned Word

ADDIU

001001 immediate

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: ADDIU rt, rs, immediate

Purpose: To add a constant to a 32-bit integer.

Description: rt ← rs + immediate

The 16-bit signed

immediate

is added to the 32-bit value in GPR

rs

and the 32-bit

arithmetic result is placed into GPR

rt

.

No Integer Overflow exception occurs under any circumstances.

Restrictions:

If GPR

rs

does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result

of the operation is undefined.

Operation:

if (NotWordValue (GPR[rs] 63..0)) then UndefinedResult( ) endif

temp ← GPR[rs] 63..0 + sign_extend (immediate)

GPR[rt] 63..0 ← sign_extend (temp31..0)

Exceptions:

None

Programming Notes:

The term “unsigned” in the instruction name is a misnomer; this operation is 32-bit

modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is

not signed, such as address arithmetic, or integer arithmetic environments that ignore

overflow, such as C language arithmetic.

Appendix A CPU Instruct ion Set Details

A-14

ADDU ADDU

Add Unsigned Word

SPECIAL

000000 ADDU

100001

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: ADDU rd, rs, rt

Purpose: To add 32-bit integers.

Description: rd ← rs + rt

The 32-bit word value in GPR

rt

is added to the 32-bit value in GPR

rs

and the 32-bit

arithmetic result is placed into GPR

rd

.

No Integer Overflow exception occurs under any circumstances.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation is undefined.

Operation:

if (NotWordValue (GPR[rs] 63..0) or NotWordValue (GPR[rt] 63..0)) then UndefinedResult() endif

temp ← GPR[rs] 63..0 + GPR[rt] 63..0

GPR[rt] 63..0 ←sign_extend (temp31..0)

Exceptions:

None

Programming Notes:

The term “unsigned” in the instruction name is a misnomer; this operation is 32-bit

modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is

not signed, such as address arithmetic, or integer arithmetic environments that ignore

overflow, such as C language arithmetic.

Appendix A CPU Instruct ion Set Details

A-15

AND AND

And

SPECIAL

000000 AND

100100

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: AND rd, rs, rt

Purpose: To do a bitwise logical AND.

Description: rd ← rs AND rt

The contents of GPR

rs

are combined with the contents of GPR

rt

in a bitwise logical AND

operation. The result is placed into GPR

rd

.

Restrictions:

None

Operation:

GPR[rd] 63..0 ← GPR[rs] 63..0 and GPR[rt] 63..0

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-16

ANDI ANDI

And Immediate

ANDI

001100 immediate

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: ANDI rt, rs, immediate

Purpose: To do a bitwise logical AND with a constant.

Description: rt ← rs AND immediate

The 16-bit

immediate

is zero-extended to the left and combined with the contents of GPR

rs

in a bitwise logical AND operation. The result is placed into GPR

rt

.

Restrictions:

None

Operation:

GPR[rt] 63..0 ← zero_extend (immediate) and GPR[rs] 63..0

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-17

BEQ BEQ

Branch on Equal

BEQ

000100 offset

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: BEQ rs, rt, offset

Purpose: To compare GPRs then do a PC-relative conditional branch.

Description: if (rs = rt) then branch

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

notnot

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR

rs

and GPR

rt

are equal, branch to the effective target address after

the instruction in the delay slot is executed.

Restriction:

None

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← (GPR[rs] 63..0 = GPR[rt] 63..0)

Ι+1: if condition then

PC ← PC + tgt_offset

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump (J) or jump register (JR) instructions to branch to more distant addresses.

Appendix A CPU Instruct ion Set Details

A-18

BEQL BEQL

Branch on Equal Likely

BEQL

010100 offset

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: BEQL rs, rt, offset

Purpose: To compare GPRs then do a PC-relative conditional branch; execute the delay slot only if

the branch is taken.

Description: if (rs = rt) then branch_likely

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

notnot

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR

rs

and GPR

rt

are equal, branch to the target address after the

instruction in the delay slot is executed. If the branch is not taken, the instruction in the

delay slot is not executed.

Restrictions:

None

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← (GPR[rs] 63..0 = GPR[rt] 63..0)

Ι+1: if condition then

PC ← PC + tgt_offset

else NullifyCurrentInstruction()

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump (J) or jump register (JR) instructions to branch to more distant addresses.

Appendix A CPU Instruct ion Set Details

A-19

BGEZ BGEZ

Branc h on Greater Than or E qual to Z er o

BGEZ

00001

REGIMM

000001 offset

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: BGEZ rs, offset

Purpose: To test a GPR then do a PC-relative conditional branch.

Description: if (rs ≥ 0) then branch

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

notnot

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR

rs

are greater than or equal to zero (sign bit is 0), branch to the

effective target address after the instruction in the delay slot is executed.

Restrictions:

None

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← GPR[rs] 63..0 ≥ 0GPRLEN

Ι+1: if condition then

PC ← PC + tgt_offset

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump (J) or jump register (JR) instructions to branch to more distant addresses.

Appendix A CPU Instruct ion Set Details

A-20

BGEZAL BGEZAL

Branch on G r eat er Than or Equal t o Z ero and Li nk

BGEZAL

10001

REGIMM

000001 offset

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: BGEZAL rs, offset

Purpose: To test a GPR then do a PC-relative conditional procedure call.

Description: if (rs ≥ 0) then procedure_call

Place the return address link in GPR 31. The return link is the address of the second

instruction following the branch, w here execution w ould continue af ter a procedure call.

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

notnot

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR

rs

are greater than or equal to zero (sign bit is 0), branch to the

effective target address after the instruction in the delay slot is executed.

Restriction:

GPR 31 must not be used for the source register

rs

, because such an instruction does not

have the same effect when re-executed. The result of executing such an instruction is

undefined. This restriction permits an exception handler to resume execution by re-

executing the branch when an exception occurs in the branch delay slot.

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← GPR[rs] 63..0 ≥ 0GPRLEN

GPR[31] 63..0 ← zero_extend (PC+8)

Ι+1: if condition then

PC ← PC + tgt_offset

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to

more distant addresses.

Appendix A CPU Instruct ion Set Details

A-21

BGEZALL BGEZALL

Branc h on Greater Than or E qual to Z er o and Link

Likely

BGEZALL

10011

REGIMM

000001 offset

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: BGEZALL rs, offset

Purpose: To test a GPR then do a PC-relative conditional procedure call; execute the delay slot only

if the branch is taken.

Description: if (rs ≥ 0) then procedure_call_likely

Place the return address link in GPR 31. The return link is the address of the second

instruction following the branch, w here execution w ould continue af ter a procedure call.

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

not not

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR

rs

are greater than or equal to zero (sign bit is 0), branch to the

effective target address after the instruction in the delay slot is executed. If the branch is

not taken, the instruction in the delay slot is not executed.

Restrictions:

GPR 31 must not be used for the source register

rs

, because such an instruction does not

have the same effect when re-executed. The result of executing such an instruction is

undefined. This restriction permits an exception handler to resume execution by re-

executing the branch when an exception occurs in the branch delay slot.

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← GPR[rs] 63..0 ≥ 0GPRLEN

GPR[31] 63..0 ← zero_extend (PC+8)

Ι+1: if condition then

PC ← PC + tgt_offset

else NullifyCurrentInstruction()

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to

more distant addresses.

Appendix A CPU Instruct ion Set Details

A-22

BGEZL BGEZL

Branch on Great er Than or Equal to Zero Likely

BGEZL

00011

REGIMM

000001 offset

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: BGEZL rs, offset

Purpose: To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the

branch is taken.

Description: if (rs ≥ 0) then branch_likely

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

notnot

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR

rs

are greater than or equal to zero (sign bit is 0), branch to the

effective target address after the instruction in the delay slot is executed. If the branch is

not taken, the instruction in the delay slot is not executed.

Restrictions:

None

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← GPR[rs] 63..0 ≥ 0GPRLEN

Ι+1: if condition then

PC ← PC + tgt_offset

else NullifyCurrentInstruction()

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump (J) or jump register (JR) instructions to branch to more distant addresses.

Appendix A CPU Instruct ion Set Details

A-23

BGTZ BGTZ

Branc h on Greater Than Z er o

0

00000

BGTZ

000111 offset

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: BGTZ rs, offset

Purpose: To test a GPR then do a PC-relative conditional branch.

Description: if (rs > 0) then branch

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

notnot

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR

rs

are greater than zero (sign bit is 0 but value not zero), branch to

the effective target address after the instruction in the delay slot is executed.

Restrictions:

None

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← GPR[rs] 63..0 > 0GPRLEN

Ι+1: if condition then

PC ← PC + tgt_offset

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump (J) or jump register (JR) instructions to branch to more distant addresses.

Appendix A CPU Instruct ion Set Details

A-24

BGTZL BGTZL

Branc h on Greater Than Z er o Lik ely

0

00000

BGTZL

010111 offset

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: BGTZL rs, offset

Purpose: To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the

branch is taken.

Description: if (rs > 0) then branch_likely

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

notnot

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR

rs

are greater than zero (sign bit is 0 but value not zero), branch to

the effective target address after the instruction in the delay slot is executed. If the branch

is not taken, the instruction in the delay slot is not executed.

Restrictions:

None

Operations:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← GPR[rs] 63..0 > 0GPRLEN

Ι+1: if condition then

PC ← PC + tgt_offset

else NullifyCurrentInstruction()

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch is ± 128 KB. Us e jump (J)

or jump register (JR) instructions to branch to more distant addresses.

Appendix A CPU Instruct ion Set Details

A-25

BLEZ BLEZ

Branc h on Less Than or Equal to Z er o

0

00000

BLEZ

000110 offset

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: BLEZ rs, offset

Purpose: To test a GPR then do a PC-relative conditional branch.

Description: if (rs ≤ 0) then branch

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

notnot

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of the GPR

rs

are less than or equal to zero (sign bit is 1 or value is zero),

branch to the effective target address after the instruction in the delay slot is executed.

Restrictions:

None

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← GPR[rs] 63..0 ≤ 0GPRLEN

Ι+1: if condition then

PC ← PC + tgt_offset

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump (J) or jump register (JR) instructions to branch to more distant addresses.

Appendix A CPU Instruct ion Set Details

A-26

BLEZL BLEZL

Branc h on Less Than or Equal to Z er o Lik ely

0

00000

BLEZL

010110 offset

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: BLEZL rs, offset

Purpose: To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the

branch is taken.

Description: if (rs ≤ 0) then branch_likely

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

not not

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR

rs

are less than or equal to zero (sign bit is 1 or value is zero),

branch to the effective target address after the instruction in the delay slot is executed. If

the branch is not taken, the instruction in the delay slot is not executed.

Restrictions:

None

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← GPR[rs] 63..0 ≤ 0GPRLEN

Ι+1: if condition then

PC ← PC + tgt_offset

else NullifyCurrentInstruction()

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump (J) or jump register (JR) instructions to branch to more distant addresses.

Appendix A CPU Instruct ion Set Details

A-27

BLTZ BLTZ

Branch on Less Than Zero

BLTZ

00000

REGIMM

000001 offset

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: BLTZ rs, offset

Purpose: To test a GPR then do a PC-relative conditional branch.

Description: if (rs < 0) then branch

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

notnot

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR

rs

are less than zero (sign bit is 1), branch to the effective target

address after the instruction in the delay slot is executed.

Restrictions:

None

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← GPR[rs] 63..0 < 0GPRLEN

Ι+1: if condition then

PC ← PC + tgt_offset

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump (J) or jump register (JR) instructions to branch to more distant addresses.

Appendix A CPU Instruct ion Set Details

A-28

BLTZAL BLTZAL

Branc h on Less Than Zer o and Link

BLTZAL

10000

REGIMM

000001 offset

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: BLTZAL rs, offset

Purpose: To test a GPR then do a PC-relative conditional procedure call.

Description: if (rs < 0) then procedure_call

Place the return address link in GPR 31. The return link is the address of the second

instruction following the branch (not

notnot

not the branch itself), where execution would continue

after a procedure call.

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch, in the branch delay slot, to form a PC-relative

effective target address.

If the contents of GPR

rs

are less than zero (sign bit is 1), branch to the effective target

address after the instruction in the delay slot is executed.

Restrictions:

GPR 31 must not be used for the source register

rs

, because such an instruction does not

have the same effect when re-executed. The result of executing such an instruction is

undefined. This restriction permits an exception handler to resume execution by re-

executing the branch when an exception occurs in the branch delay slot.

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← GPR[rs] 63..0 < 0GPRLEN

GPR[31] 63..0 ← zero_extend (PC+8)

Ι+1: if condition then

PC ← PC + tgt_offset

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to

more distant addresses.

Appendix A CPU Instruct ion Set Details

A-29

BLTZALL BLTZALL

Branc h on Less Than Zer o and Link Lik ely

BLTZALL

10010

REGIMM

000001 offset

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: BLTZALL rs, offset

Purpose: To test a GPR then do a PC-relative conditional procedure call; execute the delay slot only

if the branch is taken.

Description: if (rs < 0) then procedure_call_likely

Place the return address link in GPR 31. The return link is the address of the second

instruction following the branch (not

notnot

not the branch itself), where execution would continue

after a procedure call.

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch, in the branch delay slot, to form a PC-relative

effective target address.

If the contents of GPR

rs

are less than zero (sign bit is 1), branch to the effective target

address after the instruction in the delay slot is executed. If the branch is not taken, the

instruction in the delay slot is not executed.

Restrictions:

GPR 31 must not be used for the source register

rs

, because such an instruction does not

have the same effect when re-executed. The result of executing such an instruction is

undefined. This restriction permits an exception handler to resume execution by re-

executing the branch when an exception occurs in the branch delay slot.

Operation:

Ι: tgt_offset← sign_extend (offset || 02)

condition ← GPR[rs] 63..0 < 0GPRLEN

GPR[31] 63..0 ← zero_extend (PC+8)

Ι+1: if condition then

PC ← PC + tgt_offset

else NullifyCurrentInstruction()

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range ± 128 KB. Use jump

and link (JAL) or jump and link register (JALR) instructions for procedure calls to more

distant addresses.

Appendix A CPU Instruct ion Set Details

A-30

BLTZL BLTZL

Branc h on Less Than Zer o Lik ely

BLTZL

00010

REGIMM

000001 offset

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: BLTZL rs, offset

Purpose: To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the

branch is taken.

Description: if (rs < 0) then branch_likely

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

notnot

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR

rs

are less than zero (sign bit is 1), branch to the effective target

address after the instruction in the delay slot is executed. If the branch is not taken, the

instruction in the delay slot is not executed.

Restrictions:

None

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← GPR[rs] 63..0 < 0GPRLEN

Ι+1: if condition then

PC ← PC + tgt_offset

else NullifyCurrentInstruction()

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump (J) or jump register (JR) instructions to branch to more distant addresses.

Appendix A CPU Instruct ion Set Details

A-31

BNE BNE

Branc h on Not Equal

BNE

000101 offset

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: BNE rs, rt, offset

Purpose: To compare GPRs then do a PC-relative conditional branch.

Description: if (rs ≠ rt) then branch

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

notnot

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR rs and GPR rt are not equal, branch to the effective target address

after the instruction in the delay slot is executed.

Restrictions:

None

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← (GPR[rs] 63..0 ≠ GPR[rt] 63..0)

Ι+1: if condition then

PC ← PC + tgt_offset

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump (J) or jump register (JR) instructions to branch to more distant addresses.

Appendix A CPU Instruct ion Set Details

A-32

BNEL BNEL

Branc h on Not Equal Likely

BNEL

010101 offset

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: BNEL rs, rt, offset

Purpose: To compare GPRs then do a PC-relative conditional branch; execute the delay slot only if

the branch is taken.

Description: if (rs ≠ rt) then branch_likely

An 18-bit signed offset (the 16-bit

offset

field shifted left 2 bits) is added to the address of

the instruction following the branch (not

notnot

not the branch itself), in the branch delay slot, to

form a PC-relative effective target address.

If the contents of GPR

rs

and GPR

rt

are not equal, branch to the effective target address

after the instruction in the delay slot is executed. If the branch is not taken, the

instruction in the delay slot is not executed.

Restrictions:

None

Operation:

Ι: tgt_offset ← sign_extend (offset || 02)

condition ← (GPR[rs] 63..0 ≠ GPR[rt] 63..0)

Ι+1: if condition then

PC ← PC + tgt_offset

else NullifyCurrentInstruction()

endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KB. Use

jump (J) or jump register (JR) instructions to branch to more distant addresses.

Appendix A CPU Instruct ion Set Details

A-33

BREAK BREAK

Breakpoint

SPECIAL

000000 BREAK

001101

code

31 26 25 6 5 0

6 20 6

MIPS I

Format: BREAK

Purpose: To cause a Breakpoint exception.

Description:

A breakpoint exception occurs, immediately and unconditionally transferring control to

the exception handler.

The

code

field is available for use as s of tw are parameters , but is retrieved by the exception

handler only by loading the contents of the memory word containing the instruction.

Restrictions:

None

Operation:

SignalException (Breakpoint)

Exceptions:

Breakpoint

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-34

DADD DADD

Doubleword Add

SPECIAL

000000 DADD

101100

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DADD rd, rs, rt

Purpose: To add 64-bit integers. If overflow occurs, then trap.

Description: rd ← rs + rt

The 64-bit doubleword value in GPR

rt

is added to the 64-bit value in GPR

rs

to produce a

64-bit result. If the addition results in 64-bit 2’s complement arithmetic overflow then the

destination register is not modified and an Integer Overflow exception occurs. If it does

not overflow, the 64-bit result is placed into GPR

rd

.

Restrictions:

None

Operation:

temp ← GPR[rs] 63..0 + GPR[rt] 63..0

if (64_bit_arithmetic_overflow) then

SignalException (IntegerOverflow)

else GPR[rd] 63..0 ← temp

endif

Exceptions:

Integer Overflow

Programming Notes:

DADDU performs the same arithmetic operation but, does not trap on overflow.

Appendix A CPU Instruct ion Set Details

A-35

DADDI DADDI

Doubleword Add Immediate

DADDI

011000 immediate

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS III

Format: DADDI rt, rs, immediate

Purpose: To add a constant to a 64-bit integer. If overflow occurs, then trap.

Description: rt ← rs + immediate

The 16-bit signed

immediate

is added to the 64-bit value in GPR

rs

to produce a 64-bit

result. If the addition results in 64-bit 2’s complement arithmetic overflow then the

destination register is not modified and an Integer Overflow exception occurs. If it does

not overflow, the 64-bit result is placed into GPR

rt

.

Restrictions:

None

Operation:

temp ← GPR[rs] 63..0 + sign_extend (immediate)

if (64_bit_arithmetic_overflow) then

SignalException (IntegerOverflow)

else GPR[rt] 63..0 ← temp

endif

Exceptions:

Integer Overflow

Programming Notes:

DADDIU performs the same arithmetic operation but, does not trap on overflow.

Appendix A CPU Instruct ion Set Details

A-36

DADDIU DADDIU

Doubleword Add Immediate Unsigned

DADDIU

011001 immediate

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS III

Format: DADDIU rt, rs, immediate

Purpose: To add a constant to a 64-bit integer.

Description: rt ← rs + immediate

The 16-bit signed

immediate

is added to the 64-bit value in GPR

rs

and the 64-bit

arithmetic result is placed into GPR

rt.

No Integer Overflow exception occurs under any circumstances.

Restrictions:

None

Operation:

GPR[rt] 63..0 ← GPR[rs] 63..0 + sign_extend (immediate)

Exceptions:

None

Programming Notes:

The term “unsigned” in the instruction name is a misnomer; this operation is 64-bit

modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is

not signed, such as address arithmetic, or integer arithmetic environments that ignore

overflow, such as C language arithmetic.

Appendix A CPU Instruct ion Set Details

A-37

DADDU DADDU

Doubleword Add Unsi gned

SPECIAL

000000 DADDU

101101

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DADDU rd, rs, rt

Purpose: To add 64-bit integers.

Description: rd ← rs + rt

The 64-bit doubleword value in GPR

rt

is added to the 64-bit value in GPR

rs

and the 64-

bit arithmetic result is placed into GPR

rd

.

No Integer Overflow exception occurs under any circumstances.

Restrictions:

None

Operation:

GPR[rd] 63..0 ← GPR[rs] 63..0 + GPR[rt] 63..0

Exception:

None

Programming Notes:

The term “unsigned” in the instruction name is a misnomer; this operation is 64-bit

modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is

not signed, such as address arithmetic, or integer arithmetic environments that ignore

overflow, such as C language arithmetic.

Appendix A CPU Instruct ion Set Details

A-38

DIV DIV

Divide Wor d

SPECIAL

000000 DIV

011010

rt 0

00 0000 0000

rs

31 26 25 21 20 16 15 6 5 0

6 5 5 10 6

MIPS I

Format: DIV rs, rt

Purpose: To divide 32-bit signed integers.

Description: (LO, HI) ← rs / rt

The 32-bit word value in GPR

rs

is divided by the 32-bit value in GPR

rt

, treating both

operands as signed values. The 32-bit quotient is placed into special register

LO

and the

32-bit remainder is placed into special register

HI

.

No arithmetic exception occurs under any circumstances.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation is undefined.

If the divisor in GPR

rt

is zero, the arithmetic result value is undefined.

Operation:

if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif

q ← GPR[rs]31..0 div GPR[rt]31..0

LO63..0 ← sign_extend (q31..0)

r ← GPR[rs]31..0 mod GPR[rt]31..0

HI63..0 ← sign_extend (r31..0)

Exceptions:

None

Supplementary Explanation:

Normally, when 0x80000000 (-2147483648) the signed minimum value is divided by

0xFFFFFFFF (-1), the operation will result in an overfl ow. H owever, in this instruction an

overflow exception doesn’t occur and the result will be as follows:

Quotient is 0x80000000 (-2147483648) , and remainder is 0x00000000 ( 0) .

This sign of the quotient and the remainder is based on the signs of the dividend and the

divisor as shown in the table below :

Appendix A CPU Instruct ion Set Details

A-39

Dividend Divisor Quotient Remainder

Positive Positive Positive Positive

Positive Negative Negative Positive

Negative Positive Negative Negative

Negative Negative Positive Negative

Programming Notes:

In the C790, the integer divide operation proceeds asynchronously and allows other CPU

instructions to execute before it is retired. An attempt to read

LO

or

HI

before the results

are written will wait (interlock) until the results are ready. Asynchronous execution does

not affect the program result, but offers an opportunity for performance improvement by

scheduling the divide so that other instructions can execute in parallel.

No arithmetic exception occurs under any circumstances. If divide-by-zero or overflow

conditions should be detected and some action taken, then the divide instruction is

typically followed by additional instructions to check for a zero divisor and / or for overflow.

If the divide is asynchronous then the zero-divisor check can execute in parallel with the

divide. The action taken on either divide-by-zero or overflow is either a convention within

the program itself or more typically, the system software; one possibility is to take a

BREAK exceptio n w i t h a co de f iel d value t o signal the probl em t o t he s ys t em s oftware.

As an example, the C programming language in a UNIX environment expects division by

zero to either terminate the program or execute a program-specified signal handler. C

does not expect overflow to cause any exceptional condition. If the C compiler uses a divide

instruct i on, it also em it s c o de t o t e s t f o r a zero divisor and execut e a BREAK i ns t r uc t ion to

inform the operating system if one is detected.

In the C790, sign-extended 32-bit values ( bits 63. . 31) are ignored on divide operation.

Appendix A CPU Instruct ion Set Details

A-40

DIVU DIVU

Divide Unsigned Word

SPECIAL

000000 DIVU

011011

rt 0

00 0000 0000

rs

31 26 25 21 20 16 15 6 5 0

6 5 5 10 6

MIPS I

Format: DIVU rs, rt

Purpose: To divide 32-bit unsigned integers.

Description: (LO, HI) ← rs / rt

The 32-bit word value in GPR

rs

is divided by the 32-bit value in GPR

rt

, treating both

operands as unsigned values. The 32-bit quotient is placed into special register

LO

and

the 32-bit remainder is placed into special register

HI

.

No arithmetic exception occurs under any circumstances.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation is undefined.

If the divisor in GPR

rt

is zero, the arithmetic result is undefined.

Operation:

if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif

q ← (0 || GPR[rs]31..0) div (0 || GPR[rt]31..0)

LO63..0 ← sign_extend (q31..0)

r ← (0 || GPR[rs]31..0) mod (0 || GPR[rt]31..0)

HI63..0 ← sign_extend (r31..0)

Exceptions:

None

Programming Notes:

See the Programming Notes for the DIV instruction.

Appendix A CPU Instruct ion Set Details

A-41

DSLL DSLL

Doubleword Shift Left Logic al

SPECIAL

000000 DSLL

111000

rt rd

0

00000 sa

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DSLL rd, rt, sa

Purpose: To left shift a doubleword by a fixed amount  0 to 31 bits.

Description: rd ← rt << sa

The 64-bit doubleword contents of GPR

rt

are shifted left, inserting zeros into the emptied

bits; the result is placed in GPR

rd

. The bit shift count in the range 0 to 31 is specified by

sa

.

Restrictions:

None

Operation:

s ← 0 || sa

GPR[rd] 63..0 ←GPR[rt](63-s)..0 || 0s

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-42

DSLL32 DSLL32

Doubleword Shift Left Logic al P lus 32

SPECIAL

000000 DSLL32

111100

rt rd

0

00000 sa

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DSLL32 rd, rt, sa

Purpose: To left shift a doubleword by a fixed amount  32 to 63 bits.

Description: rd ← rt << (sa + 32)

The 64-bit doubleword contents of GPR

rt

are shifted left, inserting zeros into the emptied

bits; the result is placed in GPR

rd

. The bit shift count in the range 32 to 63 is specified by

sa

+ 32.

Restrictions:

None

Operation:

s ← 1 || sa /* 32 + sa */

GPR[rd] 63..0 ← GPR[rt](63-s)..0 || 0s

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-43

DSLLV DSLLV

Doubleword Shift Left Logic al V ar iable

SPECIAL

000000 DSLLV

010100

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DSLLV rd, rt, rs

Purpose: To left shift a doubleword by a variable number of bits.

Description: rd ← rt << rs

The 64-bit doubleword contents of GPR

rt

are shifted left, inserting zeros into the emptied

bits; the result is placed in GPR

rd

. The bit shift count in the range 0 to 63 is specified by

the low-order six bits in GPR

rs

.

Restrictions:

None

Operation:

s ← 0 || GPR[rs]5..0

GPR[rd] 63..0 ← GPR[rt](63-s)..0 || 0s

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-44

DSRA DSRA

Doubleword Shift Right Ar ithmet ic

SPECIAL

000000 DSRA

111011

rt rd

0

00000 sa

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DSRA rd, rt, sa

Purpose: To arithmetic right shift a doubleword by a fixed amount  0 to 31 bits.

Description: rd ← rt >> sa (arithmetic)

The 64-bit doubleword contents of GPR

rt

are shifted right, duplicating the sign bit (63)

into the emptied bits; the result is placed in GPR

rd

. The bit shift count in the range 0 to

31 is specified by

sa

.

Restrictions:

None

Operation:

s ← 0 || sa

GPR[rd] 63..0 ← (GPR[rt]63)s || GPR[rt]63..s

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-45

DSRA32 DSRA32

Doubleword Shift Right Ar ithmet ic P lus 32

SPECIAL

000000 DSRA32

111111

rt rd

0

00000 sa

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DSRA32 rd, rt, sa

Purpose: To arithmetic right shift a doubleword by a fixed amount  32-63 bits.

Description: rd ← rt >> (sa + 32) (arithmetic)

The doubleword contents of GPR

rt

are shifted right, duplicating the sign bit (63) into the

emptied bits; the result is placed in GPR

rd

. The bit shift count in the range 32 to 63 is

specified by

sa

+ 32.

Restrictions:

None

Operation:

s ←1 || sa /* 32 + sa */

GPR[rd] 63..0 ←(GPR[rt]63)s || GPR[rt]63..s

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-46

DSRAV DSRAV

Doubleword Shift Right Ar ithmet ic V ar iable

SPECIAL

000000 DSRAV

010111

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DSRAV rd, rt, rs

Purpose: To arithmetic right shift a doubleword by a variable number of bits.

Description: rd ← rt >> rs (arithmetic)

The doubleword contents of GPR

rt

are shifted right, duplicating the sign bit (63) into the

emptied bits; the result is placed in GPR

rd

. The bit shift count in the range 0 to 63 is

specified by the low-order six bits in GPR

rs

.

Restrictions:

None

Operation:

s ← GPR[rs]5..0

GPR[rd] 63..0 ← (GPR[rt]63)s || GPR[rt]63..s

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-47

DSRL DSRL

Doubleword Shift Right Logical

SPECIAL

000000 DSRL

111010

rt rd

0

00000 sa

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DSRL rd, rt, sa

Purpose: To logical right shift a doubleword by a fixed amount  0 to 31 bits.

Description: rd ← rt >> sa (logical)

The doubleword contents of GPR

rt

are shifted right, inserting zeros into the emptied bits;

the result is placed in GPR

rd

. The bit shift count in the range 0 to 31 is specified by

sa

.

Restrictions:

None

Operation:

s ← 0 || sa

GPR[rd] 63..0 ← 0s || GPR[rt]63..s

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-48

DSRL32 DSRL32

Doubleword Shift Right Logical Plus 32

SPECIAL

000000 DSRL32

111110

rt rd

0

00000 sa

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DSRL32 rd, rt, sa

Purpose: To logical right shift a doubleword by a fixed amount  32 to 63 bits.

Description: rd ← rt >> (sa + 32) (logical)

The 64-bit doubleword contents of GPR

rt

are shifted right, inserting zeros into the

emptied bits; the result is placed in GPR

rd

. The bit shift count in the range 32 to 63 is

specified by

sa

+ 32.

Restrictions:

None

Operation:

s ← 1 || sa /* 32 + sa * /

GPR[rd] 63..0 ← 0s || GPR[rt]63..s

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-49

DSRLV DSRLV

Doubleword Shift Right Logical Var iable

SPECIAL

000000 DSRLV

010110

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DSRLV rd, rt, rs

Purpose: To logical right shift a doubleword by a variable number of bits.

Description: rd ← rt >> rs (logical)

The 64-bit doubleword contents of GPR

rt

are shifted right, inserting zeros into the

emptied bits; the result is placed in GPR

rd

. The bit shift count in the range 0 to 63 is

specified by the low-order six bits in GPR

rs

.

Restrictions:

None

Operation:

s ← GPR[rs]5..0

GPR[rd] 63..0 ←0s || GPR[rt]63..s

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-50

DSUB DSUB

Doubleword Subtrac t

SPECIAL

000000 DSUB

101110

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DSUB rd, rs, rt

Purpose: To subtract 64-bit integers; trap if overflow.

Description: rd ← rs - rt

The 64-bit doubleword value in GPR

rt

is subtracted from the 64-bit value in GPR

rs

to

produce a 64-bit result. If the subtraction results in 64-bit 2’s complement arithmetic

overflow then the destination register is not modified and an Integer Overflow exception

occurs. If it does not overflow, the 64-bit result is placed into GPR

rd

.

Restrictions:

None

Operation:

temp ← GPR[rs] 63..0 - GPR[rt] 63..0

if (64_bit_arithmetic_overflow) then

SignalException (IntegerOverflow)

else GPR[rd] 63..0 ← temp

endif

Exceptions:

Integer Overflow

Programming Notes:

DSUBU performs the same arithmetic operation but, does not trap on overflow.

Appendix A CPU Instruct ion Set Details

A-51

DSUBU DSUBU

Doubleword Subtrac t Unsigned

SPECIAL

000000 DSUBU

101111

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS III

Format: DSUBU rd, rs, rt

Purpose: To subtract 64-bit integers.

Description: rd ← rs - rt

The 64-bit doubleword value in GPR

rt

is subtracted from the 64-bit value in GPR

rs

and

the 64-bit arithmetic result is placed into GPR

rd

.

No Integer Overflow exception occurs under any circumstances.

Restrictions:

None

Operation:

GPR[rd] 63..0 ← GPR[rs] 63..0 - GPR[rt] 63..0

Exceptions:

None

Programming Notes:

The term “unsigned” in the instruction name is a misnomer; this operation is 64-bit

modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is

not signed, such as address arithmetic, or integer arithmetic environments that ignore

overflow, such as C language arithmetic.

Appendix A CPU Instruct ion Set Details

A-52

JJ

Jump

J

000010 instr_index

31 26 25 0

6 26

MIPS I

Format: J target

Purpose: To branch within the current 256 MB aligned region.

Description:

This is a PC-region branch (not PC-relative); the effective target address is in the

“current” 256 MB aligned region. The low 28 bits of the target address is the

instr_index

field shifted left 2 bits. The remaining upper bits are the corresponding bits of the address

of the instruction in the delay slot ( not

notnot

not the jump itself).

Jump to the effective target address. Execute the instruction following the jump, in the

branch delay slot, before jumping.

Restrictions:

None

Operation:

Ι:

Ι+1: PC ← PC31..28 || instr_index || 02

Exceptions:

None

Programming Notes:

Forming the branch target address by concatenating PC and index bits rather than adding

a signed offset to the PC is an advantage if all program code addresses fit into a 256 MB

region aligned on a 256 MB boundary. It allows a branch to anywhere in the region from

anywhere in the region which a signed relative offset would not allow.

This definition creates the boundary case where the branch instruction is in the last word

of a 256 MB region and can therefore only branch to the following 256 MB region

containing the branch delay slot.

Appendix A CPU Instruct ion Set Details

A-53

JAL JAL

Jump and Link

JAL

000011 instr_index

31 26 25 0

6 26

MIPS I

Format: JAL target

Purpose: To procedure call within the current 256 MB aligned region.

Description:

Place the return address link in GPR 31. The return link is the address of the second

instruction following the branch, w here execution w ould continue af ter a procedure call.

This is a PC-region branch (not PC-relative); the effective target address is in the

“current” 256 MB aligned region. The low 28 bits of the target address is the

instr_index

field shifted left 2 bits. The remaining upper bits are the corresponding bits of the address

of the instruction in the delay slot ( not

notnot

not the jump itself).

Jump to the effective target address. Execute the instruction following the jump, in the

branch delay slot, before jumping.

Restrictions:

None

Operation:

Ι: GPR[31] 63..0 ← zero_extend (PC + 8)

Ι+1: PC ← PC31..28 || instr_index || 02

Exceptions:

None

Programming Notes:

Forming the branch target address by concatenating PC and index bits rather than adding

a signed offset to the PC is an advantage if all program code addresses fit into a 256 MB

region aligned on a 256 MB boundary. It allows a branch to anywhere in the region from

anywhere in the region which a signed relative offset would not allow.

This definition creates the boundary case where the branch instruction is in the last word

of a 256 MB region and can therefore only branch to the following 256 MB region

containing the branch delay slot.

Appendix A CPU Instruct ion Set Details

A-54

JALR JALR

Jump and Link Register

SPECIAL

000000 JALR

001001

rd

0

00000 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: JALR rs (rd = 31 implied)

JALR rd, rs

Purpose: To procedure call to an instruction address in a register.

Description: rd ← return_addr, PC ← rs

Place the return address link in GPR

rd

. The return link is the address of the second

instruction following the branch, w here execution w ould continue af ter a procedure call.

Jump to the effective target address in GPR

rs

. Execute the instruction following the jump,

in the branch delay slot, before jumping.

Restrictions:

Register specifiers

rs

and

rd

must not be equal, because such an instruction does not have

the same effect when re-executed. The result of executing such an instruction is undefined.

This restriction permits an exception handler to resume execution by re-executing the

branch when an exception occurs in the branch delay slot.

The effective target address in GPR

rs

must be naturally aligned. If either of the two

least-significant bits are not -zero, then an Address Error exception occurs, not for the

jump instruction, but when the branch target is s ubs equently f etched as an ins truction.

Operation:

Ι:temp

← GPR[rs] 31..0

GPR[rd] 63..0 ← zero_extend (PC + 8)

Ι+1: PC ← temp

Exceptions:

None

Programming Notes:

This is the only branch-and-link instruction that can select a register for the return link;

all other link instructions use GPR 31 The default register for GPR

rd

, if omitted in the

assembly language instruction, is GPR 31.

Appendix A CPU Instruct ion Set Details

A-55

JR JR

Jump Register

SPECIAL

000000 JR

001000

rs 0

000 0000 0000 0000

31 26 25 21 20 6 5 0

6 5 15 6

MIPS I

Format: JR rs

Purpose: To branch to an instruction address in a register.

Description: PC ← rs

Jump to the effective target address in GPR

rs

. Execute the instruction following the jump,

in the branch delay slot, before jumping.

Restrictions:

The effective target address in GPR

rs

must be naturally aligned. If either of the two

least-significant bits are not-zero, then an Address Error exception occurs, not for the

jump instruction, but when the branch target is s ubs equently f etched as an ins truction.

Operation:

Ι:temp

← GPR[rs] 31..0

Ι+1: PC

← temp

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-56

LB LB

Load Byte

LB

100000 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: LB rt, offset (base)

Purpose: To load a byte from memory as a signed value.

Description: rt ← memory [base + offset]

The contents of the 8-bit byte at the memory location specified by the effective address are

fetched, sign-extended, and placed in GPR

rt

. The 16-bit signed

offset

is added to the

contents of GPR

base

to form the effective address.

Restrictions:

None

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR[base] 31..0

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

pAddr ← pAddr(PSIZE-1)..4 || (pAddr3..0 xor BigEndian4)

memquad ← LoadMemory (uncached, BYTE, pAddr, vAddr, DATA)

byte ← vAddr3..0 xor BigEndian4

GPR[rt]63..0 ← sign_extend (memquad (7+8*byte)..8*byte)

Exceptions:

TLB Refill

TLB Invalid

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-57

LBU LBU

Load Byte Unsigned

LBU

100100 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: LBU rt, offset (base)

Purpose: To load a byte from memory as an unsigned value.

Description: rt ← memory [base + offset]

The contents of the 8-bit byte at the memory location specified by the effective address are

fetched, zero-extended, and placed in GPR

rt

. The 16-bit signed

offset

is added to the

contents of GPR

base

to form the effective address.

Restrictions:

None

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR[base] 31..0

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

pAddr ← pAddr(PSIZE-1).. 4 || (pAddr3..0 xor BigEndian4)

memquad ← LoadMemory (uncached, BYTE, pAddr, vAddr, DATA)

byte ← vAddr3..0 xor BigEndian4

GPR[rt]63..0 ← zero_extend (memquad(7+8*byte)..8*byte)

Exceptions:

TLB Refill

TLB Invalid

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-58

LD LD

Load Doubleword

LD

110111 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS III

Format: LD rt, offset (base)

Purpose: To load a doubleword from memory.

Description: rt ← memory [base + offset]

The contents of the 64-bit doubleword at the memory location specified by the aligned

effective address are fetched and placed in GPR

rt

. The 16-bit signed

offset

is added to the

contents of GPR

base

to form the effective address.

Restrictions:

The effective address must be naturally aligned. If any of the three least-significant bits of

the effective address are non-zero, an Address Error exception occurs.

Operation: (128-bit bus)

vAddr ←sign_extend (offset) + GPR [base] 31..0

if (v Addr2..0) ≠ 03 then SignalException (AddressError) endif

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

pAddr ← pAddr(PSIZE-1).. 4 || (pAddr3..0 xor (BigEndian || 03))

byte ← vAddr3..0 || (BigEndian || 03)

memquad ← LoadMemory (uncached, DOUBLEWORD, pAddr, vAddr, DATA)

GPR[rt]63..0 ← memquad(63+8*byte)..8*byte

Exceptions:

TLB Refill

TLB Invalid

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-59

LDL LDL

Load Doubleword Left

LDL

011010 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS III

Format: LDL rt, offset (base)

Purpose: To load the more-significant part of a doubleword from an unaligned memory address.

Description: rt ← rt MERGE memory [base + offset]

Paired LDL and LDR instructions are used to load a register with a doubleword from

eight consecutive bytes in memory starting at an arbitrary byte address. LDL loads the

left (most-significant) bytes and LDR loads the right (least-significant) bytes.

The instruction adds the 16-bit signed

offset

to the contents of GPR

base

to form the

effective address. This is the address of the most-significant byte of a doubleword

composed of eight consecutive bytes in memory. LDL loads from one to eight bytes, the

most-significant bytes of the doubleword, into the corresponding bytes of GPR

rt

. It loads

the bytes that are in the target doubleword that are also in the aligned doubleword which

contains the byte specified by the effective address.

Conceptually, it starts at the specified byte in memory and loads that byte into the high-

order (left-most) byte of the register; then it loads bytes from memory into the register

until it reaches the low-order byte of the doubleword in memory. The least-significant

(right-most) byte (s) of the register will not be changed.

memory

(little-endian)

address 8

address 0

register

before

$

24AECDBFGH

LDL $24,11 ($0)

after

$

24

register

0

1234567

8

9101112131415

8

91011 ACDB

memory

(big-endian)

address 8

address 0

register

before

$

24AECDBFGH

LDL $24,3 ($0)

after

$

24

register

01234567

89 101112131415

6

543 HF

7G

The contents of GPR

rt

are internally bypassed within the processor so that no NOP is

needed between an immediately preceding load instruction which specifies register

rt

and

a following LDL (or LDR) instruction which also specifies register

rt

.

Appendix A CPU Instruct ion Set Details

A-60

No address exceptions due to alignment are possible.

Restrictions:

None

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR[base] 31..0

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

pAddr ← pAddr(PSIZE-1)..4 || (pAddr3..0 xor BigEndian4)

if (BigEndian = 0) t hen

pAddr ← pAddr(PSIZE-1)..3 || 03

endif

byte ← 0 || (vAddr 2..0 xor BigEndian3)

doubleword ← vAddr3 xor BigEndian

memquad ← LoadMemory (uncached, byte, pAddr, vAddr, DATA)

GPR[rt]63..0 ← memquad(7+8*byte+64*doubleword)..(64*doubleword) || GPR[rt] (55-8*byte)..0

Given a doubleword in a register and a doubleword in memory, the operation of LDL is as

follows:

Appendix A CPU Instruct ion Set Details

A-61

LDL

Re

g

ister

Memor

y

abcdefgh

IJKLMNOPQRSTUVWX

15 14 13 12 11 10 9 8 7 6 5 4 3210

MSB LSB

Little-endian

63 0

Littl e-endi an byt e orderi ng (BigEndianCP U = 0)

vAddr3..0 Dest i nation register contents aft er i nstruc t i on(shaded is unchanged) Type offset

(63----------------------------------------32 31------------------------------------------0) LEM BEM

0Xb c d e f g h 0015

1WXc d e f g h 1014

2VWX d e f g h 2013

3UVWXe f g h 3012

4TUVWXf g h 4011

5STUVWXg h 5010

6RSTUVWXh609

7QRSTUVWX70 8

8Pb c d e f g h 087

9OPc d e f g h 18 6

10 NOPd e f g h 285

11 MNOPe f g h 384

12 LMNOP f g h 483

13 KLMNOPg h 58 2

14 JKLMNOPh68 1

15 IJKLMNOP

780

Appendix A CPU Instruct ion Set Details

A-62

LDL

Re

g

ister

Memor

y

abcdefgh

IJKLMNOPQRSTUVWX

151413121110987654

3210

MSB LSB

Big-endian

63 0

15 14 13 12 11 10 9 8 7 6 5 4 3210Little-endian

Big-endian byte orderi ng (BigEndianCPU = 0)

vAddr3..0 Dest i nation register contents aft er i nstruc t i on(shaded is unchanged) Type offset

(63----------------------------------------32 31------------------------------------------0) LEM BEM

0IJKLMNOP

700

1JKLMNOPh60 1

2KLMNOPg h 50 2

3LMNOP f g h 403

4MNOPe f g h 304

5NOPd e f g h 205

6OPc d e f g h 10 6

7Pb c d e f g h 007

8QRSTUVWX78 8

9RSTUVWXh689

10 STUVWXg h 5810

11 TUVWXf g h 4811

12 UVWXe f g h 3812

13 VWX d e f g h 2813

14 WXc d e f g h 1814

15 Xb c d e f g h 0815

LEM

Little-endian memory (BigEndian = 0)

BEM

BigEndian = 1

Type

AccessLength sent to memory

Offset

pAddr3..0 sent to memory

Exceptions:

TLB Refill

TLB Invalid

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-63

LDR LDR

Load Doubleword Right

LDR

011011 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS III

Format: LDR rt, offset (base)

Purpose: To load the less-significant part of a doubleword from an unaligned memory address.

Description: rt ← rt MERGE memory [base + offset]

Paired LDL and LDR instructions are used to load a register with a doubleword from

eight consecutive bytes in memory starting at an arbitrary byte address. LDL loads the

left (most-significant) bytes and LDR loads the right (least-significant) bytes.

The instruction adds the 16-bit signed

offset

to the contents of GPR

base

to form the

effective address. This is the address of the least-significant bytes of a doubleword

composed of eight consecutive bytes in memory. LDR loads from one to eight bytes, the

least-significant bytes of the doubleword, into the corresponding bytes of GPR

rt

. It loads

the bytes that are in the target doubleword that are also in the aligned doubleword which

contains the byte specified by the effective address.

Conceptually, it starts at the specified byte in memory and loads that byte into the low-

order (right-most) byte of the register; then it loads bytes from memory into the register

until it reaches the high-order byte of the doubleword in memory. The most significant

(left-most) byte (s) of the register will not be changed.

memory

(little-endian)

address 8

address 0

register

before

$

24

LDR $24,4 ($0)

after

$

24

register

0

1234567

8

9101112131415

4567

AECDBFGH

EFGH

memory

(big-endian)

address 8

address 0

register

before

$

24

LDR $24,4 ($0)

after

$

24

register

01234567

89 101112131415

4321

AECDBFGH

0

CBA

The contents of GPR

rt

are internally bypassed within the processor so that no NOP is

needed between an immediately preceding load instruction which specifies register

rt

and

a following LDR (or LDL) instruction which also specifies register

rt

.

Appendix A CPU Instruct ion Set Details

A-64

No address exceptions due to alignment are possible.

Restrictions:

None

Operation: (128-bit bus)

vAddr ← sign_extend(offset) + GPR[base] 31..0

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

pAddr ← pAddr(PSIZE-1)..0 || (pAddr3..0 xor BigEndian4)

if (BigEndian = 1) t hen

pAddr ← pAddr(PSIZE-1)..3 || 03

endif

byte ← 0 || (vAddr 2..0 xor BigEndian3)

doubleword ← vAddr3 xor BigEndian

memquad ← LoadMemory (uncached, byte, pAddr, vAddr, DATA)

GPR[rt]63..0 ← GPR[rt] 63..(64-8*byte) || memquad(63+64*doubleword).. (64*doubleword+8*byte)

Given a doubleword in a register and a doubleword in memory, the operation of LDR is as

follows:

Appendix A CPU Instruct ion Set Details

A-65

LDR

Re

g

ister

Memor

y

abcdefgh

IJKLMNOPQRSTUVWX

15 14 13 12 11 10 9 8 7 6 5 4 3210

MSB LSB

Little-endian

63 0

Littl e-endi an byt e orderi ng (BigEndianCP U = 0)

vAddr3..0 Dest i nation register contents aft er i nstruc t i on(shaded is unchanged) Type offset

(63----------------------------------------32 31------------------------------------------0) LEM BEM

0QRSTUVWX70 0

1 a QRSTUVW610

2 a b QRSTUV52 0

3 a b c QRSTU43 0

4 a b c d QRST34 0

5 a b c d e QRS250

6 a b c d e f QR16 0

7 a b c d e f g Q070

8IJKLMNOP

780

9 a IJKLMNO

690

10 a b IJKLMN

5100

11 abcIJKLM

4110

12 a b c d IJKL

3120

13 abcdeIJK

2130

14 a b c d e f IJ

1140

15 a b c d e f g I0150

Appendix A CPU Instruct ion Set Details

A-66

LDR

Re

g

ister

Memor

y

abcdefgh

IJKLMNOPQRSTUVWX

151413121110987654

3210

MSB LSB

Big-endian

63 0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0Little-endian

Big-endian byte orderi ng (BigEndianCPU = 1)

vAddr3..0 Dest i nation register contents aft er i nstruc t i on(shaded is unchanged) Type offset

(63----------------------------------------32 31------------------------------------------0) LEM BEM

0a b c d e f g I0150

1a b c d e f IJ1140

2a b c d e IJK2130

3a b c d IJKL3120

4a b c IJKLM4110

5abIJKLMN5100

6aIJKLMNO690

7IJKLMNOP780

8a b c d e f g Q07 0

9a b c d e fQR16 0

10 a b c d e QRS25 0

11 a b c d QRST34 0

12 a b c QRSTU43 0

13 a b QRSTUV52 0

14 aQRSTUVW61 0

15 Q R S T U V W X700

LEM

Little-endian memory (BigEndianMem = 0)

BEM

BigEndianMem = 1

Type

AccessLength sent to memory

Offset

pAddr2..0 sent to memory

Exceptions:

TLB Refill

TLB Invalid

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-67

LH LH

Load Halfword

LH

100001 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: LH rt, offset (base)

Purpose: To load a halfword from memory as a signed value.

Description: rt ← memory [base + offset]

The contents of the 16-bit halfword at the memory location specified by the aligned

effective address are fetched, sign-extended, and placed in GPR

rt

. The 16-bit signed

offset

is added to the contents of GPR

base

to form the effective address.

Restrictions:

The effective address must be naturally aligned. If the least-significant bit of the address

is non-zero, an Address Error exception occurs.

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR[base] 31..0

if (v Addr0) ≠ 0 then SignalException (AddressError) endif

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

pAddr ← pAddr(PSIZE-1)..4 || (pAddr3..0 xor (BigEndian3 || 0))

memquad ← LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)

byte ← vAddr3..0 xor (BigEndian3 || 0)

GPR[rt]63..0 ← sign_extend (memquad(15+8*byte)..8*byte)

Exceptions:

TLB Refill

TLB Invalid

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-68

LHU LHU

Load Halfword Unsigned

LHU

100101 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: LHU rt, offset (base)

Purpose: To load a halfword from memory as an unsigned value.

Description: rt ← memory [base + offset]

The contents of the 16-bit halfword at the memory location specified by the aligned

effective address are fetched, zero-extended, and placed in GPR

rt

. The 16-bit signed

offset

is added to the contents of GPR

base

to form the effective address.

Restrictions:

The effective address must be naturally aligned. If the least-significant bit of the address

is non-zero, an Address Error exception occurs.

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR [base] 31..0

if (v Addr0) ≠ 0 then SignalException (AddressError) endif

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

pAddr ← pAddr(PSIZE-1)..4 || (pAddr3..0 xor (BigEndian3 || 0))

memquad ← LoadMemory (uncached, HALFWORD, pAddr, vAddr, DATA)

byte ← vAddr3..0 xor (BigEndian3 || 0)

GPR [rt]63..0 ← zero_extend (memquad(15+8*byte)..8*byte)

Exceptions:

TLB Refill

TLB Invalid

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-69

LUI LUI

Load Upper I mmedi ate

0

00000

LUI

001111 immediate

rt

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: LUI rt, immediate

Purpose: To load a constant into the upper half of a word.

Description: rt ← immediate || 016

The 16-bit

immediate

is shifted left 16 bits and concatenated with 16 bits of low-order

zeros. The 32-bit result is s ign- extended and placed into G PR

rt

.

Restrictions:

None

Operation:

GPR [rt] 63..0 ← sign_extend (immediate || 016)

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-70

LW LW

Load Word

LW

100011 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: LW rt, offset (base)

Purpose: To load a word from memory as a signed value.

Description: rt ← memory [base + offset]

The contents of the 32-bit word at the memory location specified by the aligned effective

address are fetched, sign-extended to the GPR register length if necessary, and placed in

GPR

rt

. The 16-bit signed

offset

is added to the contents of GPR

base

to form the effective

address.

Restrictions:

The effective address must be naturally aligned. If either of the two least-significant bits

of the address are non-zero, an Address Error exception occurs.

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR [base] 31..0

if (v Addr1..0) ≠ 02 then SignalException (AddressError) endif

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

pAddr ← pAddr(PSIZE-1)..4 || (pAddr3..0 xor (BigEndian2 || 02))

memquad ← LoadMemory (uncached, WORD, pAddr, vAddr, DATA)

byte ← vAddr3..0 xor (BigEndian2 || 02)

GPR [rt] 63..0 ← sign_extend (memquad(31+8*byte)..8*byte)

Exceptions:

TLB Refill

TLB Invalid

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-71

LWL LWL

Load Word Left

LWL

100010 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: LWL rt, offset (base)

Purpose: To load the more-significant part of a word from an unaligned memory address as a

signed value.

Description: rt ← rt MERGE memory [base + offset]

Paired LWL and LWR instructions are used to load a register with a word from four

consecutive bytes in memory starting at an arbitrary byte address. LWL loads the left

(most-significant) bytes and LWR loads the right (least-significant) bytes.

The instruction adds the 16-bit signed

offset

to the contents of GPR

base

to form the effective

address. This is the address of the most-sig nificant by te of a word composed of four consecutiv e

bytes in memory. LWL loads from one to four bytes, the most-significant bytes of the word,

into the corresponding bytes of GPR

rt

. It loads the bytes that are in the target word that are

also in the aligned word which contains the byte specified by the effective address.

Bit 31 of the register is loaded so the loaded word is sign-extended.

Conceptually, it starts at the specified byte in memory and loads that byte into the high-

order (left-most) byte of the register; then it loads bytes from memory into the register

until it reaches the low-order byte of the word in memory. The least-significant (right-

most) byte(s) of the register will not be changed.

memory

(little-endian)

address 4

address 0

register

before

$

24

LWL $24,4 ($0)

after

$

24

register

0

123

4567

4

ACDB

ACB

memory

(big-endian)

address 4

address 0

register

before

$

24

LWL $24,1 ($0)

after

$

24

register

0123

4567

1

dbac

d

23

Appendix A CPU Instruct ion Set Details

A-72

The contents of GPR

rt

are internally bypassed within the processor so that no NOP is

needed between an immediately preceding load instruction which specifies register

rt

and

a following LWL (or LWR) instruction which also specifies register

rt

.

No address exceptions due to alignment are possible.

Restrictions:

None

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR [base] 31..0

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

pAddr ← pAddr(PSIZE-1)..4 || (pAddr3..0 xor BigEndian4)

if (BigEndian = 0) t hen

pAddr(PSIZE-1)..3 || 03

endif

byte ← 02 || (vAddr 1..0 xor BigEndian2)

word ← vAddr3..2 xor BigEndian2

memquad ← LoadMemory (uncached, byte, pAddr, vAddr, DATA)

temp ← memquad(32*word+8*byte+7)..32*word || GPR [rt] (23-8*byte)..0

GPR [rt] 63..0 ← (temp31)32 || temp

Given a doubleword in a register and a doubleword in memory, the operation of LWL is as

follows:

Appendix A CPU Instruct ion Set Details

A-73

LWL

Re

g

ister

Memor

y

abcdefgh

IJKLMNOPQRSTUVWX

15 14 13 12 11 10 9 8 7 6 5 4 3210

MSB LSB

Little-endian

63 0

Littl e-endi an byt e orderi ng (BigEndianCP U = 0)

vAddr3..0 Dest i nation register contents aft er i nstruc t i on(shaded is unchanged) Type offset

(63----------------------------------------32 31------------------------------------------0) LEM BEM

0 Sign bi t(31) extended Xf g h 0015

1 Sign bi t(31) extended WXg h 1014

2 Sign bi t(31) extended VWX h2013

3 Sign bi t(31) extended UVWX3012

4 Sign bi t(31) extended Tf g h 0411

5 Sign bi t(31) extended STg h 1410

6 Sign bi t(31) extended RSTh24 9

7 Sign bi t(31) extended QRST348

8 Sign bi t(31) extended Pf g h 08 7

9 Sign bi t(31) extended OPg h 186

10 S i gn bi t (31) extended NOPh28 5

11 S i gn bi t (31) extended MNOP384

12 S i gn bi t (31) extended Lf g h 012 3

13 S i gn bi t (31) extended KLg h 1122

14 S i gn bi t (31) extended JKLh212 1

15 S i gn bi t (31) extended IJKL

3120

Appendix A CPU Instruct ion Set Details

A-74

LWL

Re

g

ister

Memor

y

abcdefgh

IJKLMNOPQRSTUVWX

151413121110987654

3210

MSB LSB

Big-endian

63 0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0Little-endian

Big-endian byte orderi ng (BigEndianCPU = 1)

vAddr3..0 Dest i nation register contents aft er i nstruc t i on(shaded is unchanged) Type offset

(63----------------------------------------32 31------------------------------------------0) LEM BEM

0 Sign bi t(31) extended IJKL

3120

1 Sign bi t(31) extended JKLh2121

2 Sign bi t(31) extended KLg h 1122

3 Sign bi t(31) extended Lf g h 0123

4 Sign bi t(31) extended MNOP384

5 Sign bi t(31) extended NOPh28 5

6 Sign bi t(31) extended OPg h 186

7 Sign bi t(31) extended Pf g h 08 7

8 Sign bi t(31) extended QRST348

9 Sign bi t(31) extended RSTh24 9

10 S i gn bi t (31) extended STg h 1410

11 S i gn bi t (31) extended Tf g h 0411

12 S i gn bi t (31) extended UVWX3012

13 S i gn bi t (31) extended VWX h2013

14 S i gn bi t (31) extended WX g h 1014

15 S i gn bi t (31) extended Xf g h 0015

LEM

Little-endian memory (BigEndianMem = 0)

BEM

BigEndianMem = 1

Type

AccessLength sent to memory

Offset

pAddr2..0 sent to memory

Exceptions:

TLB Refill

TLB Invalid

Address Error

Programming Notes:

The architecture provides no direct support for treating unaligned words as unsigned

values, i.e. zeroing bits 63..32 of the destination register when bit 31 is loaded. See SLL or

SLLV for a single-instruction method of propagating the word sign bit in a register into

the upper half of a 64-bit register.

Appendix A CPU Instruct ion Set Details

A-75

LWR LWR

Load Word Right

LWR

100110 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: LWR rt, offset (base)

Purpose: To load the less-significant part of a word from an unaligned memory address as a signed

value.

Description: rt ← rt MERGE memory [base + offset]

Paired LWL and LWR instructions are used to load a register with a word from four

consecutive bytes in memory starting at an arbitrary byte address. LWL loads the left

(most-significant) bytes and LWR loads the right (least-significant) bytes.

The instruction adds the 16-bit signed

offset

to the contents of GPR

base

to form the effective

address. This is the address of the least-significant byte of a word composed of four consecutiv e

bytes in memory. LWR loads from one to four bytes, the least-significant bytes of the word,

into the corresponding bytes of GPR

rt

. It loads the bytes that are in the target word that are

also in the aligned word which contains the byte specified by the effective address.

If the word sign bit (bit 31) is loaded from memory into the register by the instruction,

then the loaded word is sign-extended. If the sign bit is not loaded from memory by the

LWR, then bits 63..32 of the destination are unchanged.

Conceptually, it starts at the specified byte in memory and loads that byte into the low-

order (right-most) byte of the register; then it loads bytes from memory into the register

until it reaches the high-order byte of the word in memory. The most significant (left-

most) byte(s) of the register will not be changed.

memory

(little-endian)

address 4

address 0

register

before

$

24

LWR $24,1 ($0)

after

$

24

register

0

123

4567

123

ACDB

D

Appendix A CPU Instruct ion Set Details

A-76

memory

(big-endian)

address 4

address 0

register

before

$

24

LWR $24,4 ($0)

after

$

24

register

0123

4567

4

CB

ACDB

A

The contents of GPR

rt

are internally bypassed within the processor so that no NOP is

needed between an immediately preceding load instruction which specifies register

rt

and

a following LWR (or LWL) instruction which also specifies register

rt

.

No address exceptions due to alignment are possible.

Restrictions:

None

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR [base]31..0

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

pAddr ← pAddr(PSIZE-1).. 4 || (pAddr3..0 xor BigEndian4)

if (BigEndian = 1) t hen

pAddr(PSIZE-31)..3 || 03

endif

byte ← 0 || (vAddr 1..0 xor BigEndian2)

word ← vAddr3..2 xor BigEndian2

memquad ← LoadMemory (uncached, byte, pAddr, vAddr, DATA)

temp ← GPR [rt]31.. (32-8*byte) || memquad(31+32*word).. (32*word+8*byte)

if (byte = 4) then

utemp ← (temp31)32 /* loaded bit 31, must sign extend */

else

one of the following two behaviors:

utemp ← GPR [rt]63..32 /* leave what was there alone */

utemp ← (GPR [rt]31)32 /* sign-extend bit 31 */

endif

GPR [rt] 63..0 ← utemp || temp

Given a word in a register and a word in memory, the operation of LWR is as follows:

Appendix A CPU Instruct ion Set Details

A-77

LWR

Re

g

ister

Memor

y

abcdefgh

IJKLMNOPQRSTUVWX

15 14 13 12 11 10 9 8 7 6 5 4 3210

MSB LSB

Little-endian

63 0

Littl e-endi an byt e orderi ng (BigEndianCP U = 0)

vAddr3..0 Dest i nation register contents aft er i nstruc t i on(shaded is unchanged) Type offset

(63----------------------------------------32 31------------------------------------------0) LEM BEM

0 Sign bi t (31) extended e f g I0150

1 Sign bi t (31) extended or unchanged e f IJ

1140

2 Sign bi t (31) extended or unchanged eIJK

2130

3 Sign bi t (31) extended or unchanged IJKL

3120

4 Sign bi t (31) extended e f g M0114

5 Sign bi t (31) extended or unchanged e f MN1104

6 Sign bi t (31) extended or unchanged eMNO29 4

7 Sign bi t (31) extended or unchanged MNOP384

8 Sign bi t (31) extended e f g Q07 8

9 Sign bi t (31) extended or unchanged e f QR16 8

10 S i gn bi t (31) extended or unc hanged eQRS258

11 S i gn bi t (31) extended or unc hanged QRST348

12 S i gn bi t (31) extended e f g U0312

13 S i gn bi t (31) extended or unc hanged e f UV1212

14 S i gn bi t (31) extended or unc hanged eUVW

2112

15 S i gn bi t (31) extended or unc hanged UVWX3012

Appendix A CPU Instruct ion Set Details

A-78

LWR

Re

g

ister

Memor

y

abcdefgh

IJKLMNOPQRSTUVWX

15

14

13121110987654

3210

MSB LSB

Big-endian

63 0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0Little-endian

Big-endian byte orderi ng (BigEndianCPU = 1)

vAddr3..0 Dest i nation register contents aft er i nstruc t i on(shaded is unchanged) Type offset

(63----------------------------------------32 31------------------------------------------0) LEM BEM

0 Sign bi t (31) extended or unchanged e f g I0150

1 Sign bi t (31) extended or unchanged e f IJ

1140

2 Sign bi t (31) extended or unchanged eIJK

2130

3 Sign bi t (31) extended IJKL

3120

4 Sign bi t (31) extended or unchanged e f g M011 4

5 Sign bi t (31) extended or unchanged e f MN1104

6 Sign bi t (31) extended or unchanged eMNO29 4

7 Sign bi t (31) extended MNOP384

8 Sign bi t (31) extended or unchanged e f g Q07 8

9 Sign bi t (31) extended or unchanged e f QR16 8

10 S i gn bi t (31) extended or unc hanged eQRS258

11 S i gn bi t (31) extended QRST348

12 S i gn bi t (31) extended or unc hanged e f g U0312

13 S i gn bi t (31) extended or unc hanged e f UV1212

14 S i gn bi t (31) extended or unc hanged eUVW

2112

15 S i gn bi t (31) extended UVWX3012

LEM

Little-endian memory (BigEndian = 0)

BEM

BigEndianMem = 1

Type

AccessLength sent to memory

Offset

pAddr2..0 sent to memory

Exceptions:

TLB Refill

TLB Invalid

Address Error

Programming Notes:

The architecture provides no direct support for treating unaligned words as unsigned

values, i.e. zeroing bits 63..32 of the destination register when bit 31 is loaded. See SLL or

SLLV for a single-instruction method of propagating the word sign bit in a register into

the upper half of a 64-bit register.

Appendix A CPU Instruct ion Set Details

A-79

LWU LWU

Load Word Unsi gned

LWU

100111 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS III

Format: LWU rt, offset (base)

Purpose: To load a word from memory as an unsigned value.

Description: rt ← memory [base + offset]

The contents of the 32-bit word at the memory location specified by the aligned effective

address are fetched, zero-extended, and placed in GPR

rt

. The 16-bit signed

offset

is added

to the contents of GPR

base

to form the effective address.

Restrictions:

The effective address must be naturally aligned. If either of the two least-significant bits

of the address are non-zero, an Address Error Exception occurs.

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR [base] 31..0

if (v Addr1..0) ≠ 02 then SignalException (AddressError) endif

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

pAddr ← pAddr(PSIZE-1).. 4 || (pAddr3..0 xor (BigEndian2 || 02))

memquad ← LoadMemory (uncached, WORD, pAddr, vAddr, DATA)

byte ← vAddr3..0 xor (BigEndian2 || 02)

GPR [rt] 63..0 ← 032 || memquad(31+8*byte)..8*byte

Exceptions:

TLB Refill

TLB Invalid

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-80

MFHI MFHI

Move from HI Register

SPECIAL

000000 MFHI

010000

rd

0

00 0000 0000 0

00000

31 26 25 16 15 11 10 6 5 0

6 10 5 5 6

MIPS I

Format: MFHI rd

Purpose: To copy the special purpose HI register to a GPR.

Description: rd ← HI

The contents of special register

HI

are loaded into GPR

rd

.

Restrictions:

None

Operation:

GPR [rd]63..0 ← HI63..0

Exceptions:

None

Programming Notes:

No restriction is needed because C790 has an interlock mechanism for MULT or DIV

instructions.

Appendix A CPU Instruct ion Set Details

A-81

MFLO MFLO

Move from LO Register

SPECIAL

000000 MFLO

010010

rd

0

00 0000 0000 0

00000

31 26 25 16 15 11 10 6 5 0

6 10 5 5 6

MIPS I

Format: MFLO rd

Purpose: To copy the special purpose LO register to a GPR.

Description: rd ← LO

The contents of special register

LO

are loaded into GPR

rd

.

Restrictions:

None

Operation:

GPR [rd] 63..0 ← LO63..0

Exceptions:

None

Programming Notes:

(Same as MFHI)

Appendix A CPU Instruct ion Set Details

A-82

MOVN MOVN

Move Condit ional on Not Zero

SPECIAL

000000 MOVN

001011

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS IV

Format: MOVN rd, rs, rt

Purpose: To conditionally move a GPR after testing a GPR value.

Description: if (rt ≠ 0) then rd ← rs

If the value in GPR

rt

is not equal to zero, then the contents of GPR

rs

are placed into

GPR

rd

.

Restrictions:

None

Operation:

if GPR [rt] 63..0 ≠ 0 then

GPR [rd] 63..0 ← GPR [rs] 63..0

endif

Exceptions:

None

Programming Notes:

The nonzero value tested here is the “condition true” result from the SLT, SLTI, SLTU,

and SLTIU comparison instructions.

Appendix A CPU Instruct ion Set Details

A-83

MOVZ MOVZ

Move Condit ional on Zero

SPECIAL

000000 MOVZ

001010

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS IV

Format: MOVZ rd, rs, rt

Purpose: To conditionally move a GPR after testing a GPR value.

Description: if (rt = 0) then rd ← rs

If the value in GPR

rt

is equal to zero, then the contents of GPR

rs

are placed into GPR

rd

.

Restrictions:

None

Operation:

if GPR [rt] 63..0 = 0 then

GPR [rd] 63..0 ← GPR [rs] 63..0

endif

Exceptions:

None

Programming Notes:

The zero value tested here is the “condition false” result from the SLT, SLTI, SLTU, and

SLTIU comparison instructions.

Appendix A CPU Instruct ion Set Details

A-84

MTHI MTHI

Move to HI Register

SPECIAL

000000 MTHI

010001

rs 0

000 0000 0000 0000

31 26 25 21 20 6 5 0

6 5 15 6

MIPS I

Format: MTHI rs

Purpose: To copy a GPR to the special purpose HI register.

Description: HI ← rs

The contents of GPR

rs

are loaded into special register

HI

.

Restrictions:

None

Operation:

HI63..0 ← GPR [rs] 63..0

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-85

MTLO MTLO

Move t o LO Register

SPECIAL

000000 MTLO

010011

rs 0

000 0000 0000 0000

31 26 25 21 20 6 5 0

6 5 15 6

MIPS I

Format: MTLO rs

Purpose: To copy a GPR to the special purpose LO register.

Description: LO ← rs

The contents of GPR

rs

are loaded into special register

LO

.

Restrictions:

None

Operation:

LO63..0 ← GPR [rs] 63..0

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-86

MULT MULT

Multiply Word

SPECIAL

000000 MULT

011000

rt 0

00 0000 0000

rs

31 26 25 21 20 16 15 6 5 0

6 5 5 10 6

MIPS I

Format: MULT rs, rt

Purpose: To multiply 32-bit signed integers.

Description: (LO, HI) ← rs × rt

The 32-bit word value in GPR

rt

is multiplied by the 32-bit value in GPR

rs

, treating both

operands as signed values, to produce a 64-bit result. The low-order 32-bit word of the

result is placed into special register

LO

, and the high-order 32-bit word is placed into

special register

HI

.

No arithmetic exception occurs under any circumstances.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation is undefined.

Operation:

if (NotWordValue (GPR [rs]) or NotWordValue (GPR [rt])) then UndefinedResult() endif

prod ← GPR [rs]31..0 * GPR [rt]31..0

LO63..0 ← (prod 31)32 || prod31..0

HI63..0 ← (prod 63)32 || prod63..32

Exceptions:

None

Programming Notes:

In the C790, the integer multiply operation proceeds asynchronously and allows other

CPU instructions to execute before it is retired. An attempt to read

LO

or

HI

before the

results are written will wait (interlock) until the results are ready. Asynchronous

execution does not affect the program result, but offers an opportunity for performance

improvement by scheduling the multiply so that other instructions can execute in parallel.

Programs that require overflow detection must check for it explicitly.

Appendix A CPU Instruct ion Set Details

A-87

MULTU MULTU

Multiply Unsi gned Word

SPECIAL

000000 MULTU

011001

rt 0

00 0000 0000

rs

31 26 25 21 20 16 15 6 5 0

6 5 5 10 6

MIPS I

Format: MULTU rs, rt

Purpose: To multiply 32-bit unsigned integers.

Description: (LO, HI) ← rs × rt

The 32-bit word value in GPR

rt

is multiplied by the 32-bit value in GPR

rs

, treating both

operands as unsigned values, to produce a 64-bit result. The low-order 32-bit word of the

result is placed into special register

LO

, and the high-order 32-bit word is placed into

special register

HI

.

No arithmetic exception occurs under any circumstances.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation is undefined.

Operation:

if (NotWordValue (GPR [rs]) or NotWordValue (GPR [rt])) then UndefinedResult() endif

prod ← (0 || GPR [rs]31..0 ) * (0 || GPR [rt]31..0)

LO63..0 ← (prod 31)32 || prod31..0

HI63..0 ← (prod 63)32 || prod63..32

Exceptions:

None

Programming Notes:

See the Programming Notes for the MULT instruction.

Appendix A CPU Instruct ion Set Details

A-88

NOR NOR

Not Or

SPECIAL

000000 NOR

100111

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: NOR rd, rs, rt

Purpose: To do a bitwise logical NOT OR.

Description: rd ← rs NOR rt

The contents of GPR

rs

are combined with the contents of GPR

rt

in a bitwise logical NOR

operation. The result is placed into GPR

rd

.

Restrictions:

None

Operation:

GPR [rd] 63..0 ← GPR [rs] 63..0 nor GPR [rt] 63..0

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-89

OR OR

Or

SPECIAL

000000 OR

100101

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: OR rd, rs, rt

Purpose: To do a bitwise logical OR.

Description: rd ← rs OR rt

The contents of GPR

rs

are combined with the contents of GPR

rt

in a bitwise logical OR

operation. The result is placed into GPR

rd

.

Restrictions:

None

Operation:

GPR [rd] 63..0 ← GPR [rs] 63..0 or GPR [rt] 63..0

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-90

ORI ORI

Or Immediate

ORI

001101 immediate

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: ORI rt, rs, immediate

Purpose: To do a bitwise logical OR with a constant.

Description: rt ← rs OR immediate

The 16-bit

immediate

is zero-extended to the left and combined with the contents of GPR

rs

in a bitwise logical OR operation. The result is placed into GPR

rt

.

Restrictions:

None

Operation:

GPR [rt] 63..0 ← zero_extend (immediate) or GPR [rs] 63..0

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-91

PREF PREF

Prefetch

PREF

110011 offset

hintbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS IV

Format: PREF hint, offset (base)

Purpose: To prefetch data from memory.

Description: prefetch_memory (base+offset)

PREF adds the 16-bit signed

offset

to the contents of GPR

base

to form an effective byte

address. It advises that data at the effective address may be used in the near future.

If the hint field is 000002, this instruction prefetches a block of data from main memory

into cache.

PREF is an advisory instruction. It may change the performance of the program. For all

hint values and all effective addresses, it neither changes architecturally-visible state nor

alters the meaning of the program.

PREF does not cause addressing-related exceptions. If it raises an exception condition, the

exception conditions ignored. If an addressing-related exception condition is raised and

ignored, no data will be prefetched, Even if no data is prefetched in such a case, some

action that is not architecturally-visible, such as writeback of a dirty cache line, might

take place.

PREF will never generate a memory operation for a location with an uncached memory

access type.

The defined

hint

values are shown in the table below. The C790 only supports

hint

= 0.

The

hint

table may be extended in future implementations.

Values of hint field for prefetch instruction

Value Name Data use and desired prefetch action

0 load Data is expected to be loaded (not modified).

Fetch data as if for a load.

1-31 (Reserved) (Reserved)

Appendix A CPU Instruct ion Set Details

A-92

Restrictions:

None

Operation:

vAddr ← sign_extend (offset) + GPR [base] 31..0

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

Prefet c h ( unc ac he d, pAddr, vAddr, DATA, hint)

Exceptions:

None

Programming Notes:

Prefetch can not prefetch data from a mapped location unless the translation for that

location is present in the TLB. Locations in memory pages that have not been accessed

recently may not have translations in the TLB, so prefetch may not be effective for such

locations.

Prefetch on C790 may not pref etch data when there is outs tanding bus read proces s due to

a data cache miss, an uncached load or a miss on the uncached accelerated buff er.

Prefetch does not cause addressing exceptions. It will not cause an exception to prefetch

using an address pointer value before the validity of a pointer determined.

Implementation Notes:

A reserved

hint

field value causes a default prefetch action, the load

hint

.

Appendix A CPU Instruct ion Set Details

A-93

SB SB

Store By te

SB

101000 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: SB rt, offset (base)

Purpose: To store a byte to memory.

Description: memory [base + offset] ← rt

The least-significant 8-bit byte of GPR

rt

is stored in memory at the location specified by

the effective address. The 16-bit signed

offset

is added to the contents of GPR

base

to form

the effective address.

Restrictions:

None

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR [base] 31..0

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)

pAddr ← pAddr(PSIZE-1).. 4 || (pAddr3..0 xor BigEndian4)

byte ← vAddr3..0 xor BigEndian4

dataquad ← GPR [rt] (127-8*byte)..0 || 08*byte

StoreMemory (uncached, BYTE, dataquad, pAddr, vAddr, DATA)

Exceptions:

TLB Refill

TLB Invalid

TLB Modifi ed

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-94

SD SD

Store Doubleword

SD

111111 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS III

Format: SD rt, offset (base)

Purpose: To store a doubleword to memory.

Description: memory [base + offset] ← rt

The 64-bit doubleword in GPR

rt

is stored in memory at the location specified by the

aligned effective address. The 16-bit signed

offset

is added to the contents of GPR

base

to

form the effective address.

Restrictions:

The effective address must be naturally aligned. If any of the three least-significant bits of

the effective address are non-zero, an Address Error exception occurs.

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR [base] 31..0

if (v Addr2..0) ≠ 03 then SignalException (AddressError) endif

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)

pAddr ← pAddr(PSIZE-1).. 4 || (pAddr3..0 xor (BigEndian || 03))

byte ← vAddr3..0 || (BigEndian || 03)

dataquad ← GPR [rt] (127-8*byte)..0 || 08*byte

StoreMemory (uncached, DOUBLEWORD, dataquad, pAddr, vAddr, DATA)

Exceptions:

TLB Refill

TLB Invalid

TLB Modifi ed

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-95

SDL SDL

Store Doubl eword Left

SDL

101100 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS III

Format: SDL rt, offset (base)

Purpose: To store the more-significant part of a doubleword to an unaligned memory

address.

Description: memory [base + offset] ← rt

Paired SDL and SDR instructions are used to store a doubleword from a register into

eight consecutive bytes in memory starting at an arbitrary byte address. SDL stores the

left (most-significant) bytes and SDR stores the right (least-significant) bytes.

The 16-bit signed

offset

is added to the contents of GPR

base

to form the effective address

of the most-significant byte of the contiguous doubleword in memory. It alters only the

doubleword in memory which contains that byte. From one to eight bytes will be stored,

depending on the starting byte specified.

Conceptually, it starts at the most-significant byte of the register and copies it to the

specified byte in memory; then it copies bytes from register to memory until it reaches the

low-order byte of the word in memory.

No address exceptions due to alignment are possible.

memory

(little-endian)

address 8

address 0

register

before

$

24

SDL $24,10 ($0)

after

0

1234567

8

9101112131415

address 8

address 0 0

1234567

1112131415

AECDBFGH

FGH

Appendix A CPU Instruct ion Set Details

A-96

memory

(little-endian)

address 8

address 0

register

before

$

24

SDL $24,1 ($0)

after

01234567

89 101112131415

address 8

address 0 G

FEDCBA

0

AECDBFGH

89 101112131415

Restrictions:

None

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR [base] 31..0

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)

pAddr ← pAddr(PSIZE-1)..4 || (pAddr3..0 xor BigEndian4)

If (BigEndian = 0) then

pAddr ← pAddr(PSIZE-1)..3 || 03

endif

byte ← 0 || (vAddr 2..0 xor BigEndian3)

if (v Addr3 xor BigEndian = 0) then

dataquad ← 064 || 0(56-8*byte) || GPR [rt] 63.. (56-8*byte)

else

dataquad ← 0(56-8*byte) || GPR [rt]63.. (56-8*byte) || 064

endif

Store M emory (uncac hed, byte, dat aquad, pAddr, vAddr, DATA)

Given a doubleword in a register and a doubleword in memory, the operation of SDL is as

follows:

Appendix A CPU Instruct ion Set Details

A-97

SDL

Re

g

ister

Memor

y

ABCDEFGH

ijklmnopqrstuvwx

15 14 13 12 11 10 9 8 7 6 5 4 3210

MSB LSB

Little-endian

63 0

Littl e-endi an byt e orderi ng (BigEndianCPU = 1)

vAddr3..0 Dest i nation m emory contents after ins tructi on(shaded is unchanged) Type offset

(127---------------------------------------64 63------------------------------------------0) LEM BEM

0 I j k l m n o p q r s t u v w A0815

1 I j k l m n o p q r s t u v AB 1814

2 I j k l m n o p q r s t u ABC 2813

3 I j k l m n o p q r s t ABCD 3812

4 I j k l m n o p q r s ABCDE 4811

5 I j k l m n o p q r ABCDEF 5810

6 I j k l m n o p q ABCDEFG 689

7 I j k l m n o p ABCDEFGH 788

8 I j k l m n o Aq r s t u v w x 807

9 I j k l m n ABq r s t u v w x 906

10 I j k l m ABCq r s t u v w x 10 0 5

11 I j k l ABCDq r s t u v w x 11 0 4

12 I j k ABCDEq r s t u v w x 12 0 3

13 I j ABCDEFq r s t u v w x 13 0 2

14 IABCDEFGq r s t u v w x 14 0 1

15 ABCDEFGHq r s t u v w x 15 0 0

Appendix A CPU Instruct ion Set Details

A-98

SDL

Re

g

ister

Memor

y

ABCDEFGH

ijklmnopqrstuvwx

151413121110987654

3210

MSB LSB

Big-endian

63 0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0Little-endian

Big-endian byte orderi ng (BigEndianCPU = 0)

vAddr3..0 Dest i nation m emory contents after ins tructi on(shaded is unchanged) Type offset

(127---------------------------------------64 63------------------------------------------0) LEM BEM

0ABCDEFGHq r s t u v w x 15 0 0

1 i ABCDEFGq r s t u v w x 14 0 1

2 i j ABCDEFq r s t u v w x 13 0 2

3 i j k ABCDEq r s t u v w x 12 0 3

4 i j k l ABCDq r s t u v w x 11 0 4

5 i j k l m ABCq r s t u v w x 10 0 5

6 i j k l m n ABq r s t u v w x 906

7 i j k l m n o Aq r s t u v w x 807

8 i j k l m n o p ABCDEFGH 708

9 i j k l m n o p q ABCDEFG 609

10 i j k l m n o p q r ABCDEF 5010

11 i j k l m n o p q r s ABCDE 4011

12 i j k l m n o p q r s t ABCD 3012

13 i j k l m n o p q r s t u ABC 2013

14 i j k l m n o p q r s t u v AB 1014

15 i j k l m n o p q r s t u v w A0015

LEM

Little-endian memory (BigEndianMem = 0)

BEM

BigEndianMem = 1

Type

AccessLength sent to memory

Offset

pAddr3..0 sent to memory

Exceptions:

TLB Refill

TLB Invalid

TLB Modifi ed

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-99

SDR SDR

Store Doubl eword Right

SDR

101101 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS III

Format: SDR rt, offset (base)

Purpose: To store the less-significant part of a doubleword to an unaligned memory address.

Description: memory [base + offset] ← rt

Paired SDL and SDR instructions are used to store a doubleword from a register into

eight consecutive bytes in memory starting at an arbitrary byte address. SDL stores the

left (most-significant) bytes and SDR stores the right (least-significant) bytes.

The SDR instruction adds its sign-extended 16-bit

offset

to the contents of GPR

base

to

form an effective address which may specify an arbitrary byte. It alters only the

doubleword in memory which contains that byte. From one to eight bytes will be stored,

depending on the starting byte specified.

Conceptually, it starts at the least-significant (rightmost) byte of the register and copies it

to the specified byte in memory; then it copies bytes from register to memory until it

reaches the high-order byte of the word in memory. No address exceptions due to

alignment are possible.

memory

(little-endian)

address 8

address 0

register

before

$

24

SDR $24,3 ($0)

after

0

1234567

8

9101112131415

address 8

address 0 0

12

1112131415

AECDBFGH

8

910

AECDB

memory

(big-endian)

address 8

address 0

register

before

$

24

SDR $24,5 ($0)

after

01234567

89 101112131415

address 8

address 0 7

6

H

AECDBFGH

11 12 13 14 158910

GCEDF

Restrictions:

None

Appendix A CPU Instruct ion Set Details

A-100

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR [base] 31..0

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)

pAddr ← pAddr(PSIZE-1)..4 || (pAddr3..0 xor BigEndian4)

If (BigEndian = 0) then

pAddr ← pAddr(PSIZE-31)..3 || 03

endif

byte ← vAddr2..0 xor BigEndian4

if(vAddr3 xor BigEndian = 0) then

dataquad ← 064 || GPR [rt] (63-8*byte)..0 || 08*byte

else

dataquad ← GPR [rt] (63-8*byte)..0 || 08*byte || 064

endif

StoreMemory (uncached, DOUBLEWORD-byte, dataquad, pAddr, vAddr, DATA)

Given a doubleword in a register and a doubleword in memory, the operation of SDR is as

follows:

Appendix A CPU Instruct ion Set Details

A-101

SDR

Re

g

ister

Memor

y

ABCDEFGH

ijklmnopqrstuvwx

15 14 13 12 11 10 9 8 7 6 5 4 3210

MSB LSB

Little-endian

63 0

Littl e-endi an byt e orderi ng (BigEndianCP U = 0)

vAddr3..0 Dest i nation m emory contents after ins tructi on(shaded is unchanged) Type offset

(127---------------------------------------64 63------------------------------------------0) LEM BEM

0 i j k l m n o p ABCDEFGH 700

1 i j k l m n o p BCDEFGHx610

2 i j k l m n o p CDEFGHw x 520

3 i j k l m n o p DEFGHv w x 430

4 i j k l m n o p EFGHu v w x 340

5 i j k l m n o p FGH t u v w x 250

6 i j k l m n o p GHs t u v w x 160

7 i j k l m n o p Hr s t u v w x 070

8AB C D E F G H q r s t u v w x 780

9BC D E F G H p q r s t u v w x 690

10 CD E F G H o p q r s t u v w x 5100

11 DE F G H n o p q r s t u v w x 4110

12 EFGHm n o p q r s t u v w x 3120

13 FGH l m n o p q r s t u v w x 2130

14 GH k l m n o p q r s t u v w x 1140

15 Hj k l m n o p q r s t u v w x 0150

Appendix A CPU Instruct ion Set Details

A-102

SDR

Re

g

ister

Memor

y

ABCDEFGH

ijklmnopqrstuvwx

151413121110987654

3210

MSB LSB

Big-endian

63 0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0Little-endian

Big-endian byte orderi ng (BigEndianCPU = 0)

vAddr3..0 Dest i nation m emory contents after ins tructi on(shaded is unchanged) Type offset

(127---------------------------------------64 63------------------------------------------0) LEM BEM

0Hj k l m n o p q r s t u v w x 0150

1GHk l m n o p q r s t u v w x 1140

2FGH l m n o p q r s t u v w x 2130

3EFGHm n o p q r s t u v w x 3120

4DEFGHn o p q r s t u v w x 4110

5CDEFGHo p q r s t u v w x 5100

6BCDEFGHp q r s t u v w x 690

7ABCDEFGHq r s t u v w x 780

8 i j k l m n o p Hr s t u v w x 070

9 i j k l m n o p GHs t u v w x 160

10 i j k l m n o p FGH t u v w x 250

11 i j k l m n o p EFGHu v w x 340

12 i j k l m n o p DEFGHv w x 430

13 i j k l m n o p CDEFGHw x 520

14 i j k l m n o p BCDEFGHx610

15 i j k l m n o p ABCDEFGH 700

LEM

Little-endian memory (BigEndianMem = 0)

BEM

BigEndianMem = 1

Type

AccessLength sent to memory

Offset

pAddr3..0 sent to memory

Exceptions:

TLB Refill

TLB Invalid

TLB Modifi ed

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-103

SH SH

Store Hal fword

SH

101001 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: SH rt, offset (base)

Purpose: To store a halfword to memory.

Description: memory [base + offset] ← rt

The least-significant 16-bit halfword if register

rt

is stored in memory at the location

specified by the aligned effective address. The 16-bit signed

offset

is added to the contents

of GPR

base

to form the effective address.

Restrictions:

The effective address must be naturally aligned. If the least-significant bit of the address

is non-zero, an Address Error exception occurs.

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR [base] 31..0

if (v Addr0) ≠ 0 then SignalException (AddressError) endif

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)

pAddr ← pAddr(PSIZE-1)..4 || (pAddr3..0 xor (BigEndian3 || 0))

byte ← vAddr3..0 xor (BigEndian3 || 0)

dataquad ← GPR [rt] (127-8*byte)..0 || 08*byte

StoreMemory (uncached, HALFWORD, dataquad, pAddr, vAddr, DATA)

Exceptions:

TLB Refill

TLB Invalid

TLB Modifi ed

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-104

SLL SLL

Shi ft Word Left Logic al

SPECIAL

000000 SLL

000000

rt rd

0

00000 sa

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: SLL rd, rt, sa

Purpose: To left shift a word by a fixed number of bits.

Description: rd ← rt << sa

The contents of the low-order 32-bit word of GPR

rt

are shifted left, inserting zeroes into

the emptied bits; the word result is placed in GPR

rd

. The bit shift count is specified by

sa

.

The result word is sign-extended.

Restrictions:

None

Operation:

s ← sa

temp ← GPR [rt](31-s)..0 || 0s

GPR [rd]63..0 ← sign_extend (temp31..0)

Exceptions:

None

Programming Notes:

Unlike nearly all other word operations the input operand does not have to be a properly

sign-extended word value to produce a valid sign-extended 32-bit result. The result word

is always sign extended into a 64-bit destination register; this instruction with a zero shift

amount truncates a 64-bit value to 32 bits and sign extends it and stores it in the

destination register.

Appendix A CPU Instruct ion Set Details

A-105

SLLV SLLV

Shi ft Word Left Logic al V ar iable

SPECIAL

000000 SLLV

000100

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: SLLV rd, rt, rs

Purpose: To left shift a word by a variable number of bits.

Description: rd ← rt << rs

The contents of the low-order 32-bit word of GPR

rt

are shifted left, inserting zeroes into

the emptied bits; the result word is placed in GPR

rd

. The bit shift count is specified by

the low-order five bits of GPR

rs

. The result word is sign-extended.

Restrictions:

None

Operation:

s ← GP [rs]4..0

temp ← GPR [rt](31-s)..0 || 0s

GPR [rd]63..0 ← sign_extend (temp31..0)

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-106

SLT SLT

Set on Less Than

SPECIAL

000000 SLT

101010

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: SLT rd, rs, rt

Purpose: To record the result of a less-than comparison.

Description: rd ← (rs < rt)

Compare the contents of GPR

rs

and GPR

rt

as signed integers and record the Boolean

result of the comparison in GPR

rd

. If GPR

rs

is less than GPR

rt

the result is 1 (true),

otherwise 0 (false).

The arithmetic comparison does not cause an Integer Overflow exception.

Restrictions:

None

Operation:

if GPR [rs]63..0 < GPR [rt] 63..0 then

GPR [rd] 63..0 ← 0GPRLEN-1 || 1

else GPR [rd] 63..0 ← 0GPRLEN

endif

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-107

SLTI SLTI

Set on Less Than I mm ediate

SLTI

001010 immediate

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: SLTI rt, rs, immediate

Purpose: To record the result of a less-than comparison with a constant.

Description: rt ← (rs < immediate)

Compare the contents of GPR

rs

and the 16-bit signed

immediate

as signed integers and

record the Boolean result of the comparison in GPR

rt

. If GPR

rs

is less than

immediate

the result is 1 (true), otherwise 0 (false).

The arithmetic comparison does not cause an Integer Overflow exception.

Restrictions:

None

Operation:

if GPR [rs] 63..0 < sign_extend (immediate) then

GPR [rd] 63..0 ← 0GPRLEN-1 || 1

else GPR [rd] 63..0 ← 0GPRLEN

endif

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-108

SLTIU SLTIU

Set on Less Than I mm ediate Unsigned

SLTIU

001011 immediate

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: SLTIU rt, rs, immediate

Purpose: To record the result of an unsigned less-than comparison with a constant.

Description: rt ← (rs < immediate)

Compare the contents of GPR

rs

and the sign-extended 16-bit

immediate

as unsigned

integers and record the Boolean result of the comparison in GPR

rt

. If GPR

rs

is less than

immediate

the result is 1 (true), otherwise 0 (false).

Because the 16-bit

immediate

is sign-extended before comparison, the instruction is able

to represent the smallest or largest unsigned numbers. The representable values are at

the minimum [0, 32767] or maximum [max_unsigned-32767, max_unsigned] end of the

unsigned range.

The arithmetic comparison does not cause an Integer Overflow exception.

Restrictions:

None

Operation:

if (0 || GPR [rs] 63..0) < (0 || sign_extend (immediate)) then

GPR [rd] 63..0 ← 0GPRLEN-1 || 1

else GPR [rd] 63..0 ← 0GPRLEN

endif

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-109

SLTU SLTU

Set on Less Than Unsigned

SPECIAL

000000 SLTU

101011

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: SLTU rd, rs, rt

Purpose: To record the result of an unsigned less-than comparison.

Description: rd ← (rs < rt)

Compare the contents of GPR

rs

and GPR

rt

as unsigned integers and record the Boolean

result of the comparison in GPR

rd

. If GPR

rs

is less than GPR

rt

the result is 1 (true),

otherwise 0 (false).

The arithmetic comparison does not cause an Integer Overflow exception.

Restrictions:

None

Operation:

if (0 || GPR [rs] 63..0) < (0 || GPR [rt] 63..0) then

GPR [rd] 63..0 ← 0GPRLEN-1 || 1

else GPR [rd] 63..0 ← 0GPRLEN

endif

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-110

SRA SRA

Shi ft Word Right Ar ithmetic

SPECIAL

000000 SRA

000011

rt rd

0

00000 sa

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: SRA rd, rt sa

Purpose: To arithmetic right shift a word by a fixed number of bits.

Description: rd ← rt >> sa (arithmetic)

The contents of the low-order 32-bit word of GPR

rt

are shifted right, duplicating the sign-

bit (bit 31) in the emptied bits; the word result is placed in GPR

rd

. The bit shift count is

specified by

sa

. The result word is sign-extended.

Restrictions:

If GPR

rt

does not contain a sign-extended 32-bit value (bit 63..31 equal) then the result of

the operation is undefined.

Operation:

if (NotWordValue (GPR [rt] 63..0 )) then UndefinedResult () endif

s ← sa

temp ← (GPR [rt]31)s || GPR [rt]31..s

GPR [rd] 63..0 ← sign_extend (temp31..0)

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-111

SRAV SRAV

Shi ft Word Right Ar ithmetic V ar iable

SPECIAL

000000 SRAV

000111

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: SRAV rd, rt, rs

Purpose: To arithmetic right shift a word by a variable number of bits.

Description: rd ← rt >> rs (arithmetic)

The contents of the low-order 32-bit word of GPR

rt

are shifted right, duplicating the sign-

bit (bit 31) in the emptied bits; the word result is placed in GPR

rd

. The bit shift count is

specified by the low-order five bits of GPR

rs

. The result word is sign-extended.

Restrictions:

If GPR

rt

does not contain a sign-extended 32-bit value (bit 63..31 equal) then the result of

the operation is undefined.

Operation:

if (NotWordValue (GPR [rt] 63..0 )) then UndefinedResult () endif

s ← GPR [rs]4..0

temp ← (GPR [rt]31)s || GPR [rt]31..s

GPR [rd] 63..0 ← sign_extend (temp31..0)

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-112

SRL SRL

Shi ft Word Right Logical

SPECIAL

000000 SRL

000010

rt rd

0

00000 sa

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: SRL rd, rt, sa

Purpose: To logical right shift a word by a fixed number of bits.

Description: rd ← rt >> sa (logical)

The contents of the low-order 32-bit word of GPR

rt

are shifted right, inserting zeros into

the emptied bits; the word result is placed in GPR

rd

. The bit shift count is specified by

sa

.

The result word is sign-extended.

Restrictions:

If GPR

rt

does not contain a sign-extended 32-bit value (bit 63..31 equal) then the result of

the operation is undefined.

Operation:

if (NotWordValue (GPR [rt] 63..0)) then UndefinedResult () endif

s ← sa

temp ← 0s || GPR [rt]31..s

GPR [rd] 63..0 ← sign_extend(temp31..0)

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-113

SRLV SRLV

Shi ft Word Right Logical V ar iable

SPECIAL

000000 SRLV

000110

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: SRLV rd, rt, rs

Purpose: To logical right shift a word by a variable number of bits.

Descriptions: rd ← rt >> rs (logical)

The contents of the low-order 32-bit word of GPR

rt

are shifted right, inserting zeros into

the emptied bits; the word result is placed in GPR

rd

. The bit shift count is specified by

the low-order five bits of GPR

rs

. The result word is sign-extended.

Restrictions:

If GPR

rt

does not contain a sign-extended 32-bit value (bits 63..31 equal) then the result

of the operation is undefined.

Operation:

if (NotWordValue (GPR[rt] 63..0)) then UndefinedResult () endif

s ← GPR [rs]4..0

temp ← 0s || GPR [rt]31..s

GPR [rd] 63..0 ← sign_extend (temp31..0)

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-114

SUB SUB

Subtract Word

SPECIAL

000000 SUB

100010

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: SUB rd, rs, rt

Purpose: To subtract 32-bit integers. If overflow occurs, then trap.

Description: rd ← rs - rt

The 32-bit word value in GPR

rt

is subtracted from the 32-bit value in GPR

rs

to produce a

32-bit result. If the subtraction results in 32-bit 2’s complement arithmetic overflow then

the destination register is not modified and an Integer Overflow exception occurs. If it

does not overflow, the 32-bit result is placed into GPR

rd

.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation is undefined.

Operation:

if (NotWordValue (GPR[rs] 63..0) or NotWordValue (GPR[rt] 63..0)) then UndefinedResult () endif

temp ← GPR [rs] 63..0 - GPR [rt] 63..0

if (32_bit_arithmetic_overflow) then

SignalException (IntegerOverflow)

else GPR [rd] 63..0 ← sign_extend (temp31..0)

endif

Exceptions:

Integer Overflow

Programming Notes:

SUBU performs the same arithmetic operation but, does not trap on overflow.

Appendix A CPU Instruct ion Set Details

A-115

SUBU SUBU

Subtract Unsigned Wor d

SPECIAL

000000 SUBU

100011

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: SUBU rd, rs, rt

Purpose: To subtract 32-bit integers.

Description: rd ← rs - rt

The 32-bit word value in GPR

rt

is subtracted from the 32-bit value in GPR

rs

and the 32-

bit arithmetic result is placed into GPR

rd

.

No integer overflow exception occurs under any circumstances.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation is undefined.

Operation:

if (NotWordValue (GPR[rs] 63..0) or NotWordValue (GPR[rt] 63..0)) then UndefinedResult () endif

temp ← GPR [rs] 63..0 - GPR [rt] 63..0

GPR [rd] 63..0 ← sign_extend (temp31..0)

Exceptions:

None

Programming Notes:

The term “unsigned” in the instruction name is a misnomer; this operation is 32-bit

modulo arithmetic that does not trap on overflow. It is appropriate for arithmetic which is

not signed, such as address arithmetic, or integer arithmetic environments that ignore

overflow, such as C language arithmetic.

Appendix A CPU Instruct ion Set Details

A-116

SW SW

Store Wor d

SW

101011 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: SW rt, offset (base)

Purpose: To store a word to memory.

Description: memory [base + offset] ← rt

The least-significant 32-bit word of register

rt

is stored in memory at the location specified

by the aligned effective address. The 16-bit signed

offset

is added to the contents of GPR

base

to form the effective address.

Restrictions:

The effective address must be naturally aligned. If either of the two least-significant bits

of the address are non-zero, an Address Error exception occurs.

Operation: (128-bit bus)

vAddr ← sign_extend (offset) + GPR [base] 31..0

if ( vAddr 1..0) ≠ 02 then SignalException (AddressError) endif

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)

pAddr ← pAddr(PSIZE-1).. 4 || (pAddr3..0 xor (BigEndian2 || 02))

byte ← vAddr3..0 xor (BigEndian2 || 02)

dataquad ← GPR [rt] (127-8*byte)..0 || 08*byte

Store M e m o ry (uncached, W ORD, dataquad, pAddr , vAddr, D ATA)

Exceptions:

TLB Refill

TLB Invalid

TLB Modifi ed

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-117

SWL SWL

Store Wor d Left

SWL

101010 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: SWL rt, offset (base)

Purpose: To store the more-significant part of a word to an unaligned memory address.

Description: memory [base + offset] ← rt

Paired SWL and SWR instructions are used to store a word from a register into four

consecutive bytes in memory starting at an arbitrary byte address. SWL stores the left

(most-significant) bytes and SWR stores the right (least-significant) bytes.

The SWL instruction adds its sign-extended 16-bit

offset

to the contents of GPR

base

to

form an effective address which may specify an arbitrary byte. It alters only the word in

memory which contains that byte. From one to four bytes will be stored, depending on the

starting byte specified.

Conceptually, it starts at the most-significant byte of the register and copies it to the

specified byte in memory; then it copies bytes from register to memory until it reaches the

low-order byte of the word in memory.

No address exceptions due to alignment are possible.

memory

(little-endian)

address 4

address 0

register

before

$

24

after

0

123

4567

address 4

address 0 0

123

7

SWL $24,6 ($0)

ACDB

CDB

memory

(big-endian)

address 4

address 0

register

before

$

24

after

0123

4567

address 4

address 0 C

BA

0

SWL $24,1 ($0)

ACDB

4567

Appendix A CPU Instruct ion Set Details

A-118

Restrictions:

None

Operation:

vAddr ← sign_extend (offset) + GPR [base] 31..0

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)

pAddr ← pAddr(PSIZE-1)..4 || (pAddr3..0 xor BigEndian4)

If (BigEndian = 0) then

pAddr ← pAddr(PSIZE-1)..2 || 02

endif

byte ← vAddr1..0 xor BigEndian2

if (v Addr3..2 xor BigEndian2) = 002 then

dataquad ← 096 || 0(24-8*byte) || GPR[rt]31.. (24-8*byte)

elseif ( v Addr 3..2 xor BigEndian2) = 012 then

dataquad ← 064 || 0(24-8*byte) || GPR [rt]31.. (24-8*byte) || 032

elseif ( v Addr 3..2 xor BigEndian2) = 102 then

dataquad ← 032 || 0(24-8*byte) || GPR [rt]31.. (24-8*byte) || 032

elseif ( v Addr 3..2 xor BigEndian2) = 112 then

dataquad ← 0(24-8*byte) || GPR [rt]31.. (24-8*byte) || 064

endif

Store M emory (uncac hed, byte, dat aquad, pAddr, vAddr, DATA)

Given a doubleword in a register and a doubleword in memory, the operation of SWL is as

follows:

Appendix A CPU Instruct ion Set Details

A-119

SWL

Re

g

ister

Memor

y

ABCDEFGH

ijklmnopqrstuvwx

15 14 13 12 11 10 9 8 7 6 5 4 3210

MSB LSB

Little-endian

63 0

Littl e-endi an byt e orderi ng (BigEndianCPU = 0)

vAddr3..0 Dest i nation m emory contents after ins tructi on(shaded is unchanged) Type offset

(127---------------------------------------64 63------------------------------------------0) LEM BE M

0 i j k l m n o p q r s t u v w E0015

1 i j k l m n o p q r s t u v EF 1014

2 i j k l m n o p q r s t u EFG 2013

3 i j k l m n o p q r s t EFGH 3012

4 i j k l m n o p q r s Eu v w x 0411

5 i j k l m n o p q r EFu v w x 1410

6 i j k l m n o p q EFGu v w x 249

7 i j k l m n o p EFGHu v w x 348

8 i j k l m n o Eq r s t u v w x 087

9 i j k l m n EFq r s t u v w x 186

10 i j k l m EFGq r s t u v w x 285

11 i j k l EFGHq r s t u v w x 384

12 i j k Em n o p q r s t u v w x 0123

13 i j EFm n o p q r s t u v w x 1122

14 iEFGm n o p q r s t u v w x 2121

15 EFGHm n o p q r s t u v w x 3120

Appendix A CPU Instruct ion Set Details

A-120

SWL

Re

g

ister

Memor

y

ABCDEFGH

ijklmnopqrstuvwx

151413121110987654

3210

MSB LSB

Big-endian

63 0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0Little-endian

Big-endian byte orderi ng (BigEndianCPU = 1)

vAddr3..0 Dest i nation m emory contents after ins tructi on(shaded is unchanged) Type offset

(127---------------------------------------64 63------------------------------------------0) LEM BE M

0EFGHm n o p q r s t u v w x 3120

1 i EGHm n o p q r s t u v w x 2121

2 i j EFm n o p q r s t u v w x 1122

3 i j k Em n o p q r s t u v w x 0123

4 i j k l EFGHq r s t u v w x 384

5 i j k l m EFGq r s t u v w x 285

6 i j k l m n EFq r s t u v w x 186

7 i j k l m n o Eq r s t u v w x 087

8 i j k l m n o p EFGHu v w x 348

9 i j k l m n o p q EFGu v w x 249

10 i j k l m n o p q r EFu v w x 1410

11 i j k l m n o p q r s Fu v w x 0411

12 i j k l m n o p q r s t EFGH 3012

13 i j k l m n o p q r s t u EFG 2013

14 i j k l m n o p q r s t u v EF 1014

15 i j k l m n o p q r s t u v w F0015

LEM

Little-endian memory (BigEndianMem = 0)

BEM

BigEndianMem = 1

Type

AccessLength sent to memory

Offset

pAddr3..0 sent to memory

Exceptions:

TLB Refill

TLB Invalid

TLB Modifi ed

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-121

SWR SWR

Store Word Right

SWR

101110 offset

rtbase

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: SWR rt, offset (base)

Purpose: To store the less-significant part of a word to an unaligned memory address.

Description: memory [base + offset] ← rt

Paired SWL and SWR instructions are used to store a word from a register into four

consecutive bytes in memory starting at an arbitrary byte address. SWL stores the left

(most-significant) bytes and SWR stores the right (least-significant) bytes.

The SWR instruction adds its sign-extended 16-bit

offset

to the contents of GPR

base

to

form an effective address which may specify an arbitrary byte. It alters only the word in

memory which contains that byte. From one to four bytes will be stored, depending on the

starting byte specified.

Conceptually, it starts at the least-significant (rightmost) byte of the register and copies it

to the specified byte in memory; then copies bytes from register to memory until it reaches

the high-order byte of the word in memory.

No address exceptions due to alignment are possible.

memory

(little-endian)

address 4

address 0

register

before

$

24

after

0

123

4567

address 4

address 0 0

12

7

SWR $24,3 ($0)

ACDB

456

A

memory

(big-endian)

address 4

address 0

register

before

$

24

after

0123

4567

address 4

address 0 3

21

D

SWR $24,4 ($0)

ACDB

765

0

Appendix A CPU Instruct ion Set Details

A-122

Restrictions:

None

Operation:

vAddr ← sign_extend (offset) + GPR [base] 31..0

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, STORE)

pAddr ← pAddr(PSIZE-1)..4 || (pAddr3..0 xor BigEndian4)

If (BigEndian = 0) then

pAddr ← pAddr(PSIZE-1)..2 || 02

endif

byte ← vAddr1..0 xor BigEndian2

if (v Addr3..2 xor BigEndian2) = 002 then

dataquad ← 096 || GPR [rt] (31-8*byte)..0 || 08*byte

else if ( v Addr 3..2 xor BigEndian2) = 012 then

dataquad ← 064 || GPR [rt] (31-8*byte)..0 || 08*byte || 032

else if ( v Addr 3..2 xor BigEndian2) = 102 then

dataquad ← 032 || GPR [rt] (31-8*byte)..0 || 08*byte || 064

else if ( v Addr 3..2 xor BigEndian2) = 112 then

dataquad ←GPR [rt] (31-8*byte)..0 || 08*byte || 096

endif

Store M e m o r y ( unc ac he d, WORD-b y t e , dataquad, pAddr, v Addr , DATA)

Given a doubleword in a register and a doubleword in memory, the operation of SWR is as

follows:

Appendix A CPU Instruct ion Set Details

A-123

SWR

Re

g

ister

Memor

y

ABCDEFGH

ijklmnopqrstuvwx

15 14 13 12 11 10 9 8 7 6 5 4 3210

MSB LSB

Little-endian

63 0

Littl e-endi an byt e orderi ng (BigEndianCPU = 0)

vAddr3..0 Dest i nation m emory contents after ins tructi on(shaded is unchanged) Type offset

(127---------------------------------------64 63------------------------------------------0) LEM BE M

0 i j k l m n o p q r s t EFGH 3012

1 i j k l m n o p q r s t FGHx2112

2 i j k l m n o p q r s t GHw x 1212

3 i j k l m n o p q r s t Hv w x 0312

4 i j k l m n o p EFGHu v w x 348

5 i j k l m n o p FGH t u v w x 258

6 i j k l m n o p GHs t u v w x 168

7 i j k l m n o p Hr s t u v w x 078

8 i j k l EFGHq r s t u v w x 384

9 i j k l FGHp q r s t u v w x 294

10 i j k l GHo p q r s t u v w x 1104

11 i j k l Hn o p q r s t u v w x 0114

12 EFGHm n o p q r s t u v w x 3120

13 FGH l m n o p q r s t u v w x 2130

14 GHk l m n o p q r s t u v w x 1140

15 Hj k l m n o p q r s t u v w x 0150

Appendix A CPU Instruct ion Set Details

A-124

SWR

Re

g

ister

Memor

y

ABCDEFGH

ijklmnopqrstuvwx

15

14

13121110987654

3210

MSB LSB

Big-endian

63 0

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0Little-endian

Big-endian byte orderi ng (BigEndianCPU = 1)

vAddr3..0 Dest i nation m emory contents after ins tructi on(shaded is unchanged) Type offset

(127---------------------------------------64 63------------------------------------------0) LEM BE M

0Hj k l m n o p q r s t u v w x 0150

1GHk l m n o p q r s t u v w x 1140

2FGH l m n o p q r s t u v w x 2130

3EFGHm n o p q r s t u v w x 3120

4 i j k l Hn o p q r s t u v w x 0114

5 i j k l GHo p q r s t u v w x 1104

6 i j k l FGHp q r s t u v w x 294

7 i j k l EFGHq r s t u v w x 384

8 i j k l m n o p Hr s t u v w x 078

9 i j k l m n o p GHs t u v w x 168

10 i j k l m n o p FGH t u v w x 258

11 i j k l m n o p EFGHu v w x 348

12 i j k l m n o p q r s t Hv w x 0312

13 i j k l m n o p q r s t GHw x 1212

14 i j k l m n o p q r s t FGHx2112

15 i j k l m n o p q r s t EFGH 3012

LEM

Little-endian memory (BigEndianMem = 0)

BEM

BigEndianMem = 1

Type

AccessLength sent to memory

Offset

pAddr3..0 sent to memory

Exceptions:

TLB Refill

TLB Invalid

TLB Modifi ed

Address Error

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-125

SYNC.stype SYNC.stype

Sync hr oniz e S har ed M emory

SPECIAL

000000 SYNC

001111

stype

0

000 0000 0000 0000

31 26 25 11 10 6 5 0

6 15 5 6

MIPS II

Format: SYNC (stype = 0xxxx)

SYNC.L (stype = 0xxxx)

SYNC.P (stype = 1xxxx)

Purpose: To perform either a memory barrier operation or a pipeline barrier operation.

Description:

This instruction either interlocks the pipeline until all pending loads and stores are

completed or all earlier issued instructions are completed.

In case of the SYNC or the SYNC.L instructions (memory barrier) all pending loads and

stores are retired. Loads are retired when the destination register is written. Stores are

retired when the stored data (in store buffers or write buffers) is either stored in the data

cache, or sent on the processor bus and SYSDACK* has been asserted. All uncached

accelerated data gathering operation is terminated. The uncached accelerated buffer is

invalidated. All bus read processes due to load/store/pref/cache instructions are completed.

All pending bus write processes in the write back buffer are completed.

In case of the SYNC.P instruction (pipeline barrier) all instructions prior to the barrier are

completed before the instructions following the barrier operation are fetched. Note that

the barrier operation does not wait for any instruction which was issued prior to the

barrier operation but not retired (e.g., multiply, divide, multicycle COP1 operations or a

pending load which were issued prior to the barrier operation).

Operation:

SyncOperation (st y pe)

Exceptions:

None

Programming Notes:

The SYNC instruction (SYNC.P or SYNC.L) is not allowed in the branch delay slot of

instructions which have branch delay slots.

Appendix A CPU Instruct ion Set Details

A-126

SYSCALL SYSCALL

System Call

SPECIAL

000000 SYSCALL

001100

code

31 26 25 6 5 0

6 20 6

MIPS I

Format: SYSCALL

Purpose: To cause a System Call exception.

Description:

A system call exception occurs, immediately and unconditionally transferring control to

the exception handler.

The code field is available for use as software parameters, but is retrieved by the exception

handler only by loading the contents of the memory word containing the instruction.

Restrictions:

None

Operation:

SignalExcept ion ( S y st e m Call)

Exceptions:

System Call

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-127

TEQ TEQ

Tr ap if Equal

SPECIAL

000000 TEQ

110100

code

rtrs

31 26 25 21 20 16 15 6 5 0

6 5 5 10 6

MIPS II

Format: TEQ rs, rt

Purpose: To compare GPRs and do a conditional Trap.

Description: if (rs = rt) then Trap

Compare the contents of GPR

rs

and GPR

rt

as signed integers ; if GPR

rs

is equal to GPR

rt

then take a Trap exception.

The contents of the

code

field are ignored by hardware and may be used to encode

information for system software. To retrieve the information, system software must load

the instruction word from memory.

Restrictions:

None

Operation:

if GPR[rs]63..0 = GPR[rt] 63..0 then

SignalException (Trap)

endif

Exceptions:

Trap

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-128

TEQI TEQI

Tr ap if Equal I mm ediate

TEQI

01100

REGIMM

000001 immediate

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: TEQI rs, immediate

Purpose: To compare a GPR to a constant and do a conditional Trap.

Description: if (rs = immediate) then Trap

Compare the contents of GPR

rs

and the 16-bit signed

immediate

as signed integer; if

GPR

rs

is equal to

immediate

then taken a Trap exception.

Restrictions:

None

Operation:

if GPR [rs] 63..0 = sign_extend (immediate) then

SignalException (Trap)

endif

Exceptions:

Trap

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-129

TGE TGE

Tr ap if Gr eater or E qual

SPECIAL

000000 TGE

110000

code

rtrs

31 26 25 21 20 16 15 6 5 0

6 5 5 10 6

MIPS II

Format: TGE rs, rt

Purpose: To compare GPRs and do a conditional Trap.

Description: if (rs ≥ rt) then Trap

Compare the contents of GPR

rs

and GPR

rt

as signed integers; if GPR

rs

is greater than

or equal to GPR

rt

then take a Trap exception.

The contents of the

code

field are ignored by hardware and may be used to encode

information for system software. To retrieve the information, system software must load

the instruction word from memory.

Restrictions:

None

Operation:

if GPR [rs] 63..0 ≥ GPR [rt] 63..0 then

SignalException (Trap)

endif

Exceptions:

Trap

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-130

TGEI TGEI

Trap if Greater or Equal Immediate

TGEI

01000

REGIMM

000001 immediate

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: TGEI rs, immediate

Purpose: To compare a GPR to a constant and do a conditional Trap.

Description: if (rs ≥ immediate) then Trap

Compare the contents of GPR

rs

and the 16-bit signed

immediate

as signed integers; if

GPR

rs

is greater than or equal to

immediate

then take a Trap exception.

Restrictions:

None

Operation:

if GPR [rs] 63..0 ≥ sign_extend (immediate) then

SignalException (Trap)

endif

Exceptions:

Trap

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-131

TGEIU TGEIU

Tr ap if Gr eater or E qual Immediate Unsigned

TGEIU

01001

REGIMM

000001 immediate

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: TGEIU rs, immediate

Purpose: To compare a GPR to a constant and do a conditional Trap.

Description: if (rs ≥ immediate) then Trap

Compare the contents of GPR

rs

and the 16-bit sign-extended

immediate

as unsigned

integers; if GPR

rs

is greater than or equal to

immediate

then take a Trap exception.

Because the 16-bit

immediate

is sign-extended before comparison, the instruction is able

to represent the smallest or largest unsigned numbers. The representable values are at

the minimum [0,32767] or maximum [max_unsigned-32767, max_unsigned] end of the

unsigned range.

Restrictions:

None

Operation:

if (0 || GPR[rs] 63..0) ≥ (0 || sign_extend (immediate)) then

SignalException (Trap)

endif

Exceptions:

Trap

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-132

TGEU TGEU

Tr ap if Gr eater or E qual Unsi gned

SPECIAL

000000 TGEU

110001

code

rtrs

31 26 25 21 20 16 15 6 5 0

6 5 5 10 6

MIPS II

Format: TGEU rs, rt

Purpose: To compare GPRs and do a conditional Trap.

Description: if (rs ≥ rt) then Trap

Compare the contents of GPR

rs

and GPR

rt

as unsigned integers; if GPR

rs

is greater

than or equal to GPR

rt

then take a Trap exception.

The contents of the

code

field are ignored by hardware and may be used to encode

information for system software. To retrieve the information, system software must load

the instruction word from memory.

Restrictions:

None

Operation:

if (0 || GPR[rs] 63..0)) ≥ (0 || GPR[rt] 63..0) then

SignalException (Trap)

endif

Exceptions:

Trap

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-133

TLT TLT

Tr ap if Less Than

SPECIAL

000000 TLT

110010

code

rtrs

31 26 25 21 20 16 15 6 5 0

6 5 5 10 6

MIPS II

Format: TLT rs, rt

Purpose: To compare GPRs and do a conditional Trap.

Description: if (rs < rt) then Trap

Compare the contents of GPR

rs

and GPR

rs

as signed integers; if GPR

rs

is less than

GPR

rt

then take a Trap exception.

The contents of the

code

field are ignored by hardware and may be used to encode

information for system software. To retrieve the information, system software must load

the instruction word from memory.

Restrictions:

None

Operation:

if GPR [rs] 63..0 < GPR [rt] 63..0 then

SignalException (Trap)

endif

Exceptions:

Trap

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-134

TLTI TLTI

Tr ap if Less Than Immedi ate

TLTI

01010

REGIMM

000001 immediate

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: TLTI rs, immediate

Purpose: To compare a GPR to a constant and do a conditional Trap.

Description: if (rs < immediate) then Trap

Compare the contents of GPR

rs

and the 16-bit signed

immediate

as signed integers; if

GPR

rs

is less than

immediate

then take a Trap exception.

Restrictions:

None

Operation:

if GPR[rs] 63..0 < sign_extend (immediate) then

SignalException (Trap)

endif

Exceptions:

Trap

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-135

TLTIU TLTIU

Tr ap if Less Than Immedi ate Unsigned

TLTIU

01011

REGIMM

000001 immediate

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: TLTIU rs, immediate

Purpose: To compare a GPR to a constant and do a conditional Trap.

Description: if (rs < immediate) then Trap

Compare the contents of GPR

rs

and the 16-bit sign-extended

immediate

as unsigned

integers; if GPR

rs

is less than

immediate

then take a Trap exception.

Because the 16-bit

immediate

is sign-extended before comparison, the instruction is able

to represent the smallest or largest unsigned numbers. The representable values are at

the minimum [0, 32767] or maximum [max_unsigned-32767, max_unsigned] end of the

unsigned range.

Restrictions:

None

Operation:

if (0 || GPR[rs] 63..0) < (0 || sign_extend (immediate)) then

SignalException (Trap)

endif

Exceptions:

Trap

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-136

TLTU TLTU

Tr ap if Less Than Unsigned

SPECIAL

000000 TLTU

110011

code

rtrs

31 26 25 21 20 16 15 6 5 0

6 5 5 10 6

MIPS II

Format: TLTU rs, rt

Purpose: To compare GPRs and do a conditional Trap.

Description: if (rs < rt) then Trap

Compare the contents of GPR

rs

and GPR

rt

as unsigned integers; if GPR

rs

is less than

GPR

rt

then take a Trap exception.

The contents of the

code

field are ignored by hardware and may be used to encode

information for system software. To retrieve the information, system software must load

the instruction word from memory.

Restrictions:

None

Operation:

if (0 || GPR[rs] 63..0) < (0 || GPR[rt] 63..0) then

SignalException (Trap)

endif

Exceptions:

Trap

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-137

TNE TNE

Tr ap if Not E qual

SPECIAL

000000 TNE

110110

code

rtrs

31 26 25 21 20 16 15 6 5 0

6 5 5 10 6

MIPS II

Format: TNE rs, rt

Purpose: To compare GPRs and do a conditional Trap.

Description: if (rs ≠ rt) then Trap

Compare the contents of GPR

rs

and GPR

rt

as signed integers; if GPR

rs

is not equal to

GPR

rt

then take a Trap exception.

The contents of the

code

field are ignored by hardware and may be used to encode

information for system software. To retrieve the information, system software must load

the instruction word from memory.

Restrictions:

None

Operation:

if GPR[rs] 63..0 ≠ GPR[rt] 63..0 then

SignalException (Trap)

endif

Exceptions:

Trap

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-138

TNEI TNEI

Tr ap if Not E qual Immediate

TNEI

01110

REGIMM

000001 immediate

rs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS II

Format: TNEI rs, immediate

Purpose: To compare a GPR to a constant and do a conditional Trap.

Description: if (rs ≠ immediate) then Trap

Compare the contents of GPR

rs

and the 16-bit signed

immediate

as signed integers; if

GPR

rs

is not equal to

immediate

then take a Trap exception.

Restriction:

None

Operation:

if GPR[rs] 63..0 ≠ sign_extend (immediate) then

SignalException (Trap)

endif

Exceptions:

Trap

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-139

XOR XOR

Exclusi ve OR

SPECIAL

000000 XOR

100110

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

MIPS I

Format: XOR rd, rs, rt

Purpose: To do a bitwise logical EXCLUSIVE OR.

Description: rd ← rs XOR rt

Combine the contents of GPR

rs

and GPR

rt

in a bitwise logical exclusive OR operation

and place the result into GPR

rd

.

Restrictions:

None

Operation:

GPR[rd] 63..0 ← GPR[rs] 63..0 xor GPR[rt] 63..0

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-140

XORI XORI

Exclusive OR Immediate

XORI

001110 immediate

rtrs

31 26 25 21 20 16 15 0

6 5 5 16

MIPS I

Format: XORI rt, rs, immediate

Purpose: To do a bitwise logical EXCLUSIVE OR with a constant.

Description: rt ← rs XOR immediate

Combine the contents of GPR

rs

and the 16-bit zero-extended

immediate

in a bitwise

logical exclusive OR operation and place the result into GPR

rt

.

Restrictions:

None

Operation:

GPR[rt] 63..0 ← GPR[rs] 63..0 xor zero_extend (immediate)

Exceptions:

None

Programming Notes:

None

Appendix A CPU Instruct ion Set Details

A-141

A.5 CPU Instruction Encoding

The following table shows the OpCode encoding of CPU instructions for the MIPS IV

architecture. This architecture level includes all MIPS I, MIPS II, MIPS III and some

MIPS IV instructions. Even though the OpCodes for MTSAB, MTSAH, MFSA, MTSA, LQ,

and SQ are shown in this OpCode table, these instructions are described in Appendix B

since they are C790-specific ins t ructions .

Coprocessor 0 (COP0 - System Control Processor), Coprocessor 1 (COP1 - Floating-point

Processor) and C790 specif ic ins tructions are des cribed in s eparate s ections .

31 26 0

OpCode

OpCode bits 28. . 26 I nstructions encoded by OpCode field

bits01234567

31..29 000 001 010 011 100 101 110 111

0 000 SPECIAL δREGIMM δJ JAL BEQ BNE BLEZ BGTZ

1 001 ADDI ADDIU SLTI SLTIU ANDI ORI XORI LUI

2 010 COP0 α, λ COP1 α, π ∗ ∗ BEQL BNEL BLEZL BGTZL

3 011 DADDI DADDIU LDL LDR MMI δ, µ ∗LQ µSQ µ

4 100 LB LH LWL LW LB U LHU LWR LWU

5 101 SB SH SWL SW SDL SDR SWR CACHE

6 110 ηLWC1 ηPREF ηLDC1 ηLD

7 111 ηSWC1 η ∗ η SDC1 ηSD

31 26 5 0

OpCode =

SPECIAL function

function bits 2..0 Inst ructions encoded by function fiel d when OpCode field = SPECI A L

bits01234567

5..3 000 001 010 011 100 101 110 111

0 000 S LL ∗SRL SRA SLLV ∗SRLV SRAV

1 001 JR JALR MOVZ MOVN SYSCALL BREAK ∗SYNC

2 010 MFHI MTHI MFLO MTLO DSLLV ∗DSRLV DSRAV

3 011 MULT MULTU DIV DI V U η η η η

4 100 ADD ADDU SUB SUBU AND OR XOR NOR

5 101 MFSA µMTSA µSLT SLTU DADD DADDU DSUB DSUBU

6 110 TGE TGEU TLT TLTU TEQ ∗TNE ∗

7 111 DSLL ∗DSRL DSRA DSLL32 ∗DSRL32 DSRA32

Appendix A CPU Instruct ion Set Details

A-142

31 26 20 16 0

OpCode =

REGIMM rt

rt bits 18..16 I nstructions encoded by rt field when OpCode field = RE GIMM

bits01234567

20..19 000 001 010 011 100 101 110 111

0 00 BLTZ BGEZ BLTZL BGEZL ∗ ∗ ∗ ∗

0 01 TGEI TGEI U TLTI TLTIU TE QI ∗TNEI ∗

2 10 BLTZAL BGEZAL BLTZALL BGEZALL ∗ ∗ ∗ ∗

3 11 MTSAB µMTSAH µ∗∗∗∗∗∗

*This OpCode is reserved for future use. An attempt to execute it causes a

Reserved Instruction exception.

ηThis OpCode is reserved for one of the following instructions which are

currently not supported: DMULT, DMULTU, DDIV, DDIVU, LL, LLD, SC,

SCD, LWC2, SWC2. An attempt to execute it causes a Reserved Instruction

exception.

δ This OpCode indicates an instruction class. The instruction word must be

further decoded by examining additional tables that show the values for

another instruction field.

µ This OpCode indicates C790 specific instructions. It is included in the table

because it uses a primary OpCode in the instruction encoding map.

α This OpCode is a coprocessor operation, not a CPU operation. If the

processor state does not allow access to the specified coprocessor, the

instruction causes a Coprocessor Unusable exception. It is included in the

table because it uses a primary OpCode in the instruction encoding map.

λThis OpCode indicates the class of Coprocessor 0 (System Control Processor)

instructions. If the processor state does not allow access to the coprocessor 0,

the instruction causes a Coprocessor Unusable exception. Further encoding

information for this instruction class is in the COP0 Instruction Encoding

tables.

πThis OpCode indicates the class of Coprocessor 1 (Floating-Point Processor)

instructions. If the processor state does not allow access to the coprocessor 1,

the instruction causes a Coprocessor Unusable exception. Further encoding

information for this instruction class is in the COP1 Instruction Encoding

tables.

Appendix B C790-Specific I nst ruction Set Details

B-1

B. C790-Specific Instruction Set Details

This appendix provides a detailed description of the operation of each C790-specific

instruction. The C790’s inst ruction set is extended f rom the original MIPS ISA in order to

support embedded applications. There are three classes of C790-s p ecif ic inst ructions :

• Three-operand Multiply and Multiply-Add instructions

• Multiply and Multiply-Add instructions for pipeline 1

• Multimedia instructions

Appendix B C790-Specific I nst ruction Set Details

B-2

B.1 Conventions Used in This Chapter

The

HI

and

LO

registers are 128 bits wide. Some instructions operate on either the lower

or the upper doublewords of these registers, and there are also instructions which operate

on the complete registers.

The following terminology is used for these registers.

• Strictly speaking, a reference to the least-significant doubleword of the

HI

and

LO

register should use the names

HI0

and

LO0

. However, to be consistent with

existing MIPS terminology, these registers are just called

HI

and

LO

.

• Reference to the upper doublewords of the

HI

and

LO

registers is made by using

the names

HI1

and

LO1

.

• Occasionally, based on context, the complete 128-bit registers are referred to as

HI

and

LO

.

• Any portion of these registers can use the names

HI

and

LO

with the appropriate

bit width specifications. Thus

HI1

can be referred to as

HI

127..64 and

LO1

can be

referred to as

LO

127..64, etc.

B.1.1 Instruction Description Notation and Functions

The

Operation

sections of the instruction descriptions describe the operation performed by

each instruction using a high-level language notation, or pseudocode. Symbols, functions,

and structures used in the

Operation

sections are described here.

B.1.2 Pseudocode Language Statement Execution

Each of the high-level language statements in an operation description is executed in

sequential order (as modified by conditional and loop constructs).

B.1.3 Pseudocode Symbols

Special symbols used in the notation are described in Appendix A.

B.2 Definitions for Pseudocode Functions Used in Operation

Descriptions

A variety of functions are used in the pseudocode descriptions to make the pseudocode

more readable and also to abstract implementation-specific behavior. These functions are

defined in Appendix A.

Appendix B C790-Specific I nst ruction Set Details

B-3

B.3 Summary of C790-Specific Instructions

B.3.1 M ultiply and Multiply-Add Instructions

• Three-Operand Multiply and Multiply-Add (4 instructions)

MADD Multiply/Add

MADDU Multiply/Add Unsigned

MULT Multiply (3-operand)

MULTU Multiply Unsigned (3-operand)

• Multiply Instructions for Pipeline 1 (10 instructions)

MULT1 Multiply Pipeline 1

MULTU1 Multiply Unsig ned Pipeline 1

DIV1 Divide Pipeline 1

DIVU1 D iv ide Unsig ned Pipeline 1

MADD1 Multiply-Add Pipeline 1

MADDU1 Multiply-Add Unsign ed Pipe line 1

MFHI1 Move From HI1 Register

MFLO1 Move From LO1 Register

MTHI1 Move To HI1 Register

MTLO1 Move To LO1 Register

B.3.2 Multimedia Instructions

• Arithmetic (19 instructions)

PADDB Parallel Add Byte

PSUBB Parallel Subtract Byte

PADDH Parallel Add Halfword

PSUBH Parallel Subtract Halfword

PADDW Parallel Add Word

PSUBW Parallel Subtract Word

PADSBH Parallel Add/Subtract Halfword

PADDSB Parallel Add with Signed Saturation Byte

PSUBSB Parallel Subtract with Signed Saturation Byte

PADDSH Parallel Add with Signed Saturation Halfword

PSUBSH Parallel Subtract with Signed Saturation Halfword

PADDSW Parallel Add with Signed Saturation Word

PSUBSW Parallel Subtract with Signed Saturation Word

PADDUB Parallel Add with Unsigned saturation Byte

PSUBUB Parallel Subtract with Unsigned sat uration By t e

PADDUH Parallel Add with Unsigned saturation Halfword

PSUBUH Parallel Subtract with Unsig ned sat uration H alf word

PADDUW Parallel Add with Unsigned saturation Word

PSUBUW Parallel Subtract with Unsig ned saturat ion W ord

Appendix B C790-Specific I nst ruction Set Details

B-4

• Min/Max (4 instructions)

PMAXH Parallel Maximum Halfword

PMINH Parallel Minimum Halfword

PMAXW Parallel Maximum Word

PMINW Parallel Minimum Word

• Absolute (2 instructions)

PABSH Parallel Absolute Halfword

PABSW Parallel Absolute Word

• Logical (4 instr uctions)

PAND Parallel AND

POR Parallel O R

PXOR Parallel XOR

PNOR Parallel NOR

• Shift (9 instructions)

PSLLH Parallel Shift Left Logical Ha lf w o rd

PSRLH Parallel Shift Right Logical Halfword

PSRAH Parallel Shift Rig ht Arit hm et ic H a lf w o rd

PSLLW Parallel Shift Left Log ical W ord

PSRLW Parallel Shift Right Logical W ord

PSRAW Parallel Shift Right Arithmetic Word

PSLLVW Parallel Shift Left Log ical Variable W ord

PSRLVW Parallel Shift Rig ht Log ical Variable W ord

PSRAVW Parallel Shift Right Arit hm e t ic Var iable Word

• Compare (6 instructions)

PCGTB Parallel Compare for Greater Than Byte

PCEQB Parallel Compare f o r Equal Byte

PCGTH Parallel Compare for Greater Than Halfword

PCEQH Parallel Compare for Equal Halfword

PCGTW Parallel Compare for Greater Than Word

PCEQW Parallel Compare fo r Equal W or d

• LZC (1 instruction)

PLZCW Parallel Leading Zero or One Count Word

• Quadword Load and Store (2 instructions)

LQ Load Quadword

SQ Store Quadword

Appendix B C790-Specific I nst ruction Set Details

B-5

• Multiply and Divide (19 instructions)

PMULTW Parallel Multiply Word

PMULTUW Parallel Multiply Unsigned Word

PDIVW Parallel Divide Word

PDIVUW Parallel Divide Unsigned Word

PMADDW Parallel Multiply-Add Word

PMADDUW Parallel Multiply-Add Unsig ned W o rd

PMSUBW Parallel Multiply-Subt ract W o rd

PMULTH Parallel Multiply Halfword

PMADDH Parallel Multiply-Add Halfwo rd

PMSUBH Parallel Multiply-Subt r act H alf word

PHMADH Parallel Horizontal Multiply-Add H alf word

PHMSBH Parallel Horizontal Multiply- S ubt r act H alf word

PDIVBW Parallel Divide Broadcast W o rd

PMFHI Parallel Move From HI Regist er

PMFLO Parallel Move From LO Regist er

PMTHI Parallel Move To HI Register

PMTLO Parallel Move To LO Register

PMFHL Parallel Move From HI/LO Register

PMTHL Parallel Move To HI/LO Register

• Pack/Extend (11 instructions)

PPAC5 Parallel Pack to 5 bits

PPACB Parallel Pack to Byte

PPACH Parallel Pack to Halfword

PPACW Parallel Pack to Word

PEXT5 Parallel Extend Upper from 5 bits

PEXTUB Parallel Extend Upper from Byte

PEXTLB Parallel Extend Lower from By t e

PEXTUH Parallel Extend Upper from Half word

PEXTLH Parallel Extend Lower from H alf word

PEXTUW Parallel Extend Upper f rom W o rd

PEXTLW Parallel Extend Lower from W o rd

• Others (16 instructions)

PCPYH Parallel Copy H a lf w o rd

PCPYLD Parallel Copy Lower Doubleword

PCPYUD Parallel Copy Upper Doubleword

PREVH Parallel Rever se H a lf word

PINTH Parallel Interleave Halfw ord

PINTEH Parallel Interleave Even Halfw o rd

PEXEH Parallel Exchange Even Ha lf w o rd

PEXCH Parallel Exchange Center Half word

PEXEW Parallel Exchange Even Word

PEXCW Parallel Exchange Center Word

QFSRV Quadw ord Funnel Shift Right Variable

MFSA Move from Shift Amount Register

MTSA Move to Shift Amount Register

MTSAB Mov e By te Count to Shift Amount Re gister

MTSAH Move Halfword Count to Shift Amount Register

PROT3W Parallel Rotate 3 Words

Appendix B C790-Specific I nst ruction Set Details

B-6

B.4 Instruction Set Details

In the following sections, details are provided f or each of the C790-s p ecif ic ins tructions .

Exceptions that may occur due to the execution of each instruction are listed after the

description of each instruction. Descriptions of the immediate cause and manner of

handling exceptions are omitted from the instruction descriptions in this appendix.

Appendix B C790-Specific I nst ruction Set Details

B-7

DIV1 DIV1

Divide Wor d P ipeline 1

MMI

011100 DIV1

011010

rt 0

0000000000

rs

31 26 25 21 20 16 15 6 5 0

6 5 5 10 6

C790

Format: DIV1 rs, rt

Purpose: To divide 32-bit signed integers using pipeline 1.

Description: (LO1, HI1) ← rs / rt

The 32-bit value in GPR

rs

is divided by the 32-bit value in G PR

rt

, treating both operands

as signed values. The 32-bit quotient is placed into special register

LO1

(=

LO

127..64) and

the 32-bit remainder is placed into special register

HI1

(=

HI

127..64).

No arithmetic exception occurs under any circumstances.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation will be undefined.

If the divisor in GPR

rt

is zero, the arithmetic result value will be undefined.

Operation:

if (NotWordValue(GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif

q ← GPR[rs]31..0 div GPR[rt]31..0

r ← GPR[rs]31..0 mod GPR[rt]31..0

LO127..64 ← (q 31)32 || q 31..0

HI127..64 ← (r 31)32 || r 31..0

Supplementary Explanation:

Normally, when 0x80000000 (-2147483648) the signed minimum value is divided by

0xFFFFFFFF (-1), the operation will result in an overfl ow. H owever, in this instruction an

overflow exception doesn’t occur and the result will be as follows:

Quotient is 0x80000000 (-2147483648) , and remainder is 0x00000000 ( 0) .

This sign of the quotient and the remainder is based on the signs of the dividend and the

divisor as shown in the table below :

Appendix B C790-Specific I nst ruction Set Details

B-8

Table B-1. Quotient and Remainder Signs

Dividend Divisor Quotient Remainder

Positive Positive Positive Positive

Positive Negative Negative Positive

Negative Positive Negative Negative

Negative Negative Positive Negative

Exceptions:

None

Programming Notes:

In C790, the integer divide operation proceeds asynchronously and allows other CPU

instructions to execute before it is retired. An attempt to read

LO1

or

HI1

registers before

the results are written will cause an interlock until the results are ready. Out-of-order

execution does not affect the program result, but offers an opportunity for performance

improvement by scheduling the divide so that other instructions can execute in parallel.

No arithmetic exception occurs under any circumstances. Divide-by-zero or overflow

conditions should be detected by instructions preceding the divide instruction. If the

divide is asynchronous then the zero-divisor check can execute in parallel with the divide.

The action taken on either divide-by-zero or overflow is either a convention within the

program itself or more typically, the system software; one possibility is to take a BREAK

exception with a code field value to signal the problem to the system software.

As an example, the C programming language in a UNIX environment expects division by

zero to either terminate the program or execute a program-specified signal handler. C

does not expect overflow to cause any exceptional condition. If the C compiler uses a divide

instruct i on, it also em it s c o de t o t e s t f o r a zero divisor and execut e a BREAK i ns t r uc t ion to

inform the operating system if one is detected.

Appendix B C790-Specific I nst ruction Set Details

B-9

DIVU1 DIVU1

Divide Unsigned Word Pi peline 1

MMI

011100 DIVU1

011011

rt 0

0000000000

rs

31 26 25 21 20 16 15 6 5 0

6 5 5 10 6

C790

Format: DIVU1 rs, rt

Purpose: To divide 32-bit unsigned integers using pipeline 1.

Description: (LO1, HI1) ← rs / rt

The 32-bit value in GPR

rs

is divided by the 32-bit value in G PR

rt

, treating both operands

as unsigned values. The 32-bit quotient is placed into special register

LO1

(=

LO

127..64) and

the 32-bit remainder is placed into special register

HI1

(=

HI

127..64).

No arithmetic exception occurs under any circumstances.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain zero-extended 32-bit values (bits 63..32 equal

zero), then the result of the operation is undefined.

If the divisor in GPR

rt

is zero, the arithmetic result will be undefined.

Operation:

if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif

q ← (0 || GPR[rs]31..0) div (0 || GPR[rt]31..0)

r ← (0 || GPR[rs]31..0) mod (0 || GPR[rt]31..0)

LO127..64 ← (q 31)32 || q 31..0

HI127..64 ← (r 31)32 || r 31..0

Exceptions:

None

Programming Notes:

See the Programming Notes for the DIV1 instruction.

Appendix B C790-Specific I nst ruction Set Details

B-10

LQ LQ

Load Quadword

LQ

011110 rt offsetbase

31 26 25 21 20 16 15 0

6 5 5 16

C790

Format: LQ rt, offset (base)

Purpose: To load a quadword from memory.

Description: rt ← memory [base + offset]

The contents of the 128-bit quadword at the memory location specified by the effective

address are fetched and placed in the 128-bit GPR

rt

. The 16-bit signed offset is added to

the contents of GPR base register to form the effective address. The least-significant four

bits of the effective address are masked to zero (effectively creating an aligned address)

before being used to access memory. No address exceptions due to alignment are possible.

Restriction:

The effective address doesn’t have to be naturally aligned. The least significant 4 bits of

the effective address are ignored.

Operations:

vAddr ← sign_extend (offset) + GPR [base]31..0

vAddr3..0 = 04

(pAddr, uncached) ← AddressTranslation (vAddr, DATA, LOAD)

memquad ← LoadMemory (uncached, QUADWORD, pAddr, vAddr, DATA)

GPR[rt]127..0 ← memquad

Exceptions:

TLB Refill

TLB Invalid

Address Error

Appendix B C790-Specific I nst ruction Set Details

B-11

MADD MADD

Multiply-Add word

MMI

011100 MADD

000000

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: MADD rs, rt

MADD rd, rs, rt

Purpose: To multiply 32-bit signed integers and add.

Description: (rd, HI, LO) ← (HI, LO) + rs × rt

The 32-bit word value in GPR

rt

is multiplied by the 32-bit value in GPR

rs

, treating both

operands as signed values, to produce a 64-bit multiply result. The 64-bit multiply result

is added to the contents in special registers

HI

and

LO

. The low-order 32-bit word of the

result is placed into special register

LO

and GPR

rd

, and the high-order 32-bit word of the

result is placed into special register

HI

.

No arithmetic exception occurs under any circumstances.

If GPR

rd

is omitted in assembly language, 0 is used as the default value.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation will be undefined.

Operation:

if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif

prod ← (HI31..0 || LO31..0) + GPR[rs]31..0 * GPR[rt]31..0

LO63..0 ← (prod 31)32 || prod31..0

HI63..0 ← (prod 63)32 || prod63..32

GPR[rd]63..0 ← (prod 31)32 || prod31..0

Exceptions:

None

Programming Notes:

In C790, the integer multiply accumulate operation proceeds asynchronously and allows

other CPU instructions to execute before it is retired. An attempt to read

LO

or

HI

registers before the results are written will cause an interlock until the results are ready.

Asynchronous execution does not affect the program result, but offers an opportunity for

performance improvement by scheduling the multiply so that other instructions can

execute in parallel.

Appendix B C790-Specific I nst ruction Set Details

B-12

MADD1 MADD1

Multiply-Add word Pipeline 1

MMI

011100 MADD1

100000

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: MADD1 rs, rt

MADD1 rd, rs, rt

Purpose: To multiply 32-bit signed integers and add in Pipeline 1.

Description: (rd, HI1, LO1) ← (HI1, LO1) + rs × rt

The 32-bit word value in GPR

rt

is multiplied by the 32-bit value in GPR

rs

, treating both

operands as signed values, to produce a 64-bit multiply result. The 64-bit multiply result

is added to the contents in special registers

HI1

(=

HI

127..64) and

LO1

(=

LO

127..64). The low-

order 32-bit word of the result is placed into special register

LO1

and GPR

rd

, and the

high-order 32-bit word of the result is placed into special register

HI1

.

No arithmetic exception occurs under any circumstances.

If GPR

rd

is omitted in assembly language, 0 is used as the default value.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation will be undefined.

Operation:

if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif

prod ← (HI95..64 || LO95..64) + GPR[rs]31..0 * GPR[rt]31..0

LO127..64 ← (prod 31)32 || prod31..0

HI127..64 ← (prod 63)32 || prod63..32

GPR[rd]63..0 ← (prod 31)32 || prod31..0

Exceptions:

None

Programming Notes:

In the C790, the integer multiply accumulate operation proceeds asynchronously and

allows other CPU instructions to execute before it is retired. An attempt to read

LO1

or

HI1

registers before the results are written will cause an interlock until the results are

ready. Asynchronous execution does not affect the program result, but offers an

opportunity for performance improvement by scheduling the multiply so that other

instructions can execute in parallel.

Appendix B C790-Specific I nst ruction Set Details

B-13

MADDU MADDU

Multiply - A dd Unsi gned word

MMI

011100 MADDU

000001

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: MADDU rs, rt

MADDU rd, rs, rt

Purpose: To multiply 32-bit unsigned integers and add.

Description: (rd, HI, LO) ← (HI, LO) + rs × rt

The 32-bit word value in GPR

rt

is multiplied by the 32-bit value in GPR

rs

, treating both

operands as unsigned values, to produce a 64-bit multiply result. The 64-bit multiply

result is added to the contents in special registers

HI

and

LO

. The low-order 32-bit word of

the result is placed into special register

LO

and GPR

rd

, and the high-order 32-bit word of

the result is placed into special register

HI

.

No arithmetic exception occurs under any circumstances.

If GPR

rd

is omitted in assembly language, 0 is used as the default value.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain zero-extended 32-bit values (bits 63..32 equal

zero), then the result of the operation will be undefined.

Operation:

if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif

prod ← (HI31..0 || LO31..0) + (0 || GPR[rs]31..0) * (0 || GPR[rt]31..0)

LO63..0 ← (prod 31)32 || prod31..0

HI63..0 ← (prod 63)32 || prod63..32

GPR[rd] 63..0 ← (prod 31)32 || prod31..0

Exceptions:

None

Programming Notes:

See the Programming Notes for the MADD instruction

Appendix B C790-Specific I nst ruction Set Details

B-14

MADDU1 MADDU1

Multiply - A dd Unsigned word Pi peline 1

MMI

011100 MADDU1

100001

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: MADDU1 rs, rt

MADDU1 rd, rs, rt

Purpose: To multiply 32-bit unsigned integers and add in Pipeline 1.

Description: (rd, HI1, LO1) ← (HI1, LO1) + rs × rt

The 32-bit value in GPR

rt

is multiplied by the 32-bit value in GPR

rs

, treating both

operands as unsigned values, to produce a 64-bit multiply result. The 64-bit multiply

result is added to the contents in special registers

HI1

(=

HI

127..64) and

LO1

(=

LO

127..64).

The low-order 32-bit word of the result is placed into special register

LO1

and GPR

rd

,

and the high-order 32-bit word of the result is placed into special register

HI1

.

No arithmetic exception occurs under any circumstances.

If GPR

rd

is omitted in assembly language, 0 is used as the default value.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain zero-extended 32-bit values (bits 63..32 equal

zero), then the result of the operation will be undefined.

Operation:

if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif

prod ← (HI95..64 || LO95..64) + (0 || GPR[rs]31..0) * (0 || GPR[rt]31..0)

LO127..64 ← (prod 31)32 || prod31..0

HI127..64 ← (prod 63)32 || prod63..32

GPR[rd]63..0 ← (prod 31)32 || prod31..0

Exceptions:

None

Programming Notes:

See the Programming Notes for the MADD1 instruction

Appendix B C790-Specific I nst ruction Set Details

B-15

MFHI1 MFHI1

Move F r om HI1 Register

MMI

011100 MFHI1

010000

rd 0

00000

0

0000000000

31 26 25 16 15 11 10 6 5 0

6 10 5 5 6

C790

Format: MFHI1 rd

Purpose: To copy the special purpose register HI1 to a GPR.

Description: rd ← HI1

The contents of special register

HI1

(=

HI

127..64) are loaded into GPR

rd

.

Restrictions:

None

Operation:

GPR[rd]63..0 ← HI127..64

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-16

MFLO1 MFLO1

Move From LO1 Register

MMI

011100 MFLO1

010010

rd 0

00000

0

0000000000

31 26 25 16 15 11 10 6 5 0

6 10 5 5 6

C790

Format: MFLO1 rd

Purpose: To copy the special purpose LO1 register to a GPR.

Description: rd ← LO1

The contents of special register

LO1

(=

LO

127..64) are loaded into GPR

rd

.

Restrictions:

None

Operation:

GPR[rd]63..0 ← LO127..64

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-17

MFSA MFSA

Move from Shift A mount Register

SPECIAL

000000 MFSA

101000

rd

0

00 0000 0000 0

00000

31 26 25 16 15 11 10 6 5 0

6 10 5 5 6

C790

Format: MFSA rd

Purpose: To copy the shift amount register SA to a GPR.

Description: rd ← SA

The contents of SA, the special register storing the funnel shift amount, is loaded into

GPR

rd

. Note that the shift amount is encoded in SA in an implementation-defined

manner. Therefore, it is not meaningful for software to operate on the value returned in

rd

.

The sole purpose of this instruction is to permit the shift amount to be saved during a

context switch. The MTSA instruction should be used to restore the state of SA.

Restrictions:

None

Operation:

GPR[rd]63..0 ← SA

Exceptions:

None

Implementation Note:

This instruction executes only in pipeline 0.

Appendix B C790-Specific I nst ruction Set Details

B-18

MTHI1 MTHI1

Move T o HI1 Register

MMI

011100 MTHI1

010001

rs 0

000000000000000

31 26 25 21 20 6 5 0

6 5 15 6

C790

Format: MTHI1 rs

Purpose: To copy a GPR to the special purpose register HI1.

Description: HI1 ← rs

The contents of GPR

rs

are loaded into special register

HI1

(=

HI

127..64).

Restrictions:

None

Operation:

HI127..64 ← GPR[rs]63..0

Exceptions:

None

Programming Notes:

None

Appendix B C790-Specific I nst ruction Set Details

B-19

MTLO1 MTLO1

Move T o LO1 Register

MMI

011100 MTLO1

010011

rs 0

000000000000000

31 26 25 21 20 6 5 0

6 5 15 6

C790

Format: MTLO1 rs

Purpose: To copy a GPR to the special purpose register LO1.

Description: LO1 ← rs

The contents of GPR

rs

are loaded into special register

LO1

(=

LO

127..64).

Restrictions:

None

Operation:

LO127..64 ← GPR[rs]63..0

Exceptions:

None

Programming Notes:

None

Appendix B C790-Specific I nst ruction Set Details

B-20

MTSA MTSA

Move t o S hift A mount Register

SPECIAL

000000 MTSA

101001

rs 0

000 0000 0000 0000

31 26 25 21 20 6 5 0

6 5 15 6

C790

Format: MTSA rs

Purpose: To copy a GPR to the shift amount register SA.

Description: SA ← rs

The contents of GPR

rs

are loaded into SA, the special register storing the funnel shift

amount. Note that

rs

must contain a value that was originally generated by MFSA. If

some other user-generated value is in

rs,

the shifting action performed by the funnel

shifter is not defined; that is, MTSA cannot be used to by a program to set a new funnel

shift amount. This is because the shift amount is encoded in SA in an implementation-

defined manner. The sole purpose of this instruction is to permit the shift amount to be

restored during a context switch.

Restrictions:

Note

that the three instructions st at ically preceding a MTS A instruct ion m ust not read or

write the SA register; that is, they cannot be either of the instructions MFSA, QFSRV, or

MTSA

x

.

Use the MTSAB and MTSAH instructions to s e t a new f unnel s hift amount.

Operation:

SA ← GPR[rs]63..0

Exceptions:

None

Implementation Note:

1. MTSA updates the SA register in the A Stage. To k eep exception processing simple,

this requires that the cycle prior to MTSA not read the SA register. Also, when

single stepping, making sure that SA always contains the value of the SA write

instruction, just single stepped, requires that the cycle after MTSA not write the

SA register. Both these rules are enforced by the architectural requirement that

the three instructions prior to MTSA not read SA.

2. The MTSA instruction executes only in pipeline 0.

Appendix B C790-Specific I nst ruction Set Details

B-21

MTSAB MTSAB

Move Byte Count to Shift Amount Register

REGIMM

000001 immediate

rs

31 26 25 21 20 16 15 0

6 5 5 16

MTSAB

11000

C790

Format: MTSAB rs, immediate

Purpose: To copy a GPR to the shift amount register SA.

Description: SA ← (rs xor immediate) x 8

The least-significant four bits of GPR

rs

are XORed with the least-significant four bits of

the immediate value. The resulting four bits are interpreted as a byte shift amount and

stored into SA, the special regis ter s t oring the funnel s hif t amount.

Restrictions:

The three instructions statically preceding a MTSAB instruction must not read the SA

register; that is, they cannot be either of the instructions MFSA or QFSRV.

Operation:

SA ← (GPR[rs]3..0 xor immediate3..0) * 8

Exceptions:

None

Implementation Note:

1. MTSAB updates the SA register in the A Stage. To keep exception processing

simple, this requires that the cycle prior to MTSAB not read the SA register. Also,

when single stepping, making sure that SA always contains the value of the SA

write instruction, just single stepped, requires that the cycle after the MTSAB not

write the SA register. Both these rules are enforced by the architectural

requirement that the three instructions prior to MTSAB not read SA.

2. The MTSAB instruction executes only in pipeline 0.

Progra mming Note:

MTSAB allows the user to load either a variable shift amount or a fixed shift amount, as

follows: mtsab 0, 5 // Set shift amount to “5 bytes”

mtsab 10, 0 // Set byte shift amount to contents of GPR10

Appendix B C790-Specific I nst ruction Set Details

B-22

MTSAH MTSAH

Move Halfword Count to Shift Amount

Register

REGIMM

000001 immediate

rs

31 26 25 21 20 16 15 0

6 5 5 16

MTSAH

11001

C790

Format: MTSAH rs, immediate

Purpose: To copy a GPR to the shift amount register SA.

Description: SA ← (rs xor immediate) x 16

The least-significant three bits of GPR

rs

are XORed with the least-significant three bits

of the immediate value. The resulting three bits are interpreted as a halfword shift

amount and stored into SA, the special regis ter s t oring the funnel s hif t amount.

Restrictions:

The three instructions statically preceding a MTSAB instruction must not read the SA

register; that is, they cannot be either of the instructions MFSA or QFSRV.

Operation:

SA ← (GPR[rs]2..0 xor immediate2..0) * 16

Exceptions:

None

Implementation Note:

1. MTSAH updates the SA register in the A Stage. To keep exception processing

simple, this requires that the cycle prior to MTSAH not read the SA register. Also,

when single stepping, making sure that SA always contains the value of the SA

write instruction, just single stepped, requires that the cycle after MTSAH not

write the SA register. Both these rules are enforced by the architectural

requirement that the three instructions prior to MTSAH not read SA.

2. The MTSAH instruction executes only in pipeline 0.

Progra mming Note:

MTSAH allows the user to load either a variable shift amount or a fixed shift amount, as

follows: mtsah 0, 5 // Set shift amount to “5 halfwords”

mtsah 10, 0 // Set halfword shift amount to value of GPR10

Appendix B C790-Specific I nst ruction Set Details

B-23

MULT MULT

Multiply Word

SPECIAL

000000 MULT

011000

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: MULT rd, rs, rt

MULT rs, rt

Purpose: To multiply 32-bit signed integers.

Description: (rd, LO, HI) ← rs × rt

The 32-bit value in GPR

rt

is multiplied by the 32-bit value in GPR

rs

, treating both

operands as signed values, to produce a 64-bit result. The low-order 32-bits of the result is

placed into special register

LO

and GPR

rd

, and the high-order 32-bit of the result is

placed into special register

HI

.

No arithmetic exception occurs under any circumstances.

If GPR rd is omitted in assembly language, 0 is used as the default value.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation will be undefined.

Operation:

if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif

prod ← GPR[rs]31..0 * GPR[rt]31..0

LO63..0 ← (prod 31)32 || prod31..0

HI63..0 ← (prod 63)32 || prod63..32

GPR[rd] 63..0 ← (prod 31)32 || prod31..0

Exceptions:

None

Programming Notes:

In the C790, the integer multiply operation allows other CPU instructions to execute out-

of-order. An attempt to read

LO

or

HI

registers before the results are written will cause

an interlock until the results are ready. Asynchronous execution does not affect the

program result, but offers an opportunity for performance improvement by scheduling the

multiply so that other instructions can execute in parallel.

Programs that require overflow detection must check for it explicitly.

Appendix B C790-Specific I nst ruction Set Details

B-24

MULT1 MULT1

Multiply Word Pipeline 1

MMI

011100 MULT1

011000

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: MULT1 rd, rs, rt

MULT1 rs, rt

Purpose: To multiply 32-bit signed integers in Pipeline 1.

Description: (rd, HI1, LO1) ← rs × rt

The 32-bit value in GPR

rt

is multiplied by the 32-bit value in GPR

rs

, treating both

operands as signed values, to produce a 64-bit result. The low-order 32-bits of the result is

placed into special register

LO1

(=

LO

127..64) and GPR

rd

, and the high-order 32-bits of the

result is placed into

special register

HI1

(=

HI

127..64).

No arithmetic exceptions occurs under any circumstances.

If GPR

rd

is omitted in assembly language, 0 is used as the default value.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain sign-extended 32-bit values (bits 63..31 equal),

then the result of the operation will be undefined.

Operation:

if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif

prod ← GPR[rs]31..0 * GPR[rt]31..0

LO127..64 ← (prod 31)32 || prod 31..0

HI127..64 ← (prod 63)32 || prod 63..32

GPR[rd]63..0 ← (prod 31)32 || prod31..0

Exceptions:

None

Programming Notes:

In the C790 the integer multiply operation allows other CPU instructions to execute out-

of-order. An attempt to read

LO1

or

HI1

before the results are written will cause an

interlock until the results are ready. Asynchronous execution does not affect the program

result, but offers an opportunity for performance improvement by scheduling the multiply

so that other instructions can execute in parallel.

Programs that require overflow detection must check for it explicitly.

Appendix B C790-Specific I nst ruction Set Details

B-25

MULTU MULTU

Multiply Unsi gned Word

SPECIAL

000000 MULTU

011001

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: MULTU rd, rs, rt

MULTU rs, rt

Purpose: To multiply 32-bit unsigned integers.

Description: (rd, HI, LO) ← rs × rt

The 32-bit value in GPR

rt

is multiplied by the 32-bit value in GPR

rs

, treating both

operands as unsigned values, to produce a 64-bit result. The low-order 32-bit of the result

is placed into special register

LO

and GPR

rd

, and the high-order 32-bits of the result is

placed into special register

HI

.

No arithmetic exception occurs under any circumstances.

If GPR

rd

is omitted in assembly language, 0 is used as the default value.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain zero-extended 32-bit values (bits 63..32 equal

zero), then the result of the operation will be undefined.

Operation:

if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif

prod ← (0 || GPR[rs]31..0) * (0 || GPR[rt]31..0)

LO63..0 ← (prod 31)32 || prod31..0

HI 63..0 ← (prod 63)32 || prod63..32

GPR[rd] 63..0 ← (prod 31)32 || prod31..0

Exceptions:

None

Programming Notes:

See the Programming Notes for the MULT instruction.

Appendix B C790-Specific I nst ruction Set Details

B-26

MULTU1 MULTU1

Multiply Unsi gned Word Pipeline 1

MMI

011100 MULTU1

011001

rt rd 0

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: MULTU1 rd, rs, rt

MULTU1 rs, rt

Purpose: To multiply 32-bit unsigned integers in Pipeline 1.

Description: (rd, HI1, LO1) ← rs × rt

The 32-bit value in GPR

rt

is multiplied by the 32-bit value in GPR

rs

, treating both

operands as unsigned values, to produce a 64-bit result. The low-order 32-bit of the result

is placed into special register

LO1

(=

LO

127..64) and GPR

rd

, and the high-order 32-bit of

the result is placed into

special register

HI1

(=

HI

127..64).

No arithmetic exceptions occurs under any circumstances.

If GPR rd is omitted in assembly language, 0 is used as the default value.

Restrictions:

If either GPR

rt

or GPR

rs

do not contain zero-extended 32-bit values (bits 63..32 equal

zero), then the result of the operation will be undefined.

Operation:

if (NotWordValue (GPR[rs]) or NotWordValue (GPR[rt])) then UndefinedResult() endif

prod ← ( 0 || GPR[rs]31..0) * (0 || GPR[rt]31..0)

LO127..64 ← (prod 31)32 || prod 31..0

HI127..64 ← (prod 63)32 || prod 63..32

GPR[rd]63..0 ← (prod 31)32 || prod 31..0

Exceptions:

None

Programming Notes:

See the Programming Notes for the MULT1 instruction.

Appendix B C790-Specific I nst ruction Set Details

B-27

PABSH PABSH

Parallel Absolute Halfword

MMI

011100 MMI1

101000

rt rd PABSH

00101

0

00000

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PABSH rd, rt

Purpose: To calculate the absolute value of 8 16-bit integers in parallel.

Description: rd ← rt

The absolute value of the eight signed halfword values in GPR

rt

are placed into the

corresponding eight halfwords in GPR

rd

.

This instruction operates on 128-bit regis t ers .

Operation:

GPR[rd]15..0 ← GPR[rt]15..0 

GPR[rd]31..16 ← GPR[rt]31..16

GPR[rd]47..32 ← GPR[rt]47..32

GPR[rd]63..48 ← GPR[rt]63..48

GPR[rd]79..64 ← GPR[rt]79..64

GPR[rd]95..80 ← GPR[rt]95..80

GPR[rd]111..96 ← GPR[rt]111..96

GPR[rd]127..112 ← GPR[rt]127..112

rt A7 A6 A5 A4 A3 A2 A1 A0

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

rd  A7   A6   A5   A4   A3   A2   A1   A0 

Supplementary explanation:

When the halfword value in GPR

rt

is 0x8000 (-32768), the smallest negative value, the

operation will result in an overflow. However, overflow exception doesn’t occur; the result

is truncated to the largest positive number - 0x7FFF ( + 32767) .

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-28

PABSW PABSW

Parallel Absolute Word

MMI

011100 MMI1

101000

rt rd PABSW

00001

0

00000

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PABSW rd, rt

Purpose: To calculate the absolute value of 4 32-bit integers in parallel.

Description: rd ← rt

The absolute value of the four signed word values in GPR

rt

are placed into the

corresponding four words in GPR

rd

.

This instruction operates on 128-bit regis t ers .

Operation:

GPR[rd]31..0 ←  GPR[rt]31..0 

GPR[rd]63..32 ←  GPR[rt]63..32 

GPR[rd]95..64 ←  GPR[rt]95..64 

GPR[rd]127..96 ←  GPR[rt]127..96 

rt A3 A2 A1 A0

127 96 95 64 63 32 31 0

rd  A3   A2   A1   A0 

Supplementary explanation:

When the word value of the GPR

rt

is equal to 0x80000000 (-2147483648), the smallest

negative number, the operation will result in an overflow. However, if an overflow

exception doesn’ t occur; the res ult is tr uncated t o the largest p ositi ve value - 0x7FFFFFFF

(+2147483647).

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-29

PADDB PADDB

Parallel Add Byte

MMI

011100 MMI0

001000

rt rd PADDB

01000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PADDB rd, rs, rt

Purpose: To add 16 pairs of 8-bit integers in parallel.

Description: rd ← rs + rt

The sixteen byte values in GPR

rs

are added to the corresponding sixteen byte values in

GPR

rt

in parallel. The results are placed into the corresponding sixteen bytes in GPR

rd

.

No overflow or underflow exceptions are generated under any circumstances.

This instruction operates on 128-bit regis t ers .

Operation:

GPR[rd]7..0 ← (GPR[rs]7..0 + GPR[rt]7..0)7..0

GPR[rd]15..8 ← (GPR[rs]15..8 + GPR[rt]15..8)7..0

GPR[rd]23..16 ← (GPR[rs]23..16 + GPR[rt]23..16)7..0

GPR[rd]31..24 ← (GPR[rs]31..24 + GPR[rt]31..24)7..0

GPR[rd]39..32 ← (GPR[rs]39..32 + GPR[rt]39..32)7..0

GPR[rd]47..40 ← (GPR[rs]47..40 + GPR[rt]47..40)7..0

GPR[rd]55..48 ← (GPR[rs]55..48 + GPR[rt]55..48)7..0

GPR[rd]63..56 ← (GPR[rs]63..56 + GPR[rt]63..56)7..0

GPR[rd]71..64 ← (GPR[rs]71..64 + GPR[rt]71..64)7..0

GPR[rd]79..72 ← (GPR[rs]79..72 + GPR[rt]79..72)7..0

GPR[rd]87..80 ← (GPR[rs]87..80 + GPR[rt]87..80)7..0

GPR[rd]95..88 ← (GPR[rs]95..88 + GPR[rt]95..88)7..0

GPR[rd]103..96 ← (GPR[rs]103..96 + GPR[rt]103..96)7..0

GPR[rd]111..104 ← (GPR[rs]111..104 + GPR[rt]111. .104)7..0

GPR[rd]119..112 ← (GPR[rs]119..112 + GPR[rt]119..112)7..0

GPR[rd]127..120 ← (GPR[rs]127..120 + GPR[rt]127..120)7..0

rs A15 A14 A13 A12 A11 A10 A9 A8 A7 A6 A 5 A4 A3 A2 A 1 A0

rt B15 B14 B13 B12 B11 B10 B9 B8 B7 B 6 B5 B 4 B3 B2 B1 B 0

+ + + + + + + + + + + + + + + +

127 120 119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0

A0

+

B0

A1

+

B1

A2

+

B2

A3

+

B3

A4

+

B4

A5

+

B5

A6

+

B6

A7

+

B7

A8

+

B8

A9

+

B9

A10

+

B10

A11

+

B11

A12

+

B12

A13

+

B13

A14

+

B14

A15

+

B15

rd

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-30

PADDH PADDH

Parallel Add Halfword

MMI

011100 MMI0

001000

rt rd PADDH

00100

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PADDH rd, rs, rt

Purpose: To add 8 pairs of 16-bit integers in parallel.

Description: rd ← rs + rt

The eight halfword values in GPR

rs

are added to the corresponding eight halfword values

in GPR

rt

in parallel. The results are placed into the corresponding eight halfwords in

GPR

rd

.

No overflow or underflow exceptions are generated under any circumstances.

This instruction operates on 128-bit regis t ers .

Operation:

GPR[rd]15..0 ← (GPR[rs]15..0 + GPR[rt]15..0)15..0

GPR[rd]31..16 ← (GPR[rs]31..16 + GPR[rt]31..16)15..0

GPR[rd]47..32 ← (GPR[rs]47..32 + GPR[rt]47..32)15..0

GPR[rd]63..48 ← (GPR[rs]63..48 + GPR[rt]63..48)15..0

GPR[rd]79..64 ← (GPR[rs]79..64 + GPR[rt]79..64)15..0

GPR[rd]95..80 ← (GPR[rs]95..80 + GPR[rt]95..80)15..0

GPR[rd]111..96 ← (GPR[rs]111..96 + GPR[rt]111..96)15..0

GPR[rd]127..112 ← (GPR[rs]127..112 + GPR[rt]127..112)15..0

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

rs A7 A6 A5 A4 A3 A2 A1 A0

rd A7+B7 A6+B6 A5+B5 A4+B4 A3+B3 A2+B2 A1+B1 A0+B0

rt B7 B6 B5 B4 B3 B2 B1 B0

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

+ + + + + + + +

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-31

PADDSB PADDSB

Parallel Add with Signed satur ation By te

MMI

011100 MMI0

001000

rt rd PADDSB

11000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PADDSB rd, rs, rt

Purpose: To add 16 pairs of 8-bit signed integers with saturation in parallel.

Description: rd ← rs + rt

The sixteen signed byte values in GPR

rs

are added to the corresponding sixteen signed

byte values in GPR

rt

in parallel. The results are placed into the corresponding sixteen

bytes in GPR

rd

.

No overflow or underflow exceptions are generated under any circumstances. Results

beyond the range of a signed byte value are saturated according to the following:

Overflow: 0x7F

Underflow: 0x80

This instruction operates on 128-bit regis t ers .

Operation:

if ((GPR[rs]7..0 + GPR[rt]7..0) > 0x7F) then

GPR[rd]7..0 ← 0x7F

else if (0x100 <= (GPR[rs]7..0 + GPR[rt]7..0) < 0x180) then

GPR[rd]7..0 ← 0x80

else

GPR[rd]7..0 ← (GPR[rs]7..0 + GPR[rt]7..0)7..0

endif

if ((GPR[rs]15..8 + GPR[rt]15..8) > 0x7F) then

GPR[rd]15..8 ← 0x7F

else if (0x100 <= (GPR[rs]15..8 + GPR[rt]15..8) < 0x180) then

GPR[rd]15..8 ← 0x80

else

GPR[rd]15..8 ← (GPR[rs]15..8 + GPR[rt]15..8)7..0

endif

if ((GPR[rs]23..16 + GPR[rt]23..16) > 0x7F) then

GPR[rd]23..16 ← 0x7F

else if (0x100 <= (GPR[rs]23..16 + GPR[rt]23..16) < 0x180) then

GPR[rd]23..16 ← 0x80

else

GPR[rd]23..16 ← (GPR[rs]23..16 + GPR[rt]23..16)7..0

endif

if ((GPR[rs]31..24 + GPR[rt]31..24) > 0x7F) then

GPR[rd]31..24 ← 0x7F

else if (0x100 <= (GPR[rs]31..24 + GPR[rt]31..24) < 0x180) then

Appendix B C790-Specific I nst ruction Set Details

B-32

GPR[rd]31..24 ← 0x80

else

GPR[rd]31..24 ← (GPR[rs]31..24 + GPR[rt]31..24)7..0

endif

if ((GPR[rs]39..32 + GPR[rt]39..32) > 0x7F) then

GPR[rd]39..32 ← 0x7F

else if (0x100 <= (GPR[rs]39..32 + GPR[rt]39..32) < 0x180) then

GPR[rd]39..32 ← 0x80

else

GPR[rd]39..32 ← (GPR[rs]39..32 + GPR[rt]39..32)7..0

endif

if ((GPR[rs]47..40 + GPR[rt]47..40) > 0x7F) then

GPR[rd]47..40 ← 0x7F

else if (0x100 <= (GPR[rs]47..40 + GPR[rt]47..40) < 0x180) then

GPR[rd]47..40 ← 0x80

else

GPR[rd]47..40 ← (GPR[rs]47..40 + GPR[rt]47..40)7..0

endif

if ((GPR[rs]55..48 + GPR[rt]55..48) > 0x7F) then

GPR[rd]55..48 ← 0x7F

else if (0x100 <= (GPR[rs]55..48 + GPR[rt]55..48) < 0x180) then

GPR[rd]55..48 ← 0x80

else

GPR[rd]55..48 ← (GPR[rs]55..48 + GPR[rt]55..48)7..0

endif

if ((GPR[rs]63..56 + GPR[rt]63..56) > 0x7F) then

GPR[rd]63..56 ← 0x7F

else if (0x100 <= (GPR[rs]63..56 + GPR[rt]63..56) < 0x180) then

GPR[rd]63..56 ← 0x80

else

GPR[rd]63..56 ← (GPR[rs]63..56 + GPR[rt]63..56)7..0

endif

if ((GPR[rs]71..64 + GPR[rt]71..64) > 0x7F) then

GPR[rd]71..64 ← 0x7F

else if (0x100 <= (GPR[rs]71..64 + GPR[rt]71..64) < 0x180) then

GPR[rd]71..64 ← 0x80

else

GPR[rd]71..64 ← (GPR[rs]71..64 + GPR[rt]71..64)7..0

endif

if ((GPR[rs]79..72 + GPR[rt]79..72) > 0x7F) then

GPR[rd]79..72 ← 0x7F

else if (0x100 <= (GPR[rs]79..72 + GPR[rt]79..72) < 0x180) then

GPR[rd]79..72 ← 0x80

else

GPR[rd]79..72 ← (GPR[rs]79..72 + GPR[rt]79..72)7..0

endif

if ((GPR[rs]87..80 + GPR[rt]87..80) > 0x7F) then

GPR[rd]87..80 ← 0x7F

Appendix B C790-Specific I nst ruction Set Details

B-33

else if (0x100 <= (GPR[rs]87..80 + GPR[rt]87..80) < 0x180) then

GPR[rd]87..80 ← 0x80

else

GPR[rd]87..80 ← (GPR[rs]87..80 + GPR[rt]87..80)7..0

endif

if ((GPR[rs]95..88 + GPR[rt]95..88) > 0x7F) then

GPR[rd]95..88 ← 0x7F

else if (0x100 <= (GPR[rs]95..88 + GPR[rt]95..88) < 0x180) then

GPR[rd]95..88 ← 0x80

else

GPR[rd]95..88 ← (GPR[rs]95..88 + GPR[rt]95..88)7..0

endif

if ((GPR[rs]103..96 + GPR[rt]103..96) > 0x7F) then

GPR[rd]103..96 ← 0x7F

else if (0x100 <= (GPR[rs]103..96 + GPR[rt]103..96) < 0x180) then

GPR[rd]103..96 ← 0x80

else

GPR[rd]103..96 ← (GPR[rs]103..96 + GPR[rt]103..96)7..0

endif

if ((GPR[rs]111..104 + GPR[rt]111..104) > 0x7F) then

GPR[rd]111..104 ← 0x7F

else if (0x100 <= (GPR[rs]111..104 + GPR[rt]111..104) < 0x180) then

GPR[rd]111..104 ← 0x80

else

GPR[rd]111..104 ← (GPR[rs]111..104 + GPR[rt]111..104)7..0

endif

if ((GPR[rs]119..112 + GPR[rt]119..112) > 0x7F) then

GPR[rd]119..112 ← 0x7F

else if (0x100 <= (GPR[rs]119..112 + GPR[rt]119..112) < 0x180) then

GPR[rd]119..112 ← 0x80

else

GPR[rd]119..112 ← (GPR[rs]119..112 + GPR[rt]119..112)7..0

endif

if ((GPR[rs]127..120 + GPR[rt]127..120) > 0x7F) then

GPR[rd]127..120 ← 0x7F

else if (0x100 <= (GPR[rs]127..120 + GPR[rt]127..120) < 0x180) then

GPR[rd]127..120 ← 0x80

else

GPR[rd]127..120 ← (GPR[rs]127..120 + GPR[rt]127..120)7..0

endif

Appendix B C790-Specific I nst ruction Set Details

B-34

rs A15 A14 A13 A12 A11 A10 A9 A8 A7 A6 A5 A4 A3 A2 A1 A0

rt B15 B14 B13 B12 B11 B10 B9 B8 B7 B6 B5 B4 B3 B2 B1 B0

+ + + + + + + + + + + + + + + +

127 120 119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0

A0

+

B0

A1

+

B1

A2

+

B2

A3

+

B3

A4

+

B4

A5

+

B5

A6

+

B6

A7

+

B7

A8

+

B8

A9

+

B9

A10

+

B10

A11

+

B11

A12

+

B12

A13

+

B13

A14

+

B14

A15

+

B15

rd*

* Saturate to signed byte

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-35

PADDSH PADDSH

Parallel Add with Signed satur ation Halfword

MMI

011100 MMI0

001000

rt rd PADDSH

10100

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PADDSH rd, rs, rt

Purpose: To add 8 pairs of 16-bit signed integers with saturation in parallel.

Description: rd ← rs + rt

The eight signed halfword values in GPR

rs

are added to the corresponding eight signed

halfword values in GPR

rt

in parallel. The results are placed into the corresponding eight

halfwords in GPR

rd

.

No overflow or underflow exceptions are generated under any circumstances. Results

beyond the range of a signed halfword value are saturated according to the following:

Overflow: 0x7FFF

Underflow: 0x8000

This instruction operates on 128-bit regis t ers .

Operation:

if ((GPR[rs]15..0 + GPR[rt]15..0) > 0x7FFF) t hen

GPR[rd]15..0 ← 0x7FFF

else if (0x10000 <= (GPR[rs]15..0 + GPR[rt]15..0) < 0x18000) then

GPR[rd]15..0 ← 0x8000

else

GPR[rd]15..0 ← (GPR[rs]15..0 + GPR[rt]15..0)15..0

endif

if ((GPR[rs]31..16 + GPR[rt]31..16) > 0x7FFF) t hen

GPR[rd]31..16 ← 0x7FFF

else if (0x10000 <= (GPR[rs]31..16 + GPR[rt]31..16) < 0x18000) then

GPR[rd]31..16 ← 0x8000

else

GPR[rd]31..16 ← (GPR[rs]31..16 + GPR[rt]31..16)15..0

endif

if ((GPR[rs]47..32 + GPR[rt]47..32) > 0x7FFF) t hen

GPR[rd]47..32 ← 0x7FFF

else if (0x10000 <= (GPR[rs]47..32 + GPR[rt]47..32) < 0x18000) then

GPR[rd]47..32 ← 0x8000

else

GPR[rd]47..32 ← (GPR[rs]47..32 + GPR[rt]47..32)15..0

endif

Appendix B C790-Specific I nst ruction Set Details

B-36

if ((GPR[rs]63..48 + GPR[rt]63..48) > 0x7FFF) t hen

GPR[rd]63..48 ← 0x7FFF

else if (0x10000 <= (GPR[rs]63..48 + GPR[rt]63..48) < 0x18000) then

GPR[rd]63..48 ← 0x8000

else

GPR[rd]63..48 ← (GPR[rs]63..48 + GPR[rt]63..48)15..0

endif

if ((GPR[rs]79..64 + GPR[rt]79..64) > 0x7FFF) t hen

GPR[rd]79..64 ← 0x7FFF

else if (0x10000 <= (GPR[rs]79..64 + GPR[rt]79..64) < 0x18000) then

GPR[rd]79..64 ← 0x8000

else

GPR[rd]79..64 ← (GPR[rs]79..64 + GPR[rt]79..64)15..0

endif

if ((GPR[rs]95..80 + GPR[rt]95..80) > 0x7FFF) t hen

GPR[rd]95..80 ← 0x7FFF

else if (0x10000 <= (GPR[rs]95..80 + GPR[rt]95..80) < 0x18000) then

GPR[rd]95..80 ← 0x8000

else

GPR[rd]95..80 ← (GPR[rs]95..80 + GPR[rt]95..80)15..0

endif

if ((GPR[rs]111..96 + GPR[rt]111..96) > 0x7FFF) t hen

GPR[rd]111..96 ← 0x7FFF

else if (0x10000 <= (GPR[rs]111..96 + GPR[rt]111..96) < 0x18000) then

GPR[rd]111..96 ← 0x8000

else

GPR[rd]111..96 ← (GPR[rs]111..96 + GPR[rt]111..96)15..0

endif

if ((GPR[rs]127..112 + GPR[rt]127..112) > 0x7FFF) then

GPR[rd]127..112 ← 0x7FFF

else if (0x10000 <= (GPR[rs]127..112 + GPR[rt]127..112) < 0x18000) then

GPR[rd]127..112 ← 0x8000

else

GPR[rd]127..112 ← (GPR[rs]127..112 + GPR[rt]127..112)15..0

endif

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

rs A7 A6 A5 A4 A3 A2 A1 A0

rd* A7+B7 A6+B6 A5+B5 A4+B4 A3+B3 A2+B2 A1+B1 A0+B0

rt B7 B6 B5 B4 B3 B2 B1 B0

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

+ + + + + + + +

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

* Saturate to signed halfword

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-37

PADDSW PADDSW

Parallel Add with Signed saturation Word

MMI

011100 MMI0

001000

rt rd PADDSW

10000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PADDSW rd, rs, rt

Purpose: To add 4 pairs of 32-bit signed integers with saturation in parallel.

Description: rd ← rs + rt

The four signed word values in GPR

rs

are added to the corresponding four signed word

values in GPR

rt

in parallel. The results are placed into to the corresponding four words in

GPR

rd

.

No overflow or underflow exceptions are generated under any circumstances. Results

beyond the range of a signed word value are saturated according to the following:

Overflow: 0x7FFFFFFF

Underflow: 0x80000000

This instruction operates on 128-bit regis t ers .

Operation:

if ((GPR[rs]31..0 + GPR[rt]31..0) > 0x7FFFFFFF) t hen

GPR[rd]31..0 ← 0x7FFFFFFF

else if (0x100000000 <= (GPR[rs]31..0 + GPR[rt]31..0) < 0x180000000) then

GPR[rd]31..0 ← 0x80000000

else

GPR[rd]31..0 ← (GPR[rs]31..0 + GPR[rt]31..0)31..0

endif

if ((GPR[rs]63..32 + GPR[rt]63..32) > 0x7FFFFFFF) t hen

GPR[rd]63..32 ← 0x7FFFFFFF

else if (0x100000000 <= (GPR[rs]63..32 + GPR[rt]63..32) < 0x180000000) then

GPR[rd]63..32 ← 0x80000000

else

GPR[rd]63..32 ← (GPR[rs]63..32 + GPR[rt]63..32)31..0

endif

if ((GPR[rs]95..64 + GPR[rt]95..64) > 0x7FFFFFFF) t hen

GPR[rd]95..64 ← 0x7FFFFFFF

else if (0x100000000 <= (GPR[rs]95..64 + GPR[rt]95..64) < 0x180000000) then

GPR[rd]95..64 ← 0x80000000

else

GPR[rd]95..64 ← (GPR[rs]95..64 + GPR[rt]95..64)31..0

endif

Appendix B C790-Specific I nst ruction Set Details

B-38

if ((GPR[rs]127..96 + GPR[rt]127..96) > 0x7FFFFFFF) t hen

GPR[rd]127..96 ← 0x7FFFFFFF

else if (0x100000000 <= (GPR[rs]127..96 + GPR[rt]127..96) < 0x180000000) then

GPR[rd]127..96 ← 0x80000000

else

GPR[rd]127..96 ← (GPR[rs]127..96 + GPR[rt]127..96)31..0

endif

127 96 95 64 63 32 31 0

rs A3 A2 A1 A0

rd* A3+B3 A2+B2 A1+B1 A0+B0

rt B3 B2 B1 B0

127 96 95 64 63 32 31 0

+ + + +

* Saturate to signed word

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-39

PADDUB PADDUB

Parallel Add with Unsigned sat ur ation By te

MMI

011100 MMI1

101000

rt rd PADDUB

11000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PADDUB rd, rs, rt

Purpose: To add 16 pairs of 8-bit unsigned integers with saturation in parallel.

Description: rd ← rs + rt

The sixteen unsigned byte values in GPR

rs

are added to the corresponding sixteen

unsigned byte values in GPR

rt

in parallel. The results are placed into the corresponding

sixteen bytes in GPR

rd

.

No overflow exceptions are generated under any circumstances. Results beyond the range

of an unsigned byte value are saturated according to the following:

Overflow: 0xFF

This instruction operates on 128-bit regis t ers .

Operation:

if ((GPR[rs]7..0 + GPR[rt]7..0) > 0xFF) t hen

GPR[rd]7..0 ← 0xFF

else

GPR[rd]7..0 ← (GPR[rs]7..0 + GPR[rt]7..0)7..0

endif

if ((GPR[rs]15..8 + GPR[rt]15..8) > 0xFF) t hen

GPR[rd]15..8 ← 0xFF

else

GPR[rd]15..8 ← (GPR[rs]15..8 + GPR[rt]15..8)7..0

endif

if ((GPR[rs]23..16 + GPR[rt]23..16) > 0xFF) t hen

GPR[rd]23..16 ← 0xFF

else

GPR[rd]23..16 ← (GPR[rs]23..16 + GPR[rt]23..16)7..0

endif

if ((GPR[rs]31..24 + GPR[rt]31..24) > 0xFF) t hen

GPR[rd]31..24 ← 0xFF

else

GPR[rd]31..24 ← (GPR[rs]31..24 + GPR[rt]31..24)7..0

endif

if ((GPR[rs]39..32 + GPR[rt]39..32) > 0xFF) t hen

GPR[rd]39..32 ← 0xFF

else

GPR[rd]39..32 ← (GPR[rs]39..32 + GPR[rt]39..32)7..0

endif

Appendix B C790-Specific I nst ruction Set Details

B-40

if ((GPR[rs]47..40 + GPR[rt]47..40) > 0xFF) t hen

GPR[rd]47..40 ← 0xFF

else

GPR[rd]47..40 ← (GPR[rs]47..40 + GPR[rt]47..40)7..0

endif

if ((GPR[rs]55..48 + GPR[rt]55..48) > 0xFF) t hen

GPR[rd]55..48 ← 0xFF

else

GPR[rd]55..48 ← (GPR[rs]55..48 + GPR[rt]55..48)7..0

endif

if ((GPR[rs]63..56 + GPR[rt]63..56) > 0xFF) t hen

GPR[rd]63..56 ← 0xFF

else

GPR[rd]63..56 ← (GPR[rs]63..56 + GPR[rt]63..56)7..0

endif

if ((GPR[rs]71..64 + GPR[rt]71..64) > 0xFF) t hen

GPR[rd]71..64 ← 0xFF

else

GPR[rd]71..64 ← (GPR[rs]71..64 + GPR[rt]71..64)7..0

endif

if ((GPR[rs]79..72 + GPR[rt]79..72) > 0xFF) t hen

GPR[rd]79..72 ← 0xFF

else

GPR[rd]79..72 ← (GPR[rs]79..72 + GPR[rt]79..72)7..0

endif

if ((GPR[rs]87..80 + GPR[rt]87..80) > 0xFF) t hen

GPR[rd]87..80 ← 0xFF

else

GPR[rd]87..80 ← (GPR[rs]87..80 + GPR[rt]87..80)7..0

endif

if ((GPR[rs]95..88 + GPR[rt]95..88) > 0xFF) t hen

GPR[rd]95..88 ← 0xFF

else

GPR[rd]95..88 ← (GPR[rs]95..88 + GPR[rt]95..88)7..0

endif

if ((GPR[rs]103..96 + GPR[rt]103..96) > 0xFF) t hen

GPR[rd]103..96 ← 0xFF

else

GPR[rd]103..96 ← (GPR[rs]103..96 + GPR[rt]103..96)7..0

endif

if ((GPR[rs]111..104 + GPR[rt]111..104) > 0xFF) then

GPR[rd]111..104 ← 0xFF

else

GPR[rd]111..104 ← (GPR[rs]111..104 + GPR[rt]111..104)7..0

endif

if ((GPR[rs]119..112 + GPR[rt]119..112) > 0xFF) then

Appendix B C790-Specific I nst ruction Set Details

B-41

GPR[rd]119..112 ← 0xFF

else

GPR[rd]119..112 ← (GPR[rs]119..112 + GPR[rt]119..112)7..0

endif

if ((GPR[rs]127..120 + GPR[rt]127..120) > 0xFF) then

GPR[rd]127..120 ← 0xFF

else

GPR[rd]127..120 ← (GPR[rs]127..120 + GPR[rt]127..120)7..0

endif

rs A15 A14 A13 A12 A11 A10 A9 A8 A7 A6 A5 A4 A3 A2 A1 A0

rt B15 B14 B13 B12 B11 B10 B9 B8 B7 B6 B5 B4 B3 B2 B1 B0

+ + + + + + + + + + + + + + + +

127 120 119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0

A0

+

B0

A1

+

B1

A2

+

B2

A3

+

B3

A4

+

B4

A5

+

B5

A6

+

B6

A7

+

B7

A8

+

B8

A9

+

B9

A10

+

B10

A11

+

B11

A12

+

B12

A13

+

B13

A14

+

B14

A15

+

B15

rd*

* Saturate to unsigned byte

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-42

PADDUH PADDUH

Parallel Add with Unsigned satur ation Halfword

MMI

011100 MMI1

101000

rt rd PADDUH

10100

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PADDUH rd, rs, rt

Purpose: To add 8 pairs of 16-bit unsigned integers with saturation in parallel.

Description: rd ← rs + rt

The eight unsigned halfword values in GPR

rs

are added to the corresponding eight

unsigned halfword values in GPR

rt

in parallel. The results are placed into the

corresponding eight halfwords in GPR

rd

.

No overflow exceptions are generated under any circumstances. Results beyond the range

of an unsigned halfword value are saturated according to the following:

Overflow: 0xFFFF

This instruction operates on 128-bit regis t ers .

Operation:

if ((GPR[rs]15..0 + GPR[rt]15..0) > 0xFFFF) t hen

GPR[rd]15..0 ← 0xFFFF

else

GPR[rd]15..0 ← (GPR[rs]15..0 + GPR[rt]15..0)15..0

endif

if ((GPR[rs]31..16 + GPR[rt]31..16) > 0xFFFF) t hen

GPR[rd]31..16 ← 0xFFFF

else

GPR[rd]31..16 ← (GPR[rs]31..16 + GPR[rt]31..16)15..0

endif

if ((GPR[rs]47..32 + GPR[rt]47..32) > 0xFFFF) t hen

GPR[rd]47..32 ← 0xFFFF

else

GPR[rd]47..32 ← (GPR[rs]47..32 + GPR[rt]47..32)15..0

endif

if ((GPR[rs]63..48 + GPR[rt]63..48) > 0xFFFF) t hen

GPR[rd]63..48 ← 0xFFFF

else

GPR[rd]63..48 ← (GPR[rs]63..48 + GPR[rt]63..48)15..0

endif

Appendix B C790-Specific I nst ruction Set Details

B-43

if ((GPR[rs]79..64 + GPR[rt]79..64) > 0xFFFF) t hen

GPR[rd]79..64 ← 0xFFFF

else

GPR[rd]79..64 ← (GPR[rs]79..64 + GPR[rt]79..64)15..0

endif

if ((GPR[rs]95..80 + GPR[rt]95..80) > 0xFFFF) t hen

GPR[rd]95..80 ← 0xFFFF

else

GPR[rd]95..80 ← (GPR[rs]95..80 + GPR[rt]95..80)15..0

endif

if ((GPR[rs]111..96 + GPR[rt]111..96) > 0xFFFF) t hen

GPR[rd]111..96 ← 0xFFFF

else

GPR[rd]111..96 ← (GPR[rs]111..96 + GPR[rt]111..96)15..0

endif

if ((GPR[rs]127..112 + GPR[rt]127..112) > 0xFFFF) t hen

GPR[rd]127..112 ← 0xFFFF

else

GPR[rd]127..112 ← (GPR[rs]127..112 + GPR[rt]127..112)15..0

endif

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

rs A7 A6 A5 A4 A3 A2 A1 A0

rd* A7+B7 A6+B6 A5+B5 A4+B4 A3+B3 A2+B2 A1+B1 A0+B0

rt B7 B6 B5 B4 B3 B2 B1 B0

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

+ + + + + + + +

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

* Saturate to unsig ned half word

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-44

PADDUW PADDUW

Parallel Add with Unsigned saturation Word

MMI

011100 MMI1

101000

rt rd PADDUW

10000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PADDUW rd, rs, rt

Purpose: To add 4 pairs of 32-bit unsigned integers with saturation in parallel.

Description: rd ← rs + rt

The four unsigned word values in GPR

rs

are added to the corresponding four unsigned

word values in GPR

rt

in parallel. The results are placed into the corresponding four

words in GPR

rd

.

No overflow exceptions are generated under any circumstances. Results beyond the range

of an unsigned word value are saturated according to the following:

Overflow: 0xFFFFFFFF

This instruction operates on 128-bit regis t ers .

Operation:

if ((GPR[rs]31..0 + GPR[rt]31..0) > 0xFFFFFFFF) t hen

GPR[rd]31..0 ← 0xFFFFFFFF

else

GPR[rd]31..0 ← (GPR[rs]31..0 + GPR[rt]31..0)31..0

endif

if ((GPR[rs]63..32 + GPR[rt]63..32) > 0xFFFFFFFF) then

GPR[rd]63..32 ← 0xFFFFFFFF

else

GPR[rd]63..32 ← (GPR[rs]63..32 + GPR[rt]63..32)31..0

endif

if ((GPR[rs]95..64 + GPR[rt]95..64) > 0xFFFFFFFF) then

GPR[rd]95..64 ← 0xFFFFFFFF

else

GPR[rd]95..64 ← (GPR[rs]95..64 + GPR[rt]95..64)31..0

endif

if ((GPR[rs]127..96 + GPR[rt]127..96) > 0xFFFFFFFF) t hen

GPR[rd]127..96 ← 0xFFFFFFFF

else

GPR[rd]127..96 ← (GPR[rs]127..96 + GPR[rt]127..96)31..0

endif

Appendix B C790-Specific I nst ruction Set Details

B-45

127 96 95 64 63 32 31 0

rs A3 A2 A1 A0

rd* A3+B3 A2+B2 A1+B1 A0+B0

rt B3 B2 B1 B0

127 96 95 64 63 32 31 0

+ + + +

* Saturate to unsigned word

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-46

PADDW PADDW

Parallel Add Word

MMI

011100 MMI0

001000

rt rd PADDW

00000

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PADDW rd, rs, rt

Purpose: To add 4 pairs of 32-bit integers in parallel.

Description: rd ← rs + rt

The four word values in GPR

rs

are added to the corresponding four word values in GPR

rt

in parallel. The results are placed into the corresponding four words in GPR

rd

.

No overflow or underflow exceptions are generated under any circumstances.

This instruction operates on 128-bit regis t ers .

Operation:

GPR[rd]31..0 ← (GPR[rs]31..0 + GPR[rt]31..0)31..0

GPR[rd]63..32 ← (GPR[rs]63..32 + GPR[rt]63..32)31..0

GPR[rd]95..64 ← (GPR[rs]95..64 + GPR[rt]95..64)31..0

GPR[rd]127..96 ← (GPR[rs]127..96 + GPR[rt]127..96)31..0

127 96 95 64 63 32 31 0

rs A3 A2 A1 A0

rd A3+B3 A2+B2 A1+B1 A0+B0

rt B3 B2 B1 B0

127 96 95 64 63 32 31 0

+ + + +

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-47

PADSBH PADSBH

Parallel Add/Subtract Halfword

MMI

011100 MMI1

101000

rt rd PADSBH

00100

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PADSBH rd, rs, rt

Purpose: To add/subtract 8 pairs of 16-bit integers in parallel.

Description: rd ← rs +/− rt

The high-order four halfword values in GPR

rs

are added to the corresponding four

halfword values in GPR

rt

and the low-order four halfword values in GPR

rt

are

subtracted from the corresponding four halfword values in GPR

rs

in parallel. The results

are placed into the corresponding eight halfword values in GPR

rd

.

No overflow or underflow exceptions are generated under any circumstances.

This instruction operates on 128-bit regis t ers .

Operation

GPR[rd]15..0 ← (GPR[rs]15..0 − GPR[rt]15..0)15..0

GPR[rd]31..16 ← (GPR[rs]31..16 − GPR[rt]31..16)15..0

GPR[rd]47..32 ← (GPR[rs]47..32 − GPR[rt]47..32)15..0

GPR[rd]63..48 ← (GPR[rs]63..48 − GPR[rt]63..48)15..0

GPR[rd]79..64 ← (GPR[rs]79..64 + GPR[rt]79..64)15..0

GPR[rd]95..80 ← (GPR[rs]95..80 + GPR[rt]95..80)15..0

GPR[rd]111..96 ← (GPR[rs]111..96 + GPR[rt]111..96)15..0

GPR[rd]127..112 ← (GPR[rs]127..112 + GPR[rt]127..112)15..0

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

rs A7 A6 A5 A4 A3 A2 A1 A0

rd A7+B7 A6+B6 A5+B5 A4+B4 A3−B3 A2−B2 A1−B1 A0−B0

rt B7 B6 B5 B4 B3 B2 B1 B0

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

+ + + + − − − −

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-48

PAND PAND

Parallel And

MMI

011100 MMI2

001001

rt rd PAND

10010

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PAND rd, rs, rt

Purpose: To perform a bitwise logical AND.

Description: rd ← rs AND rt

The contents of GPR

rs

are combined with the contents of GPR rt in a bitwise logical AND

operation. The result is placed into GPR

rd

.

This instruction operates on 128-bit regis t ers .

Operation:

GPR[rd]127..0 ← GPR[rs]127..0 and GPR[rt]127..0

rs A1 A0

127 64 63 0

rd A1 AND B1 A0 AND B0

127 64 63 0

rt B1 B0

127 64 63 0

AND AND

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-49

PCEQB PCEQB

Parallel Compare for Equal Byte

MMI

011100 MMI1

101000

rt rd PCEQB

01010

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PCEQB rd, rs, rt

Purpose: To record the result of 16 equality comparisons in parallel.

Description: rd ← (rs = rt)

The sixteen signed byte values in GPR

rs

are compared to the corresponding sixteen

signed byte values in GPR

rt

, in parallel. The results of the comparison are placed into

GPR

rd

as follows:

If the signed byte value in GPR

rs

is equal to the corresponding signed byte value in GPR

rt

, then the corresponding byte in GPR

rd

is set to 0xFF otherwise it is set to 0x00.

This instruction operates on 128-bit regis t ers .

Operation:

if (GPR[rs]7..0 = GPR[rt]7..0) then

GPR[rd]7..0 ← 18

else

GPR[rd]7..0 ← 08

endif

if (GPR[rs]15..8 = GPR[rt]15..8) then

GPR[rd]15..8 ← 18

else

GPR[rd]15..8 ← 08

endif

if (GPR[rs]23..16 = GPR[rt]23..16) then

GPR[rd]23..16 ← 18

else

GPR[rd]23..16 ← 08

endif

if (GPR[rs]31..24 = GPR[rt]31..24) then

GPR[rd]31..24 ← 18

else

GPR[rd]31..24 ← 08

endif

Appendix B C790-Specific I nst ruction Set Details

B-50

if (GPR[rs]39..32 = GPR[rt]39..32) then

GPR[rd]39..32 ← 18

else

GPR[rd]39..32 ← 08

endif

if (GPR[rs]47..40 = GPR[rt]47..40) then

GPR[rd]47..40 ← 18

else

GPR[rd]47..40 ← 08

endif

if (GPR[rs]55..48 = GPR[rt]55..48) then

GPR[rd]55..48 ← 18

else

GPR[rd]55..48 ← 08

endif

if (GPR[rs]63..56 = GPR[rt]63..56) then

GPR[rd]63..56 ← 18

else

GPR[rd]63..56 ← 08

endif

if (GPR[rs]71..64 = GPR[rt]71..64) then

GPR[rd]71..64 ← 18

else

GPR[rd]71..64 ← 08

endif

if (GPR[rs]79..72 = GPR[rt]79..72) then

GPR[rd]79..72 ← 18

else

GPR[rd]79..72 ← 08

endif

if (GPR[rs]87..80 = GPR[rt]87..80) then

GPR[rd]87..80 ← 18

else

GPR[rd]87..80 ← 08

endif

if (GPR[rs]95..88 = GPR[rt]95..88) then

GPR[rd]95..88 ← 18

else

GPR[rd]95..88 ← 08

endif

if (GPR[rs]103..96 = GPR[rt]103..96) then

GPR[rd]103..96 ← 18

else

GPR[rd]103..96 ← 08

endif

if (GPR[rs]111..104 = GPR[rt]111..104) then

Appendix B C790-Specific I nst ruction Set Details

B-51

GPR[rd]111..104 ← 18

else

GPR[rd]111..104 ← 08

endif

if (GPR[rs]119..112 = GPR[rt]119..112) then

GPR[rd]119..112 ← 18

else

GPR[rd]119..112 ← 08

endif

if (GPR[rs]127..120 = GPR[rt]127..120) then

GPR[rd]127..120 ← 18

else

GPR[rd]127..120 ← 08

endif

rs A15 A14 A13 A12 A11 A10 A9 A8 A7 A6 A5 A4 A3 A2 A1 A0

rt B15 B14 B13 B12 B11 B10 B9 B8 B7 B6 B5 B4 B3 B2 B1 B0

127 120 119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0

= = = = = = = = = = = = = = = =

rd 08 18 18 18 18 08 08 18 08 18 18 18 18 08 08 18

False True True True True False Fal se True False True True True True False False True

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-52

PCEQH PCEQH

Parallel Compar e for Equal Halfword

MMI

011100 MMI1

101000

rt rd PCEQH

00110

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PCEQH rd, rs, rt

Purpose: To record the results of 8 equality comparisons in parallel.

Description: rd ← (rs = rt)

The eight signed halfword values in GPR

rs

are compared to the corresponding eight

signed halfword values in GPR

rt

, in parallel. The results of the comparison are placed

into GPR

rd

as follows:

If the signed halfword value in GPR

rs

is equal to the corresponding signed halfword value

in GPR

rt

, then the corresponding halfword in GPR

rd

is set to 0xFFFF otherwis e it is set

to 0x0000.

This instruction operates on 128-bit regis t ers .

Operation:

if (GPR[rs]15..0 = GPR[rt]15..0) then

GPR[rd]15..0 ← 116

else

GPR[rd]15..0 ← 016

endif

if (GPR[rs]31..16 = GPR[rt]31..16) then

GPR[rd]31..16 ← 116

else

GPR[rd]31..16 ← 016

endif

if (GPR[rs]47..32 = GPR[rt]47..32) then

GPR[rd]47..32 ← 116

else

GPR[rd]47..32 ← 016

endif

if (GPR[rs]63..48 = GPR[rt]63..48) then

GPR[rd]63..48 ← 116

else

GPR[rd]63..48 ← 016

endif

Appendix B C790-Specific I nst ruction Set Details

B-53

if (GPR[rs]79..64 = GPR[rt]79..64) then

GPR[rd]79..64 ← 116

else

GPR[rd]79..64 ← 016

endif

if (GPR[rs]95..80 = GPR[rt]95..80) then

GPR[rd]95..80 ← 116

else

GPR[rd]95..80 ← 016

endif

if (GPR[rs]111..96 = GPR[rt]111..96) then

GPR[rd]111..96 ← 116

else

GPR[rd]111..96 ← 016

endif

if (GPR[rs]127..112 = GPR[rt]127..112) then

GPR[rd]127..112 ← 116

else

GPR[rd]127..112 ← 016

endif

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

rs A7 A6 A5 A4 A3 A2 A1 A0

rd 016 116 016 116 016 116 016 116

rt B7 B6 B5 B4 B3 B2 B1 B0

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

= = = = = = = =

False True False True False True False True

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-54

PCEQW PCEQW

Parallel Compar e for Equal Word

MMI

011100 MMI1

101000

rt rd PCEQW

00010

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PCEQW rd, rs, rt

Purpose: To record the result of 4 equality comparisons in parallel.

Description: rd ← (rs = rt)

The four signed word values in GPR

rs

are compared to the corresponding four signed

word values in GPR

rt

, in parallel. The results of the comparison are placed into GPR

rd

as follows:

If the signed word value in GPR

rs

is equal to the corresponding signed word value in GPR

rt

, then the corresponding word in GPR

rd

is set to 0xFFFFFFFF otherwise it is set to

0x00000000.

This instruction operates on 128-bit regis t ers .

Operation:

if (GPR[rs]31..0 = GPR[rt]31..0) then

GPR[rd]31..0 ← 132

else

GPR[rd]31..0 ← 032

endif

if (GPR[rs]63..32 = GPR[rt]63..32) then

GPR[rd]63..32 ← 132

else

GPR[rd]63..32 ← 032

endif

if (GPR[rs]95..64 = GPR[rt]95..64) then

GPR[rd]95..64 ← 132

else

GPR[rd]95..64 ← 032

endif

if (GPR[rs]127..96 = GPR[rt]127..96) then

GPR[rd]127..96 ← 132

else

GPR[rd]127..96 ← 032

endif

Appendix B C790-Specific I nst ruction Set Details

B-55

127 96 95 64 63 32 31 0

rs A3 A2 A1 A0

rd 032 132 032 132

rt B3 B2 B1 B0

127 96 95 64 63 32 31 0

= = = =

False True False True

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-56

PCGTB PCGTB

Parallel Compar e for Greater Than By te

MMI

011100 MMI0

001000

rt rd PCGTB

01010

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PCGTB rd, rs, rt

Purpose: To record the result of 16 greater-than comparisons in parallel.

Description: rd ← (rs > rt)

The sixteen signed byte values in GPR

rs

are compared to the corresponding sixteen

signed byte values in GPR

rt

in parallel. The results of the comparison are placed into

GPR

rd

as follows:

If the signed byte value in GPR

rs

is greater than the corresponding signed byte value in

GPR

rt

, then the corresponding byte in GPR

rd

is set to 0xFF otherwise it is set to 0x00.

This instruction operates on 128-bit regis t ers .

Operation:

if (GPR[rs]7..0 > GPR[rt]7..0) then

GPR[rd]7..0 ← 18

else

GPR[rd]7..0 ← 08

endif

if (GPR[rs]15..8 > GPR[rt]15..8) then

GPR[rd]15..8 ← 18

else

GPR[rd]15..8 ← 08

endif

if (GPR[rs]23..16 > GPR[rt]23..16) then

GPR[rd]23..16 ← 18

else

GPR[rd]23..16 ← 08

endif

if (GPR[rs]31..24 > GPR[rt]31..24) then

GPR[rd]31..24 ← 18

else

GPR[rd]31..24 ← 08

endif

Appendix B C790-Specific I nst ruction Set Details

B-57

if (GPR[rs]39..32 > GPR[rt]39..32) then

GPR[rd]39..32 ← 18

else

GPR[rd]39..32 ← 08

endif

if (GPR[rs]47..40 > GPR[rt]47..40) then

GPR[rd]47..40 ← 18

else

GPR[rd]47..40 ← 08

endif

if (GPR[rs]55..48 > GPR[rt]55..48) then

GPR[rd]55..48 ← 18

else

GPR[rd]55..48 ← 08

endif

if (GPR[rs]63..56 > GPR[rt]63..56) then

GPR[rd]63..56 ← 18

else

GPR[rd]63..56 ← 08

endif

if (GPR[rs]71..64 > GPR[rt]71..64) then

GPR[rd]71..64 ← 18

else

GPR[rd]71..64 ← 08

endif

if (GPR[rs]79..72 > GPR[rt]79..72) then

GPR[rd]79..72 ← 18

else

GPR[rd]79..72 ← 08

endif

if (GPR[rs]87..80 > GPR[rt]87..80) then

GPR[rd]87..80 ← 18

else

GPR[rd]87..80 ← 08

endif

if (GPR[rs]95..88 > GPR[rt]95..88) then

GPR[rd]95..88 ← 18

else

GPR[rd]95..88 ← 08

endif

Appendix B C790-Specific I nst ruction Set Details

B-58

if (GPR[rs]103..96 > GPR[rt]103..96) then

GPR[rd]103..96 ← 18

else

GPR[rd]103..96 ← 08

endif

if (GPR[rs]111..104 > GPR[rt]111..104) then

GPR[rd]111..104 ← 18

else

GPR[rd]111..104 ← 08

endif

if (GPR[rs]119..112 > GPR[rt]119..112) then

GPR[rd]119..112 ← 18

else

GPR[rd]119..112 ← 08

endif

if (GPR[rs]127..120 > GPR[rt]127..120) then

GPR[rd]127..120 ← 18

else

GPR[rd]127..120 ← 08

endif

rs A15 A14 A13 A12 A11 A10 A9 A8 A7 A6 A5 A4 A3 A2 A1 A0

rt B15 B14 B13 B12 B11 B10 B9 B8 B7 B6 B5 B4 B3 B2 B1 B0

127 120 119 112 111 104 103 96 95 88 87 80 79 72 71 64 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0

> > > > > > > > > > > > > > > >

rd 18 08 08 08 08 18 08 08 18 08 08 08 08 18 08 08

True Fal se False False False True False False True False Fal se False False True False False

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-59

PCGTH PCGTH

Parallel Compar e for G reater Than Halfword

MMI

011100 MMI0

001000

rt rd PCGTH

00110

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PCGTH rd, rs, rt

Purpose: To record the results of 8 greater-than comparisons in parallel.

Description: rd ← (rs > rt)

The eight signed halfword values in GPR

rs

are compared to the corresponding eight

signed halfword values in GPR

rt

in parallel. The results of the comparison are placed into

GPR

rd

as follows:

If the signed halfword value in GPR

rs

is greater than the corresponding signed halfword

value in GPR

rt

, then the corresponding halfword in GPR

rd

is set to 0xFFFF otherw ise it

is set to 0x0000.

This instruction operates on 128-bit regis t ers .

Operation:

if (GPR[rs]15..0 > GPR[rt]15..0) then

GPR[rd]15..0 ← 116

else

GPR[rd]15..0 ← 016

endif

if (GPR[rs]31..16 > GPR[rt]31..16) then

GPR[rd]31..16 ← 116

else

GPR[rd]31..16 ← 016

endif

if (GPR[rs]47..32 > GPR[rt]47..32) then

GPR[rd]47..32 ← 116

else

GPR[rd]47..32 ← 016

endif

if (GPR[rs]63..48 > GPR[rt]63..48) then

GPR[rd]63..48 ← 116

else

GPR[rd]63..48 ← 016

endif

Appendix B C790-Specific I nst ruction Set Details

B-60

if (GPR[rs]79..64 > GPR[rt]79..64) then

GPR[rd]79..64 ← 116

else

GPR[rd]79..64 ← 016

endif

if (GPR[rs]95..80 > GPR[rt]95..80) then

GPR[rd]95..80 ← 116

else

GPR[rd]95..80 ← 016

endif

if (GPR[rs]111..96 > GPR[rt]111..96) then

GPR[rd]111..96 ← 116

else

GPR[rd]111..96 ← 016

endif

if (GPR[rs]127..112 > GPR[rt]127..112) then

GPR[rd]127..112 ← 116

else

GPR[rd]127..112 ← 016

endif

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

rs A7 A6 A5 A4 A3 A2 A1 A0

rd 116 016 016 016 116 016 016 016

rt B7 B6 B5 B4 B3 B2 B1 B0

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

> > > > > > > >

True False False False True False False False

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-61

PCGTW PCGTW

Parallel Compar e for Greater Than Wor d

MMI

011100 MMI0

001000

rt rd PCGTW

00010

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PCGTW rd, rs, rt

Purpose: To record the results of 4 greater-than comparisons in parallel.

Description: rd ← (rs > rt)

The four signed word values in GPR

rs

are compared to the corresponding four signed

word values in GPR

rt

in parallel. The results of the comparison are placed into GPR

rd

as

follows:

If the signed word value in GPR

rs

is greater than the corresponding signed word value in

GPR

rt

, then the corresponding word in GPR

rd

is set 0xFFFFFFFF otherwise it is set to

0x00000000.

This instruction operates on 128-bit regis t ers .

Operation:

if (GPR[rs]31..0 > GPR[rt]31..0) then

GPR[rd]31..0 ← 132

else

GPR[rd]31..0 ← 032

endif

if (GPR[rs]63..32 > GPR[rt]63..32) then

GPR[rd]63..32 ← 132

else

GPR[rd]63..32 ← 032

endif

if (GPR[rs]95..64 > GPR[rt]95..64) then

GPR[rd]95..64 ← 132

else

GPR[rd]95..64 ← 032

endif

if (GPR[rs]127..96 > GPR[rt]127..96) then

GPR[rd]127..96 ← 132

else

GPR[rd]127..96 ← 032

endif

Appendix B C790-Specific I nst ruction Set Details

B-62

127 96 95 64 63 32 31 0

rs A3 A2 A1 A0

rd 032 132 032 132

rt B3 B2 B1 B0

127 96 95 64 63 32 31 0

> > > >

False True False True

Exception:

None

Appendix B C790-Specific I nst ruction Set Details

B-63

PCPYH PCPYH

Parallel Copy Halfword

MMI

011100 MMI3

101001

rt rd PCPYH

11011

0

00000

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PCPYH rd, rt

Purpose: To copy halfword.

Description: rd ← copy (rt)

The contents of the low-order halfword of the two doublewords in GPR

rt

are copied to

each of the halfwords of the two doublewords in GPR

rd

.

This instruction operates on 128-bit regis t ers .

Operation:

GPR[rd]15..0 ← GPR[rt]15..0

GPR[rd]31..16 ← GPR[rt]15..0

GPR[rd]47..32 ← GPR[rt]15..0

GPR[rd]63..48 ← GPR[rt]15..0

GPR[rd]79..64 ← GPR[rt]79..64

GPR[rd]95..80 ← GPR[rt]79..64

GPR[rd]111..96 ← GPR[rt]79..64

GPR[rd]127..112 ← GPR[rt]79..64

rt A1 A0

rd A1 A1 A1 A1 A0 A0 A0 A0

127 80 79 64 63 16 15 0

127 112 111 96 95 80 79 64 63 48 47 32 31 16 15 0

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-64

PCPYLD PCPYLD

Parallel Copy Lower Doubl eword

MMI

011100 MMI2

001001

rt rd PCPYLD

01110

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PCPYLD rd, rs, rt

Purpose: To copy doubleword.

Description: rd ← copy (rs, rt)

The contents of the low-order doubleword in GPR

rs

are combined with the contents of the

low-order doubleword in GPR

rt

. The quadword result is placed into GPR

rd

.

This instruction operates on 128-bit regis t ers .

Operation:

GPR[rd]63..0 ← GPR[rt]63..0

GPR[rd]127..64 ← GPR[rs]63..0

rs A0

rd A0 B0

rt B0

127 64 63 0

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-65

PCPYUD PCPYUD

Parallel Copy Upper Doubleword

MMI

011100 MMI3

101001

rt rd PCPYUD

01110

rs

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PCPYUD rd, rs, rt

Purpose: To copy doubleword.

Description: rd ← copy (rs, rt)

The contents of the high-order doubleword in GPR

rs

are combined with the contents of

the high-order doubleword in GPR

rt

. The quadword result is placed into GPR

rd

.

This instruction operates on 128-bit regis t ers .

Operation

GPR[rd]63..0 ← GPR[rs]127..64

GPR[rd]127..64 ← GPR[rt]127..64

rs A0

rd B0 A0

rt B0

127 64 63 0

Exceptions:

None

Appendix B C790-Specific I nst ruction Set Details

B-66

PDIVBW PDIVBW

Parallel Divide Br oadc ast Word

MMI

011100 MMI2

001001

rt PDIVBW

11101

rs 0

00000

31 26 25 21 20 16 15 11 10 6 5 0

6 5 5 5 5 6

C790

Format: PDIVBW rs, rt

Purpose: To divide 4 32-bit signed integers by a 16-bit signed integer in parallel.

Description: (LO, HI) ← rs / rt

The four signed words in GPR

rs

are divided by the low-order signed halfword in GPR

rt

,

in parallel. The four 32-bit quotients are placed into special register

LO

. The four 16-bit

remainders are placed into special register

HI

.

No arithmetic exception occurs under any circumstances.

This instruction operates on 128-bit regis t ers .

Restrictions:

If the divisor in GPR

rt

is zero, the arithmetic result value is undefined.

Operation:

q0 ← GPR[rs]31..0 div GPR[rt]15..0

r0 ← GPR[rs]31..0 mod GPR[rt]15..0

q1 ← GPR[rs]63..32 div GPR[rt]15..0

r1 ← GPR[rs]63..32 mod GPR[rt]15..0

q2 ← GPR[rs]95..64 div GPR[rt]15..0

r2 ← GPR[rs]95..64 mod GPR[rt]15..0

q3 ← GPR[rs]127..96 div GPR[rt]15..0

r3 ← GPR[rs]127..96 mod GPR[rt]15..0

LO31..0 ← q031..0

HI31..0 ← (r015)16 || r015..0

LO63..32 ← q131..0

HI63..32 ← (r115)16 || r115..0

LO95..64 ← q231..0

HI95..64 ← (r215)16 || r215..0

LO127..96 ← q331..0

HI127..96 ← (r315)16 || r315..0

Appendix B C790-Specific I nst ruction Set Details

B-67

127 96 95 64 63 32 31 0

rt B0

127 16 15 0

127 96 95 64 63 32 31 0

÷ ÷ ÷ ÷

rs A3 A2 A1 A0

LO A3 div B0 A2 div B0 A1 div B0 A0 div B0

HI sign ext (A3 mod B0) sign ext (A2 mod B0) sign ext (A1 mod B0) sign ext ( A0 mod B0)

Supplementary explanation:

When 0x80000000 (-2147483648), the most negative value, is divided by 0xFFFF (-1), the

operation will results in an overflow. However, overflow exception doesn’t occur and the

operation results in the following:

Quotient is 0x80000000 (-2147483648) , and remainder is 0x00000000 ( 0) .

Exceptions:

None

Programming Notes:

In the C790 the integer divide operation proceeds asynchronously and allows other CPU

instructions to execute before it is retired. An attempt to read

LO

or

HI

before the results

are written will cause an interlock until the results are ready. Asynchronous execution

does not affect the program result, but offers an opportunity for performance improvement

by scheduling the divide so that other instructions can execute in parallel.