SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V Fujitsu Limited Release 1.0, 1 July 2002 Fujitsu Limited 4-1-1 Kamikodanaka Nahahara-ku, Kawasaki, 211-8588 Japan Part No. 806-6755-1.
Copyright 2002 Sun Microsystems, Inc., 901 San Antonio Road, Palo Alto, California 94303 U.S.A. All rights reserved. Portions of this document are protected by copyright 1994 SPARC International, Inc. This product or document is protected by copyright and distributed under licenses restricting its use, copying, distribution, and decompilation. No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any.
SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. C H A P T E R Contents 1. Overview 1 Navigating the SPARC64 V Implementation Supplement 1 Fonts and Notational Conventions 1 The SPARC64 V processor 2 Component Overview 4 Instruction Control Unit (IU) 6 Execution Unit (EU) 6 Storage Unit (SU) 7 Secondary Cache and External Access Unit (SXU) 8 2. Definitions 9 3. Architectural Overview 13 4. Data Formats 15 5.
Floating-Point Deferred-Trap Queue (FQ) 24 IU Deferred-Trap Queue 24 6. Instructions 25 Instruction Execution 25 Data Prefetch 25 Instruction Prefetch 26 Syncing Instructions 27 Instruction Formats and Fields 28 Instruction Categories 29 Control-Transfer Instructions (CTIs) 29 Floating-Point Operate (FPop) Instructions 30 Implementation-Dependent Instructions 30 Processor Pipeline 31 Instruction Fetch Stages 31 Issue Stages 33 Execution Stages 33 Completion Stages 34 7.
SPARC JPS1 Implementation-Dependent Traps 39 8. Memory Models 41 Overview 42 SPARC V9 Memory Model 42 Mode Control 42 Synchronizing Instruction and Data Memory 42 A.
D. Formal Specification of the Memory Models 81 E. Opcode Maps 83 F.
Level-1 Data Cache (L1D Cache) 127 Level-2 Unified Cache (L2 Cache) 127 Cache Coherency Protocols 128 Cache Control/Status Instructions 128 Flush Level-1 Instruction Cache (ASI_FLUSH_L1I) 129 Level-2 Cache Control Register (ASI_L2_CTRL) 130 L2 Diagnostics Tag Read (ASI_L2_DIAG_TAG_READ) 130 L2 Diagnostics Tag Read Registers (ASI_L2_DIAG_TAG_READ_REG) 131 N.
error_state Transition Error 150 Urgent Error 150 Restrainable Error 152 Action and Error Control 153 Registers Related to Error Handling 153 Summary of Actions Upon Error Detection 154 Extent of Automatic Source Data Correction for Correctable Error 157 Error Marking for Cacheable Data Error 157 ASI_EIDR 161 Control of Error Action (ASI_ERROR_CONTROL) 161 Fatal Error and error_state Transition Error 163 ASI_STCHG_ERROR_INFO 163 Fatal Error Types 164 Types of error_state Transition Errors 164 Urgent Error 1
TLB Error Handling 195 Handling of TLB Entry Errors 195 Automatic Way Reduction of sTLB 196 Handling of Extended UPA Bus Interface Error 197 Handling of Extended UPA Address Bus Error 197 Handling of Extended UPA Data Bus Error 197 Q.
viii SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. C H A P T E R 1 Overview 1.1 Navigating the SPARC64 V Implementation Supplement We suggest that you approach this Implementation Supplement SPARC Joint Programming Specification as follows. 1. Familiarize yourself with the SPARC64 V processor and its components by reading these sections: ■ ■ ■ The SPARC64 V processor on page 2 Component Overview on page 4 Processor Pipeline on page 31 2. Study the terminology in Chapter 2, Definitions: 3.
1.3 The SPARC64 V processor The SPARC64 V processor is a high-performance, high-reliability, and high-integrity processor that fully implements the instruction set architecture that conforms to SPARC V9, as described in JPS1 Commonality.
1. Advanced RAS features for caches ■ Strong cache error protection: ECC protection for D1 (Data level 1) cache data, U2 (unified level 2) cache data, ■ and the U2 cache tag. Parity protection for I1 (Instruction level 1) cache data. ■ Parity protection and duplication for the I1 cache tag and the D1 cache tag. ■ ■ Automatic correction of all types of single-bit error: Automatic single-bit error correction for the ECC protected data.
■ 1.3.1 Asynchronous data error (ADE) trap for additional errors: Relaxed instruction end method (precise, retryable, not retryable) for the ■ async_data_error exception to indicate how the instruction should end; depends on the executing instruction and the detected error. Some ADE traps that are deferred but retryable. ■ Simultaneous reporting of all detected ADE errors at the error barrier for correct ■ handling of retryability. Component Overview The SPARC64 V processor contains these components.
Extended UPA Bus E-Unit SX-Unit UPA interface logic MoveIn buffer MoveOut buffer U2$ tag U2$ data 2M 4-way S-Unit interface S-Unit SX interface ALUs ALU Input Registers and Output Registers EXA EXB FLA FLB EAGA EAGB SX order queue Store queue I-TLB tag data 2048 + 32 entry Level-1 I cache 128 KB, 2-way D-TLB 2048 + 32 entry tag GUB FUB GPR FPR data Level-1 D cache 128 KB, 2-way I-Unit Instruction fetch pipeline Instruction buffer Commit stack entry Reservation stations E-unit con
1.3.2 Instruction Control Unit (IU) The IU predicts the instruction execution path, fetches instructions on the predicted path, distributes the fetched instructions to appropriate reservation stations, and dispatches the instructions to the execution pipeline. The instructions are executed out of order, and the IU commits the instructions in order. Major blocks are defined in TABLE 1-1. TABLE 1-1 1.3.
TABLE 1-2 Execution Unit Major Blocks (Continued) Name Description Interface registers Input/output registers to other units. Two integer execution pipelines (EXA, EXB) 64-bit ALU and shifters. Two floating-point and graphics Each floating-point execution pipeline can execute floating execution pipelines (FLA, FLB) point multiply, floating point add/sub, floating-point multiply and add, floating point div/sqrt, and floatingpoint graphics instruction.
1.3.5 Secondary Cache and External Access Unit (SXU) The SXU controls the operation of unified level-2 caches and the external data access interface (extended UPA interface). TABLE 1-4 describes the major blocks of the SXU. TABLE 1-4 8 Secondary Cache and External Access Unit Major Blocks Name Description Unified level-2 cache 2-Mbyte, 4-way associative, 64-byte line, writeback; provides low latency data source for both instruction level-1 cache and data level-1 cache.
F. C H A P T E R 2 Definitions This chapter defines concepts unique to the SPARC64 V, the Fujitsu implementation of SPARC JPS1. For definition of terms that are common to all implementations, please refer to Chapter 2 of Commonality. committed Term applied to an instruction when it has completed without error and all prior instructions have completed without error and have been committed.
instruction retired instruction stall issue-stalling instruction machine sync Memory Management Unit (MMU) Term applied to an instruction that is not allowed to be issued. Not every instruction can be issued in a given cycle. The SPARC64 V implementation imposes certain issue constraints based on resource availability and program requirements. An instruction that prevents new instructions from being issued until it has committed.
in parallel. When instructions are committed, results in renamed registers are posted to the architected registers in the proper sequence to produce the correct program results. scan A method used to initialize all of the machine state within a chip. In a chip that has been designed to be scannable, all of the machine state is connected in one or several loops called “scan rings.” Initialization data can be scanned into the chip through the scan rings.
12 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. C H A P T E R 3 Architectural Overview Please refer to Chapter 3 in the Commonality section of SPARC Joint Programming Specification.
14 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. C H A P T E R 4 Data Formats Please refer to Chapter 4, Data Formats in Commonality.
16 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. C H A P T E R 5 Registers The SPARC64 V processor includes two types of registers: general-purpose—that is, working, data, control/status—and ASI registers. The SPARC V9 architecture also defines two implementation-dependent registers: the IU Deferred-Trap Queue and the Floating-Point Deferred-Trap Queue (FQ); SPARC64 V does not need or contain either queue.
5.1.7 Floating-Point State Register (FSR) Please refer to Section 5.1.7 of Commonality for the description of FSR. The sections below describe SPARC64 V-specific features of the FSR register. FSR_nonstandard_fp (NS) SPARC V9 defines the FSR.NS bit which, when set to 1, causes the FPU to produce implementation-dependent results that may not conform to IEEE Std 754-1985. SPARC64 V implements this bit. When FSR.
else if (; else if (; else ; with IEEE_754_exception>) CEXC field as supplied by FPU>; with unfinished_FPop error>) with unimplemented_FPop error>) FSR Conformance SPARC V9 allows the TEM, cexc, and aexc fields to be implemented in hardware in either of two ways (both of which comply with IEEE Std 754-1985).
Note – Spurious setting of the PSTATE.RED bit by privileged software should not be performed, since it will take the SPARC64 V into RED_state without the required sequencing. 5.2.9 Version (VER) Register TABLE 5-1 shows the values for the VER register for SPARC64 V. TABLE 5-1 VER Register Encodings Bits Field Value 63:48 manuf 000416 (impl. dep. #104) 47:32 impl 5 (impl. dep.
The Performance Control Register in SPARC64 V is illustrated in FIGURE 5-1 and described in TABLE 5-2. 0 OVF 63 48 47 0 32 31 27 FIGURE 5-1 TABLE 5-2 NC OVRO 0 26 25 24 0 SC 22 21 20 18 0 SU 17 16 0 SL 11 10 9 ULRO UT ST PRIV 4 3 2 1 0 SPARC64 V Performance Control Register (PCR) (ASR 16) PCR Bit Description Bit Field Description 47:32 OVF Overflow Clear/Set/Status. Used to read counter overflow status (via RDPCR) and clear or set counter overflow status bits (via WRPCR).
PCR Bit Description (Continued) TABLE 5-2 Bit Field Description 0 PRIV Defined in SPARC JPS1 Commonality, with the additional function of controlling PCR accessibility as described above (impl. dep. #250). Performance Instrumentation Counter (PIC) Register (ASR 17) The PIC register is implemented as described in SPARC JPS1 Commonality. Four PICs are implemented in SPARC64 V. Each is accessed through ASR 17, using PCR.SC as a select field.
After a power-on reset (POR), all fields of DCUCR, including implementationdependent fields, are set to 0. After a WDR, XIR, or SIR reset, all fields of DCUCR, including implementation-dependent fields, are set to 0. The Data Cache Unit Control Register is illustrated in FIGURE 5-2 and described in TABLE 5-3. In the table, bits are grouped by function rather than by strict bit sequence.
DCUCR Description (Continued) TABLE 5-3 Bits Field Type Use — Description 1 DC RW Not implemented in SPARC64 V (impl. dep. #252). It reads as 0 and writes to it are ignored. 0 IC RW Not implemented in SPARC64 V (impl. dep. #253). It reads as 0 and writes to it are ignored. Data Watchpoint Registers No implementation-dependent feature of SPARC64 V reduces the reliability of data watchpoints (impl. dep. #244).
F. C H A P T E R 6 Instructions This chapter presents SPARC64 V implementation-specific instruction details and the processor pipeline information in these subsections: ■ ■ ■ ■ Instruction Execution on page 25 Instruction Formats and Fields on page 28 Instruction Categories on page 29 Processor Pipeline on page 31 For additional, general information, please see parallel subsections of Chapter 6 in Commonality. For easy referencing, we follow the organization of Chapter 6 in Commonality. 6.
1. If a memory operation y resolves to a volatile memory address (location[y]), SPARC64 V will not speculatively prefetch location[y] for any reason; location[y] will be fetched or stored to only when operation y is commitable. 2. If a memory operation y resolves to a nonvolatile memory address (location[y]), SPARC64 V may speculatively prefetch location[y] subject, adhering to the following subrules: a.
6.1.3 Syncing Instructions SPARC64 V has instructions, called syncing instructions, that stop execution for the number of cycles it takes to clear the pipeline and to synchronize the processor. There are two types of synchronization, pre and post. A presyncing instruction waits for all previous instructions to commit, commits by itself, and then issues successive instructions. A postsyncing instruction issues by itself and prevents the successive instructions from issuing until it is committed.
TABLE 6-1 SPARC64 V Syncing Instructions (Continued) Presyncing Opcode Sync? Postsyncing Wait for store global visibility? Sync? Discard prefetched instructions? Yes Yes STDA STDFA Yes STFSR, STXFSR Yes Tcc Yes Yes WRASR Yes2 Yes 1. When #cmask ! = 0. 2. WRGSR only. 6.2 Instruction Formats and Fields Instructions are encoded in five major 32-bit formats and several minor formats. Please refer to Section 6.2 of Commonality for illustrations of four major formats.
Since size = 00 is not IMPDEP2B and since size = 11 assumed quad operations but is not implemented in SPARC64 V, the instruction with size = 00 or 11 generates an illegal_instruction exception in SPARC64 V. 6.3 Instruction Categories SPARC V9 instructions comprise the categories listed below. All categories are described in Section 6.3 of Commonality. Subsections in bold face are SPARC64 V implementation dependencies. ■ ■ ■ ■ ■ ■ ■ ■ ■ ■ 6.3.
SPARC64 V implements JMPL and CALL return prediction hardware in a form of special stack, called the Return Address Stack (RAS). Whenever a CALL or JMPL that writes to %o7 (r[15]) occurs, SPARC64 V “pushes” the return address (PC+8) onto the RAS. When either of the synthetic instructions retl (JMPL [%o7+8]) and ret (JMPL [%i7+8]) are subsequently executed, the return address is predicted to be the address stored on the top of the RAS and the RAS is “popped.
6.4 Processor Pipeline The pipeline of SPARC64 V consists of fifteen stages, shown in FIGURE 6-2. Each stage is referenced by one or two letters as follows: IA IT IM IB IR E D P B X Ps 6.4.1 U Ts Ms Bs W Rs Instruction Fetch Stages ■ IA (Instruction Address generation) — Calculate fetch target address. ■ IT (Instruction TLB Tag access) — Instruction TLB tag search. Search of BRHIS and RAS is also started. ■ IM (Instruction TLB tag Match) — Check TLB tag is matched.
IF EAG IA IT BRHIS iTLB IM L1I IB Instruction Buffer E IR IWR D RSFA RSFB RSEA RSEB RSA RSBR P CSE B FXB FXA EXB EXA EAGA EAGB Ps RR RR RR X RR Ts FUB GUB dTLB Ms L1D Bs LB Rs LR U W FPR GPR FIGURE 6-2 32 ccr fsr PC nPC SPARC64 V Pipeline SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
6.4.2 Issue Stages ■ E (Entry) — Instructions are passed from fetch stages. ■ D (Decode) — Assign resources and dispatch to reservation station (RS.) SPARC64 V is an out-of-order execution CPU. It has six execution units (two of arithmetic and logic unit, two of floating-point unit, two of load/store unit). Each unit except the load/store unit has its own reservation station. E and D stages are issue stages that decode instructions and dispatch them to the target RS.
Execution Stages for Cache Access Memory access requests are passed to the cache access pipeline after the target address is calculated. Cache access stages work the same way as instruction fetch stages, except for the handling of branch prediction. See Section 6.4.1, Instruction Fetch Stages, for details.
F. C H A P T E R 7 Traps Please refer to Chapter 7 of Commonality. Section numbers in this chapter correspond to those in Chapter 7 of Commonality. This chapter adds SPARC64 V-specific information in the following sections: ■ ■ ■ ■ ■ 7.
7.1.1 RED_state RED_state Trap Table The RED_state trap vector is located at an implementation-dependent address referred to as RSTVaddr. The value of RSTVaddr is a constant within each implementation; in SPARC64 V this virtual address is FFFF FFFF F000 000016, which translates to physical address 0000 07FF F000 000016 in RED_state (impl. dep. #114).
Although the standard behavior of the CPU upon an entry into error_state is to internally generate a watchdog_reset (WDR), the CPU optionally stays halted upon an entry to error_state depending on a setting in the OPSR register (impl. dep #40, #254). 7.2 Trap Categories Please refer to Section 7.2 of Commonality. An exception or interrupt request can cause any of the following trap types: ■ ■ ■ ■ 7.2.2 Precise trap Deferred trap Disrupting trap Reset trap Deferred Traps Please refer to Section 7.2.
7.3 Trap Control Please refer to Section 7.3 of Commonality. 7.3.1 PIL Control SPARC64 V receives external interrupts from the UPA interconnect. They cause an interrupt_vector_trap (TT = 6016). The interrupt vector trap handler reads the interrupt information and then schedules SPARC V9-compatible interrupts by writing bits in the SOFTINT register. Please refer to Section 5.2.11 of Commonality for details. During handling of SPARC V9-compatible interrupts by SPARC64 V, the PIL register is checked.
7.4.4 Details of Supported Traps Please refer to Section 7.4.4 in Commonality. SPARC64 V Implementation-Specific Traps SPARC64 V supports the following implementation-specific trap type: ■ 7.5 async_data_error Trap Processing Please refer to Section 7.5 of Commonality. 7.6 Exception and Interrupt Descriptions Please refer to Section 7.6 of Commonality. 7.6.4 SPARC V9 Implementation-Dependent, Optional Traps That Are Mandatory in SPARC JPS1 Please refer to Section 7.6.4 of Commonality.
■ ■ ■ ■ ■ Uncorrectable errors in the internal architecture registers (general registers–gr, floating-point registers–fr, ASR, ASI registers) Uncorrectable errors in the core pipeline System data corruption Watch dog timeout first time TLB access error upon access by an ldxa or stxa instruction Multiple errors may be reported in a single generation of the async_data_error exception.
F. C H A P T E R 8 Memory Models The SPARC V9 architecture is a model that specifies the behavior observable by software on SPARC V9 systems. Therefore, access to memory can be implemented in any manner, as long as the behavior observed by software conforms to that of the models described in Chapter 8 of Commonality and defined in Appendix D, Formal Specification of the Memory Models, also in Commonality.
8.1 Overview Note – The words “hardware memory model” denote the underlying hardware memory models as differentiated from the “SPARC V9 memory model,” which is the memory model the programmer selects in PSTATE.MM. SPARC64 V supports only one mode of memory handling to guarantee correct operation under any of the three SPARC V9 memory ordering models (impl. dep. #113): ■ 8.4 Total Store Order — All loads are ordered with respect to loads, and all stores are ordered with respect to loads and stores.
corresponding locations in all instruction caches; references to any instruction cache cause corresponding modified data to be flushed and corresponding unmodified data to be invalidated from all data caches. The flush operation is still operative in SPARC64 V, however. Since the FLUSH instruction synchronizes the processor, the total latency varies depending on the situation in SPARC64 V. Assuming all prior instructions are completed, the latency of FLUSH is 18 CPU cycles. Release 1.0, 1 July 2002 F.
44 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X A Instruction Definitions: SPARC64 V Extensions This appendix describes the SPARC64 V-specific implementation of the instructions in Appendix A of Commonality. If an instruction is not described in this appendix, then no SPARC64 V implementation-dependency applies. ■ See TABLE A-1 of Commonality for the location at which general information about the instruction can be found. ■ Section numbers refer to the parallel section numbers in Appendix A of Commonality.
4. A description of the features, restrictions, and exception-causing conditions. 5. A list of exceptions that can occur as a consequence of attempting to execute the instruction(s). Exceptions due to an instruction_access_error, instruction_access_exception, fast_instruction_access_MMU_miss, async_data_error, ECC_error, and interrupts are not listed because they can occur on any instruction.
A.4 Block Load and Store Instructions (VIS I) The following notes summarize behavior of block load/store instructions in SPARC64 V. 1. Block load and store operations are not atomic, in that they are internally decomposed into eight independent, 8-byte load/store operations in SPARC64 V. Each load/store is always issued and performed in the RMO memory model and obeys all prior MEMBAR and atomic instruction-imposed ordering constraints. 2.
4. The block store with commit instruction always stores the operand in main storage and invalidates the line in the L1D cache if it is present. The invalidation is performed through an S_INV_REQ transaction through UPA by the system controller. 5. The block store instruction stores the operand into main storage if it is not present in the operand cache and the status of the line is invalid, shared, or owned.
A.12 Call and Link SPARC64 V clears the upper 32 bits of the PC value in r[15] when PSTATE.AM is set (impl. dep. #125). The value written into r[15] is visible to the instruction in the delay slot. SPARC64 V has a special hardware table, called the return address stack, to predict the return address from a subroutine. Though the return prediction stack achieves better performance in normal cases, there is a special use of the CALL instruction (call.
A.24.1 Floating-Point Multiply-Add/Subtract SPARC64 V uses IMPDEP2B opcode space to encode the Floating-Point Multiply Add/Subtract instructions.
Description The Floating-point Multiply-Add instructions multiply the registers specified by the rs1 field times the registers specified by the rs2 field, add that product to the registers specified by the rs3 field, then write the result into the registers specified by the rd field.
detects any conditions for an unfinished_FPop trap, the Floating-point Multiply-Add/ Subtract instruction generates the unfinished_FPop exception. In this case, none of rd, cexc, or aexc are modified.
Programming Note – The Multiply Add/Subtract instructions are encoded in the SPARC V9 IMPDEP2 opcode space, and they are specific to the SPARC64 V implementation. They cannot be used in any programs that will be executed on any other SPARC V9 processor, unless that implementation exactly matches the SPARC64 V use for the IMPDEP2 opcode.
A.30 Load Quadword, Atomic [Physical] The Load Quadword ASIs in this section are specific to SPARC64 V, as an extension to SPARC JPS1.
■ ■ ■ ■ ■ ■ TTE.NFO = TTE.CP = TTE.CV = TTE.E = TTE.P = TTE.W = 0 1 0 0 1 0 Note – TTE.IE depends on the endianness of the ASI. When the ASI is 03416, TTE.IE = 0; TTE.IE = 1 when the ASI is 03C16. Therefore, the atomic quad load physical instruction can only be applied to a cacheable memory area. Semantically, ASI_QUAD_LDD_PHYS{_L} (03416 and 03C16) is a combination of ASI_NUCLEUS_QUAD_LDD and ASI_PHYS_USE_EC.
Description The memory barrier instruction, MEMBAR, has two complementary functions: to express order constraints between memory references and to provide explicit control of memory-reference completion. The membar_mask field in the suggested assembly language is the concatenation of the cmask and mmask instruction fields. The mmask field is encoded in bits 3 through 0 of the instruction.
A.42 Partial Store (VIS I) Please refer A.42 in Commonality for general details. Watchpoint exceptions on partial store instructions occur conservatively on SPARC64 V. The DCUCR Data Watchpoint masks are only checked for nonzero value (watchpoint enabled). The byte store mask (r[rs2]) in the partial store instruction is ignored, and a watchpoint exception can occur even if the mask is zero (that is, no store will take place) (impl. dep. #249).
TABLE A-7 describes prefetch variants implemented in SPARC64 V. TABLE A-7 Prefetch Variants fcn Fetch to: 0 1 2 3 4 5-15 16-19 L1D S L2 S L1D M L2 M — — reserved (SPARC V9) implementation dependent. L1D S 20 Status Description NOP illegal_instruction exception is signalled. NOP If an access causes an mTLB miss, fast_data_access_MMU_miss exception is signalled. 21 L2 S If an access causes an mTLB miss, fast_data_access_MMU_miss exception is signalled.
A.70 Write State Register In SPARC64 V, a WRPCR instruction will cause a privileged_action exception if PSTATE.PRIV = 0 and PCR.PRIV = 1. If PSTATE.PRIV = 0 and PCR.PRIV = 0, WRPCR causes a privileged_action exception only when an attempt is made to change (that is, write 1 to) PCR.PRIV (impl. dep. #250). A.71 Deprecated Instructions The deprecated instructions in A.71 of Commonality are provided only for compatibility with previous versions of the architecture. They should not be used in new software.
60 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X B IEEE Std 754-1985 Requirements for SPARC V9 The IEEE Std 754-1985 floating-point standard contains a number of implementation dependencies. Please see Appendix B of Commonality for choices for these implementation dependencies, to ensure that SPARC V9 implementations are as consistent as possible. Following is information specific to the SPARC64 V implementation of SPARC V9 in these sections: ■ ■ B.
SPARC64 V floating-point hardware has its specific range of computation. If either the values of input operands or the value of the intermediate result shows that the computation may not fall in the range that hardware provides, SPARC64 V generates an fp_exception_other exception (tt = 02216) with FSR.ftt = 0216 (unfinished_FPop) and the operation is taken over by software.
Implementation Note – Detecting the exact boundary conditions requires a large amount of hardware. SPARC64 V detects approximate boundary conditions by calculating the exponent intermediate result (the exponent before rounding) from input operands, to avoid the hardware cost. Since the computation of the boundary conditions is approximate, the detection of a zero result or an overflow result shall be pessimistic. SPARC64 V generates an unfinished_FPop exception pessimistically.
TABLE B-2 unfinished_FPop Boundary Conditions (Continued) Operation Boundary Conditions FMULs, FMULd 1. One of the operands is a denormalized number, the other operand is a normal, nonzero floating-point number (except for a NaN and an infinity), and single precision: -25 < Er double precision: -54 < Er 2. Both operands are normal, nonzero floating-point numbers (except for a NaN and an infinity), TEM.UFM = 0, and single precision: −25 < eres < 1 double precision: −54 < eres < 1 FsMULd 1.
TABLE B-3 Conditions for a Pessimistic Zero Conditions Operations One operand is denormalized1 Both are denormalized Both are normal fp-number2 FdTOs always — eres ≤ -25 FMULs, FMULd single precision: Er ≤ −25 double precision: Er ≤ −54 Always single precision: eres ≤ −25 double precision: eres ≤ −54 FDIVs, FDIVd single precision: Er ≤ −25 double precision: Er ≤ −54 Never single precision: eres ≤ −25 double precision: eres ≤ −54 1.
summarizes the behavior of SPARC64 V floating-point hardware depending on FSR.NS. Note – The result and behavior of SPARC64 V of the shaded column in the tables Table B-5 and Table B-6 conform to IEEE754-1985 standard. Note – Throughout Table B-5 and Table B-6, lowercase exception conditions such as nx, uf, of, dv and nv are nontrapping IEEE 754 exceptions. Uppercase exception conditions such as NX, UF, OF, DZ and NV are trapping IEEE 754 exceptions.
TABLE B-6 describes how SPARC64 V behaves when FSR.NS = 1 (nonstandard mode). TABLE B-6 Nonarithmetic Operations Under FSR.
68 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X C Implementation Dependencies This appendix summarizes implementation dependencies. In SPARC V9 and SPARC JPS1, the notation “IMPL. DEP. #nn:” identifies the definition of an implementation dependency; the notation “(impl. dep. #nn)” identifies a reference to an implementation dependency. These dependencies are described by their number nn in TABLE C-1 on page 70. These numbers have been removed from the body of this document for SPARC64 V to make the document more readable.
C.2 Hardware Characteristics Please refer to Section C.2 of Commonality. C.3 Implementation Dependency Categories Please refer to Section C.3 of Commonality. C.4 List of Implementation Dependencies TABLE C-1 provides a complete list of how each implementation dependency is treated in the SPARC64 V implementation.
TABLE C-1 SPARC64 V Implementation Dependencies (2 of 11) Nbr SPARC64 V Implementation Notes 9 RDASR/WRASR privileged status See A.50 and A.70 in Commonality for details of implementation-dependent RDASR/WRASR instructions. 10–12 Reserved. 13 VER.impl VER.impl = 5 for the SPARC64 V processor. 20 14–15 Reserved. — IU deferred-trap queue 24 16 Page — SPARC64 V neither has nor needs an IU deferred-trap queue. 17 Reserved.
TABLE C-1 SPARC64 V Implementation Dependencies (3 of 11) Nbr SPARC64 V Implementation Notes 32 Deferred traps SPARC64 V signals a deferred trap in a few of its severe error conditions. SPARC64 V does not contain a deferred trap queue. 33 Trap precision There are no deferred traps in SPARC64 V other than the trap caused by a few severe error conditions. All traps that occur as the result of program execution are precise.
TABLE C-1 SPARC64 V Implementation Dependencies (4 of 11) Nbr SPARC64 V Implementation Notes Page 42 FLUSH instruction SPARC64 V implements the FLUSH instruction in hardware. 43 Reserved. 44 Data access FPU trap The destination register(s) are unchanged if an access error occurs. 45–46 Reserved. 47 RDASR See A.50, Read State Register, in Commonality for details. — 48 WRASR See A.70, Write State Register, in Commonality for details. — 49–54 Reserved.
TABLE C-1 SPARC64 V Implementation Dependencies (5 of 11) Nbr SPARC64 V Implementation Notes 106 IMPDEPn instructions SPARC64 V uses the IMPDEP2 opcode for the Multiply Add/Subtract instructions. SPARC64 V also conforms to Sun’s specification for VIS-1 and VIS-2. 107 Unimplemented LDD trap SPARC64 V implements LDD in hardware. — Unimplemented STD trap — 108 Page 49 SPARC64 V implements STD in hardware.
TABLE C-1 SPARC64 V Implementation Dependencies (6 of 11) Nbr SPARC64 V Implementation Notes Page 119 Unimplemented values for PSTATE.MM Writing 112 into PSTATE.MM causes the machine to use the TSO memory model. However, the encoding 11 2 should not be used, since future versions of SPARC64 V may use this encoding for a new memory model.
TABLE C-1 SPARC64 V Implementation Dependencies (7 of 11) Nbr SPARC64 V Implementation Notes 206 SHUTDOWN instruction In privileged mode the SHUTDOWN instruction executes as a NOP in SPARC64 V. 207 PCR register bits 47:32, 26:17, and bit 3 SPARC64 V uses these bits for the following purposes: Page 58 20, 21, 201 • Bits 47:32 for set/clear/show status of overflow (OVF). • Bit 26 for validity of OVF field (OVRO). • Bits 24:22 for number of counter pair (NC). • Bits 20:18 for counter selector (SC).
TABLE C-1 SPARC64 V Implementation Dependencies (8 of 11) Nbr SPARC64 V Implementation Notes Page 218 async_data_error async_data_error trap is implemented in SPARC64 V, using tt = 4016. See 39 Appendix P for details. 219 Asynchronous Fault Address Register (AFAR) allocation SPARC64 V implements two AFARs: • VA = 0016 for an error occurring in D1 cache. • VA = 0816 for an error occurring in U2 cache.
TABLE C-1 78 SPARC64 V Implementation Dependencies (9 of 11) Nbr SPARC64 V Implementation Notes 227 TSB number of entries SPARC64 V supports a maximum of 16 million entries in the common TSB and a maximum of 32 million lines the Split TSB. 88 228 TSB_Hash supplied from TSB or context-ID register TSB_Hash is generated from the context-ID register in SPARC64 V. 88 229 TSB_Base address generation SPARC64 V generates the TSB_Base address directly from the TLB Extension Registers.
TABLE C-1 SPARC64 V Implementation Dependencies (10 of 11) Nbr SPARC64 V Implementation Notes 240 DCU Control Register bits 47:41 SPARC64 V uses bit 41 for WEAK_SPCA, which enables/disables memory access in speculative paths. 241 Page 23 — Address Masking and DSFAR SPARC64 V writes zeroes to the more significant 32 bits of DSFAR. 242 TLB lock bit In SPARC64 V, only the fITLB and the fDTLB support the lock bit. The lock bit in sITLB and sDTLB is read as 0 and writes to it are ignored.
TABLE C-1 SPARC64 V Implementation Dependencies (11 of 11) Nbr SPARC64 V Implementation Notes 252 DCUCR.DC (Data Cache Enable) SPARC64 V does not implement DCUCR.DC. 24 253 DCUCR.IC (Instruction Cache Enable) SPARC64 V does not implement DCUCR.IC. 24 254 Means of exiting error_state The standard behavior of a SPARC64 V CPU upon entry into error_state is to reset itself by internally generating a watchdog_reset (WDR).
F. A P P E N D I X D Formal Specification of the Memory Models Please refer to Appendix D of Commonality.
82 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X E Opcode Maps Please refer to Appendix E in Commonality. TABLE E-1 lists the opcode map for the SPARC64 V IMPDEP2 instruction.
84 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X F Memory Management Unit The Memory Management Unit (MMU) architecture of SPARC64 V conforms to the MMU architecture defined in Appendix F of Commonality but with some model dependency. See Appendix F in Commonality for the basic definitions of the SPARC64 V MMU. Section numbers in this appendix correspond to those in Appendix F of Commonality. Figures and tables, however, are numbered consecutively.
The micro-TLBs are coherent to main TLBs and are not visible to software, with the exception of TLB multiple hit detection. Hardware maintains the consistency between micro-TLBs and main TLBs. No other details on micro-TLB are provided because software cannot execute direct operations to micro-TLB and its configuration is invisible to software.
The physical address length to be passed to the UPA interface is 41 bits or 43 bits, as designated in the ASI_UPA_CONFIG.AM field. When the 41-bit PA is specified in ASI_UPA_CONFIG.AM, the most significant 2 bits of the CPU internal physical address are discarded and only the remaining least significant 41 bits are passed to the UPA address bus. If the discarded most significant 2 bits are not 0, the urgent error ASI_UGESR.SDC is detected after the invalid address transfer to the UPA interface.
F.3.3 TSB Organization IMPL. DEP. #227: The maximum number of entries in a TSB is implementation dependent in JPS1. See impl. dep. #228 for the limitation of TSB_size in TSB registers. SPARC64 V supports a maximum of 16 million lines in the common TSB and a maximum 32 million lines in the split TSB. The maximum number N in FIGURE F-4 of Commonality is 16 million (16 * 220). F.4.2 TSB Pointer Formation IMPL. DEP.
8K_POINTER = TSB_Extension[63:14+N] 0000 (VA[21+N:13] ⊕ TSB_Hash) 0 64K_POINTER = TSB_Extension[63:14+N] TSB_Hash) 0000 (VA[24+N:16] ⊕ 1 Value of TSB_Hash for both a shared TSB and a split TSB When 0 <= N <= 4, TSB_Hash = context_register[N+8:0] Otherwise, when 5 <= N <= 15, TSB_Hash[ 12:0 ] = context_register[ 12:0 ] TSB_Hash[ N+8:13 ] = 0 ( N-4 bits zero ) F.5 Faults and Traps IMPL. DEP.
TABLE F-2 MMU Trap Types, Causes, and Stored State Register Update Policy Registers Updated (Stored State in MMU) Ref #Trap Name Trap Cause I-SFSR I-MMU Tag Access 2. instruction_access_exception Several (see below) X2 X 3. fast_data_access_MMU_miss D-TLB miss X3 X 6816–6B16 4. data_access_exception Several (see below) X3 X1 3016 5. fast_data_access_protection Protection violation X3 X 6C16-6F16 6. privileged_action Use of privileged ASI X3 7.
■ F.8 An fDTLB entry parity error is detected in a fDTLB lookup for an instruction operand access. Reset, Disable, and RED_state Behavior IMPL. DEP. #231: The variability of the width of physical address is implementation dependent in JPS1, and if variable, the initial width of the physical address after reset is also implementation dependent in JPS1. See impl. dep. #224 on page 86 for the variability of the width of physical address.
F.10 Internal Registers and ASI operations F.10.1 Accessing MMU Registers IMPL. DEP. #233: Whether the TSB_Hash field is implemented in I/D Primary/Secondary/Nucleus TSB Extension Register is implementation dependent in JPS1. On SPARC64 V, the TSB_Hash field is not implemented in the I/D Primary/Secondary/Nucleus TSB Extension Register. See TSB Pointer Formation on page 88 for details. IMPL. DEP.
TABLE F-3 MCNTL Field Description Bits Field Name RW Description Data <16> NC_Cache R/W Force instruction caching. When set, the instruction lines fetched from a noncacheable area are cached in the instruction cache. The NC_Cache has no effect on operand references. If MCNTL.NC_Cache = 1, the CPU fetches a noncacheable line in four consecutive 16-byte fetches and stores the entire 64 bytes in the I-Cache. NC_Cache is provided for use by OBP, and OBP should clear the bit before exiting.
For fTLB, SPARC64 V implements a pseudo-LRU. For sTLB, LRU is used. IMPL. DEP. #235: The MMU TLB data access address assignment and the purpose of the address are implementation dependent in JPS1. 94 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
The MMU TLB data access address assignment and the purpose of the address on SPARC64 V are shown in TABLE F-4. TABLE F-4 MMU TLB Data Access Address Assignment VA Bit Field Description 17:16 TLB# TLB to be accessed: fTLB or sTLB is designated as follows. 00: fTLB (32 entries) 01: reserved 10: sTLB(2048 entries of 8-Kbyte page and 4-Mbyte page) 11: reserved 15 ER Error insertion into mTLB: When set on a write, an entry with parity error is inserted into a selected TLB location.
FIGURE F-2 Index number of set associative TLBs RMD=00 RMD=10 8-Kbyte page entry 0 8-Kbyte page entry 0 1024 way0 4-Mbyte page entry 1024 way0 way0 511 511 1535 512 512 1536 way1 1023 2047 1023 2047 RMD=11 RMD=01 8-Kbyte page entry 0 0 1024 way0 reserved reserved 511 512 2047 4-Mbyte page entry 1024 1279 1280 1535 1536 way1 1023 way1 way1 1023 1791 1792 2047 way0 reserved way1 reserved I/D MMU TLB Tag Access Register On an ASI store to the TLB Data Access or Data In Regist
I/D TSB Base Registers IMPL. DEP. #236: The width of the TSB_Size field in the TSB Base Register is implementation dependent; the permitted range is from 2 to 6 bits. The least significant bit of TSB_Size is always at bit 0 of the TSB Base Register. Any bits unimplemented at the most significant end of TSB_Size read as 0, and writes to them are ignored. On SPARC64 V, the width of the TSB_Size field in the TSB Base Register is 4 bits.
The specification of bits 24:0 in the SPARC64 V SFSR conforms to the specification defined in Section F.10.9 in Commonality. Bits 63:25 in SPARC64 V SFSR are implementation dependent. TABLE F-5 describes the I-SFSR bits, and TABLE F-5 describes the D-SFSR bits. TABLE F-5 I-SFSR Bit Description Bits Field Name RW Description Data<63:62 > TLB# R/W Faulty TLB# log. Recorded upon an mITLB error to identify the faulty TLB (fITLB: 002 or sITLB: 102).
TABLE F-5 I-SFSR Bit Description Bits Field Name RW Description Data <15> TM R/W Translation miss. When TM = 1, it signifies an occurrence of a mITLB miss upon an instruction reference. Data <13:7> FT<6:0> R/W Fault type. Saves and indicates an exact condition that caused the recorded exception. See TABLE F-6 for the field encoding. In the IMMU, FT is valid only for an instruction_access_exception. The ISFSR.
ISFSR is updated either upon a occurrence of a fast_instruction_access_MMU_miss, an instruction_access_exception, or an instruction_access_error trap. TABLE F-7 shows the detailed update policy of each field, and TABLE F-8 describes the fields.
TABLE F-8 D-SFSR Bit Description (2 of 3) Bits Field Name RW Description Data <46> MK R/W Marked UE. On SPARC64 V, all uncorrectable errors are reported as marked, so this bit is always set whenever DSFSR.UE = 1. See Section P.2.4 for details. Data <45:32> EID R/W Error-mark ID. Valid for a marked UE. See Section P.2.4 for details about ERROR_MARK_ID. Data <31> UE R/W Operand access error status. Uncorrectable error.
TABLE F-8 D-SFSR Bit Description (3 of 3) Bits Field Name RW Description Data CT<1:0> R/W Context type. Saves the context attribute for the reference that invokes an exception. For nontranslating ASI or invalid ASI, DSFSR.CT = 1102. 0002: Primary Secondary 0102: 1002: Nucleus Reserved 1102: When a data_access_exception trap is caused by an invalid combination of an ASI and an opcode (e.g.
TABLE F-9 MMU Synchronous Fault Status Register FT (Fault Type) Field (Continued) FT<6:0> Error Description 0816 An attempt was made to access an alternate address space with an illegal ASI value, an illegal VA, an invalid read/write attribute, or an illegally sized operand. If the quad load ASI is used with the other opcode than LDDA, this bit is set. Note: Since an illegal ASI check is done prior to a TTE unmatch check, DSFSR.FT<3> = 1 causes the value of other bits of DSFSR.
TABLE F-10 DSFSR Update Policy TLB#, index FV OW W, PR, NF, CT1 FT TM ASI UE, UPA, mDTLB, NC2, E2 DSFAR Miss on fault/exception K 1 K K K 1 K K K Miss on miss K K K U K 1 K K K Field 1. 2. 3. 4. 5. 6. 7. The value of DSFSR.CT is 11 when the ASI is not a translating ASI. The value 11 is recorded in DSFSR.CT for an illegal value in ASI (0016–0316, 1216–1316, 1616–1716, 1A16–1B16, 1E16–2316, 2D16–2F16, or 3516–3B16). Valid only for the data_access_error caused by DSFSR.
F.11.10 TLB Replacement Policy Automatic TLB Replacement Rule On an automatic replacement write to the TLB, the MMU picks the entry to write according to the following rules: 1. If the following conditions are satisfied— ■ ■ ■ the new entry maps to an 8-Kbyte or an 4-Mbyte unlocked page and ASI_MCNTRL.fw_fITLB = 0 for IMMU automatic replacement and ASI_MCNTRL.fw_fDTLB = 0 for DMMU automatic replacement —then the replacement is directed to the sTLB (2-way TLB).
■ sTLB entry update data: New sTLB entry data is designated in stxa data. ■ New sTLB entry tag is designated in the I/D TLB Tag Access Register. ■ ■ Restriction between the stxa address and ASI TLB Tag Access Register contents: The relation stxa_VA<11:3> = ASI_TAG_ACCESS_REGISTER<21:13> and ■ stxa_VA<13> = 0 should be satisfied. Only if this condition is satisfied can the 8-Kbyte sTLB entry be replaced as designated.
F. A P P E N D I X G Assembly Language Syntax Please refer to Appendix G of Commonality.
108 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X H Software Considerations Please refer to Appendix H of Commonality.
110 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X I Extending the SPARC V9 Architecture Please refer to Appendix I of Commonality.
112 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X J Changes from SPARC V8 to SPARC V9 Please refer to Appendix K of Commonality.
114 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X K Programming with the Memory Models Please refer to Appendix J of Commonality.
116 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X L Address Space Identifiers Every load or store address in a SPARC V9 processor has an 8-bit Address Space Identifier (ASI) appended to the VA. The VA plus the ASI fully specifies the address. For instruction loads and for data loads or stores that do not use the load or store alternate instructions, the ASI is an implicit ASI generated by the hardware.
TABLE L-1 SPARC64 V ASI Assignments (2 of 3) Value ASI Name (Suggested Macro Syntax) Type VA 4516 ASI_DCU_CONTROL_REG (ASI_DCUCR) RW 00 22 4516 ASI_MEMORY_CONTROL_REG RW 08 92 4616–4916 (JPS1) R — 215 Description Page 4A16 ASI_UPA_CONFIG_REGISTER 4B16 (JPS1) 4C16 ASI_ASYNC_FAULT_STATUS RW 00 174 4C16 ASI_URGENT_ERROR_STATUS (ASI_UGESR) R 08 165 4C16 ASI_ERROR_CONTROL RW 10 161 4C16 ASI_STCHG_ERROR_INFO RW 18 163 4D16 ASI_ASYNC_FAULT_ADDR_D1 RW 00 177 4D16
TABLE L-1 SPARC64 V ASI Assignments (3 of 3) Value ASI Name (Suggested Macro Syntax) Type VA 6F16 ASI_C_BSTWBUSY RW C0 123 Description Page 7016–EE16 (JPS1) EF16 ASI_LBSYR0 RW 00 124 EF16 ASI_LBSYR1 RW 08 124 EF16 ASI_BSTW0 RW 80 124 EF16 ASI_BSTW1 RW 88 124 F016–FF16 (JPS1) L.3.2 Special Memory Access ASIs Please refer to Section L.3.3 in Commonality. In addition to the ASIs described in Commonality, SPARC64 V supports the ASIs described below.
ASI 4F16 (ASI_SCRATCH_REGx) SPARC64 V provides eight of 64-bit registers that can be used temporary storage for supervisor software. Data<63:0> [1] [2] [3] Register Name: ASI: VA: [4] RW: ASI_SCRATCH_REGx (x = 0–7) 4F16 VA<5:3> = register number The other VA bits must be zero. Supervisor read/write Block Load and Store ASIs ASIs E016 and E116 exist only for use with STDFA instructions as Block Store with Commit operations (see Block Load and Store Instructions (VIS I) on page 47).
n = 2 (4-byte alignment): LDDF_mem_address_not_aligned exception is generated. n ≤ 1 (≤ 2-byte alignment): mem_address_not_aligned exception is generated. 2. If the memory address is correctly aligned, SPARC64 V generates a data_access_exception with AFSR.FTYPE = “invalid ASI.” L.4 Barrier Assist for Parallel Processing SPARC64 V has a barrier-assist feature that works in concert with the barrier mechanism in the memory system to enable high-speed synchronization among CPUs in the system.
3. When the LBSY on the SB is changed, LBSY change information is broadcast to all CPUs in the SB. Each CPU receives the change information and updates its copy. 4. On a read from an application, the copy value of LBSY, which is designated by supervisor software, is returned. High-Speed BST Write Mechanism 1. An application writes value, designated by supervisor software, to a BST. 2. The CPU sends BST write information to the system controller. 3. The system controller writes the BST.
BSTW Control Register (ASI_C_BSTW0, ASI_C_BSTW1) [1] [2] [3] [4] Register Name: ASI: VA: RW ASI_C_BSTW0, ASI_C_BSTW1 6F16 8016 (ASI_C_BSTW0), 8816 (ASI_C_BSTW1). Supervisor read/write The BSTW control register designates which bit in LBSY is written through ASI_BSTWx. Bit Name RW Description 63 V RW Valid. When V = 0, BL_num and SB_BPU_num are ignored and a write to ASI_BSTWx is discarded. When V = 1, data in the ASI_BSTWx is written to the selected bit in SB_BPU.
Last Barrier Synchronization Status Read (ASI_LBSYR0, ASI_LBSYR1) [1] [2] [3] [4] Register Name: ASI: VA: RW ASI_LBSYR0, ASI_LBSYR1 EF16 0016 (ASI_LBSYR0), 0816 (ASI_LBSYR1). Read (Write is ignored) ASI_LBSYRx is a read interface to the copy of LBSY. A write to ASI_LBSYRx is ignored. Bit Name RW Description 0 RV R Read value. The bit designated by ASI_C_LBSYRx is shown.
F. A P P E N D I X M Cache Organization This appendix describes SPARC64 V cache organization in the following sections: ■ ■ ■ M.1 Cache Types on page 125 Cache Coherency Protocols on page 128 Cache Control/Status Instructions on page 128 Cache Types SPARC64 V has two levels of on-chip caches, with these characteristics: ■ Level-1 cache is split for instruction and data; level-2 cache is unified.
M.1.1 Level-1 Instruction Cache (L1I Cache) TABLE M-1 shows the characteristics of a level-1 instruction cache. TABLE M-1 L1I Cache Characteristics Feature Value Size 128 Kbytes Associativity 2-way Line Size 64-byte Indexing Virtually indexed, physically tagged (VIPT) Tag Protection Parity and duplicate Data Protection Parity Although an L1I cache is VIPT, TTE.CV is ineffective since SPARC64 V has unaliasing features in hardware.
M.1.2 Level-1 Data Cache (L1D Cache) The level-1 data cache is a writeback cache. Its characteristics are shown in TABLE M-2. TABLE M-2 L1D Cache Characteristics Feature Value Size 128 Kbytes Associativity 2-way Line Size 64-byte Indexing Virtually indexed, physically tagged (VIPT) Tag Protection Parity and duplicate Data Protection ECC Although L1D cache is VIPT, TTE.CV is ineffective since SPARC64 V has unaliasing features in hardware.
M.2 Cache Coherency Protocols The CPU uses the UPA MOESI cache-coherence protocol; these letters are acronyms for cache line states as follows: M Exclusive modified O Shared modified (owned) E Exclusive clean S Shared clean I Invalid A subset of the MOESI protocol is used in the on-chip caches as well as the D-Tags in the system controller. TABLE M-4 shows the relationships between the protocols.
1. The opcode of the instructions should be ldda, ldxa, lddfa, stda, stxa, or stdfa. Otherwise, a data_access_exception exception with D-SFSR.FT = 0816 (Invalid ASI) is generated. 2. No operand address translation is performed for these instructions. 3. VA<2:0> of all of the operand address should be 0. Otherwise, a mem_address_not_aligned exception is generated. 4. The don’t-care bits (designated “—” in the format) in the VA of the load or store alternate can be of any value.
M.3.2 Level-2 Cache Control Register (ASI_L2_CTRL) [1] [2] [3] [4] [5] Register Name: ASI: VA: RW Data ASI_L2_CTRL 6A16 1016 Supervisor read/write ASI_L2_CTRL is a control register for L2 training, interface, and size configuration. It is illustrated below and described in TABLE M-6.
ASI_L2_DIAG_TAG_READ works in concert with ASI_L2_DIAG_TAG_READ_REG. A read to ASI_L2_DIAG_TAG_READ returns 0, with the side effect of setting the tag to ASI_L2_DIAG_TAG_READ_REG0-6. M.3.4 [1] [2] [3] Register Name: ASI: VA: [4] [5] RW Data ASI_L2_DIAG_TAG 6B16 VA<18:6>: Index number of the tag. 000016–7FFC016 Supervisor read 0 is read. L2 Diagnostics Tag Read Registers (ASI_L2_DIAG_TAG_READ_REG) ASI_L2_DIAG_TAG_READ_REG0-6 holds the tag that is specified by the read of ASI_L2_DIAG_TAG_READ.
132 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X N Interrupt Handling Interrupt handling in SPARC64 V is described in these sections: ■ ■ ■ N.1 Interrupt Dispatch on page 133 Interrupt Receive on page 135 Interrupt-Related ASR Registers on page 136 Interrupt Dispatch When a processor wants to dispatch an interrupt to another UPA port, it first sets up the interrupt data registers (ASI_INTR_W data 0-7) with the outgoing interrupt packet data by using ASI instructions.
read ASI_INTR_DISPATCH_STATUS Error Y Busy? N PSTATE.IE ← 0 (begin atomic sequence) Write ASI_INTR_W (data 0) ... Write ASI_INTR_W (data 7) Write ASI_INTR_W (interrupt dispatch) MEMBAR read ASI_INTR_DISPATCH_STATUS Busy? Y N PSTATE.IE ← 1 (end atomic sequence) Nack? Y N dispatch complete FIGURE N-1 134 Dispatching an Interrupt SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
N.2 Interrupt Receive When an interrupt packet is received, eight interrupt data registers are updated with the associated incoming data and the BUSY bit in the ASI_INTR_RECEIVE register is set. If interrupts are enabled (PSTATE.IE = 1), then the processor takes a trap and the interrupt data registers are read by the software to determine the appropriate trap handler. The handler may reprioritize this interrupt packet to a lower priority. FIGURE N-2 is an example of the interrupt receive flow.
N.3 Interrupt Global Registers Please refer to Section N.3. of Commonality. N.4 Interrupt-Related ASR Registers Please refer to Section N.4 of Commonality for details of these registers. N.4.2 Interrupt Vector Dispatch Register SPARC64 V ignores all 10 bits of VA<38:29> when the Interrupt Vector Dispatch Register is written (impl. dep. #246). N.4.3 Interrupt Vector Dispatch Status Register In SPARC64 V, 32 BUSY/NACK pairs are implemented in the Interrupt Vector Dispatch Status Register (impl. dep.
F. A P P E N D I X O Reset, RED_state, and error_state The appendix contains these sections: ■ ■ ■ O.1 Reset Types on page 137 RED_state and error_state on page 139 Processor State after Reset and in RED_state on page 141 Reset Types This section describes the four reset types: power-on reset, watchdog reset, externally initiated reset, and software-initiated reset. O.1.
3. The UPA_RESET_L pin is deasserted. The processor enters RED_state with TT = 1 trap to RSTVaddr + 2016 and starts the instruction execution. O.1.2 Watchdog Reset (WDR) The watchdog reset trap is generated internally in the following cases: ■ ■ ■ Second watchdog timeout detection while TL < MAXTL. First watchdog timeout detection while TL = MAXTL When a trap occurs while TL = MAXTL When triggered by a watchdog timeout, a WDR trap has TT = 2 and control transfers to RSTVaddr + 4016.
O.2 RED_state and error_state FIGURE O-1 illustrates the processor state transition diagram. Fatal Error CPU Fatal Error *** Fatal Error TRAP @
O.2.1 RED_state Once the processor enters RED_state for any reason except when a power-on reset (POR) is performed, the software should not attempt to return to execute_state; if software attempts a return, then the state of the processor is unpredictable.
O.2.3 CPU Fatal Error state The processor enters CPU fatal error state when a fatal error is detected on the processor. A fatal error is one that breaks the cache coherency or the system data integrity and is not reported as the SDC (small data corruption) error. See Appendix P, Error Handling, for details of the SDC error. The processor reports the fatal error detection to the system, and the system causes the fatal reset. Soft POR will be applied to the all CPUs in the system at the fatal reset. O.
TABLE O-1 Nonprivileged and Privileged Register State after Reset and in RED_state (Continued) Name TLE CLE POR1 WDR2 0/ Copied from CLE 0/ Unchanged Copied from CLE Unchanged XIR TBA<63:15> Unknown/Unchanged Unchanged PIL Unknown/Unchanged Unchanged CWP Unknown/Unchanged Unchanged except for register window traps FPRS Unknown/Unchanged Unchanged TL MAXTL min (TL + 1, MAXTL) TPC[TL] TNPC[TL] Unknown/Unchanged Unknown/Unchanged PC nPC Unknown/Unchanged CCR ASI PSTATE CWP PC nPC T
TABLE O-2 ASR State after Reset and in RED_state A S R Name POR1 WDR2 0 Y Unknown/Unchanged Unchanged 2 CCR Unknown/Unchanged Unchanged 3 ASI Unknown/Unchanged Unchanged TICK 1 Restart at 0 Unchanged Unchanged 0 Unchanged 0 0 Unknown/Unchanged Unchanged Unknown/Unchanged Unchanged 4 NPT Counter 6 FSR 16 PCR 17 PIC 18 DCR 19 GSR 22 23 UT ST Others RED_state XIR SIR Unchanged Restart at 0 Unchanged Unchanged Always 0 0 0 Unknown/Unchanged Unchanged Unchanged Unch
TABLE O-3 ASI Register State After Reset and in RED_state (2 of 3) A S I VA Name POR1 WDR2 4A 00 UPA_CONFIG WB_S WRI_S INT_S UC_S AM MCAP CLK_MODE SCIQ1 SCIQ0 UPC_CAP2 MID UPC_CAP 000/Unchanged 00/Unchanged 00/Unchanged 010/Unchanged OPSR value/Unchanged OPSR value (read-only) Pin 000/Unchanged 0000/Unchanged 1 (Read-only) Module ID (read-only) 01_000000_0001_11011 Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged AFSR Unknow
TABLE O-3 ASI Register State After Reset and in RED_state (3 of 3) A S I VA Name POR1 WDR2 58 20 DMMU_SFAR Unknown/Unchanged Unchanged 58 28 DMMU_TSB_BASE Unknown/Unchanged Unchanged 58 30 DMMU_TAG_ACCESS Unknown/Unchanged Unchanged 58 38 DMMU_VA_WATCHPOINT Unknown/Unchanged Unchanged 58 40 DMMU_PA_WATCHPOINT Unknown/Unchanged Unchanged 58 48 DMMU_TSB_PEXT Unknown/Unchanged Unchanged 58 58 DMMU_TSB_NEXT Unknown/Unchanged Unchanged 59 — DMMU_TSB_8KB_PTR Unknown/Un
TABLE O-4 UPA slave register State after Reset and in RED_state PA Name POR1(binary) WDR2 00 UPA_PORTID Cookie SREQ_S ECCnotValid One_Read PRINT_RDQ PREQ_DQ PREQ_RQ UPACAP FC16 1 0 0 01 000000 0001 11011 Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged Unchanged XIR SIR RED_state 1.Hard POR occurs when power is cycled. Values are unknown following hard POR. Soft POR occurs when UPA_RESET_L is asserted. Values are unchanged following soft POR. 2.
O.3.2 Hardware Power-On Reset Sequence To be defined later. O.3.3 Firmware Initialization Sequence To be defined later. Release 1.0, 1 July 2002 F.
148 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X P Error Handling This appendix describes processor behavior to a programmer writing operating system, firmware, and recovery code for SPARC64 V. Section headings differ from those of Appendix P of Commonality. P.1 Error Classification On SPARC64 V, an error is classified into one of the following four categories, depending on the degree to which it obstructs program execution: ■ ■ ■ ■ 1. Fatal error 2. Error state transition error 3. Urgent error 4.
When the CPU detects the fatal error, the CPU enters FATAL error_state and reports the fatal error occurrence to the system controller. The system controller transfers the entire system state to the FATAL state and stops the system. After the system stops, a FATAL reset, which is a type of power-on reset, will be issued to the whole system. P.1.2 error_state Transition Error An error_state transition error is a serious error that prevents the CPU from reporting the error by generating a trap.
■ Otherwise, an error exception is generated and the damaged instruction is executed as when ASI_ERROR_CONTROL.WEAK_ED = 0 is set. The three types of instruction-obstructing errors are described below. ■ I_UGE (instruction urgent error) — All of the instruction-obstructing errors except IAE (instruction access error) and DAE (data access error). There are two categories of I_UGEs. ■ ■ An uncorrectable error in an internal program-visible register that obstructs instruction execution.
When the resource with the error is used, the program cannot continue execution, or the error_state transition error or the fatal error is detected. ■ The error in an important resource that is expected to invoke the operating system “panic” process The operating system panic process is expected when this error is detected because the normal processing cannot be expected to continue when this error occurs. The A_UGE is a disrupting error with the following deviations.
■ Degradation SPARC64 V can isolate an internal hardware resource that generates frequent errors and continue processing without deleterious effect on software during program execution. However, performance is degraded by the resource isolation. This degradation is reported as a restrainable error. The restrainable error can be reported to privileged software by the ECC_error trap. When PSTATE.
P.2.2 Summary of Actions Upon Error Detection TABLE P-2 summarizes what happens when an error is detected. TABLE P-2 Action Upon Detection of an Error (1 of 4) Fatal Error (FE) Error detection mask (the condition to suppress error detection) None Error State Transition Error (EE) When ASI_ECR.WEAK_ED = 1, the error detection is suppressed incompletely. Urgent Error (UGE) Restrainable Error (RE) None I_UGE, IAE, DAE When ASI_ECR.WEAK_ED = 1, error detection is suppressed incompletely.
TABLE P-2 Action Upon Detection of an Error (2 of 4) Fatal Error (FE) Error State Transition Error (EE) Action upon the 1. CPU enters 1. CPU enters error detection CPU fatal error_state. state. 2. Watchdog reset 2. CPU informs (WDR) is caused the system of on the CPU. fatal error occurrence. 3. The FATAL reset (which is a form of POR reset) is issued to the whole system. 4. POR reset is caused to all CPUs in the system.
TABLE P-2 Action Upon Detection of an Error (3 of 4) Fatal Error (FE) Error State Transition Error (EE) Urgent Error (UGE) Restrainable Error (RE) tt (trap type) 1 (RED_state) 2 (RED_state) ADE: 4016 DAE: 3216 IAE: 0A16 6316 Trap priority 1 1 ADE — 2 DAE — 12 IAE — 3 32 End-method of trapped instruction Abandoned Abandoned. ADE trap Precise, retryable or nonretryable. See P.4.3. Precise IAE trap, DAE trap Precise.
TABLE P-2 Action Upon Detection of an Error (4 of 4) Fatal Error (FE) Number of errors indicated at trap All FEs are detected and accumulated in ASI_STCHG_ ERROR_INFO Error State Transition Error (EE) All EEs are detected and accumulated in ASI_STCHG_ ERROR_INFO Urgent Error (UGE) Single-ADE trap All I_UGEs and A_UGEs detected at trap. Restrainable Error (RE) All restrainable errors detected and accumulated in ASI_AFSR. Multiple-ADE trap The multiple-ADE indication + UGEs at first ADE trap.
■ When a hardware unit first detects an uncorrected error in the cacheable data, the hardware unit replaces the data and ECC of the cacheable data with a special pattern that identifies the original error source and signifies that the data is already marked. The error marking helps identify the error source and prevent multiple error reports by a single error even after several cache lines transfer with uncorrected data.
Format of Error-Marked Data TABLE P-4 Data/ECC Bit Value 41:36 0 (6 bits). 35 Error bit. The value is unpredictable. 34:23 0 (12 bits). 22 Error bit. The value is unpredictable. 21:14 0 (8 bits). 13:0 ERROR_MARK_ID (14 bits). ECC The pattern indicates 3-bit error in bits 63, 35, and 22, that is, the pattern causing the 7F16 syndrome. The ERROR_MARK_ID (14 bits wide) identifies the error source.
Difference Between Error Marking on SPARC64 IV and SPARC64 V TABLE P-7 lists the differences between error marking on SPARC64 IV and SPARC64 V. TABLE P-7 Error Marking on SPARC64 IV and SPARC64 V ECC for cacheable data SPARC64 IV SPARC64 V ECC for UPA ECC for UPA Trigger of error marking The detection of a raw UE The detection of a raw UE ERROR_MARK_ID value Value specified in TABLE P-6. Value specified in TABLE P-6.
P.2.5 ASI_EIDR The ASI_EIDR register designates the source ID in the ERROR_MARK_ID of the CPU. [1] [2] [3] [4] [5] Register name: ASI: VA: Error checking: Format & function: ASI_EIDR 6E16 0016 Parity. See TABLE P-8. ASI_EIDR Bit Description TABLE P-8 Bit Name RW Description 63:14 Reserved R Always 0. 13:0 ERROR_MARK_ID RW ERROR_MARK_ID for the error caused by the CPU. P.2.
TABLE P-9 ASI_ERROR_CONTROL Bit Description (Continued) Bit Name RW Description 1 WEAK_ED RW Weak Error Detection. Controls whether the detection of I_UGE and DAE is suppressed: When WEAK_ED = 0, error detection is not suppressed. When WEAK_ED = 1, error detection is suppressed if the CPU can continue processing. When I_UGE or DAE is detected during instruction execution while WEAK_ED = 1, the value of the output register or the store target memory location become unpredictable.
P.3 Fatal Error and error_state Transition Error P.3.1 ASI_STCHG_ERROR_INFO The ASI_STCHG_ERROR_INFO register stores detected FATAL error and error_state transition error information, for access by OBP (Open Boot PROM) software. [1] [2] Register name: ASI: [3] VA: [4] Error checking: [5] Format & function: [6] Initial value at reset: [7] Update policy: ASI_STCHG_ERROR_INFO 4C16 1816 None See TABLE P-10 Hard POR: All fields are set to 0. Other resets: Values are unchanged.
TABLE P-10 Format of ASI_STCHG_ERROR_INFO Bit Description (Continued) Bit Name RW Description 9 EE_SIR_IN_MAXTL R Upon detection of the corresponding error, set to 1. 8 EE_TRAP_IN_MAXTL R Upon detection of the corresponding error, set to 1. 7:3 Reserved R Always 0. 2 FE_OTHER R Upon detection of the corresponding error, set to 1. 1 FE_U2TAG_UNCORRECTED_ERROR R Upon detection of the corresponding error, set to 1.
■ P.4 Ideal specification (not implemented) The EE_OTHER bit is specified in ASI_STCHG_ERROR_INFO bit 14. When hardware detects error_state transition errors other than those described above, it sets ASI_STCHG_ERROR_INFO.EE_OTHER = 1. Urgent Error This section presents details about urgent errors: status monitoring, actions, and end-methods. P.4.
TABLE P-11 ASI_UGESR Bit Description (2 of 4) Bit Name RW Description 22 IAUG_CRE R Uncorrectable error in any of the following: (IA) ASI_EIDR (IA) ASI_PA_WATCH_POINT when enabled (IA) ASI_VA_WATCH_POINT when enabled (I) ASI_AFAR_D1 (I) ASI_AFAR_U2 (I) ASI_INTR_R (SPARC64 V deviation from the ideal specification: the uncorrectable error in ASI_INTR_R at load instruction access is detected but reported as ASI_UGESR.COREERR instead of ASI_UGESR.IAUG_CRE; the reported ASI_UGESR.
TABLE P-11 ASI_UGESR Bit Description (3 of 4) Bit Name RW Description 15 AUG_SDC R System data corruption. Indicates the occurrence of the following system data corruption: Small data corruption: Data in the cacheable area with an unpredictable address is destroyed. The destroyed area is some number of 64-byte blocks.
TABLE P-11 ASI_UGESR Bit Description (4 of 4) Bit Name RW Description 5:4 INSTEND R Trapped instruction end-method. Upon a single async_data_error trap without watchdog timeout detection, INSTEND indicates the instruction endmethod of the trapped instruction pointed to by TPC as follows: 002: Precise 012: Retryable but not precise 102: Reserved 112: Not retryable See Section P.4.3 for the instruction end-method for the async_data_error trap.
The following actions are executed in this order: a. State transition if (TL = MAXTL), the CPU enters error_state and abandons the ADE trap; else if (CPU is in execution state && (TL = MAXTL − 1)), then the CPU enters RED_state. b. Trap target address calculation When the CPU is in execution state, trap target address is calculated by %tba, %tt, and %tl. Otherwise, the CPU is in RED_state and the trap target address is set to RSTVaddr + A016. c. TL is incremented: TL ← TL + 1. 3.
Errors in registers other than those listed above and any errors in the TLB entry remain. b. Update of ASI_UGESR, as shown in TABLE P-13. TABLE P-13 ASI_UGESR Update for Single and Multiple-ADE Exceptions Bit Field 63:6 Error indication All bits in this field are updated. All I_UGEs and A_UGEs detected at the trap are indicated simultaneously. Update upon a Single-ADE Trap Update upon a Multiple-ADE Traps 5:4 INSTEND The instruction end-method of the instruction referenced by TPC is set.
TABLE P-14 defines each instruction end-method after an ADE trap. TABLE P-14 Instruction End-Method After async_data_error Exception Precise Retryable But Not Precise Not Retryable Instructions executed after the last ADE, IAE, or DAE trap and before the trapped instruction referenced by TPC. Ended (Committed). The instructions without UGE complete as defined in the architecture.
void expected_software_handling_of_ADE_trap() { /* Only %r0-%r7 can be used from here to Point#1 because the register window control registers may not have valid value until Point#1. It is recommended that only %r0-%r7 are used as general-purpose registers (GPR) in the whole single-ADE trap handler, if possible.
causes the data_access_error trap when its tag matches at the DTLB reference for address translation. */ } if (ASI_UGESR.IUG_ITLB == 1) { execute demap_all for ITLB; /* A locked fITLB entry with uncorrectable error is not removed by this operation. A locked fITLB entry with UE never detects its tag match or causes the data access error trap when its tag matches at the ITLB reference for address translation. */ } if ((ASI_UGESR.bits22:14 == 0) && ((ASI_UGESR.INSTEND == 0) || (ASI_UGESR.
P.7 Restrainable Errors This section describes the registers—ASI_ASYNC_FAULT_STATUS, ASI_ASYNC_FAULT_ADDR_D1, and ASI_ASYNC_FAULT_ADDR_U2—that define the restrainable errors and explains how software handles these errors. P.7.1 ASI_ASYNC_FAULT_STATUS (ASI_AFSR) [1] [2] [3] [4] [5] [6] Register name: ASI: VA: Error checking: Format & function: Initial value at reset: ASI_ASYNC_FAULT_STATUS (ASI_AFSR) 4C16 0016 None See TABLE P-15 Hard POR: All fields in ASI_AFSR are set to 0.
■ ■ If the Prio_U2 column for the error shown in the table row is blank, the error is never recorded into ASI_AFAR_U2. Otherwise, the Prio_U2 column for the error shown in the table row indicates the ASI_AFAR_U2 recording priority, as follows. Let P_U2 be the Prio_U2 column value for the error E2. Then: Upon detection of the error E2, if P_U2 > ASI_AFAR_U2.CONTENTS, the error E2 is recorded into ASI_AFAR_U2 and ASI_AFAR_U2.CONTENTS is set to P_U2. Upon detection of the error E2, if P_U2 ≤ ASI_AFAR_U2.
TABLE P-15 ASI_ASYNC_FAULT_STATUS Bit Description (Continued) Bit Name R/W 3 UE_DST_BETO RW1C 2 UE_RAW_L2$FILL RW1C 8016 Raw UE in incoming data at L2 cache fill. Indicates a raw (unmarked) uncorrectable error in incoming data from UPA bus at the level 2 cache fill. The doubleword containing the raw UE in the L2 cache was marked with the ERROR_MARK_ID = 0. 1 UE_RAW_L2$INSD RW1C C016 Raw UE in L2 cache inside data.
P.7.2 ASI_ASYNC_FAULT_ADDR_D1 [1] [2] [3] [4] [5] [6] Register name: ASI: VA: Error checking: Format & function: Initial value at reset: [7] Update: [8] Software access ASI_ASYNC_FAULT_ADDR_D1 (ASI_AFAR_D1) 4D16 0016 Parity See TABLE P-16. Hard POR: All fields in ASI_AFAR_D1 are set to 0. Other reset: Value in ASI_AFAR_D1 is unchanged. When a new restrainable error is detected, ASI_AFAR_D1 is updated as defined in Section P.7.1 in the notes on the AFSR Prio_D1 column of TABLE P-15.
P.7.3 ASI_ASYNC_FAULT_ADDR_U2 [1] [2] [3] [4] [5] [6] Register name: ASI: VA: Error checking: Format & function: Initial value at reset: [7] Update: [8] Software access: ASI_ASYNC_FAULT_ADDR_U2 (ASI_AFAR_U2) 4D16 0816 Parity See TABLE P-17. Hard POR: All fields are set to 0. Other reset: Values are unchanged. When a new restrainable error is detected, ASI_AFAR_U2 is updated as defined in Section P.7.1 in the notes on the AFSR Prio_U2 column of TABLE P-15.
TABLE P-17 ASI_ASYNC_FAULT_ADDR_U2 (ASI_AFAR_U2) Register Bit Description (Continued) Bit Name R/W Description 42:3 PA_BIT42_3 R Physical address bit 42:3. Contains the value indicated by ASI_AFAR_U2.CONTENTS, as shown below: ASI_AFAR_U2.CONTENTS Error Name Contents of PA_BIT42_3 The physical address of the doubleword with the error. 4016 CE_INCOMED 8016 UE_RAW_L2$FILL The physical address of the doubleword with the error.
b. Write the U2 cache line with the CE detection to memory either by using the ASI_L2_CTRL.U2_FLUSH facility or by displacement flush. c. Clear ASI_AFSR.CE_INCOMED and reload the memory block to U2 cache, using load instructions. Check whether the CE in memory has been corrected by inspecting ASI_AFSR.CE_INCOMED and ASI_AFAR_U2. d. If the CE in memory block is not corrected, a permanent error may be detected. Avoid using the memory block with the permanent correctable error as much as possible. ■ ASI_AFSR.
P.8 Handling of Internal Register Errors This section describes error handling for the following: ■ ■ ■ P.8.1 Most registers ASR registers ASI registers Register Error Handling (Excluding ASRs and ASI Registers) The terminology used in TABLE P-18 is defined as follows: Column Term Meaning Error Detect Condition InstAccess The error is detected when the instruction accesses the register.
TABLE P-18 P.8.
TABLE P-19 ASR Error Handling (Continued) ASR Number Register Name RW Error Protect Error Detect Condition Error Type Correction 5 PC R Parity Always IUG_PSTATE ADE trap 6 FPRS RW Parity Always IUG_%F ADE trap, W 7 — 8-15 — 16 PCR RW None — — — 17 PIC RW None — — — — 18 DCR R None — — 19 GSR RW Parity Always IUG_%F ADE trap, W 20 SET_SOFTINT W None — — — 21 CLEAR_SOFTINT W None — — — 22 SOFTINT RW Parity AUG always (I)AUG_CRE I(A)UG_CRE
(2 of 3) Column Term Meaning Error Detect Condition Always Error is always checked. AUG always Error is checked when (ASI_ERROR_CONTROL.UGE_HANDLER = 0) && (ASI_ERROR_CONTROL.WEAK_ED = 0). LDXA Error is checked when the register is read by LDXA instruction. LDXA #I Error is checked when the register is read by LDXA instruction. Also, the register is used for the calculation of IMMU_TSB_8KB_PTR and IMMU_TSB_64KB_PTR.
(3 of 3) Column Error Type Correction Term Meaning error_state error_state transition error. (I)AUG_xxxx The error is indicated by ASI_UGESR.IAUG_xxxx = 1, and the error class is autonomous urgent error. I(A)UG_xxxx The error is indicated by ASI_UGESR.IAUG_xxxx = 1, and the error class is instruction urgent error. Not detected (#dv) In SPARC64 V, the error is not detected. In the ideal specification, some errors should be detected but this behavior is not implemented.
TABLE P-20 shows the handling of ASI register errors.
TABLE P-20 ASI Handling of ASI Register Errors (Continued) VA Register Name RW Error Protect Error Detect Condition Error Type Correction 5816 3016 DMMU_TAG_ACCESS RW Parity LDXA #D IUG_TSBP W (WotherD) 5816 3816 DMMU_VA_WATCHPOINT RW Parity Enabled LDXA (I)AUG_CRE I(A)UG_CRE W W 5816 4016 DMMU_PA_WATCHPOINT RW Parity Enabled LDXA (I)AUG_CRE I(A)UG_CRE W W 5816 4816 DMMU_TSB_PEXT RW Parity = DTSB_BASE I(A)UG_TSBCTXT W 5816 5016 DMMU_TSB_SEXT RW Parity = DTSB_BASE
SPARC64 V Implementation and the Ideal Specification In the table on page 183 (defining terminology in TABLE P-20), the rows (ASIs 6F16, 7F16, and EF16) with error type of “Not detected (#dv)” or “COREERROR (#dv)” indicate that the SPARC64 V implementation deviates from the ideal specification, which is described in TABLE P-21 but is not implemented in SPARC64 V.
When a parity error is detected in a D1 cache tag entry or in a D1 cache tag copy entry, hardware automatically corrects the error by copying the correct tag entry from the other copy of the tag entry. If the error can be corrected in this way, program execution is unaffected. Similarly, when a parity error is detected in an I1 cache tag entry or in a I1 cache tag copy entry, hardware automatically corrects the error by copying the correct tag entry from the other copy of the tag entry.
P.9.2 Handling of an I1 Cache Data Error I1 cache data is protected by parity attached to every doubleword. When a parity error is detected in I1 cache data during an instruction fetch, hardware executes the following sequence: 1. Reread the I1 cache line containing the parity error from the U2 cache. The read data from U2 cache must contain only the doubleword without error or the doubleword with the marked UE, because error marking is applied to U2 cache outgoing data. 2.
Marked Uncorrectable Error in D1 Cache Data When a marked uncorrectable error (UE) in D1 cache data is detected during the D1 cache line writeback to the U2 cache, the D1 cache data and its ECC are written to the target U2 cache data and its ECC without modification. That is, a marked UE in D1 cache is propagated into the U2 cache. Such an error is not reported to software.
P.9.4 Handling of a U2 Cache Data Error U2 cache data is protected by 2-bit error detection and 1-bit error correction ECC, attached to every doubleword. Correctable Error in U2 Cache Data When a correctable error is detected in the incoming U2 cache fill data from UPA, the data is corrected by hardware, stored into U2 cache, and the restrainable error ASI_AFSR.CE_INCOMED is detected.
doubleword and its ECC in the read data and those in the source U2 cache line are changed to marked UE data. The restrainable error ASI_AFSR.UE_RAW_L2$INSD is detected. Implementation Note – SPARC64 V detects writeback. P.9.5 ASI_AFSR.UE_FAW_L2$INSD only on Automatic Way Reduction of I1 Cache, D1 Cache, and U2 Cache When frequent errors occur in the I1, D1, or U2 cache, hardware automatically detects that condition and reduces the way, maintaining cache consistency.
2. Otherwise: ■ ■ All entries in I1 cache way W are invalidated and the way W will never be refilled. The restrainable error ASI_AFSR.DG_L1$U2$STLB is reported to software. D1 Cache Way Reduction When a way reduction condition is recognized for the D1 cache way W (W = 0 or 1), the following way reduction procedure is executed: 1. When only one way in D1 cache is active because of previous way reduction: ■ The CPU enters error_state. 2.
2. Otherwise: ■ P.10 All entries in available U2 cache ways, including way W, are invalidated to retain system consistency. ■ Way W becomes unavailable and is never refilled. ■ The restrainable error ASI_AFSR.DG_L1$U2$STLB is reported to software. TLB Error Handling This section describes how TLB entry errors and sTLB way reduction are handled. P.10.1 Handling of TLB Entry Errors Error protection and error detection in TLB entries are described in TABLE P-22 .
When a parity error is detected in an ITLB entry when an LDXA instruction attempts to read ASI_ITLB_DATA_ACCESS or ASI_ITLB_TAG_ACCESS, hardware automatically demaps the entry and an instruction urgent error is indicated in ASI_UGESR.IUG_ITLB. Error in sTLB Entry Detected During Virtual Address Translation When a parity error is detected in the sTLB entry during a virtual address translation, hardware automatically demaps the entry and does not report the error to software.
sTLB Way Reduction When a way reduction condition is recognized for the sTLB way W (W = 0 or 1), hardware executes the following way reduction procedures: 1. When only one way in sTLB is active because of previous way reductions: ■ The previously reduced way is reactivated. 2. Regardless of how many ways were previously active, way reduction occurs: ■ ■ P.11 Hardware reduces the way and invalidates all entries in sTLB way W. Way W will never be refilled. The restrainable error ASI_AFSR.
■ ■ Raw (unmarked) uncorrectable error (multibit error) Marked uncorrectable error Correctable Error on Extended UPA Data Bus When the SPARC64 V processor detects a correctable error in the extended UPA incoming data, the processor corrects the data and uses it. The restrainable error ASI_AFSR.CE_INCOMED is indicated.
■ Incoming noncacheable data fetched by an instruction fetch. When a UE is detected in such data, an instruction_access_error with marked UE is detected at the time the fetched instruction is executed. ■ Incoming noncacheable data loaded by a load instruction. When the UE is detected in such data, a data_access_error with marked UE is detected at the time the load instruction is executed. ■ Incoming cacheable data fetched by an instruction fetch.
200 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X Q Performance Instrumentation This appendix describes and specifies performance monitors that have been implemented in the SPARC64 V processor. The appendix contains these sections: ■ ■ Q.
/* clear pics without altering sl/su values */ pic_init = 0x0; pcr = rd_pcr(); pcr.ulro = 0x1; /* don’t change su/sl on write */ pcr.ovf = 0x0; /* clear overflow bits also */ pcr.ut = 0x0; pcr.st = 0x0; /* disable counts for good measure */ for (i=0; i<=pcr.nc; i++) { /* select the pic to be written */ pcr.sc = i; wr_pcr(pcr); wr_pic(pic_init);/* clear pic i */ } Counter Event Selection and Start Counter events are selected through PCR.SC and PCR.SU/PCR.SL fields.
for(i=0; i<=pcr.nc; i++) { /* assume rest of pcr data has been preserved */ pcr.sc = i; wr_pcr(pcr); pic = rd_pic(); picl[i] = pic.picl; picu[i] = pic.picu; } Q.2 Performance Monitor Description The performance monitors can be divided into the following groups: 1. 2. 3. 4. 5. 6. Instruction statistics Trap statistics MMU event counters Cache event counters UPA transaction event counters Miscellaneous counters Events in Group 1 are counted on commit of the instructions.
TABLE Q-1 Events and Encoding of Performance Monitor (Continued) Counter Encoding picu0 picl0 picu1 001101 Reserved 001110 Reserved 001111 Reserved 010000 Reserved 010001 Reserved 010010 Reserved 010011 Reserved 010100 Reserved 010101 Reserved 010110 trap_all 010111 Reserved 100000 Reserved 100001 Reserved 100010 Reserved 100011 Reserved 110000 sx_miss _wait_dm sx_miss_wait _pf 110001 sreq_bi _count sreq_cpi_count sreq_cpb _count 110010 Reserved 110011 Reserved 1
● Instruction Count (instruction_counts) Counter Any Encoding 0000012 Counts the number of committed instructions. For user or system mode counts, this counter is exact. Combined with the cycle_counts, it provides instructions per cycle. IPC = instruction_counts / cycle_counts If Instruction_counts and cycle_counts are both collected for user or system mode, IPC in user or system mode can be derived.
● Prefetch Instruction Count (prefetch_instructions) Counter Any Encoding 0011002 Counts the committed prefetch instructions. Q.2.2 Trap-Related Statistics ● All Traps Count (trap_all) Counter picu0 Encoding 0101102 Counts all trap events. The value is equivalent to the sum of type-specific traps counters. ● Interrupt Vector Trap Count (trap_int_vector) Counter picl0 Encoding 0101102 Counts the occurrences of interrupt_vector_trap.
● Software Instruction Trap (trap_trap_inst) Counter picl2 Encoding 0101102 Counts the occurrences of Tcc instructions. ● Instruction MMU Miss Trap (trap_IMMU_miss) Counter picu3 Encoding 0101102 Counts the occurrences of fast_instruction_access_MMU_miss. ● Data MMU Miss Trap (trap_DMMU_miss) Counter picl3 Encoding 0101102 Counts the occurrences of fast_data_instruction_access_MMU_miss. Q.2.
Q.2.4 Cache Event Counters ● I1 Cache Miss Count (if_r_iu_req_mi_go) Counter picu2 Encoding 1000002 Counts the occurrences of I1 cache misses. ● D1 Cache Miss Count (op_r_iu_req_mi_go) Counter picl2 Encoding 1000002 Counts the occurrences of D1 cache misses. ● I1 Cache Miss Latency (if_wait_all) Counter picu3 Encoding 1000002 Counts the total latency of I1 cache misses. ● D1 Cache Miss Latency (op_wait_all) Counter picl3 Encoding 1000002 Counts the total latency of D1 cache misses.
● L2 Cache Miss Count by Demand Access (sx_miss_count_dm) Counter picu1 Encoding 1100002 Counts the occurrences of L2 cache miss by demand access. ● L2 Cache Miss Count by Prefetch (sx_miss_count_pf) Counter picl1 Encoding 1100002 Counts the occurrences of L2 cache miss by both software prefetch and hardware prefetch access. ● L2 Cache Reference by Demand Access (sx_read_count_dm) Counter picu2 Encoding 1100002 Counts L2 cache references by demand read access.
Q.2.5 UPA Event Counters UPA event counters count the number of S_REQ_xxx requests received by a CPU in a given time. ● INV Receive Count (sreq_bi_count) Counter picu0 Encoding 1100012 Counts the number of S_INV_REQ packets received. ● CPI Receive Count (sreq_cpi_count) Counter picl0 Encoding 1100012 Counts the number of S_CPI_REQ packets received. ● CPB Receive Count (sreq_cpb_count) Counter picu1 Encoding 1100012 Counts the number of S_CPB_REQ packets received.
Q.2.6 Miscellaneous Counters ● Barrier-Assist ASI Read Count (asi_rd_bar) Counter picu3 Encoding 1100012 Counts the number of read accesses to the barrier-assist ASI registers. ● Barrier-Assist ASI Write Count (asi_wr_bar) Counter picl3 Encoding 1100012 Counts the number of write accesses to the barrier-assist ASI registers. Release 1.0, 1 July 2002 F.
212 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X R UPA Programmer ’s Model This chapter describes the programmers model of the UPA interface of the SPARC64 V. The registers for the UPA interface and the access method for those registers are described. The appendix contains the following sections: ■ ■ ■ R.1 Mapping of the CPU’s UPA Port Slave Area on page 213 UPA PortID Register on page 214 UPA Config Register on page 215 Mapping of the CPU’s UPA Port Slave Area TABLE R-1 shows the mapping of the CPU’s UPA port slave area.
R.2 UPA PortID Register The UPA PortID Register is a standard read-only register that accessible by a slave read from another UPA port. This register is located at word address 00 16 in the slave physical address of the UPA port. This register cannot be read or written by ASI instructions. The UPA PortID Register is illustrated below and described in TABLE R-2.
UPA PortID Register Fields (Continued) TABLE R-2 R.3 Bit Field Description 20:16 UPACAP UPACAP<4:0>. Indicates the UPA module capability type, as follows: UPACAP<4> Set; CPU is an interrupt handler. UPACAP<3> Set; CPU is an interrupter. UPACAP<2> Clear; CPU does not use UPA Slave_Int_L signal. UPACAP<1> Set; CPU is a cache master. UPACAP<0> Set; CPU has a master interface. UPA Config Register The UPA Config Register is an implementation-specific ASI read-only register.
UPA Config Register Description (Continued) TABLE R-3 Bits Field Description 58:57 WRI_S Specify the size of maximum outstanding WRI packet as follows. 002: 1 012: 2 4 102: 112: 8 56:55 INT_S Specify the size 002: 012: 102 – 112: 54:46 — Reserved. Read as 0. 45:43 UC_S U2 cache size: 0102: 42:41 — Reserved. Read as 0. 40:39 AM Address Mode. 002: 012: 102 – 112: 38:35 MCAP The value set by OPSR is indicated. Consult the system document for the meaning and encoding of this field.
TABLE R-3 UPA Config Register Description (Continued) Bits Field Description 29:23 PCON Processor Configuration. Separated into PCON<6:4> and PCON<3:0>. PCON<6:4> (UPA_CONFIG<29:27>) represents the size of class 1 request queue in the System Controller (SC).
218 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. A P P E N D I X S Summary of Differences between SPARC64 V and UltraSPARC-III The following table summarizes differences between SPARC64 V and UltraSPARC-III ISAs. This list is a summary, not an exhaustive list. TABLE T-1 SPARC64 V and UltraSPARC-III Differences (1 of 3) SPARC64 V Page UltraSPARC-III UltraSPARCIII Section Feature SPARC64 V MMU architecture SPARC64 V supports an 85 UltraSPARC II-based MMU model. TLBs are split between instruction and data. Each side has a 2-level TLB hierarchy.
TABLE T-1 SPARC64 V and UltraSPARC-III Differences (2 of 3) SPARC64 V Page UltraSPARC-III Feature SPARC64 V Floating-point subnormal handling In general, SPARC64 V does not 65 handle most subnormal operands and results in hardware. However, its handling differs from that of UltraSPARC-III. UltraSPARCIII Section In general, UltraSPARC-III does B.6.1 not handle most subnormal operands and results in hardware. However, its handling differs from that of SPARC64 V.
TABLE T-1 SPARC64 V and UltraSPARC-III Differences (3 of 3) SPARC64 V Page UltraSPARC-III UltraSPARCIII Section Feature SPARC64 V Error status ASI 4C16/0816 (ASI_UGESR): 165 SPARC64 V implements an error status register to indicate where an error was detected. Not implemented. — Error Control Register ASI 4C16/1016(ASI_ECR): SPARC64 V implements a control register to signal/suppress a trap when an error was detected. 161 Not implemented.
222 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F. C H A P T E R Bibliography General References Please refer to Bibliography in Commonality.
224 SPARC JPS1 Implementation Supplement: Fujitsu SPARC64 V • Release 1.
F.
ASI_C_BSTW0123 ASI_C_BSTW1123 ASI_C_LBSTWBUSY123 ASI_C_LBSYR0122 ASI_C_LBSYR1122 ASI_DCU_CONTROL_REGISTER118 ASI_DCUCR118 ASI_DMMU_SFAR153 ASI_DMMU_SFSR153 ASI_DMMU_TAG_ACCESS166 ASI_DMMU_TAG_TARGET166 ASI_DMMU_TSB_64KB_PTR166 ASI_DMMU_TSB_8KB_PTR166 ASI_DMMU_TSB_BASE166 ASI_DMMU_TSB_DIRECT_PTR166 ASI_DMMU_TSB_NEXT166 ASI_DMMU_TSB_PEXT166 ASI_DMMU_TSB_PTR184 ASI_DMMU_TSB_SEXT166 ASI_DTLB_DATA_ACCESS195 ASI_DTLB_TAG_ACCESS195 ASI_ECR161 UGE_HANDLER155 ASI_EIDR153, 161, 166, 187, 191, 221 ASI_ERROR_CONTROL153
ASI_INTR_W133, 134 ASI_ITLB_DATA_ACCESS196 ASI_ITLB_TAG_ACCESS196 ASI_L2_CTRL130 ASI_L2_DIAG_TAG131 ASI_L2_DIAG_TAG_READ_REG131 ASI_L3_DIAG_DATA0_REG118 ASI_L3_DIAG_DATA1_REG118 ASI_LBSYR0124 ASI_LBSYR1124 ASI_MCNTL92 JPS1_TSBP88 ASI_MEMORY_CONTROL_REG118 ASI_NUCLEUS57, 98, 101 ASI_NUCLEUS_LITTLE57, 101 ASI_PA_WATCH_POINT166 ASI_PARALLEL_BARRIER166 ASI_PHYS_BYPASS_EC_WITH_E_BIT127 ASI_PHYS_BYPASS_EC_WITH_E_BIT_LITTLE127 ASI_PHYS_BYPASS_WITH_EBIT26 ASI_PRIMARY57, 98, 101 ASI_PRIMARY_AS_IF_USER57 ASI_PRIMARY_
compare and swap37 B barrier assist121 ASI read/write accesses, counting211 parallel187, 188 block block store with commit120 load instructions120, 220 store instructions120, 220 blocked instructions10 branch history buffer2 branch instructions24 BSTW busy status register123 BSTW control register123 bus-busy cycle count210 bypass attribute bits104 C cache coherence128, 140 data cache tag error handling188–189 characteristics127 data error detection190 description7 flushing220 modification125 protection190
level-2 characteristics125 control register130 tag read130 unified127 use2 snooping140 synchronizing42 unified characteristics127 description8 CALL instruction24, 29, 30, 53 CANRESTORE register166 CANSAVE register166 CASA instruction37, 102 CASXA instruction37, 102 catastrophic_error exception37 CE correction157 counting in D1 cache data193 in D1 cache data190 detection175, 197 effect on CPU152 permanent180 in U2 cache tag189 CLEANWIN register75, 166 CLEAR_SOFTINT register183 cmask field56 committed, defini
error detection mask154 reporting151 data cacheable doubleword error marking158 error marking157 error protection158 corruption167 prefetch25 data_access_error exception55, 90, 101, 103, 130, 152, 199 data_access_exception exception54, 90, 102, 103, 120, 129 data_access_MMU_miss exception46 data_access_protection exception46, 55 data_breakpoint exception72 DCR differences from UltraSPARC III221 error handling183 nonprivileged access22 DCU_CONTROL register186 DCUCR access data format23 CP (cacheability) fiel
dispatch (instruction)9 disrupting traps17, 37 distribution nonspeculative10 speculative11 DMMU access bypassing104 disabled91 internal register (ASI_MCNTL)92 registers accessed92 Synchronous Fault Status Register97 Tag Access Register90 DMMU_DEMAP register187 DMMU_PA_WATCHPOINT register187 DMMU_SFAR register186 DMMU_SFSR register186 DMMU_TAG_ACCESS register187 DMMU_TAG_TARGET register186 DMMU_TSB_64KB_PTR register187 DMMU_TSB_8KB_PTR register187 DMMU_TSB_BASE register186 DMMU_TSB_DIRECT_PTR register187 DMM
ECC_error exception46, 153, 155, 180 ee_opsr164 ee_second_watch_dog_timeout164 ee_sir_in_maxtl164 ee_trap_addr_uncorrected_error164 ee_trap_in_maxtl164 ee_watch_dog_timeout_in_maxtl164 error asynchronous17 categories149 classification3 correctable152, 189 correction, for single-bit errors3 D1 cache data190 error_state transition164 fatal149 handling ASI errors186 ASR errors182 most registers181 isolation3 marking differences between SPARC64 IV and SPARC64 V160 restrainable152 source identification159 transi
privileged_action79 statistics monitoring206–207 unfinished_FPop62, 65 execute_state140 executed, definition9 execution EU (execution unit)6 out-of-order25 speculative25 externally_initiated_reset (XIR)138 F fast_data_access_MMU_miss exception90 fast_data_access_protection exception90, 102 fast_data_instruction_access_MMU_miss exception207 fast_instruction_access_MMU_miss exception46, 89, 99, 100, 207 fatal error behavior of CPU150 cache tag189 definition149 detection163 types164 U2 cache tag189 fDTLB77,
FMADDs instruction50 FMSUB instruction30, 45 FMSUBd instruction50 FMSUBs instruction50 FNMADD instruction45 FNMADDd instruction50 FNMADDs instruction50 FNMSUB instruction45 FNMSUBd instruction50 FNMSUBs instruction50 formats, instruction28 fp_disabled exception30, 48, 53, 57, 74 fp_exception_ieee_754 exception53, 65 fp_exception_other exception46, 62, 79 FQ17, 24 FSR aexc field19 cexc field18, 19 conformance19 NS field62 TEM field19 VER field18 fTLB78, 87, 94 G GSR register183 H high-speed synchronization
IMMU internal register (ASI_MCNTL)92 registers accessed92 Synchronous Fault Status Register97 IMMU_DEMAP register186 IMMU_SFSR register186 IMMU_TAG_ACCESS register186 IMMU_TAG_TARGET register186 IMMU_TSB_64KB_PTR register186 IMMU_TSB_8KB_PTR register186 IMMU_TSB_BASE register186 IMMU_TSB_NEXT register186 IMMU_TSB_PEXT register186 IMPDEP1 instruction30, 49 IMPDEP2 instruction30, 49, 53, 74, 83 IMPDEP2B instruction28, 50 IMPDEPn instructions49, 50 impl field of VER register18 implementation number (impl) fiel
implementation-dependent (IMPDEP2)30 implementation-dependent (IMPDEPn)49, 50 initiated, definition9 issued, definition9 LDDFA80 prefetch91 reserved fields45 stall10 statistics counters204 timing46 integer unit (IU) deferred-trap queue11, 17, 24, 71 internal ASI, reference to103 interrupt causing trap17 dispatch133 level 1522 Interrupt Vector Dispatch Register136 Interrupt Vector Receive Register136 interrupt_level_n exception206 interrupt_vector_trap exception38, 206 INTR_DATA0:7_R register, error handling
JMPL instruction29, 53 JPS1_TSBP mode93 JTAG command91, 164, 189 L LBSY control register122 LDD instruction37 LDDA instruction37, 54, 102, 103 LDDF_mem_address_not_aligned exception80, 120 LDDFA instruction80, 120 LDQF_mem_address_not_aligned exception46 LDSTUB instruction37, 102 LDSTUBA instruction102 LDXA instruction178, 185, 195 load quadword atomic54 LoadLoad MEMBAR relationship56 load-store instructions compare and swap37 D1 cache data errors191 memory model47 LoadStore MEMBAR relationship56 Lookasi
store order (STO)75 TSO41, 42 MEMORY_CONTROL register186 mmask field56 MMU disabled91 event counting207 exceptions recorded89 Memory Control Register92 physical address width86 registers accessed92 TLB data access address assignment94 TLB organization85 MOESI cache-coherence protocol128 Multiply Add/Subtract instructions53 N noncacheable access54, 126 nonleaf routine53 nonspeculative distribution10 nonstandard floating-point (NS) field of FSR register18, 71 nonstandard floating-point mode18, 62 O OBP faci
partial ordering, specification56 partial store instruction UPA transaction57 watchpoint exceptions57 partial store instructions120 partial store order (PSO) memory model41 PC register169 PCR accessibility20 counter events, selection202 error handling183 NC field21 OVF field21 OVRO field21 PRIV field20, 58, 59 SC field21, 202 SL field202 ST field204 SU field202 UT field204 performance monitor events/encoding203 groups203 pessimistic overflow65 pessimistic zero64 PIC register clearing201 counter overflow22 e
PRIMARY_CONTEXT register186 privileged registers19 privileged_action exception20, 79, 90, 103, 117 PCR access58, 59 privileged_opcode exception22 processor states after reset141 error_state36, 72, 140 execute_state140 RED_state36, 140 program counter (PC) register75 program order26 PSTATE register AM field29, 49, 53, 75 IE field134, 135 MM field42 PRIV field20, 58, 59 RED field20, 126, 140, 141 PTE E field26 Q quadword-load ASI54 queues11 R RDPCR instruction20, 58 RDTICK instruction19 reclaimed status10 R
clock-tick (TICK)73 current window pointer (CWP)75 Data Cache Unit Control (DCUCR)23 LBSY control122 other windows (OTHERWIN)75 privileged19 renaming10 restorable windows (CANRESTORE)75 savable windows (CANSAVE)75 relaxed memory order (RMO) memory model41 reservation station11 reserved fields in instructions45 reset externally_initiated_reset (XIR)138 power_on_reset (POR)72 software_initiated_reset (SIR)138 WDR146 resets POR155, 161, 163, 174 WDR155, 163 restorable windows (CANRESTORE) register75 restrainab
scan definition11 ring11 sDTLB77, 85, 90 SECONDARY_CONTEXT register186 SERIAL_ID register186 SET_SOFTINT register183 SHUTDOWN instruction58 SIR instruction138 sITLB77, 85, 90 size field of instructions28 SOFTINT register38, 135, 166, 183 speculative distribution11 execution25 spill_n_normal exception206 spill_n_other exception206 stall (instruction)10 STBAR instruction59 STCHG_ERROR_INFO register186 STD instruction37 STDA instruction37 STDFA instruction120 STICK register166, 183 STICK_COMP register166 STICK
T Tag Access Register96 Tcc instruction, counting207 TICK register19, 73 TICK_COMPARE register183 TL register138, 140 TLB CP field126 data characteristics77 in TLB organization85 data access address95 Data Access/Data In Register96 index95 instruction characteristics77 in TLB organization85 main10, 36 multiple hit detection86 replacement algorithm93 TNP register166 total store order (TSO) memory model41, 42 TPC register166 transition error150 traps deferred37 disrupting17, 37 precise17 TSB Base Register97 E
way reduction194 uDTLB10, 85, 90 UE_RAW_D1$INSD error191 UE_RAW_L2$FILL error192 uITLB10, 85, 90 uncorrectable error152, 167 unfinished_FPop exception62, 65 unimplemented_FPop floating-point trap type70 unimplemented_LDD exception46 unimplemented_STD exception46 UPA bus error176 Config Register215 port slave area213 PortID register214 UPA_CONFIGUATION register error handling186 UPA_XIR_L pin138 urgent error definition150 types A_UGE150 DAE150 IAE150 instruction-obstructing150 URGENT_ERROR_STATUS register186