Processor having an arithmetic extension of an instruction set architecture

US 6,714,197 B1
Filed: 07/30/1999
Issued: 03/30/2004
Est. Priority Date: 07/30/1999
Status: Expired due to Term

First Claim

Patent Images

1. In a processor, a method for performing computer graphics calculations, said method comprising:

representing a vertex in a computer graphics image with a plurality of world coordinates;

transforming said plurality of world coordinates into a plurality of transformed world coordinates using a floating point reduction add instruction; and

performing perspective division on said plurality of transformed world coordinates using a floating point reciprocal instruction, wherein said floating point reciprocal instruction processes a plurality of operands to generate a plurality of reduced precision reciprocal values.

View all claims

5 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

A processor having an arithmetic extension of an instruction set architecture which incorporates a set of high performance floating point operations. The instruction set architecture incorporates a variety of data formats including single precision and double precision data formats, as well as the paired-single data format that allows two simultaneous operations on a pair of operands. The extension includes instructions directed to reduction add, reduction multiply, reciprocal, and reciprocal square root.

113 Citations

View as Search Results

61 Claims

1. In a processor, a method for performing computer graphics calculations, said method comprising:
- representing a vertex in a computer graphics image with a plurality of world coordinates;
  
  transforming said plurality of world coordinates into a plurality of transformed world coordinates using a floating point reduction add instruction; and
  
  performing perspective division on said plurality of transformed world coordinates using a floating point reciprocal instruction, wherein said floating point reciprocal instruction processes a plurality of operands to generate a plurality of reduced precision reciprocal values.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
- - 2. The method for performing computer graphics calculations as recited in claim 1 wherein said floating point reduction add instruction comprises:
3. The method for performing computer graphics calculations as recited in claim 2 wherein said floating point reduction add instruction is an ADDR instruction.
4. The method for performing computer graphics calculations as recited in claim 1 wherein said floating point reciprocal instruction comprises:
- calculating said plurality of reduced precision reciprocal values with values from a plurality of lookup tables configured in parallel.
5. The method for performing computer graphics calculations as recited in claim 4 wherein said floating point reciprocal instruction is a RECIP1 instruction.
6. The method for performing computer graphics calculations as recited in claim 4 further comprising:
- calculating a plurality of full precision reciprocal values using said plurality of reduced precision reciprocal values.
7. The method for performing computer graphics calculations as recited in claim 6 wherein said plurality of full precision reciprocal values is determined using a Newton-Raphson approximation.
8. The method for performing computer graphics calculations as recited in claim 1 wherein said plurality of world coordinates and said plurality of transformed world coordinates are in a paired-single data format.
9. The method for performing computer graphics calculations as recited in claim 1 further comprising a floating point reduction multiply instruction comprising:
- reading a first value from a first position of a first data set and a second value from a second position of said first data set;
  
  multiplying said first value and said second value, and placing the result in a first position of a destination data set;
  
  reading a third value from a first position of a second data set and a fourth value from a second position of said second data set; and
  
  multiplying said third value and said fourth value, and placing the result in a second position of said destination data set.
10. The method for performing computer graphics calculations as recited in claim 9 wherein said floating point reduction multiply instruction is a MULR instruction.
11. The method for performing computer graphics calculations as recited in claim 1 wherein said floating point reciprocal instruction comprises:
- calculating said plurality of reduced precision reciprocal values with values from at least one lookup table. MULR instruction.

12. In a processor, a method for performing computer graphics calculations, said method comprising:
- representing a vertex in a computer graphics image with a plurality of surface normal coordinates;
  
  transforming said plurality of surface normal coordinates into a plurality of transformed surface normal coordinates using a floating point reduction add instruction; and
  
  normalizing said plurality of transformed surface normal coordinates using a floating point reciprocal square root instruction, wherein said reciprocal square root instruction processes a plurality of operands to generate a plurality of reduced precision reciprocal square root values.
- View Dependent Claims (13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
- - 13. The method for performing computer graphics calculations as recited in claim 12 further comprising:
14. The method for performing computer graphics calculations as recited in claim 12 wherein said floating point reduction add instruction comprises:
- reading a first value from a first position of a first data set and a second value from a second position of said first data set;
  
  adding said first value and said second value, and placing the result in a first position of a destination data set;
  
  reading a third value from a first position of a second data set and a fourth value from a second position of said second data set; and
  
  adding said third value and said fourth value, and placing the result in a second position of said destination data set.
15. The method for performing computer graphics calculations as recited in claim 14 wherein said floating point reduction add instruction is an ADDR instruction.
16. The method for performing computer graphics calculations as recited in claim 12 wherein said floating point reciprocal square root instruction comprises:
- calculating said plurality of reduced precision reciprocal square root values with values from a plurality of lookup tables configured in parallel.
17. The method for performing computer graphics calculations as recited in claim 16 wherein said floating point reciprocal square root instruction is a RSQRT1 instruction.
18. The method for performing computer graphics calculations as recited in claim 16 further comprising:
- calculating a plurality of full precision reciprocal square root values using said plurality of reduced precision reciprocal square root values.
19. The method for performing computer graphics calculations as recited in claim 18 wherein said plurality of full precision reciprocal square root values is determined using a Newton-Raphson approximation.
20. The method for performing computer graphics calculations as recited in claim 12 wherein said plurality of surface normal coordinates and said plurality of transformed surface normal coordinates are in a paired-single data format.
21. The method for performing computer graphics calculations as recited in claim 12 further comprising a floating point reduction multiply instruction comprising:
- reading a first value from a first position of a first data set and a second value from a second position of said first data set;
  
  multiplying said first value and said second value, and placing the result in a first position of a destination data set;
  
  reading a third value from a first position of a second data set and a fourth value from a second position of said second data set; and
  
  multiplying said third value and said fourth value, and placing the result in a second position of a destination data set.
22. The method for performing computer graphics calculations as recited in claim 21 wherein said floating point reduction multiply instruction is a MULR instruction.
23. The method for performing computer graphics calculations as recited in claim 12 further comprising:
- computing a dot product between said plurality of transformed surface normal coordinates and a vector using said floating point reduction add instruction.
24. The method for performing computer graphics calculations as recited in claim 12 wherein said floating point reciprocal square root instruction comprises:
- calculating said plurality of reduced precision reciprocal square root values with values from at least one lookup table.

25. A processor for computer graphics calculations, said processor comprising:
- a bus;
  
  an instruction dispatch unit coupled to said bus, said instruction dispatch unit for dispatching instructions to a floating point unit; and
  
  said floating point unit coupled to said bus, said floating point unit for executing said instructions to implement a method for performing computer graphics calculations, said method comprising;
  
  representing a vertex in a computer graphics image with a plurality of world coordinates;
  
  transforming said plurality of world coordinates into a plurality of transformed world coordinates using a floating point reduction add instruction; and
  
  performing perspective division on said plurality of transformed world coordinates using a floating point reciprocal instruction, wherein said floating point reciprocal instruction processes a plurality of operands to generate a plurality of reduced precision reciprocal values.
- View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33)
- - 26. The processor of claim 25 wherein said method for performing computer graphics calculations comprises:
27. The processor of claim 26 wherein said method for performing computer graphics calculations further comprises:
- calculating a plurality of full precision reciprocal values using said plurality of reduced precision reciprocal values.
28. The processor of claim 27 wherein said plurality of full precision reciprocal values is determined using a Newton-Raphson approximation.
29. The processor of claim 26 wherein said floating point reciprocal instruction is a RECIP1 instruction.
30. The processor of claim 25 wherein said floating point reduction add instruction is an ADDR instruction.
31. The processor of claim 25 wherein said method for performing computer graphics calculations comprises:
- calculating said plurality of reduced precision reciprocal values with values from at least one lookup table.
32. The processor of claim 28 wherein said plurality of world coordinates and said plurality of transformed world coordinates are in a paired-single data format.
33. The processor of claim 25 wherein said method for performing computer graphics calculations further comprises a floating point reduction multiply instruction, wherein said multiply instruction is a MULR instruction.

34. A processor for computer graphics calculations, said processor comprising:
- a bus;
  
  an instruction dispatch unit coupled to said bus, said instruction dispatch unit for dispatching instructions to a floating point unit; and
  
  said floating point unit coupled to said bus, said floating point unit for executing said instructions to implement a method for performing computer graphics calculations, said method comprising;
  
  representing a vertex in a computer graphics image with a plurality of surface normal coordinates;
  
  transforming said plurality of surface normal coordinates into a plurality of transformed surface normal coordinates using a floating point reduction add instruction;
  
  normalizing said plurality of transformed surface normal coordinates using a floating point reciprocal square root instruction, wherein said reciprocal square root instruction processes a plurality of operands to generate a plurality of reduced precision reciprocal square root values;
  
  computing a dot product between said plurality of transformed surface normal coordinates and a vector using said floating point reduction add instruction;
  
  computing a halfway vector in a lighting calculation; and
  
  normalizing said halfway vector using said floating point reciprocal square root instruction.
- View Dependent Claims (35, 36, 37, 38, 39, 40, 41, 42)
- - 35. The processor of claim 34, wherein said method for performing computer graphics calculations comprises:
36. The processor of claim 35 wherein said method for performing computer graphics calculations further comprises:
- calculating a plurality of full precision reciprocal square root values using said plurality of reduced precision reciprocal square root values.
37. The processor of claim 36 wherein said plurality of full precision reciprocal square root values is determined using a Newton-Raphson approximation.
38. The processor of claim 35 wherein said floating point reciprocal square root instruction is a RSQRT1 instruction.
39. The processor of claim 34 wherein said floating point reduction add instruction is an ADDR instruction.
40. The processor of claim 34 wherein said method for performing computer graphics calculations comprises:
- calculating said plurality of reduced precision reciprocal square root values with values from at least one lookup table.
41. The processor of claim 34 wherein said plurality of surface normal coordinates and said plurality of transformed surface normal coordinates are in a paired-single data format.
42. The processor of claim 34 wherein said method for performing computer graphics calculations further comprises a floating point reduction multiply instruction, wherein said multiply instruction is a MULR instruction.

43. In a processor including a memory and an execution unit (“
- EU”
  
  ), a method for determining a plurality of reduced precision values from a plurality of operands comprising;
  
  storing a first instruction in said memory, said first instruction being formatted to operate on said plurality of operands in parallel;
  
  dispatching said first instruction for execution by said EU; and
  
  executing said first instruction in said EU, wherein said executing includes;
  
  accessing in parallel a plurality of lookup tables in said EU to obtain a plurality of first intermediate results, wherein each lookup table is accessed with a first portion of a corresponding operand;
  
  modifying in parallel a second portion of each of said plurality of operands to obtain a plurality of second intermediate results; and
  
  arithmetically combining in parallel said plurality of first intermediate results with said plurality of second intermediate results to obtain said plurality of reduced precision values.
- View Dependent Claims (44, 45, 46)
- - 44. The method of claim 43 wherein modifying includes complementing part of said second portion.
  - 45. The method of claim 43 wherein said reduced precision values are reciprocal values.
  - 46. The method of claim 43 wherein said reduced precision values are reciprocal square root values.

47. A system comprising:
- a processor;
  
  a memory, coupled to said processor, for storing a first plurality of instructions that enables said processor to perform select operations, wherein said first plurality of instructions includes;
  
  a first instruction that enables said processor to combine a first plurality of operands in accordance with a first method, said first method comprising;
  
  reading a first operand from a first position of a first data set and a second operand from a second position of said first data set;
  
  combining said first operand and said second operand of said first data set, and placing the result in a first position of a first destination data set;
  
  reading a third operand from a first position of a second data set and a fourth operand from a second position of said second data set; and
  
  combining said third operand and said fourth operand of said second data set, and placing the result in a second position of said first destination data set; and
  
  a second instruction that enables said processor to determine a plurality of reduced precision values from a second plurality of operands.
- View Dependent Claims (48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59)
- - 48. The system of claim 47 wherein said combining is an add function.
  - 49. The system of claim 47 wherein said combining is a multiply function.
  - 50. The system of claim 47 wherein said combining is a subtract function.
  - 51. The system of claim 47 wherein said combining is a divide function.
  - 52. The system of claim 47 wherein said second instruction enables said processor to combine said second plurality of operands in accordance with a second method, said second method comprising:
53. The system of claim 52 wherein said combining of said first and second data sets is a function being selected from the group consisting essentially of add, subtract, multiply and divide, and said combining of said third and fourth data sets is a function being selected from the group consisting essentially of add, subtract, multiply and divide.
54. The system of claim 47 wherein said reduced precision values are reciprocal values.
55. The system of claim 54 wherein said reciprocal values are combined with a plurality of three-dimensional world coordinates representing at least one vertex of a primitive in a graphics system to project said coordinates into a two-dimensional space.
56. The system of claim 55 wherein said second instruction is a RECIP1 instruction.
57. The system of claim 56 wherein said reciprocal square root values are combined with a plurality of surface normal coordinates representing at least one vertex of a primitive in a graphics system to renormalize said coordinates.
58. The system of claim 57 wherein said second instruction is a RSQRT1 instruction.
59. The system of claim 47 wherein said reduced precision values are reciprocal square root values.

60. A system comprising:
- a memory for holding a plurality of instructions, said plurality of instructions including a first, second and third instruction; and
  
  a processor, coupled to said memory, for executing said plurality of instructions, wherein said processor includes;
  
  a first means for executing said first instruction to produce a plurality of reduced precision reciprocal values;
  
  a second means for executing said second instruction to produce a plurality of reduced precision reciprocal square root values; and
  
  a third means for executing said third instruction for performing a floating point reduction computation, wherein said computation is a function selected from the group consisting essentially of add and multiply.

61. A computer program product comprising a computer readable medium having a plurality of instructions stored thereon, the plurality of instructions for enabling a processor to perform certain operations, wherein the plurality of instructions include:
- a first instruction that enables said processor to combine a first plurality of operands in accordance with a method, said method comprising;
  
  reading a first operand from a first position of a first data set and a second operand from a second position of said first data set;
  
  combining said first operand and said second operand, and placing the result in a first position of a destination data set;
  
  reading a third operand from a first position of a second data set and a fourth operand from a second position of said second data set; and
  
  combining said third operand and said fourth operand, and placing the result in a second position of said destination data set; and
  
  a second instruction that enables said processor to determine a plurality of reduced precision values from a second plurality of operands.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
ARM Finance Overseas Limited (SoftBank Group Corp.)
Original Assignee
MIPS Technologies Incorporated (Wave Computing, Inc.)
Inventors
Ho, Ying-wai, Harrell, Chandlee B., Thekkath, Radhika, Uhler, G. Michael
Primary Examiner(s)
Zimmerman, Mark
Assistant Examiner(s)
Sealey, Lance W.

Application Number

US09/364,787
Time in Patent Office

1,705 Days
Field of Search

712/220, 712/222, 708/495, 708/502, 708/505, 345/427, 345/522, 345/419
US Class Current

345/427
CPC Class Codes

G06F 9/30014   with variable precision

G06F 9/30025   Format conversion instructi...

G06F 9/345   of multiple operands or res...

G06T 15/00   3D [Three Dimensional] imag...

G06T 15/20   Perspective computation

Processor having an arithmetic extension of an instruction set architecture

First Claim

5 Assignments

0 Petitions

Accused Products

Abstract

113 Citations

61 Claims

Specification

Solutions

Use Cases

Quick Links

Processor having an arithmetic extension of an instruction set architecture

First Claim

5 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

113 Citations

61 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links