Processor for Realizing at least Two Categories of Functions

0Associated
Cases 
0Associated
Defendants 
0Accused
Products 
1Forward
Citation 
0
Petitions 
0
Assignments
First Claim
1. A processor for realizing at least two categories of functions, comprising:
 a fixed lookup table circuit (LTC) comprising a printed memory array for storing at least a first portion of a first lookup table (LUT) related to a first function, wherein said first LUT is written during the manufacturing process of said processor;
a writable LTC comprising a writable memory array for storing at least a second portion of a second LUT related to a second function, wherein said second LUT is written after the manufacturing process of said processor is complete.
0 Assignments
0 Petitions
Accused Products
Abstract
The present invention discloses a first preferred processor comprising a fixed lookup table circuit (LTC) and a writable LTC. The fixed LTC realizes at least a common function while the writable LTC realizes at least a noncommon function. The present invention further discloses a second preferred processor comprising a twodimensional (2D) LTC and a threedimensional (3D) LTC. The 2D LTC realizes at least a fast function while the 3D LTC realizes at least a nonfast function.
1 Citation
Configurable processor with inpackage lookup table  
Patent #
US 10,445,067 B2
Filed 11/28/2018

Current Assignee
Hangzhou Haicun Information Technology Co. Ltd.

Sponsoring Entity
Hangzhou Haicun Information Technology Co. Ltd.

No References
20 Claims
 1. A processor for realizing at least two categories of functions, comprising:
a fixed lookup table circuit (LTC) comprising a printed memory array for storing at least a first portion of a first lookup table (LUT) related to a first function, wherein said first LUT is written during the manufacturing process of said processor; a writable LTC comprising a writable memory array for storing at least a second portion of a second LUT related to a second function, wherein said second LUT is written after the manufacturing process of said processor is complete.  View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10)
 11. A processor for realizing at least two categories of functions, comprising:
a semiconductor substrate; a twodimensional (2D) lookup table circuit (LTC) comprising a 2D memory array for storing at least a third portion of a third lookup table (LUT) related to a third function, wherein said 2D memory array is formed on said semiconductor substrate; a threedimensional (3D) LTC comprising a 3D memory array for storing at least a fourth portion of a fourth LUT related to a fourth function, wherein said fourth memory array is formed above said semiconductor substrate.  View Dependent Claims (12, 13, 14, 15, 16, 17, 18, 19, 20)
1 Specification
This application claims priority from Chinese Patent Application 201610307350.1, filed on May 10, 2016; Chinese Patent Application 201710315871.6, filed on May 8, 2017, in the State Intellectual Property Office of the People'"'"'s Republic of China (CN), the disclosure of which are incorporated herein by references in their entireties.
The present invention relates to the field of integrated circuit, and more particularly to processors.
Conventional processors use logicbased computation (LBC), which realizes mathematical functions primarily with logic circuits (e.g. XOR circuit). Logic circuits are suitable for arithmetic operations (i.e. addition, subtraction and multiplication), but not for nonarithmetic functions (e.g. elementary functions, special functions).
The conventional processors support a small set of basic nonarithmetic functions (e.g. basic algebraic functions and basic transcendental functions), which are realized by a combination of arithmetic operations and lookup tables (LUT). These functions are referred to as builtin functions. For example, U.S. Pat. No. 5,954,787 issued to Eun on Sep. 21, 1999 taught a method for generating sine/cosine functions using LUTs; U.S. Pat. No. 9,207,910 issued to Azadet et al. on Dec. 8, 2015 taught a method for calculating a power function using LUTs.
Conventional processors suffer a drawback. They use a single type of memory (e.g. maskROM) to store the LUTs for different functions. Because some functions are commonly used and other noncommon functions are less commonly used, using the maskROM to store the LUT for the noncommon functions are wasteful. On the other hand, because some functions require highspeed implementation and other nonfast functions do not require highspeed implementation, using the highspeed memory to store the LUT for the nonfast functions are also wasteful.
It is a principle object of the present invention to optimize the realization of mathematical functions based on the reusability, cost and speed requirements.
It is a further object of the present invention to realize common and noncommon functions in a single processor.
It is a further object of the present invention to realize fast functions and nonfast functions in a single processor.
In accordance with these and other objects of the present invention, the present invention discloses a processor for realizing at least two categories of functions.
The present invention discloses a processor for realizing at least two categories of functions. The preferred processor uses memorybased computation (MBC), which realizes a mathematical function primarily with a memory which stores the lookup table (LUT) related to the mathematical function. Although arithmetic operations are still performed, the MBC only needs to calculate a polynomial to a lower order because it uses a larger LUT than the LBC. For the MBC, the fraction of computation done by the LUT could be more than that by the arithmetic operations.
To increase the reusability, lower the cost and improve the performance, the preferred processor realizes different categories of mathematical functions by different types of memories. There are two methods to categorize the mathematical functions, each of which is associated with a preferred processor.
For the first method of categorization, the mathematical functions are categorized into common functions and noncommon functions. The common functions are commonly used functions. Examples of common functions include basic algebraic functions and basic transcendental functions. The noncommon functions are less commonly used functions. Examples of noncommon functions include elementary functions and special functions. The first method of categorization is associated with a first preferred processor, which comprises a fixed lookup table circuit (LTC) and a writable LTC. The fixed LTC comprises at least a printed memory array storing at least a portion of an LUT related to at least a common function, whereas the writable LTC comprises at least a writable memory array storing at least a portion of an LUT related to at least a noncommon function. Note that the LUT related to the common function is written into the fixed LTC during the manufacturing process of the first preferred processor, while the LUT related to the noncommon function is written into the writable LTC after the manufacturing process of the first preferred processor is complete. Because the functionrelated LUT can be written into the writable LTC in the field of use, and can be even erased and rewritten afterwards, the first preferred processor can realize different functions based on the customer'"'"'s needs after shipping.
Accordingly, the present invention discloses a processor for realizing at least two categories of functions (i.e. common functions and noncommon functions), comprising: a fixed LTC comprising a printed memory array for storing at least a first portion of a first LUT related to a first function, wherein said first LUT is written during the manufacturing process of said processor; a writable LTC comprising a writable memory array for storing at least a second portion of a second LUT related to a second function, wherein said second LUT is written after the manufacturing process of said processor is complete.
For the second method of categorization, the mathematical functions are categorized into fast functions and nonfast functions. The fast functions are the functions that require fast implementation, whereas the nonfast functions are the functions which do not require fast implementation. The second method of categorization is associated with a second preferred processor, which comprises a twodimensional (2D) LTC and a threedimensional (3D) LTC. The 2D LTC comprises at least a 2D memory array storing at least a portion of an LUT related to at least a fast function, whereas the 3D LTC comprises at least a 3D memory array storing at least a portion of an LUT related to at least a nonfast function. Note that all memory cells of the 2D memory array are located on a 2D plane, i.e. they are formed on the surface of a semiconductor substrate. On the other hand, the memory cells of the 3D memory array are located in a 3D space, i.e. they are vertically stacked above each other. Based on singlecrystalline semiconductor material, the 2D memory array is faster and more suitable for fast functions. On the other hand, occupying no substrate area, the 3D memory array has a lower storage cost. Storing the LUTs related to the fast functions into the 2D memory array while storing the LUTs related to the nonfast functions into the 3D memory array can lower the overall cost of the second preferred processor without sacrificing its performance.
Accordingly, the present invention discloses another processor for realizing at least two categories of functions (i.e. fast functions and nonfast functions), comprising: a semiconductor substrate; a 2D LTC comprising a 2D memory array for storing at least a third portion of a third LUT related to a third function, wherein said 2D memory array is formed on said semiconductor substrate; a 3D LTC comprising a 3D memory array for storing at least a fourth portion of a fourth LUT related to a fourth function, wherein said fourth memory array is formed above said semiconductor substrate.
It should be noted that all the drawings are schematic and not drawn to scale. Relative dimensions and proportions of parts of the device structures in the figures have been shown exaggerated or reduced in size for the sake of clarity and convenience in the drawings. The same reference symbols are generally used to refer to corresponding or similar features in the different embodiments. The symbol “/” means a relationship of “and” or “or”.
Those of ordinary skills in the art will realize that the following description of the present invention is illustrative only and is not intended to be in any way limiting. Other embodiments of the invention will readily suggest themselves to such skilled persons from an examination of the within disclosure.
The present invention discloses a processor for realizing at least two categories of functions. The preferred processor uses memorybased computation (MBC), which realizes a mathematical function primarily with a memory which stores the lookup table (LUT) related to the mathematical function. Although arithmetic operations are still performed, the MBC only needs to calculate a polynomial to a lower order because it uses a larger LUT than the LBC. For the MBC, the fraction of computation done by the LUT could be more than that by the arithmetic operations.
The lookup table circuits (LTC) may comprise various types of memory arrays. Based on their programming mechanisms, the memory arrays can be categorized into printed memory array and writable memory array. For the printed memory array, the data can be recorded thereto using a printing method during the manufacturing process. Note that the data are permanently stored and cannot be changed. The printing methods include photolithography (i.e. maskprogramming to form maskROM), nanoimprint, ebeam lithography, DUV lithography, laser programming and other methods. For the writable memory array, the data can be recorded thereto using an electrical programming method. The writable memory includes OTP, SRAM, DRAM, EPROM, EEPROM, and flash memory. Among them, the OTP is onetime programmable, while the SRAM, DRAM, EPROM, EEPROM and flash memory are reprogrammable.
Based on their internal placements, the memory arrays can be categorized into 2D memory array (or, planar memory array) and 3D memory array. For the 2D memory array, all of its memory cells are located on a 2D plane. They are formed on the surface of a semiconductor substrate, i.e. the transistors and/or diodes of the memory cells are formed on the substrate. For the 3D memory array, its memory cells are located in a 3D space. They are vertically stacked, i.e. the transistors and/or diodes of the memory cells are formed above the substrate, not occupying any substrate area. The 2D memory array includes 2D printed memory array and 2D writable memory array, while the 3D memory array includes 3D printed memory array (3DP, referring to U.S. patent application Ser. No. 14/875,716) and 3D writable memory array (3DW, also known as 3DEPROM, referring to U.S. Pat. No. 5,835,396). Examples of the 3DW include 3DOTP, 3DXPoint, and 3DNAND.
Referring now to
When realizing a mathematical function, combining the LUT with polynomial interpolation can achieve a high precision without using an excessively large LUT. For example, if only LUT (without any polynomial interpolation) is used to realize a singleprecision function (32bit input and 32bit output), it would have a capacity of 2^{32}*32=128 Gb. By combining polynomial interpolation, significantly smaller LUTs can be used. In the above embodiment, a singleprecision function can be realized using a total of 4 Mb LUT (i.e. 2 Mb for the function values, and 2 Mb for the firstderivative values) in conjunction with a firstorder Taylor series. This is significantly less than the LUTonly approach (4 Mb vs. 128 Gb).
Besides elementary functions, the preferred embodiment of
To increase the reusability, lower the cost and improve the performance, the preferred processor realizes different categories of mathematical functions by different types of memories. There are two methods to categorize the mathematical functions, each of which is associated with a preferred processor.
For the first method of categorization, the mathematical functions are categorized into common functions and noncommon functions. The common functions are commonly used functions. Examples of common functions include basic algebraic functions and basic transcendental functions. The noncommon functions are less commonly used functions. Examples of noncommon functions include elementary functions and special functions.
One example of the first preferred processor 300 comprises a 2D fixed LTC and a 2D writable LTC, both of which are formed on the surface of a semiconductor substrate. Among them, the 2D fixed LTC stores the LUTs related to common functions, while the 2D writable LTC stores the LUTs related to noncommon functions. Another example of the first preferred processor 300 comprises a 3D fixed LTC and a 3D writable LTC, both of which comprise vertically stacked memory cells. Among them, the 3D fixed LTC stores the LUTs related to common functions, while the 3D writable LTC stores the LUTs related to noncommon functions.
For the second method of categorization, the mathematical functions are categorized into fast functions and nonfast functions. The fast functions are the functions that require fast implementation, whereas the nonfast functions are the functions which do not require fast implementation.
The 3D memory array offers the benefit of 3D integration, i.e. the memory cells of the 3D memory array can be integrated with the 2D memory array and/or the logic circuit on a single die.
While illustrative embodiments have been shown and described, it would be apparent to those skilled in the art that many more modifications than that have been mentioned above are possible without departing from the inventive concepts set forth therein. For example, the preferred processor could be a microcontroller, a central processing unit (CPU), a digital signal processor (DSP), a graphic processing unit (GPU), a networksecurity processor, an encryption/decryption processor, an encoding/decoding processor, a neuralnetwork processor, or an artificial intelligence (AI) processor. The preferred processors can be found in consumer electronic devices (e.g. personal computers, video game machines, smart phones) as well as engineering and scientific workstations and server machines. The invention, therefore, is not to be limited except in the spirit of the appended claims.