MAY 26 2020 SUSAN Y. SOONG CLERK, U.S. DISTRICT COURT NORTH DISTRICT OF CALIFORNIA Xiaohua Huang P.O. Box 1639, Los Gatos, CA95031 Tel: 669-273-5650 Email: xiaohua\_huang@hotmail.com Pro Se Plaintiff #### UNITED STATES DISTRICT COURT NORTHERN DISTRICT OF CALIFORNIA # Xiaohua Huang Pro Se Plaintiff(s), MR. Xiaohua Huang's complaint vs. against Fry's Electronics for patent infringement Fry's Electronics Defendant(s) Demand for Jury Trial Plaintiff Xiaohua Huang (hereinafter "Huang" or "Plaintiff") alleges as follows: #### NATURE OF THE ACTION 1. This is an action for patent infringement arising out of U.S. Patent No. 6,744,653 (hereinafter the "653 Patent") issued on June 1, 2004, 6,999,331 issued on February 14, 2006 (hereinafter the "331 Patent") and RE45259 issued on Nov.25,2014 (hereinafter the "RE259 Patent") to Xiaohua Huang. This action is brought to remedy the infringement of the '653Patent, '331Patent and 'RE259Patent (collectively "patent-in-suit") by Defendant Fry's Electronics (hereinafter "Fry's" or "Defendant"). #### THE PARTIES 2. Xiaohua Huang is an individual, his current mailing address is at P.O. Box 1639, Los Gatos, CA95031. Huang has developed the state of the art high speed and low power U.S. patented TCAM designs to build IC chips used inside of Networking (wired and wireless) Routers ("Routers"), Ethernet Switches ("Switches") etc. since the year of 2000. 3. Fry's is or purports to be an corporation with its main offices in 550 E Brokaw Rd, San Jose, CA 95112, United States with contact telephone number (408) 487-1000. Fry's has used the Switches and Routers which have infringed US patent No. 6744653, 6999331 and RE45259 to generate its revenues in the United States. Fry's has also generated revenues through selling the products which have infringed US patent No. 6744653 and 6999331 and RE45259. #### JURISDICTION AND VENUE 4. This action arises under the patent laws of the United States, 35 U.S.C. § 101, et seq. This Court has jurisdiction over the subject matter of this action pursuant to 28 U.S.C. §§ 1331 and 1338(a). Venue is proper in this District pursuant to 28 U.S.C. §§1391(b) - (c) and 1400(b) in that Defendant has been generating revenues and profits through using the devices which have infringed US patent No. 6744653, 6999331 and RE45259 in the United States. Fry's has also generated revenues through selling the products which have infringed US patent No. 6744653, 6999331 and RE45259. Fry's has operation in Northern California. #### **BACKGROUND FACTUAL ALLEGATION** - 5. A true and correct copy of each of the 'RE259, '653 and '331 Patents are attached hereto as Exhibits A, B and C respectively. The 'RE259, '653 Patent and '331 Patent are valid and owned by Plaintiff Mr. Huang as the inventor. - 6. In Nov. 2000 "Huang" found CMOS Micro Device Inc ("CMOS") to develop Ternary Content Addressable Memory (TCAM). "Huang" is the owner of "CMOS", "CMOS" is a California corporation and having its office in 900 East Hamilton Ave, Room 100, Campbell, California. TCAM are used to perform the search function in internet networking router, switches and Cloud Data Center Switches. - 7. In Oct. 2001 "Huang" filed the provisional patent application titled "High-speed and low power content addressable memory (CAM) sensing circuits", some content of which was granted as US patent 6744653 "CAM cells 3 8 9 1112 1314 1516 17 18 19 2021 22 23 24 2526 2728 and differential sense circuit for content addressable memory (CAM)" in June 1, 2004 and US patent 6999331"CAM cells and differential sense circuit for content addressable memory (CAM)" in Feb. 14, 2006. - 8. From November, 2000 to April, 2002, Huang finished the design of ternary content addressable memory (TCAM) with 0.18um TSMC technology which are covered by the 'RE259, '653 and '331 Patents. The TCAM designed by Huang is more than three to hundreds of times faster in speed and consume much less power than the same products in Market at that time. Then Huang shared his patent application with two Cisco executives, they are GM and VP of Router and Gigbit switches division respectively. They both consider that Huang's patents of TCAM are the best solution among all the vendors and asked Huang to review their next generation TCAM specification and do a feasible design to evaluate the possible product performance. The design data provided by Huang is still better than the other products in market today. 'RE259, '653Patent and '331Patent are the basic fundamentals to design high speed and low power TCAM used in networking Router and Switches as well as Cloud Data Center Switches up to today. The TCAM designed by Huang provide the example design using those three patents ('RE259, '653 and '331Patent). By using the 'RE259, '653Patent and '331Patent the TCAM used in Routers and Switches helps Internet transfer information many time faster. - 9. The patented TCAM IP developed by Huang has been recognized by the industry. In 2003 Huang was an invited speaker to present his TCAM design at networking symposium at Boston organized by the Industry Authority Linley Group. In 2015 Huang was also a presenter of MEMCON 2015 in Santa Clara convention center to present his patented TCAM design. - 10. The ternary content addressable memory component are used as table look up function and used in networking router and switches as well as cloud data center switches to perform table look up to realize access control list(ACL), Quality of Service(QoS), VPN, VLAN, fire wall, LPM and other parallel searching. - 11. Based on information and belief that the devices used in Fry's networking have infringed US patent No. RE45259, 6744653 and 6999331. - 12. Based on information and believe Huang found that the "TCAM" in the networking devices used and sold by Fry's have infringed US patents 6744653, 6999331 and RE45259 including but not limited to the claim 1, 5, 8, 12,15 of US patent 6744653 and claim 1 and 9 of US patent 6999331 as well as claim 1,7,13 of RE45259. - 13. Fry's has sold the "switches" and "Routers" to directly infringe the "patent-in-suit". - 14. Fry's has used the "switches" and "Routers" which have infringed US patent RE45259, 6744653 and 6999331. - 15. The most function, such as ACL, QoS, VLAN, firewall and LPM, of "Router" and "Switches" use TCAM lookup. Through selling "Switches" and "Routers" Fry's has conducted the act of direct infringement. # THE INFRINGING PRODUCTS WHICH FRY'S MAY HAVE USED AND SOLD. 16. Fry's is a company which has used online sale network platform. Fry's has a large network and data center which have used networking Routers, Switches and data center Switches. Those network Routers, Switches and data center Switches have function such as ACL, QoS, VLAN, firewall and LPM, which function have used TCAM infringing US patent RE45259, 6744653 and 6999331, including but not limited to the claim 1,5,8,12 and 15 of patent 6744653, the claim 1 and 9 of patent 6999331 and claim 1,7,13 of RE45259 at least. Fry's has sold some of the devices which have function such as ACL, QoS, VLAN, firewall and LPM, which function have used TCAM infringing US patent RE45259, 6744653 and 6999331, including but not limited to the claim 1,5,8,12 and 15 of patent 6744653, the claim 1 and 9 of patent 6999331 and claim 1,7,13 of RE45259 at least. #### COUNT I: INFRINGEMENT OF U.S. PATENT NO. 6744653 - 17. Plaintiff Mr. Huang refers to and incorporates herein the allegations of Paragraphs 1-16 above. - 18. On June 1, 2004, U.S. Patent No.6744653 (the "653Patent") was duly and legally issued for a "CAM cells and differential sense circuit for content addressable memory(CAM)." A true and correct copy of the '653 patent is attached hereto as Exhibit B. Xiaohua Huang as inventor is the owner of all rights, title, and interest in and to the '653 patent. - 19. On information and belief, Defendant Fry's has infringed and continue to infringe directly, indirectly, literally one or more of the claims of the '653patent through using and selling the Networking Routers and Switches including but not limited to TL-SG1048 Switch, Archer AX6000 and Tenda AC1200 Router, using TCAM which have infringed at least claim 5, 8, 12 and 15 of the '653patent under 35 U.S.C. § 271(a),(b) and(c). Defendant Fry's has infringed at least claim 5, 8, 12 and 15 of the '653patent under 35 U.S.C. § 271(a), (b) and (c). - 20. Defendant Fry's acts of infringement have caused damage to Xiaohua Huang, and Xiaohua Huang is entitled to recover from Defendant Fry's for the damages sustained by Xiaohua Huang as a result of Defendant Fry's wrongful acts in an amount subject to proof at trial. Defendant Fry's infringement of Xiaohua Huang exclusive rights under the '653patent will continue to damage Xiaohua Huang, causing irreparable harm for which there is no adequate remedy at law, unless enjoined by this Court. Defendant Fry's infringement entitle Xiaohua Huang to recover damages under 35 U.S.C. § 284 and to attorneys' fees and costs incurred in prosecuting this action under 35 U.S.C. § 285. #### COUNT II: INFRINGEMENT OF U.S. PATENT NO. 6999331 - 21. Plaintiff refers to and incorporates herein the allegations of Paragraphs 1-16 above. - 22. On Feb.14, 2006, U.S. Patent No. 6999331(the "'331Patent") was duly and legally issued for a "CAM cells and differential sense circuit for content addressable memory(CAM)." A true and correct copy of the '331 patent is attached hereto as Exhibit C. Xiaohua Huang as inventor is the owner of all rights, title, and interest in and to the '331 patent. - 23. On information and belief, Defendant Fry's has infringed and continue to infringe directly, indirectly, literally one or more of the claims of the '331patent through using and selling the Networking Routers and Switches including but not limited to TL-SG1048 Switch, Archer AX6000 and Tenda AC1200 Router, using TCAM which have infringed at least claim1 and 9 of the '331patent under 35 U.S.C. § 271(a),(b) and(c). Defendant Fry's has infringed at least claim 1 and 9 of the 331patent under 35 U.S.C. § 271(a), (b) and (c). - 24. Defendant Fry's acts of infringement have caused damage to Xiaohua Huang, and Xiaohua Huang is entitled to recover from Defendant Fry's for the damages sustained by Xiaohua Huang as a result of Defendant Fry's wrongful acts in an amount subject to proof at trial. Defendant Fry's infringement of Xiaohua Huang exclusive rights under the '331patent will continue to damage Xiaohua Huang, causing irreparable harm for which there is no adequate remedy at law, unless enjoined by this Court. Defendant Fry's infringement entitle Xiaohua Huang to recover damages under 35 U.S.C. § 284 and to attorneys' fees and costs incurred in prosecuting this action under 35 U.S.C. § 285. COUNT III: INFRINGEMENT OF U.S. PATENT NO. RE45259 25. Plaintiff refers to and incorporates herein the allegations of Paragraphs 1-16 above. - 26. On November 25, 2014 U.S. Patent No. RE45259 (the "RE259Patent") was duly and legally issued for a "Hit ahead hierarchical scalable priority encoding logic and circuits." A true and correct copy of the 'RE259patent is attached hereto as Exhibit A. Xiaohua Huang as inventor is the owner of all rights, title, and interest in and to the 'RE259 patent. - 27. On information and belief, Defendant Fry's has infringed and continue to infringe directly, indirectly, literally one or more of the claims of the 'RE259 patent through using and selling the Networking Routers and Switches including but not limited to TL-SG1048 Switch, Archer AX6000 and Tenda AC1200 Router, using TCAM which have infringed at least claim 1,7 and 13 of the 'RE259 patent under 35 U.S.C. § 271(a),(b) and(c). Defendant Fry's has infringed at least claim 1,7 and 13 of the RE259 patent under 35 U.S.C. § 271(a), (b) and (c). - 28. Defendant Fry's acts of infringement have caused damage to Xiaohua Huang, and Xiaohua Huang is entitled to recover from Defendant Fry's for the damages sustained by Xiaohua Huang as a result of Defendant Fry's wrongful acts in an amount subject to proof at trial. Defendant Fry's infringement of Xiaohua Huang exclusive rights under the 'RE259 patent will continue to damage Xiaohua Huang, causing irreparable harm for which there is no adequate remedy at law, unless enjoined by this Court. Defendant Fry's infringement entitle Xiaohua Huang to recover damages under 35 U.S.C. § 284 and to attorneys' fees and costs incurred in prosecuting this action under 35 U.S.C. § 285. 2 4 5 7 8 9 10 11 12 13 1415 16 1718 19 2021 22 2324 25 2627 28 #### JURY DEMAND 29. Pursuant to Fed. R. Civ. P. 38(b), Plaintiff Xiaohua Huang requests a trial by jury on all issues. #### PRAYER FOR RELIEF WHEREFORE, Xiaohua Huang prays for the following relief: - (a). A judgment in favor of Xiaohua Huang that Defendant has infringed and is infringing U.S. Patent Nos. 6744653,6999331 and RE45259; - (b). A judgment that the 'RE259, '653 patent and '331 patent are valid and enforceable; - (c). An order preliminarily and permanently enjoining Defendant and its subsidiaries, parents, officers, directors, agents, servants, employees, affiliates, attorneys and all others in active concert or participation with any of the foregoing, from further acts of infringement of the 'RE259patent, '653patent and '331patent; - (d). An accounting for damages resulting from Defendant's infringement of the 'RE259 patent, '653 patent and '331patent under 35 U.S.C. § 284; - (e). An assessment of interest on damages; - (f). A judgment awarding damages to Xiaohua Huang for its costs, disbursements, expert witness fees, and attorneys' fees and costs incurred in prosecuting this action, with interest pursuant to 35 U.S.C. § 285 and as otherwise provided by law; - (g). Such other and further relief as this Court may deem just and equitable. Dated: May 21, 2020 Respectfully Submitted, 18/8 #### Case 5:20-cv-03522-VKD Document 1 Filed 05/26/20 Page 9 of 85 Exhibit A US patent RE45259 #### US00RE45259E ### (19) United States ## (12) Reissued Patent #### Huang #### (10) Patent Number: US RE45,259 E #### (45) Date of Reissued Patent: Nov. 25, 2014 # (54) HIT AHEAD HIERARCHICAL SCALABLE PRIORITY ENCODING LOGIC AND CIRCUITS (76) Inventor: Xiaohua Huang, San Jose, CA (US) (21) Appl. No.: 13/355,449 (22) Filed: Jan. 20, 2012 #### Related U.S. Patent Documents Reissue of: (64) Patent No.: **7,652,903**Issued: **Jan. 26, 2010**Appl. No.: **11/073,116** Filed: Mar. 4, 2005 U.S. Applications: (60) Provisional application No. 60/550,537, filed on Mar. 4, 2004. (51) Int. Cl. *G11C 15/00* (2006.01) (52) U.S. CI. USPC .......365/49.1; 365/230.01; 365/230.05; 365/230.06 #### (56) References Cited #### U.S. PATENT DOCUMENTS | 6,249,449 | <b>B1</b> * | 6/2001 | Yoneda et al 365/49.18 | |-----------|-------------|---------|------------------------| | 6.307.767 | | | Fuh | | | | | Podaima et al | | 6,392,910 | | | | | 6,505,271 | | | Lien et al 711/108 | | 7,043,601 | | | McKenzie et al 711/108 | | 7,464,217 | B2 * | 12/2008 | Braceras et al 711/108 | <sup>\*</sup> cited by examiner Primary Examiner — Han Yang #### (57) ABSTRACT In this invention a hit ahead multi-level hierarchical scalable priority encoding logic and circuits are disclosed. The advantage of hierarchical priority encoding is to improve the speed and simplify the circuit implementation and make circuit design flexible and scalable. To reduce the time of waiting for previous level priority encoding result, hit signal is generated first in each level to participate next level priority encoding, and it is called Hit Ahead Priority Encoding (HAPE) encoding. The hierarchical priority encoding can be applied to the scalable architecture among the different sub-blocks and can also be applied with in one sub-block. The priority encoding and hit are processed completely parallel without correlation, and the priority encoding, hit generation, address encoding and MUX selection of the address to next level all share same structure of circuits. #### 36 Claims, 8 Drawing Sheets Nov. 25, 2014 Sheet 1 of 8 **U.S. Patent** Nov. 25, 2014 Sheet 2 of 8 Nov. 25, 2014 Sheet 3 of 8 Figure 2b. Nov. 25, 2014 Sheet 4 of 8 **U.S. Patent** Nov. 25, 2014 Sheet 5 of 8 Figure 4 Nov. 25, 2014 Sheet 6 of 8 Nov. 25, 2014 Sheet 7 of 8 Nov. 25, 2014 Sheet 8 of 8 15 1 #### HIT AHEAD HIERARCHICAL SCALABLE PRIORITY ENCODING LOGIC AND CIRCUITS Matter enclosed in heavy brackets [] appears in the original patent but forms no part of this reissue specification; matter printed in italics indicates the additions made by reissue. This application claims the benefit of provisional U.S. Application Ser. No. 60/550,537, entitled "Priority encoding logic and Circuits," filed Mar. 4, 2004, which is incorporated herein by reference in its entirety for all purposes. #### FIELD OF THE INVENTION The presentation relates to content addressable memory. In particular, the present invention relates to logic and circuits of priority encoding of match or hit address. #### BACKGROUND OF THE INVENTION In ternary content addressable memory, not every bit in each row are compared in the searching or comparing process, so some time in one comparison, there are more than one row matching the input content, it is called multi-hit or match. In multi-hit case, one protocol was made to select the highest priority address. The logic of selecting the highest priority address is called priority encoding. Assume we have $\{A_0, A_1, \dots A_{n-1}\}$ hit signals from the corresponding addresses and define $A_0$ has the highest priority and $A_n$ has the lowest priority. Assume some of $\{A_0, A_1, \dots A_{n-1}, A_n\}$ are logic "1" and all of the others are logic "0", the priority encoding keep the highest priority "1" as "1" and convert all the other "1" into "0". The logic operation of this transform: $$\{A_0, A_1, \dots A_{n-1}, A_n\} \Longrightarrow \{h_0, h_1, \dots h_{n-1}, h_n\}$$ can logically be expressed as: $$\begin{aligned} h_0 &= A_0 \end{aligned} \tag{2} \\ h_1 &= \overline{A}_0 * A_1 \\ h_2 &= \overline{A}_0 * \overline{A}_1 * A_2 \\ \cdots \end{aligned}$$ $$\mathsf{h}_n = \overline{\mathsf{A}}_0 * \overline{\mathsf{A}}_1 * \overline{\mathsf{A}}_2 \ \dots \ \mathsf{A}_{n-1} * \overline{\mathsf{A}}_n$$ Which means only when $A_0$ to $A_{i-1}$ , are all zero, $h_i=A_i$ , 55 otherwise no matter $A_i=0$ or 1, $h_i=0$ . After the priority encoding, the hit address with the highest priority will be encoded to the binary address. If the entry N are large, say 1K to 128K or even 1M, the calculation of priority logic (2) will take long time if we use 60 serial logic. So we come out the inventions which will be described in the following. #### SUMMERY OF THE INVENTION In this invention, we propose a multi-level hierarchical scalable priority encoding. For example we make 8 entry as 2 one group as first level and 8 first level as a second level, total 64 entry. Then we can make 8 second level as third level, total 512 entry, and so on. The advantage to make hierarchical priority encoding is to improve the speed, and simplify the circuit implementation and make circuit design flexible and scalable. To reduce the time of waiting for previous level priority encoding result, we generate the hit signal first in each level to participate next level priority encoding, and we call it Hit <sup>10</sup> Ahead Priority Encoding (HAPE) encoding. The hierarchical priority encoding can be applied to the scalable architecture among the different sub-blocks and can also be applied with in one sub-block. #### BRIEF DESCRIPTION OF THE DRAWINGS Preferred embodiments of the invention will now be described, by way of example only, with reference to the attached Figures, wherein: FIG. 1 is a block diagram of scalable architecture of CAM with many sub-block in accordance with one embodiment of the present invention. FIG. 2a is a logic block diagram of hierarchical priority encoding and match address binary encoding within one subblock in accordance with one embodiment of present invention FIG. 2b is the and timing diagram in accordance with FIG. 2b of present invention. FIG. 3 is a logic block diagram of hierarchical priority encoding and match address binary encoding in higher level or among the different sub-block and timing diagram in accordance with one embodiment of present invention. FIG. 4 is the circuit implementation of priority encoding with 8 input address in accordance with one embodiment of present invention. FIG. 5 is the circuit implementation of the HIT generation logic address in accordance with one embodiment of present invention. FIG. 6 is the circuit implementation of binary encoding (1) 40 logic in accordance with one embodiment of present invention FIG. 7 is the circuit implementation of 8 to 1 mux in accordance with one embodiment of present invention. ## DETAILED DESCRIPTION OF THE INVENTIONS To make the priority encoding logic calculation quicker, the entire CAM block can be divided into 256 block and 50 divided into four quadruple, each quadruple has 8x8=64 block and each block has 8x8=64 entry as shown in FIG. 1 with embodiment 100. This is just to explain the principle, the entry number of each sub-block and the number of sub-block can be different. Assume the data pad 110 are equally distributed in four side of the chip. If all of the data pad 110 are in one side or less than four side, the principle is same. First step, route all the data signal in each side (only one side are drawn in the FIG. 1) to the middle point of that side, which is shown as route 101a in FIG. 1. Second step, route all the data signal to the center of the chip shown as route 102a in FIG. 1. Third step, in the center point send the data to be compared to both left and right side (only right side path 103a is shown in FIG. 1. Fourth step, send data to each one of the 8 column both upper part and down part shown as 104a in FIG. 1. Fifth step, the data to be compared are then sent to each sub-block 120 in each column to perform the compari- 1 son with each entry in every sub-block 120. In embedded application, the entry number of TCAM is not very large. In that case, the data path start from path 104a. If only some selected sub-block are searched or compared, the data to be compared will only be sent into those sub-block to save power consumption. After comparison with each entry inside each sub-block 120, the first level and second level priority encoding and binary encoding are performed which will be explained in details in FIG. 2, then the priority encoding in each column 130 among 8 sub-block will be performed as 10 third level priority encoding and the hit address are sent out through path 104b. Next step fourth level priority encoding will be performed among 8 column 130 in each quadruple and the Hit address are sent out through path 103b. Next step the priority encoding will be performed in the center of chip 15 among four quadruple and the hit address will be sent through path 102b. Last step the hit address are sent to the output pad 110 through path 101b. The priority encoding among upper quadruple and lower part quadruple can be performed together in path 103b. The priority encoding logic calculation block diagram for each 8×8=64 entry sub-block 120 are shown in FIG. 2a with embodiment 200a. Each 8 entry of 64 entry are grouped together to do hit logic function from 2h0 to 2h7 and generate Hit[0] to Hit[7] in block 201. In the same time each 8 entry of 25 64 entry are performed priority encoding logic calculation in each block from 2p0 to 2p7 of embodiment block 202 to generate P[63:0], then proceed binary encoding from 2e0 to 2e7 in embodiment block 203 to generate any three bit BA0 [2:0] to BA7[2:0] binary address if there is a hit in any 8 bit 30 group. The eight signal of Hit[0] to Hit[7] from block 201 will perform priority encoding in block 206 which is logically exact same as the priority encoding in each 8 entry group from 2p0 to 2p7. The Priority Hit Ph[7:0] from Hit[0] to Hit[7] will select the 8 to 1 mux 204 and select one three bit binary 35 tion (3), where n=7. address from BA0[2:0] to BA7[2:0] and become Add1[2:0]. The priority bit of Hit[0] to Hit[7] is binary encoded in block 207 which is logically same as the binary encoding block from 2e0 to 2e7 to generate the address: Add1[5:3]. Add1[5: 3] and Add1[2:0] make Add1[5:0]. Hit[0] to Hit[7] further 40 perform the logic function in block 2hh which is logically same as any block 2h0 to 2h7 and generate the next level Hit1. Both Add1[5:0] and Hit1 will be passed to the next level. The timing diagram of embodiment **200**a is shown in FIG. 2b with embodiment **200**b. Assume all the Hit or miss signal 45 from TCAM comparison A[i] (A[63:0])which is drawn as signal **240** are available in time t<sub>o</sub>, the first level hit signal Hit[7:0] generated by block **2h0** to **2h7** are drawn as **241** which is available at time t<sub>1</sub>. In the same time A[63:0] are divided into eight group and priority encoded by block **2p0** to **2p7**, generating P[0] to P[63] which are drawn as **244** and available at time t<sub>1</sub>. The time delay of generating Ph[7:0] which are drawn as **246** and the time delay of generating. BA0[2:0] to BA7[2:0] which are drawn as **245** are roughly same and they are generated in time $t_2$ . So the Binary address 55 Add1[2:0] which are drawn as **248** are selected by Ph[7:0] from the 8 group address BA0[2:0] to BA7[2:0] through an eight to one MUX **204** without any further delay except the delay of MUX itself which is $(t_3-t_2)$ , and the address Add1 [5:3] which are drawn as **247**, Add1[2:0] and Add[5:0] which 60 are drawn as **249** are available at time $t_3$ . So the total delay from A[63:0] available to the output of binary hit address Add1[5:0] is about three stage delay(priority 2p0, binary encoding 2e0 and 8 to 1 MUX 204), where we call each block(2p0, 2e0 and 204 etc) as one stage. The 65 delay of Hit1 243 is two stage delay. So the output of Hit1 which is available at $t_2$ which is one stage earlier than the output of binary Hit address Add1[5:0] **249** which is available at $t_3$ . Only Hit1 and Add1[5:0] are sent to the next level priority encoding. The entire sub-block are abstracted as sym- priority encoding. The entire sub-block are abstracted as symbol 208. The timing delay of hit, priority encoding, binary encoding and 8 to 1 mux will be analyzed in details. FIG. 3 is the logic block diagram of priority encoding of higher level among the eight group of 64 entry sub-block or among the 8 sub-block in every column 130 in FIG. 1. The Hit signal Hit1[7:0] which is marked as 313 in FIG. 3 are one stage earlier than the binary hit address Add10[5:0] to Add17 [5:0] which are marked as 314. Eight bit HIT signal of Hit1 [7:0] perform priority encoding in block 309, then the priority hit signal Ph1[7:0] will select Add2[5:0] from the eight input MUX 311. In the same time Ph1[7:0] are encoded into binary address Add2[8:6] in block 310. Add2[8:6] and Add2[5:0] make Add2[8:0]. In block 308 eight input Hit1[7:0] generate Hit2 at time t<sub>3</sub> which is one stage earlier than Binary hit address 20 Add2[8:0]. From the timing diagram 340 in FIG. 3, the delay of binary hit address Add1i[5:0] which is signal 314 to Add2 [8:0] which is marked as 319 is an 8 to 1 MUX delay which is (t<sub>4</sub>-t<sub>3</sub>), where i=0 to 7. In this hierarchical priority design, the delay on each level is an 8 to 1 MUX delay because the selection signal from the priority encoding among the hit signals is available one stage earlier and there is no extra delay to wait for the selection signal. Another advantage of this hierarchical priority encoding is that the simplicity of circuit design. We already see that each level shares the same logic and circuit design. Say, the priority function block 206, 309 in each level are same in logic and circuit, which is shown in FIG. 4, embodiment 400. Embodiment 400 in FIG. 4 is a sample implementation of the priority logic equation (2) which can be deduced to equation (3), where n=7. $$h_{0} = A_{0}$$ $$h_{1} = \overline{A_{0}} * A_{1} = \overline{A_{0} + \overline{A_{1}}}$$ $$h_{2} = \overline{A_{0}} * \overline{A_{1}} * A_{2} = \overline{A_{0} + A_{1} + \overline{A_{2}}}$$ $$...$$ $$h_{n} = \overline{A_{0}} * \overline{A_{1}} * \overline{A_{2}} ... \overline{A_{n-1}} * A_{n} = \overline{A_{0} + A_{1} ... + A_{n-1} + \overline{A_{n}}}$$ (3) The equation (3) is implemented as embodiment 400 in FIG. 4. Each line from 4y0 to 4y7 connect the drains of a few N transistors and each line 4y0 to 4y7 is the output of dynamic NOR logic of N transistor connected to that line. At the beginning of each cycle, the gate input signals $\overline{A_0}$ to $\overline{A_7}$ and $A_0$ to A<sub>7</sub> of all the N transistor from 401 to 436 are set to logic zero which turn off all the N transistors and the enable signal en is set to logic zero which makes all the output of NAND gate 445 to 452 to logic one and then turn all the output of inverter 453 to 460 into logic zero. The input pch of the P transistors 437 to 444 are set to logic zero and the P transistor 437 to 444 are turned on, which make the line 4y0 to 4y7 connecting to Vdd with low impedance and pre-charge the potential level of line 4y0 to 4y7 up to Vdd, then the signal pch is turned into Vdd and turn off the P transistors 437 to 444 before the TCAM comparison results Ao to Ao and their complementary $\overline{A}_0$ and $\overline{A}_7$ arrive. The Hit signal among A0 to A7 will be logical "one" at potential Vdd and the missed signal among A0 to A7 will be logical zero at potential ground. Only the highest priority hit, the output of the NOR 5 gates are logically high. For example, $A_0$ =0, $A_1$ =0, $A_2$ =Vdd and $A_3$ =Vdd, the highest priority hit is $A_2$ . The input of N transistor 401 is Vdd and N transistor 401 is turned on and the node 4y0 is discharged to ground. The input of transistor 402 which is the complementary of $A_1$ is also Vdd and the transistor 402 is ON, the node of 4y1 is also discharged to ground. Since $A_0=0$ , $A_1=0$ , $A_2=Vdd$ , $\overline{A}_2=0$ , so the inputs of transistors 404, 405, 406 are all zero and the transistor 404, 405, 406 are all OFF and the node 4y2 will not be discharged and will be kept logically "one" at potential Vdd. Since A<sub>2</sub>=Vdd, 10 the inputs of transistors 408, 413, 419, 426 and 434 will be Vdd and all the node 4y3, 4y4, 4y5, 4y6 and 4y7 will be pulled to ground no matter if A<sub>3</sub>, A<sub>4</sub>, A<sub>5</sub>, A<sub>6</sub> and A<sub>7</sub> are logically one or zero. The slowest path or worst case is only one input among eight N transistor 429, 430, 431, 432, 433, 434, 435 and 436 connected to node 4y7 is Vdd and all the others are zero, in that case one transistor need to discharge the drain parasitic capacitance of eight transistor and the metal wire capacitance connected to node 4y7. The signal en is characterized to turned to Vdd later then node 4y7 is discharged in 20 worst case. The worst case delay of eight input priority encoding is that one N transistor discharging the drain parasitic capacitance of eight same size N transistor down to ground plus the delay of one NAND gate and one inverter. The logic of Hit function block 2h0, 2h1, ... 2hh and 308 25 in each level is also same and its logic and circuit are shown in FIG. 5. The embodiment 510 is the circuit implementation of one block 2h0 and the embodiment 520 is the circuit implementation of both block 201 and block 2hh in FIG. 2a together. The operation principle of 510 is: 1) all the input A0 30 to A7 are set to zero as in embodiment 400 in FIG. 4. 2) Set the gate input 522 of P transistor 501 to zero to pre-charge the node 503 to Vdd, then turn 522 to Vdd and turn off the P transistor 501 before the signal A0 to A7 arrive. If all the input A0 to A7 are zero, the input of N transistors are zero and all 35 the N transistors 502 are OFF and the node 503 is kept in Vdd and the output signal of inverter 504 is zero. If only one input among A0 to A7 is Vdd and all the others are zero, which is the worst case, the delay of 510 is that one N transistor discharge the drain parasitic capacitance of the eight same size N tran- 40 sistor down to ground plus the delay of one inverter. The binary encoding logic and circuit is shown as embodiment 600 in FIG. 6. The operation principle of 600 is: 1) all the input $h_0$ , $h_2$ , $h_4$ and $h_6$ are set zero. 2) Set the gate input 611 of P transistor 601 to zero to pre-charge the node 603 to Vdd, 45 then turn 611 to Vdd and turn off the P transistor 601 before the signal $h_0$ , $h_2$ , $h_4$ and $h_6$ arrive. If all the input signal $h_0$ , $h_2$ , $h_4$ and $h_6$ are zero, the input of N transistors are zero and all the N transistors 602 are OFF and the node 603 is kept in Vdd and the output signal of inverter 604 is zero. If only one input among $h_0$ , $h_2$ , $h_4$ and $h_6$ is Vdd and all the others are zero, which is the worst case, the delay of 600 is that one N transistor discharging the drain parasitic capacitance of the four same size N transistor down to ground plus the delay of one inverter. The MUX logic and circuit is shown in FIG. 7 as embodiment 700. The operation principle of 700 is: 1) the input signal Ph<sub>0</sub>, Ph<sub>1</sub>, Ph<sub>2</sub>, Ph<sub>3</sub>, Ph<sub>4</sub>, Ph<sub>5</sub>, Ph<sub>6</sub> and Ph<sub>7</sub> are set zero. 2) Set the gate input 705 of P transistor 701 to zero to precharge the node 703 to Vdd, then turn 705 to Vdd and turn off 60 the P transistor 701 before the signal Ph<sub>0</sub>, Ph<sub>1</sub>, Ph<sub>2</sub>, Ph<sub>3</sub>, Ph<sub>4</sub>, Ph<sub>5</sub>, Ph<sub>6</sub> and P<sub>7</sub> arrive. Since Ph<sub>0</sub>, Ph<sub>1</sub>, Ph<sub>2</sub>, Ph<sub>3</sub>, Ph<sub>4</sub>, Ph<sub>5</sub>, Ph<sub>6</sub> and Ph<sub>7</sub> are from Priority encoding, only one signal among them is Vdd and all the other are zero if there is hit. After AND logic, only one output of the seven AND gate 708 is equal to 65 the input value which is the selected bit from Ba<sub>0</sub> to Ba<sub>7</sub>. If the selected bit from Ba<sub>0</sub> to Ba<sub>7</sub> is zero, the node 703 is kept Vdd 6 and the output of inverter 704 is zero and the selected bit value zero is passed out. If the selected bit from $\rm Ba_0$ to $\rm Ba_7$ is Vdd, one N transistor among eight N transistor 702 is turned ON and the node 703 is discharged down to ground and the output of inverter 704 is Vdd(logical one) and the selected bit value Vdd is passed out, which is the worst case, the delay of 700 is one N transistor discharging the drain parasitic capacitance of the eight same size N transistor down to ground plus the delay of one inverter and one AND gate. Usually one AND gate includes one inverter and one NAND gate, so the delay of 700 is one N transistor discharging the drain parasitic capacitance of the eight same size N transistor down to the ground plus the delay of two inverter and one NAND gate. The entire Priority encoding logic and circuit are simplified as a four basic building block of 400, 510, 600 and 700 in FIGS. 4, 5, 6 and 7. The delay of each block 400, 510, 600 and 700 are comparable and we call the time of delay of each block 400, 510, 600 and 700 one stage. If we define the delay of hit logic block 510 as $T_h$ , one inverter delay is $T_i$ and one NAND gate delay is $T_n$ . The delay of priority encoding block 400 is (T<sub>t</sub>+T<sub>n</sub>) since the delay of block 400 is one more NAND gate delay comparing with block 510. The delay of block 600 is roughly $T_h$ . The delay of MUX block 700 is $(T_h + T_n + T_i)$ . The extra delay of each higher level priority encoding is a MUX 700 selection delay because that the Hit signal in each priority encoding level is generated one stage earlier than the binary hit address and the selection signal of the MUX is already available when the binary address to be selected arrive and will not suffer extra delay. The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. What is claimed is: 1. A content address able memory(CAM) and hit ahead priority encoding(HAPE) logic, comprising: a group of blocks which is arranged in column and row, each block has equal number of CAM match signals which are the input signals of priority encoding logic, each block has same priority encoding logic of CAM match signals within the block, the CAM match signals or input signals are arranged from lower priority to higher priority or from higher priority to lower priority, each CAM match signals or input signal has either high logic level "one" which is called hit or low logic level "zero" which is called miss, each block generates block hit when there is at least one CAM match signal is high logic "one" within the block or block miss signal when all the CAM match signals are in low logic level "zero" within the block and block binary address signal corresponding to the CAM match signals of highest priority within the block, a priority encoding logic of block hit or miss signals of each column, each column generates a column hit signal when there is at least one block hit signal within the column or column miss signal when there is only block miss signals within the column and column binary address corresponding to the CAM match signals of highest priority within the column, a priority encoding logic of column hit or miss signals of a group column, a group of column generates a hit signal when there is at least one column hit signal within the 7 group column or a miss signal when there is only column miss signals within the group column and a group column binary address corresponding to the CAM match signals of highest priority within the group column. - 2. A content address able memory(CAM) and hit ahead 5 priority encoding(HAPE) logic of claim 1, further comprising: - a block multiplexer to select the binary address from the block of highest priority hit within the column as less significant portion of the column binary address; and - a priority encoding logic of block hit signals to generate the block multiplexer control signal which select the block of highest priority hit within the column, and a binary address encoding logic of block hit signals to generate the more significant portion of the column highest priority binary address. - 3. A content address able memory(CAM) and hit ahead priority encoding(HAPE) logic of claim 1, wherein each block comprises: - a group of sub-blocks, each sub-block has equal number of 20 ing: input signals, each sub-block has priority encoding and binary address encoding logic to generate sub-block highest priority binary address as well as hit or miss generating logic to generate sub-block hit or miss signal, and the sub-block hit or miss signal is generated independently before sub-block binary address; - a block hit or miss generating logic to generate block hit or miss signal and block hit or miss signal is generated independently before the block binary address is generated: - a sub-block multiplexer to select the binary address from the highest priority sub-block within the block as less significant portion of block binary address; and - a priority encoding logic of each sub-block hit signals to generate the control signal of sub-block multiplexer, and 35 a binary address encoding logic of each sub-block hit signals to generate the more significant portion of block binary address. - **4.** A content addressable memory(CAM) and hit ahead priority encoding(HAPE) logic of claim 3, wherein priority 40 encoding logic, address encoding logic and multiplexer have the logic circuit of same structure. - **5.** A content address able memory(CAM) and hit ahead priority encoding(HAPE) logic of claim **4**, wherein the hit generating logic, priority encoding logic, address encoding 45 logic and multiplexer have dynamic NOR logic. - 6. A content address able memory(CAM) and hit ahead priority encoding(HAPE) logic of claim 2, wherein the signal of controlling the multiplexer is generated before or in the same time that the less significant portion of the highest 50 priority local address is generated. - 7. A content addressable memory (CAM) and hit ahead priority encoding (HAPE) logic, comprising: - a group of blocks which are arranged in columns and rows, each block having an equal number of CAM match signals which are the input signals of priority encoding logic, each block having a same priority encoding logic of CAM match signals within the block, the CAM match signals or input signals arranged from lower priority to higher priority or from higher priority to lower priority, 60 each CAM match signal or input signal being either a high logic level "one which is called hit or a low logic level "zero" which is called miss, each block configured to generate a block hit signal when there is at least one CAM match signal that is a high logic level "one" within 65 the block or a block miss signal when all the CAM match signals are a low logic level "zero" within the block and a block binary address signal corresponding to the CAM match signals of highest priority within the block; 8 - a priority encoding logic of block hit or miss signals of each column, each column configured to generate a column hit signal when there is at least one block hit signal within the column or a column miss signal when there are only block miss signals within the column and a column binary address corresponding to the CAM match signals of highest priority within the column; and - a priority encoding logic of column hit or miss signals of a group column, the group column configured to generate a hit signal when there is at least one column hit signal within the group column or a miss signal when there are only column miss signals within the group column and a group column binary address corresponding to the CAM match signals of highest priority within the group column - 8. The content addressable memory (CAM) and hit ahead priority encoding (HAPE) logic of claim 7, further comprising: - a block multiplexer configured to select a binary address from the block having the highest priority hit within the column as a less significant portion of the column binary address. - the priority encoding logic of block hit signals being configured to generate a block multiplexer control signal for selecting the block having the highest priority hit within the column; and - a binary address encoding logic of block hit signals configured to generate a more significant portion of the column binary address. - 9. The content addressable memory (CAM) and hit ahead priority encoding (HAPE) logic of claim 7, wherein each block comprises: - a group of sub-blocks, each sub-block having an equal number of input signals, each sub-block having priority encoding and binary address encoding logic configured to generate a sub-block highest priority binary address as well as hit or miss generating logic configured to generate a sub-block hit or miss signal, the sub-block hit or miss signal being generated independently before the sub-block binary address; - a block hit or miss generating logic configured to generate a block hit or miss signal, the block hit or miss signal being generated independently before the block binary address is generated; - a sub-block multiplexer configured to select a binary address from a highest priority sub-block within the block as a less significant portion of the block binary address; and - a priority encoding logic of each sub-block hit signals configured to generate a control signal of the sub-block multiplexer; and - a binary address encoding logic of the sub-block hit signals configured to generate a more significant portion of the block binary address. - 10. The content addressable memory (CAM) and hit ahead priority encoding (HAPE) logic of claim 9, wherein the priority encoding logic, the address encoding logic, and the multiplexer have logic circuitry of the same structure. - 11. The content addressable memory (CAM) and hit ahead priority encoding (HAPE) logic of claim 10, wherein the hit generating logic, the priority encoding logic, the address encoding logic, and the multiplexer have dynamic NOR logic. - 12. The content addressable memory (CAM) and hit ahead priority encoding (HAPE) logic of claim 8, wherein a signal for controlling the multiplexer is generated before or at the 9 same time that the less significant portion of the highest priority local address is generated. - 13. A content addressable memory (CAM) system, comprising: - one or more columns comprising a plurality of circuit segments, at least one of the circuit segments configured to generate a first circuit segment output based on whether at least one of a plurality of circuit segment inputs received by the at least one of the circuit segments corresponds to a first logic level, - at least one of the one or more columns configured to generate first address information based on a selected one of the first circuit segment outputs that corresponds to a second logic level, to set a node to a third logic level in response to a first input signal, and to subsequently change the node to a fourth logic level in response to one or more of a plurality of second input signals. - 14. The CAM system of claim 13, wherein the first circuit segment output represents circuit segment hit information. - 15. The CAM system of claim 13, wherein the at least one 20 of the plurality of circuit segment inputs represents match information. - 16. The CAM system of claim 13, wherein the selected one of the first circuit segment outputs is a highest priority one of the first circuit segment outputs that corresponds to the sec- 25 ond logic level. - 17. The CAM system of claim 13, wherein: - the one or more columns are a plurality of columns, and the plurality of circuit segments are arranged in the plurality of columns and a plurality of rows. - 18. The CAM system of claim 13, wherein: - the one or more columns are a group of columns; - each column in the group configured to generate a column output based on the first circuit segment output of the at least one of the circuit segments; and - the group configured to generate second address information based on a selected one of the column outputs that corresponds to a fifth logic level. - 19. The CAM system of claim 13, wherein: - the at least one of the one or more columns is configured to 40 pre-charge the node in response to the first input signal; and - the at least one of the one or more columns is configured to subsequently discharge the node in response to the one or more of the plurality of second input signals. - 20. The CAM system of claim 13, wherein the first input signal is configurable independently of the one or more of the plurality of second input signals. - 21. The CAM system of claim 13, wherein the first logic level and the second logic level are the same logic level. - 22. The CAM system of claim 13, wherein the one or more columns comprise: - a first logic circuit configured to generate a first logic circuit output based on the selected one of the first circuit segment outputs that corresponds to the second logic 55 level; - a second logic circuit configured to generate a second logic circuit output based on whether the first circuit segment output corresponds to the second logic level; and - a third logic circuit configured to generate the first address 60 information based on the selected one of the first circuit segment outputs that corresponds to the second logic level. - 23. The CAM system of claim 22, wherein at least one of the first logic circuit, the second logic circuit, and the third logic 65 circuit is configured to set the node to the third logic level in response to the first input signal, and to subsequently change IU the node to the fourth logic level in response to the one or more of the plurality of second input signals. - 24. The CAM system of claim 22, wherein: - the at least one of the circuit segments is configured to generate a second circuit segment output representing second address information; and - the one or more columns further comprise: - a fourth logic circuit configured to select one of the second circuit segment outputs as a less significant portion of the first address information; and - a fifth logic circuit configured to generate a more significant portion of the first address information. - 25. The CAM system of claim 24, wherein at least one of the fourth logic circuit and the fifth logic circuit is configured to set the node to the third logic level in response to the first input signal, and to subsequently change the node to the fourth logic level in response to the one or more of the plurality of second input signals. - 26. The content addressable memory (CAM) system of claim 24, wherein the one or more columns are each configured to generate a control input for the third logic circuit before or at the same time when the second circuit segment output is generated. - 27. The content addressable memory (CAM) system of claim 22, wherein: - the plurality of circuit segment inputs is divided into a plurality of subsets of the circuit segment inputs; and the first logic circuit comprises: - a plurality of fourth logic circuits each configured to generate a fourth logic circuit output based on whether at least one of a corresponding subset of the circuit segment inputs corresponds to the first logic level; and - a fifth logic circuit configured to generate the first circuit segment output based on whether at least one of the fourth logic circuit outputs corresponds to the first logic level. - 28. The CAM system of claim 27, wherein: - at least one of the fourth logic circuit and the fifth logic circuit is configured to set the node to the third logic level in response to the first input signal, and to subsequently change the node to the fourth logic level in response to the one or more of the plurality of second input signals; and - the fourth logic circuit output is an input to the fifth logic circuit. - 29. A content addressable memory (CAM) system, comprising: - a circuit segment configured to generate a circuit segment output based on whether at least one of a plurality of circuit segment inputs received by the circuit segment corresponds to a first logic level, - the circuit segment configured to set a node to a second logic level in response to an input signal, and to subsequently change the node to a third logic level in response to the plurality of circuit segment inputs, - the circuit segment output corresponding to said third logic - 30. The CAM system of claim 29, wherein at least one of the plurality of circuit segment inputs corresponds to a match line output. - 31. The CAM system of claim 29, wherein the circuit segment output represents circuit segment hit information. - 32. The CAM system of claim 29, wherein at least one of the plurality of circuit segment inputs represents match information 10 12 11 33. The CAM system of claim 29, wherein: the circuit segment is configured to pre-charge the node in response to the input signal; and the circuit segment is configured to subsequently discharge the node in response to the plurality of circuit segment 5 34. The CAM system of claim 29, wherein the input signal is configurable independently of the plurality of circuit segment inputs. 35. The CAM system of claim 29, wherein the first logic 10 level and the third logic level are the same logic level. 36. The CAM system of claim 29, wherein the circuit segment is a first circuit segment, and further comprising a second circuit segment configured to generate address information based on the circuit segment output. ## Exhibit B US006744653B1 ## (12) United States Patent Huang (10) Patent No.: US 6,744,653 B1 (45) Date of Patent: Jun. 1, 2004 #### (54) CAM CELLS AND DIFFERENTIAL SENSE CIRCUITS FOR CONTENT ADDRESSABLE MEMORY (CAM) (76) Inventor: Xiaohua Huang, 12897 Regan La., Saratoga, CA (US) 95070 (\*) Notice: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 U.S.C. 154(b) by 22 days. (21) Appl. No.: 10/202,621 (22) Filed: Jul. 24, 2002 #### Related U.S. Application Data (60) Provisional application No. 60/327,049, filed on Oct. 4, 2001. | (51) | Int. Cl.7 | | G11C | 15/00 | |------|-----------|-----------------------------------------|------|-------| | () | ***** | *************************************** | | | #### (56) References Cited #### U.S. PATENT DOCUMENTS 5,162,681 A \* 11/1992 Lee ...... 327/53 | 5,446,686 | Α | * | 8/1995 | Bosnyak et al 365/49 | |-----------|------------|----|---------|----------------------| | 5,598,115 | A | 40 | 1/1997 | Holst 326/119 | | 6,195,277 | <b>B</b> 1 | * | 2/2001 | Sywyk et al 365/49 | | 6,307,798 | <b>B</b> 1 | * | 10/2001 | Ahmed et al 365/207 | | 6 442 054 | <b>D</b> 1 | * | 8/2002 | Evens et al 365/49 | <sup>\*</sup> cited by examiner Primary Examiner—Richard Elms Assistant Examiner—J. H. Hur (74) Attorney, Agent, or Firm-Dinh & Associates #### (57) ABSTRACT A dummy content-addressable memory (CAM) cell and a dummy ternary CAM cell are connected to each row in a CAM and a ternary CAM array, respectively, to enable a differential match line sensing based on the content stored. The ternary CAM cell is for a differential match line sensing in low power applications. A method includes generating a voltage difference between a match line signal and a reference line signal, and then detecting and amplifying the voltage difference to determine a match or a mismatch. #### 17 Claims, 18 Drawing Sheets Jun. 1, 2004 Sheet 1 of 18 U.S. Patent Jun. 1, 2004 Sheet 2 of 18 US 6,744,653 B1 Jun. 1, 2004 Sheet 3 of 18 Jun. 1, 2004 Sheet 4 of 18 Jun. 1, 2004 Sheet 5 of 18 FIG. 2E Jun. 1, 2004 Sheet 6 of 18 U.S. Patent Jun. 1, 2004 Sheet 7 of 18 US 6,744,653 B1 U.S. Patent Jun. 1, 2004 Sheet 8 of 18 US 6,744,653 B1 Jun. 1, 2004 Sheet 9 of 18 FIG. 5A Jun. 1, 2004 **Sheet 10 of 18** FIG. 5B U.S. Patent Jun. 1, 2004 Sheet 11 of 18 US 6,744,653 B1 Jun. 1, 2004 **Sheet 12 of 18** FIG. 7 Jun. 1, 2004 **Sheet 13 of 18** Jun. 1, 2004 **Sheet 14 of 18** Jun. 1, 2004 **Sheet 15 of 18** Jun. 1, 2004 **Sheet 16 of 18** U.S. Patent Jun. 1, 2004 Sheet 17 of 18 US 6,744,653 B1 Jun. 1, 2004 **Sheet 18 of 18** 1 # CAM CELLS AND DIFFERENTIAL SENSE CIRCUITS FOR CONTENT ADDRESSABLE MEMORY (CAM) This application claims the benefit of provisional U.S. 5 Application Serial No. 60/327,049, entitled "High-Speed and Low Power Content Addressable Memory (CAM) Sensing Circuits," filed Oct. 4, 2001, which is incorporated herein by reference in its entirety for all purposes. #### BACKGROUND OF THE INVENTION The present invention relates generally to semiconductor circuits, and more specifically to CAM cells and high speed and low power sense circuits for content addressable memory. A content addressable memory (CAM) is a memory having an array of memory cells that can be commanded to compare all or a subset of the "entries" in the array against an input address. Each entry in the CAM array corresponds 20 to the content of the cells in a particular row of the array. Each row of the array is further associated with a respective match line, which is used as a status line for the row. All or a portion of the CAM array may be compared in parallel to determine whether or not the input address matches any of the entries in the portion selected for comparison. If there is a match to an entry, then the match line for the corresponding row is asserted to indicate the match. Otherwise, the match line is de-asserted to indicate a mismatch (which may also be referred to as a "miss"). Typically, any number of 30 match lines may be asserted, depending on the entries in the array and the input address. In a typical CAM design, the comparison between a bit of the input address and the content of a CAM cell is performed by a comparison circuit included in the cell. The comparison 35 circuits for all cells in each row may then be coupled to the match line for the row. For simplicity, the comparison circuits may be designed such that a wired-OR operation is implemented for the outputs from all comparison circuits output for each comparison circuit is formed by the drain of an N-channel output transistor. This output transistor is turned ON if there is a mismatch between the input address bit and the memory cell content and is turned OFF otherwise. The match line may be pre-charged to a logic high 45 prior to each comparison operation, and would thereafter remains at logic high only if all output transistors for the row are turned OFF, which would be the case if there is a match between all bits of the entry for the row and the input address. Otherwise, if at least one output transistor is turned 50 ON due to a mismatch, then the match line would be pulled low by these transistors. The signal (or voltage) on the match line may thereafter be sensed or detected to determine whether or not there was a match for that row. The conventional CAM cell and CAM sensing mecha- 55 nism described above, though simple in design, have several drawbacks that affect performance. First, speed may be limited by the wired-OR design of the match line, if some speed-enhancing techniques are not employed. Each row may include a large number of cells (e.g., possibly 100 or 60 more cells). In this case, if only one bit in the entire row does not match, then only one output transistor will be turned ON and this transistor will need to pull the entire match line low (e.g., from $V_{DD}$ to $V_{SS}$ ). A long time (i.e., $t=C \cdot V_{DD}^2/I$ , where C is the capacitance of each entire match line and I is the 65 current of each transistor) may then be required to discharge the line, which would then limit the speed at which the CAM array may be operated. Second, excessive power may be consumed by the CAM design described above. Typically, only one row will match the input address, and all other rows will not match. In this case, all but one match line will be pulled to logic low (e.g., to V<sub>ss</sub>) by the output transistors that are turned ON due to mismatches. The power consumed may then be computed as $(M-1)\cdot C\cdot V^{DD2}$ , where (M-1) is the number of mismatched rows, C is the capacitance of each match line, and $V_{DD}$ is the voltage swing of the match 10 line during discharge. As can be seen, there is a need for CAM cells and sense circuits that can ameliorate the shortcomings related to speed and power in the conventional design. #### SUMMARY OF THE INVENTION The invention provides CAM cell designs having improved performance over a conventional design. The invention further provides techniques to detect the signal (or voltage) on a match line coupled to a number of CAM cells and having faster speed of operation and possibly lower power consumption. In an aspect, a content addressable memory (CAM) cell is provided having improved performance. The CAM cell includes a memory cell operable to store a bit value and a comparison circuit configured to detect the bit value stored in the memory cell. The comparison circuit includes (1) an output transistor coupled to a match line and configured to provide a drive for the match line based on the detected bit value, and (2) a dummy transistor coupled to a dummy line. The match line and dummy line are used to detect an output value provided by the CAM cell. In an embodiment, the dummy transistor (1) has similar dimension as the output transistor, (2) is located in close proximity to the output transistor, and (3) is turned OFF during sensing operation. The dummy transistor is used to achieve low voltage swing (small signal) sensing and provides for low power and high-speed operation. In another aspect, a sense circuit is provided for sensing coupled to any given match line. In one common design, the 40 a logic state of a match line in a content addressable memory (CAM). The sense circuit includes a pair of amplifiers cross-coupled in a positive feedback configuration. The first amplifier has one input operatively coupled to the match line, and the second amplifier has one input operative to receive a reference signal. The match line is driven by a number of output transistors for a row of CAM cells. The reference signal is generated based on a row of dummy transistors that are similarly arranged as the output transistors. When enabled, the amplifiers detect the difference between the signals on the match line and the reference signal and further amplify the detected difference such that the logic value on the match line may be ascertained. The sense circuit may further include (1) a pair of pass transistors operatively coupled to the pair of amplifiers and used to enable the sense circuit, and (2) a switch coupled between outputs/inputs of the cross-coupled amplifiers and used to reset the amplifiers prior to each match line sense cycle. In a specific implementation, the first and second amplifiers may each be implemented as an inverter with gain (e.g., a P-channel transistor coupled in series with an N-channel transistor). > The match line is coupled to the output transistors for the row of CAM cells and may further be coupled directly to one input of the first amplifier. The dummy transistors couple to a dummy line that may further be coupled directly to one input of the second amplifier. Alternatively, the output transistors may also couple to a first common line that is coupled to the input of the first amplifier. In this case, the dummy transistors would similarly couple to a second common line that is coupled to the input of the second amplifier. Various other aspects, embodiments, and features of the 5 invention are also provided, as described in further detail The foregoing, together with other aspects of this invention, will become more apparent when referring to the following specification, claims, and accompanying draw- #### BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1A is a block diagram of a conventional content 15 addressable memory (CAM) unit; FIG. 1B is a block diagram of a CAM unit wherein certain aspects and embodiments of the invention may be implemented; FIGS. 2A, 2B, and 2C are respectively a block diagram, 20 a schematic diagram, and a logic diagram for an embodiment of a conventional CAM cell; FIG. 2D is a schematic diagram of a binary CAM cell having improved performance; FIG. 2E is a schematic diagram of a dummy binary CAM cell: FIG. 3A is a schematic diagram of the driving circuits associated with a single match line; FIG. 3B is a block diagram of a sense circuit; FIG. 3C is a schematic diagram of an embodiment of a sense circuit that may be used to detect the signal on a match FIGS. 4A and 4B are schematic diagrams of an embodiment of two match line detection mechanisms; FIGS. 5A and 5B are timing diagrams for the match line detection mechanisms shown in FIGS. 4A and 4B, respec- FIG. 6 is a schematic diagram of another embodiment of 40 a match line detection mechanism; FIG. 7 is a timing diagram for the match line detection mechanism shown in FIG. 6; FIG. 8A is a schematic diagram of an embodiment of a conventional ternary CAM cell; FIG. 8B is a schematic diagram of a ternary CAM cell having improved performance; FIG. 8C is a schematic diagram of a dummy ternary CAM FIGS. 9A, 9B, and 10 are schematic diagrams of three match line detection mechanisms for ternary CAM cells. #### DESCRIPTION OF THE SPECIFIC **EMBODIMENTS** FIG. 1A is a block diagram of a conventional content addressable memory (CAM) unit 100a. CAM unit 100a includes a CAM array 110a coupled to sense circuits 150a. CAM array 110a is a two-dimensional array of M rows by N columns of CAM cells 120. Each row of the CAM array includes N cells that collectively store data for an entry in the array. Each row is further associated with a respective match line 130 that couples to all CAM cells in the row and further couples to sense circuits 150a. Each of the N columns of the CAM array is associated 65 tor 218b couples to a bit line (bl) 224b. with a specific bit position of an N-bit input address. A differential address line 132 is provided for each address bit and couples to all cells in the corresponding column of the CAM array. In this way, each bit of the N-bit input address may be compared with each of the M bits stored in the M cells in the corresponding column. The N-bit input address may thus be provided to all M rows of the CAM array and simultaneously compared against all entries in the array. Typically, before performing the comparison between the input address and the entries in the CAM array, the M match lines for the M rows of the array are pre-charged to logic high (e.g., $V_{DD}$ ). For each row, if any cell in the row is not matched to the corresponding address bit, then the output transistor for that cell is turned ON and the match line is pulled to logic low (e.g., $V_{SS}$ ). Thus, for any given row, the match line remains at logic high (i.e., not pulled to $V_{SS}$ ) only if the output transistors for all N cells in the row are turned OFF, which only occurs if each bit for the input address matches the bit in the corresponding cell of the row. The match line for each row is thus at logic high for a match between the entry in that row and the input address, and is at logic low if there is no match (i.e., a mismatch) between the entry and the input address. FIG. 1B is a block diagram of a CAM unit 100b having improved performance. CAM unit 100b includes a CAM array 110b coupled to sense circuits 150b. CAM array 110b is a two-dimensional array of M rows by N columns of CAM cells 122. Each row of the CAM array includes N cells that collectively store data for an entry in the array. Each row is further associated with a respective match line 130 and a dummy line 131 that couple to all CAM cells in the row and further couples to sense circuits 150. CAM array 110b further includes a column of M dummy CAM cells 124, one dummy CAM cell for each row. Dummy CAM cells 124 allow for differential detection of the values stored in CAM cells 122, which are provided on match lines 130 and dummy lines 131, as described in further detail below. FIG. 2A is a simple representation for a CAM cell 120x, which is one of many CAM cells 120 in FIG. 1A. CAM cell 120x receives a differential address line, mbl and $\overline{mbl}$ , for a single bit of the input address and further couples to a single match line for one row of the CAM array. FIG. 2B is schematic diagram of a specific design of CAM cell 120x, which may be used for each of the CAM cells 120 in FIG. 1A. CAM cell 120x includes a memory cell 45 210x coupled to a comparison circuit 230x. Memory cell 210x (which may also be referred to as a storage element or storage cell) is used to store a single bit value. Comparison circuit 230x is used to compare the stored bit value against an address bit. As shown in FIGS. 2A and 2B, memory cell 210x comprises a pair of cross-coupled inverters 212a and 212b. Each inverter 212 is formed by a P-channel transistor 214 coupled to an N-channel transistor 216, as shown in FIG. 2B. The gates of transistors 214 and 216 couple together and 55 form the input of the inverter, and the drains of these transistors similarly couple together and form the output of the inverter. The output of inverter 212a couples to the input of inverter 212b, the drain of an N-channel transistor 218a, and a complementary data line (d) 220a. Similarly, the output of inverter 212b couples to the input of inverter 212a, the drain of an N-channel transistor 218b, and a data line (d) 220b. The gates of transistors 218a and 218b couple to a word line (wl), the source of transistor 218a couples to a complementary bit line (bl) 224a, and the source of transis- A data bit may be stored to memory cell 210x as follows. Initially, word line 222 is pulled to logic high to turn ON 5 either transistor 218a or 218b. The logic value on the differential bit line ( $\overline{b1}$ and bl) is then stored to the memory cell and maintained by inverters 212a and 212b. For example, if the complementary bit line ( $\overline{b1}$ ) is at logic low and the bit line (bl) is at logic high, then transistor 218a is 5 turned ON and transistor 218b is turned OFF. The complementary data line ( $\overline{d}$ ) is then pulled to logic low, which then causes the output of inverter 212b to transition to logic high. This then turns ON transistor 216a and causes the output of inverter 212a to transition to logic low. After the bit value 10 has been written to memory cell 210x, the word line is brought to logic low and the value is maintained by inverters 212a and 212b via a positive feedback mechanism. The process to store a bit of the opposite logic value proceeds in a complementary manner. Comparison circuit 230x comprises a pair of N-channel transistors 232a and 232b and an N-channel output transistor 240. Transistors 232a and 232b have gates that couple to data lines 220a and 220b, respectively, sources that couple to an address line (mbl) 132xa and a complementary address line (mbl) 132xb, respectively, and drains that couple together and to the gate of transistor 240. The source of transistor 240 couples to circuit ground (e.g., $V_{ss}$ ) and the drain of transistor 240 couples to a match line 130x for the row to which CAM cell 120x belongs. Comparison circuit 230x operates as follows. If the address bit is not the same as the stored bit in memory cell 210x, then the value on address line (mdl) 132xa is the same as the value on complementary data line (d) 220a, and the value on complementary address line (mbl) 132xb is the same as the value on data line (d) 220b. In this case, node C will be at logic high (i.e., a high voltage level), and transistor 240 will be turned ON to indicate a mismatch. Alternatively, if the input address is the same as the stored bit in memory cell 210x, then node C will then be pulled to logic low by either transistor 232a or 232b, and output transistor 240 will be turned OFF to indicate a match. The ON state for output transistor 240 thus indicates a mismatch and the OFF state indicates a match. FIG. 2C is a logical representation for memory cell 210x. Inverters 212a and 212b are cross-coupled so that the output of one inverter drives the input of the other inverter. Inverters 212a and 212b are thus coupled in a positive feedback circuit configuration. Transistors 218a and 218b act as switches that can be selectively turned ON to store a data value, which is then maintained by inverters 212a and 212b. FIG. 2D is schematic diagram of a specific design of a CAM cell 122x, which may be used for each of the CAM cells 122 in FIG. 1B. CAM cell 122x includes a memory cell 210x coupled to a comparison circuit 231x. Memory cell 210x is used to store a single data bit value, and is described above with reference to FIG. 2B. Comparison circuit 231x comprises a pair of N-channel transistors 232a and 232b and an N-channel output transistor 240 used to drive match line 130x. These transistors are described above with reference to FIG. 2B. Comparison circuit 231x further comprises a dummy N-channel output transistor 242 used to provide the proper loading for dummy line 131x. The gate of dummy transistor 242 is coupled to logic low, and the dummy transistor is turned OFF. Dummy transistor 242 has a physical dimension that is the same as output transistor 240. In an embodiment, dummy transistor 242 is located near output transistor 240 and is oriented in the same direction. FIG. 2E is schematic diagram of a specific design of a dummy CAM cell 124x, which may be used for each of the 6 dummy CAM cells 124 in FIG. 1B. Dummy CAM cell 124x includes a memory cell 210x coupled to a comparison circuit 233x. Memory cell 210x is used to store a single data bit value, and is described above with reference to FIG. 2B. Comparison circuit 233x includes circuitry used to drive match line 130x and dummy line 131x. In particular, comparison circuit 233x comprises transistors 232a, 232b, and 240x coupled in the manner described above with reference to FIG. 2B and used to drive match line 130x. Comparison circuit 233x further comprises a pair of N-channel transistors 234a and 234b and an N-channel output transistor 242x used to drive dummy line 131x. Transistors 234a, 234b, and 242x are coupled in similar manner as transistors 232a, 232b, and 240x for the match line, except that the gates of transistors 234a and 234b couple to the data line (d) and the complementary data line (d), respectively. Thus, if transistor 242x is turned ON, then transistor 240x will be turned OFF. Otherwise, transistor 242x is turned OFF and transistor 240x will be turned ON. When transistor 240x is turned ON, the CAM row is disabled and the match line is asserted to mismatch status. In an embodiment, transistors 242x has a physical dimension that is different from that of the other output transistors for the CAM cells within the same row. If the ratio of the width over the length of transistor 240x is normalized to be equal to 1 $$\left(\text{i.e., } \frac{W_0}{\alpha L_0} = 1\right),\,$$ then the ratio of the width over the length of transistor 242x may be expressed as being equal to x, where $$x = \frac{W}{\alpha L}$$ In an embodiment, x=0.5, which may be obtained by doubling the length of transistor 242x relative to that of transistor 240x $$\left(\text{i.e., } x = \frac{W_0}{2\alpha L_0}\right)$$ or by reducing the width of transistor 242x relative to that of transistor 240x $$\left(\text{i.e., } x = \frac{W_0/2}{\alpha L_0}\right).$$ The function performed by dummy CAM cell 124 is described in further detail below. FIG. 3A is a schematic diagram of the driving circuits associated with a single match line 130x. As shown in FIG. 1A, each match line 130 traverses the entire row of CAM array 110a and couples to output transistor 240 of each CAM cell 120 in the row. In FIG. 3A, transistors 240a through 240n thus represent the N output transistors for N CAM cells 120xa through 120xn in the row to which match line 130x is associated with. Each match line is further associated with a P-channel pre-charge transistor 310 and an output buffer 320. The comparison of an entry for a row of CAM cells against the input address is performed as follows. Initially, the gate voltage of output transistors **240***a* through **240***n* are pre-set to logic low to turn OFF these transistors, and pre-charge transistor **310** is turned ON (by bringing the Pch control signal to logic low) to pre-charge match line **130**x to a high level (e.g., $V_{DD}$ ). Pre-charge transistor **310** is then turned OFF, and the input address is written to address lines **132**a through **132**n (see FIG. 1A). The comparison circuit in each CAM cell in the row then operates to compare the stored bit in the CAM cell against the input address bit for that CAM cell. Depending on the stored value in each CAM cell and its input address bit, the output transistor for the CAM cell may be turned OFF for a match or turned ON for a mismatch, as described above. If all N bits for the row are matched, then all N output transistors 240a through 240n are turned OFF, and match line 130x remains at the pre-charged level (e.g., of $V_{DD}$ ). Otherwise, if one or more bits are not matched, then each mismatched bit causes the corresponding output transistor to turn ON. If any of the N output transistors is turned ON, then those transistors would then discharge the match line (i.e., pull the match line to logic low or $V_{SS}$ ). Thus, the match line remains at logic high if the input address matches the stored content of the CAM cells in the row, and transitions to logic low if the input address does not match the stored content. Output buffer 320 buffers the match line and drives the subsequent circuitry. As noted above, the match line configuration shown in FIG. 3A has several disadvantages related to speed and 25 power. First, speed may be limited by the wired-OR design of the match line. Each row may include a large number of cells. If only one bit in the entire row mismatches, then only one output transistor will be turned ON and this transitor would need to pull the entire match line toward $V_{SS}$ . In this 30 case, a long time may be required to discharge the match line, which would then limit the speed at which the CAM array may be operated. Second, excessive power may be consumed by discharging all match lines that mismatch (which is typically all but one match line) toward $V_{SS}$ . These disadvantages are ameliorated by the match line configurations described below. FIG. 3B is a block diagram of a differential sense circuit 410 that may be used to detect a signal (or voltage) on a match line. One sense circuit 410 may be coupled to each of 40 the M match lines for the CAM array in FIG. 1A. Sense circuits 150 may thus include M sense circuits 410. Sense circuit 410 may be implemented with a current mirror type, a cross-coupled latch type, or some other design. A reference generator 411 provides a reference voltage for one input of 45 sense circuit 410, and the match line couples to the other input of the sense circuit. Reference generator 411 may be implemented with dummy transistors (as described below), a voltage divider that can provide a constant voltage, or some other design. FIG. 3C is a schematic diagram of an embodiment of a sense circuit 410a that may be used to detect a signal (or voltage) on a match line. In the embodiment shown in FIG. 3C, sense circuit 410a includes a pair of inverting amplifiers 412a and 412b cross-coupled so that the output of one 55 amplifier drives the input of the other amplifier. Amplifiers 412a and 412b are thus coupled in a positive feedback circuit configuration. Transistor 418a couples to one input of amplifier 412a and to the match line at node M, and transistor 418b couples to one input of amplifier 412b and to 60 an output from reference generator 411 at node D. Nodes M and D effectively provide a differential drive for the pair of cross-coupled amplifiers 412a and 412b. Inverting buffers 424a and 424b provide buffering for the detected data bit from inverters 412a and 412b, respectively, and further 65 derive the Out A and Out B outputs. The operation of sense circuit 410a is described below. 8 FIG. 4A is a schematic diagram of a match line detection mechanism 400, which may be used in conjunction with the inventive CAM cells 122 and dummy CAM cells 124 in CAM unit 100b in FIG. 1B, in accordance with an embodiment of the invention. Similar to FIG. 3A, match line 130x couples to N output transistors 240a through 240n for N CAM cells 122xa through 122xn and to output transistor 240x for dummy CAM cell 124x in a specific row of the CAM array. Match line 130x further couples to a P-channel transistor 310a, which is used to pre-charge the match line (e.g., to $V_{DD}$ ) at the start of each detection cycle. Match line 130x further couples to a first input (node M) of a sense circuit 410x, which is used to sense the signal or voltage on the match line. Sense circuit 410x is a specific embodiment of sense circuit 410 in FIG. 3B. Dummy line 131x couples to N dummy transistors 242a through 242n for N CAM cells 122xa through 122xn and to dummy transistor 242x for dummy CAM cell 124x in the same row of the CAM array as the associated match line 130x. Dummy transistors 242x and 242a through 242n are used to generate a reference signal for sense circuit 410x, and may thus be viewed as one implementation of reference generator 411 in FIG. 3B. Dummy transistors 242a through 242n mimic the loading observed on match line 130x. Dummy line 131x also couples to a P-channel transistor 310b, which is used to pre-charge the dummy line at the start of each detection cycle. Dummy line 131x further couples to a second input (node D) of sense circuit 410x. As shown in FIG. 4A, dummy transistors 242a through 242n for CAM cells 122xa through 122xn are each dimensioned with a normalized size of 1 (i.e., W/L→1, where W is the width and L is the channel length of the transistor). Output transistors 240a through 240n for the CAM cells and output transistor 240x for dummy CAM cell 124x are each also dimensioned with the normalized size of 1. However, dummy transistor 242x for dummy CAM cell 124x is dimensioned with a normalized size of less than 1 (i.e., x<1) and thus has reduced drive capability in comparison to each output transistor 240. In one specific embodiment, x = 0.5. As also shown in FIG. 4A, all dummy transistors 242a through 242n in the CAM cells are turned OFF by grounding the gates of these N-channel dummy transistors. However, dummy transistor 242x for dummy CAM cell 124x may be turned ON and has a size that is only a fraction (e.g., half) of the size of the other output and dummy transistors. In the match situation, all of the transistors coupled to the match line (i.e., transistors 240a through 204n and 240x) will be turned OFF, and the match line will not be discharged. However, the dummy line will be discharged through dummy transistor 242x (which has a size that is a fraction x) and the dummy line voltage will be lower than the match line voltage. Conversely, in the mismatch situation, even if only one bit is mismatched, the match line will be discharged through the one or more transistors 240 for the mismatched CAM cells (which have a size of 1) at a speed faster than dummy line. In this case, the match line voltage will be lower than that of the dummy line voltage. In the specific embodiment of sense circuit 410x shown in FIG. 4A, N-channel transistors 418a and 418b have gates that couple together and to an En1 control signal and sources that couple to ground (e.g., $V_{SS}$ ). In an embodiment, amplifiers 412a and 412b are designed as inverters with gains, and are thus referred to as simply inverters. Inverters 412a and 412b couple to transistors 418a and 418b, respectively, and further to inverters 424a and 424b, respectively. Each inverter 412 comprises a P-channel transistor 414 coupled to an N-channel transistor 416. The gates of transistors 414a Q and 416a couple together and form one input of inverter 412a (node F). The source of transistor 414a couples to the drain of transistor 416a and form the output of inverter 412a, which couples to the gates of transistors 414b and 416b and to the input of inverting buffer 424b. Similarly, the gates of transistors 414b and 416b couple together and form one input of inverter 412b (node G). The source of transistor 414b couples to the drain of transistor 416b and form the output of inverter 412b, which couples to the gates of transistors 414a and 416a and to the input of inverting buffer 10 424a. The sources of N-channel transistors 416a and 416b couple to the drains of transistors 418a and 418b, respectively. The drains of transistors 414a and 414b couple together. A P-channel transistor 422 has a gate that couples to an 15 En2 control signal, a source that couples to the drains of transistors 414a and 414b, and a drain that couples to the upper voltage supply (e.g., $V_{DD}$ ). The inputs of inverting buffers 424a and 424b couple to the outputs of inverters 412b and 412a, respectively, and the outputs of buffers 424a and 424b drives the Out A and Out B outputs, respectively. The voltage on node M represents the signal on the match line 130x to be detected. The voltage on node D represents the reference signal to which the voltage on node M is compared against. Inverters 412a and 412b amplify the 25 voltage difference between nodes M and D. The reference signal at node D is generated by dummy transistors 242x and 242a through 242n. The reference signal may be determined, in part, by selecting the proper sizes for dummy transistors 242x and pre-charge transistor 30 310b, which is usually equal to transistor 310a. FIG. 5A is a timing diagram for match line detection mechanism 400 in FIG. 4A. This timing diagram shows various control signals for sense circuit 410x to detect the signal (or voltage) on match line 130x, the voltages at nodes M and D, and the sense circuit outputs. The control signals are generated based on a clock signal, which is shown at the top of FIG. 5A for reference. The operation of the sense circuit is now described in reference to both FIGS. 4A and 5A. Initially, prior to time $T_1$ , the Pch and En2 control signals are at logic high, the En1 control signal is at logic low, and the voltages at nodes M and D are pre-set to $V_{SS}$ . At time $T_1$ , which may correspond to the rising (or leading) edge of the clock signal, the Pch control signal is brought to logic low, 45 which then turns ON transistors 310a and 310b. At approximately the same time $T_1$ , the address to be compared are written in through the address line (mbl) and its complementary address line (mbl), the comparison circuits for the CAM cells coupled to the match line are enabled. Each of 50 the N output transistors 240 for these comparison circuits may thereafter be turned ON or OFF depending on its comparison result. In a typical design, the comparison circuits could be enabled either before or after time $T_1$ when the pre-charge is finished. Upon being turned ON at time $T_1$ , transistor 310a starts pre-charging match line 130x toward $V_{DD}$ , and transistor 310b similarly starts pre-charging dummy line 131x toward $V_{DD}$ . If there is a match between the input address and the contents of the CAM cells in the row corresponding to the 60 match line, then all N output transistors 240a through 240a will be turned OFF, and transistor 310a is able to pre-charge the match line to a higher voltage and faster, as shown by plot 512 in FIG. 5A. In comparison, since transistor 242x coupled to dummy line 131x is turned ON, transistor 310b 65 is able to pre-charge the dummy line at a slower rate, as shown by plot 514 in FIG. 5A. Thus, if there is a match, then 10 130x is higher than the volt the voltage on match line 130x is higher than the voltage on dummy line 131x. Conversely, if there is a mismatch between the input address and the CAM cell contents, then at least one output transistor 240 coupled to match line 130x will be turned ON, and the voltage on the match line will be pre-charge more slowly, as shown by plot 522 in FIG. 5A. Although transistor 242x coupled to dummy line 131x is also turned ON, it is only a fraction of the size of the output transistors 240 coupled to the match line and discharges at a fraction of the rate of transistor 240. As a result, transistor 310b is able to pre-charge the dummy line at a faster rate than for the match line, as shown by plot 524 in FIG. 5A. Thus, if there is a mismatch, then the voltage on dummy line 131x is higher than the voltage on match line 130x. At time $T_2$ , the Pch control signal is brought to logic high, which then turns OFF transistors 310a and 310b. The pre-charge is stopped at this point. If there is a match, then all N output transistors 240a through 240n are turned OFF, and the voltage on the match line is maintained at the same level, as shown by plot 512 in FIG. 5A. In contrast, the voltage on the dummy line is continuously discharged (i.e., pulled toward $V_{SS}$ ) by the one dummy transistor 242x that is turned ON, and the voltage at node D is pulled lower as shown by plot 514 in FIG. 5A. Conversely, if there is a mismatch, then at least one output transistor 240 coupled to the match line will be turned ON, and the voltage on the match line is discharged by the output transistor(s) that are turned ON, as shown by plot 522 in FIG. 5A. Since the output transistor coupled to the match line is larger than the ON dummy transistor 242x coupled to the dummy line, the match line is pulled toward $V_{SS}$ at a faster rate. Moreover, since the voltage on the match line is lower than that on the dummy line for a mismatch, the voltage on the match line will continue to be even much lower than that on the dummy line as both the match and dummy lines are pulled toward $V_{SS}$ starting at time $T_2$ . At time T<sub>3</sub>, the En1 control signal is brought to logic high and the En2 control signal is brought to logic low. The logic high on the En1 control signal turns ON transistors 418a and 418b, and the logic low on the En2 control signal turns ON transistor 422. These control signals enable sense circuit 410x by turning ON transistors 418a, 418b, and 422. With sense circuit 410x enabled, the voltages at nodes M and D are detected and the voltage difference is amplified by the pair of inverters 412a and 412b cross-coupled to provide positive feedback. Inverters 412a and 412b then drive their outputs to opposite rails, with the polarity being dependent on the sign of the detected voltage difference. CAM cells coupled to the match line are enabled. Each of the N output transistors 240 for these comparison circuits may thereafter be turned ON or OFF depending on its comparison result. In a typical design, the comparison circuits could be enabled either before or after time $T_1$ when the pre-charge is finished. Upon being turned ON at time $T_1$ , transistor 310a starts pre-charging match line 130x toward $V_{DD}$ , and transistor 310b similarly starts pre-charging dummy line 131x toward $V_{DD}$ . If there is a match between the input address and the contents of the CAM cells in the row corresponding to the match line, then all N output transistors 240a through 240a to ward their respective rail voltage on node D, as shown by plots 512 and 514 in FIG. 5A. This then turns ON transistor 416b more (i.e., sinks more current), which then pulls node F lower. The lower voltage on node F turns ON transistor 414a more and turns OFF transistor 416b more. In this way, the voltage at node G is pulled low toward $V_{SS}$ , and the voltage at node G is pulled apart and toward their respective rail voltages). Conversely, if there was a mismatch, then the voltage on node D is higher than the voltage on node M, as shown by plots 522 and 524 in FIG. 5A. This then turns ON transistor 416a more, which then pulls node G lower. Transistor 414b is then turned ON more, which then pulls node F higher. The voltage at node F is thus pulled toward $V_{DD}$ , and the voltage 11 at node G is pulled toward $V_{SS}$ . In a typical implementation, before the sensing the voltages of nodes D and M starts, nodes F and G are equalized as shown in FIG. 5A. Thus, shortly after sense circuit 410x is enabled by the En1 and En2 control signals, inverters 412a and 412b sense 5 the voltage on node M relative to the voltage on node D, and the sensed difference is provided via buffers 424a and 424b to the Out A and Out B outputs. At time T<sub>4</sub>, Out A is at logic high if there was a match and at logic low if there was a mismatch, and Out B is at logic low if there was a match and at logic high if there was a mismatch, as shown by the plots for these outputs in FIG. 5A. After time $T_3$ , transistors 418a and 418b are turned ON and respectively pull the voltages at nodes M and D slowly toward $V_{SS}$ because of the big capacitance from a large 15 number of transistors coupled to these nodes. If there was a match, then transistors 414a and 416b are both turned ON, and transistors 414b and 416a are both turned OFF. Transistor 414a pulls node G high toward $V_{DD}$ . Since transistor 416a is turned OFF, no current conducts 20 through inverter 412a after node G has been pulled high. Conversely, transistor 416b pulls node F low toward $V_{SS}$ . Since transistor 414b is turned OFF, no current conducts through inverter 412b after node F has been pulled low. Thus, once node F has been pulled low and node G has been pulled high, transistors 418a and 418b are able to discharge nodes M and D, respectively, and pull these nodes to $V_{SS}$ , as shown in FIG. 5A. Nodes M and D are now ready for the next sense operation in the next clock cycle. The complementary actions occur if there was a mismatch, but the 30 voltages at nodes M and D are also pulled to $V_{SS}$ . Match line detection mechanism 400 has several advantages over the conventional detection mechanism. Detection mechanism 400 may be operated at higher speed and lower power than conventional designs. First, as shown in FIG. 5A, the voltage on the match line is compared against the voltage on the dummy line. The voltages on both the match line and dummy line may be charged to only a fraction of $V_{DD}$ (instead of $V_{DD}$ ) for reliable detection of the signal on the match line. This may be achieved by (1) properly designing sense circuit 410x, (2) selecting the proper sizes for transistors 240, 242, and 242x, pre-charge transistors **310***a* and **310***b*, and (3) providing the proper control signals that determine the times T2, T3, and T4. Second, sense circuit 410x is able to detect and amplify a small voltage difference 45 between nodes M and D. And third, power consumption is reduced by limiting the signal swing to a fraction of $V_{DD}$ instead of the full V<sub>DD</sub>, as shown in FIG. 5A. Power consumption is proportional to the square of the voltage swing, and a smaller signal swing results in lower power 50 consumption. FIG. 4B is a schematic diagram of a match line detection mechanism 401, which may also be used in conjunction with the inventive CAM cells 122 and dummy CAM cells 124 in CAM unit 100b in FIG. 1B, in accordance with an embodiment of the invention. Similar to FIG. 4A, match line 130x couples to N output transistors 240a through 240n for N CAM cells 122xa through 122xn, output transistor 240x for dummy CAM cell 124x, and pre-charge transistor 310a. Match line 130x further couples to a first P-channel pass 60 transistor 426b, which couples the match line to sense circuit 410y. Sense circuit 410y is a specific embodiment of sense circuit 410 in FIG. 3B. Dummy line 131x couples to N dummy transistors 242a through 242n for N CAM cells 122xa through 122xn, 65 dummy transistor 242x for dummy CAM cell 124x, and pre-charge transistor 310b. Dummy line 131x further couples to a second P-channel pass transistor 426a, which couples the dummy line to sense circuit 410y. In the specific embodiment of sense circuit <sup>410</sup>y shown in FIG. 4B, an N-channel transistor **418**c has a gate that couples to a Saen control signal, a source that couples to ground, and a drain that couples to the sources of transistors **416**a and **416**b. Transistors **416**a and **416**b and **418**a and **418**b are coupled as shown in FIG. 4A. However, the drains of transistors **418**a and **418**b couple directly to the upper voltage supply (e.g., V<sub>DD</sub>). Pass transistors 426a and 426b are used to respectively isolate the capacitance on the dummy and match lines from nodes D and M within sense circuit 410y. The capacitance on each of these lines is relatively high because a number of output or dummy transistors are coupled to the line. The isolation provided by pass transistors 426a and 426b allows sense circuit 410y to operate at a higher speed for sensing operation, since the internal nodes may be charged and discharged at a faster rate with reduced capacitance loading on the internal nodes. FIG. 5B is a timing diagram for match line detection mechanism 401 in FIG. 4B. This timing diagram shows various control signals for sense circuit 410y to detect the signal on match line 130x, the voltages at nodes M and D and nodes F and G, and the sense circuit outputs. The control signals are generated based on a clock signal, which is shown at the top of FIG. 5B for reference. Initially, prior to time $T_1$ , the Pch control signal is at logic low, and the voltages at nodes M and D are pre-charged to $V_{DD}$ . Nodes G and F are also pre-charged to $V_{DD}$ via pass transistors 426a and 426b, which are turned ON at this time. Near time $T_1$ , the Pch control signal is brought to logic high, which then turns OFF transistors 310a and 310b. At approximately the same time $T_1$ , the address to be compared is written to the address line, and the comparison circuits for the CAM cells are enabled. Each of the N output transistors 240 for these comparison circuits may thereafter be turned ON or OFF depending on its comparison result. If there is a match between the input address and the contents of the CAM cells, then all N output transistors 240a through 240n will be turned OFF, and the match line remains at its pre-charged level, as shown by plot 532 in FIG. 5B. In comparison, since transistor 242x coupled to dummy line 131x is turned ON, this transistor pulls the dummy line to a lower voltage, as shown by plot 534 in FIG. 5B. Thus, if there is a match, then the voltage on match line 130x is higher than the voltage on dummy line 131x. The Iso control signal is at logic low during this time, pass transistors 426a and 426b are turned ON, and the dummy and match lines are respectively coupled to nodes G and F of sense circuit 410y. At time $T_2$ , the Saen control signal is brought to logic high, which then turns ON transistor 418c and enables sense circuit 410y. The Iso control signal is also brought to logic high, which then turns OFF pass transistors 426a and 426b. The differential voltage between nodes G and F are then amplified by sense circuit 410y and Outputs A and B are provided as shown in FIG. 5B. At time $T_3$ , the Pch control signal is brought to logic low, the pre-charge transistors 310a and 310b are turned ON, and the dummy and match lines are pulled toward $V_{DD}$ . At time $T_4$ , the Saen and Iso control signals are brought to logic low, the dummy and match lines are coupled to nodes G and F, and these nodes are pulled toward $V_{DD}$ by pre-charge transistors 310a and 310b to get ready for the next sensing cycle. The signal swing for the mismatch situation is also shown in FIG. 5B. 13 FIG. 6 is a schematic diagram of a match line detection mechanism 600, which may be used in conjunction with CAM cells 122 and 124 in CAM unit 100b in FIG. 1B, in accordance with another embodiment of the invention. Similar to FIG. 4A, match line 130x couples to N output 5 transistors 240a through 240n for the N CAM cells in a specific row of the CAM array and further couples to P-channel transistor 310a. However, the sources of output transistors 240a through 240n are coupled to node M of sense circuit 410x via a first common line 610a, which may 10 be implemented with a metal track in the circuit layout. A row of N dummy transistors 242a through 242n and 242x couples to dummy line 131x, which further couples to P-channel transistor 310b. The sources of dummy transistors 242a through 242n and 242x are coupled to node D of sense 15 circuit 410x via a second common line 610b. FIG. 7 is a timing diagram for match line detection mechanism 600 in FIG. 6. Similar to FIG. 5, FIG. 7 shows the control signals, the voltages at nodes M and D, and the sense amplifier outputs for the match line detection. The 20 operation of detection mechanism 600 is now described in reference to both FIGS. 6 and 7. The operation of sense circuit 410x in FIG. 6 is similar to that described above for detection mechanism 400 in FIG. signals are at logic high, the En1 control signal is at logic low, and the voltages at nodes M and D are pre-set to V<sub>ss</sub>. At time T<sub>1</sub>, the Pch control signal is brought to logic low, which then turns ON transistors 310a and 310b. Near time $T_1$ , each of the N output transistors 240 for the CAM cells 30 than a full swing from $V_{SS}$ to $V_{DD}$ . coupled to the match line is turned ON or OFF based on its comparison result. If there is a match, then all N output transistors 240 are turned OFF, and the voltage on common line 610a is though match line 130x is pulled toward $V_{DD}$ . In contrast, the voltage on common line 610b is pulled toward $V_{DD}$ by the one dummy transistor 242x that is turned ON, as shown by plot 714 in FIG. 7. Thus, the voltage on common line common line 610a for the output transistors for a match. Conversely, if there is a mismatch, then at least one output transistor 240 is turned ON, and common line 610a is pulled toward $V_{\it DD}$ by the ON transistor(s), as shown by plot 722 in FIG. 7. Since the output transistors 240 coupled to the 45 match line are larger than the ON dummy transistor 242xcoupled to the dummy line, the match line is pulled toward ${ m V}_{DD}$ at a faster rate. Thus, the voltage on common line 610afor the output transistors is higher than the voltage on At time T2, the Pch control signal is brought to logic high, transistors 310a and 310b are both turned OFF, and the voltages on the match line, dummy line, and common lines 610a and 610b are maintained for both the match and 55 mismatch cases. If there was a match, then the voltage on node D is higher than the voltage on node M when transistors 310a and 310b are turned OFF, as shown by plots 712 and 714 in FIG. 7. Conversely, if there was a mismatch, then the voltage on node M is higher than the voltage on node D 60 when transistors 310a and 310b are turned OFF, as shown by plots 722 and 724 in FIG. 7. At time T<sub>3</sub>, the En1 control signal is brought to logic high, the En2 control signal is brought to logic low, and transistors 418a, 418b, and 422 are turned ON. Inverters 412a and 412b 65 within sense circuit 410x are then enabled. Inverters 412aand 412b then detect the voltage difference between nodes 14 M and D and further amplify the detected voltage difference. If there was a match, then the voltage on node D will be higher than the voltage on node M (as shown by plots 712 and 714 in FIG. 7), the outputs of inverters 412b (node F) and 412a (node G) will be driven to logic high and logic low, respectively, and the Out A and Out B outputs will be driven to logic low and logic high, respectively. Conversely, if there was a mismatch, then the voltage on node M will be higher than the voltage on node D (as shown by plots 722 and 724 in FIG. 7), the outputs of inverters 412b (node F) and 412a (node G) will be driven to logic low and logic high, respectively, and the Out A and Out B outputs will be driven to logic high and logic low, respectively. Starting at time T<sub>3</sub>, transistors 418a and 418b respectively pull common lines 610a and 610b toward $V_{SS}$ . Transistors 418a and 418b should be turned ON long enough to pull the voltage on these common lines to near V<sub>ss</sub>, to prepare for the next sensing cycle. Match line detection mechanism 600 is a different approach in comparison to match line detection mechanism 400 in FIG. 4A. Detection mechanisms 400 and 600 may be operated at a higher clock speed since it is not necessary to completely pre-charge the match line to VDD and also not necessary pull the match line to VDD or VSS after the pre-charge period (after the Pch signal has transitioned to 4A. Initially, prior to time T<sub>1</sub>, the Pch and En2 control 25 logic high). This is because the differential sensing mechanism 410x can detect a small voltage difference between nodes D and M. Match line detection mechanisms 400 and 600 also achieve low power operation since the match line and dummy line operate with a small voltage swing rather The sense circuits described herein may be used to detect the signal on a match line coupled to a row of "ternary" CAM cells. A ternary CAM cell is one that includes two memory cells or storage elements, with one cell being used maintained at V<sub>SS</sub>, as shown by plot 712 in FIG. 7, even 35 to store a data bit and the other cell being used to store a control bit to indicate whether or not a comparison is to be performed for that CAM cell. The additional (or secondary) cell may thus be used to selectively enable or disable the ternary CAM cell from being used in the comparison. If the 610b for the dummy transistors is higher than the voltage on 40 ternary CAM cell is disabled, then its output does not affect the logic level on the match line to which it is coupled. FIG. 8A is a schematic diagram of an embodiment of a conventional ternary CAM cell 120y, which may be used for each of the CAM cells 120 in FIG. 1A. CAM cell 120y includes a memory cell 210y, a secondary cell 250y, and a comparison circuit 230y. Memory cell 210y operates in similar manner as that described above for memory cell 210x in FIG. 2B and is used to store a single data bit. Secondary cell 250v is similar in design to memory cell 210v common line 610b for the dummy transistors for a mis- 50 and is used to store a single control bit. Secondary cell 250y may be programmed in similar manner as for memory cell **210**y, and may further utilize the same bit line (bl and $\overline{bl}$ ). Comparison circuit 230y comprises a pair of N-channel transistors 232a and 232b and a pair of N-channel output transistors 240 and 241. Transistors 232a and 232b are coupled to memory cell 210y in similar manner as shown in FIG. 2B for CAM cell 120x. Output transistors 240 and 241 are coupled in series and to cells 210y and 250y. In particular, output transistor 241 has its drain coupled to a match line 130y for the row to which CAM cell 120y belongs, its source coupled to the drain of transistor 240, and its gate (labeled as node "K") coupled to the mask line from secondary cell 250y. Output transistor 240 has its source coupled to circuit ground (e.g., V<sub>SS</sub>) and its gate (labeled as node "C") coupled to the drains of transistors 232a and 232b. Output transistors 240 and 241 effectively implement a NAND gate. 15 Comparison circuit 230y operates as follows. If the address bit is not the same as the stored data bit in memory cell 210v, then node C will be at logic high to indicate a mismatch. If the control bit on the mask line is at logic high, indicating that the ternary CAM cell is enabled, then node K will also be at logic high. If nodes C and K are both at logic high, then output transistors 240 and 241 are both turned ON, and match line 130y is pulled to logic low (e.g., toward V<sub>ss</sub>). Otherwise, if node C is at logic low because of a match or node K is at logic low because the ternary CAM cell is 10 disabled, then one or both of the output transistors will be turned OFF and these transistors will not actively operate on match line 130y. Thus, comparison circuit 230y of ternary CAM 120y cell only pulls the match line to logic low if the mismatch between its data bit and the address bit. FIG. 8B is a schematic diagram of an embodiment of a ternary CAM cell 122y, which may be used for each of the CAM cells 122 in FIG. 1B. CAM cell 120y includes a memory cell 210y, a secondary cell 250y, and a comparison 20 circuit 231y. Memory cell 210y and secondary cell 250y operate in similar manner as that described above for ternary CAM cell 120y in FIG. 8A, and are used to store a single data bit and a single control bit, respectively. Comparison circuit 231y comprises the pair of N-channel transistors 25 232a and 232b and the pair of N-channel output transistors 240 and 241, which are coupled in similar manner as described above in FIG. 8A. Comparison circuit 231y further comprises a pair of N-channel dummy transistors 242 and 243, which are coupled in series and to dummy line 30 131y. In particular, dummy transistor 243 has its drain coupled to dummy line 131y for the row to which CAM cell 120v belongs, its source coupled to the drain of transistor 242, and its gate (labeled as node "Ki") coupled to the inverted mask output of secondary cell 250y. Dummy tran- 35 sistor 242 has its source coupled to circuit ground (e.g., V<sub>ss</sub>) and its gate (labeled as node "Ki") coupled to the mask output of secondary cell 250y. Dummy transistors 242 and 243 provide the proper loading for dummy line 131y. Dummy transistors 242 and 243 have similar physical 40 dimension as output transistors 240 and 241. In an embodiment, dummy transistors 242 and 243 are located near output transistors 240 and 241 and are oriented in the same direction. The output of the pair of dummy transistors complementary. FIG. 8C is a schematic diagram of an embodiment of a dummy ternary CAM cell 124y, which may be used for each of the dummy CAM cells 124 in FIG. 1B. Dummy CAM cell and a comparison circuit 233y. Memory cell 210y and secondary cell 250y operate in similar manner as that described above for ternary CAM cell 120y in FIG. 8A, and are used to store a single data bit and a single control bit, respectively. Comparison circuit 233y includes circuitry used to drive match line 130y and dummy line 131y. In particular, comparison circuit 233y comprises transistors 232a, 232b, and output transistors 240x and 241x coupled in the manner described above with reference to FIG. 8A and used to drive 60 match line 130y. Comparison circuit 233y further comprises a second pair of N-channel transistors 234a and 234b and a second pair of output transistors 242x and 243x used to drive dummy line 131y. Transistors 234a and 234b and output transistors 242x and 243x are coupled in similar manner as 65 transistors 232a and 232b and output transistors 240x and 241x for the match line, except that the gates of transistors 16 234a and 234b couple to the data line (d) and the complementary data line (d), respectively. The output of the pair of transistors 240x and 241x and the output of the pair of transistors 242x and 243x are complementary. When the output of transistor pair 240x and 241xis OFF, the output of transistor pair 242x and 243x is ON and pulls down the dummy line with fraction of the speed as that of the match line if there is at least one bit mismatch. Conversely, when the output of transistor pair 242x and 243xis OFF, the dummy line will not be pulled down. But the output of transistor pair 240x and 241x will be ON and the match line will be pulled down. This would then indicate a mismatch and this row is disabled. FIG. 9A is a schematic diagram of a match line detection CAM cell is enabled for comparison and there was a 15 mechanism 900, which may be used in conjunction with ternary CAM cells 122y and 124y in CAM unit 100b in FIG. 1B, in accordance with yet another embodiment of the invention. Similar to FIG. 4A, a match line 130y couples to N pairs of output transistors 240a and 241a through 240n and 241n for the N ternary CAM cells 124ya through 124yn and also to transistors 240x and 241x for dummy CAM cell 124y in a specific row of the CAM array. The gates of output transistors 240a through 240n couple to the comparison circuit outputs (labeled as C1 through CN) for the N ternary CAM cells, and the gates of output transistors 241a through 241n couple to the mask outputs (labeled as K1 through KN) of the secondary cells for the N ternary CAM cells. The gates of output transistors 240x and 241x respectively couple to the comparison circuit outputs (labeled as Cd) and the secondary cell inverted mask output (labeled as Kd) for dummy ternary CAM cell 124y. Match line 130y further couples to P-channel transistor 310a and a first input of a sense circuit 410y, which is used to sense the signal on the match line. Dummy line 131y couples to N pairs of dummy transistors 242a and 243a through 242n and 243n for the N ternary CAM cells 124ya through 124yn and also to transistors 242x and 243x for dummy CAM cell 124y within the same row as the associated match line 130y. The gates of dummy transistors 242a through 242n couple to the inverted mask outputs of the secondary cells, and the gates of dummy transistors 243a through 243n couple to the mask outputs of the secondary cells. With this connection, the N pairs of dummy transistors 242a and 243a through 242n and 243n 242 and 243 is always OFF since the gate inputs are 45 are always turned OFF. The gates of dummy transistors 242x and 243x are respectively coupled to the comparison circuit complementary output (labeled as Cd) and the mask output (labeled as Kd) for dummy ternary CAM cell 124y. This dummy transistor pair is turned ON. Again, transistors 242x 124y includes a memory cell 210y, a secondary cell 250y, 50 and 243x are dimensioned to be a fraction (e.g., half) of the size of the other output transistors. Dummy line 131y further couples to P-channel transistor 310b and the second input (node D) of a sense circuit 410y. In the specific embodiment shown in FIG. 9A, sense 55 circuit 410x includes inverters 412a and 412b, N-channel transistors 418a and 418b, P-channel transistor 422, and inverting buffers 424a and 424b, which are coupled together as described above for sense circuit 410x in FIG. 4A. Sense circuit 410x may be used to detect the signal on match line 130y in similar manner as that described above for detection mechanism 400 in FIG. 4A and shown by the timing diagram in FIG. 5. FIG. 9B is a schematic diagram of a match line detection mechanism 901, which may also be used in conjunction with ternary CAM cells 122y and 124y in CAM unit 100b. Match line detection mechanism 901 is similar to match line detection mechanism 900 in FIG. 9A. However, match line 17 130y further couples to P-channel pass transistor 426b and dummy line 131y further couples to P-channel pass transistor 426a. Pass transistors 426a and 426b respectively couple the dummy and match lines to sense circuit 410y, similar to the embodiment shown in FIG. 4B. The operation of match 5 line detection mechanism 901 is as described above for FIGS. 4B and 9A. FIG. 10 is a schematic diagram of a match line detection mechanism 1000, which may be used in conjunction with ternary CAM cells 122y and 124y in CAM unit 100b in FIG. 10 1B, in accordance with yet another embodiment of the invention. Similar to FIGS. 6 and 9, match line 130y couples to N pairs of output transistors 240a and 241a through 240n and 241n for the N ternary CAM cells 122 and also to output transistors 240x and 241x for the dummy ternary CAM cell 15 124 in a specific row of the CAM array. However, the sources of output transistors 241a through 241n are coupled to node M of sense circuit 410y via first common line 610x. Similarly, the sources of dummy transistors 242a through 242n are coupled to node D of sense circuit 410y via second 20 common line 610y. FIG. 10 also shows an embodiment of a sense circuit 410y. Sense circuit 410y includes inverters 412a and 412b, N-channel transistors 418a and 418b, P-channel transistor 422, and inverting buffers 424a and 424b, which are coupled 25 together as described above for sense circuit 410x in FIG. 4A. Sense circuit 410y further includes an N-channel transistor 420, a P-channel transistor 430, and an inverter 432. P-channel transistor 430 is coupled in parallel with N-channel transistor 420. The sources of transistors 420 and 30 430 couple to node F, the drains of transistors 420 and 430 couple to node G, the gate of transistor 420 couples to the input of inverter 432, and the gate of transistor 430 couples to the output of inverter 432. The input of inverter 432 couples to an En3 control signal. Transistors 420 and 430 35 form a switch that shorts out nodes F and G when enabled by the En3 control signal. The transistors 420 and 430 are used to equalize nodes G and F in each cycle before a match comparison. In a typical implementation of all the above embodiments, these two transistors will be provided to 40 equalize nodes F and G before each match comparison. Sense circuit **410**y may be used to detect the signal on common line **610**x in similar manner as that described above for detection mechanism **600** in FIG. **6** and shown by the timing diagram in FIG. **7**. Sense circuit **410**y may also be 45 used for match line detection mechanisms **400**, **600**, and **900**. For clarity, specific designs of the sense circuit have been described herein. Various modifications to these circuit designs may also be made, and this is within the scope of the 50 invention. For example, for sense circuit 410x, inverters 412a and 412b may be coupled to match line 130x or common line 610x via some other configuration, and so on. The specific timing diagrams shown in FIGS. 5 and 7 are also provided to illustrate the operation of the sense circuit 55 and the match line detection. Variations to the timing shown in FIGS. 5 and 7 may also be made, and this is within the scope of the invention. For example, the En1 control signal may be brought to logic high at time $T_2$ when the Pch control signal is brought to logic high. The sense circuits and match line detection mechanisms described herein may be used to provide a CAM array having faster speed of operation and lower power consumption. These circuits may also be used for other types of memory (e.g., dynamic random access memory or DRAM), 65 and other integrated circuits (e.g., microprocessors, controllers, and so on). 18 The circuits described herein may also be implemented in various semiconductor technologies, such as CMOS, bipolar, bi-CMOS, GaAs, and so on. The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. What is claimed is: - 1. A content addressable memory (CAM) cell comprising: a memory cell operable to store a bit value; and - a comparison circuit coupled to the memory cell and configured to detect the bit value stored in the memory cell, the comparison circuit including - an output transistor coupled to a match line and configured to provide a drive for the match line based on the detected bit value, and - a dummy transistor coupled to a dummny line and configured to provide a drive for the dummy line based on an inverted detected bit value, wherein the match line and dummy line are used to detect output values provided by other CAM cells also coupled to the match and dummy lines. - 2. The CAM cell of claim 1, wherein the comparison circuit further includes - a first pair of transistors configured to receive the detected bit value and provide a drive for the output transistor, - a second pair of transistors configured to receive the inverted detected bit value and provide a drive for the dummy transistor. - 3. The CAM cell of claim 1, wherein the dummy transistor has a smaller dimension and less current flowing through than the output transistor and is located in close proximity to the output transistor. - 4. The CAM cell of claim 3, wherein the dummy transistor is approximately half the dimension of the output transistor and is turned ON during sensing operation. - 5. A content addressable memory (CAM) cell comprising: a memory cell operable to store a data bit value; - a secondary cell operable to store a control bit value and a complementary control bit value; and - a comparison circuit coupled to the memory cell and the secondary cell and configured to detect the data bit value stored in the memory cell and the control bit value and the complementary control bit value stored in the secondary cell, the comparison circuit including - a pair of output transistors coupled to a match line and configured to provide a drive for the match line based on the detected data bit value and the detected control bit value, and - a pair of dummy transistors coupled to a dummy line and configured to provide a drive for the dummy line based on the detected control bit value and the detected complementary control bit value, - wherein the match line and the dummy line are used to detect an output value provided by the CAM cell. - 6. The CAM cell of claim 5, wherein the dummy transistors have similar dimensions and orientation as the output transistors and located in close proximity to the output transistors. 19 - 7. The CAM cell of claim 5, wherein an output of the pair of dummy transistors are OFF during sensing operation. - 8. A content addressable memory (CAM) cell comprising: a memory cell operable to store a data bit value; - a secondary cell operable to store a control bit value; and - a comparison circuit coupled to the memory cell and the secondary cell and configured to detect the data bit value stored in the memory cell and the control bit value stored in the secondary cell, the comparison circuit including - a pair of output transistors coupled to a match line and configured to provide a drive for the match line based on the detected data bit value and the detected control bit value, and - a pair of dummy transistors coupled to a dummy line and configured to provide a drive for the dummy line based on an inverted detected data bit value and the detected control bit value. - 9. The CAM cell of claim 8, wherein the comparison $_{20}$ circuit further includes - a first pair of transistors configured to receive the detected data bit value and provide a drive for a first output transistor, and - a second pair of transistors configured to receive the 25 inverted detected bit value and provide a drive for a first dummy transistor. - 10. The CAM cell of claim 8, wherein the dummy transistors have smaller dimension and less current flowing through than the output transistors, are located in close 30 proximity to the output transistors, and are turned ON during sensing operation. - 11. The CAM cell of claim 10, wherein the dummy transistors are approximately half the dimension of the output transistors. - 12. A sense circuit for sensing a logic state of a match line in a content addressable memory (CAM), comprising: - a plurality of dummy transistors operative to provide a reference signal; - a first amplifier having an input operatively coupled to the match line, wherein the match line is coupled to a plurality of output transistors for a plurality of CAM cells, each output transistor providing an output value indicative of a comparison result for a respective CAM cell: and - a second amplifier having an input configured to receive the reference signal, wherein the first and second amplifiers are coupled in a positive feedback configuration and operative to amplify a difference between a signal on the match line and the reference signal, - wherein the plurality of output transistors are N-channel transistors having drains that couple to the match line and sources that couple to a first common line, and 20 wherein the input of the first amplifier is coupled to the first common line. - 13. The sense circuit of claim 12, wherein the plurality of dummy transistors are N-channel transistors having drains that couple to a dummy line and sources that couple to a second common line, and wherein the input of the second amplifier is coupled to the second common line. - 14. The sense circuit of claim 13, wherein the match line and the dummy line are pre-charged prior to being sensed by 10 the sense circuit. - 15. A sense circuit for sensing a logic state of a match line in a content addressable memory (CAM), comprising: - a plurality of dummy transistors operative to provide a reference signal; - a first amplifier having an input operatively coupled to the match line, wherein the match line is coupled to a plurality of pairs of output transistors for a plurality of CAM cells, each pair of output transistors providing an output value indicative of a comparison result for a respective CAM cell; and - a second amplifier having an input configured to receive the reference signal, wherein the first and second amplifiers are coupled in a positive feedback configuration and operative to amplify a difference between a signal on the match line and the reference signal, - wherein each pair of output transistors comprises a pair of series-coupled N-channel transistors having one drain coupled to the match line and one source coupled to a first common line, and wherein the input of the first amplifier is coupled to the first common line. - 16. The sense circuit of claim 15, wherein the plurality of dummy transistors comprise a plurality of pairs of seriescoupled N-channel dummy transistors, each pair of dummy transistors having one drain coupled to a dummy line and one source coupled to a second common line, and wherein the input of the second amplifier is coupled to the second common line. - 17. A method for sensing a logic state of a match line in a content addressable memory (CAM), comprising: - sensing a signal on a first common line, wherein the signal on the first common line is related to a signal on the match line; - providing a reference signal on a second common line based on a plurality of dummy transistors; - determining a difference between the sensed signal on the first common line and the reference signal on the second common line; - amplifying the determined difference with a positive feedback amplifier; and - providing an output value indicative of the logic state of the match line based on the amplified difference. \* \* \* \* \* Exhibit C US006999331B2 # (12) United States Patent Huang (10) Patent No.: US 6,999,331 B2 (45) Date of Patent: Feb. 14, 2006 # CAM CELLS AND DIFFERENTIAL SENSE CIRCUITS FOR CONTENT ADDRESSABLE MEMORY (CAM) Inventor: Xiaohua Huang, 12897 Regan La., Saratoga, CA (US) 95070 Subject to any disclaimer, the term of this Notice: (\*) patent is extended or adjusted under 35 U.S.C. 154(b) by 86 days. (21) Appl. No.: 10/789,661 Feb. 27, 2004 (22)Filed: (65)**Prior Publication Data** US 2004/0228156 A1 Nov. 18, 2004 # Related U.S. Application Data - Continuation of application No. 10/202,621, filed on Jul. 24, 2002, now Pat. No. 6,744,653. - Provisional application No. 60/327,049, filed on Oct. 4, - (51)Int. Cl. (2006.01)G11C 15/00 - **U.S. Cl.** ...... **365/49**; 365/207; 365/210; 711/108 - Field of Classification Search .............................. 365/49, 365/207, 210; 711/108 See application file for complete search history. #### (56) #### References Cited #### U.S. PATENT DOCUMENTS | 5,162,681 | A | | 11/1992 | Lee 327/53 | |-----------|----------------|---|---------|----------------------| | 5,446,686 | A | | 8/1995 | Bosnyak et al 365/49 | | 5,598,115 | A | | 1/1997 | Holst 326/119 | | 6,195,277 | B <sub>1</sub> | | 2/2001 | Sywyk et al 365/49 | | 6,307,798 | <b>B</b> 1 | | 10/2001 | Ahmed et al 365/207 | | 6,373,738 | B <sub>1</sub> | * | 4/2002 | Towler et al 365/49 | | 6,442,054 | B1 | | 8/2002 | Evans et al 365/49 | | 6,744,653 | <b>B</b> 1 | | 6/2004 | Huang 365/49 | | | | | | | <sup>\*</sup> cited by examiner Primary Examiner—Anh Phung Assistant Examiner-J. H. Hur #### (57)**ABSTRACT** A dummy Content-addressable memory (CAM) cell and a dummy Ternary Content-addressable memory (TCAM) cell are connected to each row in a CAM and a ternary CAM array, respectively, to enable a differential match line sensing based on the content stored. The ternary content-addressable memory (TCAM) cell is for a differential match line sensing in low power applications. A method includes generating a voltage difference between match line signal and a reference line signal, and then detecting and amplifying the voltage difference to determine a match or a mismatch. # 9 Claims, 18 Drawing Sheets Feb. 14, 2006 Sheet 1 of 18 Feb. 14, 2006 Sheet 2 of 18 Feb. 14, 2006 Sheet 3 of 18 Sheet 4 of 18 Sheet 5 of 18 FIG. 2E Feb. 14, 2006 Sheet 6 of 18 Feb. 14, 2006 Sheet 7 of 18 Feb. 14, 2006 Sheet 8 of 18 Sheet 9 of 18 FIG. 5A **Sheet 10 of 18** FIG. 5B U.S. Patent Feb. 14, 2006 Sheet 11 of 18 US 6,999,331 B2 **Sheet 12 of 18** Feb. 14, 2006 **Sheet 13 of 18** Feb. 14, 2006 **Sheet 14 of 18** Sheet 15 of 18 U.S. Patent Feb. 14, 2006 **Sheet 16 of 18** US 6,999,331 B2 U.S. Patent Feb. 14, 2006 **Sheet 17 of 18** US 6,999,331 B2 U.S. Patent Feb. 14, 2006 **Sheet 18 of 18** US 6,999,331 B2 ## US 6,999,331 B2 ## CAM CELLS AND DIFFERENTIAL SENSE CIRCUITS FOR CONTENT ADDRESSABLE MEMORY (CAM) This application is a continuation of Ser. No. 10/202,621 5 filed Jul. 24, 2002, now U.S. Pat. No. 6,744,653, which claims the benefit of provisional U.S. Application Ser. No. 60/327,049, entitled "High-Speed and Low Power Content Addressable Memory (CAM) Sensing Circuits," filed Oct. 4, 2001, which is incorporated herein by reference in its 10 entirety for all purposes. ## BACKGROUND OF THE INVENTION The present invention relates generally to semiconductor circuits, and more specifically to CAM cells and high speed and low power sense circuits for content addressable memory. A content addressable memory (CAM) is a memory having an array of memory cells that can be commanded to 20 compare all or a subset of the "entries" in the array against an input address. Each entry in the CAM array corresponds to the content of the cells in a particular row of the array. Each row of the array is further associated with a respective match line, which is used as a status line for the row. All or 25 a portion of the CAM array may be compared in parallel to determine whether or not the input address matches any of the entries in the portion selected for comparison. If there is a match to an entry, then the match line for the corresponding row is asserted to indicate the match. Otherwise, the 30 match line is de-asserted to indicate a mismatch (which may also be referred to as a "miss"). Typically, any number of match lines may be asserted, depending on the entries in the array and the input address. In a typical CAM design, the comparison between a bit of 35 the input address and the content of a CAM cell is performed by a comparison circuit included in the cell. The comparison circuits for all cells in each row may then be coupled to the match line for the row. For simplicity, the comparison circuits may be designed such that a wired-OR operation is 40 implemented for the outputs from all comparison circuits coupled to any given match line. In one common design, the output for each comparison circuit is formed by the drain of an N-channel output transistor. This output transistor is turned ON if there is a mismatch between the input address 45 bit and the memory cell content and is turned OFF otherwise. The match line may be pre-charged to a logic high prior to each comparison operation, and would thereafter remains at logic high only if all output transistors for the row are turned OFF, which would be the case if there is a match 50 between all bits of the entry for the row and the input address. Otherwise, if at least one output transistor is turned ON due to a mismatch, then the match line would be pulled low by these transistors. The signal (or voltage) on the match line may thereafter be sensed or detected to determine 55 whether or not there was a match for that row. The conventional CAM cell and CAM sensing mechanism described above, though simple in design, have several drawbacks that affect performance. First, speed may be limited by the wired-OR design of the match line, if some 60 speed-enhancing techniques are not employed. Each row may include a large number of cells (e.g., possibly 100 or more cells). In this case, if only one bit in the entire row does not match, then only one output transistor will be turned ON (e.g., from $V_{DD}$ to $V_{SS}$ ). A long time (i.e., $t=C \cdot V_{DD}^2/I$ , where C is the capacitance of each entire match line and I is the current of each transistor) may then be required to discharge the line, which would then limit the speed at which the CAM array may be operated. Second, excessive power may be consumed by the CAM design described above. Typically, only one row will match the input address, and all other rows will not match. In this case, all but one match line will be pulled to logic low (e.g., to $V_{SS}$ ) by the output transistors that are turned ON due to mismatches. The power consumed may then be computed as $(M-1) \cdot C \cdot V_{DD}^2$ , where (M-1) is the number of mismatched rows, C is the capacitance of each match line, and $V_{DD}$ is the voltage swing of the match line during discharge. As can be seen, there is a need for CAM cells and sense circuits that can ameliorate the shortcomings related to speed and power in the conventional design. ## SUMMARY OF THE INVENTION The invention provides CAM cell designs having improved performance over a conventional design. The invention further provides techniques to detect the signal (or voltage) on a match line coupled to a number of CAM cells and having faster speed of operation and possibly lower power consumption. In an aspect, a content addressable memory (CAM) cell is provided having improved performance. The CAM cell includes a memory cell operable to store a bit value and a comparison circuit configured to detect the bit value stored in the memory cell. The comparison circuit includes (1) an output transistor coupled to a match line and configured to provide a drive for the match line based on the detected bit value, and (2) a dummy transistor coupled to a dummy line. The match line and dummy line are used to detect an output value provided by the CAM cell. In an embodiment, the dummy transistor (1) has similar dimension as the output transistor, (2) is located in close proximity to the output transistor, and (3) is turned OFF during sensing operation. The dummy transistor is used to achieve low voltage swing (small signal) sensing and provides for low power and high-speed operation. In another aspect, a sense circuit is provided for sensing a logic state of a match line in a content addressable memory (CAM). The sense circuit includes a pair of amplifiers cross-coupled in a positive feedback configuration. The first amplifier has one input operatively coupled to the match line, and the second amplifier has one input operative to receive a reference signal. The match line is driven by a number of output transistors for a row of CAM cells. The reference signal is generated based on a row of dummy transistors that are similarly arranged as the output transistors. When enabled, the amplifiers detect the difference between the signals on the match line and the reference signal and further amplify the detected difference such that the logic value on the match line may be ascertained. The sense circuit may further include (1) a pair of pass transistors operatively coupled to the pair of amplifiers and used to enable the sense circuit, and (2) a switch coupled between outputs/inputs of the cross-coupled amplifiers and used to reset the amplifiers prior to each match line sense cycle. In a specific implementation, the first and second amplifiers may each be implemented as an inverter with gain (e.g., a P-channel transistor coupled in series with an N-channel transistor). The match line is coupled to the output transistors for the and this transistor will need to pull the entire match line low 65 row of CAM cells and may further be coupled directly to one input of the first amplifier. The dummy transistors couple to a dummy line that may further be coupled directly to one input of the second amplifier. Alternatively, the output transistors may also couple to a first common line that is coupled to the input of the first amplifier. In this case, the dummy transistors would similarly couple to a second common line that is coupled to the input of the second 5 amplifier. Various other aspects, embodiments, and features of the invention are also provided, as described in further detail below The foregoing, together with other aspects of this invention, will become more apparent when referring to the following specification, claims, and accompanying drawings. #### BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1A is a block diagram of a conventional content addressable memory (CAM) unit; FIG. 1B is a block diagram of a CAM unit wherein certain aspects and embodiments of the invention may be implemented; FIGS. 2A, 2B, and 2C are respectively a block diagram, a schematic diagram, and a logic diagram for an embodiment of a conventional CAM cell; FIG. 2D is a schematic diagram of a binary CAM cell 25 having improved performance; FIG. 2E is a schematic diagram of a dummy binary CAM cell: FIG. 3A is a schematic diagram of the driving circuits associated with a single match line; FIG. 3B is a block diagram of a sense circuit; FIG. 3C is a schematic diagram of an embodiment of a sense circuit that may be used to detect the signal on a match line: FIGS. 4A and 4B are schematic diagrams of an embodiment of two match line detection mechanisms; FIGS. 5A and 5B are timing diagrams for the match line detection mechanisms shown in FIGS. 4A and 4B, respectively; FIG. 6 is a schematic diagram of another embodiment of a match line detection mechanism; FIG. 7 is a timing diagram for the match line detection mechanism shown in FIG. 6; FIG. 8A is a schematic diagram of an embodiment of a 45 conventional ternary CAM cell; FIG. 8B is a schematic diagram of a ternary CAM cell having improved performance; FIG. 8C is a schematic diagram of a dummy ternary CAM cell: and FIGS. 9A, 9B, and 10 are schematic diagrams of three match line detection mechanisms for ternary CAM cells. # DESCRIPTION OF THE SPECIFIC EMBODIMENTS FIG. 1A is a block diagram of a conventional content addressable memory (CAM) unit 100a. CAM unit 100a includes a CAM array 110a coupled to sense circuits 150a. CAM array 110a is a two-dimensional array of M rows by N columns of CAM cells 120. Each row of the CAM array includes N cells that collectively store data for an entry in the array. Each row is further associated with a respective match line 130 that couples to all CAM cells in the row and further couples to sense circuits 150a. Each of the N columns of the CAM array is associated with a specific bit position of an N-bit input address. A 4 differential address line 132 is provided for each address bit and couples to all cells in the corresponding column of the CAM array. In this way, each bit of the N-bit input address may be compared with each of the M bits stored in the M cells in the corresponding column. The N-bit input address may thus be provided to all M rows of the CAM array and simultaneously compared against all entries in the array. Typically, before performing the comparison between the input address and the entries in the CAM array, the M match lines for the M rows of the array are pre-charged to logic high (e.g., $V_{DD}$ ). For each row, if any cell in the row is not matched to the corresponding address bit, then the output transistor for that cell is turned ON and the match line is pulled to logic low (e.g., $V_{SS}$ ). Thus, for any given row, the match line remains at logic high (i.e., not pulled to $V_{SS}$ ) only if the output transistors for all N cells in the row are turned OFF, which only occurs if each bit for the input address matches the bit in the corresponding cell of the row. The match line for each row is thus at logic high for a match between the entry in that row and the input address, and is at logic low if there is no match (i.e., a mismatch) between the entry and the input address. FIG. 1B is a block diagram of a CAM unit 100b having improved performance. CAM unit 100b includes a CAM array 110b coupled to sense circuits 150b. CAM array 110b is a two-dimensional array of M rows by N columns of CAM cells 122. Each row of the CAM array includes N cells that collectively store data for an entry in the array. Each row is further associated with a respective match line 130 and a dummy line 131 that couple to all CAM cells in the row and further couples to sense circuits 150. CAM array 110b further includes a column of M dummy CAM cells 124, one dummy CAM cell for each row. Dummy CAM cells 124 allow for differential detection of the values stored in CAM cells 122, which are provided on match lines 130 and dummy lines 131, as described in further detail below. FIG. 2A is a simple representation for a CAM cell 120x, which is one of many CAM cells 120 in FIG. 1A. CAM cell 120x receives a differential address line, mbl and $\overline{\text{mbl}}$ , for a single bit of the input address and further couples to a single match line for one row of the CAM array. FIG. 2B is schematic diagram of a specific design of CAM cell 120x, which may be used for each of the CAM cells 120 in FIG. 1A. CAM cell 120x includes a memory cell 210x coupled to a comparison circuit 230x. Memory cell 210x (which may also be referred to as a storage element or storage cell) is used to store a single bit value. Comparison circuit 230x is used to compare the stored bit value against an address bit. As shown in FIGS. 2A and 2B, memory cell 210x comprises a pair of cross-coupled inverters 212a and 212b. Each inverter 212 is formed by a P-channel transistor 214 coupled to an N-channel transistor 216, as shown in FIG. 2B. The gates of transistors 214 and 216 couple together and form the input of the inverter, and the drains of these transistors similarly couple together and form the output of the inverter. The output of inverter 212a couples to the input of inverter 212b, the drain of an N-channel transistor 218a, and a complementary data line (d) 220a. Similarly, the output of inverter 212b couples to the input of inverter 212a, the drain of an N-channel transistor 218b, and a data line (d) 220b. The gates of transistors 218a and 218b couple to a 65 word line (wl), the source of transistor 218a couples to a complementary bit line (bl) 224a, and the source of transistor 218b couples to a bit line (bl) 224b. A data bit may be stored to memory cell 210x as follows. Initially, word line 222 is pulled to logic high to turn ON either transistor 218a or 218b. The logic value on the differential bit line (bl and bl) is then stored to the memory cell and maintained by inverters 212a and 212b. For 5 example, if the complementary bit line (bl) is at logic low and the bit line (bl) is at logic high, then transistor 218a is turned ON and transistor 218b is turned OFF. The complementary data line (d) is then pulled to logic low, which then causes the output of inverter 212b to transition to logic high. 10 This then turns ON transistor 216a and causes the output of inverter 212a to transition to logic low. After the bit value has been written to memory cell 210x, the word line is brought to logic low and the value is maintained by inverters 212a and 212b via a positive feedback mechanism. The 15 process to store a bit of the opposite logic value proceeds in a complementary manner. Comparison circuit 230x comprises a pair of N-channel transistors 232a and 232b and an N-channel output transistor 240. Transistors 232a and 232b have gates that couple to data lines 220a and 220b, respectively, sources that couple to an address line (mbl) 132xa and a complementary address line (mbl) 132xb, respectively, and drains that couple together and to the gate of transistor 240. The source of transistor 240 couples to circuit ground (e.g., V<sub>SS</sub>) and the 25 drain of transistor 240 couples to a match line 130x for the row to which CAM cell 120x belongs. Comparison circuit 230x operates as follows. If the address bit is not the same as the stored bit in memory cell 210x, then the value on address line (mdl) 132xa is the same as the value on complementary data line (d) 220a, and the value on complementary address line (mbl) 132xb is the same as the value on data line (d) 220b. In this case, node C will be at logic high (i.e., a high voltage level), and transistor 240 will be turned ON to indicate a mismatch. Alternatively, if the input address is the same as the stored bit in memory cell 210x, then node C will then be pulled to logic low by either transistor 232a or 232b, and output transistor 240 will be turned OFF to indicate a match. The ON state for output transistor 240 thus indicates a mismatch and the OFF state indicates a match. FIG. 2C is a logical representation for memory cell 210x. Inverters 212a and 212b are cross-coupled so that the output of one inverter drives the input of the other inverter. Inverters 212a and 212b are thus coupled in a positive feedback circuit configuration. Transistors 218a and 218b act as switches that can be selectively turned ON to store a data value, which is then maintained by inverters 212a and 212b. FIG. 2D is schematic diagram of a specific design of a 50 CAM cell 122x, which may be used for each of the CAM cells 122 in FIG. 1B. CAM cell 122x includes a memory cell 210x coupled to a comparison circuit 231x. Memory cell 210x is used to store a single data bit value, and is described above with reference to FIG. 2B. Comparison circuit 231x comprises a pair of N-channel transistors 232a and 232b and an N-channel output transistor 240 used to drive match line 130x. These transistors are described above with reference to FIG. 2B. Comparison circuit 231x further comprises a dummy N-channel output transistor 242 used to provide the proper loading for dummy line 131x. The gate of dummy transistor 242 is coupled to logic low, and the dummy transistor 242 is coupled to logic low, and the dummy transistor is turned OFF. Dummy transistor 240. In an embodiment, dummy transistor 242 is located near output transistor 240 and is oriented in the same direction. 6 FIG. 2E is schematic diagram of a specific design of a dummy CAM cell 124x, which may be used for each of the dummy CAM cells 124 in FIG. 1B. Dummy CAM cell 124x includes a memory cell 210x coupled to a comparison circuit 233x. Memory cell 210x is used to store a single data bit value, and is described above with reference to FIG. 2B. Comparison circuit 233x includes circuitry used to drive match line 130x and dummy line 131x. In particular, comparison circuit 233x comprises transistors 232a, 232b, and 240x coupled in the manner described above with reference to FIG. 2B and used to drive match line 130x. Comparison circuit 233x further comprises a pair of N-channel transistors 234a and 234b and an N-channel output transistor 242x used to drive dummy line 131x. Transistors 234a, 234b, and 242x are coupled in similar manner as transistors 232a, 232b, and 240x for the match line, except that the gates of transistors 234a and 234b couple to the data line (d) and the complementary data line (d), respectively. Thus, if transistor 242x is turned ON, then transistor 240x will be turned OFF. Otherwise, transistor 242x is turned OFF and transistor 240x will be turned ON. When transistor 240x is turned ON, the CAM row is disabled and the match line is asserted to mismatch status. In an embodiment, transistors 242x has a physical dimension that is different from that of the other output transistors for the CAM cells within the same row. If the ratio of the width over the length of transistor 240x is normalized to be equal to 1 $$\left(\text{i.e., } \frac{W_0}{\alpha L_0} = 1\right),\,$$ then the ratio of the width over the length of transistor 242x may be expressed as being equal to x, where $$x = \frac{W}{\alpha I}$$ 40 In an embodiment, x=0.5, which may be obtained by doubling the length of transistor 242x relative to that of transistor 240x $$\left(\text{i.e., } x = \frac{W_0}{2\alpha L_0}\right)$$ or by reducing the width of transistor 242x relative to that of transistor 240x. $$\left(\text{i.e., } x = \frac{W_0/2}{\alpha L_0}\right).$$ The function performed by dummy CAM cell 124 is 55 described in further detail below. FIG. 3A is a schematic diagram of the driving circuits associated with a single match line 130x. As shown in FIG. 1A, each match line 130 traverses the entire row of CAM array 110a and couples to output transistor 240 of each CAM cell 120 in the row. In FIG. 3A, transistors 240a through 240n thus represent the N output transistors for N CAM cells 120xa through 120xn in the row to which match line 130x is associated with. Each match line is further associated with a P-channel pre-charge transistor 310 and an output buffer 320 The comparison of an entry for a row of CAM cells against the input address is performed as follows. Initially, ## US 6,999,331 B2 the gate voltage of output transistors 240a through 240n are pre-set to logic low to turn OFF these transistors, and pre-charge transistor 310 is turned ON (by bringing the Pch control signal to logic low) to pre-charge match line 130x to a high level (e.g., $V_{DD}$ ). Pre-charge transistor 310 is then turned OFF, and the input address is written to address lines 132a through 132n (see FIG. 1A). The comparison circuit in each CAM cell in the row then operates to compare the stored bit in the CAM cell against the input address bit for that CAM cell. Depending on the stored value in each CAM cell and its input address bit, the output transistor for the CAM cell may be turned OFF for a match or turned ON for a mismatch, as described above. If all N bits for the row are matched, then all N output transistors 240a through 240n are turned OFF, and match 15 line 130x remains at the pre-charged level (e.g., of $V_{DD}$ ). Otherwise, if one or more bits are not matched, then each mismatched bit causes the corresponding output transistor to turn ON. If any of the N output transistors is turned ON, then those transistors would then discharge the match line (i.e., 20 pull the match line to logic low or $V_{SS}$ ). Thus, the match line remains at logic high if the input address matches the stored content of the CAM cells in the row, and transitions to logic low if the input address does not match the stored content. Output buffer 320 buffers the match line and drives the 25 subsequent circuitry. As noted above, the match line configuration shown in FIG. 3A has several disadvantages related to speed and power. First, speed may be limited by the wired-OR design of the match line. Each row may include a large number of cells. If only one bit in the entire row mismatches, then only one output transistor will be turned ON and this transistor would need to pull the entire match line toward $V_{SS}$ . In this case, a long time may be required to discharge the match line, which would then limit the speed at which the CAM 35 array may be operated. Second, excessive power may be consumed by discharging all match lines that mismatch (which is typically all but one match line) toward $V_{SS}$ . These disadvantages are ameliorated by the match line configurations described below. FIG. 3B is a block diagram of a differential sense circuit 410 that may be used to detect a signal (or voltage) on a match line. One sense circuit 410 may be coupled to each of the M match lines for the CAM array in FIG. 1A. Sense circuits 150 may thus include M sense circuits 410. Sense circuit 410 may be implemented with a current mirror type, a cross-coupled latch type, or some other design. A reference generator 411 provides a reference voltage for one input of sense circuit 410, and the match line couples to the other input of the sense circuit. Reference generator 411 may be 50 implemented with dummy transistors (as described below), a voltage divider that can provide a constant voltage, or some other design. FIG. 3C is a schematic diagram of an embodiment of a sense circuit 410a that may be used to detect a signal (or voltage) on a match line. In the embodiment shown in FIG. 3C, sense circuit 410a includes a pair of inverting amplifiers 412a and 412b cross-coupled so that the output of one amplifier drives the input of the other amplifier. Amplifiers 412a and 412b are thus coupled in a positive feedback circuit configuration. Transistor 418a couples to one input of amplifier 412a and to the match line at node M, and transistor 418b couples to one input of amplifier 412b and to an output from reference generator 411 at node D. Nodes M and D effectively provide a differential drive for the pair of cross-coupled amplifiers 412a and 412b. Inverting buffers 424a and 424b provide buffering for the detected data bit from inverters 412a and 412b, respectively, and further derive the Out A and Out B outputs. The operation of sense circuit 410a is described below. FIG. 4A is a schematic diagram of a match line detection mechanism 400, which may be used in conjunction with the inventive CAM cells 122 and dummy CAM cells 124 in CAM unit 100b in FIG. 1B, in accordance with an embodiment of the invention. Similar to FIG. 3A, match line 130x couples to N output transistors 240a through 240n for N CAM cells 122xa through 122xn and to output transistor 240x for dummy CAM cell 124x in a specific row of the CAM array. Match line 130x further couples to a P-channel transistor 310a, which is used to pre-charge the match line (e.g., to $V_{DD}$ ) at the start of each detection cycle. Match line 130x further couples to a first input (node M) of a sense circuit 410x, which is used to sense the signal or voltage on the match line. Sense circuit 410x is a specific embodiment of sense circuit 410 in FIG. 3B. Dummy line 131x couples to N dummy transistors 242a through 242n for N CAM cells 122xa through 122xn and to dummy transistor 242x for dummy CAM cell 124x in the same row of the CAM array as the associated match line 130x. Dummy transistors 242x and 242a through 242n are used to generate a reference signal for sense circuit 410x, and may thus be viewed as one implementation of reference generator 411 in FIG. 3B. Dummy transistors 242a through 242n mimic the loading observed on match line 130x. Dummy line 131x also couples to a P-channel transistor 310b, which is used to pre-charge the dummy line at the start of each detection cycle. Dummy line 131x further couples to a second input (node D) of sense circuit 410x. As shown in FIG. 4A, dummy transistors 242a through 242n for CAM cells 122xa through 122xn are each dimensioned with a normalized size of 1 (i.e., W/L→1, where W is the width and L is the channel length of the transistor). Output transistors 240a through 240n for the CAM cells and output transistor 240x for dummy CAM cell 124x are each also dimensioned with the normalized size of 1. However, dummy transistor 242x for dummy CAM cell 124x is dimensioned with a normalized size of less than 1 (i.e., x<1) and thus has reduced drive capability in comparison to each output transistor 240. In one specific embodiment, $x \approx 0.5$ . As also shown in FIG. 4A, all dummy transistors 242a through 242n in the CAM cells are turned OFF by grounding the gates of these N-channel dummy transistors. However, dummy transistor 242x for dummy CAM cell 124x may be turned ON and has a size that is only a fraction (e.g., half) of the size of the other output and dummy transistors. In the match situation, all of the transistors coupled to the match line (i.e., transistors 240a through 204n and 240x) will be turned OFF, and the match line will not be discharged. However, the dummy line will be discharged through dummy transistor 242x (which has a size that is a fraction x) and the dummy line voltage will be lower than the match line voltage. Conversely, in the mismatch situation, even if only one bit is mismatched, the match line will be discharged through the one or more transistors 240 for the mismatched CAM cells (which have a size of 1) at a speed faster than dummy line. In this case, the match line voltage will be lower than that of the dummy line voltage. In the specific embodiment of sense circuit 410x shown in FIG. 4A, N-channel transistors 418a and 418b have gates that couple together and to an En1 control signal and sources that couple to ground (e.g., $V_{SS}$ ). In an embodiment, amplifiers 412a and 412b are designed as inverters with gains, and are thus referred to as simply inverters. Inverters 412a and 412b couple to transistors 418a and 418b, respectively, and further to inverters 424a and 424b, respectively. Each inverter 412 comprises a P-channel transistor 414 coupled to an N-channel transistor 416. The gates of transistors 414a and 416a couple together and form one input of inverter 412a (node F). The source of transistor 414a couples to the 5 drain of transistor 416a and form the output of inverter 412a, which couples to the gates of transistors 414b and 416b and to the input of inverting buffer 424b. Similarly, the gates of transistors 414b and 416b couple together and form one input of inverter 412b (node G). The source of transistor 10 414b couples to the drain of transistor 416b and form the output of inverter 412b, which couples to the gates of transistors 414a and 416a and to the input of inverting buffer 424a. The sources of N-channel transistors 416a and 416b couple to the drains of transistors 418a and 418b, respec- 15 tively. The drains of transistors 414a and 414b couple together. A P-channel transistor 422 has a gate that couples to an En2 control signal, a source that couples to the drains of transistors 414a and 414b, and a drain that couples to the 20 upper voltage supply (e.g., $V_{DD}$ ). The inputs of inverting buffers 424a and 424b couple to the outputs of inverters 412b and 412a, respectively, and the outputs of buffers 424a and 424b drives the Out A and Out B outputs, respectively. The voltage on node M represents the signal on the match 25 line 130x to be detected. The voltage on node D represents the reference signal to which the voltage on node M is compared against. Inverters 412a and 412b amplify the voltage difference between nodes M and D. The reference signal at node D is generated by dummy 30 transistors 242x and 242a through 242n. The reference signal may be determined, in part, by selecting the proper sizes for dummy transistors 242x and pre-charge transistor 310b, which is usually equal to transistor 310a. FIG. 5A is a timing diagram for match line detection 35 mechanism 400 in FIG. 4A. This timing diagram shows various control signals for sense circuit 410x to detect the signal (or voltage) on match line 130x, the voltages at nodes M and D, and the sense circuit outputs. The control signals are generated based on a clock signal, which is shown at the top of FIG. 5A for reference. The operation of the sense circuit is now described in reference to both FIGS. 4A and Initially, prior to time $T_1$ , the Pch and En2 control signals are at logic high, the En1 control signal is at logic low, and 45 the voltages at nodes M and D are pre-set to $V_{SS}$ . At time $T_1$ , which may correspond to the rising (or leading) edge of the clock signal, the Pch control signal is brought to logic low, which then turns ON transistors 310a and 310b. At approximately the same time $T_1$ , the address to be compared are 50 written in through the address line (mb1) and its complementary address line (mb1), the comparison circuits for the CAM cells coupled to the match line are enabled. Each of the N output transistors 240 for these comparison circuits may thereafter be turned ON or OFF depending on its 55 comparison result. In a typical design, the comparison circuits could be enabled either before or after time $T_1$ when the pre-charge is finished. Upon being turned ON at time $T_1$ , transistor 310a starts pre-charging match line 130x toward $V_{DD}$ , and transistor 60 310b similarly starts pre-charging dummy line 131x toward $V_{DD}$ . If there is a match between the input address and the contents of the CAM cells in the row corresponding to the match line, then all N output transistors 240a through 240n will be turned OFF, and transistor 310a is able to pre-charge 65 the match line to a higher voltage and faster, as shown by plot 512 in FIG. 5A. In comparison, since transistor 242x coupled to dummy line 131x is turned ON, transistor 310b is able to pre-charge the dummy line at a slower rate, as shown by plot 514 in FIG. 5A. Thus, if there is a match, then the voltage on match line 130x is higher than the voltage on 10 dummy line 131x. Conversely, if there is a mismatch between the input address and the CAM cell contents, then at least one output transistor 240 coupled to match line 130x will be turned ON, and the voltage on the match line will be pre-charge more slowly, as shown by plot 522 in FIG. 5A. Although transistor 242x coupled to dummy line 131x is also turned ON, it is only a fraction of the size of the output transistors 240 coupled to the match line and discharges at a fraction of the rate of transistor 240. As a result, transistor 310b is able to pre-charge the dummy line at a faster rate than for the match line, as shown by plot 524 in FIG. 5A. Thus, if there is a mismatch, then the voltage on dummy line 131x is higher than the voltage on match line 130x. At time $T_2$ , the Pch control signal is brought to logic high, which then turns OFF transistors **310**a and **310**b. The pre-charge is stopped at this point. If there is a match, then all N output transistors **240**a through **240**n are turned OFF, and the voltage on the match line is maintained at the same level, as shown by plot **512** in FIG. **5A**. In contrast, the voltage on the dummy line is continuously discharged (i.e., pulled toward $V_{SS}$ ) by the one dummy transistor **242**x that is turned ON, and the voltage at node D is pulled lower as shown by plot **514** in FIG. **5A**. Conversely, if there is a mismatch, then at least one output transistor 240 coupled to the match line will be turned ON, and the voltage on the match line is discharged by the output transistor(s) that are turned ON, as shown by plot 522 in FIG. 5A. Since the output transistor coupled to the match line is larger than the ON dummy transistor 242x coupled to the dummy line, the match line is pulled toward $V_{SS}$ at a faster rate. Moreover, since the voltage on the match line is lower than that on the dummy line for a mismatch, the voltage on the match line will continue to be even much lower than that on the dummy line as both the match and dummy lines are pulled toward $V_{SS}$ starting at time $T_2$ . At time T<sub>3</sub>, the En1 control signal is brought to logic high and the En2 control signal is brought to logic low. The logic high on the En1 control signal turns ON transistors 418a and 418b, and the logic low on the En2 control signal turns ON transistor 422. These control signals enable sense circuit 410x by turning ON transistors 418a, 418b, and 422. With sense circuit 410x enabled, the voltages at nodes M and D are detected and the voltage difference is amplified by the pair of inverters 412a and 412b cross-coupled to provide positive feedback. Inverters 412a and 412b then drive their outputs to opposite rails, with the polarity being dependent on the sign of the detected voltage difference. In particular, if there was a match, then the voltage on node M is higher than the voltage on node D, as shown by plots 512 and 514 in FIG. 5A. This then turns ON transistor 416b more (i.e., sinks more current), which then pulls node F lower. The lower voltage on node F turns ON transistor 414a more and turns OFF transistor 416a more, which then pulls node G higher. The higher voltage on node G turns OFF transistor 414b more and turns ON transistor 416b more. In this way, the voltage at node F is pulled low toward $V_{SS}$ , and the voltage at node G is pulled high toward $V_{DD}$ (i.e., the voltages at these two nodes are pulled apart and toward their respective rail voltages). Conversely, if there was a mismatch, then the voltage on node D is higher than the voltage on node M, as shown by plots 522 and 524 in FIG. 5A. This then turns ON transistor 416a more, which then pulls node G lower. Transistor 414b is then turned ON more, which then pulls node F higher. The voltage at node F is thus pulled toward $V_{DD}$ , and the voltage at node G is pulled toward $V_{SS}$ . In a typical implementation, before the sensing the voltages of nodes D and M starts, 5 nodes F and G are equalized as shown in FIG. 5A. Thus, shortly after sense circuit 410x is enabled by the En1 and En2 control signals, inverters 412a and 412b sense the voltage on node M relative to the voltage on node D, and the sensed difference is provided via buffers 424a and 424b to the Out A and Out B outputs. At time $T_4$ , Out A is at logic high if there was a match and at logic low if there was a mismatch, and Out B is at logic low if there was a match and at logic high if there was a mismatch, as shown by the plots for these outputs in FIG. 5A. After time $T_3$ , transistors 418a and 418b are turned ON $^{15}$ and respectively pull the voltages at nodes M and D slowly toward $V_{SS}$ because of the big capacitance from a large number of transistors coupled to these nodes. If there was a match, then transistors 414a and 416b are both turned ON, and transistors 414b and 416a are both 20 turned OFF. Transistor 414a pulls node G high toward $V_{DD}$ . Since transistor 416a is turned OFF, no current conducts through inverter 412a after node G has been pulled high. Conversely, transistor 416b pulls node F low toward $V_{SS}$ . Since transistor 414b is turned OFF, no current conducts through inverter 412b after node F has been pulled low. Thus, once node F has been pulled low and node G has been pulled high, transistors 418a and 418b are able to discharge nodes M and D, respectively, and pull these nodes to $V_{SS}$ , as shown in FIG. 5A. Nodes M and D are now ready for the next sense operation in the next clock cycle. The complementary actions occur if there was a mismatch, but the voltages at nodes M and D are also pulled to $V_{SS}$ . Match line detection mechanism 400 has several advantages over the conventional detection mechanism. Detection 35 mechanism 400 may be operated at higher speed and lower power than conventional designs. First, as shown in FIG. 5A, the voltage on the match line is compared against the voltage on the dummy line. The voltages on both the match line and dummy line may be charged to only a fraction of 40 $V_{DD}$ (instead of $V_{DD}$ ) for reliable detection of the signal on the match line. This may be achieved by (1) properly designing sense circuit 410x, (2) selecting the proper sizes for transistors 240, 242, and 242x, pre-charge transistors **310***a* and **310***b*, and (3) providing the proper control signals 45 that determine the times T2, T3, and T4. Second, sense circuit 410x is able to detect and amplify a small voltage difference between nodes M and D. And third, power consumption is reduced by limiting the signal swing to a fraction of V<sub>DD</sub> instead of the full V<sub>DD</sub>, as shown in FIG. 5A. Power 50 consumption is proportional to the square of the voltage swing, and a smaller signal swing results in lower power consumption. FIG. 4B is a schematic diagram of a match line detection mechanism 401, which may also be used in conjunction with the inventive CAM cells 122 and dummy CAM cells 124 in CAM unit 100b in FIG. 1B, in accordance with an embodiment of the invention. Similar to FIG. 4A, match line 130x couples to N output transistors 240a through 240n for N CAM cells 122xa through 122xn, output transistor 240x for 60 dummy CAM cell 124x, and pre-charge transistor 310a. Match line 130x further couples to a first P-channel pass transistor 426b, which couples the match line to sense circuit 410y. Sense circuit 410y is a specific embodiment of sense circuit 410 in FIG. 3B. Dummy line 131x couples to N dummy transistors 242a through 242n for N CAM cells 122xa through 122xn, 12 dummy transistor 242x for dummy CAM cell 124x, and pre-charge transistor 310b. Dummy line 131x further couples to a second P-channel pass transistor 426a, which couples the dummy line to sense circuit 410y. In the specific embodiment of sense circuit 410y shown in FIG. 4B, an N-channel transistor 418c has a gate that couples to a Saen control signal, a source that couples to ground, and a drain that couples to the sources of transistors 416a and 416b. Transistors 416a and 416b and 418a and 418b are coupled as shown in FIG. 4A. However, the drains of transistors 418a and 418b couple directly to the upper voltage supply (e.g., $V_{DD}$ ). Pass transistors 426a and 426b are used to respectively isolate the capacitance on the dummy and match lines from nodes D and M within sense circuit 410y. The capacitance on each of these lines is relatively high because a number of output or dummy transistors are coupled to the line. The isolation provided by pass transistors 426a and 426b allows sense circuit 410y to operate at a higher speed for sensing operation, since the internal nodes may be charged and discharged at a faster rate with reduced capacitance loading on the internal nodes. FIG. 5B is a timing diagram for match line detection mechanism 401 in FIG. 4B. This timing diagram shows various control signals for sense circuit 410y to detect the signal on match line 130x, the voltages at nodes M and D and nodes F and G, and the sense circuit outputs. The control signals are generated based on a clock signal, which is shown at the top of FIG. 5B for reference. Initially, prior to time $T_1$ , the Pch control signal is at logic low, and the voltages at nodes M and D are pre-charged to $V_{DD}$ . Nodes G and F are also pre-charged to $V_{DD}$ via pass transistors 426a and 426b, which are turned ON at this time. Near time T1, the Pch control signal is brought to logic high, which then turns OFF transistors 310a and 310b. At approximately the same time T1, the address to be compared is written to the address line, and the comparison circuits for the CAM cells are enabled. Each of the N output transistors 240 for these comparison circuits may thereafter be turned ON or OFF depending on its comparison result. If there is a match between the input address and the contents of the CAM cells, then all N output transistors 240a through 240n will be turned OFF, and the match line remains at its pre-charged level, as shown by plot 532 in FIG. 5B. In comparison, since transistor 242x coupled to dummy line 131x is turned ON, this transistor pulls the dummy line to a lower voltage, as shown by plot 534 in FIG. 5B. Thus, if there is a match, then the voltage on match line 130x is higher than the voltage on dummy line 131x. The Iso control signal is at logic low during this time, pass transistors 426a and 426b are turned ON, and the dummy and match lines are respectively coupled to nodes G and F of sense circuit 410y. At time $T_2$ , the Saen control signal is brought to logic high, which then turns ON transistor 418c and enables sense circuit 410y. The Iso control signal is also brought to logic high, which then turns OFF pass transistors 426a and 426b. The differential voltage between nodes G and F are then amplified by sense circuit 410y and Outputs A and B are provided as shown in FIG. 5B. At time T<sub>3</sub>, the Pch control signal is brought to logic low, the pre-charge transistors 310a and 310b are turned ON, and the dummy and match lines are pulled toward V<sub>DD</sub>. At time T<sub>4</sub>, the Saen and Iso control signals are brought to logic low, the dummy and match lines are coupled to nodes G and F, and these nodes are pulled toward V<sub>DD</sub> by pre-charge transistors 310a and 310b to get ready for the next sensing cycle. The signal swing for the mismatch situation is also shown FIG. 6 is a schematic diagram of a match line detection mechanism 600, which may be used in conjunction with CAM cells 122 and 124 in CAM unit 100b in FIG. 1B, in 5 accordance with another embodiment of the invention. Similar to FIG. 4A, match line 130x couples to N output transistors 240a through 240n for the N CAM cells in a specific row of the CAM array and further couples to P-channel transistor 310a. However, the sources of output 10 transistors 240a through 240n are coupled to node M of sense circuit 410x via a first common line 610a, which may be implemented with a metal track in the circuit layout. A row of N dummy transistors 242a through 242n and 242x couples to dummy line 131x, which further couples to 15 P-channel transistor 310b. The sources of dummy transistors 242a through 242n and 242x are coupled to node D of sense circuit 410x via a second common line 610b. FIG. 7 is a timing diagram for match line detection mechanism 600 in FIG. 6. Similar to FIG. 5, FIG. 7 shows 20 the control signals, the voltages at nodes M and D, and the sense amplifier outputs for the match line detection. The operation of detection mechanism 600 is now described in reference to both FIGS. 6 and 7. The operation of sense circuit 410x in FIG. 6 is similar to 25 that described above for detection mechanism 400 in FIG. 4A. Initially, prior to time T<sub>1</sub>, the Pch and En2 control signals are at logic high, the En1 control signal is at logic low, and the voltages at nodes M and D are pre-set to V<sub>ss</sub>. At time T<sub>1</sub>, the Pch control signal is brought to logic low, which then turns ON transistors 310a and 310b. Near time T<sub>1</sub>, each of the N output transistors 240 for the CAM cells coupled to the match line is turned ON or OFF based on its comparison result. If there is a match, then all N output transistors 240 are 35 turned OFF, and the voltage on common line 610a is maintained at V<sub>SS</sub>, as shown by plot 712 in FIG. 7, even though match line 130x is pulled toward $V_{DD}$ . In contrast, the voltage on common line 610b is pulled toward $V_{DD}$ by the one dummy transistor 242x that is turned ON, as shown by plot 714 in FIG. 7. Thus, the voltage on common line 610b for the dummy transistors is higher than the voltage on common line 610a for the output transistors for a match. Conversely, if there is a mismatch, then at least one output transistor 240 is turned ON, and common line 610a is pulled 45 toward $V_{DD}$ by the ON transistor(s), as shown by plot 722 in FIG. 7. Since the output transistors 240 coupled to the match line are larger than the ON dummy transistor 242x coupled to the dummy line, the match line is pulled toward $V_{DD}$ at a faster rate. Thus, the voltage on common line 610afor the output transistors is higher than the voltage on common line 610b for the dummy transistors for a mismatch. At time T2, the Pch control signal is brought to logic high, transistors 310a and 310b are both turned OFF, and the 55 voltages on the match line, dummy line, and common lines 610a and 610b are maintained for both the match and mismatch cases. If there was a match, then the voltage on node D is higher than the voltage on node M when transistors 310a and 310b are turned OFF, as shown by plots 712 60 and 714 in FIG. 7. Conversely, if there was a mismatch, then the voltage on node M is higher than the voltage on node D when transistors 310a and 310b are turned OFF, as shown by plots 722 and 724 in FIG. 7. the En2 control signal is brought to logic low, and transistors 418a, 418b, and 422 are turned ON. Inverters 412a and 412b within sense circuit 410x are then enabled. Inverters 412aand 412b then detect the voltage difference between nodes M and D and further amplify the detected voltage difference. If there was a match, then the voltage on node D will be higher than the voltage on node M (as shown by plots 712 and 714 in FIG. 7), the outputs of inverters 412b (node F) and 412a (node G) will be driven to logic high and logic low, respectively, and the Out A and Out B outputs will be driven to logic low and logic high, respectively. Conversely, if there was a mismatch, then the voltage on node M will be higher than the voltage on node D (as shown by plots 722 and 724 in FIG. 7), the outputs of inverters 412b (node F) and 412a (node G) will be driven to logic low and logic high, respectively, and the Out A and Out B outputs will be driven to logic high and logic low, respectively. Starting at time $T_3$ , transistors 418a and 418b respectively pull common lines 610a and 610b toward $V_{SS}$ . Transistors 418a and 418b should be turned ON long enough to pull the voltage on these common lines to near V<sub>SS</sub>, to prepare for the next sensing cycle. Match line detection mechanism 600 is a different approach in comparison to match line detection mechanism 400 in FIG. 4A. Detection mechanisms 400 and 600 may be operated at a higher clock speed since it is not necessary to completely pre-charge the match line to $V_{DD}$ and also not necessary pull the match line to V<sub>DD</sub> or V<sub>SS</sub> after the pre-charge period (after the Pch signal has transitioned to logic high). This is because the differential sensing mechanism 410x can detect a small voltage difference between nodes D and M. Match line detection mechanisms 400 and 600 also achieve low power operation since the match line and dummy line operate with a small voltage swing rather than a full swing from $V_{SS}$ to $V_{DD}$ . The sense circuits described herein may be used to detect the signal on a match line coupled to a row of "ternary" CAM cells. A ternary CAM cell is one that includes two memory cells or storage elements, with one cell being used to store a data bit and the other cell being used to store a control bit to indicate whether or not a comparison is to be performed for that CAM cell. The additional (or secondary) cell may thus be used to selectively enable or disable the ternary CAM cell from being used in the comparison. If the ternary CAM cell is disabled, then its output does not affect the logic level on the match line to which it is coupled. FIG. 8A is a schematic diagram of an embodiment of a conventional ternary CAM cell 120y, which may be used for each of the CAM cells 120 in FIG. 1A. CAM cell 120y includes a memory cell 210y, a secondary cell 250y, and a comparison circuit 230y. Memory cell 210y operates in similar manner as that described above for memory cell 210x in FIG. 2B and is used to store a single data bit. Secondary cell 250y is similar in design to memory cell 210y and is used to store a single control bit. Secondary cell 250y may be programmed in similar manner as for memory cell 210y, and may further utilize the same bit line (bl and $\overline{bl}$ ). Comparison circuit 230y comprises a pair of N-channel transistors 232a and 232b and a pair of N-channel output transistors 240 and 241. Transistors 232a and 232b are coupled to memory cell 210y in similar manner as shown in FIG. 2B for CAM cell 120x. Output transistors 240 and 241 are coupled in series and to cells 210y and 250y. In particular, output transistor 241 has its drain coupled to a match line 130y for the row to which CAM cell 120y belongs, its source coupled to the drain of transistor 240, and At time T<sub>3</sub>, the En1 control signal is brought to logic high, 65 its gate (labeled as node "K") coupled to the mask line from secondary cell 250y. Output transistor 240 has its source coupled to circuit ground (e.g., Vss) and its gate (labeled as # US 6,999,331 B2 node "C") coupled to the drains of transistors 232a and 232b. Output transistors 240 and 241 effectively implement a NAND gate. Comparison circuit 230y operates as follows. If the address bit is not the same as the stored data bit in memory 5 cell 210y, then node C will be at logic high to indicate a mismatch. If the control bit on the mask line is at logic high, indicating that the ternary CAM cell is enabled, then node K will also be at logic high. If nodes C and K are both at logic high, then output transistors 240 and 241 are both turned ON, and match line 130y is pulled to logic low (e.g., toward $V_{SS}$ ). Otherwise, if node C is at logic low because of a match or node K is at logic low because the ternary CAM cell is disabled, then one or both of the output transistors will be turned OFF and these transistors will not actively operate on 15 match line 130y. Thus, comparison circuit 230y of ternary CAM 120y cell only pulls the match line to logic low if the CAM cell is enabled for comparison and there was a mismatch between its data bit and the address bit. FIG. 8B is a schematic diagram of an embodiment of a 20 ternary CAM cell 122y, which may be used for each of the CAM cells 122 in FIG. 1B. CAM cell 120y includes a memory cell 210y, a secondary cell 250y, and a comparison circuit 231y. Memory cell 210y and secondary cell 250y operate in similar manner as that described above for ternary 25 CAM cell 120y in FIG. 8A, and are used to store a single data bit and a single control bit, respectively. Comparison circuit 231y comprises the pair of N-channel transistors 232a and 232b and the pair of N-channel output transistors 240 and 241, which are coupled in similar manner as 30 described above in FIG. 8A. Comparison circuit 231y further comprises a pair of N-channel dummy transistors 242 and 243, which are coupled in series and to dummy line 131y. In particular, dummy transistor 243 has its drain coupled to dummy line 131y for the row to which CAM cell 35 120y belongs, its source coupled to the drain of transistor 242, and its gate (labeled as node "Ki") coupled to the inverted mask output of secondary cell 250y. Dummy transistor 242 has its source coupled to circuit ground (e.g., V<sub>SS</sub>) and its gate (labeled as node "Ki") coupled to the mask 40 output of secondary cell 250y. Dummy transistors 242 and 243 provide the proper loading for dummy line 131y. Dummy transistors 242 and 243 have similar physical dimension as output transistors 240 and 241. In an embodiment, dummy transistors 242 and 243 are located 45 near output transistors 240 and 241 and are oriented in the same direction. The output of the pair of dummy transistors 242 and 243 is always OFF since the gate inputs are complementary. FIG. 8C is a schematic diagram of an embodiment of a 50 dummy ternary CAM cell 124y, which may be used for each of the dummy CAM cells 124 in FIG. 1B. Dummy CAM cell 124y includes a memory cell 210y, a secondary cell 250y, and a comparison circuit 233y. Memory cell 210y and secondary cell 250y operate in similar manner as that 55 described above for ternary CAM cell 120y in FIG. 8A, and are used to store a single data bit and a single control bit, respectively. Comparison circuit 233y includes circuitry used to drive match line 130y and dummy line 131y. In particular, comparison circuit 233y comprises transistors 232a, 232b, and output transistors 240x and 241x coupled in the manner described above with reference to FIG. 8A and used to drive match line 130y. Comparison circuit 233y further comprises a second pair of N-channel transistors 234a and 234b and a second pair of output transistors 242x and 243x used to drive dummy line 131y. Transistors 234a and 234b and output transistors 242x and 243x are coupled in similar manner as transistors 232a and 232b and output transistors 240x and 241x for the match line, except that the gates of transistors 234a and 234b couple to the data line (d) and the complementary data line ( $\overline{d}$ ), respectively. 16 The output of the pair of transistors 240x and 241x and the output of the pair of transistors 242x and 243x are complementary. When the output of transistor pair 240x and 241x is OFF, the output of transistor pair 242x and 243x is ON and pulls down the dummy line with fraction of the speed as that of the match line if there is at least one bit mismatch. Conversely, when the output of transistor pair 242x and 243x is OFF, the dummy line will not be pulled down. But the output of transistor pair 240x and 241x will be ON and the match line will be pulled down. This would then indicate a mismatch and this row is disabled. FIG. 9A is a schematic diagram of a match line detection mechanism 900, which may be used in conjunction with ternary CAM cells 122y and 124y in CAM unit 100b in FIG. 1B, in accordance with yet another embodiment of the invention. Similar to FIG. 4A, a match line 130y couples to N pairs of output transistors 240a and 241a through 240n and 241n for the N ternary CAM cells 124ya through 124yn and also to transistors 240x and 241x for dummy CAM cell 124y in a specific row of the CAM array. The gates of output transistors 240a through 240n couple to the comparison circuit outputs (labeled as C1 through CN) for the N ternary CAM cells, and the gates of output transistors 241a through 241n couple to the mask outputs (labeled as K1 through KN) of the secondary cells for the N ternary CAM cells. The gates of output transistors 240x and 241x respectively couple to the comparison circuit outputs (labeled as Cd) and the secondary cell inverted mask output (labeled as Kd) for dummy ternary CAM cell 124y. Match line 130y further couples to P-channel transistor 310a and a first input of a sense circuit 410y, which is used to sense the signal on the match line. Dummy line 131y couples to N pairs of dummy transistors 242a and 243a through 242n and 243n for the N ternary CAM cells 124ya through 124yn and also to transistors 242x and 243x for dummy CAM cell 124y within the same row as the associated match line 130y. The gates of dummy transistors 242a through 242n couple to the inverted mask outputs of the secondary cells, and the gates of dummy transistors 243a through 243n couple to the mask outputs of the secondary cells. With this connection, the N pairs of dummy transistors 242a and 243a through 242n and 243n are always turned OFF. The gates of dummy transistors 242x and 243x are respectively coupled to the comparison circuit complementary output (labeled as Cd) and the mask output (labeled as Kd) for dummy ternary CAM cell 124y. This dummy transistor pair is turned ON. Again, transistors 242x and 243x are dimensioned to be a fraction (e.g., half) of the size of the other output transistors. Dummy line 131y further couples to P-channel transistor 310b and the second input (node D) of a sense circuit 410y. In the specific embodiment shown in FIG. 9A, sense circuit 410x includes inverters 412a and 412b, N-channel transistors 418a and 418b, P-channel transistor 422, and inverting buffers 424a and 424b, which are coupled together as described above for sense circuit 410x in FIG. 4A. Sense circuit 410x may be used to detect the signal on match line 130y in similar manner as that described above for detection mechanism 400 in FIG. 4A and shown by the timing diagram in FIG. 5. FIG. 9B is a schematic diagram of a match line detection mechanism 901, which may also be used in conjunction with ternary CAM cells 122y and 124y in CAM unit 100b. Match line detection mechanism 901 is similar to match line detection mechanism 900 in FIG. 9A. However, match line 130y further couples to P-channel pass transistor 426b and dummy line 131y further couples to P-channel pass transistor 426a. Pass transistors 426a and 426b respectively couple the dummy and match lines to sense circuit 410y, similar to the embodiment shown in FIG. 4B. The operation of match line detection mechanism 901 is as described above for FIGS. 4B and 9A. FIG. 10 is a schematic diagram of a match line detection mechanism 1000, which may be used in conjunction with ternary CAM cells 122y and 124y in CAM unit 100b in FIG. 1B, in accordance with yet another embodiment of the invention. Similar to FIGS. 6 and 9, match line 130y couples to N pairs of output transistors 240a and 241a through 240n and 241n for the N ternary CAM cells 122 and also to output transistors 240a and 241x for the dummy ternary CAM cell 124 in a specific row of the CAM array. However, the sources of output transistors 241a through 241n are coupled to node M of sense circuit 410y via first common line 610x. Similarly, the sources of dummy transistors 242a through 242n are coupled to node D of sense circuit 410y via second common line 610y. FIG. 10 also shows an embodiment of a sense circuit 25 410y. Sense circuit 410y includes inverters 412a and 412b, N-channel transistors 418a and 418b, P-channel transistor 422, and inverting buffers 424a and 424b, which are coupled together as described above for sense circuit 410x in FIG. 4A. Sense circuit 410y further includes an N-channel tran- 30 sistor 420, a P-channel transistor 430, and an inverter 432. P-channel transistor 430 is coupled in parallel with N-channel transistor 420. The sources of transistors 420 and 430 couple to node F, the drains of transistors 420 and 430 couple to node G, the gate of transistor 420 couples to the 35 input of inverter 432, and the gate of transistor 430 couples to the output of inverter 432. The input of inverter 432 couples to an En3 control signal. Transistors 420 and 430 form a switch that shorts out nodes F and G when enabled by the En3 control signal. The transistors 420 and 430 are 40 used to equalize nodes G and F in each cycle before a match comparison. In a typical implementation of all the above embodiments, these two transistors will be provided to equalize nodes F and G before each match comparison. Sense circuit **410**y may be used to detect the signal on 45 common line **610**x in similar manner as that described above for detection mechanism **600** in FIG. **6** and shown by the timing diagram in FIG. **7**. Sense circuit **410**y may also be used for match line detection mechanisms **400**, **600**, and **900** For clarity, specific designs of the sense circuit have been described herein. Various modifications to these circuit designs may also be made, and this is within the scope of the invention. For example, for sense circuit 410x, inverters 412a and 412b may be coupled to match line 130x or 55 common line 610x via some other configuration, and so on. The specific timing diagrams shown in FIGS. 5 and 7 are also provided to illustrate the operation of the sense circuit and the match line detection. Variations to the timing shown in FIGS. 5 and 7 may also be made, and this is within the 60 scope of the invention. For example, the En1 control signal may be brought to logic high at time $T_2$ when the Pch control signal is brought to logic high. The sense circuits and match line detection mechanisms described herein may be used to provide a CAM array 65 having faster speed of operation and lower power consumption. These circuits may also be used for other types of 18 memory (e.g., dynamic random access memory or DRAM), and other integrated circuits (e.g., microprocessors, controllers, and so on). The circuits described herein may also be implemented in various semiconductor technologies, such as CMOS, bipolar, bi-CMOS, GaAs, and so on. The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. What is claimed is: - 1. A ternary content addressable memory (TCAM) comprising: - an array of TCAM cells arranged in a plurality of rows and a plurality of columns; - a plurality of match lines, one match line for each row of TCAM cells and operatively coupled to a plurality of output transistors for the TCAM cells in each row; - a plurality of dummy lines, one dummy line for each row of TCAM cells and operatively coupled to a plurality of dummy transistors for the TCAM cells in each row; - a plurality of match data bit lines and their complements, one pair of match data bit line and its complement for each column of TCAM cells to provide a match data and its complement to compare with the content stored in each TCAM cell of that column; - a column of dummy TCAM (DTCAM) cells, each connected to the match line and the dummy line in each row: - a pair of dummy match data bit line and its complement for the column of DTCAM cells to provide a dummy match data and its complement to compare with the content stored in each DTCAM cell; - a sense amplifier connected to the match line and the dummy line in each row; and current sources connected to each of the match line and the dummy line in each row. - 2. The TCAM of claim 1, herein each TCAM cell comprises: - a memory cell operable to store a data bit value; - a secondary cell operable t store a control bit value; and - a comparison circuit coupled to the memory cell and the secondary cell and configured to detect the data bit value stored in the memory cell and the control bit value stored in the secondary cell, the comparison circuit including: - a pair of output transistors coupled to the corresponding match line and configured to provide a drive for the match line based on the detected data bit value and the detected control bit value; and - a pair of dummy transistors coupled to the corresponding dummy line to provide a drive for the dummy line based on the detected control bit value, wherein the match line and the dummy line are used to detect an output value provided by the TCAM cell. - 3. The TCAM of claim 1, wherein each DTCAM cell comprises: - a memory cell operable to store a data bit value; - a secondary cell operable to store a control bit value; and and its complement coupled to the TCAM cells in each lines and their complements, one pair of match data bit line to the TCAM cells in each row, a plurality of match data bit dummy lines, one match line and one dummy line coupled arranged in rows and columns, a plurality of match lines and sable memory (TCAM) having an array of TCAM cells a comparison result in each row of a ternary content addres-9. A method of detecting a match or a mismatch state of respective N type transistors. dummy line in each row are connected to ground through 8. The TCAM of claim 1, wherein the match line and the to the match line and the dummy line. each row are P type transistors to provide currents from Vdd connected to each of the match line and the dummy line in 7. The TCAM of claim I, wherein the current sources Vdd. a P type transistor serially connected to both inverters and teedback; and two inverters connected to each other in a way of positive comprises: connected to the match line and the dummy line in each row 6. The TCAM of claim 1, wherein the sense amplifier corresponding row. during sensing operation to disable the comparison of the are turned OFF and the output transistors are turned ON 5. The TCAM of claim 3, wherein the dummy transistors tion to enable the comparison of the corresponding row. output transistors, and are turned ON during sensing operaoutput transistors, are located in close proximity to the have smaller dimension and less driving ability than the 15 4. The TCAM of claim 3, wherein the dummy transistors bit value and the detected control bit value. the dummy line based on the detected inverted data ing dummy line and configured to provide a drive for a pair of dummy transistors coupled to the correspond- 10 the detected control bit value; and match line based on the detected data bit value and match line and configured to provide a drive for the a pair of output transistors coupled to the corresponding circuit including: value stored in the secondary cell, the comparison value stored in the memory cell and the control bit secondary cell and configured to detect the data bit a comparison circuit coupled to the memory cell and the one comparison cycle. determine the match or the mismatch state, finishing ence between the match line and the dummy line and enabling the sense amplifier to sense the voltage differ- paths from Vdd to the match line and the dummy line; disabling the current sources to shut off the conducting to a level less than half Vdd; pull the potential of the match line and the dummy line from Vdd to the match line and the dummy line and enabling the current sources to establish conducting paths in the DTCAM cell; and its complement to compare with the content stored DTCAM cell through the dummy match data bit line sending a dummy match data and its complement to the tent stored in the corresponding TCAM cells; lines and their complements to compare with the conto the TCAM cells through the corresponding match bit sending a plurality of match data and their complements dummy line are equal to the ground voltage potential; voltage potentials is of both the match line and the match line and the dummy line to ground after the disabling the switch to shut off conducting paths from the match line and the dummy line to ground; equal to the ground voltage potential and discharge the the potential of the match line and the dummy line the match line and the dummy line to ground to make enabling the switches to establish conducting paths from disabling the sense amplifier; flowing from Vdd to the match line and the dummy disabling the current sources such that there is no current line, the method comprising: between ground and each of the match line and the dummy and the dummy line in each row, and switches connected sources connected between Vdd and each of the match line 5 the match line and the dummy line in each row, current coupled to the DTCAM cells, a sense amplifier coupled to a pair of dummy match data bit line and its complement coupled to the match line and the dummy line in each row, column, a column of dummy TCAM (DTCAM) cells, each