QUINN EMANUEL URQUHART \& SULLIVAN, LLP
James R. Asperger (Bar No. 083188) jimasperger@quinnemanuel.com 865 S. Figueroa St., 10th Floor Los Angeles, California 90017
Telephone: (213) 443-3000
Facsimile: (213) 443-3100
Kevin P.B. Johnson (Bar No. 177129)
 kevinjohnson@quinnemanuel.com 555 Twin Dolphin Drive, 5th Floor Redwood Shores, California 94065
Telephone: (650) 801-5000
Facsimile: (650) 801-5100
Attorneys for Plaintiff the California Institute of Technology

UNITED STATES DISTRICT COURT CENTRAL DISTRICT OF CALIFORNIA

The CALIFORNIA INSTITUTE OF TECHNOLOGY, a California corporation,

Plaintiff,
vs.
HUGHES COMMUNICATIONS, INC., a Delaware corporation, HUGHES NETWORK SYSTEMS, LLC, a Delaware limited liability company, DISH NETWORK CORPORATION, a Nevada corporation, DISH NETWORK L.L.C., a Colorado limited liability company, and DISHNET SATELLITE BROADBAND L.L.C., a Colorado limited liability company,

Defendants.

CASE NO. 2:13-cv-07245-MRP-JEM

## AMENDED COMPLAINT FOR

 PATENT INFRINGEMENT
## JURY TRIAL DEMANDED



Plaintiff the California Institute of Technology ("Caltech" or "Plaintiff"), by and through its undersigned counsel, complains and alleges as follows against Hughes Communications, Inc., Hughes Network Systems, LLC, DISH Network Corporation, DISH Network L.L.C., and dishNET Satellite Broadband L.L.C. (collectively, "Defendants"):

## NATURE OF THE ACTION

1. This is a civil action for patent infringement arising under the patent laws of the United States, 35 U.S.C. §§ 1 et seq.
2. Defendants have infringed and continue to infringe, contributed to and continue to contribute to the infringement of, and/or actively induced and continue to induce others to infringe Caltech's U.S. Patent No. 7,116,710, U.S. Patent No. 7,421,032, U.S. Patent No. 7,916,781, and U.S. Patent No. 8,284,833 (collectively, "the Asserted Patents"). Caltech is the legal owner by assignment of the Asserted Patents, which were duly and legally issued by the United States Patent and Trademark Office. Caltech seeks injunctive relief and monetary damages.

## THE PARTIES

3. Caltech is a non-profit private university organized under the laws of the State of California, with its principal place of business at 1200 East California Boulevard, Pasadena, California 91125.
4. On information and belief, Hughes Communications, Inc. ("Hughes Communications") is a corporation organized under the laws of the State of Delaware, with its principal place of business located at 11717 Exploration Lane, Germantown, Maryland 20876. On information and belief, Hughes Communications is a wholly-owned subsidiary of Hughes Satellite Systems Corporation, which is a wholly-owned subsidiary of EchoStar Corporation ("EchoStar").
5. On information and belief, Hughes Network Systems, LLC ("Hughes Network") is a limited liability company organized under the laws of the State of

Delaware, with its principal place of business located at 11717 Exploration Lane, Germantown, Maryland 20876. On information and belief, Hughes Network is a wholly owned subsidiary of Hughes Communications. Hughes Communications and Hughes Network, collectively, are referred to as "Hughes Defendants."
6. On information and belief, DISH Network Corporation ("DISH Corp.") is a corporation organized under the laws of the State of Nevada with its principal place of business located at 9601 South Meridian Boulevard, Englewood, Colorado 80112.
7. On information and belief, DISH Network L.L.C. ("DISH L.L.C.") is a limited liability company organized under the laws of the State of Colorado with its principal place of business located at 9601 South Meridian Boulevard, Englewood, Colorado 80112. On information and belief, DISH L.L.C. is a wholly owned subsidiary of DISH Corp.
8. On information and belief, dishNET Satellite Broadband L.L.C. ("dishNET") is a limited liability company organized under the laws of the State of Colorado with its principal place of business located at 9601 South Meridian Boulevard, Englewood, Colorado 80112. On information and belief, dishNET is a wholly owned subsidiary of DISH Corp. On information and belief, dishNET and DISH L.L.C. are related entities. DISH Corp., DISH L.L.C., and dishNET, collectively, are referred to as "Dish Defendants."
9. On information and belief, Hughes Defendants' parent company, EchoStar, and Dish Defendants were previously one company. On information and belief, around January 2008, EchoStar and Dish Defendants became two separate companies (the "spin-off").
10. On information and belief, the business relationship among Dish Defendants, EchoStar and Hughes Defendants remains extremely integrated. The same individual serves as the Chairman of both Dish Defendants and EchoStar. Further, since the spin-off, a substantial majority of the voting power of the shares
of both Dish Defendants and EchoStar is owned beneficially by the Chairman, or by certain trusts established by the Chairman. Additionally, on information and belief, in addition to the Chairman, an individual responsible for the development and implementation of advanced technologies that are of potential utility and importance to both Dish Defendants and EchoStar serves on the board of both companies. On information and belief, in 2010, Dish Defendants accounted for 82.5\% of EchoStar's total revenue and in 2012, Dish Defendants accounted for 49.5\% of EchoStar's total revenue. Additionally, on information and belief, in October 2012, Dish Defendants and Hughes Defendants entered into a distribution agreement relating to Hughes Defendants' satellite internet service.

## JURISDICTION AND VENUE

11. This Court has jurisdiction over the subject matter of this action under 28 U.S.C. §§ 1331 and 1338(a).
12. Hughes Defendants are subject to this Court's personal jurisdiction. On information and belief, Hughes Defendants regularly conduct business in the State of California, including in the Central District of California, and have committed acts of patent infringement and/or contributed to or induced acts of patent infringement by others in this District and elsewhere in California and the United States. As such, Hughes Defendants have purposefully availed themselves of the privilege of conducting business within this District; have established sufficient minimum contacts with this District such that they should reasonably and fairly anticipate being haled into court in this District; have purposefully directed activities at residents of this State; and at least a portion of the patent infringement claims alleged herein arise out of or are related to one or more of the foregoing activities.
13. Dish Defendants are subject to this Court's personal jurisdiction. On information and belief, Dish Defendants regularly conduct business in the State of California, including in the Central District of California, maintain employees in this District and elsewhere in California, and have committed acts of patent infringement
and/or contributed to or induced acts of patent infringement by others in this District and elsewhere in California and the United States. As such, Dish Defendants have purposefully availed themselves of the privilege of conducting business within this District; have established sufficient minimum contacts with this District such that they should reasonably and fairly anticipate being haled into court in this District; have purposefully directed activities at residents of this State; and at least a portion of the patent infringement claims alleged herein arise out of or are related to one or more of the foregoing activities.
14. Venue is proper in this judicial district pursuant to 28 U.S.C. $\S \S 1391$ and 1400 because Defendants regularly conduct business in this District, and certain of the acts complained of herein occurred in this District.

## CALTECH'S ASSERTED PATENTS

15. On October 3, 2006, the United States Patent Office issued U.S. Patent No. 7,116,710, titled "Serial Concatenation of Interleaved Convolutional Codes Forming Turbo-Like Codes" (the "' 710 patent"). A true and correct copy of the '710 patent is attached hereto as Exhibit A.
16. On September 2, 2008, the United States Patent Office issued U.S. Patent No. 7,421,032, titled "Serial Concatenation of Interleaved Convolutional Codes Forming Turbo-Like Codes" (the "'032 patent"). A true and correct copy of the ' 032 patent is attached hereto as Exhibit B. The ' 032 patent is a continuation of the application that led to the ' 710 patent.
17. On March 29, 2011, the United States Patent Office issued U.S. Patent No. 7,916,781, titled "Serial Concatenation of Interleaved Convolutional Codes Forming Turbo-Like Codes" (the "'781 patent"). A true and correct copy of the ' 781 patent is attached hereto as Exhibit C. The ' 781 patent is a continuation of the application that led to the ' 032 patent, which is a continuation of the application that led to the ' 710 patent.
18. On October 9, 2012, the United States Patent Office issued U.S. Patent

No. 8,284,833, titled "Serial Concatenation of Interleaved Convolutional Codes Forming Turbo-Like Codes" (the "' 833 patent"). A true and correct copy of the ' 833 patent is attached hereto as Exhibit D. The ' 833 patent is a continuation of the application that led to the ' 781 patent, which is a continuation of the application that led to the ' 032 patent, which is a continuation of the application that led to the ' 710 patent.
19. The Asserted Patents identify Hui Jin, Aamod Khandekar, and Robert J. McEliece as the inventors (the "Named Inventors").
20. Caltech is the owner of all right, title, and interest in and to each of the Asserted Patents with full and exclusive right to bring suit to enforce the Asserted Patents, including the right to recover for past damages and/or royalties.
21. The Asserted Patents are valid and enforceable.

## BACKGROUND TO THIS ACTION

22. The Asserted Patents disclose a seminal improvement to coding systems and methods used for digital satellite transmission. The Asserted Patents disclose an ensemble of codes called irregular repeat-accumulate (IRA) codes, which are specific types of low-density parity check (LDPC) codes. The IRA codes disclosed in the Asserted Patents enable a transmission rate close to the theoretical limit, while also providing the advantage of a low encoding complexity.
23. In September 2000, the Named Inventors of the Asserted Patents published a paper regarding their invention, titled "Irregular Repeat-Accumulate Codes" for the Second International Conference on Turbo Codes. (Exhibit E.) This paper has been widely cited by experts in the industry.
24. Experts recognize the importance and usefulness of the IRA codes disclosed in the September 2000 paper by the Named Inventors of the Asserted Patents. For example, a paper praising these IRA codes was published in August 2004 by Aline Roumy, Souad Guemghar, Giuseppe Caire, and Sergio Verdú in the IEEE Transactions on Information Theory. This paper, titled "Design Methods for

Irregular Repeat-Accumulate Codes," states:
IRA codes are, in fact, special subclasses of both irregular LDPCs and irregular turbo codes. . . . IRA codes are an appealing choice because the encoder is extremely simple, their performance is quite competitive with that of turbo codes and LDPCs, and they can be decoded with a very-low-complexity iterative decoding scheme.
(Exhibit F, at 1.) This paper also notes that, four years after the September 2000 paper, the Named Inventors were the only ones to propose a method to design IRA codes. (Id.)
25. The current standard for digital satellite transmissions embodies the invention of the Asserted Patents by using channel codes that are IRA codes. This digital satellite transmission standard is titled "Digital Video Broadcasting (DVB); Second generation framing structure, channel coding and modulation systems for Broadcasting, Interactive Services, News Gathering and other broadband satellite applications" (the "DVB-S2 standard").
26. Experts in the industry recognize that the DVB-S2 standard uses the IRA codes initially disclosed by the Named Inventors of the Asserted Patents. For example, a 2005 paper published by the highly regarded Institute of Electrical and Electronics Engineers (IEEE), titled "A Synthesizable IP Core for DVB-S2 LDPC Code Decoding," and authored by Frank Kienle, Torben Brack, and Norbert Wehn recognizes:

The LDPC codes as defined in the DVB-S2 standard are IRA codes, thus the encoder realization is straight forward. Furthermore, the DVB-S2 code shows regularities which can be exploited for an efficient hardware realization.
(Exhibit G, at 1.)
27. Moreover, this paper provides credit to the September 2000 paper authored by the Named Inventors of the Asserted Patents for the origination of the IRA codes that are defined in the DVB-S2 standard. (Id. at $1 \& n .8$.)
28. Similarly, on information and belief, a 2007 paper titled "Factorizable Modulo M Parallel Architecture for DVB-S2 LDPC Decoding," and published in the Proceedings of the 6th Conference on Telecommunications, recognizes that the DVB-S2 standard uses the IRA codes initially disclosed by the Named Inventors of the Asserted Patents. This paper, authored by Marco Gomes, Gabriel Falcão, Vitor Silva, Vitor Ferreira, Alexandre Sengo, and Miguel Falcão, states:

The new DVB-S2 [] standard adopted a special class of LDPC codes known by IRA codes [] as the main solution for the FEC system.
(Exhibit H, at 1.)
29. Moreover, this paper also credits the September 2000 paper authored by the Named Inventors of the Asserted Patents for the origination of the IRA codes that are defined in the DVB-S2 standard. (Id. at $1 \& n .8$.)
30. As even further support, on information and belief, a 2006 industry paper published in the Journal of Communications Software and Systems, titled "Design of LDPC Codes: A Survey and New Results" and authored by Gianluigi Liva, Shumei Song, Lan Lan, Yifei Zhang, Shu Lin, and William E. Ryan, confirms that the DVB-S2 standard uses the IRA codes, stating:

The ETSI DVB S2 [] standard for digital video broadcast specifies two IRA code families with block lengths 64800 and 16200.
(Exhibit I, at 10-11.)
31. As such, products, methods, equipment, and/or services that implement the DVB-S2 standard practice one or more claims of each of the Asserted Patents because the DVB-S2 standard embodies the invention of the Asserted Patents by using IRA codes.
32. On information and belief, Hughes Defendants manufacture, use, import, offer for sale, or sell products, methods, equipment, and/or services that implement the DVB-S2 standard. For example, Hughes Defendants provide satellite
broadband internet access to consumers and broadband network services to the enterprise markets, among other activities, including through their HN System and HX System product lines. Hughes Defendants have extensively publicized that their flagship HN System and HX System satellite broadband internet product lines implement the DVB-S2 standard. On information and belief, Hughes Defendants market and sell, among other activities, certain broadband equipment and services that implements the DVB-S2 standard through the HughesNet brand. On information and belief, Hughes Defendants further sell or provide certain broadband equipment and services that implements the DVB-S2 standard to Dish Defendants. On information and belief, Hughes Defendants use their broadband equipment that implements the DVB-S2 standard for testing, consulting, and/or support services, among other activities.
33. On information and belief, Dish Defendants manufacture, use, import, offer for sale, or sell products, methods, equipment, and/or services that implement the DVB-S2 standard. For example, on information and belief, Dish Defendants manufacture, market, offer for sale, sell, distribute, and/or use, among other activities, the Hopper set-top box that implements the DVB-S2 standard. Additionally, for example, on information and belief, Dish Defendants market, offer for sale, sell, and distribute, among other activities, Hughes Defendants' satellite internet service, among other products and services, under the dishNET brand pursuant to a distribution agreement entered into with Hughes Defendants in October 2012. On information and belief, Dish Defendants purchase certain broadband equipment and services that implements the DVB-S2 standard from Hughes Defendants and offer for sale, sell, provide, and/or distribute this equipment and service to its customers. On information and belief, Dish Defendants use this broadband equipment and service that implements the DVB-S2 standard for testing, consulting and/or support services, among other activities. On information and belief, the dishNET services are primarily bundled with other services offered by

Dish Defendants.
34. Hughes Defendants admit that their broadband satellite systems are compliant with "high-speed DVB-S2." (Exhibit J.) Additionally, Hughes Defendants have touted that implementation of this DVB-S2 standard "provides for higher throughputs, better coding efficiency, and improved satellite resource utilization for the outbound channel." (Exhibit K.)
35. Further, Hughes Defendants' website advertises its HX System and provides a link to a brochure titled "High-Performance IP Satellite Broadband System." (Exhibit L.) This brochure similarly highlights Hughes Defendants' implementation of the DVB-S2 standard, stating that the core component of the HX System, the HX Gateway, "uses a DVB-S2 carrier . . . for the outbound channel received by all HX System remote terminals." (Id.)
36. Hughes Defendants' website also advertises its HN System and states that it is compliant with DVB-S2. (Exhibit M.)

## COUNT I

## Infringement of the ' 710 Patent

37. Plaintiff re-alleges and incorporates by reference the allegations of the preceding paragraphs of this Complaint as if fully set forth herein.
38. On information and belief, in violation of 35 U.S.C. § 271, Defendants have infringed and are currently infringing, directly and/or through intermediaries, the ' 710 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, products, methods, equipment, and/or services that practice one or more claims of the ' 710 patent. These products, methods, equipment, and/or services include products that implement the DVB-S2 standard, including without limitation products in the HN System and HX System product lines, satellite internet product lines distributed under the dishNET brand, the Hopper set-top box, network and network services that employ these products, and/or marketing, consulting, and/or support services provided for these products
and services (collectively, the "Accused Services and Products"). For example, at least Paragraphs 32 and 33 illustrate a limited number of examples of Defendants' direct infringement of the ' 710 patent. Defendants have infringed and are currently infringing literally and/or under the doctrine of equivalents.
39. On information and belief, in violation of 35 U.S.C. § 271, Defendants have infringed and are continuing to infringe the ' 710 patent by contributing to and/or actively inducing the infringement by others of the ' 710 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, products, methods, equipment, and/or services, including the Accused Services and Products, that practice one or more claims of the ' 710 patent.
40. Hughes Defendants have had actual knowledge of their infringement of the ' 710 patent before the filing date of this Complaint through letters alleging such infringement, or at least have had actual knowledge of their infringement of the ' 710 patent since no later than the filing date of this Complaint.
41. On information and belief, Dish Defendants have had actual knowledge of their infringement of the ' 710 patent before the filing date of this Complaint based on their marketing, sale, and distribution, among other activities, of Hughes Defendants' satellite internet service and their relationship with Hughes Defendants (see Paragraphs 9, 10, 33). Dish Defendants at least have had actual knowledge of their infringement of the ' 710 patent since no later than the filing date of this Complaint.
42. Notwithstanding Defendants' actual notice of infringement, Defendants have continued, directly and/or through intermediaries, to manufacture, use, import, offer for sale, or sell the Accused Services and Products with knowledge of or willful blindness to the fact that their actions will induce others, including but not limited to their customers, partners, and/or end users, to infringe the ' 710 patent. Defendants have induced and continue to induce others to infringe the ' 710 patent in violation of 35 U.S.C. § 271 by encouraging and facilitating
others to perform actions that Defendants know to be acts of infringement of the ' 710 patent with intent that those performing the acts infringe the ' 710 patent. Upon information and belief, Defendants, directly and/or through intermediaries, advertise and distribute the Accused Services and Products, publish instruction materials, specifications and/or promotional literature describing the operation of the Accused Services and Products, and/or offer training and/or consulting services regarding the Accused Services and Products to their customers, partners, and/or end users. At least consumers, partners, and/or end users of these Accused Services and Products then directly or jointly infringe the ' 710 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, the Accused Services and Products.
43. Upon information and belief, Defendants know that the Accused Services and Products are especially made or especially adapted for use in the infringement of the ' 710 patent. The infringing components of these products are not staple articles or commodities of commerce suitable for substantial noninfringing use, and the infringing components of these products are a material part of the invention of the ' 710 patent. Accordingly, in violation of 35 U.S.C. § 271, Defendants are also contributing, directly and/or through intermediaries, to the direct infringement of the ' 710 patent by at least the customers, partners, and/or end users of these Accused Services and Products. The customers, partners, and/or end users of these Accused Services and Products directly infringe the ' 710 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, the Accused Services and Products.
44. As but one example of Hughes Defendants' contributory and/or induced infringement, Hughes Defendants explicitly encourage their customers to practice the methods disclosed and claimed in the ' 710 patent by using the Accused Services and Products. As detailed in Paragraphs 34 through 36, Hughes Defendants' website advertises its HN System and HX System, and provides
information and brochures regarding these systems. (See Exhibits J, K, L, M.) These webpages and brochures highlight Hughes Defendants' implementation of the DVB-S2 standard. On information and belief, through materials such as these, the Hughes Defendants actively encourage their consumers, partners, and/or end users to infringe the ' 710 patent through at least use of the HN System and HX System product lines, knowing those acts to be infringement of the ' 710 patent with intent that those performing the acts infringe the ' 710 patent.
45. As but one example of Dish Defendants' contributory and/or induced infringement, Dish Defendants explicitly encourage their customers to practice the methods disclosed and claimed in the ' 710 patent by using the Accused Services and Products. According to Dish Defendants' 2012 Annual Report (10-K), Dish Defendants lease to dishNET satellite internet subscribers the customer premise equipment. Dish Defendants also advertise, market, offer for sale, and sell to customers the Hopper set-top box on their website. On information and belief, the dishNET customer premise equipment and the Hopper set-top box implement the DVB-S2 standard. On information and belief, through providing this equipment, Dish Defendants actively encourage their consumers and end users to infringe the ' 710 patent through at least use of the equipment, knowing those acts to be infringement of the ' 710 patent with intent that those performing the acts infringe the ' 710 patent.
46. Defendants are not licensed or otherwise authorized to practice, contributorily practice and/or induce third parties to practice the claims of the ' 710 patent.
47. By reason of Defendants' infringing activities, Caltech has suffered, and will continue to suffer, substantial damages.
48. Caltech is entitled to recover from Defendants the damages sustained as a result of Defendants' wrongful acts in an amount subject to proof at trial.
49. Defendants' continuing acts of infringement are irreparably harming and causing damage to Caltech, for which Caltech has no adequate remedy at law, and will continue to suffer such irreparable injury unless Defendants' continuing acts of infringement are enjoined by the Court. The hardships that an injunction would impose are less than those faced by Caltech should an injunction not issue. The public interest would be served by issuance of an injunction. Thus, Caltech is entitled to a preliminary and a permanent injunction against further infringement.
50. Hughes Defendants' infringement of the ' 710 patent has been and continues to be willful and deliberate, justifying a trebling of damages under 35 U.S.C. § 284. Among other facts, Hughes Defendants have had knowledge of their infringement of the ' 710 patent before the filing date of this Complaint through letters alleging such infringement. Upon information and belief, Hughes Defendants' accused actions continued despite an objectively high likelihood that they constituted infringement of the ' 710 patent. Hughes Defendants either knew or should have known about their risk of infringing the ' 710 patent. Hughes Defendants' conduct despite this knowledge was made with both objective and subjective reckless disregard for the infringing nature of their activities as demonstrated by Hughes Defendants' knowledge regarding the claims of the '710 patent.
51. Defendants' infringement of the ' 710 patent is exceptional and entitles Caltech to attorneys' fees and costs incurred in prosecuting this action under 35 U.S.C. § 285.

## COUNT II

## Infringement of the ' 032 Patent

52. Plaintiff re-alleges and incorporates by reference the allegations of the preceding paragraphs of this Complaint as if fully set forth herein.
53. On information and belief, in violation of 35 U.S.C. § 271, Defendants have infringed and are currently infringing, directly and/or through intermediaries,
the '032 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, products, methods, equipment, and/or services that practice one or more claims of the '032 patent. These products, methods, equipment, and/or services include products that implement the DVB-S2 standard, including without limitation products in the HN System and HX System product lines, satellite internet product lines distributed under the dishNET brand, the Hopper set-top box, network and network services that employ these products, and/or marketing, consulting, and/or support services provided for these products and services (collectively, the "Accused Services and Products"). For example, at least Paragraphs 32 and 33 illustrate a limited number of examples of Defendants' direct infringement of the '032 patent. Defendants have infringed and are currently infringing literally and/or under the doctrine of equivalents.
54. On information and belief, in violation of 35 U.S.C. § 271, Defendants have infringed and are continuing to infringe the ' 032 patent by contributing to and/or actively inducing the infringement by others of the ' 032 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, products, methods, equipment, and/or services, including the Accused Services and Products, that practice one or more claims of the '032 patent.
55. Hughes Defendants have had actual knowledge of their infringement of the '032 patent before the filing date of this Complaint through letters alleging such infringement, or at least have had actual knowledge of their infringement of the '032 patent since no later than the filing date of this Complaint.
56. On information and belief, Dish Defendants have had actual knowledge of their infringement of the '032 patent before the filing date of this Complaint based on their marketing, sale, and distribution, among other activities, of Hughes Defendants' satellite internet service and their relationship with Hughes Defendants (see Paragraphs 9, 10, 33). Dish Defendants at least have had actual knowledge of their infringement of the ' 032 patent since no later than the filing date of this

Complaint.
57. Notwithstanding Defendants' actual notice of infringement, Defendants have continued, directly and/or through intermediaries, to manufacture, use, import, offer for sale, or sell the Accused Services and Products with knowledge of or willful blindness to the fact that their actions will induce others, including but not limited to their customers, partners, and/or end users, to infringe the ' 032 patent. Defendants have induced and continue to induce others to infringe the '032 patent in violation of 35 U.S.C. § 271 by encouraging and facilitating others to perform actions that Defendants know to be acts of infringement of the ' 032 patent with intent that those performing the acts infringe the ' 032 patent. Upon information and belief, Defendants, directly and/or through intermediaries, advertise and distribute the Accused Services and Products, publish instruction materials, specifications and/or promotional literature describing the operation of the Accused Services and Products, and/or offer training and/or consulting services regarding the Accused Services and Products to their customers, partners, and/or end users. At least consumers, partners, and/or end users of these Accused Services and Products then directly or jointly infringe the '032 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, the Accused Services and Products.
58. Upon information and belief, Defendants know that the Accused Services and Products are especially made or especially adapted for use in the infringement of the ' 032 patent. The infringing components of these products are not staple articles or commodities of commerce suitable for substantial noninfringing use, and the infringing components of these products are a material part of the invention of the ' 032 patent. Accordingly, in violation of 35 U.S.C. § 271, Defendants are also contributing, directly and/or through intermediaries, to the direct infringement of the ' 032 patent by at least the customers, partners, and/or end users of these Accused Services and Products. The customers, partners, and/or end
users of these Accused Services and Products directly infringe the '032 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, the Accused Services and Products.
59. As but one example of Hughes Defendants' contributory and/or induced infringement, Hughes Defendants explicitly encourage their customers to practice the methods disclosed and claimed in the '032 patent by using the Accused Services and Products. As detailed in Paragraphs 34 through 36, Hughes Defendants' website advertises its HN System and HX System, and provides information and brochures regarding these systems. (See Exhibits J, K, L, M.) These webpages and brochures highlight Hughes Defendants' implementation of the DVB-S2 standard. On information and belief, through materials such as these, the Hughes Defendants actively encourage their consumers, partners, and/or end users to infringe the ' 032 patent through at least use of the HN System and HX System product lines, knowing those acts to be infringement of the ' 032 patent with intent that those performing the acts infringe the ' 032 patent.
60. As but one example of Dish Defendants' contributory and/or induced infringement, Dish Defendants explicitly encourage their customers to practice the methods disclosed and claimed in the '032 patent by using the Accused Services and Products. According to Dish Defendants' 2012 Annual Report (10-K), Dish Defendants lease to dishNET satellite internet subscribers the customer premise equipment. Dish Defendants also advertise, market, offer for sale, and sell to customers the Hopper set-top box on their website. On information and belief, the dishNET customer premise equipment and the Hopper set-top box implement the DVB-S2 standard. On information and belief, through providing this equipment, Dish Defendants actively encourage their consumers and end users to infringe the '032 patent through at least use of the equipment, knowing those acts to be infringement of the ' 032 patent with intent that those performing the acts infringe the ' 032 patent.
61. Defendants are not licensed or otherwise authorized to practice, contributorily practice and/or induce third parties to practice the claims of the '032 patent.
62. By reason of Defendants' infringing activities, Caltech has suffered, and will continue to suffer, substantial damages.
63. Caltech is entitled to recover from Defendants the damages sustained as a result of Defendants' wrongful acts in an amount subject to proof at trial.
64. Defendants' continuing acts of infringement are irreparably harming and causing damage to Caltech, for which Caltech has no adequate remedy at law, and will continue to suffer such irreparable injury unless Defendants' continuing acts of infringement are enjoined by the Court. The hardships that an injunction would impose are less than those faced by Caltech should an injunction not issue. The public interest would be served by issuance of an injunction. Thus, Caltech is entitled to a preliminary and a permanent injunction against further, infringement.
65. Hughes Defendants' infringement of the ' 032 patent has been and continues to be willful and deliberate, justifying a trebling of damages under 35 U.S.C. § 284. Among other facts, Hughes Defendants have had knowledge of their infringement of the '032 patent before the filing date of this Complaint through letters alleging such infringement. Upon information and belief, Hughes Defendants' accused actions continued despite an objectively high likelihood that they constituted infringement of the '032 patent. Hughes Defendants either knew or should have known about their risk of infringing the '032 patent. Hughes Defendants' conduct despite this knowledge was made with both objective and subjective reckless disregard for the infringing nature of their activities as demonstrated by Hughes Defendants' knowledge regarding the claims of the '032 patent.
66. Defendants' infringement of the ' 032 patent is exceptional and entitles Caltech to attorneys' fees and costs incurred in prosecuting this action under 35

## COUNT III

## Infringement of the ' 781 Patent

67. Plaintiff re-alleges and incorporates by reference the allegations of the preceding paragraphs of this Complaint as if fully set forth herein.
68. On information and belief, in violation of 35 U.S.C. § 271, Defendants have infringed and are currently infringing, directly and/or through intermediaries, the '781 patent by using, without authority, products, methods, equipment, and/or services that practice one or more claims of the ' 781 patent. These products, methods, equipment, and/or services include products that implement the DVB-S2 standard, including without limitation products in the HN System and HX System product lines, satellite internet product lines distributed under the dishNET brand, the Hopper set-top box, network and network services that employ these products, and/or marketing, consulting, and/or support services provided for these products and services (collectively, the "Accused Services and Products"). For example, at least Paragraphs 32 and 33 illustrate a limited number of examples of Defendants' direct infringement of the ' 781 patent. Defendants have infringed and are currently infringing literally and/or under the doctrine of equivalents.
69. On information and belief, in violation of 35 U.S.C. § 271, Defendants have infringed and are continuing to infringe the ' 781 patent by contributing to and/or actively inducing the infringement by others of the ' 781 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, products, methods, equipment, and/or services, including the Accused Services and Products, that practice one or more claims of the ' 781 patent.
70. On information and belief, Hughes Defendants have had actual knowledge of their infringement of the ' 781 patent, the subject matter of the ' 781 patent, and/or the invention of the ' 781 patent before the filing date of this Complaint. On information and belief, Hughes Defendants also had knowledge of
the application that led to the ' 781 patent before the filing date of this Complaint. Hughes Defendants at least have had actual knowledge of their infringement of the ' 781 patent since no later than the filing date of this Complaint.
71. On information and belief, Dish Defendants have had actual knowledge of their infringement of the ' 781 patent before the filing date of this Complaint based on their marketing, sale, and distribution, among other activities, of Hughes Defendants' satellite internet service and their relationship with Hughes Defendants (see Paragraphs 9, 10, 33). Dish Defendants at least have had actual knowledge of their infringement of the ' 781 patent since no later than the filing date of this Complaint.
72. Notwithstanding Defendants' actual notice of infringement, Defendants have continued, directly and/or through intermediaries, to manufacture, use, import, offer for sale, or sell the Accused Services and Products with knowledge of or willful blindness to the fact that their actions will induce others, including but not limited to their customers, partners, and/or end users, to infringe the ' 781 patent. Defendants have induced and continue to induce others to infringe the ' 781 patent in violation of 35 U.S.C. § 271 by encouraging and facilitating others to perform actions that Defendants know to be acts of infringement of the ' 781 patent with intent that those performing the acts infringe the ' 781 patent. Upon information and belief, Defendants, directly and/or through intermediaries, advertise and distribute the Accused Services and Products, publish instruction materials, specifications and/or promotional literature describing the operation of the Accused Services and Products, and/or offer training and/or consulting services regarding the Accused Services and Products to their customers, partners, and/or end users. At least consumers, partners, and/or end users of these Accused Services and Products then directly or jointly infringe the ' 781 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, the Accused Services and Products.
73. Upon information and belief, Defendants know that the Accused Services and Products are especially made or especially adapted for use in the infringement of the ' 781 patent. The infringing components of these products are not staple articles or commodities of commerce suitable for substantial noninfringing use, and the infringing components of these products are a material part of the invention of the ' 781 patent. Accordingly, in violation of 35 U.S.C. § 271, Defendants are also contributing, directly and/or through intermediaries, to the direct infringement of the ' 781 patent by at least the customers, partners, and/or end users of these Accused Services and Products. The customers, partners, and/or end users of these Accused Services and Products directly infringe the ' 781 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, the Accused Services and Products.
74. As but one example of Hughes Defendants' contributory and/or induced infringement, Hughes Defendants explicitly encourage their customers to practice the methods disclosed and claimed in the ' 781 patent by using the Accused Services and Products. As detailed in Paragraphs 34 through 36, Hughes Defendants' website advertises its HN System and HX System, and provides information and brochures regarding these systems. (See Exhibits J, K, L, M.) These webpages and brochures highlight Hughes Defendants' implementation of the DVB-S2 standard. On information and belief, through materials such as these, the Hughes Defendants actively encourage their consumers, partners, and/or end users to infringe the ' 781 patent through at least use of the HN System and HX System product lines, knowing those acts to be infringement of the ' 781 patent with intent that those performing the acts infringe the ' 781 patent.
75. As but one example of Dish Defendants' contributory and/or induced infringement, Dish Defendants explicitly encourage their customers to practice the methods disclosed and claimed in the ' 781 patent by using the Accused Services and Products. According to Dish Defendants' 2012 Annual Report (10-K), Dish

Defendants lease to dishNET satellite internet subscribers the customer premise equipment. Dish Defendants also advertise, market, offer for sale, and sell to customers the Hopper set-top box on their website. On information and belief, the dishNET customer premise equipment and the Hopper set-top box implement the DVB-S2 standard. On information and belief, through providing this equipment, Dish Defendants actively encourage their consumers and end users to infringe the ' 781 patent through at least use of the equipment, knowing those acts to be infringement of the ' 781 patent with intent that those performing the acts infringe the ' 781 patent.
76. Defendants are not licensed or otherwise authorized to practice, contributorily practice and/or induce third parties to practice the claims of the ' 781 patent.
77. By reason of Defendants' infringing activities, Caltech has suffered, and will continue to suffer, substantial damages.
78. Caltech is entitled to recover from Defendants the damages sustained as a result of Defendants' wrongful acts in an amount subject to proof at trial.
79. Defendants' continuing acts of infringement are irreparably harming and causing damage to Caltech, for which Caltech has no adequate remedy at law, and will continue to suffer such irreparable injury unless Defendants' continuing acts of infringement are enjoined by the Court. The hardships that an injunction would impose are less than those faced by Caltech should an injunction not issue. The public interest would be served by issuance of an injunction. Thus, Caltech is entitled to a preliminary and a permanent injunction against further infringement.
80. Hughes Defendants' infringement of the ' 781 patent has been and continues to be willful and deliberate, justifying a trebling of damages under 35 U.S.C. § 284. Among other facts, on information and belief, Hughes Defendants have had knowledge of their infringement of the ' 781 patent, the subject matter of the ' 781 patent, and/or the invention of the ' 781 patent before the filing date of this

Complaint. Upon information and belief, Hughes Defendants' accused actions continued despite an objectively high likelihood that they constituted infringement of the ' 781 patent. Hughes Defendants either knew or should have known about their risk of infringing the ' 781 patent. Hughes Defendants' conduct despite this knowledge was made with both objective and subjective reckless disregard for the infringing nature of their activities as demonstrated by Hughes Defendants' knowledge regarding the claims of the ' 781 patent.
81. Defendants' infringement of the ' 781 patent is exceptional and entitles Caltech to attorneys' fees and costs incurred in prosecuting this action under 35 U.S.C. § 285.

## COUNT IV

## Infringement of the '833 Patent

82. Plaintiff re-alleges and incorporates by reference the allegations of the preceding paragraphs of this Complaint as if fully set forth herein.
83. On information and belief, in violation of 35 U.S.C. § 271, Defendants have infringed and are currently infringing, directly and/or through intermediaries, the '833 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, products, methods, equipment, and/or services that practice one or more claims of the ' 833 patent. These products, methods, equipment, and/or services include products that implement the DVB-S2 standard, including without limitation products in the HN System and HX System product lines, satellite internet product lines distributed under the dishNET brand, the Hopper set-top box, network and network services that employ these products, and/or marketing, consulting, and/or support services provided for these products and services (collectively, the "Accused Services and Products"). For example, at least Paragraphs 32 and 33 illustrate a limited number of examples of Defendants' direct infringement of the ' 833 patent. Defendants have infringed and are currently infringing literally and/or under the doctrine of equivalents.
84. On information and belief, in violation of 35 U.S.C. § 271, Defendants have infringed and are continuing to infringe the ' 833 patent by contributing to and/or actively inducing the infringement by others of the ' 833 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, products, methods, equipment, and/or services, including the Accused Services and Products, that practice one or more claims of the ' 833 patent.
85. On information and belief, Hughes Defendants have had actual knowledge of their infringement of the ' 833 patent, the subject matter of the ' 833 patent, and/or the invention of the ' 833 patent before the filing date of this Complaint. On information and belief, Hughes Defendants also had knowledge of the application that led to the ' 833 patent before the filing date of this Complaint. Hughes Defendants at least have had actual knowledge of their infringement of the '833 patent since no later than the filing date of this Complaint.
86. On information and belief, Dish Defendants have had actual knowledge of their infringement of the ' 833 patent before the filing date of this Complaint based on their marketing, sale, and distribution, among other activities, of Hughes Defendants' satellite internet service and their relationship with Hughes Defendants (see Paragraphs 9, 10, 33). Dish Defendants at least have had actual knowledge of their infringement of the ' 833 patent since no later than the filing date of this Complaint.
87. Notwithstanding Defendants' actual notice of infringement, Defendants have continued, directly and/or through intermediaries, to manufacture, use, import, offer for sale, or sell the Accused Services and Products with knowledge of or willful blindness to the fact that their actions will induce others, including but not limited to their customers, partners, and/or end users, to infringe the ' 833 patent. Defendants have induced and continue to induce others to infringe the ' 833 patent in violation of 35 U.S.C. § 271 by encouraging and facilitating others to perform actions that Defendants know to be acts of infringement of the ' 833 patent with
intent that those performing the acts infringe the ' 833 patent. Upon information and belief, Defendants, directly and/or through intermediaries, advertise and distribute the Accused Services and Products, publish instruction materials, specifications and/or promotional literature describing the operation of the Accused Services and Products, and/or offer training and/or consulting services regarding the Accused Services and Products to their customers, partners, and/or end users. At least consumers, partners, and/or end users of these Accused Services and Products then directly or jointly infringe the ' 833 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, the Accused Services and Products.
88. Upon information and belief, Defendants know that the Accused Services and Products are especially made or especially adapted for use in the infringement of the ' 833 patent. The infringing components of these products are not staple articles or commodities of commerce suitable for substantial noninfringing use, and the infringing components of these products are a material part of the invention of the ' 833 patent. Accordingly, in violation of 35 U.S.C. § 271, Defendants are also contributing, directly and/or through intermediaries, to the direct infringement of the ' 833 patent by at least the customers, partners, and/or end users of these Accused Services and Products. The customers, partners, and/or end users of these Accused Services and Products directly infringe the ' 833 patent by making, using, selling, offering for sale, and/or importing into the United States, without authority, the Accused Services and Products.
89. As but one example of Hughes Defendants' contributory and/or induced infringement, Hughes Defendants explicitly encourage their customers to practice the methods disclosed and claimed in the ' 833 patent by using the Accused Services and Products. As detailed in Paragraphs 34 through 36, Hughes Defendants' website advertises its HN System and HX System, and provides information and brochures regarding these systems. (See Exhibits J, K, L, M.)

These webpages and brochures highlight Hughes Defendants' implementation of the DVB-S2 standard. On information and belief, through materials such as these, the Hughes Defendants actively encourage their consumers, partners, and/or end users to infringe the '833 patent through at least use of the HN System and HX System product lines, knowing those acts to be infringement of the ' 833 patent with intent that those performing the acts infringe the ' 833 patent.
90. As but one example of Dish Defendants' contributory and/or induced infringement, Dish Defendants explicitly encourage their customers to practice the methods disclosed and claimed in the ' 833 patent by using the Accused Services and Products. According to Dish Defendants' 2012 Annual Report (10-K), Dish Defendants lease to dishNET satellite internet subscribers the customer premise equipment. Dish Defendants also advertise, market, offer for sale, and sell to customers the Hopper set-top box on their website. On information and belief, the dishNET customer premise equipment and the Hopper set-top box implement the DVB-S2 standard. On information and belief, through providing this equipment, Dish Defendants actively encourage their consumers and end users to infringe the ' 833 patent through at least use of the equipment, knowing those acts to be infringement of the ' 833 patent with intent that those performing the acts infringe the ' 833 patent.
91. Defendants are not licensed or otherwise authorized to practice, contributorily practice and/or induce third parties to practice the claims of the ' 833 patent.
92. By reason of Defendants' infringing activities, Caltech has suffered, and will continue to suffer, substantial damages.
93. Caltech is entitled to recover from Defendants the damages sustained as a result of Defendants' wrongful acts in an amount subject to proof at trial.
94. Defendants' continuing acts of infringement are irreparably harming and causing damage to Caltech, for which Caltech has no adequate remedy at law,
and will continue to suffer such irreparable injury unless Defendants' continuing acts of infringement are enjoined by the Court. The hardships that an injunction would impose are less than those faced by Caltech should an injunction not issue. The public interest would be served by issuance of an injunction. Thus, Caltech is entitled to a preliminary and a permanent injunction against further infringement.
95. Hughes Defendants' infringement of the ' 833 patent has been and continues to be willful and deliberate, justifying a trebling of damages under 35 U.S.C. § 284. Among other facts, on information and belief, Hughes Defendants have had knowledge of their infringement of the ' 833 patent, the subject matter of the ' 833 patent, and/or the invention of the ' 833 patent before the filing date of this Complaint. Upon information and belief, Hughes Defendants' accused actions continued despite an objectively high likelihood that they constituted infringement of the ' 833 patent. Hughes Defendants either knew or should have known about their risk of infringing the ' 833 patent. Hughes Defendants' conduct despite this knowledge was made with both objective and subjective reckless disregard for the infringing nature of their activities as demonstrated by Hughes Defendants' knowledge regarding the claims of the ' 833 patent.
96. Defendants' infringement of the ' 833 patent is exceptional and entitles Caltech to attorneys' fees and costs incurred in prosecuting this action under 35 U.S.C. § 285.

## PRAYER FOR RELIEF

WHEREFORE, Plaintiff respectfully prays for the following relief:
(a) A judgment that Defendants have infringed each and every one of the Asserted Patents;
(b) A preliminary and permanent injunction against Defendants, its respective officers, agents, servants, employees, attorneys, parent and subsidiary corporations, assigns and successors in interest, and those persons in active concert or participation with them, enjoining them from infringement, inducement of
infringement, and contributory infringement of each and every one of the Asserted Patents, including but not limited to an injunction against making, using, selling, and/or offering for sale within the United States, and/or importing into the United States, any products, methods, equipment and/or services that infringe the Asserted Patents;
(c) Damages adequate to compensate Caltech for Defendants' infringement of the Asserted Patents pursuant to 35 U.S.C. § 284;
(d) Prejudgment interest;
(e) Post-judgment interest;
(f) A judgment holding Hughes Defendants' infringement of the Asserted Patents to be willful, and a trebling of damages pursuant to 35 U.S.C. § 284;
(g) A declaration that this Action is exceptional pursuant to 35 U.S.C. § 285, and an award to Caltech of its attorneys' fees, costs and expenses incurred in connection with this Action; and
(h) Such other relief as the Court deems just and equitable.

DATED: March 6, 2014
Respectfully submitted,
QUINN EMANUEL URQUHART \& SULLIVAN, LLP

By /s/ James R. Asperger<br>James R. Asperger<br>Attorneys for Plaintiff California Institute of Technology

## DEMAND FOR JURY TRIAL

Pursuant to Rule 38 of the Federal Rules of Civil Procedure and Local Rule 38-1 of this Court, Plaintiff hereby demands a trial by jury as to all issues so triable.

DATED: March 6, 2014
Respectfully submitted,
QUINN EMANUEL URQUHART \& SULLIVAN, LLP

By /s/ James R. Asperger<br>James R. Asperger<br>Attorneys for Plaintiff California Institute of Technology

## PROOF OF SERVICE

I am employed in the County of Los Angeles, State of California. I am over the age of eighteen years and not a party to the within action; my business address is 865 South Figueroa Street, 10th Floor, Los Angeles, California 90017-2543.

On March 6, 2014, I served true copies of the following document(s) described as AMENDED COMPLAINT FOR PATENT INFRINGEMENT on the interested parties in this action as follows:

David C. Marcus
david.marcus@wilmerhale.com
Matthew J. Hawkinson
Matthew.hawkinson@wilmerhale.com
Aaron Thompson
Attorneys for
Defendants and
Counterclaim
-Plaintiffs
aaron.thompson@wilmerhale.com
WILMER CUTLER PICKERING HALE AND DORR LLP
350 South Grand Avenue, Suite 2100
Los Angeles, CA 90071
William G. McElwain (pro hac vice)
william.mcelwain@wilmerhale.com
WILMER CUTLER PICKERING HALE AND DORR LLP 1875 Pennsylvania Avenue NW
Washington, DC 20006
William F. Lee (pro hac vice)
william.lee@wilmerhale.com
WILMER CUTLER PICKERING HALE AND DORR LLP
60 State Street
Boston. Massachusetts. 02109
BY ELECTRONIC MAIL TRANSMISSION: By electronic mail transmission from angeldelira@quinnemanuel.com on March 6, 2014, by transmitting a PDF format copy of such document(s) to each such person at the e mail address listed below their address(es). The document(s) was/were transmitted by electronic transmission and such transmission was reported as complete and without error.

BY FEDEX: I deposited such document(s) in a box or other facility regularly maintained by FedEx, or delivered such document(s) to a courier or driver authorized by FedEx to receive documents, in sealed envelope(s) or package(s) designated by FedEx with delivery fees paid or provided for, addressed to the person(s) being served.

I declare that I am employed in the office of a member of the bar of this Court at whose direction the service was made.

Executed on March 6, 2014, at Los Angeles, California.


## (12) United States Patent

 Jin et al.(10) Patent No.: US 7,116,710 B1
(45) Date of Patent: Oct. 3, 2006
(54) SERIAL CONCATENATION OF INTERLEAVED CONVOLUTIONAL CODES FORMING TURBO-LIKE CODES
(75) Inventors: Hui Jin, Glen Gardner, NJ (US); Aamod Khandekar, Pasadena, CA (US); Robert J. McEliece, Pasadena, CA (US)
(73) Assignee: California Institute of Technology, Pasadena, CA (US)
(*) Notice:
Subject to any disclaimer, the term of this patent is extended or adjusted under 35 U.S.C. 154(b) by 735 days.
(21) Appl. No.: 09/861,102
(22) Filed: May 18, 2001

## Related U.S. Application Data

(60) Provisional application No. 60/205,095, filed on May $18,2000$.
(51) Lut. Cl. H04B 1/66 (2006.01)
(52) U.S. Cl. ..................... 375/240; 375/262; 375/265; 375/341; 341/51; 341/102; 714/752
(58) Field of Classification Search $\qquad$ 375/259, 375/262, 265, 285, 296, 341, 346, 348; 714/746, $714 / 752,755,756,786,792,794,795,796 ;$ 341/51, 52, 56, 102, 103
See application file for complete search history.

## References Cited

U.S. PATENT DOCUMENTS

5,392,299 A $2 / 1995$ Rhines et al.
5,751,739 A * 5/1998 Seshadri et al. $\qquad$ 714/746

| 5,881,093 A | 3/1999 | Wang et al. |
| :---: | :---: | :---: |
| 6,014,411 A * | 1/2000 | Wang ....................... 375/259 |
| 6,023,783 A | 2/2000 | Divsalar et al. |
| 6,031,874 A | 2/2000 | Chennakeshu et al. |
| 6,032,284 A | 2/2000 | Bliss |
| 6,044,116 A | 3/2000 | Wang |
| 6,396,423 B1* | 5/2002 | Laumen et al. .............. 341/95 |
| 6,437,714 B1* | 8/2002 | Kim et al. ................... 341/81 |
| 2001/0025358 Al | 9/2001 | Eidson et al. |
| OTH | ER PUB | BLICATIONS |

Wiberg et al., "Codes and Iteratie Decoding on General Graphs", 1995 Intl. Symposium on Information Theory, Sep. 1995, p. 506.* Appendix A.1 "Structure of Parity Check Matrices of Standardized LDPC Codes," Digital Video Broadcasting (DVB) User guidelines for the second generation system for Broadcasting, Interactive Services, News Gathering and other broadband satellite applications (DVB-S2) ETSI TR 102376 V1.1.1. (2005-02) Technical Report. pp. 64.
Benedetto et al., "Bandwidth efficient parallel concatenated coding schemes," Electronics Letters 31(24):2067-2069 (Nov. 23, 1995). Benedetto et al., "Sof-output decoding algorithms in iterative decoding of turbo codes," The Telecommunications and Data Acquisition (TDA) Progress Report 42-124 for NASA and California Institute of Technology Jet Propulsion Laboratory, Joseph H. Yuen, Ed., pp. 63-87 (Feb. 15, 1996).
(Continued)
Primary Examiner-Dac V. Ha
(74) Attorney, Agent, or Firm-Fish \& Richardson P.C.

## ABSTRACT

A serial concatenated coder includes an outer coder and an inner coder. The outer coder irregularly repeats bits in a data block according to a degree profile and scrambles the repeated bits. The scrambled and repeated bits are input to an inner coder, which has a rate substantially close to one.

33 Claims, 5 Drawing Sheets


## OTHER PUBLICATIONS

Benedetto et al., "Serial Concatenation of Interleaved Codes: Performace Analysis, Design, and Iterative Decoding," The Telecommunications and Data Acquisition (TDA) Progress Report 42-126 for NASA and California Institute of Technology Jet Propulsion Laboratory, Jospeh H. Yuen, Ed., pp. 1-26 (Aug. 15, 1996). Benedetto et al., "A Soft-Input Soft-Output Maximum A Posteriori (MAP) Module to Decode Parallel and Serial Concatenated Codes," The Telecommunications and Data Acquisition (TDA) Progress Report 42-127 for NASA and Califormia Institute of Technology Jet Propulsion Laboratory, Jospeh H. Yuen, Ed., pp. 1-20 (Nov. 15, 1996).

Benedetto et al., "Parallel Concatenated Trellis Coded Modulation," ICC '96, IEEE, pp. 974-978, (Jun. 1996).
Benedetto, S. et al., "A Soft-Input Soft-Output APP Module for Iterative Decoding of Concatenated Codes," IEEE Communications Letters 1(1):22-24 (Jan. 1997).
Benedetto et al., "Serial Concatenation of interleaved codes: performance analysis, design, and iterative decoding," Proceedings from the IEEE 1997 International Symposium on Information Theory (ISIT), Ulm, Germany, p. 106, Jun. 29-Jul. 4, 1997.
Benedetto et al., "Serial Concatenated Trellis Coded Modulation with Iterative Decoding," Proceedings from IEEE 1997 International Symposium on Information Theory (ISIT), Ulm, Germany, p. 8, Jun. 29-Jul. 4, 1997.
Benedetto et al., "Desigri of Serially Concatenated Interleaved Codes," ICC 97, Montreal, Canada, pp. 710-714, (Jun. 1997).
Berrou et al., 'Near Shannon Limit Error-Correcting Coding and Decoding: Turbo Codes," ICC pp. 1064-1070 (1993).
Digital Video Broadcasting (DVB) User guidelines for the second generation system for Broadcasting, Interactive Services, News Gathering and other broadband satellite applications (DVB-S2) ETSI TR 102376 V1.1.1. (Feb. 2005) Technical Report, pp. 1-104 (Feb. 15, 2005).
Divsalar et al., "Coding Theorems for 'Turbo-Like' Codes," Proceedings of the $36^{\text {th }}$ Annual Allerton Conference on Communication, Control, and Computing, Sep. 23-25 1998, Allerton House, Monticello, Illinois, pp. 201-210 (1998).
Divsalar, D. et al., "Multiple Turbo Codes for Deep-Space Communications," The Telecommunications and Data Acquisition
(TDA) Progress Report 42-121 for NASA and Califormia Institute of Technology Jet Propulsion Laboratory, Jospeh H. Yuen, Ed., pp. 60-77 (May 15, 1995).
Divsalar, D. et al., "On the Design of Turbo Codes," The Telecommunications and Data Acquisition (TDA) Progress Report 42-123 for NASA and Califomia Institute of Technology Jet Propulsion Laboratory, Jospeh H. Yuen, Ed., pp. 99-131 (Nov. 15, 1995).
Divsalar, D. et al., "Low-rate turbo codes for Deep Space Communications," Proceedings from the 1995 IEEE International Symposium on Information Theory, Sep. 17-22, 1995, Whistler, British Columbia, Canada, p. 35.
Divsalar, D. et al., "Turbo Codes for PCS Applications," ICC 95, IEEE, Seattle, WA, pp. 54-59 (Jun. 1995).
Divsalar, D. et al., "Multiple Turbo Codes," MILCOM 95, San Diego, CA pp. 279-285 (Nov. 5-6, 1995).
Divsalar et al., "Effective free distance of turbo codes," Electronics Letters 32(5): 445-446 (Feb. 29, 1996).
Divsalar, D. et al., "Hybrid concatenated codes and Iterative Decoding," Proceedings from the IEEE 1997 International Symposium on Information Theory (ISIT), Ulm, Germany, p. 10 (Jun. 29-Jul. 4, 1997).

Divsalar, D. et al., "Serial Turbo Trellis Coded Modulation with Rate-1 Inner Code," Proceedings from the IEEE 2000 International Symposium on Information Theory (ISIT), Italy, pp. 1-14 (Jun. 2000).

Jin et al., "Irregular Repeat - Accumulate Codes," 2nd International Symposium on Turbo Codes \& Related Topics, Sep. 4-7, 2000, Brest, France, 25 slides, (presented on Sep. 4, 2000).
Jin et al., "Irregular Repeat-Accumulate Codes," $2^{\text {nd }}$ International Symposium on Turbo Codes \& Related Topics, Sep. 4-7, 2000, Brest, France, pp. 1-8 (2000).
Richardson, et al., "Design of capacity approaching irregular low density parity check codes," IEEE Trans, Inform. Theory 47: 619-637 (Feb. 2001).
Richardson, T. and R. Urbanke, "Efficient encoding of low-density parity check codes," IEEE Trans. Inform. Theory 47: 638-656 (Feb. 2001).

* cited by examiner

FIG. 1
(Prior Art)

Exhibit A
Page 32



FIG. 3

## Exhibit A

Page 34


FIG. 5A


FIG. $5 B$

## Exhibit A

Page 35


Exhibit A
Page 36

## 1 <br> SERIAL CONCATENATION OF INTERLEAVED CONVOLUTIONAL CODES FORMING TURBO-LIKE CODES

## CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Application Ser. No. 60/205,095, filed on May 18, 2000, and to U.S. application Ser. No. 09/922,852, filed on Aug. 18, 2000 and entitled Interleaved Serial Concatenation Forming Turbo-Like Codes.

## GOVERNMENT LICENSE RIGHTS

The U.S. Government has a paid-up license in this invention and the right in limited circumstances to require the patent owner to license others on reasonable terms as provided for by the terms of Grant No. CCR-9804793 awarded by the National Science Foundation.

## BACKGROUND

Properties of a channel affect the amount of data that can be handled by the channel. The so-called "Shannon limit" defines the theoretical limit of the amount of data that a channel can carry.

Different techniques have been used to increase the data rate that can be handled by a channel. "Near Shannon Limit Error-Correcting Coding and Decoding: Turbo Codes," by Berrou et al. ICC, pp 1064-1070, (1993), described a new "turbo code" technique that has revolutionized the field of error correcting codes. Turbo codes have sufficient randomness to allow reliable communication over the channel at a high data rate near capacity. However, they still retain sufficient structure to allow practical encoding and decoding algorithms. Still, the technique for encoding and decoding turbo codes can be relatively complex.
A standard turbo coder 100 is shown in FIG. 1. A block of $k$ information bits is input directly to a first coder 102. A k bit interleaver 106 also receives the k bits and interleaves them prior to applying them to a second coder 104. The second coder produces an output that has more bits than its input, that is, it is a coder with rate that is less than 1 . The coders 102, 104 are typically recursive convolutional coders.
Three different items are sent over the channel 150: the original k bits, first encoded bits 110, and second encoded bits 112. At the decoding end, two decoders are used: a first constituent decoder 160 and a second constituent decoder 162. Each receives both the original $k$ bits, and one of the encoded portions 110, 112. Each decoder sends likelihood estimates of the decoded bits to the other decoders. The estimates are used to decode the uncoded information bits as corrupted by the noisy channel.

## SUMMARY

A coding system according to an embodiment is configured to receive a portion of a signal to be encoded, for example, a data block including a fixed number of bits. The coding system includes an outer coder, which repeats and scrambles bits in the data block. The data block is apportioned into two or more sub-blocks, and bits in different sub-blocks are repeated a different number of times according to a selected degree profile. The outer coder may include a repeater with a variable rate and an interleaver. Alterna- 6 tively, the outer coder may be a low-density generator matrix (LDGM) coder.

The repeated and scrambled bits are input to an inner coder that has a rate substantially close to one. The inner coder may include one or more accumulators that perform recursive modulo two addition operations on the input bit 5 stream.

The encoded data output from the inner coder may be transmitted on a channel and decoded in linear time at a destination using iterative decoding techniques. The decoding techniques may be based on a Tanner graph represen10 tation of the code.

## BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of a prior "turbo code" 15 system

FIG. 2 is a schematic diagram of a coder according to an embodiment.

FIG. 3 is a Tanner graph for an irregular repeat and accumulate (IRA) coder.
FIG. 4 is a schematic diagram of an IRA coder according to an embodiment.

FIG. 5A illustrates a message from a variable node to a check node on the Tanner graph of FIG. 3.

FIG. 5B illustrates a message from a check node to a

FIG. 2 illustrates a coder 200 according to an embodiment. The coder 200 may include an outer coder 202, an interleaver 204, and inner coder 206. The coder may be used to format blocks of data for transmission, introducing redundancy into the stream of data to protect the data from loss due to transmission errors. The encoded data may then be decoded at a destination in linear time at rates that may approach the channel capacity.

The outer coder 202 receives the uncoded data. The data may be partitioned into blocks of fixed size, say $k$ bits. The outer coder may be an ( $\mathrm{n}, \mathrm{k}$ ) binary linear block coder, where $\mathrm{n}>\mathrm{k}$. The coder accepts as input a block $u$ of $k$ data bits and produces an output block $v$ of $n$ data bits. The mathematical relationship between $u$ and $v$ is $v=T_{0} u$, where $T_{0}$ is an $n \times k$ matrix, and the rate of the coder is $\mathrm{k} / \mathrm{n}$.

The rate of the coder may be irregular, that is, the value of $T_{0}$ is not constant, and may differ for sub-blocks of bits in the data block. In an embodiment, the outer coder 202 is a repeater that repeats the k bits in a block a number of times q to produce a block with n bits, where $\mathrm{n}=\mathrm{qk}$. Since the repeater has an irregular output, different bits in the block may be repeated a different number of times. For example, a fraction of the bits in the block may be repeated two times, a fraction of bits may be repeated three times, and the remainder of bits may be repeated four times. These fractions define a degree sequence, or degree profile, of the code.

The inner coder 206 may be a linear rate-1 coder, which means that the $n$-bit output block $x$ can be written as $x=T, w$, where $\mathrm{T}_{I}$ is a nonsingular $\mathrm{n} \times \mathrm{n}$ matrix. The inner coder 210 can have a rate that is close to 1 , e.g., within $50 \%$, more preferably $10 \%$ and perhaps even more preferably within $1 \%$ of 1 .
In an embodiment, the inner coder 206 is an accumulator, which produces outputs that are the modulo two (mod-2) partial sums of its inputs. The accumulator may be a
truncated rate-1 recursive convolutional coder with the transfer function $1 /(1+D)$. Such an accumulator may be considered a block coder whose input block $\left[\mathrm{x}_{1}, \ldots, \mathrm{x}_{n}\right]$ and output block $\left[y_{1}, \ldots, y_{n}\right]$ are related by the formula

$$
\begin{aligned}
& y_{1}=x_{1} \\
& y_{2}=x_{1} \oplus x_{2} \\
& y_{3}=x_{1} \oplus x_{2} \oplus x_{3} \\
& y_{n}=x_{1} \oplus x_{2} \oplus x_{3} \oplus \ldots \oplus x_{n}
\end{aligned}
$$

where " $\oplus$ " denotes mod-2, or exclusive-OR (XOR), addition. An advantage of this system is that only mod-2 addition is necessary for the accumulator. The accumulator may be embodied using only XOR gates, which may simplify the design.

The bits output from the outer coder 202 are scrambled before they are input to the inner coder 206. This scrambling may be performed by the interleaver 204, which performs a pseudo-random permutation of an input block $v$, yielding an output block w having the same length as $\mathbf{v}$.
The serial concatenation of the interleaved irregular repeat code and the accumulate code produces an irregular repeat and accumulate (IRA) code. An IRA code is a linear code, and as such, may be represented as a set of parity checks. The set of parity checks may be represented in a bipartite graph, called the Tanner graph, of the code. FIG. 3 shows a Tanner graph 300 of an IRA code with parameters ( $f_{1}, \ldots, f_{j} ;$ a), where $f_{i} \geqq 0, \Sigma_{i} f_{i}=1$ and "a" is a positive integer. The Tanner graph includes two kinds of nodes: variable nodes (open circles) and check nodes (filled circles). There are $k$ variable nodes $\mathbf{3 0 2}$ on the left, called information nodes. There are $r$ variable nodes 306 on the right, called parity nodes. There are $r=\left(k \Sigma_{i} \mathrm{if}_{i}\right) / \mathrm{a}$ check nodes 304 connected between the information nodes and the parity nodes. Each information node 302 is connected to a number of check nodes 304. The fraction of information nodes connected to exactly i check nodes is $f_{i}$. For example, in the Tanner graph 300, each of the $f_{2}$ information nodes are connected to two check nodes, corresponding to a repeat of $\mathrm{q}=2$, and each of the $f_{3}$ information nodes are connected to three check nodes, corresponding to $\mathrm{q}=3$.
Each check node 304 is connected to exactly "a" information nodes 302. In FIG. 3, $a=3$. These connections can be made in many ways, as indicated by the arbitrary permutation of the ra edges joining information nodes 302 and check nodes 304 in permutation block 310. These connections correspond to the scrambling performed by the interleaver 204.

In an alternate embodiment, the outer coder 202 may be a low-density generator matrix (LDGM) coder that performs an irregular repeat of the k bits in the block, as shown in FIG. 4. As the name implies, an LDGM code has a sparse (low-density) generator matrix. The IRA code produced by the coder 400 is a serial concatenation of the LDGM code and the accumulator code. The interleaver 204 in FIG. 2 may be excluded due to the randomness already present in the structure of the LDGM code.
If the permutation performed in permutation block 310 is fixed, the Tanner graph represents a binary linear block code with $k$ information bits ( $u_{1}, \ldots, u_{k}$ ) and $r$ parity bits ( $\mathrm{x}_{1}, \ldots$, $\mathrm{x}_{r}$ ), as follows. Fach of the information bits is associated with one of the information nodes 302, and each of the parity bits is associated with one of the parity nodes 306 . The value of a parity bit is determined uniquely by the condition that the mod- 2 sum of the values of the variable nodes connected
to each of the check nodes 304 is zero. To see this, set $x_{0}=0$. Then if the values of the bits on the ra edges coming out the permutation box are ( $\mathrm{v}_{1}, \ldots, \mathrm{v}_{\mathrm{ra}}$ ), then we have the recursive formula

$$
x_{j}=x_{j-1}+\sum_{i=1}^{\lambda} v_{(j-1),+i}
$$

for $\mathrm{j}=1,2, \ldots, r$. This is in effect the encoding algorithm. Two types of IRA codes are represented in FIG. 3, a nonsystematic version and a systematic version. The nonsystematic version is an ( $\mathrm{r}, \mathrm{k}$ ) code, in which the codeword corresponding to the information bits ( $\mathrm{u}_{1}, \ldots, \mathrm{u}_{k}$ ) is ( $\mathrm{x}_{1}, \ldots$, $\mathrm{x}_{r}$ ). The systematic version is a ( $k+r, k$ ) code, in which the codeword is $\left(u_{1}, \ldots, u_{k} ; x_{1}, \ldots, x_{r}\right)$.
The rate of the nonsystematic code is

$$
R_{n \text { nss }}=\frac{a}{\sum_{i} i f_{i}}
$$

The rate of the systematic code is

$$
R_{s y s}=\frac{a}{a+\sum_{i} f_{i}}
$$

For example, regular repeat and accumulate (RA) codes can be considered nonsystematic IRA codes with $\mathrm{a}=1$ and exactly one $f_{i}$ equal to 1 , say $f_{q}=1$, and the rest zero, in which case $\mathrm{R}_{n s y s}$ simplifies to $\mathrm{R}=1 / \mathrm{q}$.

The IRA code may be represented using an alternate notation. Let $\lambda_{i}$ be the fraction of edges between the information nodes 302 and the check nodes 304 that are adjacent to an information node of degree i , and let $\rho_{i}$ be the fraction of such edges that are adjacent to a check node of degree $i+2$ (i.e., one that is adjacent to i information nodes). These edge fractions may be used to represent the IRA code rather than the corresponding node fractions. Define $\lambda(x)=\Sigma_{i} \lambda_{i} x^{i-1}$ and $\rho(x)=\Sigma_{i} \rho_{i} x^{i-1}$ to be the generating functions of these sequences. The pair $(\lambda, \rho)$ is called a degree distribution. For $\mathrm{L}(\mathrm{x})=\Sigma_{i} \mathrm{f}_{\mathrm{i}} \mathrm{X}_{i}$,

$$
f_{i}=\frac{\lambda_{i} / i}{\sum_{j} \lambda_{j} / j}
$$

## $L(x)=\int_{0}{ }_{0}^{x} \lambda(t) d t / \int_{0}{ }_{0}^{1} \lambda(t) d t$

The rate of the systematic IRA code given by the degree distribution is given by

$$
\text { Rate }=\left(1+\frac{\sum_{j} \rho_{j} / j}{\sum_{j} \lambda_{j} / j}\right)^{-1}
$$

"Belief propagation" on the Tanner Graph realization may be used to decode IRA codes. Roughly speaking, the belief

## 5

propagation decoding technique allows the messages passed on an edge to represent posterior densities on the bit associated with the variable node. A probability density on a bit is a pair of non-negative real numbers $p(0), p(1)$ satisfying $p(0)+p(1)=1$, where $p(0)$ denotes the probability of the bit being $0, p(1)$ the probability of it being 1 . Such a pair can be represented by its $\log$ likelihood ratio, $m=\log (p(0) / p(1))$. The outgoing message from a variable node $u$ to a check node $v$ represents information about $u$, and a message from a check node $u$ to a variable node $v$ represents information about $u$, as shown in FIGS. 5A and 5B, respectively.

The outgoing message from a node $u$ to a node $v$ depends on the incoming messages from all neighbors $w$ of $u$ except $v$. If $u$ is a variable message node, this outgoing message is

$$
m(u \rightarrow v)=\sum_{w \neq v} m(w \rightarrow u)+m_{0}(u)
$$

where $m_{0}(u)$ is the log-likelihood message associated with $u$. If $u$ is a check node, the corresponding formula is

$$
\tanh \frac{m(u \rightarrow v)}{2}=\prod_{w \neq v} \tan h \frac{m(w \rightarrow u)}{2}
$$

Before decoding, the messages $\mathrm{m}(\mathrm{w} \rightarrow \mathrm{u})$ and $\mathrm{m}(\mathrm{u} \rightarrow \mathrm{v})$ are initialized to be zero, and $m_{0}(u)$ is initialized to be the log-likelihood ratio based on the channel received information. If the channel is memoryless, i.e., each channel output only relies on its input, and $y$ is the output of the channel code bit $u$, then $m_{0}(i)=\log (p(u=0 \mid y) / p(u=1 / y))$. After this initialization, the decoding process may run in a fully parallel and local manner. In each iteration, every variable/ check node receives messages from its neighbors, and sends back updated messages. Decoding is terminated after a fixed number of iterations or detecting that all the constraints are satisfied. Upon termination, the decoder outputs a decoded sequence based on the messages $\mathrm{m}(\mathrm{u})=\Sigma \mathrm{w}_{m}(\mathrm{w} \rightarrow \mathrm{u})$.
Thus, on various channels, iterative decoding only differs in the initial messages $m_{0}(\mathrm{u})$. For example, consider three memoryless channel models: a binary erasure channel (BEC); a binary symmetric channel (BSC); and an additive white Gaussian noise (AGWN) channel.
In the BEC, there are two inputs and three outputs. When 0 is transmitted, the receiver can receive either 0 or an erasure E . An erasure E output means that the receiver does not know how to demodulate the output. Similarly, when 1 is transmitted, the receiver can receive either 1 or E . Thus, for the $B E C, y \in\{0, E, 1\}$, and

$$
m_{0}(u)=\left\{\begin{array}{cc}
+\infty & \text { if } y=0 \\
0 & \text { if } y=E \\
-\infty & \text { if } y=1
\end{array}\right.
$$

In the BSC, there are two possible inputs $(0,1)$ and two possible outputs $(0,1)$. The BSC is characterized by a set of
conditional probabilities relating all possible outputs to possible inputs. Thus, for the $\operatorname{BSC} y \in\{0,1\}$,

$$
m_{0}(u)=\left\{\begin{array}{cc}
\log \frac{1-p}{p} & \text { if } y=0 \\
-\log \frac{1-p}{p} & \text { if } y=1
\end{array}\right.
$$

and
In the AWGN, the discrete-time input symbols $X$ take their values in a finite alphabet while channel output symbols Y can take any values along the real line. There is assumed to be no distortion or other effects other than the addition of white Gaussian noise. In an AWGN with a Binary Phase Shift Keying (BPSK) signaling which maps 0 to the symbol with amplitude $\sqrt{E S}$ and 1 to the symbol with amplitude $-\sqrt{E s}$, output $y \in R$, then

$$
m_{0}(u)=4 y \sqrt{E_{j}} / N_{0}
$$

where $\mathrm{N}_{0} / 2$ is the noise power spectral density.
The selection of a degree profile for use in a particular transmission channel is a design parameter, which may be affected by various attributes of the channel. The criteria for selecting a particular degree profile may include, for example, the type of channel and the data rate on the channel. For example, Table 1 shows degree profiles that have been found to produce good results for an AWGN channel model.

TABLE 1

| a | 2 | 3 | 4 |
| :---: | :---: | :---: | :---: |
| 12 | 0.139025 | 0.078194 | 0.054485 |
| $\lambda 3$ | 0.2221555 | 0.128085 | 0.104315 |
| $\lambda 5$ |  | 0.160813 |  |
| $\lambda 6$ | 0.638820 | 0.036178 | 0.126755 |
| $\lambda 10$ |  |  | 0.229816 |
| $\lambda 11$ |  |  | 0.016484 |
| $\lambda 12$ |  | 0.108828 |  |
| $\lambda 13$ |  | 0.487902 |  |
| $\lambda 14$ |  |  |  |
| $\lambda 16$ |  |  |  |
| $\lambda 27$ |  |  | 0.450302 |
| 228 |  |  | 0.017842 |
| Rate | 0.333364 | 0.333223 | 0.333218 |
| OGA | 1.1840 | 1.2415 | 1.2615 |
| $\sigma^{*}$ | 1.1981 | 1.2607 | 1.2780 |
| (Eb/N0) * (dB) | 0.190 | -0.250 | -0.371 |
| S.L. (dB) | -0.4953 | -0.4958 | -0.4958 |

Table 1 shows degree profiles yielding codes of rate approximately $1 / 3$ for the AWGN channel and with $\mathrm{a}=2,3,4$. For each sequence, the Gaussian approximation noise threshold, the actual sum-product decoding threshold and the corresponding energy per bit $\left(\mathrm{E}_{b}\right)$-noise power $\left(\mathrm{N}_{0}\right)$ ratio in dB are given. Also listed is the Shannon limit (S.L.).

As the parameter "a" is increased, the performance improves. For example, for $a=4$, the best code found has an iterative decoding threshold of $\mathrm{E}_{b} / \mathrm{N}_{0}=-0.371 \mathrm{~dB}$, which is only 0.12 dB above the Shannon limit.

The accumulator component of the coder may be replaced by a "double accumulator" 600 as shown in FIG. 6. The double accumulator can be viewed as a truncated rate 1 convolutional coder with transfer function $1 /\left(1+D+D^{2}\right)$.
Alternatively, a pair of accumulators may be the added, as shown in FIG. 7. There are three component codes: the "outer" code 700, the "middle" code 702, and the "inner"
code 704. The outer code is an irregular repetition code, and the middle and inner codes are both accumulators.
IRA codes may be implemented in a variety of channels, including memoryless channels, such as the BEC, BSC, and AWGN, as well as channels having non-binary input, nonsymmetric and fading channels, and/or channels with memory.
A number of embodiments have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims.

## The invention claimed is:

1. A method of encoding a signal, comprising:
obtaining a block of data in the signal to be encoded;
partitioning said data block into a plurality of sub-blocks, each sub-block including a plurality of data elements;
first encoding the data block to from a first encoded data block, said first encoding including repeating the data elements in different sub-blocks a different number of times;
interleaving the repeated data elements in the first encoded data block; and
second encoding said first encoded data block using an encoder that has a rate close to one.
2. The method of claim 1, wherein said second encoding is via a rate 1 linear transformation.
3. The method of claim 1 , wherein said first encoding is carried out by a first coder with a variable rate less than one, and said second encoding is carried out by a second coder with a rate substantially close to one.
4. The method of claim 3, wherein the second coder comprises an accumulator.
5. The method of claim 4, wherein the data elements comprises bits.
6. The method of claim 5, wherein the first coder comprises a repeater operable to repeat different sub-blocks a different number of times in response to a selected degree profile.
7. The method of claim 4, wherein the first coder comprises a low-density generator matrix coder and the second coder comprises an accumulator.
8. The method of claim 1, wherein the second encoding uses a transfer function of $1 /(1+D)$.
9. The method of claim 1, wherein the second encoding uses a transfer function of $1 /\left(1+D+D^{2}\right)$.
10. The method of claim 1, wherein said second encoding utilizes two accumulators.
11. A method of encoding a signal, comprising:
receiving a block of data in the signal to be encoded, the data block including a plurality of bits;
first encoding the data block such that each bit in the data block is repeated and two or more of said plurality of bits are repeated a different number of times in order to form a first encoded data block; and
second encoding the first encoded data block in such a way that bits in the first encoded data block are accumulated.
12. The method of claim 11, wherein the said second encoding is via a rate 1 linear transformation.
13. The method of claim 11, wherein the first encoding is via a low-density generator matrix transformation.
14. The method of claim 11, wherein the signal to be encoded comprises a plurality of data blocks of fixed size.

UNITED STATES PATENT AND TRADEMARK OFFICE CERTIFICATE OF CORRECTION

PATENT NO. :7,116,710 B1 Page 1 of 1 APPLICATION NO. : 09/861102<br>DATED : October 3, 2006<br>INVENTOR(S) : Hui Jin, Aamod Khandekar and Robert J. McEliece

It is certified that error appears in the above-identified patent and that said Letters Patent is hereby corrected as shown below:

At column 1, line 8, please amend the paragraph as follows:
This application claims the priority [[to]] of U.S. Provisional
Application Ser. No. 60/205,095, filed on May 18, 2000, and [[to]]
is a continuation-in-part of U.S. application Ser. No. 09/922,852, filed on Aug.
18, 2000 and entitled Interleaved Serial Concatenation Forming Turbo-Like
Codes.

## Signed and Sealed this

Twenty-second Day of July, 2008


JON W. DUDAS
Director of the United States Patent and Trademark Office

## (12) United States Patent

Jin et al.
(10) Patent No.: US 7,421,032 B2
(45) Date of Patent:

Sep. 2, 2008
(54) SERIAL CONCATENATION OF INTERLEAVED CONVOLUTIONAL CODES FORMING TURBO-LIKE CODES
(75) Inventors: Hui Jin, Glen Gardner, NJ (US); Aamod Khandekar, Pasadena, CA (US); Robert J. McEliece, Pasadena, CA (US)
(73) Assignee: Callifornia Institute of Technology, Pasadena, CA (US)
(*) Notice: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 U.S.C. 154(b) by 0 days.
(21) Appl. No.: 11/542,950
(22) Filed:

Oct. 3, 2006
(65)

Prior Publication Data
US 2007/0025450 A1 Feb. 1, 2007
Related U.S. Application Data
(63) Continuation of application No. 09/861,102, filed on May 18, 2001, now Pat. No. 7,116,710, and a continu-ation-in-part of application No. 09/922,852, filed on Aug. 18, 2000, now Pat. No. 7,089,477.
(60) Provisional application No. $60 / 205,095$, filed on May 18, 2000.
(51) Int. Cl.

$$
\text { H04L } 5 / 12
$$

(52) U.S. Cl. ...................... 375/262; 375/265; 375/348; 714/755; 714/786; 714/792; 341/52; 341/102
(58) Field of Classification Search ................ 375/259, 375/262, 265, 285, 296, 341, 346, 348; 714/746, 714/752, 755, 756, 786, 792, 794-796; 341/51, $341 / 52,56,102,103$
See application file for complete search history.
(56)

## References Cited

U.S. PATENT DOCUMENTS

| 5,392,299 A | 2/1995 | Rhines et al. |
| :---: | :---: | :---: |
| 5,530,707 A * | 6/1996 | Lin ........................ 714/792 |
| 5,751,739 A | 5/1998 | Seshadri etal. |
| 5,802,115 A * | 9/1998 | Meyer .................... 375/341 |
| 5,881,093 A | 3/1999 | Wang et al. |
| 6,014,411 A | 1/2000 | Wang |
| 6,023,783 A | 2/2000 | Divsalar et al. |
| 6,031,874 A | 2/2000 | Chennakeshu et al. |
| 6,032,284 A | 2/2000 | Bliss |
| 6,044,116 A | 3/2000 | Wang |
| 6,094,739 A * | 7/2000 | Miller et al. ............... 714/792 |

(Continued)

## OTHER PUBLICATIONS

Appendix A.l "Structure of Parity Check Matrices of Standardized LDPC Codes," Digital Video Broadcasting (DVB) User guidelines for the second generation system for Broadcasting, Interactive Services, News Gathering and other broadband satellite applications (DVB-S2) ETSI TR 102376 V1.1.1. (Feb. 2005) Technical Report, pp. 64.
(Continued)
Primary Examiner-Dac V. Ha
(74) Attorney, Agent, or Firm-Fish \& Richardson P.C.

ABSTRACT
A serial concatenated coder includes an outer coder and an inner coder. The outer coder irregularly repeats bits in a data block according to a degree profile and scrambles the repeated bits. The scrambled and repeated bits are input to an inner coder, which has a rate substantially close to one.

23 Claims, 5 Drawing Sheets


## Exhibit B

Page 42

## U.S. PATENT DOCUMENTS



Benedetto et al., "A Soft-Input Soft-Output Maximum A Posteriori (MAP) Module to Decode Parallel and Serial Concatenated Codes," The Telecommunications and Data Acquisition (TDA) Progress Report 42-127 for NASA and California Institute of Technology Jet Propulsion Laboratory, Jospeh H. Yuen, Ed., pp. 1-20 (Nov. 15, 1996).

Benedetto et al., "Bandwidth efficient parallel concatenated coding schemes," Electronics Letters 31(24): 2067-2069 (Nov. 23, 1995).
Benedetto et al., "Design of Serially Concatenated Interleaved Codes," ICC 97, Montreal, Canada, pp. 710-714, ( Jun. 1997).
Benedetto et al., "Parallel Concatenated Trellis Coded Modulation," ICC '96, IEEE, pp. 974-978, (Jun. 1996).
Benedetto et al., "Serial Concatenated Trellis Coded Modulation with Iterative Decoding." Proceedings from the IEEE 1997 International Symposium on Information Theory (ISIT), Ulm, Germany, p. 8, Jun. 29-Jul. 4, 1997.
Benedetto et al., "Serial Concatenation of Interleaved Codes: Performance Analysis, Design, and Iterative Decoding," The Telecommunications and Data Acquisition (TDA) Progress Report 42-126 for NASA and California Institute of Technology Jet Propulsion Laboratory, Jospeh H. Yuen, Ed., pp. 1-26 (Aug. 15, 1996).
Benedetto et al., "Serial Concatenation of interleaved codes: performance analysis, design, and iterative decoding," Proceedings from the IEEE 1997 International Symposium on Information Theory (ISIT), Ulm, Germany, p. 106, Jun. 29-Jul. 4, 1997.
Benedetto et al., "Soft-output decoding algorithms in iterative decoding of turbo codes," The Telecommunications and Data Acquisition (TDA) Progress Report 42-124 for NASA and California Institute of Technology Jet Propulsion Laboratory, Jospeh H. Yuen, Ed., pp. 63-87 (Feb. 15, 1996).
Benedetto, S. et al., "A Soft-Input Soft-Output APP Module for Iterative Decoding of Concatenated Codes," IEEE Communications Letters I(1): 22-24 (Jan. 1997).
Berrou et al., "Near Shannon Limit Error-Correcting Coding and Decoding: Turbo Codes," ICC pp. 1064-1070 (1993).
Digital Video Broadcasting (DVB) User guidelines for the second generation system for Broadcasting, Interactive Services, News Gathering and other broadband satellite applications (DVB-S2) ETSI TR 102376 V1.1.1. (Feb. 2005) Technical Report, pp. 1-104(Feb. 15, 2005).

Divsalar et al., "Coding Theorems for 'Turbo-Like'Codes," Proceedings of the $36^{\text {th }}$ Annual Allerton Conference on Communication, Control, and Computing, Sep. 23-25, 1998, Allerton House, Monticello, Illinois, pp. 201-210 (1998).
Divsalar et al., "Effective free distance of turbo codes," Electronics Letters 32(5): 445-446 (Feb. 29, 1996).
Divsalar, D. et al., "Hybrid Concatenated Codes and Iterative Decoding," Proceedings from the IEEE 1997 International Symposium on Information Theory (ISIT), Ulm, Germany, p. 10 (Jun. 29-Jul. 4, 1997).

Divsalar, D. et al., "Low-rate turbo codes for Deep Space Communications," Proceedings from the 1995 IEEE International Symposium on Information Theory, Sep. 17-22, 1995, Whistler, British Columbia, Canada, pp. 35.
Divsalar, D. et al., "Multiple Turbo Codes for Deep-Space Communications," The Telecommunications and Data Acquisition (TDA) Progress Report 42-12I for NASA and California Institute of Technology Jet Propulsion Laboratory, Jospeh H. Yuen, Ed., pp. 60-77 (May 15, 1995).
Divsalar, D. et al., "Multiple Turbo Codes," MILCOM95, San Diego, CA pp. 279-285 (Nov. 5-6, 1995).
Divsalar, D. et al., "On the Design of Turbo Codes," The Telecommunications and Data Acquisition (TDA) Progress Report 42-123 for NASA and California Institute of Technology Jet Propulsion Laboratory, Jospeh H. Yuen, Ed., pp. 99-131 (Nov. 15, 1995).
Divsalar, D. et al., "Serial Turbo Trellis Coded Modulation with Rate-1 Inner Code," Proceedings from the IEEE 2000 International Symposium on Information Theory (ISIT), Italy, pp. 1-14 (Iun. 2000).

Divsalar, D. et al., "Turbo Codes for PCS Applications," ICC 95, IEEE, Seattle, WA, pp. 54-59 (Jun. 1995).
Jin et al., "Lrregular Repeat-Accumulate Codes," 2nd International Symposium on Turbo Codes \& Related Topics, Sep. 4-7, 2000, Brest, France, 25 slides, (presented on Sep. 4, 2000).
Jin et al., "Irregular Repeat-Accumulate Codes," $2^{\text {nd }}$ International Symposium on Turbo Codes \& Related Topics, Sep.4-7, 2000, Brest, France, pp. 1-8 (2000).
Richardson et al., "Design of capacity approaching irregular low density parity check codes," IEEE Trans. Inform. Theory 47:619-637 (Feb. 2001).
Richardson, T. and R. Urbanke, "Efficient encoding of low-density parity check codes," IEEE Trans. Inform. Theory 47: 638-656 (Feb. 2001).

Wilberg, et al., "Codes and Iteratie Decoding on General Graphs", 1995 Intl. Symposium on Information Theory, Sep. 1995, p. 468.

* cited by examiner
100

FIG. 1
(Prior Art)

Exhibit B
Page 44


FIG. 4

Variable Node
Fraction of nodes degree i


FIG. 3
Exhibit B
Page 46


FIG. 5A


FIG. 5B

## Exhibit B

Page 47

FIG. 6


Exhibit B

1

## SERIAL CONCATENATION OF INTERLEAVED CONVOLUTIONAL CODES FORMING TURBO-LIKE CODES

## CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No. 09/861,102, filed May 18, 2001, now U.S. Pat. No. 7,116, 710 , which claims the priority of U.S. provisional application Ser. No. 60/205,095, filed May 18, 2000, and is a continua-tion-in-part of U.S. application Ser. No. 09/922,852, filed Aug. 18, 2000, now U.S. Pat. No. 7,089,477.

## GOVERNMENT LICENSE RIGHTS

The U.S. Government has a paid-up license in this invention and the right in limited circumstances to require the patent owner to license others on reasonable terms as provided for by the terms of Grant No. CCR-9804793 awarded by the National Science Foundation.

## BACKGROUND

Properties of a channel affect the amount of data that can be handled by the channel. The so-called "Shannon limit" defines the theoretical limit of the amount of data that a channel can carry.

Different techniques have been used to increase the data rate that can be handled by a channel. "Near Shannon Limit Error-Correcting Coding and Decoding: Turbo Codes," by Berrou et al. ICC, pp 1064-1070, (1993), described a new "turbo code" technique that has revolutionized the field of error correcting codes. Turbo codes have sufficient randomness to allow reliable communication over the channel at a high data rate near capacity. However, they still retain suffcient structure to allow practical encoding and decoding algorithms. Still, the technique for encoding and decoding turbo codes can be relatively complex.
A standard turbo coder 100 is shown in FIG. 1. A block of k information bits is input directly to a first coder 102. A k bit interleaver 106 also receives the k bits and interleaves them prior to applying them to a second coder 104. The second coder produces an output that has more bits than its input, that is, it is a coder with rate that is less than 1 . The coders 102,104 are typically recursive convolutional coders.
Three different items are sent over the channel 150: the original k bits, first encoded bits 110 , and second encoded bits 112. At the decoding end, two decoders are used: a first constituent decoder 160 and a second constituent decoder 162. Each receives both the original k bits, and one of the encoded portions 110, 112. Each decoder sends likelihood estimates of the decoded bits to the other decoders. The estimates are used to decode the uncoded information bits as corrupted by the noisy channel.

## SUMMARY

A coding system according to an embodiment is configured to receive a portion of a signal to be encoded, for example, a data block including a fixed number of bits. The coding system includes an outer coder, which repeats and scrambles bits in the data block. The data block is apportioned into two or more sub-blocks, and bits in different sub-blocks are repeated a different number of times according to a selected degree profile. The outer coder may include a
partial sums of its inputs. The accumulator may be a truncated rate-1 recursive convolutional coder with the transfer function $1 /(1+D)$. Such an accumulator may be considered a block coder whose input block $\left[\mathrm{x}_{1}, \ldots, \mathrm{x}_{n}\right]$ and output block $\left[y_{1}, \ldots, y_{n}\right]$ are related by the formula
$y_{1}=x_{1}$
$y_{2}=x_{1} \oplus x_{2}$
$y_{3}=x_{1} \oplus x_{2} \oplus x_{3}$
-
-
-
$y_{n}=x_{1} \oplus x_{2} \oplus x_{3} \oplus \ldots \oplus x_{n}$.
where " $\oplus$ " denotes mod-2, or exclusive-OR (XOR), addition. An advantage of this system is that only mod-2 addition is necessary for the accumulator. The accumulator may be embodied using only XOR gates, which may simplify the design.

The bits output from the outer coder 202 are scrambled before they are input to the inner coder 206. This scrambling may be performed by the interleaver 204, which performs a pseudo-random permutation of an input block $v$, yielding an output block w having the same length as $v$.

The serial concatenation of the interleaved irregular repeat code and the accumulate code produces an irregular repeat and accumulate (IRA) code. An IRA code is a linear code, and as such, may be represented as a set of parity checks. The set of parity checks may be represented in a bipartite graph, called the Tanner graph, of the code. FIG. 3 shows a Tanner graph 300 of an IRA code with parameters ( $f_{1}, \ldots, f_{j} ;$ a), where $\mathrm{f}_{i} \geqq 0, \Sigma_{i} \mathrm{f}_{i}=1$ and "a" is a positive integer. The Tanner graph includes two kinds of nodes: variable nodes (open circles) and check nodes (filled circles). There are k variable nodes 302 on the left, called information nodes. There are r variable nodes 306 on the right, called parity nodes. There are $r=\left(k \sum_{i} i_{i}\right) / \mathrm{a}$ check nodes 304 connected between the information nodes and the parity nodes. Each information node 302 is connected to a number of check nodes 304. The fraction of information nodes connected to exactly $i$ check nodes is $f_{t}$. For example, in the Tanner graph 300, each of the $f_{2}$ information nodes are connected to two check nodes, corresponding to a repeat of $q=2$, and each of the $f_{3}$ information nodes are connected to three check nodes, corresponding to $\mathrm{q}=3$.
Each check node 304 is connected to exactly " $a$ " information nodes 302. In FIG. 3, $a=3$. These connections can be made in many ways, as indicated by the arbitrary permutation of the ra edges joining information nodes 302 and check nodes 304 in permutation block 310 . These connections correspond to the scrambling performed by the interleaver 204.
In an alternate embodiment, the outer coder 202 may be a low-density generator matrix (LDGM) coder that performs an irregular repeat of the $k$ bits in the block, as shown in FIG. 4. As the name implies, an LDGM code has a sparse (lowdensity) generator matrix. The IRA code produced by the coder 400 is a serial concatenation of the LDGM code and the accumulator code. The interleaver 204 in FIG. 2 may be excluded due to the randomness already present in the structure of the LDGM code.
If the permutation performed in permutation block 310 is fixed, the Tanner graph represents a binary linear block code with k information bits ( $\mathrm{u}_{\mathrm{t}}, \ldots, \mathrm{u}_{k}$ ) and r parity bits
( $\mathrm{x}_{1}, \ldots, \mathrm{x}_{r}$ ), as follows. Each of the information bits is associated with one of the information nodes 302 , and each of the parity bits is associated with one of the parity nodes 306. The value of a parity bit is determined uniquely by the condition that the mod-2 sum of the values of the variable nodes connected to each of the check nodes 304 is zero. To see this, set $x_{0}=0$. Then if the values of the bits on the ra edges coming out the permutation box are $\left(\mathrm{v}_{1}, \ldots, \mathrm{v}_{\mathrm{ra}}\right)$, then we have the recursive formula

$$
x_{j}=x_{j-1}+\sum_{i=1}^{\lambda} v_{(j-1) \lambda+i}
$$

for $\mathrm{j}=1,2, \ldots, \mathrm{r}$. This is in effect the encoding algorithm.
Two types of IRA codes are represented in FIG. 3, a nonsystematic version and a systematic version. The nonsystematic version is an (r,k) code, in which the codeword corresponding to the information bits $\left(\mathrm{u}_{1}, \ldots, \mathrm{u}_{k}\right)$ is ( $\mathrm{x}_{1}, \ldots, \mathrm{x}_{r}$ ). The systematic version is a $(\mathrm{k}+\mathrm{r}, \mathrm{k})$ code, in which the codeword is ( $u_{1}, \ldots, u_{k} ; x_{1}, \ldots, x_{r}$ ).

The rate of the nonsystematic code is

$$
R_{n y y s}=\frac{a}{\sum_{i} i f_{i}}
$$

The rate of the systematic code is

$$
R_{s y s}=\frac{a}{a+\sum_{i} i f_{i}}
$$

For example, regular repeat and accumulate (RA) codes can be considered nonsystematic IRA codes with $a=1$ and exactly one $\mathrm{f}_{i}$ equal to 1 , say $\mathrm{f}_{q}=1$, and the rest zero, in which case $\mathrm{R}_{\text {nsys }}$ simplifies to $\mathrm{R}=1 / \mathrm{q}$.

The IRA code may be represented using an alternate notation. Let $\lambda_{t}$ be the fraction of edges between the information nodes 302 and the check nodes 304 that are adjacent to an information node of degree $i$, and let $\rho_{i}$ be the fraction of such edges that are adjacent to a check node of degree $i+2$ (i.e., one that is adjacent to $i$ information nodes). These edge fractions may be used to represent the IRA code rather than the corresponding node fractions. Define $\lambda(x)=\Sigma_{i} \lambda_{i} x^{i-1}$ and $\rho(x)=\Sigma_{i} \rho_{i}$ $x^{i-1}$ to be the generating functions of these sequences. The pair $(\lambda, \rho)$ is called a degree distribution. For $L(x)=\Sigma_{i} f_{i} x_{i}$,

$$
f_{i}=\frac{\lambda_{i} / i}{\sum_{j} \lambda_{j} / j}
$$

$$
L(x)=\int_{0}^{x} \lambda(t) d t / \int_{0}^{1} \lambda(t) d t
$$

likelihood ratio based on the channel received information. If the channel is memoryless, i.e., each channel output only

The rate of the systematic IRA code given by the degree distribution is given by

$$
\text { Rate }=\left(1+\frac{\sum_{j} \rho_{j} / j}{\sum_{j} \lambda_{j} / j}\right)^{-1}
$$ sages. Decoding is terminated after a fixed number of iterasages. Decoding is terminated after a fixed number of itera-

tions or detecting that all the constraints are satisfied. Upon termination, the decoder outputs a decoded sequence based on the messages $m(u)=\sum w_{m}(w \rightarrow u)$.

Thus, on various channels, iterative decoding only differs
the initial messages $m_{0}(u)$. For example, consider three in the initial messages $m_{0}(u)$. For example, consider three memoryless channel models: a binary erasure channel
(BEC); a binary symmetric channel (BSC); and an additive memoryless channel models: a binary erasure channel
(BEC); a binary symmetric channel (BSC); and an additive white Gaussian noise (AGWN) channel.

In the BEC, there are two inputs and three outputs. When 0 is transmitted, the receiver can receive either 0 or an erasure E . is transmitted, the receiver can receive either 0 or an erasure $E$.
An erasure $E$ output means that the receiver does not know how to demodulate the output. Similarly, when 1 is transmitted, the receiver can receive either 1 or E . Thus, for the BEC, $y \in\{0, E, 1\}$, and

$$
m_{0}(u)=\left\{\begin{array}{cc}
+\infty & \text { if } y=0 \\
0 & \text { if } y=E \\
-\infty & \text { if } y=1
\end{array}\right.
$$

In the BSC, there are two possible inputs $(0,1)$ and two possible outputs $(0,1)$. The BSC is characterized by a set of conditional probabilities relating all possible outputs to possible inputs. Thus, for the $\operatorname{BSC} y \in\{0,1\}$,
where $m_{0}(u)$ is the log-likelihood message associated with $u$. If $u$ is a check node, the corresponding formula is

$$
\tanh \frac{m(u \rightarrow v)}{2}=\prod_{w \neq v} \tanh \frac{m(w \rightarrow u)}{2}
$$

Before decoding, the messages $m(w \rightarrow u)$ and $m(u \rightarrow v)$ are initialized to be zero, and $m_{0}(u)$ is initialized to be the log-
relies on its input, and $y$ is the output of the channel code bit $u$, then $m_{0}(u)=\log (p(u=0 \mid y) / p(u=1 \mid y))$. After this initialization, the decoding process may run in a fully parallel and local manner. In each iteration, every variable/check node receives messages from its neighbors, and sends back updated mes-

$$
m(u \rightarrow v)=\sum_{w \neq v} m(w \rightarrow u)+m_{0}(u)
$$

In the AWGN, the discrete-time input symbols $X$ take their values in a finite alphabet while channel output symbols Y can take any values along the real line. There is assumed to be no distortion or other effects other than the addition of white Gaussian noise. In an AWGN with a Binary Phase Shift Keying (BPSK) signaling which maps 0 to the symbol with amplitude $\sqrt{E s}$ and 1 to the symbol with amplitude $-\sqrt{E s}$, output $y \in R$, then

$$
m_{0}(\mu)=4 y \sqrt{E_{s}} / N_{0}
$$

where $\mathrm{N}_{0} / 2$ is the noise power spectral density.

The selection of a degree profile for use in a particular transmission channel is a design parameter, which may be affected by various attributes of the channel. The criteria for selecting a particular degree profile may include, for example, the type of channel and the data rate on the channel. For example, Table 1 shows degree profiles that have been found to produce good results for an AWGN channel model.

TABLE 1

| LABLE 1 |  |  |  |
| :--- | :---: | :---: | :---: |
| $\mathbf{a}$ | 2 | 3 |  |
| $\lambda 2$ | 0.139025 | 0.078194 | 0.054485 |
| $\lambda 3$ | 0.2221555 | 0.128085 | 0.104315 |
| $\lambda 5$ |  | 0.638820 | 0.03613 |
| $\lambda 6$ |  |  | 0.126755 |
| $\lambda 10$ |  | 0.229816 |  |
| $\lambda 11$ |  | 0.108828 | 0.016484 |
| $\lambda 12$ |  |  |  |
| $\lambda 13$ |  |  |  |
| $\lambda 14$ |  |  | 0.450302 |
| $\lambda 16$ |  |  | 0.333364 |
| $\lambda 27$ | 1.1840 | 0.333223 | 0.017842 |
| $\lambda 28$ | 1.1981 | 1.2415 | 1.26015 |
| Rate | 0.190 | -0.250 | 1.2780 |
| $\sigma G A$ | -0.4953 | -0.4958 | -0.371 |
| $\sigma^{*}$ |  |  | -0.4958 |
| (Eb/N0)* (dB) |  |  |  |
| S.L. (dB) |  |  |  |

Table 1 shows degree profiles yielding codes of rate approximately $1 / 3$ for the AWGN channel and with $a=2,3,4$. For each sequence, the Gaussian approximation noise threshold, the actual sum-product decoding threshold and the corresponding energy per bit $\left(\mathrm{E}_{b}\right)$-noise power $\left(\mathrm{N}_{0}\right)$ ratio in dB are given. Also listed is the Shannon limit (S.L.).
As the parameter "a" is increased, the performance improves. For example, for $\mathrm{a}=4$, the best code found has an iterative decoding threshold of $\mathrm{E}_{b} / \mathrm{N}_{0}=-0.371 \mathrm{~dB}$, which is only 0.12 dB above the Shannon limit.

The accumulator component of the coder may be replaced by a "double accumulator" 600 as shown in FIG. 6. The double accumulator can be viewed as a truncated rate 1 convolutional coder with transfer function $1 /\left(1+D+D^{2}\right)$.
Alternatively, a pair of accumulators may be the added, as shown in FIG. 7. There are three component codes: the "outer" code 700, the "middle" code 702, and the "inner" code 704. The outer code is an irregular repetition code, and the middle and inner codes are both accumulators.
IRA codes may be implemented in a variety of channels, including memoryless channels, such as the BEC, BSC, and AWGN, as well as channels having non-binary input, nonsymmetric and fading channels, and/or channels with memory.
A number of embodiments have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims.

## The invention claimed is:

> 1. A method comprising:
> receiving a collection of message bits having a first sequence in a source data stream;
generating a sequence of parity bits, wherein each parity bit
" $x$ " in the sequence is in accordance with the formula

12. The device of claim 11, wherein the encoder is config- 3 ured to generate the collection of parity bits as if a number of inputs into nodes $\mathrm{v}_{i}$ was not constant.
13. The device of claim 11, wherein the encoder comprises: a low-density generator matrix (LDGM) coder configured to perform an irregular repeat on message bits having a 40 first sequence in a source data stream to output a random sequence of repeats of the message bits; and
an accumulator configured to XOR sum in linear sequential fashion a predecessor parity bit and "a" bits of the random sequence of repeats of the message bits.
14. The device of claim 12, wherein the accumulator comprises a recursive convolutional coder.
15. The device of claim 14 , wherein the recursive convolutional coder comprises a truncated rate-1 recursive convolutional coder.
16. The device of claim 14, wherein the recursive convolutional coder has a transfer function of $1 /(1+D)$.
17. The device of claim 12, further comprising a second accumulator configured to determine a second sequence of parity bits that defines a second condition that constrains the random sequence of repeats of the message bits.

## 18. A device comprising:

a message passing decoder configured to decode a received data stream that includes a collection of parity bits, the



19. The device of claim 18, wherein the message passing decoder is configured to decode the received data stream that 45 check nodes, wherein the message passing decoder is configured to decode the received data stream that has been encoded in accordance with the following Tanner graph:
 includes the message bits.
20. The device of claim 18, wherein the message passing decoder is configured to decode the received data stream as if a number of inputs into nodes $\mathrm{v}_{i}$ was not constant.
21. The device of claim 18, wherein the message passing decoder is configured to decode in linear time at rates that approach a capacity of a channel.
22. The device of claim 18, wherein the message passing decoder comprises a belief propagation decoder.
23. The device of claim 18, wherein the message passing
message passing decoder comprising two or more check/variable nodes operating in parallel to receive messages from neighboring check/variable nodes and send updated messages to the neighboring variable/ decoder is configured to decode the received data stream without the message bits.

## Exhibit B Page 53

## UNITED STATES PATENT AND TRADEMARK OFFICE CERTIFICATE OF CORRECTION

PATENT NO. : 7,421,032 B2

It is certified that error appears in the above-identified patent and that said Letters Patent is hereby corrected as shown below:

Title Page, item [73] (Assignee), line 1, please delete "California" and insert --California--, therefor.

Claim 11, Column 9, line 28, delete " $\mathrm{V}_{1}$ " and insert $\ldots \mathrm{V}_{\mathrm{r}}-$, therefor.
Claim 11, Column 9, line 29, delete " $U_{1}$ " and insert $-U_{k}-$, therefor.
Claim 11, Column 9, line 29, delete " $\mathrm{X}_{1}$ " and insert $-\mathrm{X}_{\mathrm{r}}-$, therefor.
Claim 18, Column 10, line 35, delete " $\mathrm{V}_{1}$ " and insert $-\mathrm{V}_{\mathrm{r}}-$-, therefor.
Claim 18, Column 10, line 36, delete " $\mathrm{U}_{1}$ " and insert $-\mathrm{U}_{\mathrm{k}}-$, therefor.
Claim 18, Column 10, line 37, delete " $\mathrm{X}_{1}$ " and insert $-\mathrm{X}_{\mathrm{r}}-$-, therefor.

Signed and Sealed this
Seventeenth Day of February, 2009


JOHN DOLL
Acting Director of the United States Patent and Trademark Office

## Exhibit B <br> Page 54

It is certified that error appears in the above-identified patent and that said Letters Patent is hereby corrected as shown below:

$$
x_{j}=x_{j-1}+\sum_{i=1}^{\lambda} v_{(j-1) \lambda+i}
$$

$$
x_{j}=x_{j-1}+\sum_{i=1}^{a} v_{(j-1) a+i}
$$

In claim 1, column 8, line 4, please delete "

$$
x_{j}=x_{j-1}+\sum_{i=1}^{\lambda} v_{(j-1) \lambda+i}, \text { "and insert }
$$

$$
x_{j}=x_{j-1}+\sum_{i=1}^{a} v_{(j-1) a+i}
$$

$$
\text { In claim 1, column 8, line 13, please delete }{ }^{<} \sum_{i=1}^{a} v_{(j-1) a+1} " \text { and insert }
$$

$$
\sum_{i=1}^{a} V_{(j-1) a+i}
$$

Signed and Sealed this
Twenty-seventh Day of July, 2010


## Exhibit B

Page 55

## (12) <br> United States Patent Jin et al.

(10) Patent No.

US 7,916,781 B2
(45) Date of Patent:
(54) SERIAL CONCATENATION OF INTERLEAVED CONVOLUTIONAL CODES FORMING TURBO-LIKE CODES
(75) Inventors:

Hui Jin, Glen Gardner, NJ (US); Aamod Khandekar, Pasadena, CA (US); Robert J. McEliece, Pasadena, CA (US)
(73)

Assignee: California Institute of Technology, Pasadena, CA (US)
(*) Notice:
Subject to any disclaimer, the term of this patent is extended or adjusted under 35 U.S.C. 154(b) by 424 days.
(21) Appl. No.: 12/165,606
(22)

Filed: Jun. 30, 2008
(65)

## Prior Publication Data

US 2008/0294964 A1
Nov. 27, 2008

## Related U.S. Application Data

(63) Continuation of application No. 11/542,950, filed on Oct. 3, 2006, now Pat. No. 7,421,032, which is a continuation of application No. 09/861,102, filed on May 18, 2001, now Pat. No. 7,116,710, which is a continuation-in-part of application No. 09/922,852, filed on Aug. 18, 2000, now Pat. No. 7,089,477.
(60) Provisional application No. $60 / 205,095$, filed on May 18, 2000.
(51) Int. Cl. H04B $1 / 66$
(2006.01)
U.S. Cl. ........ 375/240; 375/285; 375/296; 714/801; 714/804
(58) Field of Classification Search $\qquad$ 375/240, 375/240.24, 254, 285, 295, 296, 260; 714/755, $714 / 758,800,801,804,805$ See application file for complete search history.

References Cited
U.S. PATENT DOCUMENTS

| 5,181,207 A * | 1/1993 | Chapman .................... 714/755 |
| :---: | :---: | :---: |
| 5,392,299 A | 2/1995 | Rhines et al. |
| 5,530,707 A | 6/1996 | Lin |
| 5,751,739 A | 5/1998 | Seshadri et al. |
| 5,802,115 A | 9/1998 | Meyer |
| 5,881,093 A | 3/1999 | Wang et al. |
| 6,014,411 A | 1/2000 | Wang |
| 6,023,783 A | 2/2000 | Divsalar et al. |
| 6,031,874 A | 2/2000 | Chennakeshu et al. |
| 6,032,284 A | 2/2000 | Bliss |
| 6,044,116 A | 3/2000 | Wang |
| 6,094,739 A | 7/2000 | Miller et al. |
| 6,195,396 B1* | 2/2001 | Fang et al. .................. 375/261 |
| 6,396,423 B1 | 5/2002 | Laumen et al. |
| 6,437,714 Bl | 8/2002 | Kim et al. |
| 6,732,328 B1* | 5/2004 | McEwen et al. ............. 714/795 |
| 6,859,906 B2 | 2/2005 | Hammons et al. |
| 7,089,477 B1 | 8/2006 | Divsalar et al. |
|  | (Co | nued) |

## OTHER PUBLICATIONS

Benedetto, S., et al., "A Soft-Input Soft-Output APP Module for Iterative Decoding of Concatenated Codes," IEEE Communications Letters, 1(1):22-24, Jan. 1997.
(Continued)
Primary Examiner - Dac V Ha
(74) Attorney, Agent, or Firm - Perkins Coie LLP

ABSTRACT
A serial concatenated coder includes an outer coder and an inner coder. The outer coder irregularly repeats bits in a data block according to a degree profile and scrambles the repeated bits. The scrambled and repeated bits are input to an inner coder, which has a rate substantially close to one.

22 Claims, 5 Drawing Sheets


## Exhibit C <br> Page 56

## U.S. PATENT DOCUMENTS

2001/0025358 A1 9/2001 Eidson etal.

## OTHER PUBLICATIONS

Benedetto, S., et al., "A Soft-Input Soft-Output Maximum A Posteriori (MAP) Module to Decode Parallel and Serial Concatenated Codes," The Telecommunications and Data Acquisition Progress Report (TDA PR 42-127), pp. 1-20, Nov. 1996.
Benedetto, S., et al., "Bandwidth efficient parallel concatenated coding schemes," Electronics Letters, 31(24):2067-2069, Nov. 1995. Benedetto, S., et al., "Design of Serially Concatenated Interleaved Codes," ICC 97, vol. 2, pp. 710-714, Jun. 1997.
Benedetto, S., et al., "Parallel Concatenated Trellis Coded Modulation," ICC 96, vol. 2, pp. 974-978, Jun. 1996.
Benedetto, S., et al., "Serial Concatenated Trellis Coded Modulation with Iterative Decoding," Proceedings 1997 IEEE International Symposium on Information Theory (ISTT), Ulm, Germany, p. 8, Jun. 29-Jul. 4, 1997.
Benedetto, S., et al., "Serial Concatenation of Interleaved Codes: Performace Analysis, Design, and Iterative Decoding," The Telecommunications and DataAcquisition Progress Report (TDAPR 42126), pp. 1-26, Aug. 1996.
Benedetto, S., et al., "Serial concatenation of interleaved codes: performance analysis, design, and iterative decoding," Proceedings 1997 IEEE International Symposium on Information Theory (ISIT), Ulm, Germany, p. 106, Jun. 29-Jul. 4, 1997.
Benedetto, S., et al., "Soft-Output Decoding Algorithms in Iterative Decoding of Turbo Codes," The Telecommunications and Data Acquisition Progress Report (TDA PR 42-124), pp. 63-87, Feb. 1996. Berrou, C., et al., "Near Shannon Limit Error-Correcting Coding and Decoding: Turbo Codes," ICC 93, vol. 2, pp. 1064-1070, May 1993.

Digital Video Broadcasting (DVB)-User guidelines for the second generation system for Broadcasting, Interactive Services, News Gathering and other broadband satellite applications (DVB-S2), ETSI TR 102376 V1.1.1 Technical Report, pp. 1-104 (p. 64), Feb. 2005.

Divsalar, D., et al., "Coding Theorems for 'Turbo-Like' Codes," Proceedings of the $36^{\text {th }}$ Annual Allerton Conference on Communication, Control, and Computing, Monticello, Illinois, pp. 201-210, Sep. 1998.

Divsalar, D., et al., "Effective free distance of turbo codes," Electronics Letters, 32(5):445-446, Feb. 1996.
Divsalar, D., et al., "Hybrid Concatenated Codes and Iterative Decoding," Proceedings 1997 IEEE International Symposium on Information Theory (ISIT), Ulm, Germany, p. 10, Jun. 29-Jul. 4, 1997.
Divsalar, D., et al., "Low-Rate Turbo Codes for Deep-Space Communications," Proceedings 1995 IEEE International Symposium on Information Theory (ISIT), Whistler, BC, Canada, p. 35, Sep. 1995. Divsalar, D., et al., "Multiple Turbo Codes for Deep-Space Communications," The Telecommunications and Data Acquisition Progress Report (IDA PR 42-121), pp. 66-77, May 1995.
Divsalar, D., et al., "Multiple Turbo Codes," MILCOM'95, vol. 1,pp. 279-285, Nov. 1995.
Divsalar, D., et al., "On the Design of Turbo Codes," The Telecommunications and Data Acquisition Progress Report (TDA PR 42-123), pp. 99-121, Nov. 1995.
Divsalar, D., et al., "Serial Turbo Trellis Coded Modulation with Rate-1 Inner Code," Proceedings 2000 IEEE International Symposium on Information Theory (ISIT), Sorrento, Italy, pp. 194, Jun. 2000.

Divsalar, D., et al., "Turbo Codes for PCS Applications," IEEE ICC '95, Seattle, WA, USA, vol. 1, pp. 54-59, Jun. 1995.
Jin, H., et al., "Irregular Repeat-Accumulate Codes," 2nd International Symposium on Turbo Codes, Brest, France, 25 pages, Sep. 2000.

Jin, H., et al., "Irregular Repeat-Accumulate Codes," $2^{\text {nd }}$ International Symposium on TurboCodes \& Related Topics, Brest, France, p. $1-8$, Sep. 2000.
Richardson, T.J., et al., "Design of Capacity-Approaching Irregular Low-Density Parity-Check Codes," IEEE Transactions on Information Theory, 47(2):619-637, Feb. 2001.
Richardson, T.J., et al., "Efficient Encoding of Low-Density ParityCheck Codes," IEEE Transactions on Information Theory, 47(2):638-656, Feb. 2001.
Wiberg, N., et al., "Codes and Iterative Decoding on General Graphs," Proceedings 1995 IEEE International Symposium on Information Theory (ISIT), Whistler, BC, Canada, p. 468, Sep. 1995.
Aji, S.M., et al., "The Generalized Distributive Law," IEEE Transactions on Information Theory, 46(2):325-343, Mar. 2000.
Tanner, R.M., "A Recursive Approach to Low Complexity Codes," IEEE Transactions on Information Theory, 27(5):533-547, Sep. 1981.

* cited by examiner

FIG. 1
(Prior Art)

[^0]Page 58


FIG. 4

## Exhibit C

Page 59


FIG. 3

Exhibit C
Page 60


FIG. 5A


FIG. 5B

## Exhibit C

Page 61


FIG. 6


1

## SERLAL CONCATENATION OF INTERLEAVED CONVOLUTIONAL CODES FORMUNG TURBO-LIKE CODES

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No. 11/542,950, filed Oct. 3, 2006 now U.S. Pat. No. 7,421, 032, which is a continuation of U.S. application Ser. No. 09/861,102, filed May 18, 2001, now U.S. Pat. No. 7,116,710, which claims the priority of U.S. Provisional Application Ser. No. 60/205,095, filed May 18, 2000, and is a continuation-in-part of U.S. application Ser. No. 09/922,852, filed Aug. 18, 2000, now U.S. Pat. No. 7,089,477. The disclosure of the prior applications are considered part of (and are incorporated by reference in) the disclosure of this application.

## GOVERNMENT LICENSE RIGHTS

The U.S. Government has a paid-up license in this invention and the right in limited circumstances to require the patent owner to license others on reasonable terms as provided for by the terms of Grant No. CCR-9804793 awarded by the National Science Foundation.

## BACKGROUND

Properties of a channel affect the amount of data that can be handled by the channel. The so-called "Shannon limit" defines the theoretical limit of the amount of data that a channel can carry.

Different techniques have been used to increase the data rate that can be handled by a channel. "Near Shannon Limit Error-Correcting Coding and Decoding: Turbo Codes," by Berrou et al. ICC, pp 1064-1070, (1993), described a new "turbo code" technique that has revolutionized the field of error correcting codes. Turbo codes have sufficient randomness to allow reliable communication over the channel at a high data rate near capacity. However, they still retain sufficient structure to allow practical encoding and decoding algorithms. Still, the technique for encoding and decoding turbo codes can be relatively complex.

A standard turbo coder 100 is shown in FIG. 1. A block of k information bits is input directly to a first coder 102. A k bit interleaver 106 also receives the k bits and interleaves them prior to applying them to a second coder 104. The second coder produces an output that has more bits than its input, that is, it is a coder with rate that is less than 1 . The coders $\mathbf{1 0 2 , 1 0 4}$ are typically recursive convolutional coders.
Three different items are sent over the channel 150: the original $k$ bits, first encoded bits 110 , and secondencoded bits 112. At the decoding end, two decoders are used: a first constituent decoder 160 and a second constituent decoder 162. Each receives both the original $k$ bits, and one of the encoded portions 110, 112. Each decoder sends likelihood estimates of the decoded bits to the other decoders. The estimates are used to decode the uncoded information bits as corrupted by the noisy channel.

## SUMMARY

A coding system according to an embodiment is configured to receive a portion of a signal to be encoded, for example, a data block including a fixed number of bits. The coding system includes an outer coder, which repeats and scrambles bits in the datablock. The data block is apportioned
into two or more sub-blocks, and bits in different sub-blocks are repeated a different number of times according to a selected degree profile. The outer coder may include a repeater with a variable rate and an interleaver. Alternatively,
5 the outer coder may be a low-density generator matrix (LDGM) coder.

The repeated and scrambled bits are input to an inner coder that has a rate substantially close to one. The inner coder may include one or more accumulators that perform recursive
modulo two addition operations on the input bit stream.
The encoded data output from the inner coder may be transmitted on a channel and decoded in linear time at a destination using iterative decoding techniques. The decoding techniques may be based on a Tanner graph representation of the code.

## BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of a prior "turbo code" system.

FIG. $\mathbf{2}$ is a schematic diagram of a coder according to an embodiment.

FIG. 3 is a Tanner graph for an irregular repeat and accumulate (IRA) coder.

FIG. 4 is a schematic diagram of an IRA coder according to an embodiment.

FIG. 5A illustrates a message from a variable node to a check node on the Tanner graph of FIG. 3 .
FIG. 5B illustrates a message from a check node to a variable node on the Tanner graph of FIG. 3.

FIG. 6 is a schematic diagram of a coder according to an alternate embodiment.

FIG. 7 is a schematic diagram of a coder according to another alternate embodiment.

## DETAILED DESCRIPTION

FIG. 2 illustrates a coder 200 according to an embodiment. The coder 200 may include an outer coder 202, an interleaver 204, and inner coder 206. The coder may be used to format blocks of data for transmission, introducing redundancy into the stream of data to protect the data from loss due to transmission errors. The encoded data may then be decoded at a destination in linear time at rates that may approach the channel capacity.

The outer coder 202 receives the uncoded data. The data may be partitioned into blocks of fixed size, say $k$ bits. The outer coder may be an ( $n, k$ ) binary linear block coder, where $n>k$. The coder accepts as input a block $u$ of $k$ data bits and produces an output block v of n data bits. The mathematical relationship between $u$ and $v$ is $v=T_{0} u$, where $T_{0}$ is an $n \times k$ matrix, and the rate of the coder is $\mathrm{k} / \mathrm{n}$.

The rate of the coder may be irregular, that is, the value of $T_{0}$ is not constant, and may differ for sub-blocks of bits in the data block. In an embodiment, the outer coder 202 is a repeater that repeats the k bits in a block a number of times q to produce a block with $n$ bits, where $n=q$. Since the repeater has an irregular output, different bits in the block may be repeated a different number of times. For example, a fraction of the bits in the block may be repeated two times, a fraction of bits may be repeated three times, and the remainder of bits may be repeated four times. These fractions define a degree sequence, or degree profile, of the code.

The inner coder 206 may be a linear rate-1 coder, which means that the n-bit output block $x$ can be written as $x=T_{I} w$, where $\mathrm{T}_{I}$ is a nonsingularnxnmatrix. The inner coder 210 can

3
have a rate that is close to 1 , e.g., within $50 \%$, more preferably $10 \%$ and perhaps even more preferably within $1 \%$ of 1 .
In an embodiment, the inner coder 206 is an accumulator, which produces outputs that are the modulo two (mod-2) partial sums of its ipputs. The accumulator may be a truncated rate-1 recursive convolutional coder with the transfer function $1 /(1+D)$. Such an accumulator may be considered a block coder whose input block $\left[\mathrm{x}_{1}, \ldots, \mathrm{x}_{n}\right.$ ] and output block $\left[y_{1}, \ldots, y_{n}\right]$ are related by the formula

$$
\begin{aligned}
& y_{1}=\mathrm{x}_{1} \\
& y_{2}=x_{1} \oplus x_{2} \\
& y_{3}=x_{1} \oplus x_{2} \oplus x_{3}
\end{aligned}
$$

$$
y_{n}=x_{1} \oplus x_{2} \oplus x_{3} \oplus \ldots \oplus x_{n}
$$

where " $\oplus$ " denotes mod-2, or exclusive-OR (XOR), addition. An advantage of this system is that only mod-2 addition is necessary for the accumulator. The accumulator may be embodied using only XOR gates, which may simplify the design.

The bits output from the outer coder 202 are scrambled before they are input to the inner coder 206. This scrambling may be performed by the interleaver 204, which performs a pseudo-random permutation of an input block $v$, yielding an output block $w$ having the same length as v .

The serial concatenation of the interleaved irregular repeat code and the accumulate code produces an irregular repeat 3 and accumulate(IRA) code. An IRA code is a linear code, and as such, may be represented as a set of parity checks. The set of parity checks may be represented in a bipartite graph, called the Tanner graph, of the code. FIG. 3 shows a Tanner graph 300 of an IRA code with parameters ( $f_{1}, \ldots, f_{j}$; a), where $\mathrm{f}_{i} \geqq 0, \Sigma_{i} \mathrm{f}_{i}=1$ and "a" is a positive integer. The Tanner graph includes two kinds of nodes: variable nodes (open circles) and check nodes (filled circles). There are $k$ variable nodes $\mathbf{3 0 2}$ on the left, called information nodes. There are $r$ variable nodes 306 on the right, called parity nodes. There are $\mathrm{r}=\left(\mathrm{k} \Sigma_{l}, \mathrm{if}\right) / \mathrm{a} \mathrm{check}$ nodes 304 connected between the information nodes and the parity nodes. Each information node 302 is connected to a number of check nodes 304 . The fraction of information nodes connected to exactly $i$ check nodes is $f_{i}$. For example, in the Tanner graph 300 , each of the $f_{2}$ information nodes are connected to two check nodes, corresponding to a repeat of $q=2$, and each of the $f_{3}$ information nodes are connected to three check nodes, corresponding to $\mathrm{q}=3$ :
Each check node 304 is connected to exactly "a" information nodes 302. In FIG. 3, $a=3$. These connections can be made in many ways, as indicated by the arbitrary permutation of the ra edges joining information nodes 302 and check nodes 304 in permutation block 310. These connections correspond to the scrambling performed by the interleaver 204.
In an alternate embodiment, the outer coder 202 may be a low-density generator matrix (LDGM) coder that performs an irregular repeat of the $k$ bits in the block, as shown in FIG. 4. As the name implies, an LDGM code has a sparse (lowdensity) generator matrix. The IRA code produced by the coder 400 is a serial concatenation of the LDGM code and the

[^1]
## Exhibit C

Page 64 $\Sigma_{i} \rho_{i} x^{i-1}$ to be
the generating functions of these sequences. The pair $(\lambda, \rho)$ is called a degree distribution. For $L(x)=\Sigma_{t i} \mathrm{f}_{i}$,
accumulator code. The interleaver 204 in FIG. 2 may be excluded due to the randomness already present in the structure of the LDGM code.
If the permutation performed in permutation block 310 is fixed, the Tanner graph represents a binary linear block code with $k$ information bits ( $u_{1}, \ldots, u_{k}$ ) and $r$ parity bits ( $\mathrm{x}_{1}, \ldots, \mathrm{x}_{r}$ ), as follows. Each of the information bits is associated with one of the information nodes 302 , and each of the parity bits is associated with one of the parity nodes 306. The value of a parity bit is determined uniquely by the condition that the mod-2 sum of the values of the variable nodes connected to each of the check nodes 304 is zero. To see this, set $x_{0}=0$. Then if the values of the bits on the ra edges coming out the permutation box are

$$
x_{j}=x_{j-1}+\sum_{i-1}^{R} v_{(-1) R+i}
$$

$\left(V_{1}, \ldots, V_{r a}\right)$, then we have the recursive formula for $j=$ $1,2, \ldots, r$. This is in effect the encoding algorithm.

Two types of IRA codes are represented in FIG. 3, a nonsystematic version and a systematic version. The nonsystematic version is an ( $\mathrm{r}, \mathrm{k}$ ) code, in which the codeword corresponding to the information bits $\left(\mathrm{u}_{1}, \ldots, \mathrm{u}_{k}\right)$ is ( $\mathrm{x}_{1}, \ldots, \mathrm{x}_{r}$ ). The systematic version is a $(k+r, k)$ code, in which the codeword is ( $u_{1}, \ldots, u_{k} ; x_{1}, \ldots, x_{4}$.

The rate of the nonsystematic code is

$$
R_{n \text { ngs }}=\frac{a}{\sum_{i} i_{i}}
$$

The rate of the systematic code is

$$
R_{s y s}=\frac{a}{a+\sum_{i} f_{i}}
$$

For example, regular repeat and accumulate (RA) codes can be considered nonsystematic IRA codes with $\mathrm{a}=1$ and exactly one $f_{i}$ equal to 1 , say $f_{q}=1$, and the rest zero, in which case $\mathrm{R}_{\text {asys }}$ simplifies to $\mathrm{R}=1 / \mathrm{q}$.
The IRA code may be represented using an alternate notation. Let $\lambda_{r}$ be the fraction of edges between the information nodes $\mathbf{3 0 2}$ and the check nodes 304 that are adjacent to an information node of degree $i$, and let $\rho_{i}$ be the fraction of such edges that are adjacent to a check node of degree $i+2$ (i.e., one that is adjacent to information nodes). These edge fractions may be used to represent the IRA code rather than the corresponding node fractions. Define $\lambda(x)=\Sigma_{i} \lambda_{i} x^{i-1}$ and $\rho(x)=$

$$
f_{i}=\frac{\lambda_{i} / i}{\sum_{j} \lambda_{j} / j}
$$

The rate of the systematic IRA code given by the

$$
\begin{aligned}
& L(x)=\int_{0}^{x} \lambda(t) d t / \int_{0}^{t} \lambda(t) d t \\
& \text { Rate }=\left(1+\frac{\sum_{j} \rho_{j} / j}{\sum_{j} \lambda_{j} / j}\right)^{-1}
\end{aligned}
$$

## degree distribution is given by

"Belief propagation" on the Tanner Graph realization may be used to decode IRA codes. Roughly speaking, the belief propagation decoding technique allows the messages passed on an edge to represent posterior densities on the bit associated with the variable node. A probability density on a bit is a pair of non-negative real numbers $p(0), p(1)$ satisfying $p(0)+$ $p(1)=1$, where $p(0)$ denotes the probability of the bit being 0 , $\mathrm{p}(1)$ the probability of it being 1 . Such a pair can be represented by its $\log$ likelihood ratio, $m=\log (p(0) / p(1))$. The outgoing message from a variable node $u$ to a check node $v$ represents information about $u$, and a message from a check node $u$ to a variable node $v$ represents information about $u$, as shown in FIGS. 5 A and 5 B , respectively.

The outgoing message from a node $u$ to a node $v$ depends on the incoming messages from all neighbors $w$ of $u$ except $v$. If $u$ is a variable message node, this outgoing message is

$$
m(u \rightarrow v)=\sum_{w \neq v} m(w \rightarrow u)+m_{0}(u)
$$

where $m_{0}(u)$ is the log-likelihood message associated with $u$. If $u$ is a check node, the corresponding formula is

$$
\tanh \frac{m(u \rightarrow v)}{2}=\prod_{w \neq v} \tanh \frac{m(w \rightarrow u)}{2}
$$

Before decoding, the messages $\mathrm{m}(\mathrm{w} \rightarrow \mathrm{u})$ and $\mathrm{m}(\mathrm{u} \rightarrow \mathrm{v})$ are initialized to be zero, and $m_{0}(u)$ is initialized to be the loglikelihood ratio based on the channel received information. If the channel is memoryless, i.e., each channel output only relies on its input, and y is the output of the channel code bit $u$, then $m_{0}(u)=\log (p(u=0 \mid y) / p(u=1 \mid y))$. After this initialization, the decoding process may run in a fully parallel and local manner. In each iteration, every variable/check node receives messages from its neighbors, and sends back updated messages. Decoding is terminated after a fixed number of iterations or detecting that all the constraints are satisfied. Upon termination, the decoder outputs a decoded sequence based on the messages

$$
m(u)=\sum w_{m}(w \rightarrow u) .
$$

Thus, on various channels, iterative decoding only differs in the initial messages $\mathrm{m}_{0}(\mathrm{u})$. For example, consider three memoryless channel models: a binary erasure channel (BEC); a binary symmetric channel (BSC); and an additive white Gaussian noise (AGWN) channel.

In the BEC, there are two inputs and three outputs. When 0 is transmitted, the receiver can receive either 0 or an erasure E .

An erasure E output means that the receiver does not know how to demodulate the output. Similarly, when 1 is transmitted, the receiver can receive either 1 or E . Thus, for the BEC, $\mathrm{y} \in\{0, \mathrm{E}, 1\}$, and

$$
m_{0}(u)=\left\{\begin{array}{ccc}
+\infty & \text { if } y=0 \\
0 & \text { if } y=E \\
-\infty & \text { if } y=1
\end{array}\right.
$$

In the BSC, there are two possible inputs $(0,1)$ and two possible outputs $(0,1)$. The BSC is characterized by a set of conditional probabilities relating all possible outputs to posis sible inputs. Thus, for the BSC $y \in\{0,1\}$,

$$
m_{0}(u)=\left\{\begin{array}{lll}
\log \frac{1-p}{p} & \text { if } & y=0 \\
-\log \frac{1-p}{p} & \text { if } & y=1
\end{array}\right.
$$

In the AWGN, the discrete-time input symbols X take their take any values along the real line. There is assumed to be no distortion or other effects other than the addition of white Gaussian noise. In an AWGN with a Binary Phase Shift Keying (BPSK) signaling which maps 0 to the symbol with 30 amplitude $\sqrt{E s}$ and 1 to the symbol with amplitude $-\sqrt{E s}$, output $y \in R$, then

$$
m_{0}(\mu)=4 y \sqrt{E_{2}} / N_{0}
$$

Table 1 shows degree profiles yielding codes of rate approximately $1 / 3$ for the AWGN channel and with $a=2,3,4$. For each sequence, the Gaussian approximation noise threshold, the actual sum-product decoding threshold and the corresponding energy per bit $\left(\mathrm{E}_{b}\right)$-noise power $\left(\mathrm{N}_{0}\right)$ ratio in dB are given. Also listed is the Shannon limit (S.L.).

As the parameter " $a$ " is increased, the performance improves. For example, for $a=4$, the best code found has an iterative decoding threshold of $\mathrm{E}_{b} / \mathrm{N}_{0}=-0.371 \mathrm{~dB}$, which is only 0.12 dB above the Shanoon limit.
The accumulator component of the coder may be replaced by a "double accumulator" 600 as shown in FIG. 6. The double accumulator can be viewed as a truncated rate 1 convolutional coder with transfer function $1 /\left(1+D+D^{2}\right)$.

Alternatively, a pair of accumulators may be the added, as shown in FIG. 7. There are three component codes: the "outer" code 700, the "middle" code 702, and the "inner" code 704. The outer code is an irregular repetition code, and the middle and inner codes are both accumulators.
IRA codes may be implemented in a variety of channels, including memoryless channels, such as the BEC, BSC, and AWGN, as well as channels having non-binary input, nonsymmetric and fading channels, and/or channels with memory.

A number of embodiments have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the iavention. Accordingly, other embodiments are within the scope of the following claims.

## What is claimed is:

1. A method of encoding a signal, comprising:
receiving a block of data in the signal to be encoded, the block of data including information bits;
performing a first encoding operation on at least some of the information bits, the first encoding operation being a linear transform operation that generates L transformed bits; and
performing a second encoding operation using the $L$ transformed bits as an input, the second encoding operation including an accumulation operation in which the L transformed bits generated by the first encoding operation are accumulated, said second encoding operation producing at least a portion of a codeword, wherein $L$ is two or more.
2. The method of claim 1, further comprising:
outputting the codeword, wherein the codeword comprises parity bits.
3. The method of claim 2 , wherein outputting the codeword comprises:
outputting the parity bits; and
outputting at least some of the information bits.
4. The method of claim 3, wherein outputting the codeword comprises:
outputting the parity bits following the information bits.
5. The method of claim 2, wherein performing the first encoding operation comprises transforming the at least some of the information bits via a low density generator matrix transformation.
6. The method of claim 5 , wherein generating each of the $L$ transformed bits comprises mod-2 or exclusive-OR summing of bits in a subset of the information bits.
7. The method of claim 6, wherein each of the subsets of the information bits includes a same number of the information bits.
8. The method of claim 6, wherein at least two of the information bits appear in three subsets of the information bits.
9. The method of claim 6, wherein the information bits appear in a variable number of subsets.
10. The method of claim 2 , wherein performing the second encoding operation comprises using a first of the parity bits in the accumulation operation to produce a second of the parity bits.
11. The method of claim 10 , wherein outputting the codeword comprises outputting the second of the parity bits immediately following the first of the parity bits.
12. The method of claim 2 , wherein performing the second 5 encoding operation comprises performing one of a mod-2 addition and an exclusive-OR operation.
13. A method of encoding a signal, comprising:
receiving a block of data in the signal to be encoded, the block of data including information bits; and
performing an encoding operation using the information bits as an input, the encoding operation including an accumulation of mod-2 or exclusive-OR sums of bits in subsets of the information bits, the encoding operation generating at least a portion of a codeword,
wherein the information bits appear in a variable number of subsets.
14. The method of claim 13, further comprising:
outputting the codeword, wherein the codeword comprises parity bits.
15. The method of claim 14 , wherein outputting the codeword comprises:
outputting the parity bits; and
outputting at least some of the information bits.
16. The method of claim 15 , wherein the parity bits follow the information bits in the codeword.
17. The method of claim 13 , wherein each of the subsets of the information bits includes a constant number of the information bits.
18. The method of claim 13, wherein performing the encoding operation further comprises:
performing one of the mod- 2 addition and the exclusive-
OR summing of the bits in the subsets.
19. A method of encoding a signal, comprising:
receiving a block of data in the signal to be encoded, the block of data including information bits; and
performing an encoding operation using the information bits as an input, the encoding operation including an accumulation of mod- 2 or exclusive-OR sums of bits in subsets of the information bits, the encoding operation generating at least a portion of a codeword, wherein at least two of the information bits appear in three subsets of the information bits.
20. A method of encoding a signal, comprising:
receiving a block of data in the signal to be encoded, the block of data including information bits; and
performing an encoding operation using the information bits as an input, the encoding operation including an accumulation of mod-2 or exclusive-OR sums of bits in subsets of the information bits, the encoding operation generating at least a portion of a codeword, wherein performing the encoding operation comprises:
mod-2 or exclusive-OR adding a first subset of information bits in the collection to yield a first sum;
mod-2 or exclusive-OR adding a second subset of information bits in the collection and the first sum to yield a second sum.
21. A method comprising:
receiving a collection of information bits;
mod-2 or exclusive-OR adding a first subset of information bits in the collection to yield a first parity bit;
mod-2 or exclusive-OR adding a second subset of information bits in the collection and the first parity bit to yield a second parity bit; and
outputting a codeword that includes the first parity bit and the second parity bit.

## Exhibit C

## 9

22. The method of claim 21, wherein: the method further comprises mod-2 or exclusive-OR adding additional subsets of information bits in the collection and parity bits to yield additional parity bits; and

## 10

the information bits in the collection appear in a variable number of subsets.
(12) United States Patent

Jin et al.
(10) Patent No.: US 8,284,833 B2
(45) Date of Patent:

Oct. 9, 2012
(54) SERIAL CONCATENATION OF INTERLEAVED CONVOLUTIONAL CODES FORMING TURBO-LIKE CODES
(75) Inventors: Hui Jin, Glen Gardner, NJ (US); Aamod Khandekar, Pasadena, CA (US);
Robert J. McEliece, Pasadena, CA (US)
(73)

Assignee: California Institute of Technology, Pasadena, CA (US)
(*) Notice: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 U.S.C. 154(b) by 0 days.
(21)

Appl. No.: 13/073,947
Filed: Mar: 28, 2011
Prior Publication Data
US 2011/0264985 A1 Oct. 27, 2011

## Related U.S. Application Data

(63) Continuation of application No. $12 / 165,606$, filed on Jun. 30, 2008, now Pat. No. 7,916,781, which is a continuation of application No. $11 / 542,950$, filed on Oct. 3, 2006, now Pat. No. $7,421,032$, which is a continuation of application No. 09/861,102, filed on May 18, 2001, now Pat. No. 7,116,710, which is a continuation-in-part of application No. 09/922,852, filed on Aug. 18, 2000, now Pat. No. 7,089,477.
(60) Provisional application No. 60/205,095, filed on May $18,2000$.
(51) Int. CI.

H04B 1/66
(2006.01)
(52) U.S. Cl. $\qquad$ 375/240; 375/285; 375/296; 714/801; 714/804
(58) Field of Classification Search $\qquad$ 375/240, 375/240.24, 254, 285, 295, 296, 260; 714/755, $714 / 758,800,801,804,805$
See application file for complete search history.
(56)

## References Cited

U.S. PATENT DOCUMENTS


Aji, S.M., et al., "The Generalized Distributive Law," IEEE Transactions on Information Theory, 46(2):325-343, Mar. 2000.
(Continued)
Primary Examiner - Dac Ha
(74) Attorney, Agent, or Firm - Perkins Coie LLP

## (57)

ABSTRACT
A serial concatenated coder includes an outer coder and an inner coder. The outer coder irregularly repeats bits in a data block according to a degree profile and scrambles the repeated bits. The scrambled and repeated bits are input to an inner coder, which has a rate substantially close to one.

14 Claims, 5 Drawing Sheets


| U.S. PATENT DOCUMENTS |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: |
| 6,859,906 | B2 | 2/2005 | Hammons et al. |  |
| 7,089,477 | BI | 8/2006 | Divsalar et al. |  |
| 7,116,710 | B1 | 10/2006 | Jin et al. |  |
| 7,421,032 | B2 | 9/2008 | Jin et al. |  |
| 7,916,781 | B2 | 3/2011 | Jin et al. |  |
| 7,934,146 | B2* | 4/2011 | Stolpman | 714/800 |
| 2001/0025358 | A1 | 9/2001 | Eidson et al. |  |
| 2007/0025450 | AI | 2/2007 | Jin et al. |  |
| 2008/0263425 | AI* | 10/2008 | Lakkis ........... | 714/752 |
| 2008/0294964 | Al | 11/2008 | Jin et al. |  |
| OTHER PUBLICATIONS |  |  |  |  |

Benedetto, S., et al., "A Soft-Input Soft-Output APP Module for Iterative Decoding of Concatenated Codes," IEEE Communications Letters, 1(1):22-24, Jan. 1997.
Benedetto, S., et al., "A Soft-Input Soft-Output Maximum a Posteriori (MAP) Module to Decode Parallel and Serial Concatenated Codes," The Telecommunications and Data Acquisition Progress Report (TDA PR 42-127), pp. 1-20, Nov. 1996.
Benedetto, S., et al., "Bandwidth efficient parallel concatenated coding schemes," Electronics Lefters, 31 (24):2067-2069, Nov. 1995. Benedetto, S., et al., "Design of Serially Concatenated Interleaved Codes," ICC 97, vol. 2, pp. 710-714, Jun. 1997.
Benedetto, S., et al., "Parallel Concatenated Trellis Coded Modulation," ICC 96, vol. 2, pp. 974-978, Jun. 1996.
Benedetto, S., et al., "Serial Concatenated Trellis Coded Modulation with Iterative Decoding," Proceedings 1997 IEEE International Symposium on Information Theory (ISIT), Ulm, Germany, p. 8, Jun. 29-Jul. 4, 1997.
Benedetto, S., et al., "Serial Concatenation of Interleaved Codes: Performace Analysis, Design, and Iterative Decoding," The Telecommunications and Data Acquisition Progress Report (TDA PR 42-126), pp. 1-26, Aug. 1996.
Benedetto, S., et al., "Serial concatenation of interleaved codes: performance analysis, design, and iterative decoding," Proceedings 1997 IEEE International Symposium on Information Theory (ISIT), Ulm, Germany, p. 106, Jun. 29-Jul. 4, 1997.
Benedetto, S., et al., "Soft-Output Decoding Algorithms in Iterative Decoding of Turbo Codes," The Telecommunications and Data Acquisition Progress Report (TDA PR 42-124), pp. 63-87, Feb. 1996. Berrou, C., et al., "Near Shannon Limit Error-Correcting Coding and Decoding: Turbo Codes," ICC 93, vol. 2, pp. 1064-1070, May 1993.

Digital Video Broadcasting (DVB)-User guidelines for the second generation system for Broadcasting, Interactive Services, News

Gathering and other broadband satellite applications (DVB-S2), ETSI TR 102376 V1.1.1 Technical Report, pp. 1-104 (p. 64), Feb. 2005.

Divsalar, D., et al., "Coding Theorems for 'Turbo-Like' Codes," Proceedings of the 36th Annual Allerton Conference on Communication, Control, and Computing, Monticello, Illinois, pp. 201-210, Sep. 1998.
Divsalar, D., et al.,"Effective free distance of turbo codes," Electronics Letters, 32(5):445-446, Feb. 1996.
Divsalar, D., et al., "Hybrid Concatenated Codes and Iterative Decoding," Proceedings 1997 IEEE International Symposium on Information Theory (ISIT), Ulm, Germany, p. 10, Jun. 29-Jul. 4, 1997.
Divsalar, D., et al., "Low-Rate Turbo Codes for Deep-Space Communications," Proceedings 1995 IEEE International Symposium on Information Theory (ISIT), Whistler, BC, Canada, p. 35, Sep. 1995. Divsalar, D., et al., "Multiple Turbo Codes for Deep-Space Communications," The Teiccommunications and Data Acquisition Progress Report (TDA PR 42-121), pp. 66-77, May 1995.
Divsalar, D., et al., "Multiple Turbo Codes,"MILCOM '95, vol. 1, pp. 279-285, Nov. 1995.
Divsalar, D., et al., "On the Design of Turbo Codes," The Telecommunications and Data Acquisition Progress Report (TDA PR 42-123), pp. 99-121, Nov. 1995.
Divsalar, D., et al., "Serial Turbo Trellis Coded Modulation with Rate-1 Inner Code," Proceedings 2000 IEEE International Symposium on Information Theory (ISIT), Sorrento, Italy, pp. 194, Jun. 2000.

Divsalar, D., et al., "Turbo Codes for PCS Applications," IEEE ICC '95, Seattle, WA, USA, vol. 1, pp. 54-59, Jun. 1995.
Jin, H., et al., "Irregular Repeat-Accumulate Codes," 2nd International Symposium on Turbo Codes, Brest, France, 25 pages, Sep. 2000.

Jin, H., et al., "Irregular Repeat-Accumulate Codes," 2nd International Symposium on Turbo Codes \& Related Topics, Brest, France, p. 1-8, Sep. 2000.

Richardson, T.J., et al., "Design of Capacity-Approaching Irregular Low-Density Parity-Check Codes," IEEE Transactions on Information Theory, 47(2):619-637, Feb. 2001.
Richardson, T.J., et al., "Efficient Encoding of Low-Density ParityCheck Codes," IEEE Transactions on Information Theory, 47(2):638-656, Feb. 2001.
Tanner, R.M., "A Recursive Approach to Low Complexity Codes," IEEE Transactions on Information Theory, 27 (5):533-547, Sep. 1981.

Wiberg, N., et al., "Codes and Iterative Decoding on General Graphs," Proceedings 1995 IEEE International Symposium on Information Theory (ISIT), Whistler, BC, Canada, p. 468, Sep. 1995.

* cited by examiner

FIG. 1
(Prior Art)

Exhibit D
Page 70

FIG. 4

Exhibit D


FIG. 3


FIG. 5A


FIG. $5 B$

Exhibit D
Page 73

FIG. 6


Exhibit D
Page 74

## 1 <br> SERIAL CONCATENATION OF INTERLEAVED CONVOLUTIONAL CODES FORMING TURBO-LIKE CODES

## CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No. 12/165,606, filed Jun. 30, 2008 now U.S. Pat. No. 7,916, 781, which is a continuation of U.S. application Ser. No. 11/542,950, filed Oct. 3, 2006, now U.S. Pat. No. 7,421,032, which is a continuation of U.S. application Ser. No. 09/861, 102, filed May 18,2001 , now U.S. Pat. No. $7,116,710$, which claims the priority of U.S. Provisional Application Ser. No. $60 / 205,095$, filed May 18, 2000, and is a continuation-in-part of U.S. application Ser. No. 09/922,852, filed Aug. 18, 2000, now U.S. Pat. No. 7,089,477. The disclosures of the prior applications are considered part of (and are incorporated by reference in) the disclosure of this application.

## GOVERNMENT LICENSE RIGHTS

The U.S. Government has a paid-up license in this invention and the right in limited circumstances to require the patent owner to license others on reasonable terms as provided for by the terms of Grant No. CCR-9804793 awarded by the National Science Foundation.

## BACKGROUND

Properties of a channel affect the amount of data that can be handled by the channel. The so-called "Shannon limit" defines the theoretical limit of the amount of data that a channel can carry.
Different techniques have been used to increase the data rate that can be handled by a channel. "Near Shannon Limit Error-Correcting Coding and Decoding: Turbo Codes," by Berrou et al. ICC, pp 1064-1070, (1993), described a new "turbo code" technique that has revolutionized the field of error correcting codes. Turbo codes have sufficient randomness to allow reliable communication over the channel at a high data rate near capacity. However, they still retain sufficient structure to allow practical encoding and decoding algorithms. Still, the techaique for encoding and decoding turbo codes can be relatively complex.
A standard turbo coder 100 is shown in FIG. 1. A block of $k$ information bits is input directly to a first coder 102. A $k$ bit interleaver 106 also receives the k bits and interleaves them prior to applying them to a second coder 104. The second coder produces an output that has more bits than its input, that is, it is a coder with rate that is less than 1 . The coders 102, 104 are typically recursive convolutional coders.
Three different items are sent over the channel 150: the original k bits, first encoded bits 110, and second encoded bits 112. At the decoding end, two decoders are used: a first constituent decoder 160 and a second constituent decoder 162. Each receives both the original k bits, and one of the encoded portions 110, 112. Each decoder sends likelihood estimates of the decoded bits to the other decoders. The estimates are used to decode the uncoded information bits as corrupted by the noisy channel.

## SUMMARY

A coding system according to an embodiment is configured to receive a portion of a signal to be encoded, for example, a data block including a fixed number of bits. The
coding system includes an outer coder, which repeats and scrambles bits in the data block. The data block is apportioned into two or more sub-blocks, and bits in different sub-blocks are repeated a different number of times according to a selected degree profile. The outer coder may include a repeater with a variable rate and an interleaver. Alternatively, the outer coder may be a low-density generator matrix (LDGM) coder.
The repeated and scrambled bits are input to an inner coder that has a rate substantially close to one. The inner coder may include one or more accumulators that perform recursive modulo two addition operations on the input bit stream.
The encoded data output from the inner coder may be transmitted on a channel and decoded in linear time at a destination using iterative decoding techniques. The decoding techniques may be based on a Tanner graphrepresentation of the code.

## BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of a prior "turbo code" system.

FIG. 2 is a schematic diagram of a coder according to an embodiment.
FIG. 3 is a Tanner graph for an irregular repeat and accumulate (IRA) coder.

FIG. 4 is a schematic diagram of an IRA coder according to an embodiment.

FIG. 5A illustrates a message from a variable node to a check node on the Tanner graph of FIG. 3.

FIG. 5B illustrates a message from a check node to a variable node on the Tanner graph of FIG. 3.

FIG. 6 is a schematic diagram of a coder according to an alternate embodiment.

FIG. 7 is a schematic diagram of a coder according to another alternate embodiment.

## DETAILED DESCRIPTION

FIG. 2 illustrates a coder 200 according to an embodiment. The coder 200 may include an outer coder 202, an interleaver 204, and inner coder 206. The coder may be used to format blocks of data for transmission, introducing redundancy into the stream of data to protect the data from loss due to transmission errors. The encoded data may then be decoded at a destination in linear time at rates that may approach the channel capacity.

The outer coder 202 receives the uncoded data. The data may be partitioned into blocks of fixed size, say k bits. The outer coder may be an ( $\mathrm{n}, \mathrm{k}$ ) binary linear block coder, where $\mathrm{n}>\mathrm{k}$. The coder accepts as input a block $u$ of k data bits and produces an output block v of n data bits. The mathematical relationship between $u$ and $v$ is $v=T_{0} u$, where $T_{0}$ is an $n \times k$ matrix, and the rate of the coder is $\mathrm{k} / \mathrm{n}$.

The rate of the coder may be irregular, that is, the value of $T_{0}$ is not constant, and may differ for sub-blocks of bits in the data block. In an embodiment, the outer coder 202 is a repeater that repeats the $k$ bits in a block a number of times $q$ to produce a block with $n$ bits, where $n=q k$. Since the repeater has an irregular output, different bits in the block may be repeated a different number of times. For example, a fraction of the bits in the block may be repeated two times, a fraction of bits may be repeated three times, and the remainder of bits may be repeated four times. These fractions define a degree sequence, or degree profile, of the code.

The inner coder 206 may be a linear rate-1 coder, which means that the $n$-bit output block $x$ can be written as $x=T_{i} w$,

Filed 03/06/14
where $\mathrm{T}_{I}$ is a nonsingular $\mathrm{n} \times \mathrm{n}$ matrix. The inner coder 210 can have a rate that is close to 1, e.g., within $50 \%$, more preferably $10 \%$ and perhaps even more preferably within $1 \%$ of I.

In an embodiment, the inner coder 206 is an accumulator, which produces outputs that are the modulo two (mod-2) partial sums of its inputs. The accumulator may bea truncated rate-1 recursive convolutional coder with the transfer function $1 /(1+D)$. Such an accumulator may be considered ablock coder whose input block $\left[\mathrm{x}_{1}, \ldots, \mathrm{x}_{n}\right.$ ] and output block $\left[y_{1}, \ldots, y_{n}\right]$ are related by the formula

$$
\begin{aligned}
& y_{1}=x_{1} \\
& y_{2}=x_{1} \oplus x_{2} \\
& y_{3}=x_{1} \oplus x_{2} \oplus x_{3}
\end{aligned}
$$

ก

$$
\mathrm{y}_{n}=\mathrm{x}_{1} \oplus \mathrm{x}_{2} \oplus \mathrm{x}_{3} \oplus \ldots \oplus \mathrm{x}_{n}
$$

where " $\oplus$ " denotes mod-2, or exclusive-OR (XOR), addition. An advantage of this system is that only mod-2 addition is necessary for the accumulator. The accumulator may be embodied using only XOR gates, which may simplify the design.
The bits output from the outer coder 202 are scrambled 25 before they are input to the inner coder 206. This scrambling may be performed by the interleaver 204, which performs a pseudo-random permutation of an input block $v$, yielding an output block whaving the same length as $v$.
The serial concatenation of the interleaved irregular repeat 30 code and the accumulate code produces an irregular repeat and accumulate (IRA) code. An IRA code is a linear code, and as such, may be represented as a set of parity checks. The set of parity checks may be represented in a bipartite graph, called the Tanner graph, of the code. FIG. 3 shows a Tanner graph 300 of an IRA code with parameters ( $f_{1}, \ldots, f_{j}$; a), where $\mathrm{f}_{i} \geqq 0, \Sigma_{i} \mathrm{f}_{i}=1$ and "a" is a positive integer. The Tanner graph includes two kinds of nodes: variable nodes (open circles) and check nodes (filled circles). There are $k$ variable nodes 302 on the left, called information nodes. There are r variable nodes 306 on the right, called parity nodes. There are $\mathrm{r}=\left(\mathrm{k} \Sigma_{i} \mathrm{if}_{i}\right) /$ a check nodes 304 connected between the information nodes and the parity nodes. Each information node 302 is connected to a number of check nodes 304. The fraction of information nodes connected to exactly $i$ check nodes is $f_{i}$. For example, in the Tanner graph 300, each of the $\mathrm{f}_{2}$ information nodes are connected to two check nodes, corresponding to a repeat of $q=2$, and each of the $f_{3}$ information nodes are connected to three check nodes, corresponding to $\mathrm{q}=3$.
Each check node 304 is connected to exactly "a" information nodes 302. In FIG. 3, $a=3$. These connections can be made in many ways, as indicated by the arbitrary permutation of the ra edges joining information nodes 302 and check nodes 304 in permutation block 310 . These connections correspond to the scrambling performed by the interleaver 204.

In an alternate embodiment, the outer coder 202 may be a low-density generator matrix (LDGM) coder that performs an irregular repeat of the $k$ bits in the block, as shown in FIG. 4. As the name implies, arr LDGM code has a sparse (lowdensity) generator matrix. The IRA code produced by the coder 400 is a serial concatenation of the LDGM code and the accumulator code. The interleaver 204 in FIG. 2 may be excluded due to the randomness already present in the structure of the LDGM code.
If the permutation performed in permutation block 310 is fixed, the Tanner graph represents a binary linear block code
with $k$ information bits ( $u_{k}, \ldots, u_{k}$ ) and $r$ parity bits
a systematic version. The nonsystematic version is an (r,k) code, in which the codeword corresponding to the information bits ( $\mathrm{u}_{1}, \ldots, \mathrm{u}_{k}$ ) is ( $\mathrm{x}_{1}, \ldots, \mathrm{x}_{r}$ ). The systematic version is a $(k+r, k)$ code, in which the codeword is $\left(u_{1}, \ldots, u_{k}\right.$; $\mathrm{x}_{1}, \ldots, \mathrm{x}_{\mathrm{r}}$ ).

$$
\left.\mathrm{x}_{1}, \ldots, \mathrm{x}_{r}\right) .
$$

$$
R_{\text {nys }}=\frac{a}{\sum_{i} \dot{f}_{i}}
$$

The rate of the nonsystematic code is
The rate of the systematic code is

$$
R_{\mathrm{sys}}=\frac{a}{a+\sum_{i} f_{i}}
$$

For example, regular repeat and accumulate (RA) codes can be considered nonsystematic IRA codes with $a=1$ and exactly one $\mathrm{f}_{i}$ equal to 1 , say $\mathrm{f}_{q}=1$, and the rest zero, in which case $\mathrm{R}_{\text {nsys }}$ simplifies to $\mathrm{R}=1 / \mathrm{q}$.

The IRA code may be represented using an alternate notation. Let $\lambda_{i}$ be the fraction of edges between the information nodes 302 and the check nodes 304 that are

$$
f_{i}=\frac{\lambda_{i} / i}{\sum_{j} \lambda_{j} / j}
$$

adjacent to an information node of degree $i$, and let $\rho_{i}$ be the fraction of such edges that are adjacent to a check pode of degree i+2 (i.e., one that is adjacent to i information nodes). These edge fractions may be used to represent the IRA code rather than the corresponding node fractions. Define $\lambda(x)=$ $\Sigma_{i} \lambda_{i} x^{i-1}$ and $\rho(x)=\Sigma_{i} \rho_{i} x^{i-1}$ to be the generating functions of these sequences. The pair $(\lambda, \rho)$ is called a degree distribution. 6 dition that the mod-2 sum of the values of the variable nodes connected to each of the check nodes 304 is zero. To see this, set $x_{0}=0$. Then if the values of the bits on the ra edges coming out the permutation box are $\left(v_{1}, \ldots, v_{r a}\right)$, then we have the recursive formula for $\mathrm{j}=1,2, \ldots, r$. This is in effect the encoding algorithm.

Two types of IRA codes are represented in FIG. 3, a nonsystematic version and

$$
x_{j}=x_{j-1}+\sum_{i=1}^{a} v_{(j-1)_{a+i}}
$$

$\left(\mathrm{x}_{1}, \ldots, \mathrm{x}_{r}\right)$, as follows. Each of the information bits is associated with one of the information nodes 302, and each of the parity bits is associated with one of the parity nodes 306. The value of a parity bit is determined uniquely by the con-
-continued

$$
\text { Rate }=\left(1+\frac{\sum_{j} \rho_{j} / j}{\sum_{j} \lambda_{j} / j}\right)^{-1}
$$

"Belief propagation" on the Tanner Graph realization may be used to decode IRA codes. Roughly speaking, the belief propagation decoding technique allows the messages passed on an edge to represent posterior densities on the bit associated with the variable node. A probability density on a bit is a pair of non-negative real numbers $p(0), p(1)$ satisfying $p(0)+$ $p(1)=1$, where $p(0)$ denotes the probability of the bit being 0 , $p(1)$ the probability of it being 1 . Such a pair can be represented by its $\log$ likelihood ratio, $m=\log (p(0) / p(1))$. The outgoing message from a variable node $u$ to a check node $v$ represents information about $u$, and a message from a check node $u$ to a variable node $v$ represents information about $u$, as shown in FIGS. 5A and 5B; respectively.

The outgoing message from a node $u$ to a node $v$ depends on the incoming messages from all neighbors $w$ of $u$ except $v$. If $u$ is a variable message node, this outgoing message is

$$
m(u \rightarrow v)=\sum_{w \neq v} m(w \rightarrow u)+m_{0}(u)
$$

where $\mathrm{m}_{0}(\mathrm{u})$ is the log-likelihood message associated with u . If $u$ is a check node, the

$$
\tanh \frac{m(u \rightarrow v)}{2}=\prod_{w \neq \nu} \tanh \frac{m(w \rightarrow u)}{2}
$$

## corresponding formula is

Before decoding, the messages $m(w \rightarrow u)$ and $m(u \rightarrow v)$ are initialized to be zero, and $m_{0}(u)$ is initialized to be the loglikelihood ratio based on the channel received information. If the channel is memoryless, i.e., each channel output only relies on its input, and $y$ is the output of the channel code bit $u$, then $m_{0}(u)=\log (p(u=0 \mid y) / p(u=1 \mid y))$. After this initialization, the decoding process may run in a fully parallel and local manner. In each iteration, every variable/check node receives messages from its neighbors, and sends back updated messages. Decoding is terminated after a fixed number of iterations or detecting that all the constraints are satisfied. Upon termination, the decoder outputs a decoded sequence based on the messages $\mathrm{m}(\mathrm{u})=\Sigma \mathrm{w}_{m}(\mathrm{w} \rightarrow \mathrm{u})$.

Thus, on various channels, iterative decoding only differs in the initial messages $m_{0}(u)$. For example, consider three memoryless channel models: a binary erasure channel (BEC); a binary symmetric channel (BSC); and an additive white Gaussian noise (AGWN) channel.

In the BEC, there are two inputs and three outputs. When 0 is transmitted, the receiver can receive either 0 or an erasure $E$. An erasure $E$ output means that the receiver does not know how to demodulate the output. Similarly, when 1 is transmitted, the receiver can receive either 1 or E. Thus, for the BEC, $\mathrm{y} \in\{0, \mathrm{E}, 1\}$, and

In the BSC, there are two possible inputs $(0,1)$ and two possible outputs $(0,1)$.

$$
m_{0}(u)=\left\{\begin{array}{ccc}
+\infty & \text { if } y=0 \\
0 & \text { if } y=E \\
-\infty & \text { if } & y=1
\end{array}\right.
$$

The BSC is characterized by a set of conditional probabilities relating all possible outputs to possible inputs. Thus, for the $\operatorname{BSC} y \in\{0,1\}$, and

$$
m_{0}(u)= \begin{cases}\log \frac{1-p}{p} & \text { if } y=0 \\ -\log \frac{1-p}{p} & \text { if } y=1\end{cases}
$$

In the AWGN, the discrete-time input symbols $X$ take their values in a finite alphabet while channel output symbols $Y$ can take any values along the real line. There is assumed to be no distortion or other effects other than the addition of white Gaussian noise. In an AWGN with a Binary Phase Shift Keying (BPSK) signaling which maps 0 to the symbol with amplitude $\sqrt{ } E s$ and 1 to the symbol with amplitude $-\sqrt{E s}$, output $y \in R$, then

$$
\mathrm{m}_{0}(u)=4 y \sqrt{E_{j}} N_{0}
$$

where $\mathrm{N}_{0} / 2$ is the noise power spectral density.
The selection of a degree profile for use in a particular transmission channel is a design parameter, which may be affected by various attributes of the channel. The criteria for selecting a particular degree profile may include, for example, the type of channel and the data rate on the channel. For example, Table 1 shows degree profiles that have been found to produce good results for an AWGN channel model.

TABLE 1

| a | 2 | 3 | 4 |
| :---: | :---: | :---: | :---: |
| $\lambda 2$ | 0.139025 | 0.078194 | 0.054485 |
| 23 | 0.2221555 | 0.128085 | 0.104315 |
| $\lambda 5$ |  | 0.160813 |  |
| $\lambda 6$ | 0.638820 | 0.036178 | 0.126755 |
| $\lambda 10$ |  |  | 0.229816 |
| $\lambda 11$ |  |  | 0.016484 |
| $\lambda 12$ |  | 0.108828 |  |
| $\lambda 13$ |  | 0.487902 |  |
| $\lambda 14$ |  |  |  |
| $\lambda 16$ |  |  |  |
| 2.27 |  |  | 0.450302 |
| $\lambda 28$ |  |  | 0.017842 |
| Rate | 0.333364 | 0.333223 | 0.333218 |
| OGA | 1.1840 | 1.2415 | 1.2615 |
| $\sigma^{*}$ | 1.1981 | 1.2607 | 1.2780 |
| ( $\mathrm{Eb} / \mathrm{NO}$ ) ${ }^{(\mathrm{dB}}$ ) | 0.190 | -0.250 | -0.371 |
| S.L. (dB) | -0.4953 | -0.4958 | -0.4958 |

Table 1 shows degree profiles yielding codes of rate approximately $1 / 3$ for the AWGN channel and with $\mathrm{a}=2,3,4$. For each sequence, the Gaussian approximation noise threshold, the actual sum-product decoding threshold and the corresponding energy per bit $\left(\mathrm{E}_{6}\right)$-noise power $\left(\mathrm{N}_{0}\right)$ ratio in dB are given. Also listed is the Shannon limit (S.L.).

As the parameter " $a$ " is increased, the performance improves. For example, for $\mathrm{a}=4$, the best code found has an iterative decoding threshold of $\mathrm{E}_{b} / \mathrm{N}_{0}=-0.371 \mathrm{~dB}$, which is only 0.12 dB above the Shannon limit.

The accumulator component of the coder may be replaced by a "double accumulator" 600 as shown in FIG. 6. The
double accumulator can be viewed as a truncated rate 1 convolutional coder with transfer function $1 /\left(1+D+D^{2}\right)$.

Alternatively, a pair of accumulators may be the added, as shown in FIG. 7. There are three component codes: the "outer" code 700, the "middle" code 702, and the "inner" code 704. The outer code is an irregular repetition code, and the middle and inner codes are both accumulators.
IRA codes may be implemented in a variety of channels, including memoryless channels, such as the BEC, BSC, and AWGN, as well as channels having non-binary input, nonsymmetric and fading channels, and/or channels with memory.
'A number of embodiments have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims.

## What is claimed is:

1. An apparatus for performing encoding operations, the apparatus comprising:
a first set of memory locations to store information bits; a second set of memory locations to store parity bits;
a permutation module to read a bit from the first set of 2 memory locations and combine the read bit to a bit in the second set of memory locations based on a corresponding index of the first set of memory locations and a corresponding index of the second set of memory locations; and
an accumulator to perform accumulation operations on the bits stored in the second set of memory locations,
wherein two or more memory locations of the first set of memory locations are read by the permutation module different times from one another.
2. The apparatus of claim 1, wherein the permutation module is configured to perform the combine operation to include performing mod-2 or exclusive-OR sum.
3. The apparatus of claim 2 , wherein the permutation module is configured to perform the combining operation to further include writing the sum to the second set of memory locations based on a corresponding index.
4. The apparatus of claim 1 , wherein the accumulator is configured to perform the accumulation operation to include a mod-2 or exclusive-OR sum of the bit stored in a prior index to a bit stored in a current index based on a corresponding index of the second set of memory locations.
5. The apparatus of claim 4, wherein the accumulator is configured to perform the accumulation operation to at least 2 consecutive indices of the second set of memory locations.
6. The apparatus of claim 1, wherein the permutation module further comprises a permutation information module to generate pairs of an index of the first set of memory locations and an index of the second set of memory locations.
7. The apparatus of claim 6 , wherein at least one index of the second set of memory locations is used twice.
8. A method of performing encoding operations, the method comprising:
receiving a sequence of information bits from a first set of memory locations;
performing an encoding operation using the received sequence of information bits as an input, said encoding operation comprising:
reading a bit from the received sequence of information bits, and
combining the read bit to a bit in a second set of memory locations based on a corresponding index of the first set of memory locations for the received sequence of information bits and a corresponding index of the second set of memory locations; and
accumulating the bits in the second set of memory locations,
wherein two or more memory locations of the first set of memory locations are read by the permutation module different times from one another.
9. The method of claim 8 , wherein performing the combine operation comprises performing mod-2 or exclusive-OR sum.
10. The method of claim 9 , wherein performing the combine operation comprises writing the sum to the second set of memory locations based on a corresponding index.
11. The method of claim 8, wherein performing the accumulation operation comprises performing a mod-2 or exclu-sive-OR sum of the bit stored in a prior index to a bit stored in a current index based on a corresponding index of the second set of memory locations.
12. The method of claim 8, wherein the accumulation operation is performed to at least 2 consecutive indices of the second set of memory locations.
13. The method of claim 8 , wherein the combining operation comprises generating pairs of an index of the first set of memory locations and an index of the second set of memory locations.
14. The method of claim 13, wherein at least one index of the second set of memory locations is used twice.

## Exhibit D <br> Page 78

## UNITED STATES PATENT AND TRADEMARK OFFICE <br> CERTIFICATE OF CORRECTION

| PATENT NO. | $: 8,284,833 \mathrm{~B} 2$ | Page 1 of 1 |
| :--- | :--- | :--- |
| APPLICATION NO. | $: 13 / 073947$ |  |
| DATED | $:$ October 9,2012 |  |
| INVENTOR(S) | $:$ Hui Jin et al. |  |

It is certified that error appears in the above-identified patent and that said Letters Patent is hereby corrected as shown below:

On the Title Page, in the Figures, insert Referral Tag -- 300 --.
On Title Page 2, Item (56), under "OTHER PUBLICATIONS", Line 19, delete "Performace" and insert -- Performance --, therefor.

In Fig. 3, Sheet 3 of 5, insert Referral Tag -- $300-$-.
In Column 1, Line 38, delete "Bcrrou" and insert -- Berrou --, therefor.

In Column 3, Line 3, delete " $1 \%$ of I." and insert -- $1 \%$ of $1 .-$, therefor.
In Column 4, Line 31, delete "The rate of the nonsystematic code is" and insert the same at Line 25 as a new line.

In Column 4, Line 61, delete "The" and insert -- the --, therefor.
In Column 5, Line 39, delete "corresponding formula is" and insert the same in Line 32, after "node, the".

In Column 5, Line 59, delete "(AGWN)" and insert -- (AWGN) --, therefor.

Signed and Sealed this
Eighth Day of January, 2013


# Irregular Repeat-Accumulate Codes ${ }^{1}$ 

Hui Jin, Aamod Khandekar, and Robert McEliece<br>Department of Electrical Engineering, California Institute of Technology<br>Pasadena, CA 91125 USA<br>E-mail: \{hui, aamod, rjm\}@systems.caltech.edu


#### Abstract

In this paper we will introduce an ensemble of codes called irregular repeat-accumulate (IRA) codes. IRA codes are a generalization of the repeat-accumluate codes introduced in [1], and as such have a natural linear-time encoding algorithm. We shall prove that on the binary erasure channel, IRA codes can be decoded reliably in linear time, using iterative sum-product decoding, at rates arbitrarily close to channel capacity. A similar result appears to be true on the AWGN channel, although we have no proof of this. We illustrate our results with numerical and experimental examples.


Keywords: repeat-accumulate codes, turbo-codes, low-density parity-check codes, iterative decoding.

## 1. INTRODUCTION

With the hindsight provided by the past seven years of research in turbo-codes and low-density paritycheck codes, one is tempted to propose the following problem as the final problem for channel coding researchers: For a given channel, find an ensemble of codes with (1) a linear-time encoding algorithm, and (2) which can be decoded reliably in linear time at rates arbitrarily close to channel capacity. For turbo-codes, both parallel and serial, (1) holds, but according to the recent work by Divsalar, Dolinar, and Pollara [7], on the AWGN channel there appears to be a gap, albeit usually not a large one, between channel capacity and the iterative decoding thresholds for any turbo ensemble. For LDPC codes, the natural encoding algorithm is quadratic in the block length, and from the work of Richardson and Urbanke [2] we know that for regular LDPC codes, on the binary symmetric and AWGN channels there is a gap between capacity and the iterative decoding thresholds. On the positive side, however, Luby, Shokrollahi et at. [3], [4], [8], have established the remarkable fact that on the binary erasure channel irregular LDPC codes satisfy (2). Recent work by Richardson, Shokrollahi and Urbanke [5] shows

[^2]that on the AWGN channel, irregular LDPC codes are markedly better than regular ones, but whether or not they can reach capacity is not yet known. In summary, as yet there is no known noisy channel for which the final problem has been solved, although researchers are very close on the AWGN channel and extremely close on the binary erasure channel.

In this paper, we will introduce a promising class of codes called irregular repeat-accumulate codes, which generalizes the repeat-accumulate codes of [1]. After defining the codes in Section 2, and observing that they have a simple linear-time encoding algorithm, in Section 3, using the powerful Richarson-Urbanke method [2], we will prove rigorously that IRA codes solve the final problem for the binary erasure channel. In Section 4, we will discuss, less rigorously, the performance of IRA codes on the AWGN channel, and show that their performance is remarkably good.

## 2. DEFINTION OF IRA CODES

Figure 1 shows a Tanner graph of an IRA code with parameters $\left(f_{1}, \ldots, f_{J} ; a\right)$, where $f_{i} \geq 0, \sum_{i} f_{i}=$ 1 and $a$ is a positive integer. The Tanner graph is a bipartite graph with two kinds of nodes: variable nodes (open circles) and check nodes (filled circles). There are $k$ variable nodes on the left, called information nodes; there are $r=\left(k \sum_{i} i f_{i}\right) / a$ check nodes; and there are $r$ variable nodes on the right, called parity nodes. Each information node is connected to a number of check nodes: the fraction of information nodes connected to exactly $i$ check nodes is $f_{i}$. Each check node is conected to exactly $a$ information nodes. These connections can made in many ways, as indicated in Figure 1 by the "arbitrary permutation" of the ra edges joining information nodes and check nodes. The check nodes are connected to the parity nodes in the simple zigzag pattern shown in the figure.

If the "arbitrary permutation" in Figure 1 is fixed, the Tanner graph represents a binary linear code with $k$ information bits ( $u_{1}, \ldots, u_{k}$ ) and $r$ parity bits $\left(x_{1}, \ldots, x_{r}\right)$, as follows. Each of the information bits is associated with one of the information nodes; and each of the parity bits is associated with one of the


Figure 1: Tanner graph for IRA code with parameters $\left(f_{1}, \ldots, f_{J} ; a\right)$.
parity nodes. The value of a parity bit is determined uniquely by the condition that the mod 2 sum of the values of the variable nodes connected to each of the check nodes is zero. To see this, let us conventionally set $x_{0}=0$. Then if the values of the bits on the $r a$ edges coming out of the permutation box are ( $v_{1}, \ldots, v_{r a}$ ), we have the recursive formula

$$
\begin{equation*}
x_{j}=x_{j-1}+\sum_{i=1}^{a} v_{(j-1) a+i} \tag{1}
\end{equation*}
$$

for $j=1,2, \ldots, r$. This is in effect the encoding algorithm, and so if $a$ is fixed and $n \rightarrow \infty$, the encoding complexity is $O(n)$.

There are two versions of the IRA code in Figure 1: the nonsystematic and the systematic verisons. The nonsystematic version is an $(r, k)$ code, in which the codeword corresponding to the information bits $\left(u_{1}, \ldots, u_{k}\right)$ is $\left(x_{1}, \ldots, x_{r}\right)$. The systematic version is a $(k+r, k)$ code, in which the codeword is

$$
\left(u_{1}, \ldots, u_{k} ; x_{1}, \ldots, x_{r}\right)
$$

The rate of the nonsystematic code is easily seen to be

$$
\begin{equation*}
R_{\mathrm{nsys}}=\frac{a}{\sum_{i} i f_{i}} \tag{2}
\end{equation*}
$$

whereas for the systematic code the rate is

$$
\begin{equation*}
R_{\mathrm{sys}}=\frac{a}{a+\sum_{i} i f_{i}} \tag{3}
\end{equation*}
$$

For example, the original RA codes are nonsystematic IRA codes with $a=1$ and exactly one $f_{i}$ equal to 1 , say $f_{q}=1$, and the rest zero, in which case (2) simplifies to $R=1 / q$. (However, in this paper we will be concerned almost exclusively with systematic IRA codes.)

In an iterative sum-product message-passing decoding algorithm, all messages are assumed to be loglikelihood ratios, i.e., of the form $m=\log (p(0) / p(1))$. The outgoing message from a variable node $u$ to a check node $v$ represents information about $u$, and a message from a check node $u$ to a variable node $v$ represents information about $u$. Intially, messages are sent from variable nodes which represent transmitted symbols.

The outgoing message from a node $u$ to a node $v$ depends on the incoming messages from all neighbors $w$ of $u$ except $v$. If $u$ is a variable message node, this outgoing message is

$$
\begin{equation*}
m(u \rightarrow v)=\sum_{w \neq v} m(w \rightarrow u)+m_{0}(u) \tag{4}
\end{equation*}
$$

where $m_{0}(u)$ is the log-likelihood message associated with $u$. (If $u$ is not a codeword node, this term is absent.) If $u$ is a check node the corresponding formula is [10]

$$
\begin{equation*}
\tanh \frac{m(u \rightarrow v)}{2}=\prod_{w \neq v} \tanh \frac{m(w \rightarrow u)}{2} \tag{5}
\end{equation*}
$$

## 3. IRA CODES ON THE BINARY ERASURE CHANNEL

The sum-product algorithm defined in equations (4) and (5) simplifies considerably on the binary erasure channel (BEC). The BEC is a binary input channel with three output symbols, a 0 , a 1 and "erasure." The input symbol is received as an erasure with probability $p$ and is received correctly with probability $1-p$. It is important to note that no errors are ever made on this channel.

It is not difficult to see that the messages defined in (4) and (5) can assume only three values on the BEC, viz. $+\infty,-\infty$ or 0 , corresponding to a variable value 0,1 , or "unknown." No errors can occur during the running of the algorithm; if a message is $\pm \infty$, the corresponding variable is guaranteed to be 0 or 1 , respectively. The operations at the nodes in the graph given by eqns (4) and (5) can be stated much more simply and intutively in this case. At a variable node, the outgoing message is equal to any non-erasure incoming message, or an erasure if all incoming messages are erasures. At a check node, the outgoing message is an erasure if any incoming message is an erasure, and otherwise is the binary sum of all incoming messages.

## Exhibit E

### 3.1. Notation

In this section and the next, it will be convenient to use a slightly different representation for an IRA code than the one used in Section 2. Firstly, we will begin with the assumption that the degrees of both the information nodes and the check nodes are nonconstant, though we will soon restrict attention to the "right-regular" case, in which the check nodes have constant degree.

Secondly, let $\lambda_{i}$ be the fraction of edges between the information and the check nodes that are adjacent to an information node of degree $i$, and let $\rho_{i}$ be the fraction of such edges that are adjacent to a check node of degree $i+2$ (i.e. one which is adjacent to $i$ information nodes). We will use these edge fractions $\lambda_{i}$ and $\rho_{i}$ to represent the IRA code rather than the corresponding node fractions. We define $\lambda(x)=\sum_{i} \lambda_{i} x^{i-1}$ and $\rho(x)=\sum_{i} \rho_{i} x^{i-1}$ to be the generating functions of these sequences. The pair $(\lambda, \rho)$ is called a degree distribution. It is quite easy to convert between the two representations. We demonstrate the conversion with the information node degrees. Let the $f_{i}$ 's be as defined in Section 2 and let $L(x)=\sum_{i} f_{i} x^{i}$. Then we have

$$
\begin{align*}
f_{i} & =\frac{\lambda_{i} / i}{\sum_{j} \lambda_{j} / j}  \tag{6}\\
L(x) & =\int_{0}^{x} \lambda(t) d t / \int_{0}^{1} \lambda(t) d t \tag{7}
\end{align*}
$$

The rate of the systematic IRA code (we shall be dealing only with these) given by this degree distribution is given by

$$
\begin{equation*}
\text { Rate }=\left(1+\frac{\sum_{j} \rho_{j} / j}{\sum_{j} \lambda_{j} / j}\right)^{-1} \tag{8}
\end{equation*}
$$

(This is an easy exercise. For a proof, see [8].)

### 3.2. Fixed point analysis of iterative decoding

In [2], it was shown that if for a code ensemble, the probability of the depth-l neighborhood of an edge (in the Tanner graph) being cycle-free goes to 1 as the length of the code goes to infinity (we will call this condition the cycle-free condition), then density evolution gives an accurate estimate of the bit error rate after $l$ iterations, again as the length of the codes goes to infinity. In density evolution, we evolve the probability density of the messages being passed according to the operations being performed on them, assuming that all incoming messages are independent (which is true if the depth-l neighbourhood is tree-like). The cycle-free condition does indeed hold
for IRA codes. The proof of this fact is almost exactly the same as in the irregular LDPC codes case, which was done in [2].

Now, in the case of the erasure channel, we have seen that the messages are only of three types, so in effect we have a discrete density function, and the probability of error is merely the probability of erasure. With this in mind, we will now study the evolution of the erasure probability, and derive conditions which guarantee that it goes to zero as the number of iterations goes to infinity. Under these conditions iterative decoding will be successful in the sense of [2], i.e., it will achieve arbitrarily small BERs, given enough iterations and long enough codes.

Let $p$ be the channel probability of erasure. We will iterate the probability of erasure along the edges of the graph during the course of the algorithm. Let $x_{0}$ be the probability of erasure on an edge from an information node to a check node, $x_{1}$ the probability of erasure on an edge from a check node to a parity node, $x_{2}$ the probability of erasure on an edge from a parity node to a check node, and $x_{3}$ the probability of erasure on an edge from a check node to an information node. The initial probability of erasure on the message bits is $p$.

We now assume that we are at a fixed point of the decoding algorithm and solve for $x_{0}$. We get the following equations:

$$
\begin{align*}
& x_{1}=1-\left(1-x_{2}\right) R\left(1-x_{0}\right)  \tag{9}\\
& x_{2}=p x_{1}  \tag{10}\\
& x_{3}=1-\left(1-x_{2}\right)^{2} \rho\left(1-x_{0}\right)  \tag{11}\\
& x_{0}=p \lambda\left(x_{3}\right) \tag{12}
\end{align*}
$$

where $R(x)$ is the polynomial in which the coefficient of $x^{i}$ denotes the fraction of check nodes of degree $i$. $R(x)$ is given by (cf. eq. (7))

$$
\begin{equation*}
R(x)=\frac{\int_{0}^{x} \rho(t) d t}{\int_{0}^{1} \rho(t) d t} \tag{13}
\end{equation*}
$$

We eliminate $x_{1}$ from the first two of these equations to get $x_{2}$ in terms of $x_{0}$ and then keep substituting forwards to get an equation purely in $x_{0}$, henceforth denoted by $x$. We thereby obtain the following equation for a fixed point of iterative decoding:

$$
\begin{equation*}
p \lambda\left(1-\left[\frac{1-p}{1-p R(1-x)}\right]^{2} \rho(1-x)\right)=x \tag{14}
\end{equation*}
$$

If this equation has no solution in the interval $(0,1]$, then iterative decoding must converge to probability of erasure zero. Therefore, if we have

## Exhibit E

 Page 82$p \lambda\left(1-\left[\frac{1-p}{1-p R(1-x)}\right]^{2} \rho(1-x)\right)<x, \quad \forall x \neq 0$.
then in the sense of [2], iterative decoding is successful.

### 3.3. Capacity-achieving sequences of degree distributions

We will now derive sequences of degree distributions that can be shown to achieve channel capacity. First, we restrict attention to the case $\rho(x)=x^{a-1}$ for some $a \geq 1$, since it turns out that we can achieve capacity even with this restriction. In this case, $R(x)=x^{a}$, and the condition for convergence to zero BER now becomes
$p \lambda\left(1-\left[\frac{1-p}{1-p(1-x)^{a}}\right]^{2}(1-x)^{a-1}\right)<x, \quad \forall x \neq 0$
We now make the following new definitions

$$
\begin{align*}
f_{p}(x) & \triangleq 1-\left[\frac{1-p}{1-p(1-x)^{a}}\right]^{2}(1-x)^{a-1}  \tag{17}\\
h_{p}(x) & \triangleq 1-\left[\frac{1-p}{1-p(1-x)^{a}}\right]^{2}(1-x)^{a}  \tag{18}\\
g_{p}(x) & \triangleq h_{p}^{-1}(x) \tag{19}
\end{align*}
$$

Notice that $f_{p}(x), h_{p}(x)$ and $g_{p}(x)$ are all monotonic functions in $[0,1]$ and attain the values 0 at 0 and 1 at 1 . In addition, $h_{p}(x)$ can be inverted by hand (by making the substitution $(1-x)^{a}=y$ ) and it can be shown that $g_{p}(x)$ has a power series expansion around 0 with non-negative coefficients. Let this expansion be $g_{p}(x)=\sum_{i} g_{p, i} x^{i}$.

Now, the condition (16) can now be rewritten as

$$
\begin{equation*}
p \lambda\left(f_{p}(x)\right)<x, \quad \forall x \neq 0 \tag{20}
\end{equation*}
$$

which can be rewritten as

$$
\begin{equation*}
\lambda(x)<\frac{f_{p}^{-1}(x)}{p} \tag{21}
\end{equation*}
$$

We make the following choice of $\lambda(x)$ :

$$
\begin{equation*}
\lambda(x)=\frac{1}{p}\left(\sum_{i=1}^{N-1} g_{p, i} x^{i}+\epsilon x^{N}\right) \tag{22}
\end{equation*}
$$

where $0<\epsilon<g_{p, N}$ and $\sum_{i=1}^{N-1} g_{p, i}+\epsilon=p$. Such a choice of $N$ and $\epsilon$ exists and is unique since the $g_{p, i}$ 's are non-negative and $\sum_{i=1}^{\infty} g_{p, i}=g_{p}(1)=1$. For this choice of $\lambda(x)$, we have

$$
\begin{equation*}
p \lambda(x)<g_{p}(x)=h_{p}^{-1}(x)<f_{p}^{-1}(x) \quad \forall x \neq 0 \tag{23}
\end{equation*}
$$

where the last inequality follows because $f_{p}(x)<$ $h_{p}(x) \quad \forall x \neq 0$.

Thus, the condition (21) for BER going to zero is satisfied and the degree distributions we have thus defined yield codes with thresholds that are greater than or equal to $p$. We now wish to compute the rate of these codes in the limit as $a \rightarrow \infty$ to show that they achieve channel capacity. The rate of the code is given by eq. (8) which simplifies to ( $1+$ $\left.\left(a \sum_{i} \lambda_{i} / i\right)^{-1}\right)^{-1}$ in the right-regular case. Now,

$$
\begin{equation*}
\lim _{a \rightarrow \infty} a \sum_{i} \frac{\lambda_{i}}{i}=\lim _{a \rightarrow \infty} a\left(\sum_{i=1}^{N-1} \frac{g_{p, i}}{i}+\frac{\epsilon}{N}\right) \tag{24}
\end{equation*}
$$

We also have

$$
\begin{equation*}
\lim _{a \rightarrow \infty} a \sum_{i=N}^{\infty} \frac{g_{p, i}}{i} \leq \lim _{a \rightarrow \infty} \frac{a}{N} \sum_{i=N}^{\infty} g_{p, i} \leq \lim _{a \rightarrow \infty} \frac{a}{N}=0 \tag{25}
\end{equation*}
$$

where the last equality is a property of the function $g_{p}(x)$ and is also proved by manual inversion of $h_{p}(x)$. We therefore have

$$
\begin{aligned}
\lim _{a \rightarrow \infty} a \sum_{i} \frac{\lambda_{i}}{i} & =\lim _{a \rightarrow \infty} a \sum_{i=1}^{\infty} \frac{g_{p, i}}{i} \\
& =\lim _{a \rightarrow \infty} a \int_{0}^{1} g_{p}(x) d x \\
& =a\left(1-\int_{0}^{1} h_{p}(x) d x\right) \\
& =a \int_{0}^{1}\left(\frac{1-p}{1-p x^{a}}\right)^{2} x^{a} d x
\end{aligned}
$$

The integrand on the right can be expanded in a power series with non-negative coefficients, with the first non-zero coefficient being that of $x^{a}$. Keeping in mind that we are integrating this power series, it is easy to see that

$$
\begin{align*}
& \frac{a}{a+1} \int_{0}^{1}\left(\frac{1-p}{1-p x^{a}}\right)^{2} x^{a-1} d x \\
< & 1-\int_{0}^{1} h_{p}(x) d x  \tag{26}\\
< & \int_{0}^{1}\left(\frac{1-p}{1-p x^{a}}\right)^{2} x^{a-1} d x
\end{align*}
$$

Both bounds in the above equation can be computed easily and both tend to $(1-p) / p$ in the limit of large $a$. Plugging this result into the formula for the rate, we finally get that the rate tends to $1-p$ in the limit of large $a$, which is indeed the capacity of the BEC.

Thus the sequence of degree distributions given in eq. (22) does indeed achieve channel capacity.

### 3.4. Some numerical results

We have seen that the condition for BER going to zero at a channel erasure probability of $p$ is $p \lambda(x)<f_{p}^{-1}(x) \forall x \neq 0$. We later enforced a stronger condition, namely $p \lambda(x)<h_{p}^{-1}(x)=g_{p}(x) \forall x \neq 0$ and derived capacity-chieving degree sequences satisfying this condition. The reason we needed to enforce the stronger condition was that $h_{p}^{-1}(x)=g_{p}(x)$ has non-negative power-series coefficients, while the same cannot be said for $f_{p}^{-1}(x)$. However, from (26) we see that enforcing this stronger condition costs us a factor of $1-a /(a+1)=1 /(a+1)$ in the rate which is very large for values of $a$ that are of interest, and therefore the resulting codes are not very good.

If, however, $f_{p}^{-1}(x)$ were to have non-negative power series coefficients, then we could use it to define a degree distribution and we would no longer lose this factor of $1 /(a+1)$. We have found through direct numerical computation in all cases that we tried, that enough terms in the beginning of this power series are non-negative to enable us to define $\lambda(x)$ by an equation analogous to eq. (22), replacing $g_{p}(x)$ by $f_{p}^{-1}(x)$. Of course, the resulting code is not theoretically guaranteed to have a threshold $\geq p$, but numerical computation shows that the threshold is either equal to or very marginally less than $p$.

This design turns out to yield very powerful codes, in particular codes whose performance is in every way comparable to the irregular LDPC codes listed in [8] as far as decoding performance is concerned. The performance of some of these distributions is listed in Table 1. The threshold values $p$ are the same as those in [8] for corresponding values of $a$ (IRA codes with right degree $a+2$ should be compared to irregular LDPC codes with right degree $a$, so that the decoding complexity is about the same), so as to make comparison easy. The codes listed in [8] were shown to have certain optimality properties with respect to the tradeoff between $1-\delta /(1-R)$ (distance from capacity) and $a$ (decoding complexity), so it is very heartening to note that the codes we have designed are comparable to these.

We end this section with a brief discussion of the case $a=1$. In this case, it turns out that $f_{p}^{-1}(x)$ does indeed have non-negative power-series coefficients. The resulting degree sequences yield codes that are better than conventional RA codes at small rates. An entirely similar exercise can be carried out for the case of non-systematic RA codes with $a=1$ and the codes resulting in this case are significantly better than conventional RA codes for most rates. However, non-systematic RA codes turn out to be useless for higher values of $a$, as can be seen by manually following the decoding algorithm for one iteration, which shows that decoding does not proceed at all. For this reason all the preceding analysis was

Table 1: Performance of some codes designed using the procedure described in Section 3.4. at rates close to $2 / 3$ and $1 / 2 . \delta$ is the code threshold (maximum allowable value of $p$ ), $N$ the number of terms in $\lambda(x)$, and $R$ the rate of the code.

| $a$ | $\delta$ | $N$ | $1-R$ | $\delta /(1-R)$ |
| :---: | :---: | :---: | :---: | :---: |
| 4 | 0.20000 | 1 | 0.333333 | 0.6000 |
| 5 | 0.23611 | 3 | 0.317101 | 0.7448 |
| 6 | 0.28994 | 6 | 0.329412 | 0.8802 |
| 7 | 0.31551 | 11 | 0.336876 | 0.9366 |
| 8 | 0.32024 | 16 | 0.333850 | 0.9592 |
| 9 | 0.32558 | 26 | 0.334074 | 0.9744 |
| 4 | 0.48090 | 13 | 0.502141 | 0.9577 |
| 5 | 0.49287 | 28 | 0.502225 | 0.9814 |

performed for systematic RA codes.

## 4. IRA CODES ON THE AWGN CHANNEL

In this section, we will consider the behavior of IRA codes on the AWGN channel. Here there are only two possible inputs, 0 and 1 , but the output alphabet is the set of real numbers: if the $x$ is the input, then the output is $y=(-1)^{x}+z$, where $z$ is a mean zero, variance $\sigma^{2}$ Gaussian random variable. For a given noise variance $\sigma^{2}$, our objective will be to find a left degree sequence $\lambda(x)$ such that the ensemble message error probability approaches zero, while the rate is as large as possible. Unlike the BEC, where we deal only with probabilities, in the case of the AWGN we must deal with probability densities. This complicates the analysis, and forces us to resort to approximate design methods.

### 4.1. Gaussian Approximation

Wiberg [9] has shown that the messages passed in iterative decoding on the AWGN channel can be well approximated by Gaussian random variables, provided the messages are in log-likelihood ratio form. In [6], this approximation was used to design good LDPC codes for the AWGN channel.

In this subsection, we use this Gaussian approximation to design good IRA codes for the AWGN channel. Specifically, we approximate the messages from check nodes to variable nodes (both information and parity) as Gaussian at every iteration. For a variable node, if all the incoming messages are Gaussian, then all the outgoing messages are also Gaussian because of (4). A Gaussian distribution $f(x)$ is called consistent [5] if $f(x)=f(-x) e^{x}$ for $\forall x \leq 0$. The consistency condition implies that the mean and variance satisfy $\sigma^{2}=2 \mu$. For the sum-product algorithm, it has been shown [2] that consistency is preserved at message updates of both the variable and

## Exhibit E Page 84

check nodes. Thus if we assume Gaussian messages, and require consistency, we only need to keep track of the means. To this end, we define a consistent Gaussian density with mean $\mu$ to be

$$
\begin{equation*}
G_{\mu}(z)=\frac{1}{\sqrt{4 \pi \mu}} e^{-(z-\mu)^{2} / 4 \mu} \tag{27}
\end{equation*}
$$

The expected value of $\tanh \frac{z}{2}$ for a consistent Gaussian distributed random variable $z$ with mean $\mu$ is then

$$
\begin{equation*}
E\left[\tanh \frac{z}{2}\right]=\int_{-\infty}^{+\infty} G_{\mu}(z) \tanh \frac{z}{2} d z \triangleq \phi(\mu) \tag{28}
\end{equation*}
$$

It is easy to see that $\phi(u)$ is a monotonic increasing function of $u$; we denote its inverse function by $\phi^{(-1)}(y)$. Let $\mu_{L}^{(l)}$ and $\mu_{R}^{(l)}$ be the means of the message from check nodes to variable nodes on the left (i.e., information nodes) and on the right (i.e., parity nodes) at the $l$ th iteration. We want to obtain expressions for $\mu_{L}^{(l+1)}$ and $\mu_{R}^{(l+1)}$ in terms of $\mu_{L}^{(l)}$ and $\mu_{R}^{(l)}$. A message from a degree- $i$ information node to a check node at the $l$ th iteration, is Gaussian with mean $(i-1) \mu_{L}^{(l)}+\mu_{o}$, where $\mu_{o}$ is the mean of message $m_{o}$ in (4). Hence if $v_{L}$ denotes the message on a randomly selected edge from an information node to a check node, the density of $v_{L}$ is

$$
\begin{equation*}
\sum_{i=1}^{J} \lambda_{i} G_{(i-1) \mu_{L}^{(i)}+\mu_{o}}(z) \tag{29}
\end{equation*}
$$

From (29) and (28) we obtain:

$$
\begin{equation*}
E\left[\tanh \frac{v_{L}}{2}\right]=\sum_{i=1}^{J} \lambda_{i} \phi\left((i-1) \mu_{L}^{(l)}+\mu_{o}\right) \tag{30}
\end{equation*}
$$

Similarly, if $v_{R}$ denotes the message on a randomly selected edge from a parity node to a check node,

$$
\begin{equation*}
E\left[\tanh \frac{v_{R}}{2}\right]=\phi\left(\mu_{R}^{(l)}+\mu_{o}\right) . \tag{31}
\end{equation*}
$$

Because of (5) we have

$$
\begin{equation*}
E\left[\tanh \frac{m(u \rightarrow v)}{2}\right]=\prod_{w \neq v} E\left[\tanh \frac{m(w \rightarrow u)}{2}\right] . \tag{32}
\end{equation*}
$$

Denote a message from a check node to an information node, resp. parity node, by $u_{L}$, resp, $u_{R}$. Replacing $E\left[\tanh \frac{m(w \rightarrow u)}{2}\right]$ with the right side of (30) or (31) depending upon whether the message comes from the left or right, (32) implies:

$$
\begin{gathered}
E\left[\tanh \frac{u_{L}}{2}\right]=E\left[\tanh \frac{v_{L}}{2}\right]^{a-1} E\left[\tanh \frac{v_{R}}{2}\right]^{2} \\
=\left(\sum_{i=1}^{J} \lambda_{i} \phi\left((i-1) \mu_{L}^{(l)}+\mu_{o}\right)\right)^{a-1}\left(\phi\left(\mu_{R}^{(l)}+\mu_{o}\right)\right)^{2},
\end{gathered}
$$

$$
\begin{aligned}
& E\left[\tanh \frac{u_{R}}{2}\right]=E\left[\tanh \frac{v_{L}}{2}\right]^{a} E\left[\tanh \frac{v_{R}}{2}\right] \\
= & \left(\sum_{i=1}^{J} \lambda_{i} \phi\left((i-1) \mu_{L}^{(l)}+\mu_{o}\right)\right)^{a} \phi\left(\mu_{R}^{(l)}+\mu_{o}\right) .
\end{aligned}
$$

Using the definition of $\phi(\mu)$ in (28), we thus have the following recursion for $\mu_{L}^{(l)}$ and $\mu_{R}^{(l)}$ :

$$
\begin{align*}
\phi\left(\mu_{L}^{(l+1)}\right)= & \left(\sum_{i=1}^{J} \lambda_{i} \phi\left((i-1) \mu_{L}^{(l)}+\mu_{o}\right)\right)^{a-1} \times . \\
& \left(\phi\left(\mu_{R}^{(l)}+\mu_{o}\right)\right)^{2},  \tag{33}\\
\phi\left(\mu_{R}^{(l+1)}\right)= & \left(\sum_{i=1}^{J} \lambda_{i} \phi\left((i-1) \mu_{L}^{(l)}+\mu_{o}\right)\right)^{a} \times \\
& \phi\left(\mu_{R}^{(l)}+\mu_{o}\right) . \tag{34}
\end{align*}
$$

In order to have arbitrary small bit error probability, the means $\mu_{L}^{(l)}$ and $\mu_{R}^{(l)}$ should approach infinity as $l$ approaches infinity. In the next subsection, we derive a sufficient condition for this.

### 4.2. Fixed point analysis

We now assume that iterative dedoding has reached a fixed point of (33) and (34), i.e., $\mu_{L}^{(l+1)}=\mu_{L}^{(l)}=\mu_{L}$ and $\mu_{R}^{(l+1)}=\mu_{R}^{(l)}=\mu_{R}$. Denote $\sum_{i=1}^{J} \lambda_{i} \phi\left((i-1) \mu_{L}+\right.$ $\mu_{o}$ ) by $x$. From (30) we can see that $0<x<1$ and $x \rightarrow 1$ if and only if $\mu_{L} \rightarrow \infty$. From (34) it's easy to show that $\mu_{R}$ is a function of $x$, denoted by $f$, i.e., $\mu_{R}=f(x)$. Then, dividing (33) by the square of (34) gives us:

$$
\begin{equation*}
\phi\left(\mu_{L}\right)=\phi^{2}\left(\mu_{R}\right) / x^{a+1}=\phi^{2}(f(x)) / x^{a+1} \tag{35}
\end{equation*}
$$

Now replacing $\mu_{L}$ with $\phi^{(-1)}\left(\phi^{2}(f(x)) / x^{a+1}\right)$ into the definition of $x$, we obtain the following equation for the fixed point $x$ :

$$
\begin{equation*}
x=\sum_{i=1}^{J} \lambda_{i} \phi\left(\mu_{o}+(i-1) \phi^{(-1)}\left(\frac{\phi^{2}(f(x))}{x^{a+1}}\right)\right) \tag{36}
\end{equation*}
$$

If this equation doesn't have a solution in the interval $[0,1]$, then the decoding bit error probability converges to zero. Therefore, if we have

$$
\begin{equation*}
F(x) \triangleq \sum_{i=1}^{J} \lambda_{i} \phi\left(\mu_{o}+(i-1) \phi^{(-1)}\left(\frac{\phi^{2}(f(x))}{x^{a+1}}\right)\right)>x \tag{37}
\end{equation*}
$$

for any $x \in\left[x_{0}, 1\right)$, where $x_{0}$ is the value of $x$ at the first_iteration, then (the Gaussian approximation to) iterative decoding is successful.

Since the rate of the code is given by (cf. (8)):

$$
\begin{equation*}
\frac{\sum_{i} \lambda_{i} / i}{1 / a+\sum_{i} \lambda_{i} / i} \tag{38}
\end{equation*}
$$

to maximize the rate, we should maximize $\sum_{i} \lambda_{i} / i$. Thus, under the Gaussian approximation, the problem of finding a good degree sequence for IRA codes is converted to the following linear programming problem:
Linear Programming Problem. Maximize

$$
\begin{equation*}
\sum_{i=1}^{J} \lambda_{i} / i \tag{39}
\end{equation*}
$$

under the condition

$$
\begin{equation*}
F(x)>x, \quad \forall x \in\left[x_{0}, 1\right] . \tag{40}
\end{equation*}
$$

We have designed some degree sequences for IRA codes using this linear programming methodology. The results are presented in Tables 2 (code rate $\approx$ $1 / 3$ ) and 3 (code rate $\approx 1 / 2$ ). After using the heuristic Gaussian approximation method to design the degree sequences, we used exact density evolution to determine the actual noise threshold. (In every case, the true iterative decoding threshold was better than the one predicted by the Gaussian approximation.)

| $a$ | 2 | 3 | 4 |
| :---: | :---: | :---: | :---: |
| $\lambda_{2}$ | 0.139025 | 0.078194 | 0.054485 |
| $\lambda_{3}$ | 0.222155 | 0.128085 | 0.104315 |
| $\lambda_{5}$ |  | 0.160813 |  |
| $\lambda_{6}$ | 0.638820 | 0.036178 | 0.126755 |
| $\lambda_{10}$ |  |  | 0.229816 |
| $\lambda_{11}$ |  |  | 0.016484 |
| $\lambda_{12}$ |  | 0.108828 |  |
| $\lambda_{13}$ |  | 0.487902 |  |
| $\lambda_{14}$ |  |  |  |
| $\lambda_{16}$ |  |  |  |
| $\lambda_{27}$ |  |  | 0.450302 |
| $\lambda_{28}$ |  |  | 0.017842 |
| rate | 0.333364 | 0.333223 | 0.333218 |
| $\sigma_{G A}$ | 1.1840 | 1.2415 | 1.2615 |
| $\sigma^{*}$ | 1.1981 | 1.2607 | 1.2780 |
| $\left(\frac{E_{6}}{N_{0}}\right)^{*}(d B)$ | 0.190 | -0.250 | -0.371 |
| S.L. $(d B)$ | -0.4953 | -0.4958 | -0.4958 |

Table 2: Good degree sequences yielding codes of rate approximately $1 / 3$ for the AWGN channel and with $a=2,3,4$. For each sequence the Gaussian approximation noise threshold, the actual sum-product decoding threshold, and the corresponding $\left(\frac{E_{b}}{N_{0}}\right)^{*}$ in dB are given. Also listed is the Shannon limit (S.L.)

For example, consider the " $a=3$ " column in Table 2. We adjust Gaussian approximation noise threshold
$\sigma_{G A}$ to be 1.2415 to have the returned optimal sequence having rate 0.333223 . Then applying the exact density evolution program on this code, we obtain the actual sum-product decoding threshold $\sigma^{*}=1.2607$, which corresponds to $E_{b} / N_{0}=-0.250$ dB . This should be compared to the Shannon limit for the ensemble of all linear codes of the same rate, which is -0.4958 dB . As we increase the parameter $a$, the ensemble improves. For $a=4$, the best code we have found has iterative decoding threshold $E_{b} / N_{0}=-0.371 \mathrm{~dB}$, which is only 0.12 dB above the Shannon limit.

The above analysis is for bit error probability. In order to have zero word error probability, it is necessary to have $\lambda_{2}=0$. (This can be proved by the following argument: if $\lambda_{2}>0$, then in the ensemble, as $n \rightarrow \infty$, the average number of weight 2 codewords is bounded away from zero. Hence even a maximumlikelihood decoder would have non-zero decoding error probability.) In Table 3, we compare the noise thresholds of codes with and without $\lambda_{2}=0$.

| $a$ | 8 | 8 |
| :---: | :---: | :---: |
| $\lambda_{2}$ |  | 0.0577128 |
| $\lambda_{3}$ | 0.252744 | 0.117057 |
| $\lambda_{7}$ |  | 0.2189922 |
| $\lambda_{8}$ |  | 0.0333844 |
| $\lambda_{11}$ | 0.081476 |  |
| $\lambda_{12}$ | 0.327162 |  |
| $\lambda_{18}$ |  | 0.2147221 |
| $\lambda_{20}$ |  | 0.0752259 |
| $\lambda_{46}$ | 0.184589 |  |
| $\lambda_{48}$ | 0.154029 |  |
| $\lambda_{55}$ |  | 0.0808676 |
| $\lambda_{58}$ |  | 0.202038 |
| rate | 0.50227 | 0.497946 |
| $\sigma^{*}$ | 0.9589 | 0.972 |
| $\left(\frac{E_{6}}{N_{0}}\right)^{*}(d B)$ | 0.344 | 0.266 |
| Shannon limit | 0.197 | 0.178 |

Table 3: Two degree sequences yielding codes of rate $\approx 1 / 2$ with $a=8$. For each sequence, the actual sum-product decoding threshold, and the corresponding $\left(\frac{E_{b}}{N_{0}}\right)^{*}$ in dB are given. Also listed is the Shannon limit.

We chose rate one-half because we wanted to compare our results with the best irregular LDPC codes obtained in [5]. Our best IRA code has threshold 0.266 dB , while the best rate one-half irregular LDPC code found in [5] has threshold 0.25 dB . These two codes have roughly the same decoding complexity, but unlike LDPC codes, IRA codes have a simple linear encoding algorithm.

## Exhibit E Page 86

### 4.3. Simulation Results

We simulated the rate one-half code with $\lambda_{2}=$ 0 in Table 3. Figure 2 shows the performance of that particular code, with information block lengths $10^{3}, 10^{4}$, and $10^{5}$. For comparison, we also show the performance of the best known rate $1 / 2$ turbo code for the same block length.


Figure 2: Comparison between turbo codes (dashed curves) and IRA codes (solid curves) of lengths $n=$ $10^{3}, 10^{4}, 10^{5}$. All codes are of rate one-half.

## 5. CONCLUSIONS

We have introduced a class of codes, the IRA codes, that combines many of the favorable attributes of turbo codes and LDPC codes. Like turbo codes (and unlike LDPC codes), they can be encoded in linear time. Like LDPC codes (and unlike turbo codes), they are amenable to an exact RichardsonUrbanke style analysis. In simulated performance they appear to be slightly superior to turbo codes of comparable complexity, and just as good as the best known irregular LDPC codes. In our opinion, the important open problem is to prove (or disprove) that IRA codes can be decoded reliably in linear time at rates arbitrarily close to channel capacity. We know this to be true for the binary erasure channel, but for no other channel model. If this should turn out ot be true, we would argue that IRA codes definitively solve the problem posed implicitly by Shannon in 1948. If it is not true, then researchers should search for an even better class of code ensembles.

## REFERENCES

[1] D. Divsalar, H. Jin, and R. J. McEliece, "Coding theorems for 'turbo-like' codes," pp. 201210 in Proc. 36th Allerton Conf. on Communication, Control, and Computing. (Allerton, Illinois, Sept. 1998).
[2] T. J. Richardson and R. Urbanke, "The capacity of low-density parity-check codes under message passing decoding," submitted to IEEE Trans. Inform. Theory.
[3] M. Luby, M. Mitzenmacher, A. Shokrollahi, D. Spielman, and V. Stemann, "Practical lossresilient codes," Proc. 29th ACM Symp. on the Theory of Computing (1997), pp. 150-159.
[4] M. Luby, M. Mitzenmacher, A. Shokrollahi, and D. Spielman, "Analysis of low-density codes and improved designs using irregular graphs," Proc. 30th ACM Symp. on the Theory of Computing (1998), pp. 249-258.
[5] T. J. Richardson, A. Shokrollahi,, and R. Urbanke, "Design of provably good low-density parity-check codes," submitted to IEEE Trans. Inform. Theory.
[6] S.-Y. Chung, R. Urbanke,, and T. J. Richardson, "Analysis of sum-product decoding of lowdensity parity-check codes using a Gaussian approximation," submitted to IEEE Trans. Inform. Theory.
[7] D. Divsalar, S. Dolinar, and F. Pollara, "Iterative turbo decoder analysis based on Gaussian density evolution," submitted to IEEE J. Selected Areas in Comm.
[8] M. A. Shokrollahi, "New sequences of linear time erasure codes approaching channel capacity," Proc. 1999 ISITA (Honolulu, Hawaii, November 1999) pp. 65-76.
[9] N. Wiberg, "Codes and decoding on general graphs," dissertation no. 440, Linköping Studies in Science and Technology, Linköping, Sweden, 1996.
[10] J. Hagenauer, E. Offer, and L. Papke, "Iterative decoding of binary block and convolutional codes," IEEE Trans. Inform. Theory, vol. IT42, no. 2 (March 1996). pp. 429-445.

## Exhibit E <br> Page 87

# Design Methods for Irregular Repeat-Accumulate Codes 

Aline Roumy, Member, IEEE, Souad Guemghar, Student Member, IEEE, Giuseppe Caire, Senior Member, IEEE, and Sergio Verdú, Fellow, IEEE


#### Abstract

We optimize the random-like ensemble of irregular repeat-accumulate (IRA) codes for binary-input symmetric channels in the large block-length limit. Our optimization technique is based on approximating the evolution of the densities (DE) of the messages exchanged by the belief-propagation (BP) message-passing decoder by a one-dimensional dynamical system. In this way, the code ensemble optimization can be solved by linear programming. We propose four such DE approximation methods, and compare the performance of the obtained code ensembles over the binary-symmetric channel (BSC) and the binary-antipodal input additive white Gaussian noise channel (BLAWGNC). Our results clearly identify the best among the proposed methods and show that the IRA codes obtained by these methods are competitive with respect to the best known irregular low-density parity-check (LDPC) codes. In view of this and the very simple encoding structure of IRA codes, they emerge as attractive design choices.


Index Terms-Belief propagation (BP), channel capacity, density evolution, low-density parity-check (LDPC) codes, stability, threshold, turbo codes.

## I. Introduction

SINCE the discovery of turbo codes [1], there have been several notable inventions in the field of random-like codes. In particular, the rediscovery of the low-density parity-check (LDPC) codes, originally proposed in [2], the introduction of irregular LDPCs [3], [4], and the introduction of the repeat-accumulate (RA) codes [5].

In [3], [4], irregular LDPCs were shown to asymptotically achieve the capacity of the binary erasure channel (BEC) under iterative message-passing decoding. Although the BEC is the only channel for which such a result currently exists, irregular LDPC codes have been designed for other binary-input channels (e.g., the binary-symmetric channel (BSC), the binary-antipodal input additive white Gaussian noise channel (BIAWGNC) [6], and the binary-input intersymbol interference (ISI) channel [7]-[9]) and have shown to achieve very good performance.

First attempts to optimize irregular LDPC codes ([10] for the BEC and other channels [11]) with the density evolution (DE) technique computes the expected performance for a random-like

[^3]code ensemble in the limit of infinite code block length. In order to reduce the computational burden of ensemble optimization based on the DE, faster techniques have been proposed, based on the approximation of the DE by a one-dimensional dynamical system (recursion). These techniques are exact only for the BEC (for which DE is one-dimensional). The most popular techniques proposed so far are based on the Gaussian approximation (GA) of messages exchanged in the message-passing decoder. GA in addition to the symmetry condition of message densities implies that the Gaussian density of messages is expressed by a single parameter. Techniques differ in the parameter to be tracked and in the mapping functions defining the dynamical system [12]-[18].
The introduction of irregular LDPCs motivated other schemes such as irregular RA (IRA) [19], for which similar results exist (achievability of the BEC capacity) and irregular turbo codes [20]. IRA codes are, in fact, special subclasses of both irregular LDPCs and irregular turbo codes. In IRA codes, a fraction $f_{i}$ of information bits is repeated $i$ times, for $i=2,3, \ldots$. The distribution
$$
\left\{f_{i} \geq 0, i=2,3, \ldots: \sum_{i=2}^{\infty} f_{i}=1\right\}
$$
is referred to as the repetition profile, and it is kept as a degree of freedom in the optimization of the IRA ensemble. After the repetition stage, the resulting sequence is interleaved and input to a recursive finite-state machine (called accumulator) which outputs one bit for every $a$ input symbols, where $a$ is referred to as grouping factor and is also a design parameter.

IRA codes are an appealing choice because the encoder is extremely simple, their performance is quite competitive with that of turbo codes and LDPCs, and they can be decoded with a very-low-complexity iterative decoding scheme.

The only other work that has proposed a method to design IRA codes is [19], [21] where the design focuses on the choice of the grouping factor and the repetition profile. The recursive finite-state machine is the simplest one which gives full freedom to choose any rational number between 0 and 1 as the coding rate. We will also restrict our study to IRAs that use the same simple recursion of [19], although it might be expected that better codes can be obtained by including the finite-state machine as a degree of freedom in the overall ensemble optimization. The method used in [19] to choose the repetition profile was based on the infinite-block-length GA of message-passing decoding proposed in [14]. In this work, we propose and compare four low-complexity ensemble


Fig. 1. IRA encoder.
optimization methods. Our approach to design IRAs is based on several tools that have been noticed recently: the EXtrinsic mutual Information Transfer (EXIT) function and its analytical properties [12], [22], [23], reciprocal channel (duality) approximation [22], [24], and the nonstrict convexity of mutual information.

The rest of the paper is organized as follows. Section II presents the systematic IRA encoder and its related decoder: the belief-propagation (BP) message-passing algorithm. Existing results on the analysis of the decoder (i.e., DE technique) are summarized and applied to the IRA code ensemble. This leads to a two-dimensional dynamical system whose state is defined on the space of symmetric distributions, for which we derive a local stability condition. In Section III, we propose a general framework in order to approximate the DE (defined on the space of distributions) by a standard dynamical system defined on the reals. We propose four low-complexity ensemble optimization methods as special cases of our general framework. These methods differ by the way the message densities and the BP transformations are approximated:

1) GA, with reciprocal channel (duality) approximation;
2) BEC approximation, with reciprocal channel approximation;
3) GA, with EXIT function of the inner decoder;
4) BEC approximation, with EXIT function of the inner decoder.

All four methods lead to optimization problems solvable by linear programming. In Section IV, we show that the first proposed method yields a one-dimensional DE approximation with the same stability condition as the exact DE , whereas the exact stability condition must be added to the ensemble optimization as an explicit additional constraint for the second method. Then, we show that, in general, the GA methods are optimistic, in the sense that there is no guarantee that the optimized rate is below capacity. On the contrary, we show that for the BEC approximation methods rates below capacity are guaranteed. In Section V, we compare our code optimization methods by evaluating their iterative decoding threshold (evaluated by the exact DE ) over the BLAWGNC and the BSC.

## II. Encoding, Decoding, and Density Evolution

Fig. 1 shows the block diagram of a systematic IRA encoder. A block of information bits $\boldsymbol{b}=\left(b_{1}, \ldots, b_{k}\right) \in \boldsymbol{F}_{2}^{k}$ is encoded by an (irregular) repetition code of rate $k / n$. Each bit $b_{j}$ is repeated $r_{j}$ times, where $\left(r_{1}, \ldots, r_{k}\right)$ is a sequence of integers such that $2 \leq r_{j} \leq d$ and $\sum_{j=1}^{k} r_{j}=n$ ( $d$ is the maximum repetition factor). The block of repeated symbols is interleaved,


Fig. 2. Tanner graph of an IRA code.
and the resulting block $x_{1}=\left(x_{1,1}, \ldots, x_{1, n}\right) \in \mathbb{F}_{2}^{n}$ is encoded by an accumulator, defined by the recursion

$$
\begin{equation*}
x_{2, j+1}=x_{2, j}+\sum_{i=0}^{a-1} x_{1, a j+i}, \quad j=0, \ldots, m-1 \tag{1}
\end{equation*}
$$

with initial condition $x_{2,0}=0$, where $x_{2}=\left(x_{2,1}, \ldots, x_{2, m}\right) \in$ $\mathbb{F}_{2}^{m}$ is the accumulator output block corresponding to the input $x_{1}, a \geq 1$ is a given integer (referred to as grouping factor), and we assume that $m=n / a$ is an integer. Finally, the codeword corresponding to the information block $b$ is given by $x=$ (b, $x_{2}$ ).

The transmission channel is memoryless, binary-input, and symmetric-output, i.e., its transition probability $p_{Y \mid X}(y \mid x)$ satisfies

$$
\begin{equation*}
p_{Y \mid X}(y \mid 0)=p_{Y \mid X}(-y \mid 1) \tag{2}
\end{equation*}
$$

where $y \mapsto-y$ indicates a reflection of the output alphabet. ${ }^{1}$
IRA codes are best represented by their Tanner graph [25] (see Fig. 2). In general, the Tanner graph of a linear code is a bipartite graph whose node set is partitioned into two subsets: the bitnodes, corresponding to the coded symbols, and the checknodes, corresponding to the parity-check equations that codewords must satisfy. The graph has an edge between bitnode $\alpha$ and checknode $\beta$ if the symbol corresponding to $\alpha$ participates in the parity-check equation corresponding to $\beta$.
Since the IRA encoder is systematic (see Fig. 1), it is useful to further classify the bitnodes into two subclasses: the information bitnodes, corresponding to information bits, and the parity bitnodes, corresponding to the symbols output by the accumulator. Those information bits that are repeated $i$ times are represented by bitnodes with degree $i$, as they participate in $i$ parity-check equations. Each checknode is connected to a information bit nodes and to two parity bitnodes and represents one of the equations (for a particular $j$ ) (1). The connections between checknodes and information bitnodes are determined by the interleaver and are highly randomized. On the contrary, the connections between checknodes and parity bitnodes are arranged in a regular

[^4] tion with respect to the origin. Generalizations to other alphabets are immediate.

## Exhibit F <br> Page 89

zig-zag pattern since, according to (1), every pair of consecutive parity bits are involved in one parity-check equation.

A random IRA code ensemble with parameters ( $\left.\left\{\lambda_{i}\right\}, a\right)$ and (information) block length $k$ is formed by all graphs of the form of Fig. 2 with $k$ information bitnodes, grouping factor $a$, and $\lambda_{i} n$ edges connected to information bitnodes of degree $i$, for $i=2, \ldots, d$. The sequence of nonnegative coefficients $\left\{\lambda_{i}\right\}$ such that $\sum_{i=2}^{d} \lambda_{i}=1$ is referred to as the degree distribution of the ensemble. The probability distribution over the code ensemble is induced by the uniform probability over all interleavers (permutations) of $n$ elements.

The information bitnodes average degree is given by $\vec{d} \triangleq$ $1 /\left(\sum_{i=2}^{d} \lambda_{i} / i\right)$. The number of edges connecting information bitnodes to checknodes is $n=k /\left(\sum_{i=2}^{d} \lambda_{i} / i\right)$. The number of parity bitnodes is $m=k /\left(a \sum_{i=2}^{d} \lambda_{i} / i\right)$. Finally, the code rate is given by

$$
\begin{equation*}
R=\frac{k}{k+m}=\frac{a \sum_{i=2}^{d} \lambda_{i} / i}{1+a \sum_{i=2}^{d} \lambda_{i} / i}=\frac{a}{a+\bar{d}} \tag{3}
\end{equation*}
$$

Under the constraints $0 \leq \lambda_{i} \leq 1$ and $\sum_{i \geq 2} \lambda_{i}=1$, we get $\bar{d} \geq 2$. Therefore, the highest rate with parameter $a$ set to 1 is $1 / 3$. This motivates the use of $a \geq 2$ in order to get higher rates.

## A. Belief Propagation Decoding of IRA Codes

In this work, we consider BP message-passing decoding [26]-[28]. In message-passing decoding algorithms, the graph nodes receive messages from their neighbors, compute new messages, and forward them to their neighbors. The algorithm is defined by the code Tanner graph, by the set on which messages take on values, by the node computation rules, and by the node activation scheduling.

In BP decoding, messages take on values in the extended real line $\mathbb{B} \cup\{-\infty, \infty\}$. The BP decoder is initialized by setting all messages output by the checknodes equal to zero. Each bitnode $\alpha$ is associated with the channel observation message (log-likelihood ratio)

$$
\begin{equation*}
u_{\alpha}=\log \frac{p_{Y \mid X}\left(y_{\alpha} \mid x_{\alpha}=0\right)}{p_{Y \mid X}\left(y_{\alpha} \mid x_{\alpha}=1\right)} \tag{4}
\end{equation*}
$$

where $y_{\alpha}$ is the channel output corresponding to the transmission of the code symbol $x_{\alpha}$.

The BP node computation rules are given as follows. For a given node, we identify an adjacent edge as outgoing and all other adjacent edges as incoming. Consider a bitnode $\alpha$ of degree $i$ and let $m_{1}, \ldots, m_{i-1}$ denote the messages received from the $i-1$ incoming edges and $u_{\alpha}$ the associated channel observation message. The message $m_{o, \alpha}$ passed along the outgoing edge is given by

$$
\begin{equation*}
m_{o, \alpha}=m_{1}+\cdots+m_{i-1}+u_{\alpha} \tag{5}
\end{equation*}
$$

Consider a checknode $\beta$ of degree $i$ and let $m_{1}, \ldots, m_{i-1}$ denote the messages received from the $i-1$ incoming edges. The message $m_{o, p}$ passed along the outgoing edge is given by

$$
m_{o, \beta}=\gamma^{-1}\left(\gamma\left(m_{1}\right)+\cdots+\gamma\left(m_{i-1}\right)\right)
$$

where the mapping $\gamma: \mathbb{R} \rightarrow \mathbb{F}_{2} \times \mathbb{R}_{+}$is defined by [11]

$$
\begin{equation*}
\gamma(z)=\left(\operatorname{sign}(z),-\log \tanh \frac{|z|}{2}\right) \tag{7}
\end{equation*}
$$

and where the sign function is defined as [11]

$$
\operatorname{sign}(z)= \begin{cases}0, & \text { if } z>0 \\ 0, & \text { with probability } 1 / 2 \text { if } z=0 \\ 1, & \text { with probability } 1 / 2 \text { if } z=0 \\ 1, & \text { if } z<0\end{cases}
$$

Since the code Tanner graph has cycles, different schedulings yield in general nonequivalent BP algorithms. In this work, we shall consider the following "classical" schedulings.

- LDPC-like scheduling [19]. In this case, all bitnodes and all checknodes are activated alternately and in parallel. Every time a node is activated, it sends outgoing messages to all its neighbors. A decoding iteration (or "round" [31]) consists of the activation of all bitnodes and all checknodes.
- Turbo-like scheduling. Following [29], a good decoding scheduling consists of isolating large trellis-like subgraphs (or, more generally, normal realizations in Forney's terminology) and applying locally the forwardbackward Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm [30] (that implements efficiently the BP algorithm on normal cycle-free graphs), as done for turbo codes [1]. A decoding iteration consists of activating all the information bitnodes in parallel (according to (5)) and of running the BCJR algorithm over the entire accumulator trellis. In particular, the checknodes do not send messages to the information bitnodes until the BCJR iteration is completed.
Notice that for both of the above schedulings one decoder iteration corresponds to the activation of all information bitnodes in the graph exactly once.


## B. Density Evolution and Stability

The bit-error rate (BER) performance of BP decoding averaged over the IRA code ensemble and over the noise observations can be analyzed, for any finite number $\ell$ of iterations and in the limit of $k \rightarrow \infty$, by the DE technique [11]. The usefulness of the DE method stems from the Concentration Theorem [31], [10] which guarantees that, with high probability, the BER after $\ell$ iterations of the BP decoder applied to a randomly selected code in the ensemble and to a randomly generated channel noise sequence is close to the BER computed by DE, for sufficiently large block length.

Next, we formulate the DE for IRA codes and we study the stability condition of the fixed-point corresponding to zero BER. As in [11, Sec. III-B], we introduce the space of distributions whose elements are nonnegative nondecreasing right-continuous functions with range in $[0,1]$ and domain the extended real line.

## Exhibit F

Page 90

It can be shown that, for a binary-input symmetric-output channel, the distributions of messages at any iteration of the DE satisfy the symmetry condition

$$
\begin{equation*}
\int h(x) d F(x)=\int e^{-x} h(-x) d F(x) \tag{8}
\end{equation*}
$$

for any function $h$ for which the integral exists. If $F$ has density $f,(8)$ is equivalent to

$$
\begin{equation*}
f(x)=e^{x} f(-x) \tag{9}
\end{equation*}
$$

With some abuse of terminology, distributions satisfying (8) are said to be symmetric. The space of symmetric distributions will be denoted by $\mathcal{F}_{\text {sym }}$.

The BER operator $\mathrm{Pe}: \mathcal{F}_{\text {sym }} \rightarrow[0,1 / 2]$ is defined by

$$
\operatorname{Pe}(F)=\frac{1}{2}\left(F^{-}(0)+F(0)\right)
$$

where $F^{-}(z)$ is the left-continuous version of $F(z)$. We introduce the "delta at zero" distribution, denoted by $\Delta_{0}$, for which $\operatorname{Pe}\left(\Delta_{0}\right)=1 / 2$, and the "delta at infinity" distribution, denoted by $\Delta_{\infty}$, for which $\operatorname{Pe}\left(\Delta_{\infty}\right)=0$.

The symmetry property (8) implies that a sequence of symmetric distributions $\left\{F^{(\ell)}\right\}_{\ell=0}^{\infty}$ converges to $\Delta_{\infty}$ if and only if $\lim _{\ell \rightarrow \infty} \operatorname{Pe}\left(F^{(\ell)}\right)=0$, where convergence of distributions is in the sense given in [11, Sec. III-F].

The DE for IRA code ensembles is given by the following proposition whose derivation is omitted as it is completely analogous to the derivation of DE in [11] for irregular LDPC codes.

Proposition 1: Let $P_{\ell}$ (respectively, $\widetilde{P}_{\ell}$ ) denote the average distribution of messages passed from an information bitnode (respectively, parity bitnode) to a checknode, at iteration $\ell$. Let $Q_{\ell}$ (respectively, $\widetilde{Q}_{\ell}$ ) denote the average distribution of messages passed from a checknode to an information bitnode (respectively, parity bitnode), at iteration $\ell$.

Under the cycle-free condition, $P_{\ell}, \widetilde{P}_{\ell}, Q_{\ell}, \widetilde{Q}_{\ell}$ satisfy the following recursion:

$$
\begin{align*}
& P_{\ell}=F_{u} \otimes \lambda\left(Q_{\ell}\right)  \tag{10}\\
& \widetilde{P}_{\ell}=F_{u} \otimes \widetilde{Q}_{\ell}  \tag{11}\\
& Q_{\ell}=\Gamma^{-1}\left(\Gamma\left(\widetilde{P}_{\ell-1}\right)^{\otimes 2} \otimes \Gamma\left(P_{\ell-1}\right)^{\otimes(a-1)}\right)  \tag{12}\\
& \tilde{Q}_{\ell}=\Gamma^{-1}\left(\Gamma\left(\widetilde{P}_{\ell-1}\right) \otimes \Gamma\left(P_{\ell-1}\right)^{\otimes a}\right) \tag{13}
\end{align*}
$$

for $\ell=1,2, \ldots$, with initial condition $P_{0}=\widetilde{P}_{0}=\Delta_{0}$, where $F_{u}$ denotes the distribution of the channel observation messages (4), ${ }^{\otimes}$ denotes convolution of distributions, defined by

$$
\begin{equation*}
(F \otimes G)(z)=\int F(z-t) d G(t) \tag{14}
\end{equation*}
$$

where ${ }^{\otimes m}$ denotes $m$-fold convolution,

$$
\lambda(F) \triangleq \sum_{i=2}^{d} \lambda_{i} F^{\otimes(i-1)}
$$

$\Gamma\left(F_{x}\right)$ is the distribution of $y=\gamma(x)$ (defined on $F_{2} \times \mathbb{R}_{2}$ ), when $x \sim F_{x}$, and $\Gamma^{-1}$ denotes the inverse mapping of $\Gamma$, i.e., $\Gamma^{-1}\left(G_{y}\right)$ is the distribution of $x=\gamma^{-1}(y)$ when $y \sim G_{y}$.

The DE recursion (10)-(13) is a two-dimensional nonlinear dynamical system with state space $\mathcal{F}_{\text {sym }}^{2}$ (i.e., the state trajecto-
ries of (10)-(13) are sequences of pairs of symmetric distributions $\left(P_{\ell}, \widetilde{P}_{\ell}\right)$ ). For this system, the BER at iteration $\ell$ is given by $\operatorname{Pe}\left(P_{\ell}\right)$.

It is easy to see that $\left(\Delta_{\infty}, \Delta_{\infty}\right)$ is a fixed point of (10)-(13). The local stability of this fixed point is given by the following result.

Theorem 1: The fixed point $\left(\Delta_{\infty}, \Delta_{\infty}\right)$ for the $D E$ is locally stable if and only if

$$
\begin{equation*}
\lambda_{2}<\frac{e^{r}\left(e^{r}-1\right)}{a+1+e^{r}(a-1)} \tag{15}
\end{equation*}
$$

where $r=-\log \left(\int e^{-z / 2} d F_{u}(z)\right)$.
Proof: See Appendix I.
Here necessity and sufficiency are used in the sense of [11]. By following steps analogous to [11], it can be shown that if (15) holds, then there exists $\xi>0$ such that if for some $\ell \in \mathbb{N}$

$$
\operatorname{Pe}\left(R P_{\ell}\left(P_{0}, \widetilde{P}_{0}\right)+(1-R) \widetilde{P}_{\ell}\left(P_{0}, \widetilde{P}_{0}\right)\right)<\xi
$$

then $\operatorname{Pe}\left(R P_{\ell}+(1-R) \widetilde{P}_{\ell}\right)$ converges to zero as $\ell$ tends to infinity. On the contrary, if $\lambda_{2}$ is strictly larger than the right-hand side (RHS) of (15), then there exists $\xi>0$ such that for all $\ell \in \mathbb{N}$

$$
\operatorname{Pe}\left(R P_{\ell}\left(P_{0}, \widetilde{P}_{0}\right)+(1-R) \widetilde{P}_{\ell}\left(P_{0}, \widetilde{P}_{0}\right)\right)>\xi
$$

## III. IRA Ensemble Optimization

In this section, we tackle the problem of optimizing the IRA code ensemble parameters for a broad class of binary-input sym-metric-output channels.

A property of DE given in Proposition 1 is that $\mathrm{Pe}\left(P_{\ell}\right)$ for $\ell=1,2, \ldots$ is a nonincreasing nonnegative sequence. Hence, the limit $\lim _{\ell \rightarrow \infty} \mathrm{Pe}\left(P_{\ell}\right)$ exists. Consider a family of channels

$$
\mathcal{C}(\nu)=\left\{p_{Y \mid X}^{\nu}: \nu \in \mathbb{R}_{+}\right\}
$$

where the channel parameter $\nu$ is, for example, an indicator of the noise level in the channel. Following [31], we say that $\mathcal{C}(\nu)$ is monotone with respect to the IRA code ensemble ( $\left.\left\{\lambda_{i}\right\}, a\right)$ under BP decoding if, for any finite $\ell$

$$
\nu \leq \nu^{\prime} \Leftrightarrow \operatorname{Pe}\left(P_{\ell}\right) \leq \operatorname{Pe}\left(P_{\ell}^{\prime}\right)
$$

where $P_{\ell}$ and $P_{\ell}^{\prime}$ are the message distributions at iteration $\ell$ of DE applied to channels $p_{Y \mid X}^{\nu}$ and $p_{Y \mid X}^{\nu^{\prime}}$, respectively.
Let $\operatorname{BER}(\nu)=\lim _{\ell \rightarrow \infty} \operatorname{Pe}\left(P_{\ell}\right)$, where $\left\{P_{\ell}\right\}$ is the trajectory of DE applied to the channel $p_{Y \mid X}^{\nu}$. The threshold $\nu^{\star}$ of the ensemble ( $\left\{\lambda_{i}\right\}, a$ ) over the monotone family $\mathcal{C}(\nu)$ is the worst case channel parameter for which the limiting BER is zero, i.e.,

$$
\begin{equation*}
\nu^{\star}=\sup \{\nu \geq 0: \operatorname{BER}(\nu)=0\} \tag{16}
\end{equation*}
$$

Thus, for every value of $\nu$, the optimal IRA ensemble parameters $a$ and $\left\{\lambda_{i}\right\}$ maximize $R$ subject to vanishing $\operatorname{BER}(\nu)=0$, i.e., are solution of the optimization problem

$$
\begin{cases}\operatorname{maximize} & a \sum_{i=2}^{d} \lambda_{i} / i  \tag{17}\\ \text { subject to } & \sum_{i=2}^{d} \lambda_{i}=1, \lambda_{i} \geq 0 \quad \forall i \\ \text { and to } & \operatorname{BER}(\nu)=0\end{cases}
$$

the solution of which can be found by some numerical techniques, as in [11]. However, the constraint $\operatorname{BER}(\nu)=0$ is given directly in terms of the fixed point of the DE recursion, and makes optimization very computationally intensive.

A variety of methods have been developed in order to simplify the code ensemble optimization [19], [24], [14], [32]. They consist of replacing the DE with a dynamical system defined over the reals (rather than over the space of distributions), whose trajectories and fixed points are related in some way to the trajectories and the fixed point of the DE. Essentially, all proposed approximated DE methods can be formalized as follows. Let $\Phi: \mathcal{F}_{\text {sym }} \rightarrow \mathbb{R}$ and $\Psi: \mathbb{R} \rightarrow \mathcal{F}_{\text {sym }}$ be mappings of the set of symmetric distributions to the real numbers and vice versa. Then, a dynamical system with state space $\mathbb{R}^{2}$ can be derived from (10)-(13) as

$$
\begin{align*}
& x_{\ell}=\Phi\left(F_{u} \otimes \lambda\left(\mathrm{Q}_{\ell}\right)\right)  \tag{18}\\
& \widetilde{x}_{\ell}=\Phi\left(F_{u} \otimes \widetilde{\mathrm{Q}}_{\ell}\right)  \tag{19}\\
& \mathrm{Q}_{\ell}=\Gamma^{-1}\left(\Gamma\left(\Psi\left(\widetilde{x}_{\ell-1}\right)\right)^{\otimes 2} \otimes \Gamma\left(\Psi\left(x_{\ell-1}\right)\right)^{\otimes(a-1)}\right)  \tag{20}\\
& \widetilde{Q}_{\ell}=\Gamma^{-1}\left(\Gamma\left(\Psi\left(\widetilde{x}_{\ell-1}\right)\right) \otimes \Gamma\left(\Psi\left(x_{\ell-1}\right)\right)^{\otimes a}\right) \tag{21}
\end{align*}
$$

for $\ell=1,2, \ldots$, with initial condition $x_{0}=\widetilde{x}_{0}=\Phi\left(\Delta_{0}\right)$, and where ( $x_{\ell}, \widetilde{x}_{\ell}$ ) are the system state variables.

By eliminating the intermediate distributions $Q_{\ell}$ and $\tilde{Q}_{\ell}$, we can put (18)-(21) in the form

$$
\begin{align*}
& x_{\ell}=\phi\left(x_{\ell-1}, \widetilde{x}_{\ell-1}\right) \\
& \widetilde{x}_{\ell}=\widetilde{\phi}\left(x_{\ell-1}, \widetilde{x}_{\ell-1}\right) . \tag{22}
\end{align*}
$$

For all DE approximations considered in this work, the mappings $\Phi$ and $\Psi$ and the functions $\phi$ and $\widetilde{\phi}$ satisfy the following desirable properties.

1) $\Phi\left(\Delta_{0}\right)=0, \Phi\left(\Delta_{\infty}\right)=1$.
2) $\Psi(0)=\Delta_{0}, \Psi(1)=\Delta_{\infty}$.
3) $\phi$ and $\widetilde{\phi}$ are defined on $[0,1] \times[0,1]$ and have range in $[0,1]$.
4) $\phi(0,0)>0$ and $\widetilde{\phi}(0,0)>0$.
5) $\phi(1,1)=\widetilde{\phi}(1,1)=1$, i.e., $(1,1)$ is a fixed point of the recursion (22). Moreover, this fixed point corresponds to the zero-BER fixed point $\left(\Delta_{\infty}, \Delta_{\infty}\right)$ of the exact DE.
6) If $F_{u} \neq \Delta_{0}$, the function $\widetilde{\phi}(x, \widetilde{x})-\widetilde{x}$ is strictly decreasing in $\widetilde{x}$ for all $x \in[0,1]$. Therefore, the equation

$$
\widetilde{x}=\widetilde{\phi}(x, \widetilde{x})
$$

has a unique solution in $[0,1]$ for all $x \in[0,1]$. This solution will be denoted by $\widetilde{x}(x)$.
It follows that all fixed points of (22) must satisfy

$$
\begin{equation*}
x=\phi(x, \widetilde{x}(x)) \tag{23}
\end{equation*}
$$

and that in order to avoid fixed points other than $(1,1),(23)$ must not have solutions in the interval $[0,1)$, i.e., it must satisfy

$$
x<\phi(x, \widetilde{x}(x)), \quad \forall x \in[0,1)
$$



Fig. 3. EXIT model.
Notice that, in general, (24) is neither a necessary nor a sufficient condition for the uniqueness of the zero-BER fixed point of the exact DE. However, if the quality of the DE approximation is good, this provides a heuristic for the code ensemble optimization.

By replacing the constraint $\operatorname{BER}(\nu)=0$ by (24) in (17), we obtain the approximated IRA ensemble optimization method as

$$
\begin{cases}\operatorname{maximize} & a \sum_{i=2}^{d} \lambda_{i} / i  \tag{25}\\ \text { subject to } & \sum_{i=2}^{d} \lambda_{i}=1, \lambda_{i} \geq 0, \quad \forall i \\ \text { and to } & x<\phi(x, \widetilde{x}(x)), \quad \forall x \in[0,1)\end{cases}
$$

Approximations of the DE recursion differ essentially in the choice of $\Phi$ and $\Psi$, and in the way the intermediate distributions $Q_{\ell}$ and $\widetilde{Q}_{\ell}$ and the channel message distribution $F_{u}$ are approximated. Next, we illustrate the approximation methods considered in this work.

## A. EXIT Functions

Several recent works show that DE can be accurately described in terms of the evolution of the mutual information between the variables associated with the bitnodes and their messages (see [12], [33]-[35], [13], [23], [18]).

The key idea in order to approximate DE by mutual information evolution is to describe each computation node in BP decoding by a mutual information transfer function. For historical reasons, this function is usually referred to as the EXtrinsic mutual Information Transfer (EXIT) function.
EXIT functions are generally defined as follows. Consider the model of Fig. 3, where the box represents a generalized computation node of the BP algorithm (i.e., it might contain a subgraph formed by several nodes and edges, and might depend on some other random variables such as channel observations, not shown in Fig. 3). Let $m_{1}, \ldots, m_{i-1}$ denote the input messages, assumed independent and identically distributed (i.i.d.) $\sim F_{\text {in }}$, and let $m_{o} \sim F_{\text {out }}$ denote the output message. Let $X_{j}$ denote the binary code symbol associated with message $m_{j}$, for $j=1, \ldots, i-1$, and let $X$ denote the binary code symbol associated with message $m_{o}$. Since $F_{\text {in }}, F_{\text {out }} \in \mathcal{F}_{\text {sym }}$, we can think of $m_{j}$ and $m_{o}$ as the outputs of binary-input symmetric-output channels with inputs $X_{j}$ and $X$ and transition probabilities

$$
\begin{align*}
P\left(m_{j} \leq z \mid X_{j}=0\right) & =F_{\mathrm{in}}(z)  \tag{26}\\
P\left(m_{o} \leq z \mid X=0\right) & =F_{\mathrm{out}}(z) \tag{27}
\end{align*}
$$

respectively.
The channel (26) models the a priori information that the node receives about the symbols $X_{j}$ 's, and the channel (27) models the extrinsic information [1] that the node generates (24) about the symbol $X$.

## Exhibit F

Page 92


We define the binary-input symmetric-output capacity functional $\mathcal{I}: \mathcal{F}_{\text {sym }} \rightarrow[0,1]$, such that

$$
\begin{equation*}
\mathcal{I}(F)=1-\int_{-\infty}^{\infty} \log _{2}\left(1+e^{-z}\right) d F(z) \tag{28}
\end{equation*}
$$

Namely, $\mathcal{I}$ maps any symmetric distribution $F$ into the capacity ${ }^{2}$ of the binary-input symmetric-output channel with transition probability $p_{Y \mid X}(y \mid 0)=F(y)$.

Then, we let

$$
\begin{aligned}
& I_{A}=I\left(X_{j} ; m_{j}\right)=\mathcal{I}\left(F_{\text {in }}\right) \\
& I_{E}=I\left(X ; m_{o}\right)=\mathcal{I}\left(F_{\text {out }}\right)
\end{aligned}
$$

denote the capacities of the channels (26) and (27), respectively. The EXIT function of the node of Fig. 3 is the set of pairs ( $I_{A}, I_{E}$ ), for all $I_{A} \in[0,1]$ and for some (arbitrary) choice of the input distribution $F_{\text {in }}$ such that $\mathcal{I}\left(F_{\text {in }}\right)=I_{A}$. Notice that the EXIT function of a node is not uniquely defined, since it depends on the choice of $F_{\text {in }}$. In general, different choices yield different transfer functions.

The approximations of the DE considered in this work are based on EXIT functions, and track the evolution of the mutual information between the messages output by the bitnodes and the associated code symbols.

Remark. Two properties of binary-input symmetric-output channels: Before concluding this section, we take a brief detour in order to point out two properties of binary-input symmetric-output channels. Consider a binary-input sym-metric-output channel with $p_{Y \mid X}(y \mid 0)=G(y)$, where $G$ is not necessarily symmetric (in the sense of (8)). Its capacity can be written as

$$
\begin{equation*}
C=1-\int_{-\infty}^{\infty} \log _{2}\left(1-\frac{d G(-z)}{d G(z)}\right) d G(z) \tag{29}
\end{equation*}
$$

By concatenating the transformation $y \mapsto u=\log \frac{p_{Y \mid X}(y \mid 0)}{p_{Y \mid X}(y \mid 1)}$ to the channel output, we obtain a new binary-input symmetricoutput channel with $p_{U \mid X}^{\prime}(u \mid 0)=F(u)$ such that $F \in \mathcal{F}_{\text {sym }}$. Moreover, since $U$ is a sufficient statistic for $Y$, the original channel has the same capacity as the new channel, given by $C=$ $\mathcal{I}(F)$. Therefore, by defining appropriately the channel output, the capacity of any binary-input symmetric-output channel can always be put in the form (28).

Another interesting property is the following.
Proposition 2: The mutual information functional is not strictly convex on the set of binary-input symmetric-output channels with transition probability $p_{Y \mid X}(y \mid 0) \in \mathcal{F}_{\text {sym }}$.

Proof: See Appendix II.

## B. Method I

The first approximation of the DE considered in this work assumes that the distributions at any iteration are Gaussian. A Gaussian distribution satisfies the symmetry condition (9) if and only if its variance is equal to twice the absolute value of its mean. We introduce the shorthand notation $\mathcal{N}_{\text {sym }}(\mu)$ to denote the symmetric Gaussian distribution (or density, depending on the context) with mean $\mu$, i.e., $\mathcal{N}_{\text {sym }}(\mu) \triangleq \mathcal{N}(\mu, 2|\mu|)$.

[^5]

Fig. 4. Reciprocal channel approximation.
For a distribution $F \in \mathcal{F}_{\text {sym }}$, we let the mapping $\Phi$ be equal to $\mathcal{I}$ defined in (28), and for all $x \in[0,1]$ we define the mapping

$$
\begin{equation*}
\Psi: x \mapsto \mathcal{N}_{\mathrm{sym}}\left(J^{-1}(x)\right) \tag{30}
\end{equation*}
$$

where

$$
\begin{align*}
J(\mu) & \triangleq \mathcal{I}\left(\mathcal{N}_{\mathrm{sym}}(\mu)\right) \\
& =1-\frac{1}{\sqrt{\pi}} \int_{-\infty}^{+\infty} e^{-z^{2}} \log _{2}\left(1+e^{-2 \sqrt{\mu} z-\mu}\right) d z \tag{31}
\end{align*}
$$

Namely, $\Psi$ maps $x \in[0,1]$ into the symmetric Gaussian distribution $\mathcal{N}_{\text {sym }}(\mu)$ such that the BIAWGNC with transition probability $p_{Y \mid X}(y \mid 0)=\mathcal{N}_{\text {sym }}(\mu)$ has capacity $x$.

The first key approximation in Method 1 is

$$
\begin{align*}
& \mathrm{Q}_{\ell} \approx \mathcal{N}_{\mathrm{sym}}\left(\mu_{\ell}\right) \\
& \widetilde{\mathrm{Q}}_{\ell} \approx \mathcal{N}_{\mathrm{sym}}\left(\tilde{\mu}_{\ell}\right) \tag{32}
\end{align*}
$$

for some $\mu_{\ell}, \widetilde{\mu}_{\ell} \geq 0$.
In order to compute $\mu_{\ell}$ and $\tilde{\mu}_{\ell}$, we make use of the reciprocal channel approximation [24] also called approximate duality property of EXIT functions in [22]. This states that the EXIT function of a checknode is accurately approximated by the EXIT function of a bitnode with the same degree after the change of variables $I_{A} \mapsto 1-I_{A}$ and $I_{E} \mapsto 1-I_{E}$ (see Fig. 4). Using approximate duality, we replace the checknode by a bitnode and change $\left(x_{\ell-1}, \widetilde{x}_{\ell-1}\right)$ into $\left(1-x_{\ell-1}, 1-\widetilde{x}_{\ell-1}\right)$. Since for a bitnode the output message is the sum of the input messages (see (5)), and since the input distributions $\Psi\left(1-x_{\ell-1}\right)$ and $\left.\Psi\left(1-\widetilde{x}_{\ell-1}\right)\right)$ are Gaussian, also the output distribution is Gaussian, with mean

$$
(a-1) J^{-1}\left(1-x_{\ell-1}\right)+2 J^{-1}\left(1-\widetilde{x}_{\ell-1}\right)
$$

for messages sent to information bitnodes and

$$
a J^{-1}\left(1-x_{\ell-1}\right)+J^{-1}\left(1-\widetilde{x}_{\ell-1}\right)
$$

for messages sent to parity bitnodes. Finally, $\mu_{\ell}$ and $\widetilde{\mu}_{\ell}$ are given by

$$
\begin{align*}
& \mu_{\ell}=J^{-1}\left(1-J\left((a-1) J^{-1}\left(1-x_{\ell-1}\right)+2 J^{-1}\left(1-\tilde{x}_{\ell-1}\right)\right)\right) \\
& \tilde{\mu}_{\ell}=J^{-1}\left(1-J\left(a J^{-1}\left(1-x_{\ell-1}\right)+J^{-1}\left(1-\widetilde{x}_{\ell-1}\right)\right)\right) . \tag{33}
\end{align*}
$$

The second key approximation in Method 1 is to replace $F_{u}$ with a discrete (symmetric) distribution such that

$$
\begin{equation*}
F_{u} \approx \sum_{j=1}^{D} p_{j} \Delta_{v_{j}} \tag{34}
\end{equation*}
$$

for some integer $D \geq 2, v_{j} \in \mathbb{R}$, and $p_{j} \in \mathbb{R}_{+}$such that $\sum_{j=1}^{D} p_{j}=1$.

With this assumption, from the definition (28) of the operator $\mathcal{I}$ and since [11]: a) the convolution of symmetric distributions is symmetric, and $b$ ) the convex combination of symmetric dis-
tributions is symmetric; it is immediate to write (18) and (19) as (35) at the bottom of the page. The desired DE approximation in the form (22) is obtained (implicitly) by combining (33) and (35). Notice that (35) is linear in the repetition profile and the optimization problem (25) can be solved as linear programming.

Example 1. Discrete-output channels: In general, when the channel output is discrete then the approximation (34) holds exactly. For example, for the BSC with transition probability $p$ we have

$$
F_{u}=p \Delta_{-\log \frac{1-p}{p}}+(1-p) \Delta_{\log \frac{1-p}{p}}
$$

Example 2: The BIAWGNC defined by $y=(-1)^{x}+z$, where $z \sim \mathcal{N}\left(0, \sigma^{2}\right)$, is a channel such that

$$
\begin{equation*}
F_{u}=\mathcal{N}_{\mathrm{sym}}\left(2 / \sigma^{2}\right) \tag{36}
\end{equation*}
$$

In this case, since convolving symmetric Gaussian distributions yields a symmetric Gaussian distribution whose mean is the sum of the means, the discretization approximation (34) is not necessary and we have

$$
\begin{align*}
F_{u} \otimes \lambda\left(Q_{\ell}\right) & =\sum_{i=2}^{d} \lambda_{i} \mathcal{N}_{\text {sym }}\left(2 / \sigma^{2}+(i-1) \mu_{\ell}\right) \\
F_{u} \otimes \widetilde{\mathrm{Q}}_{\ell} & =\mathcal{N}_{\mathrm{sym}}\left(2 / \sigma^{2}+\widetilde{\mu}_{\ell}\right) \tag{37}
\end{align*}
$$

By applying the operator $\mathcal{I}$ and using (31) we obtain the DE approximation for the BIAWGNC as (38) at the bottom of the page.

## C. Method 2

The second approximation of the DE considered in this work assumes that the distributions of messages at any iteration consist of two mass points, one at zero and the other at $+\infty$. For such distributions, we introduce the shorthand notation $\mathcal{E}_{\text {sym }}(\epsilon) \triangleq \epsilon \Delta_{0}+(1-\epsilon) \Delta_{\infty}$.

We let the mapping $\Phi$ be equal to $\mathcal{I}$ defined in (28) and the mapping $\Psi$ be

$$
\begin{equation*}
\Psi: x \mapsto \mathcal{E}_{\text {sym }}(1-x) \tag{39}
\end{equation*}
$$

for all $x \in[0,1]$.

With these mappings, (20) and (21) can be put in the form

$$
\begin{align*}
& \mathrm{Q}_{\ell}=\mathcal{E}_{\mathrm{sym}}\left(1-x_{\ell-1}^{a-1} \widetilde{x}_{\ell-1}^{2}\right) \\
& \widetilde{\mathrm{Q}}_{\ell}=\mathcal{E}_{\mathrm{sym}}\left(1-x_{\ell-1}^{a} \widetilde{x}_{\ell-1}\right) \tag{40}
\end{align*}
$$

where we used the fact that, as it can be easily seen from the definitions of $\Gamma$ and $\Gamma^{-1}$ in (46)-(48)

$$
\begin{aligned}
& \Gamma^{-1}\left(\Gamma\left(\mathcal{E}_{\text {sym }}\left(\epsilon_{1}\right)\right) \otimes \Gamma\left(\mathcal{E}_{\text {sym }}\left(\epsilon_{2}\right)\right)\right) \\
& \quad=\mathcal{E}_{\text {sym }}\left(1-\left(1-\epsilon_{1}\right)\left(1-\epsilon_{2}\right)\right)
\end{aligned}
$$

Notice that, while in Method 1 we assumed $Q_{\ell}$ and $\widetilde{Q}_{\ell}$ to be symmetric Gaussian (see (32)), here (40) holds exactly.

As a consequence of these mappings, the communication channel of the parity bits, with distribution $F_{u}$, is replaced by a BEC with erasure probability $\epsilon=1-\mathcal{I}\left(F_{u}\right)$.

Furthermore, for any $F \in \mathcal{F}_{\text {sym }}$ we have

$$
\mathcal{I}\left(F \otimes \mathcal{E}_{\mathrm{sym}}(\epsilon)\right)=1-(1-\mathcal{I}(F)) \epsilon
$$

From this result, it is immediate to obtain the approximated DE recursion as

$$
\begin{align*}
& x_{\ell}=1-\left(1-\mathcal{I}\left(F_{u}\right)\right) \sum_{i=2}^{d} \lambda_{i}\left(1-x_{\ell-1}^{a-1} \widetilde{x}_{\ell-1}^{2}\right)^{i-1} \\
& \widetilde{x}_{\ell}=1-\left(1-\mathcal{I}\left(F_{u}\right)\right)\left(1-x_{\ell-1}^{a} \widetilde{x}_{\ell-1}\right) \tag{41}
\end{align*}
$$

Notice that (41) is the standard (exact) DE for the IRA ensemble ( $\left\{\lambda_{i}\right\}, a$ ) over a BEC (see [19]) with the same capacity of the actual binary-input symmetric-output channel, given by $\mathcal{I}\left(F_{u}\right)$. We point out here that this method, consisting of replacing the actual channel with a BEC with equal capacity and optimizing the code ensemble for the BEC, was proposed in [24] for the optimization of LDPC ensembles. Interestingly, this method follows as a special case of our general approach for DE approximation, for a particular choice of the mappings $\Phi$ and $\Psi$.

In this case, the fixed-point equation corresponding to (23) is obtained in closed form as

$$
\begin{align*}
x=1-(1 & \left.-\mathcal{I}\left(F_{u}\right)\right) \\
& \times \sum_{i=2}^{d} \lambda_{i}\left(1-\frac{x^{a-1} \mathcal{I}\left(F_{u}\right)^{2}}{\left(1-\left(1-\mathcal{I}\left(F_{u}\right)\right) x^{a}\right)^{2}}\right)^{i-1} \tag{42}
\end{align*}
$$

(for details, see [19]).

$$
\left\{\begin{array}{l}
x_{\ell}=1-\sum_{i=2}^{d} \sum_{j=1}^{D} \lambda_{i} p_{j} \frac{1}{\sqrt{\pi}} \int_{-\infty}^{\infty} e^{-z^{2}} \log _{2}\left(1+e^{-2 \sqrt{(i-1) \mu_{\ell}} z-(i-1) \mu_{\ell}-v_{j}}\right) d z  \tag{35}\\
\widetilde{x}_{\ell}=1-\sum_{j=1}^{D} p_{j} \frac{1}{\sqrt{\pi}} \int_{-\infty}^{\infty} e^{-z^{2}} \log _{2}\left(1+e^{-2 \sqrt{\mu_{\ell}} z-\widetilde{\mu}_{\ell}-v_{j}}\right) d z
\end{array}\right.
$$

$$
\begin{align*}
& x_{\ell}=\sum_{i=2}^{d} \lambda_{i} J\left(\frac{2}{\sigma^{2}}+(i-1) J^{-1}\left(1-J\left((a-1) J^{-1}\left(1-x_{\ell-1}\right)+2 J^{-1}\left(1-\widetilde{x}_{\ell-1}\right)\right)\right)\right) \\
& \widetilde{x}_{\ell}=J\left(\frac{2}{\sigma^{2}}+J^{-1}\left(1-J\left(a J^{-1}\left(1-x_{\ell-1}\right)+J^{-1}\left(1-\widetilde{x}_{\ell-1}\right)\right)\right)\right) \tag{38}
\end{align*}
$$

## Exhibit F

$F_{u}$


Fig. 5. Turbo-like IRA decoder.

## D. Methods 3 and 4

Methods 1 and 2 yield (almost) closed-form DE approximations at the price of some approximations of the message distributions and, above all, of the checknodes output distributions $Q_{\ell}$ and $\widetilde{Q}_{\ell}$.

In much of the current literature on random-like code ensemble optimization, the EXIT function of a decoding block is obtained by Monte Carlo simulation, by generating i.i.d. input messages, estimating the distribution of the output messages, and computing a one-dimensional quantity [12]-[18]. Following this approach, we shall consider the IRA decoder with turbo-like scheduling (see Fig. 5) and obtain the EXIT functions of the inner and outer decoders.

The inner (accumulator) and outer (repetition) decoders are characterized by an EXIT function as defined in Section III-A, for some guess of the (symmetric) distribution $F_{\text {in }}$. In general, the EXIT function of the decoders can be obtained as follows.

1) Let the channel observation messages be i.i.d., $\sim F_{u}$.
2) Assume the decoder input messages are i.i.d., $\sim F_{\text {in }}$.
3) Obtain either in closed form or by Monte Carlo simulation the corresponding marginal distribution $F_{\text {out }}$ of the decoder output messages.
4) Let $I_{A}=\mathcal{I}\left(F_{\text {in }}\right), I_{E}=\mathcal{I}\left(F_{\text {out }}\right)$ be a point on the EXIT function curve.

Our Methods 3 and 4 consist of applying the above approach under the assumptions $F_{\text {in }}=\mathcal{N}_{\text {sym }}\left(J^{-1}\left(I_{A}\right)\right)$ and $F_{\text {in }}=\mathcal{E}_{\text {sym }}\left(1-I_{A}\right)$, respectively.

Let the resulting EXIT functions of the inner and outer decoders be denoted by $I_{E}=g\left(I_{A}\right)$ and by $I_{E}=h\left(I_{A}\right)$, respectively, and let $x$ denote the mutual information between the messages at the output of the outer decoder (repetition code) and the corresponding symbols (information bitnodes).

The resulting approximated DE is given by

$$
\begin{equation*}
x_{\ell}=h\left(g\left(x_{\ell-1}\right)\right) \tag{43}
\end{equation*}
$$

The corresponding fixed-point equation is given by $x=h(g(x))$, and the condition for the uniqueness of the fixed point at $x=1$, corresponding to (24), is $x<h(g(x))$ for all $x \in[0,1)$. The resulting IRA optimization methods are obtained by using this condition in (25).

While for the inner decoder (accumulator) we are forced to resort to Monte Carlo simulation, it is interesting to notice that, due to the simplicity of the repetition code, for both Methods 3 and 4 the EXIT function of the outer decoder $\left(I_{E}=h\left(I_{A}\right)\right)$ can be obtained in closed form.

For Method 3, by discretizing the channel observation distribution as in (34), we have ${ }^{3}$

$$
\begin{align*}
& h\left(I_{A}\right)=1-\sum_{i=2}^{d} \sum_{j=1}^{D} \lambda_{i} p_{j} \frac{1}{\sqrt{\pi}} \int_{-\infty}^{\infty} e^{-z^{2}} \\
& \quad \times \log _{2}\left(1+e^{\left.-2 \sqrt{(i-1) J^{-x}\left(I_{A}\right) z-(i-1) J^{-1}\left(I_{A}\right)-v_{j}}\right) d z}\right. \tag{44}
\end{align*}
$$

For Method 4 we have

$$
\begin{equation*}
h\left(I_{A}\right)=1-\left(1-\mathcal{I}\left(F_{u}\right)\right) \sum_{i=2}^{d} \lambda_{i}\left(1-I_{A}\right)^{i-1} \tag{45}
\end{equation*}
$$

## IV. Properties of the Approximated DE

In this section, we show some properties of the approximated DE derived in Section III.

## A. Stability Condition

Consider the DE approximation of Method 1. As indicated in Section III-B, $(x, \widetilde{x})=(1,1)$ is a fixed-point of the system (33)-(35). We have the following result.

Theorem 2: The fixed point at $(1,1)$ of the system (33)-(35) is stable if and only if the fixed point $\left(\Delta_{\infty}, \Delta_{\infty}\right)$ of the exact DE (10)-(13) is stable.

Proof: See Appendix III.
For other DE approximations, stability does not generally imply stability of the corresponding exact DE . Consider the DE approximation of Method 2. $(1,1)$ is a fixed point of the system (41). We have the following result.

Proposition 3: The local stability condition of the approximated DE with Method 2 is less stringent than that of the exact DE.

Proof: See Appendix IV.
If an approximated DE has a less stringent stability condition, then the exact stability condition must be added to the ensemble optimization as an explicit additional constraint. It should be noticed that the DE approximations used in [24], [14], [19] require the additional stability constraint. For example, the codes presented in [19] for the BIAWGNC and for which $\lambda_{2}>0$ are not stable. Therefore, the BER for an arbitrary large number of iterations is not vanishing.

## B. Fixed-Points, Coding Rate, and Channel Capacity

An interesting property of optimization Methods 2 and 4 is that the optimized ensemble for a given channel with channel observation distribution $F_{u}$ and capacity $C=\mathcal{I}\left(F_{u}\right)$ has coding rate not larger than $C$. In fact, as a corollary of a general result of [23] (see Appendix V), we have the following.

Theorem 3: The DE approximations of Methods 2 and 4 have unique fixed point $(1,1)$ only if the IRA ensemble coding rate $R$ satisfies $R<C=\mathcal{I}\left(F_{u}\right)$.

## Proof: See Appendix V

[^6]

Fig. 6. Mutual information EXIT functions for BIAWGNC and Method 1.

TABLE I
Optimzation for the biawgnc

|  | Method 1 |  | Method 2 |  | Method 3 |  | Method 4 |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | $i$ | $\lambda_{i}$ | $i$ | $\lambda_{i}$ | $i$ | $\lambda_{i}$ | $i$ | $\lambda_{i}$ |
|  | 2 | 0.04227 | 2 | 0.05554 | 2 | 0.05266 | 2 | 0.05554 |
|  | 3 | 0.16242 | 3 | 0.16330 | 3 | 0.11786 | 3 | 0.14480 |
|  | 7 | 0.06529 | 8 | 0.06133 | 5 | 0.05906 | 7 | 0.18991 |
|  | 8 | 0.06489 | 9 | 0.19357 | 6 | 0.06517 | 8 | 0.00996 |
|  | 9 | 0.06207 | 25 | 0.14460 | 8 | 0.03615 | 19 | 0.03721 |
|  | 10 | 0.01273 | 26 | 0.08842 | 9 | 0.11288 | 20 | 0.25894 |
|  | 11 | 0.13072 | 100 | 0.29323 | 13 | 0.06068 | 100 | 0.30366 |
|  | 14 | 0.04027 |  |  | 14 | 0.04650 |  |  |
|  | 25 | 0.00013 |  |  | 22 | 0.08606 |  |  |
|  | 26 | 0.05410 |  |  | 23 | 0.01610 |  |  |
|  | 36 | 0.13031 |  |  | 34 | 0.11019 |  |  |
|  | 37 | 0.13071 |  |  | 35 | 0.11919 |  |  |
|  | 100 | 0.10402 |  |  | 100 | 0.11751 |  |  |
| Rate |  | 0183 |  | 9697 |  | 50154 |  | 9465 |
| $a$ |  | 8 |  | 8 |  | 8 |  | 8 |
| d |  | 4153 |  | 9755 |  | 55087 |  | 7305 |
| SNR(DE) |  | 739 |  | . 457 |  | .727 |  | 588 |
| SNR $_{\text {gap }}(\mathrm{DE})$ |  | 059 |  | 406 |  | . 075 |  | 306 |
| SNR $_{\text {gap }}$ (approx.) |  | 025 |  | 040 |  | . 021 |  | 071 |

We show in Section V-A through some examples that this property does not hold in general for other code ensemble optimization methods, for which the ensemble rate $R$ might result to be larger than the (nominal) capacity $\mathcal{I}\left(F_{u}\right)$. This means that the threshold $\nu^{\star}$, evaluated by exact DE, is worse than the channel parameter $\nu$ used for the ensemble design.

## V. Numerical Results

## A. Design Example for Rate-1/2 Codes

In this subsection we present the result of optimization for codes of rate $1 / 2$ and give examples for the BSC with crossover
probability $p$ and the BIAWGNC with signal-to-noise ratio (SNR)

$$
\mathrm{SNR} \triangleq \frac{E_{s}}{N_{0}}=\frac{1}{2 \sigma^{2}}
$$

In Fig. 6, the curve is the fixed-point equation used for the optimization in Method 1, i.e., the function $\phi(x, \widetilde{x}(x))$. The fixed-point equation curves for the other three methods are very similar.
In Fig. 6, the curve (solid line) shows $\phi(x, \tilde{x}(x))$ as a function of $x \in[0,1]$ for Method 1. The solutions of the fixed point (23)

TABLE II
Optimization for the BSC

|  | Method 1 |  | Method 2 |  | Method 3 |  | Method 4 |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | $i$ | $\lambda_{i}$ | $i$ | $\lambda_{i}$ | $i$ | $\lambda_{i}$ | $i$ | $\lambda_{i}$ |
|  | 2 | 0.03545 | 2 | 0.04732 | 2 | 0.03115 | 2 | 0.04657 |
|  | 3 | 0.14375 | 3 | 0.17984 | 3 | 0.14991 | 3 | 0.14932 |
|  | 6 | 0.03057 | 9 | 0.19715 | 6 | 0.04630 | 7 | 0.07693 |
|  | 7 | 0.10963 | 10 | 0.06259 | 7 | 0.06217 | 8 | 0.16249 |
|  | 9 | 0.10654 | 26 | 0.16429 | 8 | 0.08666 | 20 | 0.07001 |
|  | 10 | 0.02388 | 27 | 0.05676 | 10 | 0.12644 | 21 | 0.20550 |
|  | 11 | 0.04856 | 100 | 0.29205 | 17 | 0.03430 | 100 | 0.28919 |
|  | 12 | 0.00461 |  |  | 18 | 0.01506 |  |  |
|  | 21 | 0.03035 |  |  | 26 | 0.00228 |  |  |
|  | 28 | 0.22576 |  |  | 27 | 0.02258 |  |  |
|  | 29 | 0.09453 |  |  | 28 | 0.21774 |  |  |
|  | 100 | 0.14635 |  |  | 29 | 0.08021 |  |  |
|  |  |  |  |  | 100 | 0.12521 |  |  |
| Rate |  | 8908 |  | 49620 |  | 9226 |  | 9091 |
| $a$ |  | 8 |  | 8 |  | 8 |  | 8 |
| d |  | 5724 |  | 2253 |  | 5157 |  | 9627 |
| $p$ (DE) |  | 1091 |  | . 0938 |  | 1091 |  | 1009 |
| $p_{\text {gap }}(\mathrm{DE})$ |  | 0046 |  | . 0175 |  | . 0035 |  | 0122 |
| $p_{\text {gap }}$ (approx.) |  | 0037 |  | 0013 |  | 0026 |  | 0018 |

correspond to the intersection of this curve with the main diagonal (dotted line). Tables I and II give the degree sequences, the grouping factors, and the information bitnode average degrees for the four methods, for codes of rate $1 / 2$ over the BLAWGNC and the BSC, respectively. We compute the true iterative decoding thresholds (by using the exact DE ) for all the ensembles (denoted by the SNR (DE) or $p$ (DE) in the tables) and report also the gap of these thresholds with respect to the Shannon limit (denoted by $\mathrm{SNR}_{\text {gap }}$ (DE) or $p_{\text {gap }}$ (DE) in the tables). Then, we compare it to the threshold of the approximated $\mathrm{DE}\left(\mathrm{SNR}_{\text {gap }}\right.$ (approximately) and $p_{\text {gap }}$ (approximately)). We observe that the codes designed by using Methods 2 or 4 have rate below capacity, which is consistent with Theorem 3. On the other hand, the codes designed by using Methods 1 or 3 have rate possibly larger than the capacity corresponding to the channel parameter used for design. It can easily be checked that all the designed codes are stable.

## B. Thresholds of IRA Ensembles

In this subsection, we present results for codes designed according to the four methods, for rates from 0.1 to 0.9 , and we compare the methods on the basis of the true thresholds obtained by DE . We present the code rate, the grouping factor, the average repetition factor, and the gap to Shannon limit, for both BSC and BIAWGNC.

Tables III and IV show the performance of IRA codes on the BIAWGNC. Tables $V$ and VI show the performance of IRA codes on the BSC.

For all rates, and for both channels, IRA codes designed assuming GA (Methods 1 and 3 ) perform much better than those designed assuming BEC a priori (Methods 2 and 4). Nevertheless, Method 4 yields better codes than Method 2, especially at low rates. This is due to the fact that, in Method 2, the communication channel is replaced with a BEC with the same capacity, while this is not the case in Method 4 . This difference in performance decreases as the rate increases.

Fig. 7 compares the performance of IRA ensembles with the best known LDPC ensembles [6] on the BIAWGNC. As ex-

TABLE III
ira Codes, Designed With Methods 1 and 3, Evaluated WITH DE, FOR BIAWGNC

| Method 1 |  |  |  | Method 3 |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Rate | $a$ | $\bar{d}$ | SNR $_{\text {gap }}$ | Rate | $a$ | $\bar{d}$ | SNR $_{\text {gap }}$ |
| 0.10109 | 2 | 17.78 | 0.151 | 0.10133 | 2 | 17.74 | 0.163 |
| 0.20191 | 3 | 11.86 | 0.096 | 0.20199 | 3 | 11.85 | 0.126 |
| 0.30153 | 4 | 9.27 | 0.081 | 0.30175 | 4 | 9.26 | 0.111 |
| 0.40196 | 6 | 8.93 | 0.057 | 0.40201 | 6 | 8.93 | 0.067 |
| 0.50184 | 8 | 7.94 | 0.059 | 0.50154 | 8 | 7.95 | 0.075 |
| 0.60188 | 11 | 7.28 | 0.065 | 0.60147 | 11 | 7.29 | 0.065 |
| 0.70154 | 16 | 6.81 | 0.067 | 0.70093 | 16 | 6.83 | 0.068 |
| 0.79904 | 29 | 7.29 | 0.066 | 0.79912 | 29 | 7.29 | 0.062 |
| 0.89677 | 61 | 7.02 | 0.088 | 0.89712 | 61 | 7.00 | 0.083 |

TABLE IV
IRA Codes, Designed With Methods 2 and 4, Evaluated WITH DE, FOR BIAWGNC

| Method 2 |  |  |  | Method 4 |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Rate | $a$ | $\bar{d}$ | SNR $_{\text {gap }}$ | Rate | $a$ | $\bar{d}$ | SNR $_{\text {gap }}$ |
| 0.09407 | 2 | 19.26 | 0.906 | 0.09752 | 2 | 18.51 | 0.316 |
| 0.19842 | 3 | 12.12 | 0.573 | 0.19725 | 3 | 12.21 | 0.293 |
| 0.29767 | 4 | 9.44 | 0.529 | 0.29671 | 4 | 9.48 | 0.336 |
| 0.39703 | 6 | 9.11 | 0.466 | 0.39445 | 6 | 9.21 | 0.343 |
| 0.49697 | 8 | 8.10 | 0.406 | 0.49465 | 8 | 8.17 | 0.306 |
| 0.59689 | 11 | 7.43 | 0.362 | 0.59577 | 11 | 7.46 | 0.338 |
| 0.69580 | 16 | 7.00 | 0.323 | 0.69584 | 16 | 6.99 | 0.296 |
| 0.79737 | 26 | 6.61 | 0.272 | 0.79678 | 26 | 6.63 | 0.271 |
| 0.89827 | 56 | 6.34 | 0.212 | 0.89826 | 56 | 6.34 | 0.214 |

pected, the performance of IRA ensembles is inferior to that of LDPC ensembles. However, in view of the simplicity of their encoding and decoding, IRA codes, optimized using Methods 1 or 3 , emerge as a very attractive design alternative.
Fig. 8 compares the performance of IRA ensembles obtained via the proposed methods for the BSC. The best codes are those designed with Method 3.

## VI. CONClusion

This paper has tackled the optimization of IRA codes in the limit for large code block length. This assumption allows to consider a cycle-free graph and enables to evaluate the threshold of the code by iteratively calculating message densities (DE).


Fig. 7. Gap to Shannon limit (obtained by DE) versus rate for BIAWGNC.

TABLE V
IRA Codes, Designed With Methods 1 and 3, Evaluated With DE, FOR BSC

| Method 1 |  |  |  | Method 3 |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Rate $a$ $\bar{d}$ $p_{\text {gap }}$ Rate <br> $a$ $\bar{d}$ $p_{\text {gap }}$   <br> 0.10042 2 17.92 0.0032 0.10137 <br> 0 17.73 0.0036   <br> 0.19910 3 12.07 0.0037 0.20086 <br> 3 11.94 0.0041   <br> 0.29573 4 9.53 0.0044 0.29897 <br> 4 9.38 0.0031   <br> 0.39298 6 9.27 0.0044 0.39621 <br> 6 9.14 0.0032   <br> 0.48908 8 8.36 0.0046 0.49226 <br>  8 8.25 0.0035  <br> 0.58590 12 8.48 0.0044 0.58815 <br> 0.689 7.70 0.0040   <br> 0.6827 17 7.90 0.0044 0.68409 <br> 16 7.39 0.0039   <br> 0.78155 28 7.83 0.0038 0.78235 <br> 0.88437 59 7.71 0.0026 0.88457 |  |  |  |  |  |  |  |

TABLE VI
IRA Codes, Designed Wrth Methods 2 and 4, Evaluated With DE, FOR BSC

| Method 2 |  |  |  | Method 4 |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Rate | $a$ | $\bar{d}$ | $p_{\text {gap }}$ | Rate | $a$ | $\bar{d}$ | $p_{\text {gap }}$ |
| 0.09406 | 2 | 19.26 | 0.0194 | 0.09952 | 2 | 18.10 | 0.0121 |
| 0.19833 | 3 | 12.13 | 0.0175 | 0.19842 | 3 | 12.12 | 0.0101 |
| 0.29743 | 4 | 9.45 | 0.0190 | 0.28836 | 4 | 9.87 | 0.0114 |
| 0.39650 | 6 | 9.13 | 0.0187 | 0.38865 | 6 | 9.44 | 0.0149 |
| 0.49620 | 8 | 8.12 | 0.0175 | 0.49091 | 8 | 8.30 | 0.0122 |
| 0.59580 | 11 | 7.46 | 0.0155 | 0.59349 | 11 | 7.53 | 0.0124 |
| 0.69559 | 16 | 7.00 | 0.0126 | 0.69107 | 16 | 7.15 | 0.0116 |
| 0.79583 | 26 | 6.67 | 0.0091 | 0.79283 | 26 | 6.79 | 0.0090 |
| 0.89692 | 56 | 6.44 | 0.0049 | 0.89337 | 57 | 6.80 | 0.0051 |

in general, the Gaussian a priori methods are optimistic, in the sense that there is no guarantee that the optimized rate is below capacity. On the contrary, the BEC a priori methods have always rates below capacity.

Our numerical results show that, for the BIAWGNC and BSC, the Gaussian a priori approximation is more attractive since the codes designed under this assumption have the smallest gap to Shannon limit. Depending on the desired rate, the EXIT function of the inner decoder has to be computed either with Monte Carlo simulation (Method 3) or with the reciprocal channel approximation (Method 1). At least in the BIAWGNC there is some evidence that the best LDPC codes [6] designed with DE slightly outperform our designed codes. In view of this and the very simple encoding structure of IRA codes, they emerge as attractive design choices.

## Exhibit F

Page 98


Fig. 8. Gap to Shannon limit (obtained by DE) versus rate for BSC.

APPENDIX I

## Proof of Theorem 1

We follow in the footsteps of [11] and analyze the local stability of the zero-BER fixed point by using a small perturbation approach. In order to do this, we need more details on the mapping $\Gamma$ and its inverse.

Given a random variable $x$ with distribution $F_{x}(z)$, the distribution of $\gamma(x)$ is given by

$$
\begin{equation*}
\Gamma\left(F_{x}\right)(s, z)=\chi_{\{s=0\}} \Gamma_{0}\left(F_{x}\right)(z)+\chi_{\{s=1\}} \Gamma_{1}\left(F_{x}\right)(z) \tag{46}
\end{equation*}
$$

where

$$
\begin{aligned}
& \Gamma_{0}\left(F_{x}\right)(z)=1-F_{x}^{-}\left(-\log \tanh \frac{z}{2}\right) \\
& \Gamma_{1}\left(F_{x}\right)(z)=F_{x}\left(\log \tanh \frac{z}{2}\right)
\end{aligned}
$$

and where $\chi_{\mathcal{A}}$ denotes the indicator function of the event $\mathcal{A}$.
In particular, the mapping $\Gamma$ applied to $\Delta_{0}$ and $\Delta_{\infty}$ yields

$$
\begin{align*}
\Gamma\left(\Delta_{0}\right)(s, z) & =\frac{1}{2} \chi_{\{s=0\}} \Delta_{\infty}(z)+\frac{1}{2} \chi_{\{s=1\}} \Delta_{\infty}(z) \\
\Gamma\left(\Delta_{\infty}\right)(s, z) & =\chi_{\{s=0\}} \Delta_{0}(z) \tag{47}
\end{align*}
$$

Given

$$
G(s, z)=\chi_{\{s=0\}} G_{0}(z)+\chi_{\{s=1\}} G_{1}(z)
$$

applying $\Gamma^{-1}$ yields

$$
\begin{align*}
\Gamma^{-1}(G)(z)=\chi_{\{z>0\}} & \left(1-G_{0}\left(-\log \tanh \frac{z}{2}\right)\right) \\
& +\chi_{\{z<0\}} G_{1}\left(-\log \tanh \frac{-z}{2}\right) \tag{48}
\end{align*}
$$

## Exhibit F

Page 99

By applying $\Gamma^{-1}$ we get

$$
\left\{\begin{array}{l}
Q_{1}=\Gamma^{-1}\left(\Gamma\left(P_{0}\right)^{\otimes(a-1)} \otimes \Gamma\left(\widetilde{P}_{0}\right)^{\otimes 2}\right) \\
\widetilde{Q}_{1}=\Gamma^{-1}\left(\Gamma\left(P_{0}\right)^{\otimes a} \otimes \Gamma\left(\widetilde{P}_{0}\right)\right)
\end{array}\right.
$$

and

$$
\begin{aligned}
Q_{1}= & (1-2(a-1) \epsilon-4 \delta) \Delta_{\infty} \\
& +(2(a-1) \epsilon+4 \delta) \Delta_{0}+O\left(\epsilon^{2}, \delta^{2}\right) \\
\widetilde{Q}_{1}= & (1-2 a \epsilon-2 \delta) \Delta_{\infty}+(2 a \epsilon+2 \delta) \Delta_{0}+O\left(\epsilon^{2}, \delta^{2}\right) .
\end{aligned}
$$

Hence, by noticing (50) at the bottom of the page we have

$$
\begin{aligned}
\lambda\left(Q_{1}\right)=(1-2(a-1) & \left.\lambda_{2} \epsilon-4 \lambda_{2} \delta\right) \Delta_{\infty} \\
& +\left(2(a-1) \lambda_{2} \epsilon+4 \lambda_{2} \delta\right) \Delta_{0}+O\left(\epsilon^{2}, \delta^{2}\right)
\end{aligned}
$$

Finally, by using the fact that $P_{1}=F_{u} \otimes \lambda\left(Q_{1}\right)$ and that $\widetilde{P}_{1}=F_{u} \otimes \widetilde{Q}_{1}$, the message distributions after one DE iteration are given by

$$
\left[\begin{array}{l}
P_{1} \\
\hat{P}_{1}
\end{array}\right]=A\left[\begin{array}{l}
2 \epsilon \\
2 \delta
\end{array}\right] F_{u}+\left(\left[\begin{array}{l}
1 \\
1
\end{array}\right]-A\left[\begin{array}{l}
2 \epsilon \\
2 \delta
\end{array}\right]\right) \Delta_{\infty}+\left[\begin{array}{l}
O\left(\epsilon^{2}\right) \\
O\left(\delta^{2}\right)
\end{array}\right]
$$

where

$$
A=\left[\begin{array}{cc}
(a-1) \lambda_{2} & 2 \lambda_{2}  \tag{51}\\
a & 1
\end{array}\right] .
$$

After $\ell$ iterations we obtain

$$
\begin{align*}
{\left[\begin{array}{c}
P_{\ell} \\
\widetilde{P}_{\ell}
\end{array}\right]=A^{\ell} } & {\left[\begin{array}{c}
2 \epsilon \\
2 \delta
\end{array}\right] F_{u}^{\otimes \ell} } \\
& +\left(\left[\begin{array}{l}
1 \\
1
\end{array}\right]-A^{\ell}\left[\begin{array}{c}
2 \epsilon \\
2 \delta
\end{array}\right]\right) \Delta_{\infty}+\left[\begin{array}{c}
O\left(\epsilon^{2}\right) \\
O\left(\delta^{2}\right)
\end{array}\right] \tag{52}
\end{align*}
$$

From the large deviation theory we get that [11]

$$
\begin{align*}
r & =-\lim _{\ell \rightarrow \infty} \frac{1}{\ell} \log \operatorname{Pe}\left(F_{u}^{\otimes \ell}\right) \\
& =-\log \left(\inf _{s>0} \int e^{-s z} d F_{u}(z)\right) \\
& =-\log \left(\int e^{-z / 2} d F_{u}(z)\right) \tag{53}
\end{align*}
$$

where the last equality follows from the fact that $F_{u}(z) \in \mathcal{F}_{\text {sym }}$.
Then, by applying $\mathrm{Pe}(\cdot)$ to $P_{\ell}$ in (52) we obtain that $\lim _{\ell \rightarrow \infty} \operatorname{Pe}\left(P_{\ell}\right)=0$ (implying that $\lim _{\ell \rightarrow \infty} P_{\ell}=\Delta_{\infty}$ ) if the eigenvalues of the matrix $A e^{-r}$ are inside the unit circle.

The stability condition is obtained by computing explicitly the largest (in magnitude) eigenvalue. We obtain

$$
\begin{equation*}
\frac{1}{2}\left(1+\lambda_{2}(a-1)+\sqrt{1+(2+6 a) \lambda_{2}+(a-1)^{2} \lambda_{2}^{2}}\right)<e^{r} \tag{54}
\end{equation*}
$$

Since the left-hand side (LHS) of (54) is increasing, condition (54) is indeed an upperbound on $\lambda_{2}$, given explicitly by (15).

## Appendix II PROOF OF PROPOSITION 2

Proposition 2 is a particular case of a more general result that we state in the following.

Proposition 4: Let $X$ be binary with $P[X=0]=p$ and $P[X=1]=1-p$. Let $S$ be independent of $X$ and take $M$ (finite) values with $P[S=i]=q_{i}$. Conditioned on $S=j, Y$ is a continuous random variable with conditional density function

$$
f_{Y \mid X=1}^{(j)}(y)=e^{-y} f_{Y \mid X=0}^{(j)}(y)
$$

Then

$$
I(X ; Y \mid S)=I(X ; Y)
$$

Proof of Proposition 4: First, notice that

$$
\begin{aligned}
f_{Y \mid X=0}(y) & =\sum_{i}^{M} q_{i} f_{Y \mid X=0}^{(i)}(y)=\sum_{i}^{M} q_{i} e^{y} f_{Y \mid X=1}^{(i)}(y) \\
& =e^{y} f_{Y \mid X=1}(y)
\end{aligned}
$$

Hence, we have (55) at the top of the following page.
Proof of Proposition 2: The assertion of Proposition 2 follows from Proposition 4 since for a collection of binary-input symmetric-output channels with symmetric transition probability we have that $\forall i, \forall y$

$$
\begin{aligned}
p_{Y \mid X, S}(y \mid X=1, S=i) & =p_{Y \mid X, S}(-y \mid X=0, S=i) \\
& =e^{-y} p_{Y \mid X, S}(y \mid X=0, S=i)
\end{aligned}
$$

## Appendix III

## PROOF OF Theorem 2

The local stability condition for the system ((33) and (35)) is given by the eigenvalues of the Jacobian matrix for the functions $(\phi, \tilde{\phi})$ in the fixed point $(x, \tilde{x})=(1,1)$. The partial derivatives of $\phi$ and $\tilde{\phi}$ are

$$
\begin{aligned}
& \frac{\partial \phi}{\partial x}(1,1)=\sum_{i=2}^{d} \sum_{j=1}^{D} \lambda_{i} p_{j}(i-1)(a-1) \lim _{\mu \rightarrow+\infty} \frac{J_{v_{j}}^{\prime}((i-1) \mu)}{J^{\prime}(\mu)} \\
& \frac{\partial \phi}{\partial \tilde{x}}(1,1)=\sum_{i=2}^{d} \sum_{j=1}^{D} \lambda_{i} p_{j}(i-1) 2 \lim _{\mu \rightarrow+\infty} \frac{J_{v_{j}}^{\prime}((i-1) \mu)}{J^{\prime}(\mu)} \\
& \frac{\partial \tilde{\phi}}{\partial x}(1,1)=\sum_{j=1}^{D} p_{j} a \lim _{\mu \rightarrow+\infty} \frac{J_{v_{j}}^{\prime}(\mu)}{J^{\prime}(\mu)} \\
& \frac{\partial \tilde{\phi}}{\partial \tilde{x}}(1,1)=\sum_{j=1}^{D} p_{j} \lim _{\mu \rightarrow+\infty} \frac{J_{v_{j}}^{\prime}(\mu)}{J^{\prime}(\mu)}
\end{aligned}
$$

where

$$
\begin{equation*}
J_{v_{j}}(\mu) \triangleq 1-\frac{1}{\sqrt{\pi}} \int_{-\infty}^{+\infty} e^{-z^{2}} \log _{2}\left(1+e^{-2 \sqrt{\mu} z-\mu-v_{j}}\right) d z \tag{56}
\end{equation*}
$$

$$
\begin{array}{rlr}
Q_{1}^{\otimes n} & =\sum_{j=0}^{n}\binom{n}{j}(1-2(a-1) \epsilon-4 \delta)^{n-j}(2(a-1) \epsilon+4 \delta)^{j} \Delta_{\infty}^{\otimes n-j} \otimes \Delta_{0}^{\otimes j}+O\left(\epsilon^{2}, \delta^{2}\right) \\
& = \begin{cases}\Delta_{\infty}+O\left(\epsilon^{2}, \delta^{2}\right), & \text { for } n \geq 2 \\
(1-2(a-1) \epsilon-4 \delta) \Delta_{\infty}+(2(a-1) \epsilon+4 \delta) \Delta_{0}+O\left(\epsilon^{2}, \delta^{2}\right), & \text { for } n=1\end{cases} \tag{50}
\end{array}
$$

## Exhibit F

Page 100

$$
\begin{align*}
I(X ; Y)= & p \int f_{Y \mid X=0}(y) \log _{2} \frac{f_{Y \mid X=0}(y)}{p f_{Y \mid X=0}(y)+(1-p) f_{Y \mid X=1}(y)} d y \\
& +(1-p) \int f_{Y \mid X=1}(y) \log _{2} \frac{f_{Y \mid X=1}(y)}{p f_{Y \mid X=0}(y)+(1-p) f_{Y \mid X=1}(y)} d y \\
= & p \int f_{Y \mid X=0}(y) \log _{2} \frac{1}{p+(1-p) e^{-y}} d y+(1-p) \int f_{Y \mid X=1}(y) \log _{2} \frac{1}{p e^{y}+(1-p)} d y \\
= & p \int \sum_{i}^{M} q_{i} f_{Y \mid X=0}^{(i)}(y) \log _{2} \frac{1}{p+(1-p) e^{-y}} d y+(1-p) \int \sum_{i}^{M} q_{i} f_{Y \mid X=1}^{(i)}(y) \log _{2} \frac{1}{p e^{y}+(1-p)} d y \\
= & \sum_{i}^{M} q_{i}\left(p \int f_{Y \mid X=0}^{(i)}(y) \log _{2} \frac{1}{p+(1-p) e^{-y}} d y+(1-p) \int f_{Y \mid X=1}^{(i)}(y) \log _{2} \frac{1}{p e^{y}+(1-p)} d y\right) \\
= & \sum_{i}^{M} q_{i}\left(p \int f_{Y \mid X=0}^{(i)}(y) \log _{2} \frac{f_{Y \mid X=0}(y)}{p f_{Y \mid X=0}^{(i)}(y)+(1-p) f_{Y \mid X=1}^{(i)}(y)} d y\right. \\
= & \quad I(X ; Y \mid S) .
\end{align*}
$$

Note that $J_{0}(\mu)=J(\mu)$. Since both limits tend to 0 , we derive an asymptotic expansion for $J_{v_{j}}^{\prime}(\mu)$ and $J^{\prime}(\mu)$.

The derivative of $J_{v_{j}}$ is given by
$J_{v_{j}}^{\prime}(\mu)=\frac{\log _{2}(e)}{\sqrt{\mu}} \frac{1}{\sqrt{\pi}} \int_{-\infty}^{+\infty}(z+\sqrt{\mu})$

$$
\times e^{-v_{j}} \frac{e^{-(z+\sqrt{\mu})^{2}}}{1+e^{-2 \sqrt{\mu} z-\mu-v_{j}}} d z
$$

Since $F_{u}$ is symmetric, the sum over $j$ can be rewritten as
$\sum_{j=1}^{D} p_{j} J_{v_{j}}^{\prime}(\mu)=p_{0} J_{0}^{\prime}(\mu)+\sum_{j=1}^{D^{\prime}} p_{j}\left(J_{v_{j}}^{\prime}(\mu)+e^{-v_{j}} J_{-v_{j}}^{\prime}(\mu)\right)$.
Let us define

$$
\begin{align*}
f_{0}(\mu) & =\frac{1}{\log _{2}(e)} J_{0}^{\prime}(\mu) \\
f_{v_{j}}(\mu) & =\frac{1}{\log _{2}(e)}\left(J_{v_{j}}^{\prime}(\mu)+e^{-v_{j}} J_{-v_{j}}^{\prime}(\mu)\right) \tag{57}
\end{align*}
$$

Following [38], (57) can be rewritten as (58) at the bottom of the page. The second equality in (58) is obtained by the change
of variable $z^{\prime}=z+\sqrt{\mu} / 2$. The fourth equality is due to the fact that the first and second integrands in the third line of (58) are odd and even functions of $z$, respectively. Then we use the changes of variable $z^{\prime}=\sqrt{\mu} z+\frac{v_{j}}{2}$ and $z^{\prime}=\sqrt{\mu} z-\frac{v_{j}}{2}$.

Lebesgue's dominated convergence theorem completes the proof. Since the sequence of measurable functions verifies

$$
\forall z \in \mathbb{R}, \quad \frac{e^{-\frac{z^{2}}{\mu}}}{\cosh (z)} \xrightarrow[\mu \rightarrow+\infty]{ } \frac{1}{\cosh (z)}
$$

and since these functions are bounded by an integrable function independent of $\mu$

$$
\forall \mu>0, \forall z \in \mathbb{R}, \quad\left|\frac{e^{-\frac{z^{2}}{\mu}}}{\cosh (z)}\right| \leq \frac{1}{\cosh (z)} \in L^{1}(\mathbb{R})
$$

Thus, Lebesgue's dominated convergence theorem [37] applies and

$$
\int_{-\infty}^{+\infty} \frac{e^{-\frac{z^{2}}{\mu}}}{\cosh (z)} d z \xrightarrow[\mu \rightarrow+\infty]{ }
$$

$$
\begin{align*}
f_{v_{j}}(\mu)= & \frac{1}{\sqrt{\pi}} \int_{-\infty}^{+\infty}\left(1+\frac{z}{\sqrt{\mu}}\right) e^{-(z+\sqrt{\mu})^{2}}\left(\frac{e^{-v_{j}}}{1+e^{-2 \sqrt{\mu} z-\mu-v_{j}}}+\frac{1}{1+e^{-2 \sqrt{\mu} z-\mu+v_{j}}}\right) d z \\
= & \frac{1}{\sqrt{\pi}} \int_{-\infty}^{+\infty} \frac{1}{\sqrt{\mu}}\left(z+\frac{\sqrt{\mu}}{2}\right) e^{-\left(z+\frac{\sqrt{\mu}}{2}\right)^{2}}\left(\frac{e^{-v_{j}}}{1+e^{-2 \sqrt{\mu} z-v_{j}}}+\frac{1}{1+e^{-2 \sqrt{\mu} z+v_{j}}}\right) d z \\
= & \frac{1}{\sqrt{\pi}} \int_{-\infty}^{+\infty} \frac{z}{\sqrt{\mu}} e^{-z^{2}-\frac{\mu}{4}-\frac{v_{j}}{2}}\left(\frac{1}{e^{\sqrt{\mu} z+\frac{v_{j}}{2}}+e^{-\sqrt{\mu} z-\frac{v_{j}}{2}}}+\frac{1}{e^{\sqrt{\mu} z-\frac{v_{j}}{2}}+e^{-\sqrt{\mu} z+\frac{v_{j}}{2}}}\right) d z \\
& +\frac{1}{\sqrt{\pi}} \int_{-\infty}^{+\infty} \frac{1}{2} e^{-z^{2}-\frac{\mu}{4}-\frac{v_{j}}{2}}\left(-\frac{1}{e^{\sqrt{\mu} z+\frac{v_{j}}{2}}}+e^{-\sqrt{\mu} z-\frac{v_{j}}{2}}+\frac{1}{e^{\sqrt{\mu} z-\frac{v_{j}}{2}}+e^{-\sqrt{\mu} z+\frac{v_{j}}{2}}}\right) d z \\
= & \frac{e^{-\frac{\mu}{4}-\frac{v_{j}}{2}}}{4 \sqrt{\pi}} \int_{-\infty}^{+\infty} e^{-z^{2}\left(\frac{1}{\cosh \left(\sqrt{\mu} z+\frac{v_{j}}{2}\right)}+\frac{1}{\cosh \left(\sqrt{\mu} z-\frac{v_{j}}{2}\right)}\right) d z} \\
= & \frac{e^{-\frac{\mu}{4}-\frac{v_{j}}{2}}}{4 \sqrt{\pi \mu}} \int_{-\infty}^{+\infty} \frac{e^{-\frac{\left(z-\frac{v_{j}}{2}\right)^{2}}{\mu}}+e^{-\frac{\left(x+\frac{v_{j}}{2}\right)^{2}}{\mu}}}{\cosh (z)} d z \tag{58}
\end{align*}
$$

$$
\begin{aligned}
\int_{-\infty}^{+\infty} \frac{1}{\cosh (z)} d z & =\left[2 \arctan \left(e^{z}\right)\right]_{-\infty}^{+\infty} \\
& =\pi
\end{aligned}
$$

Therefore, for large $\mu$

$$
f_{v_{j}}(\mu) \sim \frac{\sqrt{\pi}}{2} \frac{e^{-\frac{\mu}{4}} e^{-\frac{v_{j}}{2}}}{\sqrt{\mu}}
$$

Similarly, we get

$$
f_{0}(\mu) \sim \frac{\sqrt{\pi}}{4} \frac{e^{-\frac{\mu}{4}}}{\sqrt{\mu}}
$$

And thus, for $n \geq 1$

$$
\lim _{\mu \rightarrow+\infty} \frac{f_{v_{j}}(n \mu)}{f_{0}(\mu)}= \begin{cases}2 e^{-\frac{v_{j}}{2}}, & \text { if } n=1 \\ 0, & \text { if } n>1\end{cases}
$$

and

$$
\lim _{\mu \rightarrow+\infty} \frac{f_{0}(n \mu)}{f_{0}(\mu)}= \begin{cases}1, & \text { if } n=1 \\ 0, & \text { if } n>1\end{cases}
$$

The partial derivatives of $\phi$ and $\tilde{\phi}$ are

$$
\begin{align*}
\frac{\partial \phi}{\partial x}(1,1) & =\lambda_{2}(a-1) \cdot\left(p_{0}+\sum_{j=1}^{D^{\prime}} 2 p_{j} e^{-\frac{v_{j}}{2}}\right) \\
& =\lambda_{2}(a-1) \sum_{j=1}^{D} p_{j} e^{-\frac{v_{j}}{2}} \\
& =\lambda_{2}(a-1) e^{-r} \tag{59}
\end{align*}
$$

where $r$ is defined in (53). Similarly

$$
\begin{align*}
& \frac{\partial \phi}{\partial \tilde{x}}(1,1)=\lambda_{2} 2 e^{-r}  \tag{60}\\
& \frac{\partial \tilde{\phi}}{\partial x}(1,1)=a e^{-r}  \tag{61}\\
& \frac{\partial \tilde{\phi}}{\partial \tilde{x}}(1,1)=e^{-\boldsymbol{r}} \tag{62}
\end{align*}
$$

We get the Jacobian matrix as

$$
J=\left[\begin{array}{cc}
(a-1) \lambda_{2} & 2 \lambda_{2} \\
a & 1
\end{array}\right] e^{-r}
$$

In order to be stable, the eigenvalues of $J$ should be inside the unit circle. Therefore, the stability condition reduces to

$$
\begin{equation*}
\frac{1}{2}\left(1+\lambda_{2}(a-1)+\sqrt{1+(2+6 a) \lambda_{2}+(a-1)^{2} \lambda_{2}^{2}}\right)<e^{r} \tag{63}
\end{equation*}
$$

Notice from (54) and (63) that the stability conditions under DE and approximated DE are the same.

## Appendix IV Proof of Proposition 3

The Jacobian matrix of the approximated DE (41) about the fixed point $(x, \tilde{x})=(1,1)$, for a given input channel distribution $F_{u}$, is
$\boldsymbol{J}=\left[\begin{array}{cc}(a-1) \lambda^{\prime}(0) & 2 \lambda^{\prime}(0) \\ a & 1\end{array}\right]\left(1-\mathcal{I}\left(F_{u}\right)\right)=A\left(1-\mathcal{I}\left(F_{u}\right)\right)$
where $A$ was already defined in (51). The stability of the exact DE is given by the eigenvalues of $A e^{-r}$ (where $r$ is defined in (53)) while it is given by those of $A\left(1-\mathcal{I}\left(F_{u}\right)\right)$ for the approximated DE (where $\mathcal{I}(F)$ is given in (28)).

Under the assumption that $F_{u}$ is symmetric, we get

$$
\begin{aligned}
\int_{-\infty}^{0} e^{-z / 2} d F_{u}(z) & =\int_{0}^{+\infty} e^{-z / 2} d F_{u}(z) \\
\int_{-\infty}^{0} \log _{2}\left(1+e^{-z}\right) d F_{u}(z) & =\int_{0}^{+\infty} e^{-z} \log _{2}\left(1+e^{z}\right) d F_{u}(z)
\end{aligned}
$$

It follows that

$$
e^{-r}=\int_{0}^{+\infty} 2 e^{-z / 2} d F_{u}(z)
$$

and that

$$
\begin{aligned}
1-\mathcal{I}\left(F_{u}\right)=\int_{0}^{+\infty}\left(\left(1+e^{-z}\right) \log _{2}(1+\right. & \left.e^{-z}\right) \\
& \left.+\frac{z}{\log 2} e^{-z}\right) d F_{u}(z)
\end{aligned}
$$

From the inequality
$\forall z \geq 0, \quad\left(1+e^{-z}\right) \log \left(1+e^{-z}\right)+z e^{-z} \leq 2(\log 2) e^{-z / 2}$
we get

$$
\forall F_{u} \in \mathcal{F}_{\text {sym }}, \quad 1-\mathcal{I}\left(F_{u}\right) \leq e^{-r}
$$

and the conclusion follows.
In the following, we show inequality (64). Letting $x=e^{-z}$, (64) becomes equivalent to

$$
\forall x \in[0,1], \quad f(x) \leq 0
$$

where

$$
\begin{equation*}
f(x) \triangleq(1+x) \log (1+x)-x \log x-2 \log 2 \sqrt{x} \tag{65}
\end{equation*}
$$

It can be shown that $f(x)$ has a single minimum in the open interval $(0,1)$. Hence, by noticing that

$$
\lim _{x \rightarrow 0} f(x)=0 \quad \text { and } \quad f(1)=0
$$

we get inequality (64).

## APPENDIX V

## Proof of Theorem 3

Theorem 3 follows as a corollary of a result of [23] that we state here for the sake of completeness as Lemma 1. In order to introduce this result, we consider the model of Fig. 9 , where $b, x_{1}$, and $x$ are binary sequences and where Channel 1 is the communication channel with output $y$ and Channel 2 is a BEC channel with output $z$. Let the decoder be a maximum a posteriori (MAP) symbol-by-symbol decoder, producing for all $i=$ $1, \ldots, n$, output messages of the form

$$
\begin{equation*}
m_{o, i}=\log \frac{P\left(x_{1, i}=0 \mid \boldsymbol{y}, \boldsymbol{z}_{[i]}\right)}{P\left(x_{1, i}=1 \mid \boldsymbol{y}, z_{[i]}\right)} \tag{66}
\end{equation*}
$$

where $z_{[i]} \triangleq\left(z_{1}, \ldots, z_{i-1}, z_{i+1}, \ldots, z_{n}\right)$. Following [23], we generalize the definition of $I_{A}$ and $I_{E}$ given in Section III-A to the case of sequences as

$$
\begin{aligned}
& I_{A}=\frac{1}{n} \sum_{i=1}^{n} I\left(x_{1, i} ; z_{i}\right) \\
& I_{E}=\frac{1}{n} \sum_{i=1}^{n} I\left(x_{1, i} ; m_{o, i}\right)
\end{aligned}
$$



Fig. 9. General decoding model.

$$
\begin{equation*}
\stackrel{(\mathrm{a})}{=} \frac{1}{n} \sum_{i=1}^{n} I\left(x_{1, i} ; \boldsymbol{y}, z_{[i]}\right) \tag{67}
\end{equation*}
$$

where (a) follows from the fact that the decoder is MAP. Again, the decoder EXIT function is the set of points $\left(I_{A}, I_{E}\right)$ for all $I_{A} \in[0,1]$.

For the setup of Fig. 9 with the above assumptions, the following result applies.

Lemma 1: [23] Let $b$ be uniformly distributed and i.i.d. If Encoder 2 is linear with generator matrix having no all-zero columns, then the area under the EXIT characteristic satisfies

$$
\begin{equation*}
\mathcal{A} \triangleq \int_{0}^{1} I_{E}(z) d z=1-\frac{1}{n} H\left(x_{1} \mid y\right) \tag{68}
\end{equation*}
$$

We start by proving Theorem 3 for the approximated DE of Method 4. Consider the IRA encoder of Fig. 1 and the turbo-like decoder of Fig. 5.

The inner MAP decoder receives channel observations $u_{p}$ for the parity bits and input messages for the symbols of $x_{1}$, and produces output messages for the symbols of $x_{1}$. The general decoding model of Fig. 9, applied to the inner decoder, yields the model of Fig. 10(a).

The outer MAP decoder receives channel observations $u_{s}$ for the information bits and input messages for the symbols of $x_{1}$, and produces output messages for the symbols of $x_{1}$. The general decoding model of Fig. 9, applied to the outer decoder, yields the model of Fig. 10(b).

The upper channel is the communication channel with capacity $\mathcal{I}\left(F_{u}\right)$. Since we consider approximation Method 4, we let lower channel to be a BEC in both Fig. 10(a) and (b). Let $k, n$, and $m$ denote the number of information bits (length of $b$ and of $u_{s}$ ), the number of repeated information bits (length of $x_{1}$ ), and the number of parity bits (length of $x_{2}$ and of $u_{p}$ ), respectively. The inner and outer coding rates are $R_{\text {in }}=n / m$ and $R_{\text {out }}=k / n$, and the overall IRA coding rate (3) is given by

$$
R=\frac{k}{k+m}=\frac{R_{\mathrm{in}} R_{\mathrm{out}}}{1+R_{\mathrm{in}} R_{\mathrm{out}}}
$$

By applying Lemma 1 to the inner code model (Fig. 10(a)), we obtain

$$
\begin{aligned}
\mathcal{A}_{\text {in }} & =1-\frac{1}{n} H\left(x_{1} \mid u_{p}\right) \\
& =1-\frac{1}{n}\left(H\left(x_{1}\right)-I\left(x_{1} ; u_{p}\right)\right) \\
& \stackrel{(\text { a) }}{=} \frac{1}{n} I\left(x_{1} ; u_{p}\right) \\
& \stackrel{(b)}{=} \frac{m}{n} I\left(x_{2, i} ; u_{p, i}\right)=\mathcal{I}\left(F_{u}\right) / R_{\mathrm{in}}
\end{aligned}
$$


(a)

(b)

Fig. 10. Model of inner (a) and outer (b) decoders Method 4.
where (a) follows from the fact that, by the model assumption, $x_{1}$ is an i.i.d. uniformly distributed binary sequence, and (b) follows from the fact that the accumulator (inner code) generates $x_{2}$ with uniform probability (and uniform marginals) if driven by the i.i.d. uniform input sequence $x_{1}$.

By applying Lemma 1 to the outer code model (Fig. 10(b)), we obtain

$$
\begin{align*}
\mathcal{A}_{\text {out }} & =1-\frac{1}{n} H\left(x_{1} \mid u_{s}\right) \\
& =1-\frac{1}{n}\left(H\left(x_{1}\right)-I\left(x_{1} ; u_{s}\right)\right) \\
& \stackrel{(a)}{=} 1-\frac{k}{n}+\frac{1}{n} I\left(x_{1} ; u_{s}\right) \\
& \stackrel{(b)}{=} 1-\frac{k}{n}+\frac{k}{n} I\left(b_{i} ; u_{s, i}\right) \\
& =1-R_{\mathrm{out}}+R_{\mathrm{out}} \mathcal{I}\left(F_{u}\right) \tag{70}
\end{align*}
$$

where both (a) and (b) follow from the fact that the repetition code is an invertible mapping, so the entropy $H\left(x_{1}\right)$ is equal to the entropy of the information sequence $b$ (equal to $k$ bits) and $I\left(\boldsymbol{x}_{1} ; \boldsymbol{u}_{s}\right)=I\left(\boldsymbol{b} ; \boldsymbol{u}_{s}\right)=k I\left(b_{i} ; u_{s, i}\right)=k \mathcal{I}\left(F_{u}\right)$.

As seen in Section III-D, the approximated DE has no fixed points other than $(1,1)$ if and only if $g(x)>h^{-1}(x)$ for all $x \in[0,1)$, where $g(x)$ and $h(x)$ denote the inner and outer decoder EXIT functions. This implies that

$$
\mathcal{A}_{\mathrm{in}}=\int_{0}^{1} g(x) d x>\int_{0}^{1} h^{-1}(x) d x=1-\mathcal{A}_{\text {out }}
$$

By using (69) and (70), we obtain

$$
\begin{align*}
\mathcal{I}\left(F_{u}\right) / R_{\text {in }} & >R_{\mathrm{out}}-R_{\mathrm{out}} \mathcal{I}\left(F_{u}\right) \\
& \Downarrow \\
\mathcal{I}\left(F_{u}\right) & >\frac{R_{\mathrm{in}} R_{\mathrm{out}}}{1+R_{\mathrm{in}} R_{\mathrm{out}}}=R . \tag{71}
\end{align*}
$$

For Method 2, the above derivation still holds, since the communication channel in Fig. 9 is replaced by a BEC with erasure Exhibit F
Page 103
probablity $\epsilon=1-\mathcal{I}\left(F_{u}\right)$. In fact, the inner and outer decoder EXIT functions can be rewritten as

$$
\begin{aligned}
& h(x)=1-\left(1-\mathcal{I}\left(F_{u}\right)\right) \sum_{i=2}^{d} \lambda_{i}(1-x)^{i-1} \\
& g(x)=\frac{x^{a-1} \mathcal{I}\left(F_{u}\right)^{2}}{\left(1-\left(1-\mathcal{I}\left(F_{u}\right)\right) x^{a}\right)^{2}}
\end{aligned}
$$

and the area under these functions are again

$$
\begin{aligned}
\mathcal{A}_{\mathrm{out}} & =\int_{0}^{1} h(x) d x=1-\left(1-\mathcal{I}\left(F_{u}\right)\right) \sum_{i=2}^{d} \lambda_{i} / i \\
& =1-R_{\mathrm{out}}+R_{\mathrm{out}} \mathcal{I}\left(F_{u}\right) \\
\mathcal{A}_{\text {in }} & =\int_{0}^{1} g(x) d x=\mathcal{I}\left(F_{u}\right) / a=\mathcal{I}\left(F_{u}\right) / R_{\text {in }}
\end{aligned}
$$

Therefore, the final result (71) holds also for Method 2.

## ACKNOWLEDGMENT

The authors wish to thank Dr. Alex Ashikhmin for the helpful discussion concerning the results in [23].

## References

[1] C. Berrou, A. Glavieux, and P. Thitimajshima, "Near Shannon limit error-correcting and decoding: Turbo codes," in Proc. IEEE Int. Conf. Communications, Geneva, Switzerland, May 1993, pp. 1064-1070.
[2] R. G. Gallager, Low-Density Parity-Check Codes. Cambridge, MA: MIT Press, 1963.
[3] M. Luby, M. Mitzenmacher, A. Shokrollahi, D. Spielman, and V. Stemann, "Practical loss-resilient codes," in Proc. 29th ACM Symp. Theory of Computing (STOC), 1997, pp. 150-159.
[4] M. Luby, M. Mitzenmacher, A. Shokrollahi, and D. Spielman, "Efficient erasure correcting codes," IEEE Trans. Inform. Theory, vol. 47, pp. 569-584, Feb. 2001.
[5] D. Divsalar, H. Jin, and R. McEliece, "Coding theorems for 'Turbo-like" codes," in Proc. 36 th Annu. Allerton Conf. Communication, Control, and Computing, Sept. 1998, pp. 201-210.
[6] R. Urbanke et al.. (2002) Web page. [Online]. Available: http://thcwww. epfl.ch/research/ldpcopt/
[7] N. Vamica and A. Kavčić, "Optimized LDPC codes for partial response channels," in Proc. IEEE Int. Symp. Information Theory (ISIT 2002), Lausanne, Switzerland, June/July 2002, p. 197.
[8] X. Ma, N. Vamica, and A. Kavčić, "Matched information rate codes for binary ISI channels," in Proc. IEEE Int. Symp. Information Theory (ISIT 2002), Lausanne, Switzerland, June/July 2002, p. 269.
[9] B. M. Kurkoski, P. H. Siegel, and J. K. Wolf, "Joint message-passing decoding of LDPC codes and partial-response channels," IEEE Trans. Inform. Theory, vol. 48, pp. 1410-1422, June 2002.
[10] M. Luby, M. Mitzenmacher, A. Shokrollahi, and D. Spielman, "Analysis of low-density codes and improved designs using irregular graphs," in Proc. 30th ACM Symp. Theory of Computing, 1998, pp. 249-258.
[11] T. J. Richardson, M. A. Shokrollahi, and R. L. Urbanke, "Design of capacity-approaching irregular low-density parity-check codes," IEEE Trans. Inform. Theory, vol. 47, pp. 619-637, Feb. 2001.
[12] S. ten Brink, "Designing iterative decoding schemes with the extrinsic information transfer chart," AEÜ Int. J. Electron. Commun., vol. 54, no. 6, pp. 389-398, Dec. 2000.
[13] , "Convergence behavior of iteratively decoded parallel concatenated codes," IEEE Trans. Commun., vol. 49, pp. 1727-1737, Oct. 2001.
[14] S.-Y. Chung, T. J. Richardson, and R. Urbanke, "Analysis of sum-product decoding of low-density parity-check codes using a Gaussian approximation," IEEE Trans. Inform. Theory, vol. 47, pp. 657-670, Feb. 2001.
[15] H. El Gamal and A. R. Hammons, Jr, "Analyzing the turbo decoder using the Gaussian approximation," IEEE Trans. Inform. Theory, vol. 47, pp. 671-686, Feb. 2001.
[16] J. Boutros and G. Caire, "Iterative multiuser joint decoding: Unified framework and asymptotic analysis," IEEE Trans. Inform. Theory, vol. 48, pp. 1772-1793, July 2002.
[17] F. Lehmann and G.M. Maggio, "An approximate analytical model of the message passing decoder of LDPC codes," in Proc. IEEE Int. Symp. Information Theory (ISIT 2002), Lausanne, Switzerland, June/July 2002, p. 31.
[18] M. Ardakani and F. R. Kschischang, "Designing irregular LPDC codes using exit charts based on message error rate," in Proc. IEEE Int. Symp. Information Theory (ISIT 2002), Lausanne, Switzerland, June/July 2002, p. 454.
[19] H. Jin, A. Khandekar, and R. McEliece, "Irregular repeat-accumulate codes," in Proc. 2nd Int. Symp. Turbo Codes and Related Topics, Brest, France, Sept. 2000, pp. 1-8.
[20] J. Boutros, G. Caire, E. Viterbo, H. Sawaya, and S. Vialle, "Turbo code at 0.03 dB from capacity limit," in Proc. IEEE Int. Symp. Information Theory (ISIT 2002), Lausanne, Switzerland, June/July 2002, p. 56.
[21] H. Jin, "Analysis and design of turbo-like codes," Ph.D. dissertation, Calif. Inst. Technol., Pasadena, 2001.
[22] A. Ashikhmin, G. Kramer, and S. ten Brink, "Extrinsic information transfer functions: A model and two properties," in Proc. 36th Annu. Conf. Information Sciences and Systems (CISS 2002), Princeton, NJ, Mar. 2002.
[23] $\longrightarrow$ "Code rate and the area under extrinsic information transfer curves," in Proc. IEEE Int. Symp. Information Theory (ISIT 2002), Lausanne, Switzerland, July 2002, p. 115.
[24] S. Y. Chung, "On the construction of some capacity-approaching coding schemes," Ph.D. dissertation, MIT, Cambridge, MA, 2000.
[25] R. M. Tanner, "A recursive approach to low complexity codes," IEEE Trans. Inform. Theory, vol. IT-27, pp. 533-547, Sept. 1981.
[26] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA: Morgan Kaufmann, 1988.
[27] R. J. McEliece, D. J. C. MacKay, and J.-F. Cheng, "Turbo decoding as an instance of Pearl's belief propagation algorithm," IEEE J. Select. Areas Communications, vol. 16, pp. 140-152, Feb. 1998.
[28] F. R. Kschischang and B. J. Frey, "Iterative decoding of compound codes by probability propagation in graphical models," IEEE J. Selected Areas Commun., vol. 16, pp. 219-230, Feb. 1998.
[29] D. Forney, "Codes on graphs: Normal realizations," IEEE Trans. Inform. Theory, vol. 47, pp. 520-548, Feb. 2001.
[30] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, "Optimal decoding of linear codes for minimizing symbol error rate," IEEE Trans. Inform. Theory, vol. IT-20, pp. 284-287, Mar. 1974.
[31] T. J. Richardson and R. L. Urbanke, "The capacity of low-density paritycheck codes under message-passing decoding," IEEE Trans. Inform. Theory, vol. 47, pp. 599-618, Feb. 2001.
[32] S. Y. Chung, G. D. Forney, Jr., T. J. Richardson, and R. Urbanke, "On the design of low-density parity-check codes within 0.0045 dB of the Shannon limit," IEEE Commun. Lett., vol. 5, pp. 58-60, Feb. 2001.
[33] S. ten Brink, "Exploiting the chain rule of mutual information for the design of iterative decdoing schemes," in Proc. 39th Annu. Allerton Conf. Communication, Control, and Computing, Oct. 2001, pp. 293-300.
[34] M. Tüchler, S. ten Brink, and J. Hagenauer, "Measures for tracing convergence of iterative decoding algorithms," in Proc. 4th Int. ITG Conf. Source and Channel Coding, Berlin, Germany, Jan. 2002, pp. 53-60.
[35] S. Huettinger and J. Huber, "Extrinsic and intrinsic information in systematic coding," in Proc. IEEE Int. Symp. Information Theory (ISIT 2002), Lausanne, Switżerland, June/July 2002, p. 116.
[36] S. ten Brink and G. Kramer, "Turbo processing for scalar and vector channels," in Proc. 3rd Int. Symp. Turbo Codes and Related Topics, Brest, France, Sept, 2003, pp. 23-30.
[37] A. Browder, Mathematical Analysis: An Introduction. New York: Springer-Verlag, 1996.
[38] T. F. Wong, "Numerical calculation of symmetric capacity of Rayleigh fading channel with BPSK/QPSK," IEEE Commun. Lett., vol. 5, pp. 328-330, Aug. 2001.

# A synthesizable IP Core for DVB-S2 LDPC Code Decoding 

Frank Kienle, Torben Brack, Norbert Wehn<br>Microelectronic System Design Research Group<br>University of Kaiserslautern<br>Erwin-Schrödinger-Straße<br>67663 Kaiserslautern, Germany<br>\{kienle, brack, wehn\}@eit.uni-kl.de


#### Abstract

The new standard for digital video broadcast DVB-S2 features Low-Density Parity-Check (LDPC) codes as their channel coding scheme. The codes are defined for various code rates with a block size of 64800 which allows a transmission close to the theoretical limits.

The decoding of LDPC is an iterative process. For DVBS2 about 300000 messages are processed and reordered in each of the 30 iterations. These huge data processing and storage requirements are a real challenge for the decoder hardware realization, which has to fulfill the specified throughput of $255 \mathrm{MBit} /$ /s for base station applications.

In this paper we will show, to the best of our knowledge, the first published IP LDPC decoder core for the DVB-S2 standard. We present a synthesizable IP block based on ST Microelectronics 0.13 $\mu \mathrm{m}$ CMOS technology.


## 1 Introduction

The new DVB-S2 standard [1] features a powerful forward error correction (FEC) system which enables transmission close to the theoretical limit (Shannon limit). This is enabled by using Low-Density Parity-Check (LDPC) codes [2] one of the most powerful codes known today which can even outperform Turbo-Codes [31. To provide flexibility 11 different code rates ranging from ( $R=1 / 4$ up to $9 / 10$ ) are specified with a codeword length up to 64800 bits. This huge maximum codeword length is the reason for the outstanding communications performance ( $\sim 0.7 \mathrm{~dB}$ to Shannon) of this DVB-S2 LDPC code proposal, so in this paper we only focus on the codeword length of 64800 bits. To yield this performance, the decoder has to iterate 30 times. At each iteration up to 300000 data are scrambled and calculated. This huge data processing, storage and network/interconnect requirements is a real challenge for the decoder realization.

A LDPC code can be represented by a bipartite graph. For the DVB-S2 code 64800 so called variable nodes (VN) and $64800 *(1-R)$ check nodes ( CN ) exist. The connectivity of these two type of nodes is specified in the standard [1]. For decoding the LDPC code messages are exchanged iteratively between this two type of nodes, while the node processing is of low complexity. Within one iteration first the variable nodes are procesed, then the check nodes.

For a fully parallel hardware realization each node is instantiated and the connections between them are hardwired. This was shown in [4] for a 1024 bit LDPC code. But even for this relatively short block length severe routing congestion problems exist. Therefore a partly parallel architecture becomes mandatory for larger block length, where only a subset of nodes are instantiated. A network has to provide the required connectivity between VN and CN nodes. But realizing any permutation pattern is very costly in terms of area, delay and power.

To avoid this problem a decoder first design approach was presented in [5]. First an architecture is specified and afterwards a code is designed which fits this architecture. This approach is only suitable for regular LDPC code where each VN has the same number of incident edges, the CN respectively. But for an improved communications performance so called irregular LDPC codes are mandatory [6], where the VN nodes are of varying degrees. This is the case for the DVB-S2 code. In [7] we have presented a design method for irregular LDPC codes which can be efficiently processed by the decoder hardware. We used so called irregular repeat accumulate (IRA) codes [8] which are within the class of LDPC codes with the advantage of a very simple (linear) encoding complexity. In general, LDPC code encoder are very difficult to implement due to the inherent complex encoding scheme.

The LDPC codes as defined in the DVB-S2 standard are IRA codes, thus the encoder realization is straight forward. Furthermore, the DVB-S2 code shows regularities which can be exploited for an efficient hardware realization.


Figure 1. Tanner graph for the DVB-S2 LDPC code

These regularities are also the base for our methodology introduced in [7].

In this paper we show how to exploit these regularities and present an efficient mapping of VN and CN nodes to hardware instances. Memory area and access conflicts are most critical in this mapping process. Thus we used simulated annealing to minimize memory requirements and avoidance of RAM access conflicts.

We present to the best of our knowledge the first IP core capable to process all specified code rates in the DVB-S2 standard. We show synthesis results using a $0.13 \mu \mathrm{~m}$ technology.

The paper is structured as follows: the DVB-S2 LDPC codes and the decoding algorithm are presented in Section 2. In Section 3 the mapping of nodes to hardware instances is explained. The overall decoder architecture is shown in Section 4. Section 5 gives synthesis results and Section 6 concludes the paper.

## 2 DVB-S2 LDPC Codes

LDPC codes are linear block codes defined by a sparse binary matrix (parity check matrix) $H$. The set of valid codewords $x \in C$ have to satisfy

$$
\begin{equation*}
H x^{T}=0, \quad \forall x \in C \tag{1}
\end{equation*}
$$

A column in $H$ is associated to a bit of the codeword and each row corresponds to a parity check. A nonzero element in a row means that the corresponding bit contributes to this parity check. The code can best be described by a Tanner graph [6], a graphical representation of the associations between code bits and parity checks. Code bits are shown as variable nodes (circles), and parity checks as check nodes (squares), with edges connecting them. The number of edges on each node is called the node degree. If the node degree is identical for all variable nodes, the corresponding parity check matrix is called regular, otherwise it's irregular.

By carefully inspection of the construction rules, the DVB-S2 parity check matrix consists of two distinctive

| Rate | $\mathbf{j}$ | $\mathbf{f}_{\mathbf{j}}$ | $\mathbf{f}_{\mathbf{3}}$ | $\mathbf{k}$ | $\mathbf{N}$ | $\mathbf{K}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $1 / 4$ | 12 | 5400 | 10800 | 4 | 49600 | 16200 |
| $1 / 3$ | 12 | 7200 | 14400 | 5 | 43200 | 21600 |
| $2 / 5$ | 12 | 8640 | 17280 | 6 | 38880 | 25920 |
| $1 / 2$ | 8 | 12960 | 19440 | 7 | 32400 | 32400 |
| $3 / 5$ | 12 | 12960 | 25920 | 11 | 25920 | 38880 |
| $2 / 3$ | 13 | 4320 | 38880 | 10 | 21600 | 43200 |
| $3 / 4$ | 12 | 5400 | 43200 | 14 | 16200 | 48600 |
| $4 / 5$ | 11 | 6480 | 45360 | 18 | 12960 | 51840 |
| $5 / 6$ | 13 | 5400 | 48600 | 22 | 10800 | 54000 |
| $8 / 9$ | 4 | 7200 | 50400 | 27 | 7200 | 57600 |
| $9 / 10$ | 4 | 6480 | 51840 | 30 | 6480 | 58320 |

Table 1. Parameters describing the DVB-S2 LDPC Tanner graph for different coderates
parts: a random part dedicated to the systematic information, and a fixed part that belongs to the parity information. The Tanner graph for DVB-S2 is shown in Figure 1. There exist two types of variable nodes, the information (IN) and parity nodes (PN), corresponding to the systematic and parity bits respectively. The permutation $\Pi$ represents the random matrix part of the connectivity between IN and CN nodes, while the PN nodes are all of degree two and are connected in a fixed zigzag pattern to the CN nodes. The $N$ check nodes have a constant degree $k$. The $K$ information nodes consist of two subsets $f_{j}$ and $f_{3}$, with $f$ the number of IN nodes of degree $j$ and 3 . Table 1 summarizes the code rate dependent parameters as defined in the standard [1].

The connectivity of the IN and CN nodes is defined by the DVB-S2 encoding rule

$$
\begin{equation*}
p_{j}=p_{j} \oplus i_{m}, \quad j=(x+q(\bmod 360)) \bmod N \tag{2}
\end{equation*}
$$

$p_{j}$ is the $j$ th parity bit, $i_{m}$ the $m$ th information code bit, and $x, q$, and $N$ are code rate dependent parameters specified by the DVB-S2 standard. Equation 2 determines the entries of the parity check matrix. The $m$ th column has nonzero elements in each row $j$, thus the permutation $\Pi$ generates one edge between every $\mathrm{CN} m$ and $\mathrm{IN} j$.

The fixed zigzag connectivity of the PN and CN nodes is defined by the encoding scheme:

$$
\begin{equation*}
p_{j}=p_{j} \oplus p_{j-1}, \quad j=1,2, \ldots, N-1 . \tag{3}
\end{equation*}
$$

This is a simple accumulator. The corresponding part of the parity check matrix has two nonzero elements in each column, forming a square banded matrix. This type of LDPC codes with this simple encoding procedure are also called irregular repeat accumulate (IRA) codes [8].

### 2.1 Decoding Algorithm

LDPC codes can be decoded using the message passing algorithm [2]. It exchanges soft-information iteratively

## Exhibit G

a)


Figure 2. a) conventional message update scheme b) optimized message update scheme
between the variable and check nodes. The update of the nodes can be done with a canonical scheduling [2]: In the first step all variable nodes must be updated, in the second step all check nodes respectively. The processing of individual nodes within one step is independent, and can thus be parallelized.

The exchanged messages are assumed to be loglikelihood ratios (LLR). Each variable node of degree $i$ calculates an update of message $k$ according to:

$$
\begin{equation*}
\lambda_{k}=\lambda_{c h}+\sum_{l=0, l \neq k}^{i-1} \lambda_{l}, \tag{4}
\end{equation*}
$$

with $\lambda_{c h}$ the corresponding channel LLR of the VN and $\lambda_{i}$ the LLRs of the incident edges. The check node LLR updates are calculated according to

$$
\begin{equation*}
\tanh \left(\lambda_{k} / 2\right)=\prod_{l=0, l \neq k}^{i-1} \tanh \left(\lambda_{l} / 2\right) \tag{5}
\end{equation*}
$$

For fixed-point implementations it was shown in [9] that the total quantization loss is $\leq 0.1 \mathrm{db}$ when using a 6 bit message quantization compared to infinite precision. For a 5 bit message quantization the loss is $0.1-0.2 \mathrm{~dB}[6]$.

### 2.2 Optimized update of degree 2 Parity Nodes

The DVB standard supports LDPC codes ranging from code rate $R=1 / 4$ to $R=9 / 10$. Each code has one common property: the connectivity of the check nodes caused by the accumulator of the encoder. $C N_{0}$ is alway connected to $C N_{1}$ by a variable node of degree 2 and so on for all CN nodes. A variable node of degree 2 has the property that the input of the first incident edge is the output of the second incident edge (plus the received channel value) and vice versa. For a sequential processing of the check nodes (e.g. from left to right in Figure 1) an already updated message can directly passed to the next check node due to the simple zigzag connectivity. This immediate message update changes the con-

| Rate | $\mathbf{q}$ | EPN $_{\mathbf{P N}}$ | $\mathbf{E}_{\text {IN }}$ | Addr |
| :---: | :---: | :---: | :---: | :---: |
| $1 / 4$ | 135 | 97199 | 97200 | 270 |
| $1 / 3$ | 120 | 86399 | 129600 | 360 |
| $2 / 5$ | 108 | 77759 | 155520 | 432 |
| $1 / 2$ | 90 | 64799 | 162000 | 450 |
| $3 / 5$ | 72 | 51839 | 233280 | 648 |
| $2 / 3$ | 60 | 43199 | 172800 | 480 |
| $3 / 4$ | 45 | 32399 | 194400 | 540 |
| $4 / 5$ | 36 | 25919 | 207360 | 576 |
| $5 / 6$ | 30 | 21599 | 216000 | 600 |
| $8 / 9$ | 20 | 14399 | 180000 | 500 |
| $9 / 10$ | 18 | 12959 | 181440 | 504 |

Table 2. Code rate dependent parameters, with $E$ the number of incident edges of IN and PN nodes and Addr the number of values required to store the code stucture
ventional update scheme between CN an VN nodes (Equation 4).

The difference in the update scheme is presented in Figure 2. Only the connectivity between check nodes and parity nodes is depicted, the incident edges from the information nodes are omitted. Figure 2a) shows the standard belief propagation with the two phase update. In the first phase all messages from the PN to CN nodes are updated, in the second phase the messages from CN to PN nodes respectively. The message update within one phase is commutative and can be fully parallized. Figure 2b) shows our new message update scheme in which the new CN message is directly passed to the next CN node. This data flow is denoted as a forward update and corresponds to a sequential message update. The backwards update from the PN to CN nodes is again a parallel update. Note that a sequential backwards update would result in a maximum a posteriori (MAP) algorithm.

This new update scheme improves the communications performance. For the same communications performance 10 iterations can be saved i.e. 30 iterations instead of 40 have to be used. Furthermore we need to store only one message instead of two messages for the next iteration, which is explained in more detail in Section 4.

## 3 Hardware mapping

As already mentioned only partly parallel architectures are feasible. Hence only a subset $P$ of the nodes are instantiated. The variable and check nodes have to be mapped on these $P$ functional units. All messages have to be stored during the iterative process, while taking care of RAM access conflicts. Furthermore we need a permutation networks which provides the connectivity of the Tanner graph.


Figure 3. Message and functional unit mapping for $R=1 / 2$

We can split the set of edges $E$ connecting the check nodes in two subsets $E_{I N}$ and $E_{P N}$, indicating the connections between $\mathrm{CN} / \mathrm{IN}$ nodes and $\mathrm{CN} / \mathrm{PN}$ nodes respectively. The subsets are shown in Table 2 for each code rate. Furthermore the $q$ factor of Equation 2 is listed. The implementation of $E_{I N}$ is the challenging part, since this connectivity (II) changes for each code rate. The realization of $E_{P N}$ is straightforward, thus we focus on the mapping of the IN and CN nodes.

Due to the varying node degrees the functional nodes process all incoming messages in a serial manner. Thus a functional node can except one message per clock cycle and produces at most one updated message per clock cycle.

A careful analysis of Equation 2 shows that the connectivity of 360 edges of distinct information nodes are determined by just one value $x$, while $q$ is a code rate dependent constant, see Table 2.

These 360 edges can be processed simultaneously by $P=360$ functional units. Within a half iteration a functional unit has to process $q *(k-2)$ edges. ( $k-2$ ) is the number of edges between one check node and information nodes. For each code rate $q$ was chosen to satisfy the constraint

$$
\begin{equation*}
E_{I N} / 360=q *(k-2) \tag{6}
\end{equation*}
$$

It guarantees that each functional unit has to process the same amount of nodes which simplifies the node mapping. Figure 3 shows the mapping of the IN and CN nodes for the LDPC code of rate $R=1 / 2$. Always 360 consecutive VN nodes are mapped to 360 functional units. To each func-
tional unit a RAM is associated to hold the corresponding messages (edges). Please note that for each IN of degree 8,8 storage places are allocated to this VN, because each incident edge has to be stored.

The check nodes mapping depends on the rate dependent factor $q$. For $R=1 / 2$ the first $q=90 \mathrm{CN}$ nodes are mapped to the first functional unit. The next 90 CN nodes are mapped to the next producer and so on. Again the CN number corresponds to CN degree storage locations.

This node mapping is the key for an efficient hardware realization, since it enables to use a simple shuffing network to provide the connectivity of the Tanner graph. The shuffling network ensures that at each cycle 360 input messages are shuffled to 360 distinct target memories. Thus we have to store $E_{I N} / 390=450$ shuffling and addressing information for the $R=1 / 2$ code, see Table 2 for the other code rates. The shuffing offsets and addresses can be extracted from the $x$ tables provided by [1].

## 4 Decoder Architecture

Based on the message mapping described in the previous chapter, the basic architecture of the DVB-S2 LDPC decoder is shown in Figure 4. It consists of functional units which can process the functionality of variable and check nodes. This is possible, since only one type of the node are processed during one half iteration. The IN message memories banks hold the messages which are exchanged between information and check nodes. Furthermore we have memories for storing the exchanged messages for the parity nodes (PN message memories), which are all of degree two. The address and shuffling RAM together with the shuffling network provides the connectivity of the Tanner graph.

As mentioned the decoder processes 360 nodes in parallel so 360 messages have to be provided per cycle. All 360 messages are read from the same address from the IN message memory bank. Though, for the information node processing we just increment the reading address. The functional unit can accept each clock cycle new data, while a control flag just labels the last message belonging to a node and starts the output processing. The newly produced 360 messages are then written back to the same address location but with a cyclic shift, caused by the shuffling network. To process the check nodes we have to read from dedicated addresses, provided by the address RAM. These addresses were extracted from node mapping as described in the previous chapter. Again 360 messages are read per clock cycle and written back to the same address after the processing via the shuffling network. This ensures that the messages are shuffled back to their original position.

The processing of the parity nodes can be done concurrently during the check node processing, by using the update seheme described in Section 2.2. Each functional

## Exhibit G <br> Page 108



Figure 4. Basic architecture of our LDPC decoder
unit processes consecutive check nodes (Figure 3). The message which is passed during the forward update of the check nodes is kept in the functional unit. Only the messages of the backward update has to be stored which reduces the memory requirements for the zigzag connectivity to $E_{P N} / 2$ messages. The PN message memories are only read and written during the check node phase, while the channel (CH) RAMs delivers the corresponding received channel value.

We use single port SRAMs due to area and power efficiency. Hence we have to take care of read/write conflicts during the iterative process. Read/write conflicts occur, since data are continously read from the 360 RAMs and provided to the functional units, while new processed messages have to be written back.

The check node processing is the most critical part. We have to read from dedicated addresses extracted during the mapping process. Therefore, the IN message memory block is partitioned in 4 RAMs which is shown in Figure 5. Even if the commutativity of the message processing within a check node is exploited all write conflicts can not be avoided. Therefore a buffer is required to hold a message if writing is not possible due to a conflict. We use simulated annealing to find the best addressing scheme to reduce RAM access conflicts and hence to minimize the buffer overhead. This optimization step ensures that only one buffer is required which holds for all code rates. Per clock cycle we read data from one RAM, and write at most 2 data back to two distinct RAMs, coming from the buffers or the shuffling network. The two least significant bits of the addresses determines the assignment to a partition. This
allows a simple control flow, which just has to compare the reading and the writing addresses of the current clock cycle.

The resulting decoder throughput $T$ is

$$
\begin{equation*}
T=\frac{I}{\# \mathrm{cyc}} \cdot f_{\mathrm{cyc}} \tag{7}
\end{equation*}
$$

with $I$ the number of information bits to be decoded and \#cyc the number of cycles to decode one block including the input/output (I/O) processing.

The number of cycles is calculated as $\frac{C}{P_{I O}}+I t \cdot\left(2 \cdot \frac{E_{I N}}{P}\right)$. Thus Equation 7 yields:

$$
\begin{equation*}
T=\frac{I}{\frac{C}{P_{I O}}+I t \cdot\left(2 \cdot\left(\frac{E_{I N}}{P}+T_{\text {latency }}\right)\right)} \cdot f_{\text {cycle }} \tag{8}
\end{equation*}
$$

The part $\frac{c}{P_{10}}$ is the number of cycles for input/output (I/O) processing. The decoder is capable to receive 10 channel values per clock cycle. Reading a new codeword of size $C$ and writing the result of the prior processed block can be done in parallel with reading/writing $P_{I O}$ data concurrently. The latency $T_{\text {latency }}$ for each iteration depends on the processing units and the shuffling network.

## 5 Results

The LDPC decoder is implemented as a synthesizable VHDL model. Results are obtained with the Synopsis Design Compiler based on a ST Microelectronics $0.13 \mu \mathrm{~m}$ CMOS technology. The maximum clock frequency is 270 MHz under worst case conditions. The decoder is capable to process all specified code rates of the DVB standard with

## Exhibit G

Page 109


Figure 5. Hierarchical RAM structure
the required throughput of $255 \mathrm{Mbit} / \mathrm{s} .30$ iterations are assumed.

Table 3 shows the synthesis results for a 6 bit message quantization of the channel values and the exchanged messages. The overall area is $22.74 \mathrm{~mm}^{2}$. The area is determined by different code rates.
$R=1 / 4$ has the largest set of parity nodes and defines the size of the PN message memories. While the rate $R=$ $3 / 5$ has the most edges to the informtion nodes and hence determines the size of the IN message memory banks. The size of a functional node depends on the maximum IN and PN degree, respectivly ( $R=2 / 3$ and $R=9 / 10$ ).

The area is splitted in three parts: RAMs, logic and the shuffing network. Storing the messages yields the major part of the RAM area with $9.12 \mathrm{~mm}^{2}$. It is important to note, that only an area of $0.075 \mathrm{~mm}^{2}$ is required to store the connectivity of the Tanner graph. This shows the efficiency of our architectural approach. The logic area of the functional nodes with $10.8 \mathrm{~mm}^{2}$ is a major part of the overall area. This is due to the required flexibility of the different code rates. We also placed and routed the shuffling network to test routing congestions. Due to its regularity no congestions resulted, its area is dominated by the logic cells.

## 6 Conclusion

Low-Density Parity-Check codes are part of the new DVB-S2 standard. In this paper we presented to the best of our knowledge the first published IP core for DVB-S2 LDPC decoding. We explained how to explore the code structure for an efficient hardware mapping and presented a decoder architecture which can process all specified code rates ranging from $R=1 / 4$ to $R=9 / 10$.

| 0.13 pm technologie |  | AREA <br> $\left[\mathbf{m m}^{2}\right]$ |
| :---: | :---: | :---: |
| RAMs | channel LLRs | 1.997 |
|  | Messages | 9.117 |
|  | Address/Shuffling | 0.075 |
| Logic | Functional Nodes | 10.8 |
|  | control logic | 0.2 |
| Shuffing Network | 0.55 |  |
| Total Area $\left[\mathrm{mm}^{2}\right]$ |  | 22.739 |

Table 3. Synthesis Results for the DVB-S2 LDPC code decoder

## 7 Acknowledgments

The work presented in this paper was supported by the European IST project 4More 4G MC-CDMA multiple antenna system On chip for Radio Enhancements [10].

Our special thanks goes to Friedbert Berens from the Advanced System Technology Group of STM, Geneva, Switzerland, for many valuable discussions.

## References

[1] European Telecommunications Standards Institude (ETSI). Digital Video Broadcasting (DVB) Second generation framing structure for broadband satellite applications; EN 302 307 V1.1.1. www.dvb. org.
[2] R. G. Gallager. Low-Density Parity-Check Codes. M.I.T. Press, Cambridge,Massachusetts, 1963.
[3] C. Berrou. The Ten-Year-Old Turbo Codes are Entering into Service. IEEE Communications Magazine, 41:110116, Aug. 2003.
[4] A. Blanksby and C. J. Howland. A $690-\mathrm{mW} ~ 1-\mathrm{Gb} / \mathrm{s}$, Rate1/2 Low-Density Parity-Check Code Decoder. IEEE Journal of Solid-State Circuits, 37(3):404-412, Mar. 2002.
[5] E. Boutillon, J. Castura, and F. Kschischang. Decoderfirst code design. In Proc. 2nd International Symposium on Turbo Codes \& Related Topics, pages 459-462, Brest, France, Sept. 2000.
[6] T. Richardson and R. Urbanke. The Renaissance of Gallager's Low-Density Pariy-Check Codes. IEEE Communications Magazine, 41:126-131, Aug. 2003.
[7] F. Kienle and N. Wehn. Design Methodology for IRA Codes. In Proc. 2004 Asia South Pacifi c Design Automation Conference (ASP-DAC '04), Yokohama, Japan, Jan. 2004.
[8] H. Jin, A. Khandekar, and R. McEliece. Irregular RepeatAccumulate Codes. In Proc. 2nd International Symposium on Turbo Codes \& Related Topics, pages 1-8, Brest, France, Sept. 2000.
[9] T. Zhang, Z. Wang, and K. Parhi. On finite precision implementation of low-density parity-check codes decoder. In Proc.International Symposium on Circuits and Systems (ISCAS '01), Antwerp,Belgium, May 2001.
[10] http://ist-4more.org.

## Exhibit G

Page 110

# Factorizable modulo $M$ parallel architecture for DVB-S2 LDPC decoding 

Marco Gomes, Gabriel Falcão, Vitor Silva, Vitor Ferreira, Alexandre Sengo and Miguel Falcão* Instituto de Telecomunicações, Pólo II da Universidade de Coimbra, 3030-290 Coimbra, Portugal<br>*Chipidea Microelectrónica S.A., Rua Frederico Ulrich, n. 2650, 4470-605 Moreira da Maia, Portugal<br>e-mail: marco@co.it.pt, gff@co.it.pt, vitor@co.it.pt, vitorhugo@co.it.pt, sengo@co.it.pt, mfalcao@chipidea.com


#### Abstract

State-of-the-art decoders for DVB-S2 low-density parity-check (LDPC) codes explore semi-parallel architectures based on the periodicity $M=360$ factor of the special type of LDPC-IRA codes adopted. This paper addresses the generalization of a well known hardware $M$-kernel parallel structure and proposes an efficient partitioning by any factor of $M$, without addressing overhead and keeping unchanged the efficient message memory mapping scheme. Our method provides a simple and efficient way to reduce the decoder complexity. Synthesizing the decoder for an FPGA from Xilinx shows a minimum throughput above the minimal 90 Mbps .


## I. INTRODUCTION

The recent Digital Video Satellite Broadcast Standard (DVB-S2) [1] [2] has adopted a powerful FEC scheme based on the serial concatenation of BCH and Low Density Parity Check (LDPC) codes. This new FEC structure, combined with the adoption of high order modulations (QPSK, 8PSK, 16APSK and 32APSK), is able to provide capacity gains of about $30 \%$ over the previous DVB-S standard [2], with the LDPC codes playing a fundamental role in this raise of performance.
LDPC codes are linear block codes defined by sparse parity-check matrices [3] [4] [5], $\mathbf{H}$ and, usually, represented by Tanner graphs [6]. A Tanner graph is a bi-partite graph formed by two types of nodes. Check nodes ( $v^{c}$ ), one per each code constraint, and bit nodes one per each codeword bit (information and parity, respectively, $v^{1}$ and $v^{P}$ ), with the connection edges between them being given by $\mathbf{H}$.
They are decoded using low complexity iterative belief propagation algorithms operating over the Tanner graph description [7]. However, a major drawback is their high encoding complexity caused by the fact that the generator matrix, $\mathbf{G}$, is, in general, not sparse. In order to overcome this problem, DVB-S2 standard has adopted a special class of LDPC codes, with linear encoding complexity, known by Irregular Repeat-Accumulate (IRA) [8] [9].
An important issue in the design of LDPC encoder and decoder architectures for DVB-S2 is the fact that the standard supports two different frame lengths ( 16200 bits for low delay applications and 64800 bits otherwise) and a set of different code rates $(1 / 4,1 / 3,2 / 5,1 / 2,3 / 5,2 / 3,3 / 4$, $4 / 5,5 / 6,8 / 9$ and $9 / 10$ ) for both frame lengths and different modulation schemes [1] [9]. For each mode of operation is defined a different LDPC code and, although they share a similar structure and properties, this still poses
an enormous challenge on the development of an encoder and a decoder fully compliant with all operating modes.
The decoder state-of-the-art is based on a flexible partial parallel architecture that explores the $M=360$ periodicity nature of DVB-S2 LDPC codes [10]. Although capable of providing a throughput far above from the minimum mandatory rate of 90 Mbps , this architecture requires a huge ASIC area of $22.74 \mathrm{~mm}^{2}$ on a ST Microelectronics $0.13 \mu \mathrm{~m}$ technology, mainly due to the high number (360) of computation kernels or functional units ( FU ) and the wide length of the barrel shifter. In order to decrease the number of computation kernels to only 45 FU's and to reduce the length of the barrel shifter, an alternative solution was proposed [11] which uses a re-structured version of $\mathbf{H}$. As a consequence, this approach increases the complexity of the DVB-S2 deinterleaver and doubles (almost) the input memory in terms of [10].
In this paper we generalize the architecture [10] and surpass its disadvantages. We will show that it is possible to reduce the number of computation kernels to any integer factor of $M=360$, without addressing overhead and keep unchanged the efficient message memory mapping scheme [10]. Our strategy also reduces the length of the barrel shifter by the same factor and considerably simplifies the routing problem. The throughput is reduced by the same factor but this does not represent a real problem since the architecture [10] is able to provide a throughput far above from the mandatory minimum rate. Thus, we provide a simple and efficient method to reduce the decoder complexity without loosing the throughput goals.
The next section briefly describes DVB-S2 LDPC-IRA codes. Section III addresses the LDPC decoding for DVB-S2 using a partial parallel architecture and its generalization by sub-sampling it by a factor of $M$. Synthesis results are presented in section IV and final conclusions are pointed out in section $V$.

## II. DVB-S2 LDPC-IRA CODES

The new DVB-S2 [1] [9] standard adopted a special class of LDPC codes known by IRA codes [8] as the main solution for the FEC system. An IRA code is characterized by a parity check matrix, $\mathbf{H}$, of the form,

$$
\begin{aligned}
& \mathbf{H}_{(n-k) \times n}=\left[\begin{array}{l|l}
\mathbf{A}_{(n-k) \times k} & \mathbf{B}_{(n-k)(n-k)}
\end{array}\right]
\end{aligned}
$$

where B is a staircase lower triangular matrix. By restricting A to be sparse, it is obtained an LDPC-IRA code [9].
The H matrices of the DVB-S2 LDPC codes have other properties beyond being of IRA type. Some periodicity constraints were put on the pseudo-random design of the $\mathbf{A}$ matrices, which allows a significant reduction on the storage requirement without code performance loss.
The matrix $\mathbf{A}$ construction technique is based on dividing the $v^{1}$ nodes in disjoint groups of $M$ consecutives ones. All the $v^{1}$ nodes of a group $l$ should have the same weight, $w_{i}$, and it is only necessary to choose the $v^{c}$ nodes that connect to the first $v^{1}$ of the group in order to specify the $v^{\mathrm{c}}$ nodes that connect to each one of the remaining $M-1$ nodes. The connection choice for the first element of group $l$ is pseudorandom with the restrictions that the resulting LDPC code is cycle-4 free, the number of length 6 cycles is the shortest possible and all the $v^{c}$ nodes must connect to the same number of $\nu^{\mathrm{t}}$ nodes.
Denoting by, $r_{1}, r_{2}, \ldots, r_{w_{i}}$, the indices of the $v^{c}$ nodes that connect to the first $v^{\mathrm{L}}$ of group $l$, the indices of the $v^{\mathrm{c}}$ nodes that connect to $v_{i}^{1}$, with $0 \leq i \leq M-1$, of group $l$ can be obtained by,

$$
\begin{align*}
& \left\{\left(r_{1}+i \times q\right) \bmod (n-k),\left(r_{2}+i \times q\right) \bmod (n-k), \ldots,\right.  \tag{2}\\
& \left.\left(r_{w_{1}}+i \times q\right) \bmod (n-k)\right\}
\end{align*}
$$

with $q=(n-k) / M$ and $M=360$ (a common factor for all DVB-S2 supported codes).
Another property of matrix $\mathbf{A}$ is that for each supported code, there are a set of groups of $v^{\mathrm{r}}$ nodes of constant weight $w>3$ ( $w$ is code dependent) and the remaining have all weight 3.

## III. DVB-S2 LDPC DECODING

The huge dimensions of the LDPC-IRA codes adopted by the DVB-S2 standard, turns impractical the adoption of a fully parallel architecture that maps the Tanner graph structure [12]. Besides that, such solution is code dependent, which means that is required a different full parallel decoder for each code defined in the standard.
Best known solutions are based on highly vectorized partial parallel architectures [10] [11], that explore the particular characteristics of the DVB-S2 LDPC-IRA codes, namely, the periodic nature ( $M=360$ ) shared by all the codes. One solution was proposed in [10], whose architecture uses $M$
functional units working in parallel. In this paper we will show that it is possible to reduce the number of functional units by any integer factor of $M$, without addressing overhead, keeping unchanged its efficient memory mapping scheme. Our approach does not only surpass the architecture [10] disadvantages, but also makes the architecture flexible and easy reconfigurable according with the decoder constraints.

## A. Modulo M parallel architecture

As previously described, DVB-S2 adopted a special class of structured LDPC-IRA codes with the properties stated in (2). This turns possible the simultaneous processing of $v^{\text {a }}$ and $v^{c}$ node sets, whose indices are given by,

$$
\begin{align*}
& \mathrm{C}^{(c)}=\{c, c+1, \cdots, c+M-1\}, \text { with, } c \bmod M=0, \text { and } \\
& \mathrm{R}^{(r)}=\{r, r+q, r+2 q, \cdots, r+(M-1) q\}, \text { with }, 0 \leq r \leq q-1, \tag{3}
\end{align*}
$$

respectively, (the superscript is the index of the first element of the set and, ' $r$ ' and ' $c$ ' mean row and column of $\mathbf{H}$ ), which significantly simplifies the decoder control. In fact, according to (2), if $v_{\tilde{c}}^{\mathrm{I}}$ is connected to $v_{r}^{\mathrm{C}}$, then, $v_{r+1 \mathrm{x}_{q}}^{\mathrm{c}}$, with $0 \leq i \leq M-1$, will be connected to, $v_{c+(\tilde{c}-c+i))^{1} \bmod M}$, where, $c=M \times(\tilde{c}$ div $M)$ is the index of the first $v^{\text {b }}$ of the group $C^{(c)}$ to which $v_{\varepsilon}^{1}$ belongs.
The architecture shown in Fig. 1 is based on $M$ functional units (FU) working in parallel with shared control signals [12], that process both $\nu^{c}$ (in check mode) and $v^{\mathrm{t}}$ nodes (in bit mode) in a flooding schedule manner [13] [14]. Attending to the zigzag connectivity between $v^{\mathrm{P}}$ and $v^{\mathrm{c}}$ nodes, they are updated jointly in check mode following a horizontal schedule approach [15]. A detailed description of the FU operation can be found in [12].


Figure 1. Modulo $M$ parallel architecture for DVB-S2 LDPC decoding.

## Memory mapping and shuffling mechanism

As mentioned before, a single FU unit is shared by a constant number of $v^{\mathrm{I}}, v^{\mathrm{C}}$ and $v^{\mathrm{P}}$ nodes (the last two are
processed jointly), depending on the code length and rate. More precisely, for a ( $n, k$ ) DVB-S2 LDPC-IRA code, the $\mathrm{FU}_{i}$, with $0 \leq i \leq M-1$, updates sequentially in bit mode the $\nu_{\{, i,+M, i+2 \times M, \cdots, i+\{(\alpha-1) \times M\}}^{1}$ nodes, with $\alpha=k / M$. In check mode, the same FU updates the $v_{\{, j+j, \ldots, \ldots+q-1\}}^{\mathrm{c}}, \ldots, v_{\{,,+1 \ldots, \ldots+q-1\}}^{\mathrm{P}}$, nodes, with $j=i \times q$. This guarantees that when processing simultaneously the group $\mathrm{C}^{(c)}$, the computed messages have as destination a set $\mathrm{R}^{(r)}$, where each one of them will be processed by a different FU. Considering (2), the new computed messages only need to be right rotated to be handled by the correct $v^{\mathrm{C}}$ nodes. The same happens when processing each $\mathrm{R}^{(r)}$ set, where according to (2), the right rotation must be reversed in order to the new computed messages have as destination the exact $v^{I}$ nodes. The shuffling network (barrel shifter) is responsible for the correct message exchange between $v^{\mathrm{c}}$ and $v^{t}$ nodes, emulating the code Tanner graph. The shift values stored in ROM (Fig. 1) can be easily obtained from the annexes B and C of DVB-S2 standard tables [1].
The messages sent along the Tanner graph edges are stored in RAM (see Fig. 1). If we adopt a sequential RAM access in bit mode, then, the access in check mode must be indexed or vice-versa. Both options are valid, so, without loss of generalization, we assume sequential access in bit mode. Denoting by, $\mathbf{r}_{i}=\left[r_{r_{1}} r_{i 2} \cdots r_{r_{m}}\right]^{T}$, the vector of $v^{\mathrm{c}}$ node indices connected to the $v_{i}^{l}$ node of weight, $w_{i}$, then, the message memory mapping can be obtained using the following matrix,

$$
\mathbf{R}=\left[\begin{array}{llll}
\mathbf{r}_{0} & \mathbf{r}_{1} & \cdots & \mathbf{r}_{M-1}  \tag{4}\\
\mathbf{r}_{M} & \mathbf{r}_{M+1} & \cdots & \mathbf{r}_{2 M-1} \\
\vdots & \vdots & \vdots & \vdots \\
\mathbf{r}_{(\alpha-1) \times M} & \mathbf{r}_{(\alpha-1) \times N+1} & \cdots & \mathbf{r}_{m \times M-1}
\end{array}\right]_{\left(q \times \omega_{c}\right) \times M}
$$

where, $w_{c}$, is a code constant ( $v^{c}$ weight is $w_{c}+2$, except for the first one (1)).
In order to process each $\mathrm{R}^{(r)}$ set in check mode, the required memory addresses can be obtained by finding the matrix $\mathbf{R}$ rows where the index $r$ appears.

## $B$. Sub-sampling by a factor of $M$

The simplicity of the shuffling mechanism and the efficient memory mapping scheme, constitute the major strengths of the architecture just described [10]. However, the high number of FU's and the long width of the barrel shifter require a huge silicon area. Since this architecture is able to provide a throughput far above from the minimum mandatory rate of 90 Mbps , we may reduce the number of FU's. In fact, we will show that this can be done by any factor of $M$.
Let be $L, N \in \mathbb{N}$ factors of $M$, with, $M=L \times N$, and consider a $C^{(c)}$ set (3). This group can be decomposed by down-sampling in $L$ subgroups as:

$$
\begin{align*}
\mathrm{C}_{0}^{(c)} & =\{c, c+L, c+2 L, \cdots, c+(N-1) \times L\} \\
\mathrm{C}_{1}^{(c)} & =\{c+1, c+1+L, c+1+2 L, \cdots, c+1+(N-1) \times L\} .  \tag{5}\\
& \vdots \\
\mathrm{C}_{L-1}^{(c)} & =\{c+L-1, c+2 L-1, c+3 L-1, \cdots, c+N \times L-1\}
\end{align*}
$$

Each sub-group, $\mathrm{C}_{\gamma}^{(c)}$, with $0 \leq \gamma \leq L-1$, can be described in terms of the first node of the subgroup (2), $v_{c+\gamma}^{\mathrm{I}}$. If $v_{r}^{\mathrm{C}}$ is connected to the first information node of the subgroup, $C_{y}^{(c)}$, then, $v_{(r+\dot{x} \times(x) \times \infty(n-k)}^{\mathrm{c}}$ is connect to the $\mathrm{i}-\mathrm{th} v^{1}$ node, with $0 \leq i \leq N-1$, of the referred subgroup.
Equally, the same down-sampling process by $L$ can be done on each $\mathrm{R}^{(r)}$ group as:

$$
\begin{align*}
\mathrm{R}_{0}^{(r)} & =\{r, r+L \times q, r+2 L \times q, \cdots, r+(N-1) \times L \times q\} \\
\mathrm{R}_{1}^{(r)} & =\{r+q, r+(L+1) \times q, r+(2 L+1) \times q, \cdots, r+((N-1) \times L+1) \times q\} \\
& \vdots  \tag{6}\\
\mathrm{R}_{L-1}^{(r)} & =\{r+(L-1) \times q, r+(2 L-1) \times q, r+(3 L-1) \times q, \cdots, r+(N \times L-1) \times q\}
\end{align*}
$$

and, in a similar way, each subgroup, $\mathrm{R}_{\beta}^{(r)}$, with $0 \leq \beta \leq L-1$, can be described just in terms of the first element, $v_{r+p \times 4}^{c}$. If $v_{\dot{\varepsilon}}^{1}$ is connected to the first node of sub-set $\mathrm{R}_{\beta}^{(r)}$, then, $v_{c+((\tilde{r}-c+\operatorname{tix}) \bmod M)}^{1}$, with $c=M \times(\tilde{c} \operatorname{div} M)$, is connected to the i -th $v^{\mathrm{c}}$, with $0 \leq i \leq N-1$, of the considered subgroup.
From the framework just described in (5) and (6), we conclude that the down-sampling approach preserves the key modulo $M$ properties and, thus, we can process individually each $\mathrm{C}_{r}^{(c)}$ and $\mathrm{R}_{\beta}^{(t)}$ subgroup and the same architecture [10] can be used with only $N$ processing units as shown in Fig. 2. In fact, when processing simultaneously a group $C_{r}^{(c)}$, the computed messages have as destination a set $\mathrm{R}_{\beta}^{(r)}$ and, vice-versa.

## Memory mapping and shuffling mechanism

The down-sampling strategy allows a linear reduction (by a factor of $L$ ) of the hardware resources occupied by the FU's blocks, reduces significantly the complexity of the barrel shifter $\left(O\left(N \log _{2} N\right)\right.$ ) and simplifies the routing problem. Yet, at first glance, it may seem that this strategy implies an increase by $L$ in the size of the system ROM (Shifts and Addresses). Fortunately, if we know the properties of the subgroups $C_{0}^{(c)}$ and $R_{0}^{(f)}$, we automatically know the properties of the remaining subgroups, $\mathrm{C}_{\gamma}^{(c)}$ and $\mathrm{R}_{\beta}^{(r)}$ respectively, with $0 \leq \gamma, \beta \leq N-1$. By a proper message memory mapping based on a convenient reshape by $L$ of the matrix $\mathbf{R}$ (4), we can keep unchanged the size of the system ROM and compute on the fly the new shifts and addresses values as functions of the ones stored in the ROM of Fig. 2, i.e., for all $\mathrm{C}_{0}^{(\mathrm{c})}$ and $\mathrm{R}_{0}^{(r)}$ groups.


Figure 2. Factorizable modulo $M$ parallel architecture for DVB-S2 LDPC decoding.

For the configuration shown in Fig. 2, each $\mathrm{FU}_{i}$, with $0 \leq i \leq N-1$, is now responsible for processing $L \times \alpha$ information nodes in the following order

$$
\begin{align*}
& \{i, i+M, i+2 M, \cdots, i+(\alpha-1) M ; \\
& i+1, i+1+M, \cdots, i+1+(\alpha-1) M ; \cdots ;  \tag{7}\\
& i+L-1, i+L-1+M, \cdots, i+L-1+(\alpha-1) M\}
\end{align*}
$$

and $L \times q$ check and parity nodes, $\{j, j+1, \cdots, j+L \times q-1\}$, with $j=i \times L \times q$.

## IV. SYNTHESIS RESULTS

The architecture of Fig. 2 was synthesized on Virtex-II Pro FPGAs (XC2VP) from Xilinx. For XC2VPxx family it is necessary to use a factor $L=8$ ( 45 FU 's) due to internal memory limitations. In fact, synthesis results show that it is mandatory to use at least the FPGA XC2VP50 in order to guarantee the minimum memory resources required to implement all code rates and lengths. However, this particular choice uses less than $50 \%$ of the FPGA available slices. Using external memory, it would be possible to choose the lower cost FPGA XC2VP30.
The XC2VP100 FPGA allows the implementation of the architecture of Fig. 2 with 90 FUs, which doubles the throughput.

## V. CONCLUSIONS

This paper addresses the generalization of a state-of-the-art $M$-kernel parallel structure for LDPC-IRA DVB-S2 decoding, for any integer factor of $M=360$ by mean of subsampling, keeping unchanged the efficient message memory mapping structure without addressing overheads. This architecture proves to be flexible and easily reconfigurable
according to the decoder constraints and represents a trade off between silicon area and decoder throughput.
Synthesis results show that the implementation of a complete LDPC-IRA DVB-S2 decoder is possible with 45 functional units for Xilinx XC2VP FPGAs family.

## REFERENCES

[1] ETSI, Digital video broadcasting (DVB); Second generation framing structure, channel coding and modulation systems for broadcasting, interactive services, news gathering and other broad-band satellite applications: EN 302307 V1. 1.1, 2005.
[2] A. Morello and V. Mignone, "DVB-S2: The second generation standard for satellite broad-band services," Proceedings of the IEEE, vol. 94, pp. 210-227, 2006.
[3] R. G. Gallager, "Low-Density Parity-Check Codes," Ire Transactions on Information Theory, vol. 8, pp. 21-\&, 1962.
[4] D. J. C. MacKay, "Good error-correcting codes based on very sparse matrices," IEEE Transactions on Information Theory, vol. 45, pp. 399-431, 1999.
[5] S. Y. Chung, G. D. Forney, T. J. Richardson, and R. Urbanke, "On the design of low-density parity-check codes within 0.0045 dB of the Shannon limit," IEEE Communications Letters, vol. 5, pp. 58-60, 2001.
[6] R. M. Tanner, "A Recursive Approach to Low Complexity Codes," IEEE Transactions on Information Theory, vol. 27, pp. 533-547, 1981.
[7] J. H. Chen and M. P. C. Fossorier, "Near optimum universal belief propagation based decoding of low-density parity check codes," IEEE Transactions on Communications, vol. 50, pp. 406-414, 2002.
[8] H. Jin, A. Khandekar, and R. McEliece, "Irregular repeataccumulate codes," In. Proc. 2nd International Symposium on Turbo Codes \& Related Topics, Brest, France, Sept 2000.
[9] M. Eroz, F. W. Sun, and L. N. Lee, "DVB-S2 low density parity check codes with near Shannon limit performance," International Journal of Satellite Communications and Networking, vol. 22, pp. 269-279, 2004.
[10] F. Kienle, T. Brack, and N. Wehn, "A Synthesizable IP Core for DVB-S2 LDPC Code Decoding," In. Proc. Design, Automation and Test in Europe (DATE'05), Munich, Germany, Mar. 2005.
[11] J. Dielissen, A. Hekstra, and V. Berg, "Low cost LDPC decoder for DVB-S2," In. Proc. Design, automation and test in Europe: Designers' forum (DATE'06), Munich, Germany, Mar. 2006.
[12] M. Gomes, G. Falcão, J. Gonçalves, V. Silva, M. Falcão, and P. Faia, "HDL Library of Processing Units for Generic and DVB-S2 LDPC Decoding," In. Proc. International Conference on Signal Processing and Multimédia Applications (SIGMAP2006), Setúbal, Portugal, Aug. 2006.
[13] J. T. Zhang and M. P. C. Fossorier, "Shuffled iterative decoding," IEEE Transactions on Communications, vol. 53, pp. 209-213, 2005.
[14] H. Xiao and A. H. Banihashemi, "Graph-based messagepassing schedules for decoding LDPC codes," IEEE Transactions on Communications, vol. 52, pp. 2098-2105, 2004.
[15] E. Sharon, S. Litsyn, and J. Goldberger, "An efficient message-passing schedule for LDPC decoding," Electrical and Electronics Engineers in Israel, 2004. Proceedings. 2004 23rd IEEE Convention of, pp. 223-226, 2004.

# Design of LDPC Codes: A Survey and New Results 

Gianluigi Liva, Shumei Song, Lan Lan, Yifei Zhang, Shu Lin, and William E. Ryan


#### Abstract

This survey paper provides fundamentals in the design of LDPC codes. To provide a target for the code designer, we first summarize the EXIT chart technique for determining (near-)optimal degree distributions for LDPC code ènsembles. We also demonstrate the simplicity of representing codes by protographs and how this naturally leads to quasi-cyclic LDPC codes. The EXIT chart technique is then extended to the special case of protograph-based LDPC codes. Next, we present several design approaches for LDPC codes which incorporate one or more accumulators, including quasi-cyclic accumulatorbased codes. The second half the paper then surveys several algebraic LDPC code design techniques. First, codes based on finite geometries are discussed and then codes whose designs are based on Reed-Solomon codes are covered. The algebraic designs lead to cyclic, quasi-cyclic, and structured codes. The masking technique for converting regular quasi-cyclic LDPC codes to irregular codes is also presented. Some of these results and codes have not been presented elsewhere. The paper focuses on the binary-input AWGN channel (BI-AWGNC). However, as discussed in the paper, good BI-AWGNC codes tend to be universally good across many channels. Alternatively, the reader may treat this paper as a starting point for extensions to more advanced channels. The paper concludes with a brief discussion of open problems.


## I. Introduction

The class of low-density parity-check (LDPC) codes represents the leading edge in modern channel coding. They have held the attention of coding theorists and practitioners in the past decade because of their near-capacity performance on a large variety of data transmission and storage channels and because their decoders can be implemented with manageable complexity. They were invented by Gallager in his 1960 doctoral dissertation [1] and were scarcely considered in the 35 years that followed. One notable exception is Tanner, who wrote an important paper in 1981 [2] which generalized LDPC codes and introduced a graphical representation of LDPC codes, now called Tanner graphs. Apparently independent of Gallager's work, LDPC codes were re-invented in the mid1990's by MacKay, Luby, and others [3][4][5][6] who noticed the advantages of linear block codes which possess sparse (low-density) parity-check matrices.

This papers surveys the state-of-the-art in LDPC code design for binary-input channels while including a few new results as well. While it is tutorial in some aspects, it is not

[^7]entirely a tutorial paper, and the reader is expected to be fairly versed on the topic of LDPC codes. Tutorial coverages of LDPC codes can be found in [7][8]. The purpose of this paper is to give the reader a detailed overview of various LDPC code design approaches and also to point the reader to the literature. While our emphasis is on code design for the binary-input AWGN channel (BI-AWGNC), the results in [9][10][11][12] demonstrate that a LDPC code that is good on the BI-AWGNC tends to be universally good and can be expected to be good on most wireless, optical, and storage channels.

We favor code designs which are most appropriate for applications, by which we mean codes which have low-complexity encoding, good waterfall regions, and low error floors. Thus, we discuss quasi-cyclic (QC) codes because their encoders may be implemented by shift-register circuits [13]. We also discuss accumulator-based codes because low-complexity encoding is possible from their parity-check matrices, whether they are quasi-cyclic or not. The code classes discussed tend to be the ones (or related to the ones) used in applications or adopted for standards. Due to time and space limitations, we cannot provide a complete survey. The present survey is biased toward the expertise and interests of the authors.'

Before a code can be designed, the code designer needs to know the design target. For this reason, Section II first briefly reviews the belief propagation decoder for LDPC codes and then presents the so-called extrinsic information transfer (EXIT) chart technique for this decoder. The EXIT chart technique allows one to obtain near-optimal parameters for LDPC code ensembles which guide the code designer. The EXIT technique is extended in Section III to the case of codes based on protographs. Section IV considers LDPC codes based on accumulators. The code types treated in that section are: repeat-accumulate, irregular repeat-acccumulate, irregular repeat-accumulate-accumulate, generalized irregular repeat-accumulate, and accumulate-repeat-accumulate. That section also gives examples of quasi-cyclic code design using protograph (or base matrix) representations. Section V surveys the literature on cyclic and quasi-cyclic LDPC code design based on finite geometries. Section VI presents several LDPC code design techniques based on Reed-Solomon codes. Section VII presents the masking technique for converting regular QC codes to irregular QC codes to conform to prescribed code parameters. Section VIII contains some concluding remarks and some open problems.

## II. Design via Exit Charts

We start with an $m \times n$ low-density parity-check matrix $\mathbf{H}$, which corresponds to a code with design rate $(n-m) / n$, which could be less than the actual rate, $R=k / n$, where $k$ is the number of information bits per codeword. $\mathbf{H}$ gives rise


Fig. 1. Tanner graph representation of LDPC codes.
to a Tanner graph which has $m$ check nodes, one for each row of $\mathbf{H}$, and $n$ variable nodes, one for each column of $\mathbf{H}$. Considering the general case in which $\mathbf{H}$ has non-uniform row and column weight, the Tanner graph can be characterized by degree assignments $\left\{d_{v}(i)\right\}_{i=1}^{n}$ and $\left\{d_{c}(j)\right\}_{j=1}^{m}$, where $d_{v}(i)$ is the degree of the $i$-th variable node and $d_{c}(j)$ is the degree of the $j$-th check node. Such a graph, depicted in Fig. 1 , is representative of the iterative decoder, with each node representing a soft-in/soft-out processor (or node decoder).

We shall assume the BI-AWGNC in our description of the LDPC iterative decoder. In this model, a received channel sample $y$ is given by $y=x+w$, where $x=(-1)^{c} \in\{ \pm 1\}$ is the bipolar representation of the transmitted code bit $c \in$ $\{0,1\}$ and $w$ is a white Gaussian noise sample distributed as $\eta\left(0, \sigma_{v v}^{2}\right)$, where $\sigma_{v v}^{2}=N_{0} / 2$, following convention. The channel bit log-likelihood ratios (LLRs) are computed as

$$
\begin{equation*}
L_{c h}=\log \left(\frac{p(x=+1 \mid y)}{p(x=-1 \mid y)}\right)=\frac{2 y}{\sigma_{w}^{2}} . \tag{1}
\end{equation*}
$$

In one iteration of the conventional, flooding-schedule iterative decoder, the variable node decoders (VNDs) first process their input LLRs and send the computed outputs (messages) to each of their neighboring check node decoders (CNDs); then the CNDs process their input LLRs and send the computed outputs (messages) to each of their neighboring VNDs. More specifically, the message from the $i$-th VND to the $j$-th CND is

$$
\begin{equation*}
L_{i \rightarrow j}=L_{c h, i}+\sum_{j^{\prime} \neq j} L_{j^{\prime} \rightarrow i} \tag{2}
\end{equation*}
$$

where $L_{j^{\prime} \rightarrow i}$ is the incoming message from CND $j^{\prime}$ to VND $i$ and where the summation is over the $d_{v}(i)-1$ check node neighbors of variable node $i$, excluding check node $j$. The message from CND $j$ to VND $i$ is given by

$$
\begin{equation*}
L_{j \rightarrow i}=2 \tanh ^{-1}\left(\prod_{i^{\prime} \neq i} \tanh \left(L_{i^{\prime} \rightarrow j}\right)\right) \tag{3}
\end{equation*}
$$

where $L_{i^{\prime} \rightarrow j}$ is the incoming message from VND $i^{\prime}$ to CND $j$ and where the product is over the $d_{c}(j)-1$ variable node neighbors of check node $j$, excluding variable node $i$. This decoding algorithm is called the sum-product algorithm (SPA).

We now discuss the EXIT chart technique [14][15][11] for this decoder and channel model. The idea is that the VNDs and the CNDs work cooperatively and iteratively to make bit decisions, with the metric of interest generally improving with each half-iteration. A transfer curve which plots the input metric versus the output metric can be obtained for both the VNDs and the CNDs, where the transfer curve for the VNDs depends on the channel SNR. Further, since the output metric for one processor is the input metric for its companion processor, one can plot both transfer curves on the same axes, but with the abscissa and ordinate reversed for one processor. Such a chart aids in the prediction of the decoding threshold of the ensemble of codes characterized by given VN and CN degree distributions: the decoding threshold is the SNR at which the two transfer curves just touch, precluding convergence of the two processors. EXIT chart computations are thus integral to the optimization of Tanner graph node degree distributions for LDPC codes and are the main computation in the optimization process. We emphasize that decoding threshold prediction techniques such as EXIT charts or density evolution [16] assume a graph with no cycles, an infinite codeword length, and an infinite number of decoding iterations.

An EXIT chart example is depicted in Fig. 2 for the ensemble of regular LDPC codes on the BI-AWGNC with $d_{v}(i)=d_{v}=3$ for $i=1, \ldots, n$, and $d_{c}(j)=d_{c}=6$ for $j=1, \ldots, m$. In the figure, the metric used for the transfer curves is extrinsic mutual information, giving rise to the name extrinsic information transfer (EXIT) chart. (The notation used in the figure is explained below.) Also shown in the figure is the decoding trajectory corresponding to these EXIT curves. As the SNR increases, the top curve shifts upwards, increasing the "tunnel" between the two curves and thus the decoder convergence rate. The SNR for this figure is just above the decoding threshold for codes with $\left(d_{v}, d_{c}\right)=(3,6)$, $\left(E_{b} / N_{0}\right)_{\text {thres }}=1.1 \mathrm{~dB}$. Other metrics, such as SNR and mean [17][18] and error probability [19] are possible, but mutual information generally gives the most accurate prediction of the decoding threshold [14][20] and is a universally good metric across many channels [9][10][[11][12].

To facilitate EXIT chart computations, the following Gaussian assumption is made. First, we note that the LLR $L_{c h}$ in (1) corresponding to the BI-AWGNC is Gaussian with mean $\mu_{c h}=2 x / \sigma_{w}^{2}$ and variance $\sigma_{c h}^{2}=4 / \sigma_{v w}^{2}$. From this and the usual assumption that the all-zeros codeword was transmitted (thus, $x_{i}=+1$ for $i=1, \ldots, n$ ), $\sigma_{c h}^{2}=2 \mu_{c h}$. This is equivalent to the symmetric condition of [16] which states that the conditional pdf of an LLR value $L$ must satisfy $p_{L}(l \mid x)=p_{L}(-l \mid x) e^{x l}$. Now, it has been observed that under normal operating conditions and after a few iterations, the LLRs $L_{i \rightarrow j}$ and $L_{j \rightarrow i}$ are approximately Gaussian and, further, if they are assumed to be symmetric-Gaussian, as is the case for $L_{c h}$, the decoding threshold predictions are very accurate (e.g., when compared to the more accurate, but more computationally intensive density evolution results [16]). Moreover, the symmetric-Gaussian assumption vastly


Fig. 2. EXIT chart example for $\left(d_{v}, d_{c}\right)=(3,6)$ regular LDPC code.
simplifies EXIT chart analyses.
We now consider the computation of EXIT transfer curves for both VNDs and the CNDs, first for regular LDPC codes and then for irregular codes. Following [14][15], excluding the inputs from the channel, we consider VND and CND inputs to be a priori information, designated by ' $A$ ', and their outputs to be extrinsic information, designated by ' E '. Thus, an extrinsic information transfer curve for the VNDs plots the extrinsic information $I_{E}$ as a function of its input a priori information, $I_{A}$, and similarly for the CNDs.

The VND EXIT curve, $I_{E, V}$ versus $I_{A, V}$, under the symmetric-Gaussian assumption for VND inputs, $L_{c h, i}$ and $\left\{L_{j^{\prime} \rightarrow i}\right\}$, and outputs, $L_{i \rightarrow j}$, can be obtained as follows. From (2) and an independent-message assumption, $L_{i \rightarrow j}$ is Gaussian with variance $\sigma^{2}=\sigma_{c h}^{2}+\left(d_{v}-1\right) \sigma_{A}^{2}$ (hence, mean $\sigma^{2} / 2$ ). The mutual information between the random variable $X$ (corresponding to the realization $x_{i}$ ) and the extrinsic LLR $L_{i \rightarrow j}$ is therefore (for simplicity, we write $L$ for $L_{i \rightarrow j}, x$ for $x_{i}$, and $p_{L}(l \mid \pm)$ for $\left.p_{L}(l \mid x= \pm 1)\right)$

$$
\begin{aligned}
I_{E, V}= & H(X)-H(X \mid L) \\
= & 1-E\left[\log _{2}\left(1 / p_{X \mid L}(x \mid l)\right)\right] \\
= & 1-\sum_{x= \pm 1} \frac{1}{2} \int_{-\infty}^{\infty} p_{L}(l \mid x) \\
& \cdot \log _{2}\left(\frac{p_{L}(l \mid+)+p_{L}(l \mid-)}{p_{L}(l \mid x)}\right) d l \\
= & 1-\int_{-\infty}^{\infty} p_{L}(l \mid+) \log \left(1+\frac{p_{L}(l \mid-)}{p_{L}(l \mid+)}\right) d l \\
= & 1-\int_{-\infty}^{\infty} p_{L}(l \mid+) \log \left(1+e^{-l}\right) d l
\end{aligned}
$$

where the last line follows from the symmetry condition and because $p_{L}(l \mid x=-1)=p_{L}(-l \mid x=+1)$ for Gaussian densities.
Since $L_{i \rightarrow j} \sim \eta\left(\sigma^{2} / 2, \sigma^{2}\right)$ (when conditioned on $x_{i}=$
+1 ), we have

$$
\begin{equation*}
I_{E, V}=1-\int_{-\infty}^{\infty} \frac{1}{\sqrt{2 \pi} \sigma} e^{-\left(l-\sigma^{2} / 2\right)^{2} / 2 \sigma^{2}} \log \left(1+e^{-l}\right) d l \tag{4}
\end{equation*}
$$

For convenience we write this as

$$
\begin{equation*}
I_{E, V}=J(\sigma)=J\left(\sqrt{\left(d_{v}-1\right) \sigma_{A}^{2}+\sigma_{c h}^{2}}\right) \tag{5}
\end{equation*}
$$

following [15]. To plot $I_{E, V}$ versus $I_{A, V}$, where $I_{A, V}$ is the mutual information between the VND inputs $L_{j \rightarrow i}$ and the channel bits $x_{i}$, we apply the symmetric-Gaussian assumption to these inputs so that

$$
\begin{equation*}
I_{A, V}=J\left(\sigma_{A}\right) \tag{6}
\end{equation*}
$$

and

$$
\begin{equation*}
I_{E, V}=J(\sigma)=J\left(\sqrt{\left(d_{v}-1\right)\left[J^{-1}\left(I_{A, V}\right)\right]^{2}+\sigma_{c h}^{2}}\right) . \tag{7}
\end{equation*}
$$

The inverse function $J^{-1}(\cdot)$ exists since $J\left(\sigma_{A}\right)$ is monotonic in $\sigma_{A}$. Lastly, $I_{E, V}$ can be parameterized by $E_{b} / N_{0}$ for a given code rate $R$ since $\sigma_{c h}^{2}=4 / \sigma_{w}^{2}=8 R\left(E_{b} / N_{0}\right)$. Approximations of the functions $J(\cdot)$ and $J^{-1}(\cdot)$ are given in [15].

To obtain the CND EXIT curve, $I_{E, C}$ versus $I_{A, C}$, we can proceed as we did in the VND case, e.g., begin with the symmetric-Gaussian assumption. However, this assumption is not sufficient because determining the mean and variance for a CND output $L_{j \rightarrow i}$ is not straightforward, as is evident from the computation for CNDs in (3). Closed-form expressions have been derived for the check node EXIT curves [21][22]. Computer-based numerical techniques can also be used to obtain these curves. However, the simplest technique exploits the following duality relationship (proven to be exact for the binary erasure channel [11]): the EXIT curve for a degree $-d_{c}$ check node (i.e., rate- $\left(d_{c}-1\right) / d_{c}$ single-parity check (SPC). code) and that of a degree- $d_{c}$ variable node (i.e., rate- $1 / d_{c}$ repetition code) are related as

$$
I_{E, S P C}\left(d_{c}, I_{A}\right)=1-I_{E, R E P}\left(d_{c}, 1-I_{A}\right)
$$

This relationship was shown to be very accurate for the BIAWGNC in [21][22]. Thus,

$$
\begin{align*}
I_{E, C} & =1-I_{E, V}\left(\sigma_{c h}=0, d_{v} \leftarrow d_{c}, I_{A, V} \leftarrow 1-I_{A, C}\right) \\
& =1-J\left(\sqrt{\left(d_{c}-1\right)\left[J^{-1}\left(1-I_{A, C}\right)\right]^{2}}\right) . \tag{8}
\end{align*}
$$

For irregular LDPC codes, $I_{E, V}$ and $I_{E, C}$ are computed as weighted averages. The weighting is given by the coefficients of the "edge perspective" degree distribution polynomials $\lambda(z)=\sum_{d=1}^{d_{v}} \lambda_{d} z^{d-1}$ and $\rho(z)=\sum_{d=1}^{d_{c}} \rho_{d} z^{d-1}$, where $\lambda_{d}$ is the fraction of edges in the Tanner graph connected to degree- $d$ variable nodes, $\rho_{d}$ is the fraction of edges connected to degree$d$ check nodes, and $\lambda(1)=\rho(1)=1$. Then, for irregular LDPC codes,

$$
\begin{equation*}
I_{E, V}=\sum_{d=1}^{d_{v}} \lambda_{d} I_{E, V}\left(d, I_{A, V}\right) \tag{9}
\end{equation*}
$$



Fig. 3. EXIT chart for rate-1/2 irregular LDPC code. (Ack: S. AbuSurra)
where $I_{E, V}(d)$ is given by (7) with $d_{v}$ replaced by $d$, and

$$
\begin{equation*}
I_{E, C}=\sum_{d=1}^{d_{c}} \rho_{d} I_{E, C}\left(d, I_{A, C}\right) \tag{10}
\end{equation*}
$$

where $I_{E, C}(d)$ is given by (8) with $d_{c}$ replaced by $d$.
It has been shown [11] that to optimize the decoding threshold on the binary erasure channel, the shapes of the VND and CND transfer curves must be well matched in the sense that the CND curve fits inside the VND curve (an example will follow). This situation has also been observed on the BIAWGNC [15]. Further, to achieve a good match, the number of different VN degrees need only be about 3 or 4 and the number of different CN degrees need only be 1 or 2 .

Example 1: We consider the design of a rate- $1 / 2$ irregular LDPC code with four possible VN degrees and two possible CN degrees. Given than $\lambda(1)=\rho(1)=1$ and $R=1-\int_{0}^{1} \rho(z) d z / \int_{0}^{1} \lambda(z) d z[16],[4]$, only two of the four coefficients for $\lambda(z)$ need be specified and only one of the two for $\rho(z)$ need be specified. A non-exhaustive search yielded $\lambda(z)=0.267 z+0.176 z^{2}+0.127 z^{3}+0.430 z^{9}$ and $\rho(z)=0.113 z^{4}+0.887 z^{7}$ with a decoding threshold of $\left(E_{b} / N_{0}\right)_{t h r e s}=0.414 \mathrm{~dB}$. The EXIT chart for $E_{b} / N_{0}=0.55$ dB is presented in Fig. 3. The figure also gives the "node perspective" degree distribution information.

The references contain additional information on EXIT charts, including the so-called area property, EXIT charts for the Rayleigh channel, for higher-order modulation, and for multi-input/multi-output channels [14][15][11][23].

## III. Design of Protograph-Based Codes

## A. Definition and Problem Statement

A protograph [24][25][26][27] is a relatively small bipartite graph from which a larger graph can be obtained by a copy-and-permute procedure: the protograph is copied $Q$ times,


Fig. 4. Illustration of the protograph copy and permute procedure with $q=4$ copies.
and then the edges of the individual replicas are permuted among the replicas (under restrictions described below) to obtain a single, large graph. An example is presented in Fig. 4. The permuted edge connections are specified by the paritycheck matrix $\mathbf{H}$. Note that the edge permutations cannot be arbitrary. In particular, the nodes of the protograph are labeled so that if variable node V is connected to check node C in the protograph, then variable node V in a replica can only connect to one of the $Q$ replicated C check nodes. Doing so preserves the decoding threshold properties of the protograph. A protograph can possess parallel edges, i.e., two nodes can be connected by more than one edge. For LDPC codes, the copy-and-permute procedure must eliminate such parallel connections in order to obtain a derived graph appropriate for a parity-check matrix.

It is convenient to choose the parity-check matrix $H$ as an $M \times N$ array of $Q \times Q$ (weight-one) circulant permutation matrices (some of which may be the $Q \times Q$ zero matrix). When H is an array of circulants, the LDPC code will be quasi-cyclic. Such a structure has a favorable impact on both the encoder and the decoder. The encoder for QC codes can be implemented with shift-register circuits with complexity linearly proportional to $m$ for serial encoding and to $n$ for parallel encoding [13]. By contrast, encoders for unstructured LDPC codes require much more work. The decoder for QC LDPC codes can be implemented in a modular fashion by exploiting the circulant-array structure of H [28][29].
Below we present an extension of the EXIT approach to codes defined by protographs. This extension is a multidimensional numerical technique and as such does not have a two-dimensional EXIT chart representation of the iterative decoding procedure. Still, the technique yields decoding thresholds for LDPC code ensembles specified by protographs. This multi-dimensional technique is facilitated by the relatively small size of protographs and permits the analysis of protograph code ensembles characterized by the presence of critical node types, i.e., node types which can lead to failed

EXIT-based convergence of code ensembles. Examples of critical node types are degree-1 variable nodes and punctured variable nodes.

A code ensemble specified by a protograph is a refinement (sub-ensemble) of a code ensemble specified simply by the protograph's (hence, LDPC code's) degree distributions. To demonstrate this, we introduce the adjacency matrix $\mathbf{B}=\left[b_{j i}\right]$ for a protograph, also called a base matrix [25], where $b_{j i}$ is the number of edges between $\mathrm{CN} j$ and $\mathrm{VN} i$. As an example, for the protograph at the top of Fig. 4,

$$
B=\left(\begin{array}{lll}
2 & 1 & 1 \\
1 & 1 & 1
\end{array}\right)
$$

Consider also an alternative protograph and base matrix specified by

$$
\mathbf{B}^{\prime}=\left(\begin{array}{lll}
2 & 0 & 2 \\
1 & 2 & 0
\end{array}\right) .
$$

The degree distributions of both of these protographs are identical and are easily seen to be

$$
\begin{aligned}
\lambda(z) & =\frac{4}{7} z+\frac{3}{7} z^{2} \\
\rho(z) & =\frac{3}{7} z^{2}+\frac{4}{7} z^{3} .
\end{aligned}
$$

However, the ensemble corresponding to $\mathbf{B}$ has a threshold of $E_{b} / N_{0}=0.78 \mathrm{~dB}$ and that corresponding to $\mathrm{B}^{\prime}$ has a threshold at 0.83 dB . (For reference, density evolution [16] applied to the above degree distributions gives 0.817 dB .)

As another example, let

$$
\mathbf{B}=\left(\begin{array}{lllll}
1 & 2 & 1 & 1 & 0 \\
2 & 1 & 1 & 1 & 0 \\
1 & 2 & 0 & 0 & 1
\end{array}\right)
$$

and

$$
\mathbf{B}^{\prime}=\left(\begin{array}{lllll}
1 & 3 & 1 & 0 & 0 \\
2 & 1 & 1 & 1 & 0 \\
1 & 1 & 0 & 1 & 1
\end{array}\right)
$$

noting that they have identical degree distributions. We also puncture the bits corresponding to the second column in each base matrix. Using the multidimensional EXIT algorithm described below, the thresholds for $\mathbf{B}$ and $\mathbf{B}^{\prime}$ in this case were computed to be 0.48 dB and $+\infty$, respectively.

Thus, standard EXIT analysis based on degree distributions is inadequate for protograph-based LDPC code design. In fact, the presence of degree-1 variable nodes as in our second example implies that there is a term in the summation in (9) of the form

$$
\lambda_{1} I_{E, V}\left(1, I_{A, V}\right)=J\left(\sigma_{c h}\right)
$$

Since $J\left(\sigma_{c h}\right)$ is always less than one for $0<\sigma_{c h}<\infty$ and since $\sum_{d=1}^{d_{v}} \lambda_{d}=1$, the summation in (9), that is, $I_{E, V}$, will be strictly less than one. Again, standard EXIT analysis implies failed convergence for codes with the same degree distributions as $\mathbf{B}$ and $\mathbf{B}^{\prime}$. This is in contrast with the fact that codes in the $\mathbf{B}$ ensemble do converge when the SNR exceeds the threshold of 0.48 dB .

In the following, a multidimensional EXIT technique [30][31] will be presented which overcomes this issue and allows the determination of the decoding threshold for codes based on protographs (possibly with punctured nodes).

## B. Multidimensional EXIT Analysis

The algorithm presented in [30][31] eliminates the average in (9) and considers the propagation of the messages on a decoding tree which is specified by the protograph of the ensemble. Let B $=\left[b_{j i}\right]$ be the $M \times N$ base matrix for the protograph under analysis. Let $I_{E, V}^{i \rightarrow j}$ be the extrinsic mutual information between code bits associated with "type $i$ " VNs and the LLRs $L_{i \rightarrow j}$ sent from these VNs to "type $j^{\prime \prime} \mathrm{CNs}$. Similarly, let $I_{E, C}^{j \rightarrow i}$ be the extrinsic mutual information between code bits associated with "type $i$ " VNs and the LLRs $L_{j \rightarrow i}$ sent from "type $j$ " CNs to these VNs. Then, because $I_{E, C}^{j \rightarrow i}$ acts as a priori mutual information in the calculation of $I_{E, V}^{i+j}$, following (7) we have (given an edge exists between $\mathrm{CN} j$ and $\mathrm{VN} i$, i.e., given $b_{j i} \neq 0$ )

$$
\begin{equation*}
I_{E, V}^{i \rightarrow j}=J\left(\sqrt{\sum_{c=1}^{M}\left(b_{c i}-\delta_{c j}\right)\left(J^{-1}\left(I_{E, C}^{c \rightarrow i}\right)\right)^{2}+\sigma_{c h, i}^{2}}\right) \tag{11}
\end{equation*}
$$

where $\delta_{c j}=1$ when $c=j$ and $\delta_{c j}=0$ when $c \neq j . \sigma_{c h, i}^{2}$ is set to zero if code bit $i$ is punctured. Similarly, because $I_{E, V}^{i \rightarrow j}$ acts as a priori mutual information in the calculation of $I_{E, C}^{j \rightarrow i}$, following (8) we have (when $b_{j i} \neq 0$ )

$$
\begin{equation*}
I_{E, C}^{j \rightarrow \rightarrow i}=1-J\left(\sqrt{\sum_{v=1}^{N}\left(b_{j v}-\delta_{c i}\right)\left(J^{-1}\left(1-I_{E, V}^{v \rightarrow j}\right)\right)^{2}}\right) \tag{12}
\end{equation*}
$$

The multidimensional EXIT algorithm can now be presented as follows.

1) Initialization. Select $E_{b} / N_{0}$. Initialize a vector $\sigma_{c h}=$ ( $\sigma_{c h, 0}, \ldots, \sigma_{c h, N-1}$ ) such that

$$
\sigma_{c h, i}=8 R\left(\frac{E_{b}}{N_{0}}\right)_{i}
$$

where $\left(E_{b} / N_{0}\right)_{i}$ equals zero when $x_{i}$ is punctured and equals the selected $E_{b} / N_{0}$ otherwise.
2) IN to $C N$. For $i=0, \ldots, N-1$ and $j=0, \ldots, M-1$, compute (11).
3) $C N$ to $V N$. For $i=0, \ldots, N-1$ and $j=0, \ldots, M-1$, compute (12).
4) Cumulative mutual information. For $i=0, \ldots, N-1$, compute

$$
I_{C M I}^{i}=J\left(\sqrt{\sum_{c=1}^{M}\left(J^{-1}\left(I_{E, C}^{c \rightarrow i}\right)\right)^{2}+\sigma_{c h, i}^{2}}\right)
$$

5) If $I_{C M I I}^{i}=1$ (up to desired precision) for all $i$, then stop; otherwise, go to step 2.
This algorithm converges only when the selected $E_{b} / N_{0}$ is above the threshold. Thus, the threshold is the lowest
value of $E_{b} / N_{0}$ for which all $I_{C M I}^{i}$ converge to 1 . As shown in [30][31], the thresholds computed by this algorithm are typically within 0.05 dB of those computed by density evolution. Recalling that many classes of multi-edge type (MET) [26] LDPC codes rely on simple protographs, the above algorithm provides an accurate threshold estimation for MET ensembles, with a remarkable reduction in computational complexity relative to the density evolution analysis proposed in [26].

## IV. Accumulator-Based Code Designs

## A. Repeat-Accumulate Codes

This section provides an overview of the design of LDPC codes that can be considered to be a concatenation of a set of repetition codes with one or more accumulators, through an interleaver. The first example of accumulator-based codes were the so-called repeat-accumulate (RA) codes [32]. Despite their simple structure, they were shown to provide good performance and, more importantly, they paved a path toward the design of efficiently encodable LDPC codes. RA codes and other accumulator-based codes are LDPC codes that can be decoded as serial turbo codes or as LDPC codes.

An RA code consists of a serial concatenation of a single rate- $1 / q$ repetition code through an interleaver with an accumulator having transfer function $1 /(1 \oplus D)$. RA codes can be either non-systematic or systematic. In the first case, the accumulator output, $p$, is the codeword and the code rate is $1 / q$. For systematic RA codes, the information word, $u$, is combined with $p$ to yield the codeword $\mathbf{c}=[\mathbf{u} p]$ and so that the code rate is $1 /(1+q)$. RA codes perform reasonably well on the AWGN channel, and they tend to approach the channel capacity as their rate $R \rightarrow 0$ and their block length $n \rightarrow \infty$. Their main limitations are the code rate, which cannot be higher than $1 / 2$, and the performance of short and mediumlength RA codes. The following subsections will present a brief overview of the major enhancements to RA codes which permit operation closer to capacity for both high and low rates.

## B. Irregular Repeat-Accumulate codes

The systematic irregular repeat-accumulate (IRA) codes generalize the systematic RA codes in that the repetition rate may differ across the $k$ information bits and that a variable number of bits in the repeated word are combined (modulo 2) prior to sending them through the accumulator. Irregular repeat-accumulate [33] codes provide several advantages over RA codes. They allowing both flexibility in the choice of the repetition rate for each information bit so that high rate codes may be designed and capacity is more easily approached.
The Tanner graph for IRA codes is presented in Fig. 5(a) and the encoder structure (to be discussed further later) is depicted in Fig. 5(b). The variable repetition rate is accounted for in the graph by letting $d_{b, i}$ vary with $i$. The accumulator is represented by the right-most part of the graph, where the dashed edge is added to include the possibility of a tail-biting trellis. Also, we see that $d_{c, j}$ interleaver output bits are added
(a)

(b)

|  | $d_{b, i}$ | $d_{c, j}$ | $m=n-k$ |
| :---: | :---: | :---: | :---: |
| RA | $q$ | 1 | $m>k, m=q k$ |
| IRA | variable | variable | $m \geq 1, k \geq 1$ |



Fig. 5. Tanner graph (a) and encoder (b) for irregular repeat-accumulate codes.
(modulo 2) to produce the $j$-th accumulator input. Fig. 5 also includes the representation for RA codes. As indicated in the table in the figure, for an RA code, each information bit node connects to exactly $q$ check nodes ( $d_{b, i}=q$ ) and each check node connects to exactly one information bit node ( $d_{c, j}=1$ ). We remark that $\left\{d_{b, i}\right\}$ and $\left\{d_{c, j}\right\}$ can be related to our earlier notation, $\left\{d_{v}(i)\right\}$ and $\left\{d_{c}(j)\right\}$, as follows: $d_{v}(i)=d_{b, i}$ for $i=1, \ldots, k, d_{v}(i)=2$ for $i=k+1, \ldots, n$, and $d_{c}(j)=$ $d_{c, j}+2$ for $j=1, \ldots, m$.

To determine the code rate for an IRA code, define $\bar{q}$ to be the average repetition rate of the information bits

$$
\bar{q}=\frac{1}{k} \sum_{i=1}^{k} d_{b, i},
$$

and $\bar{a}$ as the average of the degrees $\left\{d_{c, j}\right\}$,

$$
\bar{a}=\frac{1}{m} \sum_{j=1}^{m} d_{c, j}
$$

Then the code rate for systematic IRA codes is

$$
R=\frac{1}{1+\bar{q} / \bar{a}}
$$

For non-systematic IRA codes, $R=\bar{a} / \bar{q}$.
The parity-check matrix for systematic RA and IRA codes has the form

$$
\mathbf{H}=\left[\begin{array}{ll}
\mathbf{H}_{u} & \mathbf{H}_{p} \tag{13}
\end{array}\right]
$$

where $\mathbf{H}_{p}$ is an $m \times m$ "dual-diagonal" square matrix,

$$
\mathbf{H}_{p}=\left[\begin{array}{ccccc}
1 & & & & (1)  \tag{14}\\
1 & 1 & & & \\
& \ddots & \ddots & & \\
& & 1 & 1 & \\
& & & 1 & 1
\end{array}\right]
$$

where the upper-right 1 is included for tailing-biting accumulators. For RA codes, $\mathrm{H}_{u}$ is a regular matrix having column weight $q$ and row weight 1 . For IRA codes, $\mathbf{H}_{u}$ has column weights $\left\{d_{b, i}\right\}$ and row weights $\left\{d_{c, j}\right\}$. The encoder of Fig. $5(\mathrm{~b})$ is obtained by noting that the generator matrix corresponding to $\mathbf{H}$ in (13) is $\mathbf{G}=\left[\begin{array}{ll}\mathbf{I} & \mathbf{H}_{u}^{T} \mathbf{H}_{p}^{-T}\end{array}\right]$ and writing $H_{u}$ as $\Pi^{T} A^{T}$, where $\Pi I$ is a permutation matrix. Note also that $\mathbf{H}_{p}^{-T}$ performs the same computation as $1 /(1 \oplus D)$ (and $\mathbf{H}_{p}^{-T}$ exists only when the "tail-biting $1^{1 "}$ is absent). Two encoding alternatives exist: (1) When the accumulator is not tail-biting, one may use $\mathbf{H}$ to encode since one may solve for the parity bits sequentially from the equation $\mathbf{c H}^{T}=\mathbf{0}$ starting with the top row of $\mathbf{H}$ and moving on downward. (2) As discussed in the next section, quasi-cyclic IRA code designs are possible, in which case the techniques of [13] may be used.

We remark that the choice of the degree distributions of the variable nodes for an IRA code are constrained by the presence of (at least) $n-k-1$ degree- 2 variable nodes. Although such a constraint ostensibly limits the code designer, for rates $R \geq 1 / 2$, EXIT analysis leads to optimized degree distributions having approximately $n-k-1$ degree-2 variable nodes. Moreover, when the number of degree-2 variable nodes is exactly $n-k-1$, the edge connections involving the degree2 variable nodes induced by the IRA structure are optimal in the sense of avoiding low weight codewords [34][35].

IRA codes and a generalization will be discussed in the next two sections. Additional information may be found in the following references: [33][35][36][24][40][41] [42][43].

## C. Structured IRA and IRAA Codes

Given the code rate, length, and degree distributions, an IRA code is defined entirely by the matrix $\mathrm{H}_{u}$ (equivalently, by $\mathbf{A}$ and $\boldsymbol{\Pi}$ ). While a random-like $\mathbf{H}_{u}$ would generally give good performance, it is problematic for both encoder and decoder implementations. For, in this case, a substantial amount of memory would be required to store the connection information implicit in $\mathbf{H}_{u}$. In addition, although standard message-passing decoding algorithms for LDPC codes are inherently parallel, the physical interconnections required to realize a code's bipartite graph becomes an implementation bottleneck and prohibits a fully parallel decoder [29]. Using a structured $\mathbf{H}_{u}$ matrix mitigates these problems.

Tanner [24] was the first to consider structured RA codes, more specifically, quasi-cyclic RA codes, which require tailbiting in the accumulator. Simulation results in [24] demonstrate that the QC-RA codes compete well with random-like RA codes and surpass their performance at high SNR values. Similar ideas were applied to IRA codes in [29][44][36].

In [36], IRA codes with quasi-cyclic structure are called structured IRA (S-IRA) codes.

Toward the goal of attaining structure in H , one cannot simply choose $\mathrm{H}_{u}$ to be an array of circulant permutation matrices. For, it is easy to show that doing so will produce a poor LDPC code in the sense of minimum distance (consider weight-2 encoder inputs with adjacent ones). Instead, the following strategy is proposed in [36]. Let $\mathbf{P}$ be an $L \times J$ array of $Q \times Q$ circulant permutation matrices (for some convenient $Q$ ). (Conditions for designing $\mathbf{P}$ to avoid 4 -cycles, etc., are described in [36].) Then set $\mathbf{A}^{\mathbf{T}}=\mathbf{P}$ so that $\mathbf{H}_{u}=\boldsymbol{\Pi}^{T} \mathbf{P}$ and

$$
\mathbf{H}_{a}=\left[\begin{array}{ll}
\Pi^{T} \mathbf{P} & \mathbf{H}_{p} \tag{15}
\end{array}\right],
$$

where $H_{p}$ represents the tailbiting accumulator. Note that $m=$ $L \times Q$ and $k=J \times Q$.

We now choose $\Pi$ to be a standard deterministic "rowcolumn" interleaver so that row $l Q+q$ in $\mathbf{P}$ becomes row $q L+l$ in $\Pi^{T} P$, for all $0 \leq l<L$ and $0 \leq q<Q$. Next, we permute the rows of $\mathbf{H}_{a}$ by $\Pi^{-T}$ to obtain

$$
\mathbf{H}_{b}=\boldsymbol{\Pi}^{-T} \mathbf{H}=\left[\begin{array}{ll}
\mathbf{P} & \boldsymbol{\Pi} \mathbf{H}_{p} \tag{16}
\end{array}\right],
$$

where we have used the fact that $\Pi^{-T}=\Pi$. Finally, we permute only the columns corresponding to the parity part of $\mathrm{H}_{b}$, which gives

$$
\mathrm{H}_{S-\mathrm{IRA}}=\left[\begin{array}{ll}
\mathbf{P} \quad \Pi \mathbf{H}_{p} \Pi^{T} \tag{17}
\end{array}\right] .
$$

It is easily shown that the parity part of $\mathrm{H}_{\text {SIRA }}$, that is, $\Pi H_{p} \Pi^{T}$, is exactly in QC form,

$$
\left[\begin{array}{ccccc}
I_{0} & & & & I_{1}  \tag{18}\\
I_{0} & I_{0} & & & \\
& \ddots & \ddots & & \\
& & I_{0} & I_{0} & \\
& & & I_{0} & I_{0}
\end{array}\right]
$$

where $I_{0}$ is the $Q \times Q$ identity matrix and $I_{1}$ is obtained from $I_{0}$ by cyclically shifting all of its rows leftward. Therefore, $\mathbf{H}_{\text {SIRA }}$ corresponds to a quasi-cyclic IRA code since $\mathbf{P}$ is also an array of $Q \times Q$ circulant permutation matrices. Observe that, except for a re-ordering of the parity bits, $\mathrm{H}_{\text {SIRA }}$ describes the same code as $\mathrm{H}_{a}$ and $\mathrm{H}_{b}$.

As described in [36], in addition to simplifying encoder and decoder implementations, the QC structure simplifies the code design process. Simulation results for the example codes, which are produced by the design algorithms proposed in [36][37][38][39], show that the S-IRA codes perform as well as IRA codes in the waterfall region and possess very low error floors. As an example, Fig. 6 depicts the performance of a rate-1/2 $(2044,1024)$ S-IRA code simulated in software and hardware. ${ }^{1}$ It is seen that the floors, both bit error rate (BER) and frame error rate (FER), are quite low (it can be lower or higher depending on the decoder implementation). Lastly, SIRA codes are suitable for rate-compatible code family design [36].

[^8]
## Exhibit I

Page 121


Fig. 6. Performance of a $(2044,1024)$ S-RA code on the BI-AWGNC. $H W=$ hardware simulator. $S W=$ software simulator.


Fig. 7. IRAA encoder.

We now consider irregular repeat-accumulate-accumulate (IRAA) codes which are obtained by concatenating the parity arm of the IRA encoder of Fig. 5(b) with another accumulator, through a permuter, as shown in Fig. 7. (ARAA codes were considered in [49].) The IRAA codeword can be either $\mathbf{c}=$ [ $\mathbf{u} \mathbf{p}]$ or $\mathbf{c}=[\mathbf{u} \mathbf{b} \mathbf{p}]$, depending on whether the intermediate parity bits $\mathbf{b}$ are punctured or not. The parity-check matrix of the general IRAA code corresponding to Fig. 7 is

$$
\mathbf{H}_{\mathrm{IRAA}}=\left[\begin{array}{ccc}
\mathbf{H}_{u} & \mathbf{H}_{p} & 0  \tag{19}\\
0 & \mathbf{\Pi}_{1}^{T} & \mathbf{H}_{p}
\end{array}\right],
$$

where $\Pi_{1}$ is the interleaver between the two accumulators. When the parity bits $\mathbf{b}$ are not transmitted, they are considered to be punctured, that is, the log-likelihood ratios for these bits are initialized by zeros before decoding. When an IRAA code is structured, we use the notation S-IRAA.

Example 2: We compare the performance of rate-1/2 (2048, 1024) S-IRA and S-IRAA codes in this example. For the SIRA code, $d_{b, i}=5$ for all $i$ and for the S-IRAA code, $d_{b, i}=3$ for all $i$, and the intermediate parity vector b is not transmitted to maintain the code rate at $1 / 2$. The protographs for these codes are given in Fig. 8. Because decoder complexity is proportional to the number of edges in a code's parity-check matrix, the complexity of the S-IRAA decoder is slightly greater than the complexity of the S-IRA decoder, even though the column weight in $\mathbf{H}_{u}$ is 3 for the former versus 5 for the


Fig. 8. Rate-1/2 SIRA and SIRAA protographs for the codes in Fig. 9. The shaded node in the SIRAA protograph represents punctured bits. SIRA: $\left(E_{6} / N_{0}\right)_{\text {thres }}=0.97 \mathrm{~dB}$. SIRAA: $\left(E_{b} / N_{0}\right)_{\text {thres }}=1.1 \mathrm{~dB}$.


Fig. 9. Performance comparison between rate-1/2 S-IRA and S-IRAA codes on the BI-AWGNC, $n=2048$ and $k=1024$.
latter. We observe in Fig. 9 that, for both codes, there are no error floors in the BER curves down to $\mathrm{BER}=5 \times 10^{-8}$ and in the FER curves down to $\mathrm{FER}=10^{-6}$. While the S-IRAA code is 0.2 dB inferior to the S-IRA code in the waterfall region, we conjecture that it has a lower floor (which is difficult to measure), which would be due to the second accumulator whose function is to increase minimum distance.

Example 3: This second example is a comparison of rate$1 / 3(3072,1024)$ S-IRA and S-IRAA codes, with $d_{b, i}=4$ for the S-IRA code and $d_{b, i}=3$ for the S-IRAA code. The protographs for these codes are given in Fig. 10. In this case, b is part of the transmitted S-IRAA codeword and the decoder complexities are the same. We see in Fig. 11 that, in the low SNR region, the performance of the S-IRA code is 0.4 dB better than the S-IRAA code. However, for high SNRs, the SIRAA code will outperform the S-IRA code due to its lower error floor.

## D. Generalized IRA codes

Generalized IRA (G-IRA) codes [40][41] increase the flexibility in choosing degree distributions relative to IRA codes, allowing, for example, the design of near-regular efficiently


Fig. 10. Rate-1/3 SIRA and SIRAA protographs for the codes in Fig. 11. SIR: $\left(E_{b} / N_{0}\right)_{t h r e s}=0.40 \mathrm{~dB}$. SRAA: $\left(E_{b} / N_{0}\right)_{\text {thres }}=0.83 \mathrm{~dB}$.


Fig. 11. Performance comparison between rate-1/3 S-IRA and S-IRAA codes on the BI-AWGNC, $n=3072$ and $k=1024$.
encodable codes. The encoding algorithms for G-IRA codes are similar to those of IRA codes. For G-IRA codes, the accumulator $1 /(1 \oplus D)$ in Fig. $5(\mathrm{~b})$ is replaced by a generalized accumulator with transfer function $1 / g(D)$ where $g(D)=$ $\sum_{j=0}^{t} g_{j} D^{j}$ and $g_{j} \in\{0,1\}$, except $g_{0}=1$. The systematic encoder therefore has the same generator matrix format, $\mathbf{G}=$ [lll $\mathbf{I}_{u}^{T} \mathbf{H}_{p}^{-T}$, but now

$$
\mathbf{H}_{p}=\left[\begin{array}{cccccccc}
1 & & & & & & & \\
g_{1} & 1 & & & & & & \\
g_{2} & g_{1} & \ddots & & & & & \\
\vdots & g_{2} & \ddots & \ddots & & & & \\
g_{t} & \vdots & \ddots & \ddots & \ddots & & & \\
& g_{t} & \ddots & \ddots & \ddots & \ddots & & \\
& & \ddots & \ddots & \ddots & \ddots & \ddots & \\
& & & g_{t} & \cdots & g_{2} & g_{1} & 1
\end{array}\right] .
$$

Further, the parity-check matrix format is unchanged, $\mathbf{H}=$ $\left[\mathrm{H}_{u} \mathrm{H}_{p}\right]$.

To design a G-IRA code, one must choose $g(D)$ so that the bipartite graph for $\mathrm{H}_{p}$ contains no length-4 cycles [40]. Once $g(D)$ has been chosen, $\mathbf{H}$ can be completed by constructing the sub-matrix $\mathbf{H}_{u}$, according to some prescribed degree


Fig. 12. Generic bipartite graph for ARA codes.
distribution, again avoiding short cycles, this time in all of H.

G-IRA codes are highly reconfigurable in the sense that an encoder and decoder can be designed for a set of different polynomials $g(D)$. This could be useful when faced with different channels conditions.

## E. Accumulate-Repeat-Accumulate Codes

For accumulate-repeat-accumulate (ARA) codes, introduced in [45], an accumulator is added to precode a subset of the information bits of an IRA code. The primary role of this second accumulator is to improved the decoding threshold of a code, that is, to shift the BER waterfall region leftward. ARA codes are a subclass of LDPC codes and Fig. 12 presents a generic ARA Tanner graph in which punctured variable nodes are highlighted. The sparseness of the ARA graph is achieved at the price of these punctured variable nodes which act as auxiliary nodes that enlarge the H used by the decoder. The iterative graph-based ARA decoder thus has to deal with a redundant representation of the code, implying a larger $\mathbf{H}$ matrix than the nominal $(n-k) \times n$. This issue, together with the presence of a large number of degree-1 and degree-2 variable nodes, results in slow decoding convergence.

The ARA codes presented in [45] relies on very simple protographs. Several modified ARA protographs have been introduced in [46][47], leading to ARA and ARA-like code families with excellent performance in both the waterfall and floor regions of the codes' performance curves. The protograph of a rate- $1 / 2$ ARA code ensemble with repetition rate 4 , denoted AR4A, is depicted in Fig. 13(a). The dark circle corresponds to a state-variable node, and it is associated with the precoded fraction of the information bits. As emphasized in the figure, such a protograph is the serial concatenation of an accumulator protograph and an IRA protograph. Half (node 2) of the information bits are sent directly to the IRA encoder, while the other half (node 1) is first precoded by the


Fig. 13. AR4A protographs in (a) serial-concatenated form and (b) parallelconcatenated form. $\left(E_{b} / N_{0}\right)_{\text {thres }}=0.55 \mathrm{~dB}$.
outer accumulator. This encoding procedure corresponds to a systematic code.
A different code structure is represented by the protograph in Fig. 13(b), which has a parallel-concatenated form. In this case, half (node 2) of the information bits are encoded by the IRA encoder and the other half (node 3) are encoded by both the IRA encoder and a $(3,2)$ single-parity-check encoder. The node-3 information bits (corresponding to the dark circle in the protograph) are punctured and so codes corresponding to this protograph are non-systematic. While the codes (actually, code ensembles) specified by the protographs in Fig. 13(a) are the same in the sense that the same set of codewords are implied, the $\mathbf{u} \rightarrow \mathbf{c}$ mappings are different. The advantage of the non-systematic protograph is that, although the node-3 information bits in Fig. 13(b) are punctured, the node degree is 6, in contrast with the node-1 information bits in Fig. 13(a), in which the node degree is only 1 . Given that ARA code decoders converge so slowly, the faster-converging degree-6 node is to be preferred over the slowly converging degree-1 node.

To demonstrate this, we designed a $(2048,1024)$ QC AR4A code whose $\mathbf{H}$ matrix is depicted in Fig. 14. The first group of 512 columns (of weight 6) correspond to variable node type 1 (Fig. 13) whose bits are punctured, and the subsequent four groups of 512 columns correspond, respectively, to node types $2,3,4$, and 5 . The first group of 512 rows correspond to check node type A , and the two subsequent groups of rows correspond to node types B and C, respectively. The performance of the code, with a maximum of $I_{\max }=50$ iterations is shown in Fig. 15. We note that the ( 2048,1024 ) AR4A code reported in [47] achieves $\mathrm{BER}=10^{-7}$ at $E_{b} / N_{0}=2 \mathrm{~dB}$ with


Fig. 14. H matrix for the $(2048,1024)$ AR4A code.


Fig. 15. BER and FER performance for an AR4A code.

200 iterations, whereas in the simulation here, $\mathrm{BER}=10^{-7}$ is achieved at $E_{b} / N_{0}=2.2 \mathrm{~dB}$ with 50 iterations. In Fig. 16, we present the BER performance at $E_{b} / N_{0}=2.25 \mathrm{~dB}$ for the five node types that appear in Fig. 13 for $I_{\max }$ ranging from 5 to 20 . With 20 iterations, we collected 400 error events, while with fewer iterations, the numbers of collected error events were larger. From the figure, we see that the high-degree variable nodes (node types 2 and 3) converge the fastest. We note also that, while type 3 nodes have degree 6 and type 2 nodes have degree 4 , type 3 nodes initially converge slower because the bits corresponding to those nodes are punctured so that the decoder receives no channel LLRs for those bits. However, by 20 iterations, the type 3 bits become more reliable than the type 2 bits.

## F. Accumulator-Based Codes in Standards

IRA codes and IRA-influenced codes are being considered for several communication standards. The ETSI DVB S2 [48] standard for digital video broadcast specifies two IRA code


Fig. 16. Node convergence analysis for a $(2048,1024)$ AR4A code at $E_{b} / N_{0}=2.25 \mathrm{~dB}$.
families with block lengths 64800 and 16200 . The code rates supported by this standard range from $1 / 4$ to $9 / 10$, and a wide range of spectral efficiencies is achieved by coupling these LDPC codes with QPSK, 8-PSK, 16-APSK, and 32-APSK modulation formats. A further level of protection is afforded by an outer BCH code.

The IEEE standards bodies are also considering IRAinfluenced QC LDPC codes for $802.1 \ln$ (wireless local-area networks) and 802.16e (wireless metropolitan-area networks). Rather than employing a tailing-biting accumulator (which avoids weight-one columns), these standards have replaced the last block-column in (18) with a weight-three block-column and moved it to the first column, as displayed below. Encoding is facilitated by this matrix since the sum of all block-rows gives the block-row ( $\left.\begin{array}{cccc}I_{0} & 0 & \cdots & 0\end{array}\right)$, so that encoding is initialized by summing all of the block-rows of H and solving for first $Q$ parity bits using the resulting block-row.

$$
\left[\begin{array}{ccccccc}
I_{0} & I_{0} & & & & & \\
& I_{0} & I_{0} & & & & \\
& & I_{0} & \ddots & & & \\
I_{0} & & & \ddots & \ddots & & \\
& & & & \ddots & I_{0} & \\
& & & & & I_{0} & I_{0} \\
I_{0} & & & & & & I_{0}
\end{array}\right]
$$

ARA codes are being considered by the Consultative Committee for Space Data Systems (CCSDS) for high data-rate bandwidth-efficient space links. Very low floors are required for this applications because the scientific data (e.g., images) being transmitted from space to the ground are typically in a compressed format.

## V. LDPC Codes Based on Finite Geometries

In [50], it is shown that structured LDPC codes can be constructed based on the lines and points of geometries over finite fields, namely Euclidean and projective geometries. These codes are known as finite-geometry (FG) LDPC codes. Among the FG-LDPC codes, an important subclass is the subclass of cyclic FG-LDPC codes. A cyclic LDPC code is completely characterized by its generator polynomial and its encoding can be implemented with a shift-register with feedback connections based on its generator polynomial [7]. The systematic-form generator matrix of a cyclic LDPC code can be constructed easily based on its generator polynomial [7]. Another impottant subclass of FG-LDPC codes is the subclass of quasi-cyclic FG-LDPC codes. As pointed out earlier, QC-LDPC codes can also be encoded easily with simple shift-registers. In this section, we give a brief survey of constructions of cyclic and quasi-cyclic FG-LDPC codes.

## A. Cyclic Euclidean Geometry LDPC Codes

The $m$-dimensional Euclidean geometry over the finite field $\mathrm{GF}(q)[7][51][52]$, denoted by $\mathrm{EG}(m, q)$, consists of $q^{m}$ points, and each point is represented by an $m$-tuple over $\mathrm{GF}(q)$. The point represented by the all-zero $m$-tuple $\mathbf{0}=(0,0, \ldots, 0)$, is called the origin of the geometry. A line in $\mathrm{EG}(m, q)$ is either a one-dimensional subspace of the vector space of all the $m$-tuples over GF $(q)$, or a coset of it. There are $q^{m-1}\left(q^{m}-\right.$ $1) /(q-1)$ lines in total. Each line consists of $q$ points. Two points are connected by one and only one line. If a is a point on the line $\mathcal{L}$, we say that the line $\mathcal{L}$ passes through the point a. Two lines either do not have any point in common or they have one and only one point in common. If two lines have a common point a, we say that they intersect at a. For any point a in $\mathrm{EG}(m, q)$, there are exactly $\left(q^{m}-1\right) /(q-1)$ lines passing through (or intersecting at) a. In particular, if a is not the origin, then it lies on $q\left(q^{m-1}-1\right) /(q-1)$ lines not passing through the origin. Furthermore, there are in total ( $q^{m-1}-$ 1) $\left(q^{m}-1\right) /(q-1)$ lines not passing through the origin.

The extension field $\operatorname{GF}\left(q^{m}\right)$ of $\operatorname{GF}(q)$ is a realization of $\mathrm{EG}(m, q)$ [7][51]. Let $\alpha$ be a primitive element of $\mathrm{GF}\left(q^{m}\right)$. Then, the elements $0,1, \alpha, \alpha^{2}, \ldots, \alpha^{q^{m}-2}$ of $\operatorname{GF}\left(q^{m}\right)$ represent the $q^{m}$ points of $\operatorname{EG}(m, q)$, and 0 represents the origin of the geometry. A line is a set of points of the form $\left\{a+\beta a^{\prime}\right.$ : $\beta \in \mathrm{GF}(q)\}$, where $\mathbf{a}$ and $\mathbf{a}^{\prime}$ are linearly independent over $\mathrm{GF}(q)$.

Let $n_{\mathrm{EG}}=q^{m}-1$ be the number of non-origin points in the geometry. Let $\mathcal{L}$ be a line not passing through the origin. Define the $n_{\text {EG }}$-tuple over GF(2),

$$
\mathbf{v}_{\mathcal{L}}=\left(v_{0}, v_{1}, \ldots, v_{n_{\mathrm{EG}}-2}\right)
$$

whose components correspond to the $q^{m}-1$ non-origin points, $\alpha^{0}, \alpha, \cdots, \alpha^{q^{m}-2}$, of $\mathrm{EG}(m, q)$, where $v_{i}=1$ if the point $\alpha^{i}$ lies on $\mathcal{L}$, otherwise $v_{i}=0$. The vector $\mathbf{v}_{\mathcal{L}}$ is called the incidence vector of $\mathcal{L}$. Clearly, $\alpha \mathcal{L}$ is also a line in the geometry whose incidence vector $\mathrm{v}_{\alpha \mathcal{L}}$ is the right cyclic-shift of $\mathbf{v}_{\mathcal{L}}$. The lines $\mathcal{L}, \alpha \mathcal{L}, \cdots, \alpha^{n_{\mathrm{EG}}-1} \mathcal{L}$ are all different [7]

## Exhibit I

and they do not pass through the origin. Since $\alpha^{q^{m}-1}=1$, $\alpha^{n_{E G}} \mathcal{L}=\mathcal{L}$. These $n_{E G}$ lines form a cyclic class. The $\left(q^{m-1}-1\right)\left(q^{m}-1\right) /(q-1)$ lines in $\mathrm{EG}(m, q)$ not passing through the origin can be partitioned into $K=\left(q^{m-1}-\right.$ 1) $/(q-1)$ cyclic classes, denoted $\mathcal{Q}_{1}, \mathcal{Q}_{2}, \cdots, \mathcal{Q}_{K}$ where $\mathcal{Q}_{i}=\left\{\mathcal{L}_{i} ; \alpha \mathcal{L}_{i}, \cdots, \alpha^{n_{\mathrm{EG}}-1} \mathcal{L}_{i}\right\}$ with $1 \leq i \leq K$. For each cyclic class $\mathcal{Q}_{i}$, we form an $n_{\mathrm{EG}} \times n_{\mathrm{EG}}$ matrix $\mathbf{H}_{\mathrm{EG}, i}$ over $\mathrm{GF}(2)$ with the incidence vectors $\mathcal{L}_{i}, \alpha \mathcal{L}_{i}, \cdots, \alpha^{n_{\mathrm{EG}}-1} \mathcal{L}_{i}$ as rows. $\mathrm{H}_{\mathrm{EG}, i}$ is a circulant matrix with column and row weights equal to $q$. For $1 \leq k \leq K$, let

$$
\mathbf{H}_{\mathrm{EG}(m, q), k}=\left[\begin{array}{c}
\mathbf{H}_{\mathrm{EG}, 1}  \tag{20}\\
\mathbf{H}_{\mathrm{EG}, 2} \\
\vdots \\
\mathbf{H}_{\mathrm{EG}, k}
\end{array}\right] .
$$

Then $\mathbf{H}_{\mathrm{EG}(m, q), k}$ consists of a column of $k$ circulants of the same size $n_{\mathrm{EG}} \times n_{\mathrm{EG}}$, and it has column and row weights, $k q$ and $q$, respectively. Since no two lines in $\operatorname{EG}(m, q)$ have more than one point in common, it follows that no two rows or two columns in $\mathbf{H}_{\mathrm{EG}(m, q), k}$ have more than a single 1-element in common. We say that $\mathrm{H}_{\mathrm{EG}(m, q), k}$ satisfies the $R C$-constraint. The null space of $\mathrm{H}_{\mathrm{EG}(m, q), k}$ gives a cyclic EG-LDPC code of length $n_{\mathrm{EG}}=q^{m}-1$ and minimum distance at least $k q+1$ [50][7], whose Tanner graph has a girth of at least 6 .
Of particular interest is the two-dimensional Euclidean geometry, $\mathrm{EG}(2, q)$, which is also called an affine plane over $\mathrm{GF}(q)$ [52]. This geometry has $q^{2}$ points and $q(q+1)$ lines, and $q^{2}-1$ of them do not pass through the origin. Each line has $q$ points and each point lies on $q+1$ lines. Each nonorigin point lies on $q$ lines that do not pass through the origin. If $\mathcal{L}$ is a line in $\operatorname{EG}(2, q)$ not passing through the origin, then $\mathcal{L}, \alpha \mathcal{L}, \ldots, \alpha^{q^{2}-2} \mathcal{L}$, where $\alpha$ is a primitive element in $\operatorname{GF}\left(q^{2}\right)$, are all the lines in the geometry not passing through the origin. Hence, all the lines in $\operatorname{EG}(2, q)$ not passing through the origin form a single cyclic class $\mathcal{Q}$ (i.e., $K=1$ ). Let $\mathbf{H}_{\mathrm{EG}(2, q)}$ denote the $\left(q^{2}-1\right) \times\left(q^{2}-1\right)$ circulant formed by the incidence vectors of lines in $\mathcal{Q}$. It is a $\left(q^{2}-1\right) \times\left(q^{2}-1\right)$ matrix over $\mathrm{GF}(2)$ with both column and row weights equal to $q$. The null space of $\mathrm{H}_{\mathrm{EG}(2, q)}$ gives a cyclic.EG-LDPC code of length $q^{2}-1$ and minimum distance at least $q+1$. For $q=2^{s}$, the parameters of the code with parity-check matrix $\mathbf{H}_{\mathrm{EG}(2, q)}$ are as follows [7]:

| Length | $n=2^{2 s}-1$, |
| :--- | :--- |
| Number of parity bits | $n-k=3^{s}-1$, |
| Dimension | $k=2^{2 s}-3^{s}$, |
| Minimum distance | $d_{\min } \geq 2^{s}+1$, |
| Size of the LDPC matrix | $\left(2^{2 s}-1\right) \times\left(2^{2 s}-1\right)$, |
| Row weight | $2^{s}$, |
| Column weight | $2^{s}$. |

Generators polynomials for these codes can be readily obtained from [7].
Example 4: The cyclic LDPC code constructed based on the two-dimensional Euclidean geometry $\mathrm{EG}\left(2,2^{6}\right)$ over $\operatorname{GF}\left(2^{6}\right)$ is a $(4095,3367)$ LDPC code with rate 0.822 and


Fig. 17. Performance of the binary $(4095,3367)$ cyclic EG-LDPC code given in Example 4 over the BI-AWGNC.
minimum distance 65 . The performance of this code with iterative decoding using the SPA is shown in Fig. 17. At a BER of $10^{-6}$, it performs 1.65 dB from the Shannon limit. Since it has a very large minimum distance, it has a very low error-floor.

## B. Cyclic Projective Geometry LDPC Codes

The $m$-dimensional projective geometry over $\operatorname{GF}(q)$, denoted by $\mathrm{PG}(m, q)$, consists of $n_{\mathrm{PG}}=\left(q^{m+1}-1\right) /(q-1)$ points. Each point is represented by a non-zero $(m+1)$-tuple a over $\operatorname{GF}(q)$ such that all $q-1$ non-zero multiples $\beta \mathbf{a}$, where $\beta$ is a non-zero element in $\operatorname{GF}(q)$, represent the same point. $A$ line in $\mathrm{PG}(m, q)$ consists of all points of the form $\beta_{1} \mathrm{a}_{1}+\beta_{2} \mathrm{a}_{2}$, where $\mathbf{a}_{1}$ and $\mathbf{a}_{2}$ are two $(m+1)$-tuples that are linearly independent over $\operatorname{GF}(q)$ and $\beta_{1}$ and $\beta_{2}$ are elements in $\operatorname{GF}(q)$, with $\beta_{1}$ and $\beta_{2}$ not simultaneously equal to zero. There are $\left(q^{m+1}-1\right)\left(q^{m}-1\right) /\left(q^{2}-1\right)(q-1)$ lines in $\operatorname{PG}(m, q)$ and each line consists of $q+1$ points. Two points are connected by one and only one line and each point lies on $\left(q^{m}-1\right) /(q-1)$ lines.

The extension field $\mathrm{GF}\left(q^{m+1}\right)$ of $\mathrm{GF}(q)$ is a realization of $\operatorname{PG}(m, q)$ [7]. Let $\alpha$ be a primitive element of $\mathrm{GF}\left(q^{m+1}\right)$. A point in $\operatorname{PG}(m, q)$ is represented by a non-zero element $\alpha^{i}$. Every nonzero element in the base field $\mathrm{GF}(q)$ can be written as $\alpha^{l}$ for some $l$ which is divisible by $\left(q^{m+1}-1\right) /(q-1)$. Hence, the elements $\alpha^{i}$ and $\alpha^{j}$ represent the same point in $\mathrm{PG}(m, q)$ if and only if $i \equiv j\left(\bmod \left(q^{m+1}-1\right) /(q-1)\right)$. Therefore, we can take the elements $1, \alpha, \ldots, \alpha^{n_{P G}-1}$ to represent all the points in $\operatorname{PG}(m, q)$.

Let $\mathcal{L}$ be a line in $\operatorname{PG}(m, q)$. Define the $n_{\mathrm{PG}}$-tuple over $\mathrm{GF}(2) \mathbf{v}_{\mathcal{L}}=\left(v_{0}, v_{1}, \ldots, v_{n_{\mathrm{PG}}-1}\right)$ whose components correspond to the $n_{\mathrm{PG}}=\left(q^{m+1}-1\right) /(q-1)$ points of $\operatorname{PG}(m, q)$, where $v_{i}=1$ if the point represented by $\alpha^{i}$ lies on $\mathcal{L}$, otherwise $v_{i}=0$. The vector $\mathbf{v}_{\mathcal{L}}$ is called the incidence
vector of $\mathcal{L}$. Clearly, $\alpha \mathcal{L}$ is also a line in the geometry whose incidence vector $\mathbf{v}_{\alpha \mathcal{L}}$ is the cyclic-shift of $\mathbf{v}_{\mathcal{L}}$.

For even $m$, the lines in $\operatorname{PG}(m, q)$ can be partitioned into $K_{1}=\left(q^{m}-1\right) /\left(q^{2}-1\right)$ cyclic classes $\mathcal{Q}_{1}, \mathcal{Q}_{2}, \cdots, \mathcal{Q}_{K_{1}}$, each class consisting of $n_{\mathrm{PG}}$ lines. For each cyclic class $\mathcal{Q}_{i}$, we can form an $n_{\mathrm{PG}} \times n_{\mathrm{PG}}$ circulant $\mathbf{H}_{\mathrm{PG}, i}$ with both column and row weights equal to $q+1$. For $1 \leq k \leq K_{1}$, form the following matrix:

$$
\mathbf{H}_{\mathrm{PG}(m, q), k}^{(1)}=\left[\begin{array}{c}
\mathbf{H}_{\mathrm{PG}, \mathbf{1}}  \tag{21}\\
\mathbf{H}_{\mathrm{PG}, 2} \\
\vdots \\
\mathbf{H}_{\mathrm{PG}, k}
\end{array}\right]
$$

which has column and row weights $k(q+1)$ and $q+1$, respectively. The null space of $\mathrm{H}_{\mathrm{PG}(m, q), k}^{(1)}$ gives a cyclic PGLDPC code of length $n_{\mathrm{PG}}=\left(q^{m+1}-1\right) /(q-1)$ and minimum distance at least $k(q+1)+1$ whose Tanner graph has a girth of at least 6 . For odd $m$, the lines in $\operatorname{PG}(m, q)$ can be partitioned into $K_{2}+1$ cyclic classes, $\mathcal{Q}_{0}, \mathcal{Q}_{1}, \mathcal{Q}_{2}, \cdots, \mathcal{Q}_{K_{2}}$, where $K_{2}=q\left(q^{m-1}-1\right) /\left(q^{2}-1\right)$. Except for $\mathcal{Q}_{0}$, each cyclic class consists of $n_{\mathrm{PG}}$ lines. The cyclic class $\mathcal{Q}_{0}$ consists of only $\lambda=\left(q^{m+1}-1\right) /\left(q^{2}-1\right)$ lines. For each cyclic class $\mathcal{Q}_{i}$ with $i \neq 0$, we can form a $n_{\mathrm{PG}} \times n_{\mathrm{PG}}$ circulant $\mathrm{H}_{\mathrm{PG}, i}$ with the incidence vectors of the lines in $\mathcal{Q}_{i}$ as rows. For $1 \leq k \leq K_{2}$, we can form a matrix $\mathbf{H}_{\mathrm{PG}(m, q), k}^{(2)}$ of the form given by (21). The null space of $\mathbf{H}_{\mathrm{PG}(m, q), k}^{(2)}$ gives a cyclic PG-LDPC code of length $n_{\mathrm{PG}}$ and minimum distance at least $k(q+1)+1$ whose Tanner graph has a girth of at least 6 .
As in the case of Euclidean geometries, the two-dimensional projective geometry, $\mathrm{PG}(2, q)$, which is also called a projective plane over GF( $q$ ) [52], is of particular interest. This geometry has $q^{2}+q+1$ points and $q^{2}+q+1$ lines. Each line has $q+1$ points and each point lies on $q+1$ lines. If $\mathcal{L}$ is a line in $\operatorname{PG}(2, q)$, then $\mathcal{L}, \alpha \mathcal{L}, \ldots, \alpha^{q^{2}+q} \mathcal{L}$, where $\alpha$ is a primitive element in $\operatorname{GF}\left(q^{2}\right)$, are all the lines in the geometry. Hence, all the lines in $\operatorname{PG}(2, q)$ form a single cyclic class $\mathcal{Q}$ (i.e., $K_{1}=1$ ). Let $\mathbf{H}_{\mathrm{PG}(2, q)}$ denote the $n_{\mathrm{PG}} \times n_{\mathrm{PG}}$ circulant formed by the incidence vectors of the lines in $\mathcal{Q}$. It is a $\left(q^{2}+q+\right.$ 1) $\times\left(q^{2}+q+1\right)$ matrix over $\mathrm{GF}(2)$ with both column and row weights equal to $q+1$. The null space of $\mathrm{H}_{\mathrm{PG}(2, q)}$ gives a cyclic PG-LDPC code of length $q^{2}+q+1$ and minimum distance at least $q+2$. For $q=2^{s}$, the parameters of the cyclic PG-LDPC code given by the null space of $\mathbf{H}_{\mathrm{PG}(2, q)}$ are as follows [7]:

| Length | $n=2^{2 s}+2^{s}+1$, |
| :--- | :--- |
| Number of parity bits | $n-k=3^{s}+1$, |
| Dimension | $k=2^{2 s}+2^{s}-3^{s}$, |
| Minimum distance | $d_{\min } \geq 2^{s}+2$, |
| Size of the LDPC matrix | $\left(2^{2 s}+2^{s}+1\right) \times\left(2^{2 s}+2^{s}+1\right)$ |
| Row weight | $2^{s}+1$, |
| Column weight | $2^{s}+1$. |

Generators polynomials for these codes can also be readily obtained from [7].


Fig. 18. Performance of the binary $(3510,3109)$ quasi-cyclic PG-LDPC code given in Example 5 over the BI-AWGNC.

## C. Quasi-Cyclic Finite Geometry LDPC Codes

Let $\mathbf{R}_{\mathrm{EG}(m, q), k}$ be the transpose of the parity-check matrix $\mathbf{H}_{\mathrm{EG}(m, q), k}$ of a cyclic EG-LDPC code given by (20), i.e.,

$$
\begin{equation*}
\mathbf{R}_{\mathrm{EG}(m, q), k} \triangleq \mathbf{H}_{\mathrm{EG}(m, q), k}^{T}=\left[\mathbf{H}_{1}^{T} \mathbf{H}_{2}^{T} \cdots \mathbf{H}_{k}^{T}\right] \tag{22}
\end{equation*}
$$

which consists of a row of $k$ circulants of size $n_{\mathrm{EG}} \times n_{\mathrm{EG}}$. It is a $\left(q^{m}-1\right) \times k\left(q^{m}-1\right)$ matrix with column and rosv weights $q$ and $k q$, respectively. The null space of $\mathbf{R}_{\mathrm{EG}(m, q), k}$ gives a quasi-cyclic EG-LDPC code of length $k\left(q^{m}-1\right)$ and minimum distance at least $q+1$ whose Tanner graph has a girth of at least 6 .

Similarly, let $\mathbf{R}_{\mathrm{PG}(m, q), k}^{(e)}$ be the transpose of $\mathbf{H}_{\mathrm{PG}(m, q), k}^{(e)}$ with $e=1$ or 2 . Then the null space of $\mathbf{H}_{\mathrm{PG}(m, q), k}^{(e)}$ gives a quasi-cyclic PG-LDPC code of length $k\left(q^{m+1}-1\right) /(q-1)$ and minimum distance at least $q+2$.

Example 5: Consider the 3-dimensional projective geometries $\operatorname{PG}\left(3,2^{3}\right)$ over $\mathrm{GF}\left(2^{3}\right)$. This geometry consists of 585 points and 4745 lines, each line consists of 9 points. The lines in this geometry can be partitioned into 9 cyclic classes, $\mathcal{Q}_{0}, \mathcal{Q}_{1}, \cdots, \mathcal{Q}_{8}$, where $\mathcal{Q}_{0}$ consists of 65 lines and each of the other 8 cyclic classes consists of 585 lines. For each $\mathcal{Q}_{i}$ with $1 \leq i \leq 8$, we can form a $585 \times 585$ circulant $H_{P G, i}$ over $\mathrm{GF}(2)$ with the incidence vectors in $\mathcal{Q}_{i}$ as the rows. Set $k=6$. Form the following $585 \times 3510$ matrix: $\mathbf{R}_{\mathrm{PG}\left(3,2^{3}\right), 6}^{(2)}=$ $\left[\mathbf{H}_{\mathrm{PG}, 1}^{T} \mathbf{H}_{\mathrm{PG}, 2}^{T} \cdots \mathbf{H}_{\mathrm{PG}, 6}^{T}\right]$, which has column and row weights 9 and 54 , respectively. The null space of this matrix gives a $(3510,3109)$ quasi-cyclic PG-LDPC code with rate 0.8858 and minimum distance at least 10 . The performance of this code decoded with iterative decoding using the SPA is shown in Fig. 18. At a BER of $10^{-6}$, it performs 1.3 dB from the Shannon limit.

Other LDPC codes constructed based on finite geometries can be found in [53][54][55][56][57]. Finite geometry LDPC codes can also be effectively decoded with one-step majority-


Fig. 19. Performance of the binary $(4095,3367)$ EG-LDPC code given in Example 4 with various decoding techniques over the BI-AWGNC. MLD $=$ majority-logic decoding. $\mathrm{BF}=$ bit-flipping. $\mathrm{SPA}=$ sum-product algorithm.
logic decoding [7]; hard-decision bit-flipping (BF) decoding [1][50][7] and weighted BF decoding [50][58][59][60]. These decoding methods together with the soft-input and soft-output (SISO) iterative decoding based on belief propagation offer various trade-offs between performance and decoding complexity. The one-step majority-logic decoding requires the ledst decoding complexity while the (SISO) iterative decoding based on belief propagation requires the most decoding complexity and the other two decoding methods are in between. Fig. 19 shows the performances of the $(4095,3367)$ cyclic EG-LDPC code given in Example 4 with various decoding methods.

## VI. REGULAR RS-BASED LDPC CODES

This section first gives a brief survey of a class of structured LDPC codes that are constructed from the codewords of Reed-Solomon (RS) codes with two information symbols. Then two new classes of Reed-Solomon-based quasi-cyclic LDPC codes are presented. Experimental results show that constructed codes perform very well over the AWGN channel with iterative decoding.

In [61], a class of structured regular LDPC codes was presented which were constructed from the codewords of RS codes with two information symbols. These codes are referred to as RS-based LDPC codes and their parity-check matrices are arrays of permutation matrices. RS-based LDPC codes perform well with iterative decoding over the AWGN channel. Most importantly, they have low error-floors and their decoding converges very fast. These features are important in high-speed communication systems where very low error rates are required, such as the 10 G Base-T Ethernet. In this section, we first give a more general form of the RS-based LDPC codes presented in [61] and then we present two classes of RS-based QC LDPC codes.

Let $\alpha$ be a primitive element of the finite field $\operatorname{GF}(q)$. Then the following powers of $\alpha, \alpha^{-\infty} \triangleq 0, \alpha^{0}=1, \alpha, \ldots, \alpha^{q-2}$, form the $q$ elements of $\operatorname{GF}(q)$ and $\alpha^{q-1}=1$. For $i=$ $-\infty, 0,1, \cdots, q-2$, represent each element $\alpha^{i}$ of $\mathrm{GF}(q)$ by a $q$-tuple over GF(2),

$$
\begin{equation*}
z\left(\alpha^{i}\right)=\left(z_{-\infty}, z_{0}, z_{1}, z_{2}, \ldots, z_{q-2}\right) \tag{23}
\end{equation*}
$$

with components corresponding to the $q$ elements, $\alpha^{-\infty}, \alpha^{0}, \cdots, \alpha^{q-2}$, of $\mathrm{GF}(q)$, where the $i$-th component $z_{i}=1$ and all the other components equal to zero. This binary $q$-tuple $\mathbf{z}\left(\alpha^{i}\right)$ is an unit-vector with one and only one 1 -component and is called the location vector of $\alpha^{i}$. It is clear that the location vectors of two different elements in $\mathrm{GF}(q)$ have their 1 -components at two different locations. Suppose we form a $q \times q$ matrix A over $\mathrm{GF}(2)$ with the location vectors of the $q$ elements of $\mathrm{GF}(q)$ as rows arranged in any order. Then $\mathbf{A}$ is a $q \times q$ permutation matrix.
Consider an extended ( $q, 2, q-1$ ) RS code $\mathcal{C}_{b}$ over GF( $q$ ) [7] of length $q$ with two information symbols and minimum distance $q-1$. The nonzero codewords of $\mathcal{C}_{b}$ have two different weights, $q-1$ and $q$. Because the minimum distance of $\mathcal{C}_{b}$ is $q-1$, two codewords in $\mathcal{C}_{b}$ differ in at least $q-1$ places, i.e., they have at most one place where they have the same code symbols. Let $\mathbf{v}$ be a nonzero codeword in $\mathcal{C}_{b}$ with weight $q$. Then, the set $\mathcal{C}_{b}^{(0)}=\{c \mathbf{v}: c \in G F(q)\}$ of $q$ codewords in $\mathcal{C}_{b}$ of weight $q$ forms a one-dimensional subcode of $\mathcal{C}_{b}$ with minimum distance $q$ and is a $(q, 1, q)$ extended RS code over $\mathrm{GF}(q)$. Any two codewords in $\mathcal{C}_{b}^{(0)}$ differ at every location. Partition $\mathcal{C}_{b}$ into $q$ cosets, $\mathcal{C}_{b}^{(0)}, \mathcal{C}_{b}^{(1)}, \cdots, \mathcal{C}_{b}^{(q-1)}$, based on the subcode $\mathcal{C}_{b}^{(0)}$. Then two codewords in any $\operatorname{coset} \mathcal{C}_{b}^{(i)}$ differ at every location and two codewords from two different cosets $\mathcal{C}_{b}^{(i)}$ and $\mathcal{C}_{b}^{(j)}$ with $i \neq j$ differ in at least $q-1$ locations. For $0 \leq i<q$, form a $q \times q$ matrix $\mathbf{G}_{i}$ over $\operatorname{GF}(q)$ with the codewords in $\mathcal{C}_{b}^{(i)}$ as rows. Then all the $q$ entries in a column of $\mathbf{G}_{i}$ are different and they form all the $q$ elements of $\mathrm{GF}(q)$. It follows from the structural properties of the cosets of $\mathcal{C}_{b}^{(0)}$ that any two rows from any matrix $G_{i}$ differ at every position and any two rows from two different matrices $\mathbf{G}_{i}$ and $\mathbf{G}_{j}$ with $i \neq j$ can have at most one location where they have identical symbols.

For $0 \leq i<q$, replacing each entry in $\mathbf{G}_{i}$ by its location vector, we obtain a $q \times q^{2}$ matrix $\mathbf{B}_{i}$ over $\operatorname{GF}(2)$ which consists of a row of $q$ permutation matrices of size $q \times q$,

$$
\mathbf{B}_{i}=\left[\begin{array}{lll}
\mathbf{A}_{i, 0} & \mathbf{A}_{i, 1} \cdots & \mathbf{A}_{i, q} \tag{24}
\end{array}\right],
$$

where $\mathbf{A}_{i, j}$ has the location vectors of the $q$ entries of the $j$-th column of $\mathbf{G}_{i}$ as rows. Next, we form the following $q \times q$ array of $q \times q$ permutation matrices with $\mathbf{B}_{0}, \mathbf{B}_{1}, \cdots, \mathbf{B}_{q-1}$

## Exhibit I <br> Page 128

as submatrices arranged in a column:

$$
\begin{align*}
\mathbf{H}_{r s, 1} & =\left[\begin{array}{c}
\mathbf{B}_{0} \\
\mathbf{B}_{1} \\
\vdots \\
\mathbf{B}_{q-1}
\end{array}\right]  \tag{25}\\
& =\left[\begin{array}{cccc}
\mathbf{A}_{0,0} & \mathbf{A}_{0,1} & \cdots & \mathbf{A}_{0, q-1} \\
\mathbf{A}_{1,0} & \mathbf{A}_{1,1} & \cdots & \mathbf{A}_{1, q-1} \\
\vdots & \vdots & \ddots & \vdots \\
\mathbf{A}_{q-1,0} & \mathbf{A}_{q-1,1} & \cdots & \mathbf{A}_{q-1, q-1}
\end{array}\right] .
\end{align*}
$$

$\mathbf{H}_{r s, 1}$ is a $q^{2} \times q^{2}$ matrix over $\mathrm{GF}(2)$ with both column and row weights $q$. For $q>7$, each permutation matrix $\mathbf{A}_{i, j}$ is a sparse matrix and hence $\mathrm{H}_{r s, 1}$ is also a sparse matrix. It follows from the structural properties of the matrices $\mathrm{G}_{i}$ 's that no two rows (or two columns) of $\mathbf{H}_{r s, 1}$ can have more than one 1 -component in common. This implies that there are no four 1 -components at the four corners of a rectangle in $\mathrm{H}_{r s, 1}$, that is, $\mathrm{H}_{r s, 1}$ satisfies the RC-constraint and, hence, has a girth of at least 6 [50][7].
For any pair of integers, $\left(d_{v}, d_{c}\right)$, with $1 \leq d_{v}, d_{c} \leq q$, let $\mathbf{H}_{r s, 1}\left(d_{v}, d_{c}\right)$ be a $d_{v} \times d_{c}$ subarray of $\mathbf{H}_{r s, 1}$. Then $\mathrm{H}_{r s, 1}\left(d_{v}, d_{c}\right)$ is a $d_{v} q \times d_{c} q$ matrix over GF(2) with column and row weights $d_{v}$ and $d_{c}$, respectively. It is a $\left(d_{v}, d_{c}\right)$-regular matrix which also satisfies the RC-constraint. The null space of $\mathbf{H}_{r s, 1}\left(d_{v}, d_{c}\right)$ gives a $\left(d_{v}, d_{c}\right)$-regular RS-based LDPC code $\mathcal{C}_{r s, 1}$ of length $d_{c} q$ with rate at least $\left(d_{c}-d_{v}\right) / d_{c}$ and minimum distance at least $d_{v}+1$ [50], [7], whose Tanner graph has a girth of at least 6 . Since $\mathrm{H}_{r s, 1}$ consists of an array of permutation matrices, no odd number of columns of $\mathbf{H}_{r s, 1}$ can be added to zero. This implies that the RS-based regular LDPC code $\mathcal{C}_{r s, 1}$ has only even-weight codewords. Consequently, its minimum distance is even, at least $d_{v}+2$ for even $d_{v}$ and $d_{v}+1$ for odd $d_{v}$. The above construction gives a class of regular LDPC codes whose Tanner graphs have girth at least 6. For each $(q, 2, q-1)$ extended RS code $\mathcal{C}_{b}$ over $\mathrm{GF}(q)$, we can construct a family of regular RS-based LDPC codes with various lengths, rates and minimum distances. $\mathcal{C}_{b}$ is referred to as the base code.

Example 6: Consider the $(64,2,63)$ extended RS code $\mathcal{C}_{b}$ over GF $\left(2^{6}\right)$. Based on the codewords of this RS code $\mathcal{C}_{b}$, we can construct a $64 \times 64$ array $\mathbf{H}_{r s, 1}$ of $64 \times 64$ permutation matrices. Suppose we choose $d_{v}=6$ and $d_{c}=32$. Take a $6 \times 32$ subarray $\mathbf{H}_{r s, 1}(6,32)$ from $\mathbf{H}_{r s, 1}$, say the $6 \times 32$ subarray at the upper left corner of $\mathbf{H}_{r s, 1} . \mathbf{H}_{r s, 1}(6,32)$ is a $384 \times 2048$ matrix over $\mathrm{GF}(2)$ with column and row weights 6 and 32, respectively. The null space of this matrix gives a (2048, 1723) regular RS-based LDPC code with rate 0.841 and minimum distance at least 8. Assume transmission over the AWGN channel with BPSK signaling. The performance of this code with iterative decoding using the SPA ( 50 iterations) is shown in Fig. 20. At a BER of $10^{-6}$, the code performs 1.55 dB from the Shannon limit. The standard code for the IEEE 802.2 10G Base-T Ethernet is a $(2048,1723)$ regular RS-based LDPC code given by the null space of a $6 \times 32$ subarray of the array $\mathbf{H}_{r s, 1}$ constructed above.


Fig. 20. Performance of the binary $(2048,1723)$ regular RS-based LDPC code given in Example 6 over the BI-AWGNC.

## A. Class-I RS-Based QC-LDPC Codes

RS codes were originally defined in polynomial form in frequency domain [63]. Using the polynomial form, arrays of circulant permutation matrices that satisfy the RC-constraint can be constructed from all the codewords of an RS code over a prime field $\operatorname{GF}(p)$ with two information symbols. Based on these arrays of circulant permutation matrices, a class of QC. LDPC codes can be constructed.
Let $p$ be a prime. Consider the prime field $\mathrm{GF}(p)=$ $\{0,1, \cdots, p-1\}$ under modulo- $p$ addition and multiplication. Let $\mathcal{P}=\left\{\mathbf{a}(X)=a_{1} X+a_{0}: a_{1}, a_{0} \in G F(p)\right\}$ be the set of $p^{2}$ polynomials of degree one or less with coefficients from $\mathrm{GF}(p)$. For each polynomial $\mathrm{a}(X)$ in $\mathcal{P}$, define the following $p$-tuple over $\operatorname{GF}(p): \mathbf{v}=(\mathbf{a}(0), \mathbf{a}(1), \cdots, \mathbf{a}(p-1))$, where $\mathbf{a}(j)=a_{1} \cdot j+a_{0}$ with $j \in G F(p)$. Then the set of $p^{2} p$ tuples,

$$
\begin{equation*}
\mathcal{C}_{b}=\{\mathbf{v}=(\mathbf{a}(0), \mathbf{a}(1), \cdots, \mathbf{a}(p-1)): \mathbf{a}(X) \in \mathcal{P}\} \tag{26}
\end{equation*}
$$

gives a ( $p, 2, p-1$ ) RS code over GF( $p$ ) with two information symbols. The RS code $\mathcal{C}_{b}$ given by (26) is not cyclic.

Consider the subset $\mathcal{P}_{0}=\left\{\mathbf{a}(X)=a_{0}: a_{0} \in G F(p)\right\}$ of zero-degree polynomials in $\mathcal{P}$. Then the set of $p$-tuples,

$$
\begin{align*}
\mathcal{C}_{b}^{(0)} & =\left\{(\mathbf{a}(0), \mathbf{a}(1), \cdots, \mathbf{a}(p-1)): \mathbf{a}(X) \in \mathcal{P}_{0}\right\}  \tag{27}\\
& =\left\{\left(a_{0}, a_{0}, \cdots, a_{0}\right): a_{0} \in G F(p)\right\}
\end{align*}
$$

constructed from the zero-degree polynomials in $\mathcal{P}_{0}$ forms a one-dimensional subcode of $\mathcal{C}_{b}$ and is a ( $p, 1, p-1$ ) RS code over $G F(p)$ with minimum distance $p$. Partition $\mathcal{C}_{b}$ with respect to $\mathcal{C}_{b}^{(0)}$ into $p$ cosets, $\mathcal{C}_{b}^{(0)}, \mathcal{C}_{b}^{(1)}, \cdots, \mathcal{C}_{b}^{(p-1)}$, where
$\mathcal{C}_{b}^{(i)}=\left\{(\mathbf{a}(0), \cdots ; \mathbf{a}(p-1)): \mathbf{a}(X)=i X+a_{0}, a_{0} \in G F(p)\right\}$.
For $0 \leq i<p, \mathcal{C}_{b}^{(i)}$ contains $p$ codewords in $\mathcal{C}_{b}$ of the following form:

$$
\begin{equation*}
\left(i \cdot 0+a_{0}, i \cdot 1+a_{0}, \cdots, i \cdot(p-1)+a_{0}\right) \tag{29}
\end{equation*}
$$

## Exhibit I


[^0]:    Exhibit C

[^1]:    

[^2]:    ${ }^{1}$ This paper is to be presented at the Second International Conference on Turbo Codes, Brest, France, September 2000. This research was supported by NSF grant no. CCR-9804793, and grants from Sony, Qualcomm, and Caltech's Lee Center for Advanced Networking.

[^3]:    Manuscript received October 22, 2002; revised April 1, 2004.
    A. Roumy is with IRISA-INRIA, 35042 Rennes, France (e-mail: aline.roumy @irisa.fr).
    S. Guemghar and G. Caire are with the Eurecom Institute, 06904 SophiaAntipolis, France (e-mail: Souad.Guemghar@eurecom.fr; Giuseppe.Caire@eurecom.fr).
    S. Verdú is with the Department of Electrical Engineering, Princeton University, Princeton, NJ 08544 USA (e-mail: verdu@princeton.edu).

    Communicated by R. Urbanke, Associate Editor for Coding Techniques.
    Digital Object Identifier 10.1109/TIT.2004.831778

[^4]:    ${ }^{1}$ If the output alphabet is the real line, then $-y$ coincides with ordinary reflec-

[^5]:    ${ }^{2}$ Recall that the capacity of a binary-input symmetric-output memoryless channel is achieved by uniform i.i.d. inputs.

[^6]:    ${ }^{3}$ Just prior to the submission of the final revised version of this work we became aware of [36] which proposes essentially the same method as Method 3.

[^7]:    Manuscript received July 04, 2006; revised August 25, 2006. This work was supported by the University of Bologna, NASA-Goddard, and NSF.

    This paper has been approved by F. Chiaraluce.
    Gianluigi Liva is with the University of Bologna (email: gliva@deis.unibo.it).

    Shumei Song, Lan Lan, and Shu Lin are with the University of California at Davis (e-mail: ssmsong@ece.ucdavis.edu, squashlan@gmail.com, shulin@ece.ucdavis.edu).
    Yifei Zhang and William E. Ryan are with the University of Arizona, U.S.A. (e-mail: \{yifeiz, ryan\}@ece.arizona.edu).

[^8]:    ${ }^{1}$ Acknowledgment to C. Jones of JPL for simulating this code for us on an FPGA decoder.

