Genomic messaging system
First Claim
1. A computer-based method for processing data that includes a genomic sequence, said method comprising:
- identifying at least one genomic base in an input data stream comprising said genomic sequence;
assigning a base-specific binary code to the at least one genomic base;
grouping the base-specific binary code to form a genomic data stream representative of the genomic sequence;
assigning a command binary code to at least one command for selectively processing said genomic data stream; and
integrating said genomic data stream and said command binary code to form an output binary data stream.
2 Assignments
0 Petitions
Accused Products
Abstract
A computer-based method is provided for transferring data that includes a genomic sequence. The method includes identifying at least one genomic base in an input data stream comprising said genomic sequence; assigning a base-specific binary code to the at least one genomic base; grouping the base-specific binary code to form a genomic data stream representative of the genomic sequence; assigning a command binary code to at least one command for selectively processing said genomic data stream; and integrating said genomic data stream and said command binary code to form an output binary data stream.
72 Citations
41 Claims
-
1. A computer-based method for processing data that includes a genomic sequence, said method comprising:
-
identifying at least one genomic base in an input data stream comprising said genomic sequence;
assigning a base-specific binary code to the at least one genomic base;
grouping the base-specific binary code to form a genomic data stream representative of the genomic sequence;
assigning a command binary code to at least one command for selectively processing said genomic data stream; and
integrating said genomic data stream and said command binary code to form an output binary data stream.
-
-
2. The method of claim 1, wherein said input data stream further comprises clinical data.
-
3. The method of claim 2, wherein said output binary data stream comprises said clinical data.
-
4. The method of claim 1, wherein said input data stream is read from an input data file.
-
5. The method of claim 1, further comprising transmitting said output binary stream to a receiving data processing system.
-
6. The method of claim 5, further comprising writing said output binary data stream to a binary data file before said transmitting step.
-
7. The method of claim 5, wherein said receiving data processing system performs the steps of:
-
parsing the genomic data stream from the output binary data stream;
unpacking the base-specific binary code within the genomic data stream;
reassigning said genomic bases to said base-specific binary code; and
arranging the genomic bases to form an output data sequence that includes said genomic sequence.
-
-
8. The method of claim 7, further comprising writing said output data sequence to an output data file.
-
9. The method of claim 1, wherein said genomic sequence is a DNA sequence and wherein said genomic base is one of adenine, guanine, cytosine, and thymine.
-
10. The method of claim 1, wherein said genomic sequence is an RNA sequence, and wherein said genomic base is one of adenine, guanine, cytosine, and uracil.
-
11. The method of claim 1, wherein the base-specific binary code is an n-bit binary code, wherein 2≦
- n≦
6.
- n≦
-
12. The method of claim 1, wherein the base-specific binary code is a 2-bit binary code.
-
13. The method of claim 12, wherein the 2-bit base-specific binary code is one of 00, 01, 10, or 11.
-
14. The method of claim 1, wherein the base-specific binary code comprises a code group of genomic bases, wherein 2n is greater than or equal to the number of permutations possible for the code group of genomic bases, and wherein n equals a number of bits necessary to code said code group of genomic bases.
-
15. The method of claim 14, wherein the code group comprises two genomic bases thereby forming 16 possible permutations of the two genomic bases, and wherein the number of bits necessary to code the code group comprising the two genomic bases is 4.
-
16. The method of claim 1, wherein said grouping step comprises grouping said base-specific binary code into at least one byte.
-
17. The method of claim 16, wherein said byte is an 8-bit byte.
-
18. The method of claim 17, wherein said byte comprises a genomic base portion coding for at least one genomic base and a command portion coding for at least one command.
-
19. The method of claim 18, wherein said command portion is a 6-bit command portion.
-
20. The method of claim 18, wherein said genomic base portion comprises a 6-9bit base portion and wherein said command portion comprises a 2-bit command portion.
-
21. The method of claim 1, wherein the binary code is a 2-bit binary code, and wherein the 2-bit binary code is packed into a binary stream of at least one 8-bit byte.
-
22. The method of claim 21, wherein X number of bases represented by said 2-bit binary code are grouped into said 8-bit byte wherein X=1, 2, or 3;
- and wherein any remaining bits of said 8-bit byte are used to specify a multiplicity of the X number of bases represented by said 2-bit binary code.
-
23. The method of claim 21, wherein four genomic bases represented by said 2-bit binary code are grouped into said 8-bit byte, and wherein a multiplicity of the four bases is specified elsewhere in said output binary data stream.
-
24. The method of claim 1, wherein said assigning a base-specific binary code comprises assigning a first bit to said genomic base such that the first bit corresponds to a purine or a pyrimidine base.
-
25. The method of claim 4, further comprising encrypting said output binary data stream.
-
26. The method of claim 25, further comprising decrypting said binary stream after said transmitting step.
-
27. The method of claim 1, wherein said command comprises annotation text annotating said one or more genomic bases.
-
28. The method of claim 27, wherein said annotation text is embedded in said output binary data stream so as to preserve a relationship of said annotation text to said genomic bases.
-
29. The method of claim 28, further comprising transmitting said output binary data stream to a receiving data processing system and extracting said annotation text from said output binary data stream after said transmitting step so as to preserve the relationship of said annotation text to said genomic bases.
-
30. The method of claim 5, wherein said command is operable to add a text identifier to said genomic data stream.
-
31. The method of claim 30, further comprising providing a corresponding text identifier to a user of said receiving data processing system.
-
32. The method of claim 1, wherein said command is operable to provide validation of integrity of said genomic data stream.
-
33. The method of claim 1, wherein said command is operable to exclude identifying information pertaining to a person whose genomic sequence is contained in said genomic data stream from being revealed in said output binary data stream.
-
34. The method of claim 1, wherein said command is operable to control a level of encryption of the output binary data stream.
-
35. The method of claim 34, wherein said command is recognized by a receiving data processing system to permit decryption of the output binary data stream.
-
36. The method of claim 34, wherein said command is operable to seed an algorithm used for encryption of the output binary data stream.
-
37. The method of claim 34, wherein said command is operable to specify a block size of a shuffling algorithm used for encryption of the output binary data stream.
-
38. The method of claim 1, wherein said command is operable to embed program code for selectively processing said genomic data stream.
-
39. The method of claim 1, wherein said command is operable to bracket at least one portion of said genomic data stream thereby selecting said portion for processing.
-
40. An apparatus in a data processing system for transferring data comprising a genomic sequence, said apparatus comprising:
at least one processor operative to;
(i) identify at least one genomic base in an input data stream comprising said genomic sequence;
(ii) assign a base-specific binary code to the at least one genomic base;
(iii) group the base-specific binary code to form a genomic data stream representative of the genomic sequence;
(iv) assign a command binary code to at least one command for selectively processing said genomic data stream; and
(v) integrate said genomic data stream and said command binary code to form an output binary data stream.
-
41. An article of manufacture in a data processing system for transferring data comprising a genomic sequence, said article of manufacture comprising a machine readable medium containing one or more programs which when executed implement the steps of:
-
identifying at least one genomic base in an input data stream comprising said genomic sequence;
assigning a base-specific binary code to the at least one genomic base;
grouping the base-specific binary code to form a genomic data stream representative of the genomic sequence;
assigning a command binary code to at least one command for selectively processing said genomic data stream; and
integrating said genomic data stream and said command binary code to form an output binary data stream.
-
Specification