Method and apparatus for parallel high speed data transfer
First Claim
1. A data bus interface for fast parallel I/O comprising:
- a core interface configured to accept output data from a core for transmission and configured to provide to said core input data which has been received;
a data transmission unit coupled to said core interface to receive said output data for transmission and configured to drive a plurality of output signals representing said output data across a data bus in a fixed-phase relationship with a bus clock signal; and
a data reception unit coupled to said data bus to receive a plurality of input signals representing said input data and configured to synchronize said plurality of input signals to each other in a fixed-phase relationship with said bus clock signal, thereby reducing clock skew of said plurality of input signals, said data reception unit comprising;
a plurality of delay elements each coupled to receive a corresponding one of said plurality of input signals;
a first-in first-out buffer for synchronizing said input data to a core clock signal;
a data delay adjust unit coupled to an output of said first-in first-out buffer and configured to set the individual delays of said plurality of delay elements;
a demultiplexer coupled to receive a delayed input data signal from each of said plurality of delay elements, coupled to receive a bus clock signal, and configured to time-demultiplex the delayed input data signal from each of said plurality of delay elements in response to changes in said bus clock signal; and
a plurality of data latches coupled to receive time-demultiplexed output data from said demultiplexer, and thereafter synchronously provide said time-demultiplexed output data as an input signal to said first-in first-out buffer.
8 Assignments
0 Petitions
Accused Products
Abstract
The present invention concerns a method for eliminating or reducing clock skew introduced by differing signal propagation delays across a data bus. At high bus clock frequencies the time delay differences caused by path length differences can be catastrophic and must be eliminated by expensive layout techniques. An input/output (I/O) architecture is proposed here which tailors a delay to each individual data line, and thereby aligns all the incoming data. Furthermore, a clock signal is provided to indicate the optimal data sampling time. In the described embodiment, this circuit enables the transmission of four 32 bit words in parallel in one clock cycle of a 250 MHz processor.
122 Citations
12 Claims
-
1. A data bus interface for fast parallel I/O comprising:
-
a core interface configured to accept output data from a core for transmission and configured to provide to said core input data which has been received; a data transmission unit coupled to said core interface to receive said output data for transmission and configured to drive a plurality of output signals representing said output data across a data bus in a fixed-phase relationship with a bus clock signal; and a data reception unit coupled to said data bus to receive a plurality of input signals representing said input data and configured to synchronize said plurality of input signals to each other in a fixed-phase relationship with said bus clock signal, thereby reducing clock skew of said plurality of input signals, said data reception unit comprising; a plurality of delay elements each coupled to receive a corresponding one of said plurality of input signals; a first-in first-out buffer for synchronizing said input data to a core clock signal; a data delay adjust unit coupled to an output of said first-in first-out buffer and configured to set the individual delays of said plurality of delay elements; a demultiplexer coupled to receive a delayed input data signal from each of said plurality of delay elements, coupled to receive a bus clock signal, and configured to time-demultiplex the delayed input data signal from each of said plurality of delay elements in response to changes in said bus clock signal; and a plurality of data latches coupled to receive time-demultiplexed output data from said demultiplexer, and thereafter synchronously provide said time-demultiplexed output data as an input signal to said first-in first-out buffer. - View Dependent Claims (2, 3, 4, 5, 6)
-
-
7. A computer system which comprises:
a data bus which is coupled between elements configured to transmit and receive data across said data bus, each of said elements comprising; a core; and a data bus interface coupled to said core, said data bus interface including; a core interface configured to accept output data from said core for transmission and configured to provide to said core input data which has been received; a data transmission unit coupled to said core interface to receive said output data for transmission and configured to drive a plurality of output signals representing said output data across said data bus in a fixed-phase relationship with a bus clock signal; and a data reception unit coupled to said data bus to receive a plurality of input signals representing said input data and configured to synchronize said plurality of input signals to each other in a fixed-phase relationship with said bus clock signal, thereby reducing clock skew of said plurality of input signals, said data reception unit including; a plurality of delay elements each coupled to receive a corresponding one of said plurality of input signals; a first-in first-out buffer for synchronizing said input data to a core clock signal; a data delay adjust unit coupled to an output of said first-in first-out buffer and configured to set the individual delays of said plurality of delay elements; a demultiplexer coupled to receive a delayed input data signal from each of said plurality of delay elements, coupled to receive a bus clock signal, and configured to time-demultiplex the delayed input data signal from each of said plurality delay elements in response to changes in said bus clock signal; and a plurality of data latches coupled to receive time-demultiplexed output data from said demultiplexer, and thereafter synchronously provide said time-demultiplexed output data as an input signal to said first-in first-out buffer. - View Dependent Claims (8, 9, 10, 11, 12)
Specification