CORR-8BIT-P2-U2 - Lattice - Datasheet.Directory

www.latticesemi.com

ip1039_01.0

April 2005 IP Data Sheet

or product names are trademarks or registered trademarks of their respective holders. The speciﬁcations and information herein are subject to change without notice.

The product described herein is subject to continuing development, and applicable speciﬁcations and information are subject to change without notice. Such speciﬁca-

tions and information are provided in good faith; actual performance is not guaranteed, as it is dependent on many factors, including the user's system design.

Correlator IP Core

Features

■

Supports 1- to 8-Bit Input Data Width

■

Supports 1 to 256 Channels

■

Supports a Correlation Window from 8 to

2048 Taps

■

Supports Oversampled Input Data from 2x

to 8x

■

Supports Real Correlations for Either

Signed or Unsigned Data

■

Supports Complex Correlations for Signed

Data

■

Allows the User to Tune the Performance of

the Design by Specifying the Values of

Several Parameters

■

Provides a Selectable Input FIFO for

Maximum Data Throughput

■

Allows the User to Specify the Number of

Coefﬁcient Sequences Desired, from 1 to

256

Introduction

The function of this core is to correlate an incoming data

stream to a stored binary pattern called a code

sequence or coefﬁcient sequence. The data stream

may be binary or multi-valued, either signed or

unsigned, and is provided to the core one sample at a

time. The core can be conﬁgured to perform either a

real correlation with a single data input stream and a

single coefﬁcient sequence, or a complex correlation

with two input data streams representing the real and

imaginary input terms, and two coefﬁcient sequences

representing the real and imaginary coefﬁcients.

The core provides multiple channel capability and can

support up to 256 channels. Correlations for each chan-

nel operate independently from each other. Also, up to

256 different coefﬁcient sequences may be stored in the

core, and each channel can select which coefﬁcient

sequence is correlated to that channel, so one coefﬁ-

cient sequence could be used for all 256 channels if

desired.

Figure 1. Correlator IP Core External Interface Diagram

decim_r[n:0]

Correlator

rst

clk

din[n:0]

din_im[n:0]

numtaps_r[n:0]

code_sel_in[n:0]

coeffaddr[n:0]

crdy

coeffwdat[n:0]

coeffwdat_im[n:0]

coeffwr

ordy

chan_in[n:0]

dout[n:0]

chan_out[n:0]

block_start_out

dout_im[n:0]

block_start_in

irdy

seq_err

Lattice Semiconductor Correlator IP Core

General Description

A correlation function determines how closely a data sequence matches a reference, or “coefﬁcient” sequence. A

high correlation value means that the data sequence closely matches the coefﬁcient sequence. A low correlation

value means that the data sequence is dissimilar to the coefﬁcient sequence. The basic correlator equation is given

by:

(1)

The terms of the equation are:

•

– Input data sequence. The Correlator IP core allows the input sequence to be from 1 to 8 bits wide, and either

signed (two’s complement) or unsigned data.

•

– Coefﬁcient (or code) sequence. In the Correlator IP core, this sequence must be loaded into internal memory

prior to a correlation operation. This sequence is always binary {1,0}; however, the coefﬁcient sequence may rep-

resent either an unsigned sequence {1,0} or a signed sequence {+1,-1}. In the case of a signed coefﬁcient

sequence, a 1 in the sequence represents a value of +1 and a 0 in the sequence represents a value of -1. The d

and c

inputs must both be of the same type, either signed or unsigned.

•

– Correlation result output sequence (correlation between d

and c

inputs).

•

corr_win

– Correlation window. This is the number of elements in the input data sequence over which the corre-

lation function is calculated. This is also referred to as the number of “taps.” For the Correlator IP core this num-

ber is determined by the user when conﬁguring the core. Once selected, the number of data elements (and

coefﬁcient terms) is then ﬁxed at this number for all correlation operations. The number of terms in the coefﬁcient

sequence is always equal to the number of taps (corr_win) speciﬁed.

•

num_lags

– Total number of lags for which the correlation is performed. This is also the length of the correlation

result sequence, r

From Equation 1, a correlation operation takes an input data sequence d

of length “corr_win” and multiplies each

term in the sequence against the terms of the reference coefﬁcient sequence c

, summing the results of all of the

multiplications to produce the result r

. The input data sequence is then shifted by one element and the operation is

repeated to produce the next term in the r

sequence. This is done “num_lags” times.

In the Correlator IP core, each time a new data term is input to the core, one correlation operation is performed

across “corr_win” data and coefﬁcient terms, producing one result, r. The “k” index in Equation 1 does not apply

since the Correlator always produces one new result when it receives one new data value. Old data beyond the

deﬁned correlation window size is not held in memory.

In addition to the basic correlation function described above, the Correlator IP core can be conﬁgured to perform

complex correlations, deﬁned by the equation:

(2)

In this case, the data and coefﬁcient input sequences are both complex and each contains a real input sequence

and an imaginary input sequence. For the Correlator IP core, the input data sequence is a sequence of signed

(two’s complement) numbers from 1 to 8 bits wide, and the coefﬁcient sequence is a binary sequence where a

coefﬁcient value of 1 represents +1 and a coefﬁcient value of 0 represents a -1. Equation 2 represents the complex

conjugate of c

. The complex conjugate multiplication expressed in Equation 2 is given as:

(3)

wincorr

ikik

cdr

k = 0,1,…,num_lags-1

wincorr

ikik

cdr

k = 0,1,…,num_lags-1

dc = (dre cre + dim cim) + j(dim cre - dre cim)

Lattice Semiconductor Correlator IP Core

Since the coefﬁcients in a complex correlation are restricted to the values {+1,-1}, the multiplications in Equation 3

simplify to inversions of the d

and d

terms, and the whole equation reduces to a series of additions and subtrac-

tions. The Correlator IP core performs these inversions and sums the results to produce a new result value r for

each new d

and d

term input to the core. In this case, the result sequence will have two terms, a real term and

an imaginary term.

Functional Description

The Correlator IP core is composed of the following functional blocks:

•

State Machine

– Controls the ﬂow of data received from the user. Generates the starting pointer values neces-

sary to read/write the Tap Memory and stores the pointer values in the Channel Memory. Stores the starting

pointer values for the next correlation operation in the input FIFO. Generates the “shiftby” value for the aligner.

•

Channel Memory

– Stores the pointer to the location in Tap Memory to write the next data value.

•

Input FIFO

– Stores the pointer to the starting point in Tap and Coefﬁcient Memories for the next correlation.

•

Tap Memory

– Stores the data terms (d

•

Coefﬁcient Memory

– Stores the coefﬁcient terms (c

•

Aligner

– Aligns data and coefﬁcients read from memory for the correlation operation.

•

Correlator

– Performs the equivalent of the multiplication operations in Equations 1 and 2.

•

Adder/Accumulator

– Performs the addition and subtraction operations in Equations 1 and 2.

Correlator Input and Output Data

The Correlator IP core accepts a new input data value for a channel and writes that value into Tap Memory. When it

is ready to perform the next correlation operation for that channel, the new data value will be included in the corre-

lation, along with enough “old” data already in memory to completely ﬁll the correlation window. The Tap Memory is

a circular buffer which contains a correlation window’s worth of data. As each new value is added to Tap Memory

for a particular channel, a correlation operation needs to be run and completed before the next new value is added

to memory for that channel. New data can be written into Tap Memory for channels other than the one the Correla-

tor is operating on, however it is the user’s responsibility to insure that a correlation operation is done for a particu-

lar channel before new data is written into memory for that same channel. This is easy to do for a large number of

channels where new data is written to channels in a round-robin sequence, or if the Correlator throughput is not

stressed to its limit (i.e. unused cycles appear between correlations), but the problem can be difﬁcult to manage for

small numbers of channels. The Correlator IP core will automatically prevent new data being written into Tap Mem-

ory and corrupting a correlation as long as the input FIFO depth (parameter 9) is set to 1.

Figures 2 and 3 show the timing of the user interface. The state machine accepts one new data value from the user

interface at a time. When the state machine is ready to accept a new input data word from the user interface it

asserts the

crdy

signal. The user interface then inputs

din

chan_in

code_sel_in

block_start_in

, and

asserts the

irdy

signal. When the state machine sees

irdy

go active, it will take the new data value from the user

interface. If the design has been conﬁgured for multiple channels, the state machine reads the pointer for that chan-

nel from the Channel Memory. This pointer value tells the state machine where in Tap Memory to write the data

value just received from the user interface. This pointer value will also be the starting point for the next correlation

operation, so this pointer value is stored in the Input FIFO until the next correlation operation is ready to start. Once

the present correlation operation ﬁnishes, or if no correlation operation was in progress when a new data value was

received, then the Input FIFO is read to determine the starting pointer for the next correlation. The state machine

begins reading the Tap and Coefﬁcient Memories at the starting pointer location, and it reads until it has read an

entire correlation window’s worth of data and coefﬁcients.

Figure 2 shows at time 173.5µs that

crdy

went active. The user provided a value of 0x3 for channel 0, and set the

code_sel_in

to 1 which indicates which coefﬁcient sequence is to be used for the correlation of channel 0 data.

Lattice Semiconductor Correlator IP Core

In this example, two channels and two different coefﬁcient sequences have been conﬁgured. Each channel can be

correlated to either of the two coefﬁcient sequences.

Figure 2. User Interface Timing Diagram for Two-channel Correlator

Figure 3. User Interface Timing Diagram for Two-channel Correlator

The user drives the

irdy

signal high for one clock cycle indicating that the input values are valid. Along with the

input data, the user also sets the

block_start_in

signal. This signal will be taken as a marker by the Correlator

and aligned with the input data as it passes through the Correlator IP core. The next time the

block_start_out

signal is set and

chan_out

= 0, it will indicate that the output data was associated with this input data value. The

block_start

signals act as markers for the user to do frame alignments between the input and outputs of the

Correlator IP core. This is necessary since the core operates on one input data sample at a time. It does not per-

form multiple correlations over “num_lag” values as expressed in Equation 1. This allows the simplest and most

versatile Correlator IP core design. If it is necessary to operate for multiple “lags,” then the user application will

need to add input and output FIFOs around the core to feed data values one sample at time.

At time 178.5µs,

crdy

again goes active indicating that the Correlator IP core is ready to accept the next input

value, and in the example of Figure 2 the user inputs data for channel 1. At time 196.5µs, the correlation result for

channel 0 is ready at the

dout

outputs, and the core outputs a value of 0x9 on

dout

, sets the

chan_out

to 0, and

asserts the

ordy

signal. It also asserts the

block_start_out

signal to indicate that this output value was asso-

ciated with the

din

value from time 175.5µs.

Lattice Semiconductor Correlator IP Core

Figure 4. Tap and Coefﬁcient Memories

Tap and Coefﬁcient Memories

While the Tap and Coefﬁcient Memories are being read, the values read are passed to the Aligner. Under the con-

trol of the state machine the Aligner shifts the tap data and coefﬁcients to be passed to the Correlator block. The

state machine also generates strobe signals to the Aligner which indicate, in any given clock cycle, which tap and

coefﬁcient values are valid for the correlator block to work on.

The Tap and Coefﬁcient Memories are implemented with EBR blocks as shown in Figure 4. The Correlator IP core

will automatically conﬁgure and instantiate the proper number of EBR blocks in the design based on the parame-

ters selected by the user. In the case of the Tap Memory, the number of correlator cells, number of taps, number of

channels, and the oversampling rate all determine how many EBR memories are needed. The number of correlator

cells (parameter MWIDTH) determines how many words of data can be operated on during a single clock cycle.

The more correlator cells which are conﬁgured, the more multiplication operations can occur in a clock cycle and

the overall data throughput goes up. At least one EBR memory is required to feed each correlator cell. All Tap

Memory EBR blocks in the design will be conﬁgured to be at least the word width of the input data (DWIDTH) wide.

The EBR blocks can be sized 1, 2, 4, or 9 bits wide, and must be equal to or greater than DWIDTH. Since each

EBR block can store 8192 bits, if the value of [TAP_EBR_WIDTH * (NUM_TAP / MWIDTH) * NUM_CHAN *

OS_FACTOR] exceeds 8192 bits, then multiple EBR blocks will be stacked in columns to feed the correlator cells,

as shown in Figure 4. TAP_EBR_WIDTH is the minimum allowed EBR width which is at least DWIDTH wide. The

Tap Memory EBRs will be conﬁgured automatically for the user; however, the user is responsible for determining

the total number of EBR blocks needed for the design and insure that the target LatticeEC™ device contains

enough memories.

The Coefﬁcient Memories are also implemented in EBR blocks. Since each coefﬁcient is constrained to be 1 bit,

the total amount of memory required for coefﬁcients is generally less than that required for tap data. Each EBR

EBR Block

Tap Memory

Aligner

Coefficient Memory

Corr

Cell

Corr

Cell

Corr

Cell

Corr

Cell

Number of Correlator

Cells = MWIDTH

Number of Data Words (Taps)

per Row of Memory = MWIDTH

Number of Coefficients

per Row of Memory = MWIDTH

Correlator

EBR BlockEBR Block

EBR BlockEBR BlockEBR Block

Lattice Semiconductor Correlator IP Core

block can be a maximum of 36 bits wide, therefore if the number of correlator cells (MWIDTH) is not greater than

36, only one column of EBR memories is required for the Coefﬁcient Memories. If MWIDTH > 36, then multiple col-

umns will be conﬁgured. As in the case of the Tap Memories, if the total number of coefﬁcients which needs to be

stored exceeds one row of EBR memories, then multiple rows will be conﬁgured in a stacked arrangement as

shown in Figure 4. For MWIDTH

≤

36, the number of coefﬁcients required is [NUM_TAP * NUM_COEF_SEQ]. If

this number is less than 8192 then only one EBR is needed for the Coefﬁcient Memory.

Unlike the Tap Memories which are written with new user data under the control of the state machine, the Coefﬁ-

cient Memories must be written with the coefﬁcient sequences before any correlation operations can be done. This

is done via the Coefﬁcient Memory Conﬁguration interface shown in Figure 5. This interface consists of the input

signals:

coeffaddr

coeffwdat

coeffwdat_im

, and

coeffwr

. Figure 6 shows the timing for this interface for

a two-channel design with MWIDTH=4, NUM_TAP=16 and NUM_COEF_SEQ=2.

Figure 5. Coefﬁcient Memory Conﬁguration Interface Timing

In this case, the Coefﬁcient Memory is implemented in one EBR block. Each row of Coefﬁcient Memory is required

to store MWIDTH=4 coefﬁcients, so each write to the memory writes four bits. Each coefﬁcient sequence is

NUM_TAP=16 bits long, and it will occupy (NUM_TAP / MWIDTH)= 4 rows in the Coefﬁcient Memory. In addition, in

this example there are two separate coefﬁcient sequences, so the coefﬁcients will occupy a total of eight rows in

the Coefﬁcient Memory.

Writes to the Coefﬁcient Memory are enabled by asserting the

coeffwr

input. The

coeffaddr

input selects the

row of memory to be written, and

coeffwdat

(and

coeffwdat_im

for complex correlations) is set to the desired

value. This is a very simple interface, however it is essential to make sure that the coefﬁcient sequence is written in

the correct order. In the example above, the ﬁrst four values written are for coefﬁcient sequence 0. The values writ-

ten are 0xa6fc (or in binary: 1010 0110 1111 1100) with the LSB being the ﬁrst bit in the correlation sequence. This

bit will be multiplied against the newest data value received by the Correlator. The MSB in this string will be multi-

plied against the oldest data read from Tap Memory. This is explained further in the Correlator Evaluation Package

section of this document.

The second coefﬁcient sequence written into the Coefﬁcient Memory is 0x0180, and is written into rows 7, 6, 5 and

4. This will be selected as coefﬁcient sequence 1 by setting the

code_sel_in

to 1 when a data value is input to

the Correlator. Figure 6 shows how the coefﬁcient values from this example would appear in the Coefﬁcient Mem-

ory:

Lattice Semiconductor Correlator IP Core

Figure 6. Example Coefﬁcient Sequences Written in Coefﬁcient Memory

In conﬁgurations where the number of taps is not a power of 2, the coefﬁcient sequences will need to be padded

with zeros so that all coefﬁcient sequences written to the Coefﬁcient Memory are a power of 2 long. This is

because the Coefﬁcient and Tap Memories must be divided up evenly into sections representing the individual

channels and coefﬁcient sequences. In the example above, if the number of taps were less than 16, the coefﬁcients

would still be written into memory the same way except that padding zeros would be added before the MSB. The

padding zeros would be written into memory starting at the MSB of row 3 for coefﬁcient sequence 0.

Correlator and Adder/Accumulator Blocks

The Correlator block performs the multiplication operations in Equations 1 and 2. The coefﬁcients are conﬁgured by

the user to be either unsigned or signed. If unsigned, then the binary coefﬁcient values simply represent {1,0} and

the multiplications reduce to either passing the tap values read from memory to the Adder/Accumulator, or passing

a zero value. If the coefﬁcients are signed, then the binary coefﬁcients {1,0} represent values of {+1,-1}. If a tap

value is multiplied by 1, then the Correlator block does nothing other than pass the tap value read from memory to

the Adder/Accumulator. If a tap value is multiplied by -1, then the Correlator block does a two’s complement conver-

sion of the tap value read from memory and passes the result to the Adder/Accumulator, which in turn completes

the summation of the correlation sequence to generate the ﬁnal result.

Decimation

The Correlator IP core allows the input data to be oversampled from two to eight times the normal sampling rate.

The OS_FACTOR parameter should be set to the correct oversampling rate. When this is done, the core will auto-

matically decimate the amount of data which is included in the correlation operations by the correct amount. For

example, if the number of taps is eight and an oversampling rate of two is chosen, then the circuit will correlate the

eight coefﬁcient values with the newest input tap data value and the odd numbered tap data values from the past

15 “old” data values. The correlation will look like this:

r = d1c1 + d3c2 + d5c3 + d7c4 + d9c5 + d11c6 + d13c7 + d15c8 (4)

The number of data values stored in Tap Memory for a given channel becomes [OS_FACTOR*NUM_TAP], or in

this case 16. The number of coefﬁcients per channel is still equal to NUM_TAP.

0 0 0 0

1 0 1 0

0 1 1 0

1 1 1 1

1 0 0 0

0 0 0 1

0 0 0 0

1 1 0 0Row 0

Row 1

Row 2

Row 3

Row 4

Row 5

Row 6

Row 7

Coefficient

Sequence 1

Coefficient

Sequence 0

Upper Locations

of EBR (Unused

in this Example)

MWIDTH wide

Lattice Semiconductor Correlator IP Core

Parameter Descriptions

The parameters used for conﬁguring the Correlator IP core are listed below. The values of these parameters must

be set prior to synthesis or functional veriﬁcation.

Table 1. User Conﬁgurable Parameters

The basic conﬁguration parameters should be selected based on the type of correlation desired. These include

parameters 1, 2, 4, 5, 6, 7 and 10. The remaining parameters 3, 8 and 9 are selected based on the desired perfor-

mance of the circuit.

For parameter 3, a higher fMAX can be achieved by generating a much smaller circuit (smaller number of correlator

cells). However, for long data sequences (number of taps, or “corr_win”), this will mean that many clock cycles are

needed for each correlation result to be calculated resulting in very poor overall data throughput and long latency

times. For higher data throughput, and at the expense of a larger and therefore more complicated circuit, a higher

number of correlators should be chosen. The Correlator IP core is architected to be highly pipelined, so even for

large numbers of correlator cells, the penalty in fMAX is small; however, as the design becomes more complicated, it

will eventually reach a point where the fMAX is impacted by routing in the FPGA.

Parameter 8 should be set to 1 for the highest performance circuit. A value of 2 or 3 will result in a smaller, but sig-

niﬁcantly lower performance design.

Parameter 9 sets the depth of the input FIFO. This improves throughput performance by allowing the next input

data sample to be presented to the device while the present correlation result is being calculated. However, care

must be used when changing this parameter. If the FIFO depth is set above 1, then the user must insure that a new

data sample will not be presented to the Correlator IP core for the same channel as is presently being serviced or

the new data sample will be written into the core’s internal tap memory and will corrupt the correlation which is

already in progress for that channel. If the core has been conﬁgured for multiple channels, and input data values for

the same channel are never presented to the core adjacent to each other in time, then the FIFO depth can be

safely increased beyond 1. For example, if the core is conﬁgured for eight channels, and data for each of the eight

channels is always presented in sequence, then the FIFO depth may be increased to 2 or 3. However, if the core is

conﬁgured for one or two channels, or the input data sequences through channels at random, then the FIFO depth

should never be increased beyond 1.

Parameter

Number Parameter

Parameter

Description Input Range

Default

Input Value

Parameter

Values

1 DWIDTH Input data width 1-8 4 —

2 NUM_TAP Number of taps 8-2048 16 —

3 MWIDTH Number of correlator cells

Minimum = 1

Maximum = the number of

EBR blocks in the target

LatticeEC device

4—

4 NUM_CHAN Number of channels 1-256 2 —

5 DTYPE Input data type Signed, unsigned Unsigned “UNSIGNED”

“SIGNED”

6 COMPLEX Correlation type Real, complex Real Real = 0

Complex = 1

7 OS_FACTOR Oversampling rate 1-8 1 —

8 PERFORMANCE Performance 1, 2, 3 1 —

9 FIFO_DEPTH Input FIFO depth 1, 2, 3 1 —

10 NUM_COEF_SEQ Number of coefﬁcient

sequences 1-256 NUM_CHAN —

Lattice Semiconductor Correlator IP Core

Custom Core Conﬁgurations

For core conﬁgurations that are not available in the Evaluation Packages, please contact your Lattice sales repre-

sentative to request a custom conﬁguration.

Related Information

For more information regarding core usage and design veriﬁcation, refer to the Parallel RapidIO Physical Layer

Interface IP Core User’s Guide, available on the Lattice web site at www.latticesemi.com.

Lattice Semiconductor Correlator IP Core

Appendix for LatticeECP™ and LatticeEC™ Devices

Table 2. Performance and Resource Utilization1

Supplied Netlist Conﬁgurations

The Ordering Part Number (OPN) for the Correlator IP Core on LatticeECP/EC devices is CORR-8BIT-E2-N1 (for

all conﬁgurations of the netlist package). Table 3 lists the evaluation netlists that can be downloaded from the Lat-

tice web site at www.latticesemi.com. To load the preset parameters for this core, click on the “Load Parameters”

button inside the IP Manager tool. Make sure that you are looking for a ﬁle inside this core's directory location. The

Lattice Parameter Conﬁguration ﬁles (.lpc) are located inside this directory.

Table 3. Parameter Settings of the Evaluation Packages

Parameter

Filename

Parameter

Settings SLICEs LUTs Registers

External

Pins

sysMEM™

EBRs fMAX

Corr_8bit_e2_1_001.lpc See Table 3 212 140 310 41 5 212.54 MHz

1. Performance and utilization characteristics are generated using LFEC20E-5F672C in Lattice's ispLEVER® 4.2 software. When using this

IP core in a different density, speed, or grade within the Lattice ECP/EC family, performance may vary.

Input Data

Width

# of

Taps

# of

Correlators

# of

Channels

Input

Data Type

Correlation

Type

Over-sampling

Rate Performance

Input

FIFO

Depth

# of

Coefﬁcient

Sequences

4 16 4 2 Unsigned Real 1 1 1 2

Mouser Electronics

Authorized Distributor

Click to View Pricing, Inventory, Delivery & Lifecycle Information:

Lattice:

CORR-8BIT-X2-U2 CORR-8BIT-E2-UT2 CORR-8BIT-P2-UT2 CORR-8BIT-PM-UT2 CORR-8BIT-SC-UT2 CORR-

8BIT-X2-UT2 CORR-8BIT-XM-UT2 CORR-8BIT-E2-U2 CORR-8BIT-P2-U2 CORR-8BIT-PM-U2 CORR-8BIT-SC-U2

CORR-8BIT-XM-U2