(home)
(Full Firmware Plan)

Downsample Filter (DF)

(Firmware coder: Jon Wilson)

Structure

The Downsample Filter (DF) module takes the raw, full-sampling-rate input samples from the DCRC's Analog-to-Digital Converters (ADCs), downsamples them, and sends the results to the Synchronizer module. The inputs are received from 5 separate streams, which are in a one-to-one correspondence with the 5 ADC chips. There are 3 phonon ADC chips (each of which processes 4 phonon channels) and 2 charge ADC chips (each of which processes 2 charge channels). The processing is done in 5 DF submodules, one for each of the 5 inputs. Each submodule sends its output directly as a streaming output to the Synchronizer module. While the phonon and charge DF submodules are slightly different, they both employ a Cascaded Integrator-Comb (CIC) filter to reduce aliased noise.

Data Inputs

There are 5 separate inputs to the DF module, one from each of the 3 phonon ADC chips and from each of the 2 charge ADC chips. Each ADC input is processed in a separate DF submodule. Each phonon ADC output includes 4 parallel channels, while each charge ADC output includes 2 parallel channels.

  • The 5 inputs contain the following information:
    • phonon channels 1-4
    • phonon channels 5-8
    • phonon channels 9-12
    • charge channels 1-2
    • charge channels 3-4
  • The input is not formatted as an Avalon-ST stream unlike other streaming interfaces used in the L1 trigger firmware. The format is described in more detail on the ADC module page.
    • Each input includes the following signals:
      • Data signal:
        • 64 bits wide for each phonon input (4 phonon channels with 16 bits each, in parallel)
        • 32 bits wide for each charge input (2 charge channels with 16 bits each, in parallel)
      • 1 bit wide "data ready" signal
    • Note that the input samples from the ADCs arrive in groups: 3 groups of 4 phonon channels each at a rate of 625 kHz, and 2 groups of 2 charge channels each at a rate of 2.5 MHz. The input samples are presented on parallel signals, and are accompanied by a "data ready" signal. When the "data ready" signal for each input is asserted, the data for that input is available on the next rising edge of the 100 MHz clock. The data may not be available at any later time, so this module must store it in an internal register until it is used.
    • Note that the inputs of the phonon DF submodules and the charge DF submodules arrive at different rates, so they are inherently not synchronized. The individual inputs of the phonon DF submodules are approximately synchronized, but they may arrive as much as a few clock cycles apart. The same is true for the inputs of the charge DF submodules. The ADC samples to any one DF submodule arrive simultaneously. The alignment of all the channels will be done in the Synchronizer module.
Configuration and Control Inputs
  • 100 MHz clock input
  • 1 reset input
Outputs There are 5 separate streaming outputs, which go to the Synchronizer module. Each DF submodule pushes out the outputs from the corresponding input channels.
  • The 5 outputs contain the following information:
    • phonon channels 1-4
    • phonon channels 5-8
    • phonon channels 9-12
    • charge channels 1-2
    • charge channels 3-4
  • The outputs are formatted as 5 separate multiple-channel packetized Avalon-ST streams.
    • Each output stream includes the following signals (see table 5-1 of the Avalon-ST specification for detailed descriptions of the signal roles):
      • Data signal:
        • 28 bit wide data signal for each phonon output (there are 3 separate phonon streams total, each of which is used to carry the 4 28-bit words per stream in serial)
        • 34 bit wide data signal for each charge output (there are 2 separate charge streams total, each of which is used to carry the 2 34-bit words per stream in serial)
      • Channel-number signal:
        • 2 bit wide channel-number signal for each phonon output (to specify which of the 4 channels each datum belongs to)
        • 1 bit wide channel-number signal for each charge output (to specify which of the 2 channels each datum belongs to)
      • 1 bit wide valid signal
      • 1 bit wide start-of-packet signal
      • 1 bit wide end-of-packet signal
    • Packets will be sent at a rate of 39.0625 kHz.
    • Since each output carries multiple channels, exactly one datum from each submodule will be sent in series as a single Avalon-ST packet.
Internal Registers This module requires some information to be retained from one time step to the next. It will be stored in internal registers which are not visible or accessible from outside the module. They are mentioned and named here in order to simplify the detailed description of the module's functionality given below.
  • Current input: 64 bits (for phonon submodules) or 32 bits (for charge submodules)
    (This shift register stores all the input bits from the ADC on an asserted "data ready" signal. This allows us to serialize the data before passing it to the CIC IP Core.)
Tuned parameters There are a few parameters which must be specified to define our CIC filters. More detail about some of the parameter choices can be found below. These parameters have been chosen in a qualitative, heuristic fashion. They may need to be re-tuned once early noise and pulse data from SNOLAB is available.
  • R (CIC filter decimation ratio) =
    • 16 (for phonon channels)
    • 64 (for charge channels)
  • N (number of stages in CIC filter1) = 3
  • M (number of samples per CIC filter comb stage) = 1
Note that these are the only parameters necessary to specify a generic CIC filter. Since we are using Altera's implementation of a CIC filter, some other parameters must also be specified. The full list of parameters is found here.
Functionality Brief functional description:

The full sampling rate of the phonon channels (625 kHz) and charge channels (2.5 MHz) needs to be reduced in order to maximize the sensitivity of the trigger. This sensitivity optimization involves a tradeoff between a high sampling rate (which allows access to higher-frequency information) and a longer baseline for the Finite Impulse Response (FIR) module (which allows access to lower-frequency information). Downsampling by a larger factor reduces the sampling rate and lengthens the baseline. Downsampling by a smaller factor increases the sampling rate and shortens the baseline. This module downsamples both the phonon and charge channels to 39.0625 kHz. The choice of 39.0625 kHz seems, judging from Soudan detector pulses, to provide a good balance between low-frequency and high-frequency information, leading to a sensitive trigger while conserving FPGA resources.

Any downsampling procedure will result in aliasing of high-frequency components down to lower frequencies. Our signals mostly occupy low frequencies, and the higher frequencies are dominated by noise. Therefore we want to filter out the high-frequency components before downsampling so that our signals are not swamped by aliased noise.

To do this we have chosen a Cascaded Integrator-Comb (CIC) filter to reduce aliased noise. For more details on how CIC filters work and are implemented, we recommend starting with the wikipedia page. The CIC filter is chosen for certain desirable properties:

  • No multiplication is needed, only addition, which conserves FPGA resources.
  • The zeros of the frequency response are located at multiples of the aliasing frequency, so the amount of aliased noise at frequencies close to zero falls rapidly to nothing.

A CIC filter is a cascade of moving-average filters (a moving average of a moving average of a moving average) in the time domain. In the case of one of our phonon submodules, we have $R=16$, $N=3$, $M=1$. Thus each value of the output is a triple sum of recent input samples: \[y[n] = \sum_{j=0}^{15} \sum_{k=0}^{15} \sum_{l=0}^{15} x[n - j - k - l] \text{, where}\]

  • $j, k, l=0, 1, 2, \ldots, RM-1$,
  • $x[n]$ is the newest input sample in time, and $x[n-(RM-1)-(RM-1)-(RM-1)]$ is the oldest input sample used.
So, the CIC filter uses the preceding $N(RM-1)+1$ (46 in this case) inputs to compute its output. Note: The output is not valid until there have been at least 46 input samples. The first two outputs after initialization or reset aren't fully valid since they occur after only 16 and 32 input samples, respectively. The third output is valid since it comes after 48 input samples. In the case of the charge submodules, the output becomes valid after 190 input samples ($N(RM-1) + 1 = 3(64⋅1-1)+1 = 190$). Just like the case of the phonon submodules, this requirement is satisfied on the third output from the charge submodules.

To understand the performance of this cascaded moving average as an anti-aliasing filter, we should look at the frequency response function. The frequency response function tells us whether this filter adequately attenuates noisy high frequencies while transmitting the desired low frequencies. In the frequency domain, each value of the output $\tilde{y}(\omega)$ is derived from multiplying the input $\tilde{x}(\omega)$ by the filter's frequency response function \(H(\omega)\): \[\tilde{y}(\omega) = H(\omega) \tilde{x}(\omega), \text{where}\]

  • $\tilde{x}(\omega)$ is the Fourier transform of the input sequence $x[n]$ at angular frequency \(\omega\) in radians per second,
  • $\tilde{y}(\omega)$ is the Fourier transform of the output sequence $y[n]$ at angular frequency \(\omega\) in radians per second, and
  • $H(\omega)$ is the frequency response function at angular frequency \(\omega\), \[H(\omega) = \left(\frac{1 - z^{-RM}}{1 - z^{-1}}\right)^N, \text{where}\]
    • \(z = e^{i \omega T}\), where \(\omega\) is the angular frequency in radians per second and \(T\) is the time period between input samples,
    • \(R\) is the downsample fraction,
    • \(M\) is the number of samples per integrator-comb stage, and
    • \(N\) is the number of integrator-comb stages in the cascade.

This module is implemented as an Altera CIC IP Core, which requires certain parameters to be set at design time in order to specify the exact functionality that is needed from a fairly general-purpose piece of code. The list of parameters and the values we use are given later in this document (CIC IP Core parameters).

This module also includes a simple wrapper to adapt the inputs and outputs of the CIC IP Core to match our needs. The design of this wrapper is detailed later on this page.

Our choice of the parameters \(R\), \(M\), and \(N\) is somewhat qualitative and heuristic, since we do not yet know the exact characteristics of the pulses we will record in the SNOLAB detectors. We anticipate re-visiting these parameter choices once we have early data from SNOLAB. At that point it will be simple to take random-triggered data and pulse data and re-optimize the CIC parameters based on the noise and signal. Currently, we envision the following values:

Downsample fraction \(R\)
Since the phonon rate (625 kHz) and charge rate (2.5 MHz) are different, we have two different downsample fractions \(R\) to bring them both down to 39.0625 kHz. The phonon submodules use \(R = 16\) (\(\frac{625~\text{kHz}}{16} = 39.0625~\text{kHz}\)), and the charge submodules use \(R = 64\) (\(\frac{2.5~\text{MHz}}{64} = 39.0625~\text{kHz}\)).
Number of stages \(N\)
In order to balance suppression of aliasing against passband droop (attenuation of frequencies we want to keep), we choose \(N = 3\). Larger \(N\) results in more passband droop, while smaller \(N\) results in more aliased noise.
Comb delay \(M\)
We choose \(M = 1\). If we chose \(M = 2\), then there would be a zero in the frequency response at the maximum frequency after downsampling, which would effectively halve the passband. Since there is still useful signal information at the maximum frequencies after downsampling, putting a zero at this spot in the frequency response would be undesirable and would negatively impact our sensitivity and thresholds.

The output of a CIC filter necessarily has more bits than the input. The number of additional bits is \(N \log_2 (RM)\) 2. For each phonon channel, this is 12 additional bits for a total of 28 bits. For the charge channels, this is 18 additional bits for a total of 34 bits.

More detailed description (including data transfer information):

We will instantiate five copies of the CIC IP Core, one for each of the 5 DF submodules: three will handle the three phonon ADC inputs, and the remaining two will handle the two charge ADC inputs. Each copy of the CIC IP Core will be wrapped by a single copy of the wrapper code.

The CIC IP Core is configured in such a way that it expects its multi-channel inputs to arrive in serial (e.g. phonon channel 0 on clock cycle 1, phonon channel 1 on clock cycle 2, phonon channel 2 on clock cycle 3, and phonon channel 3 on clock cycle 4). This configuration was chosen to conserve FPGA resources. The inputs from the ADCs, on the other hand, arrive in parallel (e.g. all four phonon channels on clock cycle 1). The wrapper's primary role is to serialize the ADC input data.

Although the inputs from the ADCs are not formally Avalon-ST streams, they may be treated as such by treating the "data ready" signal as the Avalon-ST "valid" signal and storing the input data in a register. This translation is accomplished by the wrapper. The CIC IP Core input requires several signals:

  • 16 bit wide data signal for phonon and charge submodules, respectively,
  • 1 bit wide valid signal,
  • 1 bit wide start-of-packet signal,
  • 1 bit wide end-of-packet signal, and
  • 2 bit wide error signal.
The CIC IP Core also provides a "ready" bit to tell its input source whether or not it is ready to accept input, which is ignored by the wrapper because the CIC IP Core will always be ready by the time another input arrives. This "ready" bit should not be confused with the "data ready" bit in the ADC input.

Note that the CIC IP Core does not have a "channel" input, even though it accepts multi-channel input data. We do not know the reason for this aspect of the design, but we have ascertained by reading the documentation and by trial and error that the CIC IP Core expects the data for its input channels to always arrive in order, never skipping nor duplicating any input data. The wrapper accommodates this requirement.

The wrapper uses a shift register to serialize the parallel inputs. The shift register has a depth of 4 (2) steps for phonon (charge) channels, and a width of 19 bits. When the wrapper sees the "data ready" signal set on a rising clock edge, the data for the first input channel is put into the first 16 bits of the first step of the shift register. The data for the second input channel is put into the second step of the shift register, etc. The next bit of each step of the shift register is set to 1 to supply the valid signal. The next bit of the first step of the shift register is set to 1 but set to zero for all other steps to supply the start-of-packet signal. And the last bit is set to 1 for the last step and set to 0 for all other steps to supply the end-of-packet signal.

On each clock cycle, the shift register is shifted by one step. The current contents of the "first step" are discarded, all the other steps are shifted forward by one, and the last step is filled with all zeros.

The output of the first step of the shift register is connected directly to the data, valid, start-of-packet, and end-of-packet signals input to the CIC IP Core. In this way, a single, four-datum, parallel input from the ADCs is serialized and delivered to the CIC IP Core along with the required Avalon-ST control signals over the course of four clock cycles (or two clock cycles for charge submodules).

The 2 bit wide error signal input to the CIC IP Core is always set to "00", because we do not need to tell the CIC IP Core about any error conditions.

The output of the CIC IP Core is also adapted by the wrapper, although the output part of the wrapper is purely passive. The data, valid, channel, start-of-packet, and end-of-packet signals are passed straight through the wrapper. The 2 bit wide error signal output by the CIC IP Core is ignored, as it does not give us any useful error information. Lastly, the CIC IP Core expects a "ready" signal on its input. We do not use this feature, so the wrapper sets this signal always to 1.

The following description is at a very high level because we are not implementing the CIC filter ourselves. Instead, we are using the CIC IP Core from Altera. Since it is closed source, we only describe in general terms what is done.

Step-by-step

On each rising clock edge, do the following:
  1. Check the 1 bit wide "data ready" signal that indicates whether the ADC input is ready.
    • If it is not set, then do nothing, and wait for the next rising clock edge.
    • If it is set, then continue to step 2.
  2. Store the data from the ADC input samples along with the into the "current input" internal shift register along with the required Avalon-ST control bits as described above, the output of which is connected to the input lines of the CIC IP Core.
  3. On each clock cycle, advance the "current input" shift register, shifting in zeros to fill the back end of the shift register.
  4. The CIC IP Core will then receive and process the data. On every \(R\)th input (16th input for phonon DF submodules and 64th input for charge DF submodules), it will produce an output packet. The output packet contains all channels (4 for phonon DF submodules and 2 for charge DF submodules), but they are output in serial rather than parallel. In more detail, it goes as follows:
    • For each phonon DF submodule do the following on successive clock cycles (a total of 5 clock cycles):
      1. Set the 28 data bits to the output data for phonon channel 1. Concurrently, set the 2 channel-number bits to 00, set the valid bit, and set the start-of-packet bit.
      2. Set the 28 data bits to represent phonon channel 2. Concurrently, set the 2 channel-number bits to 01, and unset the start-of-packet bit.
      3. Set the 28 data bits to represent phonon channel 3. Concurrently, set the 2 channel-number bits to 10.
      4. Set the 28 data bits to represent phonon channel 4. Concurrently, set the 2 channel-number bits to 11, and set the end-of-packet bit.
      5. Unset the valid bit, and unset the end-of-packet bit.
    • For each charge DF submodule do the following on successive clock cycles (a total of 3 clock cycles):
      1. Set the 34 data bits to the output data for charge channel 1. Concurrently, set the 1 channel-number bit to 0, set the valid bit, and set the start-of-packet bit.
      2. Set the 34 data bits to represent charge channel 2. Concurrently, set the 1 channel-number bit to 1, unset the start-of-packet bit, and set the end-of-packet bit.
      3. Unset the valid bit, and unset the end-of-packet bit.

Reset Signal:

When the reset signal is asserted,
  • Set the "current input" internal register to zero.
When the reset signal is deasserted,
  • Go back to Step 1 of the detailed description.
Notes
Testing Plan
  • See here for the overall testing plan.
  • See here for the testing plan for this module.

CIC IP Core Parameters

Since we are going to use Altera's CIC IP Core for this module, we need to specify what parameters we will use and the value of them. Here is the list of them:

CIC Filter Specification Parameter Value
Filter type Decimator
Number of stages (\(N\)) 3
Differential delay (\(M\)) 1
Enable variable rate change factor Off
Rate change factor (\(R\)) 16 (phonon) or 64 (charge)
Number of interfaces 1
Number of channels per interface 4 (phonon) or 2 (charge)
Input data width 16 bits
Output rounding options None