Analysis of Four Parallel and Five Parallel Linear Phase FIR Digital Filter Using FFA Algorithm

1H. Joseph Prabhakar Williams and 2K.R. Shankar Kumar
1Anna University, Chennai, Tamil Nadu, India
2Department of ECE, Ranganathan Engineering College, REC Kalvi Nagar, Viraliyur (Post), Coimbatore, Tamil Nadu, India

Abstract: This study presents an architectural approach to the design of Low Area and high speed linear phase Finite Impulse Response (FIR) digital filter. FIR digital filters are used in DSP by the virtue of its, linear phase, fewer finite precision error, stability and efficient implementation. In the proposed architecture, we are introduced five parallel linear phase fir filter to obtain the high speed and low Area than four parallel fir filters. So the proposed architectures offer Low Area and high speed compared to the best existing linear phase FIR filter implementations in the literature and the proposed architectures have been implemented and tested on Spartan-3 xc3s200-5pq208 Field-Programmable Gate Array (FPGA) and synthesized.

Keywords: Fast FIR Algorithms (FFAs), high speed filter, linear-phase FIR filter, parallel FIR filter

INTRODUCTION

Along the explosive growth of multimedia application, the demand for high-performance and low-area Digital Signal Processing (DSP) is getting higher and higher. The FIR digital filter is one of the most widely used fundamental devices performed in DSP systems, ranging from wireless communications to video and image processing. Some applications need the FIR filter to operate at high frequencies such as video processing, whereas some other applications request high throughput with a low-Area circuit such as multiple-input-multiple-output systems used in cellular wireless communication. Furthermore, when narrow transition band characteristics are required, the much higher order in the FIR filter is unavoidable (Cheng and Parhi, 2005). In this brief, parallel processing in the digital FIR filter will be discussed. Due to its linear increase in the hardware implementation cost brought by the increase in the block size L, the parallel processing technique loses its advantage to be employed in practice. There have been a few papers proposing ways to reduce the complexity of the parallel FIR filter in the past (Acha, 1989; Parker and Parhi, 1997). In Mou and Duhamel (1991), poly phase decomposition is mainly manipulated, where the small sized parallel FIR filter structures are derived first and then the larger block-sized ones can be constructed by cascading or by iterating small sized parallel FIR filtering blocks.

Fast FIR Algorithms (FFAs) introduced in Parhi (1999), show that they can implement an L-parallel filter using approximately (2L-1) sub filter blocks, each of which is of length N/L. It reduces the required number of multipliers to (2N-N/L) from L×N. In Chung and Parhi (2002), the fast linear convolution is utilized to develop the small-sized filtering structures and then a long convolution is decomposed into several short convolutions, i.e., larger block-sized filtering structures can be constructed through iterations of the small-sized filtering structures. However, in both categories of methods, when it comes to symmetric convolutions, the symmetry of coefficients has not been taken into consideration yet, which can lead to a significant saving in hardware cost. Previously, we have investigated the design for symmetric convolutions based on even length (Tsao and Choi, 2010). In this brief, we will discuss symmetric convolutions based on odd length and provide new parallel FIR digital filter architectures consisting of advantageous poly phase decomposition, which can further reduce amounts of multipliers required in the sub filter section by exploiting the inherent nature of the symmetric coefficients, compared with the existing FFA fast parallel FIR filter structures. This brief is organized as follows. A brief introduction of FFAs is reviewed and the proposed parallel FIR filter architectures are presented. And also the complexity, comparisons and Conclusions are shown.

METHODOLOGY

Fast FIR Algorithms (FFA): Consider an N-tap FIR filter that can be expressed in the general form as:

$$y(n) = \sum_{i=0}^{N-1} h(i)x(n-i), \ n = 0, 1, 2, \ldots \infty$$
where, \( x(n) \) is an infinite length input sequence and \( h(i) \) is the length-NFIR filter coefficients. Then, the traditional \( L \)-parallel FIR filter can be derived using poly phase decomposition:

\[
X_q(z) = \sum_{k=0}^{\infty} z^{-k} x(Lk + q)
\]

\[
H_r(Z) = \sum_{k=0}^{N-1} z^{-k} h(Lk + r)
\]

\[
Y_p(z) = \sum_{k=0}^{\infty} z^{-k} y(lk + p)
\]

for \( p, q \) and \( r = 0, 1, 2, \ldots, L-1 \).

To exploit the symmetry of coefficients, the main idea is to manipulate the poly phase decomposition to earn as many sub-filter blocks as possible, which contain symmetric coefficients so that half the number of multipliers within a single sub filter block can be utilized for the multiplications of whole taps (Cheng and Parhi, 2004).

Two-parallel FFA structure naturally has benefits to symmetric convolutions in odd length. When it comes to a set of odd-length symmetric coefficients, two out of three sub filters contain symmetric coefficients, i.e., \( H_0 \) and \( H_1 \), shown in Fig. 1 and 2 (Cheng and Parhi, 2007).

As shown from the given example, after applying the existing structure 3A, in Fig. 3, four out of six sub filter blocks, i.e., \( H_1, H_0+H_2, H_0+H_1+H_2 \), are with symmetric coefficients now, which means a single sub filter block can be realized in Fig. 4, with only half the amount of multipliers required. Each output of multipliers responds to two taps, except the middle one. Note that the transposed direct-form FIR filter is employed. Compared with the existing FFA three-parallel FIR filter structure, the proposed structure leads to two more sub filter blocks, which contains symmetric coefficients.

The cascading process for the larger block-sized parallel FIR filter is similar to that introduced in Lin and Mitra (1996), but instead of applying the existing small-sized FFAs to every stage, we interleave multiple various small-sized structures in each stage to fully exploit the symmetry of coefficients.

**Architecture of existing four parallel FIR filter:** For example, for a set of 23-tap symmetric coefficients, after applying two-parallel FFA, the sub filters block \( H_0 \) is with 12 symmetric for two-parallel-based
Table 1: Comparison of delay and area of four parallel and five parallel methods.

<table>
<thead>
<tr>
<th>Methods</th>
<th>Existing (4 parallel)</th>
<th>Proposed (5 parallel)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Delay (ns)</td>
<td>28.798</td>
<td>27.292</td>
</tr>
<tr>
<td>Slices</td>
<td>171</td>
<td>146</td>
</tr>
<tr>
<td>LUT</td>
<td>304</td>
<td>252</td>
</tr>
</tbody>
</table>

Cascading, with a set of odd-length symmetric coefficients, it is possible to have even-length symmetric coefficients in a sub filter block after applying two parallel FFA structure. A coefficient (Yu and Ken, 2012), to which the existing FFA is not beneficial; therefore, in this case, the two-parallel structure that is advantageous to even-length symmetric coefficients is employed. The resulted four-parallel filter realization, which leads to four sub filter blocks containing symmetric coefficients (Fig. 5).

**Architecture of proposed five parallel FIR filters:** A comparison between the proposed and the Existing FFA structure with various lengths under different level of

![Fig. 5: Existing four-parallel FIR filter implementation](image)

![Fig. 6: Proposed five-parallel FIR filter implementation](image)
SIMULATION RESULTS

The results presented establish a high speed advantage of Proposed Fir architecture over prior architecture. For Typical filter parameters with comparable Low Area and High speed. The proposed architecture achieved high clock frequency compared to direct form architecture, we validated our techniques on Spartan-III devices where we observed significant high speed and low Area over traditional Distributed Arithmetic based techniques. And Our proposed method (Fig. 7 and 8).

CONCLUSION

In this study, we have presented new five parallel FIR filter structures, which are beneficial to symmetric convolutions of odd length for low area and delay. The proposed new structures exploit the nature of symmetric co-efficient of odd length and further reduce the number of multipliers required at the expense of additional adders. We have implemented the architectures on Spartan-III XC3S200-5PQ-208 FPGA and synthesized. The proposed linear-phase five parallel FIR filter architectures provide less delay, High throughput and low area than the existing four parallel FIR filters. So this architecture is used for high performance and area efficient DSP application.

REFERENCES