Laboratory of Telecommunications Technology
Subproject goals
Utilization of efficient transformation techniques for implementation of DSP algorithms
in CDMA receivers, and mobile user tracking systems.
Background
This subproject addresses the methodologies needed to design VLSI circuits for systems
that require higher throughput or lower power consumption. Many of the techniques that
will be presented in this subproject will be applied to DSP algorithms needed in Wideband
Code Division Multiple Access (WCDMA) receivers, and Mobile User Tracking systems.
The first part of this subproject addresses several high-level architectural
transformations which can be used to design families of architectures for a given
algorithm. These transformations include pipelining, retiming, unfolding, and array
processor design. In the area of digital mobile communications, there is a growing need
for high speed circuits having a low power consumption. Two popular approaches for
achieving high processing speed are pipelining and parallel processing. From a single chip
implementation point of view, the pipeline approach could be an advantage due to its lower
hardware cost. Recently, the utility of pipelined systems in low-power applications in
mobile communications has been observed. Pipelined DSP algorithms allow us to tradeoff
speed, power and area during the course of VLSI implementation.
The second part of this subproject, deals with high-level algorithm transformation
techniques such as look-ahead (LA), relaxed look-ahead (RLA), and strength reduction (SR).
Algorithm transformation techniques have been employed to introduce concurrency in serial
algorithms. In general, pipelining is simply accomplished by placing latches at any
feed-forward cutsets of the data flow graph representation of the algorithm. However,
pipelining of DSP algorithms having a feedback loop is not a trivial task. LA and RLA
transformations are applied to design pipelined systems having a feedback loop. Usually,
the LA technique results in a large hardware overhead as it transforms a serial algorithm
into an equivalent pipelined algorithm. RLA technique, however, involves approximating the
algorithms obtained via the look-ahead technique. Through these approximations, the RLA
technique maintains functionality of the algorithm rather than the input-output behavior.
Approximations can be quite crude and yet result in minimal performance loss. Furthermore,
the resulting pipelined requires minimal hardware overhead and achieves a higher
throughput. This increased of throughput as a result of pipelining can be exchanged
partially or fully for lower power consumption. On the other hand, the strength reduction
transformation is applied to reduce the number of multiplications. This will result in
remarkable savings in consumed power and silicon area.
Therefore, as a result of transformation techniques, higher throughput and lower power
consumption can be obtained. This is of great importance when implementing mobile
communication systems.
Key results
Pipelined implementation of a DS-CDMA receiver was proposed when multiple antennas are utilized in mobile receivers. The RLA technique was utilized in order to introduce pipelining and to achieve higher throughput as compared to the receiver using the serial RLS and LMS algorithms. Simulations were carried out to illustrate the SIR versus the relative interfering power for different number of antennas and different levels of pipelining. It was shown that as the the number of pipelining stage, M increases the SIR will decrease. This was due to the slower convergence speed as a result of the approximations. However, this decrease of SIR was considerably less than that obtained with the LMS algorithm in [18]. Also, the hardware overhead was only 2(M-1) latches. It is important to note that the level of pipelining should be carefully selected when more antennas are introduced.
· The SR transformation technique has been studied and applied to the CDMA receivers.
[6, 12, 14]
· Low power rake receivers were proposed.
In this work, we have studied the application of efficient transformation techniques in Code Division Multiple Access receivers (CDMA). This was achieved by looking at some of the computational blocks that constitute the CDMA receivers. We focused on the Look-Ahead, Relaxed Look-ahead and Strength Reduction transformation techniques and demonstrate how CDMA receivers can benefit when utilizing these techniques during the course of implementation.
In Code Division Multiple Access (CDMA) receivers, in order to suppress the Multiple Access Interference (MAI) different multiuser detectors such as the decorrelating and the linear minimum mean square error (LMMSE) detectors can be utilized. In these receivers, the correlation matrix depends on the number of users, the signature waveform and the delays of the users. A change in one of these parameters changes the correlation values and thus, an update of the multiuser detector is required. However, the updating process is a computationally complex task and requires the matrix inversion operation with the computational complexity of O(K3), where K is the number of users. To alleviate the implementation complexity, iterative implementation of the decorrelating and LMMSE detectors have been reported. The advantage of such detectors is the significant savings in computational complexity since there is no need for matrix inversion. This is accomplished by utilizing different iterative algorithms such as the Steepest Descent (SD) and the CG methods to solve linear matrix equations. As a result, the following research has been done:
· Fixed-point implementation of the CG and GSCD algorithms [9, 11]
· Utilization of SPW for the implementation of CG algorithm
· Mapping the CG algorithm into an array processor
In this work, practical implementations of the Gram-Schmidt Conjugate Direction (GSCD) and CG algorithms were studied and compared. We illustrated that although theoretically the CG algorithm must converge in at most N steps, in practice for large R it may not happen. As a result, due to round-off errors different resetting schemes and more iterations were required. On the other hand, we demonstrated that the GSCD method always converges at most in N steps. The fixed-point implementation of the above mentioned algorithms were also studied and presented. For more realistic results, the algorithms were applied to a multiuser detection scheme in a DS-CDMA system. In these simulations, for different number of users, different number of iterations and different wordlengths the MSE errors were calculated.
· Comparison between the performance of the SD and the CG algorithms in CDMA receivers
[1]
· Mapping the Steepest Descent algorithm into an array processor
· Fixed-point implementation of the algorithms
With the aid of simulations, we illustrated that in applications such as multiuser detection, after a moderate number of iterations, the SD and the CG algorithms have a similar fixed-point performance. However, from the VLSI implementation point of view, the SD algorithm is a more suitable choice. Furthermore, for the implementation of the SD algorithm a fully pipelined systolic array was proposed.
Currently, in many applications such as beamforming based communication for mobile users, there is a large demand for tracking the location of the mobile user. Mobile users can be located when Direction-of-Arrival (DOA) estimates are established at the base stations. The user tracking problem can be solved by the methods of the trigonometric geometry from the intersection of the two or more Lines-of-Bearing (LOB). There are numerous techniques that can be applied for such tracking problems. In the signal subspace based user tracking system of [12], for a step-by-step update an efficient conjugate gradient based algorithm was developed. Also, implementation of two different mobile user tracking systems were discussed. It was shown that from both performance and complexity point of views, the signal subspace method is a better choice and its implementation should be considered. Thus, for a more realistic implementation, the signal subspace based tracking system was partitioned into two parts: HW and SW. For the HW implementation of the tracking unit, a systolic architecture was proposed. With the aid of this array, the time complexity of the most computationally intensive unit was reduced to O(M). Furthermore, for the implementation of the complex multipliers needed in the PEs, the SR transformation technique was utilized. As a result, remarkable savings in consumed power and silicon area were achieved.
· Error-free computation of real-valued discrete transforms [2, 4, 5, 8]
· Developing two approximation techniques to reduce complexity
· Developing parallel structures for the implementation of such transforms
We proposed a novel approach that is aimed at efficient implementation of real-valued discrete transforms such as the DCT and the DHT. One of the main advantages of the proposed approach is an error-free implementation of these transformations until the final reconstruction. As a result of utilizing this approach, multiplications have been replaced by maximum two shift operations and one addition. The proposed method can be combined with any other DCT or DHT algorithms to achieve further hardware reductions. Furthermore, by introducing two approximation methods hardware complexity was drastically reduced. Finally, for the implementation of the algorithms, a fully pipelined systolic architecture with O(N) throughput was proposed.
[1] R. Baghaie, "Systolic implementation of the steepest descent algorithm and its performance in CDMA receivers," to appear in the Proceedings of the European Signal Processing Conference, Eusipco'00, Tampere, Finland, September 2000.
[2] R. Baghaie and V. Dimitrov, "Implementation of real-valued discrete transforms via encoding algebraic integers," to appear in the Proceedings of the European Signal Processing Conference, Eusipco'00, Tampere, Finland, September 2000.
[5] R. Baghaie and V. Dimitrov, "Systolic implementation of real-valued discrete transforms via algebraic integers quantization," Submitted to An International Journal Computers & Mathematics with applications, September 1999.
[11] R. Baghaie, "Systolic implementation of sample-by-sample conjugate gradient algorithm," in Proceedings IEEE Finnish Signal Processing Symposium, FINSIG'99, Oulu, Finland, pp. 252-256, May 1999.
[12] R. Baghaie, "Transformation techniques in CDMA receivers," in Proceedings IEEE Finnish Signal Processing Symposium, FINSIG'99, Oulu, Finland, pp. 10-15, May 1999.
[16] R. Baghaie, S. Werner, and T. Laakso, "Pipelined implementation of adaptive multiple-antenna CDMA receivers," in Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP'98, Seattle, Washington, vol. 6, pp. 3229-3232, May 1998.