# A SDR Platform for Mobile Wi-Fi/3G UMTS System on a Dynamic Reconfigurable Architecture

Zong Wang, Ahmet T. Erdogan, and Tughrul Arslan

School of Engineering, University of Edinburgh The King's Buildings, Mayfield Road, Edinburgh EH9 3JL, United Kingdom phone: + (44) 01316505619, fax: + (44) 01316506554, email: {z.wang, ahmet.erdogan, t.arslan}@ed.ac.uk

#### Abstract

As wireless communication standards evolve, Wi-Fi and W-CDMA based 3G UMTS are still the major technologies widely deployed to provide high speed wireless internet access and mobile communication services. Therefore, the integration of these two protocols onto a single SDR platform is a major research task. In this paper we propose a flexible SDR platform targeting the above standards. Our SDR platform is based on a recently developed novel Reconfigurable Instruction Cell Array (RICA) along with an ARM controller. We describe the proposed platform architecture, the associated software design flow, and the mapping methodology. The simulation results demonstrate that our architecture is able to meet the typical throughput requirement of these two protocols while run at a lower core power consumption compared to other architectures.

## **1** INTRODUCTION

As new wireless communication standards emerge and evolve, and research continues on advanced hardware architectures, the software defined radio (SDR) is becoming an attractive solution for next generation multi-mode, multistandard communication systems [1] [2]. With the aim to provide a seamless shift between existing wireless protocols, the SDR must provide high flexibility and capacity as well as durability.

*Flexibility* comes by means of reconfiguring the communication system by using only software modifications. Thus, SDR can play a key role when new communication technologies appear in the future. The work on SDR may include high performance modem signal processing, downloading necessary standard specifications from the network and installation. All these steps, ideally, do not require any special skills and efforts from end users. Gaining better flexibility involves both system hardware architecture and software flow design.

*Capability* is, in some sense, easier to achieve, due to 'Moore's law'. Gigabytes/second processing capacity is no more a dream. Hardware architecture design and fabrication techniques will be dominant factors in improving the capability [3].

*Durability*, however, has seen no significant increase during recent decades. As communication standards are moving forward, more complex algorithms and management schemes are being introduced. These consume much higher power because of rising processing requirements and memory usage. But with smart architectural and domain specific design, acceptable durability could be achieved.

In today's wireless communication world, Wi-Fi (IEEE 802.11x) is the most popular wireless communication standard, which provides a high speed connection over the air as an alternative to the wired cable network. On the other hand, UMTS that utilizes the W-CDMA air interface has been widely adopted as the 3G successor to the old GSM voice communication infrastructure. Although new standards, such as WiMAX (IEEE 802.16x) and WiBro, keep emerging in recent years to provide very fast mobile wireless connection, the large adoption of Wi-Fi and 3G UMTS will still dominate wireless communications, see table 1.

Table 1: Popular wireless communication standards

| Standard  | Maximum data rate | Coverage |
|-----------|-------------------|----------|
| Wi-Fi     | 54 Mbps (802.11g) | Medium   |
| WCDMA     | 7.2 Mbps (HSDPA)  | Long     |
| WiMAX     | 70 Mbps           | Long     |
| Bluetooth | 3 Mbps            | Short    |

In this paper we propose a flexible SDR platform targeting Wi-Fi and W-CDMA standards. It is based on a recently developed novel Reconfigurable Instruction Cell Array (RICA) processor along with an ARM controller. We describe the proposed platform architecture, the associated software design flow, and the mapping methodology. The simulation results demonstrate that our architecture can meet the typical throughput requirement of the targeting protocols at a lower power consumption.

The rest of the paper is organized as follows. Section 2 describes the system architecture as well as providing an introduction to RICA architecture. Section 3 provides an overview of the two targeted standards, more specifically, 802.11g and W-CDMA. Section 4 presents the mapping methodology and simulation results. Finally, conclusions are drawn and future work directions are given in Section 5.

## **2** SYSTEM ARCHITECTURE

The two major challenges in SDR are the design of efficient hardware system and the associated software development environment. The recent development of the RICA architecture [5] provides new approaches for efficient realization of hardware platforms targeting SDR systems. The physical hardware component for the proposed SDR is a platform that integrates an ARM controller and a specially tailored RICA processor core, see Figure 1. Such platform enables a software design flow that resembles the process of applications development on general purpose processors.



Figure 1 - System architecture overview

## 2.1 RICA architecture overview

The RICA architecture consists of highly а reconfigurable fabric of interconnected instruction cells whose functionality is derived from common machine code instructions which can be dynamically reconfigured to provide highly parallel FPGA-like representations of typical software operations. The fabric is controlled through a series of VLIW instructions which are extracted from a C-based high level algorithm description by a dedicated scheduling algorithm [9]. The functional units support primitive addition/subtraction, operators that can perform multiplication, logic, multiplexing, shifting and register operations, etc. Additional functional units (cells) are provided to handle control/branch operations. This enables the architecture to take advantage of greatly increased parallelism at the instruction level than existing processors executing the same code since it is not constrained by dependencies between operations or by small branches. As a result it exhibits significantly improved performance and flexibility over traditional solutions. The resources available to the system are fully parameterisable with a standard allocation defined for general purpose applications.

As shown in Figure 2, applications such as wireless communication standards are described in ANSI-C and compiled through a tailored C compiler. Then, from the assembly code, a step-based net-list file is generated through a SIMD pipelined scheduler. ICs are connected through a network of programmable switches to allow the creation of data paths by reading the configuration bits from net-list files stored in the program memory. The interconnect configuration is dynamically reprogrammed at run-time based on the schedule information.



Figure 2 - High-level design tool flow for RICA

Through the study of WiFi, Bluetooth and W-CDMA protocols, we found that most DSP algorithms such as FFTs, FIR filters, Viterbi decoders, and cipher engines rely on a high degree of SIMD parallelism [3]. Their main operations are based on long vector variables with short data widths. Though the ICs in RICA are 32 bit integer based, some of these can be configured into 4QI or 2HI vector SIMD operation modes, thereby, providing more efficient integration on the RICA for the base band processing element compared to conventional DSPs [2].

## 2.2 System design environment

Mapping of communication protocols commences with simulation and verification of functionality through MATLAB, Simulink or floating-point C implementations. Unlike traditional SoC designs, the development flow then separates into system-level design and RICA processor profile design [6]. Both of these are performed in fixed-point precision using a C environment, with the main difference that the RICA processor uses an Eclipse C extension IDE whereas ARM utilizes an ARM C development toolkit, as depicted in Figure 3.



Figure 3 - Software design flow

## 3 CASE STUDY

Wi-Fi and 3G cellular mobile networks are the two significant strides in wireless communication revolution that have changed the way of our life today. As an upgrade based on the traditional GSM infrastructure, 3G UMTS employs W-CDMA as the underlying air interface, to provide a high speed connection, typically up to 7.2 Mbps with HSDPA technology. This enables a number of wide band services, such as video calling, web browsing, file download, mobile TV, etc. in addition to the voice calls. On the other hand, Wi-Fi technology has been expanding rapidly to replace cable connections for Local Area Network (LAN), which promises to provide a backbone for convenient high-speed internet access.

The overall data processing chain of a W-CDMA based UMTS system is illustrated in Figure 4. First, a block of raw data bits from upper layer are fed into CRC attachment engine to provide error detection. Before the data blocks are encoded, they are serially concatenated and segmented to meet the requirements of different FEC encoders. Then they are delivered to the channel encoder module. After this, the encoded blocks are equalized and interleaved with intercolumn permutation to avoid cell interference. The spreading process consists of two steps. The first step is channelization operation, which can transform binary data symbols into high rate chips. The second step is to apply the scrambling code to the spread signal. Finally, the complex-valued chip sequence from spreading process is mapped onto physical channel using QPSK modulation.

#### 3.1 W-CDMA based 3G UMTS

Since W-CDMA symbols are asymmetric in the downlink and uplink, the receiver resembles the inverse process of the transmitter, but with several new features, such as supporting 16QAM and 64QAM in the modulation mapper to provide higher data rates, and using a rake receiver to reduce multi path fading, etc.



Figure 4 - Mapping of the targeted protocols

#### 3.2 Wi-Fi 802.11g

Wi-Fi, in particular, covers a series of IEEE 802.11 standards, including 802.11a/b/g/n. It uses both single carrier DSSS and multi carrier OFDM technology. The DSSS scheme is similar to that of the W-CDMA. A high level block diagram of a typical OFDM 802.11g standard is depicted in Figure 5. The main functional blocks of an OFDM system include FFT / IFFT, Channel coding / decoding, Interleaving / De-interleaving, Guard interval insertion / removal, etc [8].



Figure 5 - A high level block diagram of a typical 802.11g OFDM physical layer [8].

### 3.3 Multi standard SDR implementation analysis

Table 3 illustrates the main parameters of W-CDMA and 802.11g OFDM based physical layers. In channel coding stages, both 802.11g OFDM and W-CDMA adopt convolutional encoding and the Viterbi decoder as the main FEC unit, just with different constraint lengths. In addition, W-CDMA based UMTS also utilizes turbo coding to increase data rate. The two standards use different approaches to achieve higher rates. W-CDMA has a rate matching stage, providing either puncturing or repeating method to fulfil the required channel bit rate. Thus the design of a reconfigurable convolutional coding unit can be used for the implementation of both standards as well as the Viterbi decoder.

The modulation I/Q mappers of both standards follow the same constellation. They only differ in a few final mapping values, which can be loaded from the local memory.

The interleaving operation in W-CDMA is realized by a column by column permutation scheme. The number of columns corresponds to Transfer Time Interval (TTI) in the first stage, and is fixed to 30 in the second stage. The permutation is performed using the patterns shown in table 2.

Table 2: Inter-column permutation pattern in W-CDMA

| Table 2: Inter-column permutation pattern in w-CDMA |      |         |                             |
|-----------------------------------------------------|------|---------|-----------------------------|
|                                                     | TTI  | Columns | Permutation pattern         |
| 1 <sup>st</sup>                                     | 10ms | 1       | <0>                         |
| interleave                                          | 20ms | 2       | <0,1>                       |
|                                                     | 40ms | 4       | <0,2,1,3>                   |
|                                                     | 80ms | 8       | <0,4,2,6,1,5,3,7>           |
| 2 <sup>nd</sup>                                     | All  | 30      | <0,20,10,5,15,25,3,13,23,8, |
| interleave                                          |      |         | 18,28,1,11,21,6,16,26,4,14, |
|                                                     |      |         | 24,19,9,29,12,2,7,          |
|                                                     |      |         | 22,27,17>                   |

For the 802.11g OFDM protocol, data interleaving employs a similar block inter-leaver. In contrast to the W-CDMA system, this interleaving performs coded bits index based permutation, with two concatenated steps, regulated by the following permutations:

| First permutation:                                      |  |
|---------------------------------------------------------|--|
| $i = (N_{cbps}/16)*(k \mod 16) + floor(k/16)$           |  |
| $k=0,1,,N_{cbps}-1$                                     |  |
| Second permutation:                                     |  |
| $j=s*floor(i/s)+(i+N_{cbps}-floor(16*i/N_{cbps}))mod s$ |  |
| $i=0, 1,, N_{cbps}-1$                                   |  |
| $s=max(N_{bpsc}/2,1)$                                   |  |

where k denotes the index of original coded bits; i is the index right after the first permutation; j is the index after the second permutation; and  $N_{bpsc}$  corresponds to the number of coded bits per subcarrier.  $N_{cbps}$  and  $N_{bpsc}$  depend on the modulation mapper used.

DSSS, adopted by W-CDMA as the symbol modulation scheme, uses the OVSF code generator as well as the Gold Code generator to spread narrow bandwidth signals to a 5MHz wide bandwidth single carrier. The PN sequence required is only a set of the long Gold Codes, up to 512 bits. A 25-stage Linear Feedback Shift Register (LFSR) and a 18stage LFSR with distinct polynomials are defined respectively for uplink and downlink.

The 802.11g OFDM protocol, which transmits signals using 64 subcarriers, requires a 64-point FFT and IFFT. The stream of complex bits are divided into 48 subgroups and mapped to the corresponding frequency offset. High Rate DSSS (HR/DSSS) is also defined in the 802.11 standard, to provide 5.5Mbps and 11Mbps payload data rates by using the 8-chip Complementary Code Keying (CCK) modulation scheme when OFDM transmission is not used.

#### 4 MAPPING AND SIMULATION RESULTS

Taking advantage of the capability provided by the proposed SDR platform, we have mapped the two targeted protocols onto our architecture as illustrated in Figure 6.

For layer 1, each functional block is coded in our RICA tool flow using ANSI-C. Intra algorithm optimization for our



Figure 6 - Mapping of the targeted protocols

RICA core has been realized by applying advanced coding techniques, such as multiplexer instantiation, SIMD packed instructions for 8-bit multiplication, addition, and logic operations [6]. Inter-algorithm communication can be reduced by putting dependent function blocks into a larger kernel to maximize the utilization of the instruction cells, such as the modulator and inter-leaver. In the case a larger kernel is available, where a temporary data buffer in between algorithms via local or global memory access is significantly avoided, a loop level parallelism within the big kernel contributes to an even better performance by a few overhead of registers. Additionally, due to the nature of programming flexibility of our RICA core, an individual baseband signal processing module can be easily picked up and optimized to adopt diverse protocols.

For layer 2, both W-CDMA & 802.11g Medium Access Controller (MAC) are designed to be mapped onto the ARM controller. However, in this work, for simulation purposes, only the transport channel of MAC has been used, in order to send raw data bits from upper layer to the physical layer through the AHB bus.

|                 | Channel coding                                                 | I/Q mapper                | Interleaver/Deinterleaver                                     | Symbol<br>modulation         |
|-----------------|----------------------------------------------------------------|---------------------------|---------------------------------------------------------------|------------------------------|
| W-CDMA uplink   | Rate: 1/2, 1/3<br>Convolutional (K=9)<br>Turbo (1/3 only, K=4) | QPSK                      | 2-stage block interleaver<br>with inter-column<br>permutation | Direct sequence<br>spreading |
| W-CDMA downlink | Viterbi/Turbo<br>decoder(K=4)                                  | QPSK/16QAM/<br>64QAM      | Same as uplink                                                | Direct sequence despreading  |
| 802.11g OFDM    | Rate: 1/2, 2/3, 3/4<br>Convolutional (K=7)<br>Viterbi decoder  | BPSK/QPSK/<br>16QAM/64QAM | 2-concatenated block interleaver                              | 64 point FFT/IFFT            |

Table 3: Main parameters of W-CDMA and 802.11g OFDM based PHY layers [7] [8]

To explore the system efficiency for SDR, we mapped the 802.11g OFDM protocol along with the W-CDMA baseband onto our proposed architecture separately to evaluate the corresponding performance.

Performance results-Table 4 provides a benchmark of our solution and a comparison to the SODA architecture [3]. Generally, our approach is better in achieving a higher throughput while running at a lower frequency. The SODA computation cycle is the number of executions of its vector units. On the other hand, RICA steps represent how many times all the data paths are executed. It is clear that a large speed up is possible in terms of reconfiguration count with low overhead for configuration bits. Notably, the 802.11g receiver does not perform as well as other modules. This is mainly because the Viterbi decoder has not been fully optimized which contributes to about 80% computation in the receiver chain. However, the SODA approach is manually coded to maximize the advantage of its architecture. Still, our approach proves promising in providing a higher throughput frequency ratio.

Power results-Meanwhile, long data-path kernel loops inside the RICA processor reduce the steps (RICA cycles) needed for implementing the full physical layer, which helps reduce memory access to minimize the power consumption. As our power estimation models are not yet complete, the interconnection and routing energy results are not yet available. However, using RICA as the signal processing unit in our approach still shows significant advantage on core power consumption compared to the SODA solution, as shown in Table 5. On the other hand, considering the programming flexibility using C language, more optimization techniques can be applied on the RICA profile design to improve the system performance significantly. Customized cells can also be generated to accelerate more complex algorithms [5].

 Table 4: Performance comparison between RICA and SODA

| Protocols | Modules                      | RICA<br>@200MHz            | SODA<br>@400MHz          |
|-----------|------------------------------|----------------------------|--------------------------|
| 802.11g   | Transmitter                  | 20.2 Msteps/s<br>@54Mbps   | 806 Mcycles/s<br>@24Mbps |
|           | Receiver                     | 21.5 Msteps/s<br>@15.6Mbps | 1194Mcycles/s<br>@24Mbps |
| W-CDMA    | Turbo decoder<br>(K=4, SOVA) | 32.8 Msteps/s<br>@7.2Mbps  | 540 Mcycles/s<br>@2Mbps  |
|           | PN code<br>receiver          | 2 Msteps/s<br>@7.2Mbps     | 23Mcycles/s<br>@2Mbps    |

Table 5: Power comparison between SODA PE and RICA for $802.11 \alpha$ 

|         | Included parts     | Power (mW) |         |
|---------|--------------------|------------|---------|
|         |                    | 802.11a    | 802.11g |
| SODA PE | Data & Inst. Mem,  | 1826       | N/A     |
|         | SIMD, Scalar units |            |         |
| RICA    | Data & Inst.Mem,   | N/A        | 512.3   |
|         | Inst.cells         |            |         |

### 5 CONCLUSION

In this paper, we have proposed a flexible SDR platform based on a recently developed novel Reconfigurable Instruction Cell Array (RICA) along with an ARM controller. Two popular wireless communication protocols were analyzed and mapped onto our proposed platform. We discuss the targeted protocols for SDR implementation as well as the mapping methodology onto our RICA architecture. Simulation results prove that our architecture can meet tough throughput requirements of evolving protocols while working with a very restricted power budget. Further optimization on computation intensive FEC decoders will further improve the system performance for future communication standards.

## REFERENCES

- [1] Gerard K. Rauwerda, Paul M. Heysters, and Gerard J.M. Smit, "Towards Software Defined Radios Using Coarse-Grained Reconfigurable Hardware," *IEEE Trans. On very large scale integration (VLSI) systems*, vol. 16, no. 4, pp. 3-13, Jan. 2008.
- [2] John Glossner, Daniel Iancu, Jin Lu, Erdem Hokenek, and Mayan Moudgill, "A software defined communications baseband design," *IEEE Communications Magazine*, vol. 41, no. 1, pp. 120-128, Jan. 2003.
- [3] Yuan Lin, Hyunseok Lee, Mark Woh, Yoav Harel, Scott Mahlke, Trevor Mudge, Chaitali Chakrabarti, and Kriszti 'an Flautner, "SODA: A low-power architecture for software radio," *Proc. 33rd Intl. Symposium on Computer Architecture (ISCA)*, vol.1, pp. 89-100, Jun. 2006.
- [4] Rob Pelt and Martin Lee, "Low power software defined radio design using FPGAs," *Proc. of the SDR 2005 Technical Conference and Product Exposition.*
- [5] Sami Khawam, Ioannis Nousias, Mark Milward, Ying Yi, Mark Muir, and Tughrul Arslan, "The reconfigurable instruction cell array," *IEEE Trans. On very large scale integration (TVLSI) systems*, vol. 16, no. 1, pp. 75-85, Jan. 2008.
- [6] Zong Wang, Tughrul Arslan, Ahmet T. Erdogan, "Implementation of Hardware encryption engine for wireless communication on a reconfigurable instruction cell array," *Proc. of IEEE international symposium on electronic design, test & applications (DELTA 2008)*, vol. 1, pp. 148-152, Jan. 2008.
- [7] 3GPP TS 25.212 version 7.8.0 Release 7
- [8] IEEE Std. 802.11-2007, Part 11.
- [9] Ying Yi, Ioannis Nousias, Mark Milward, Sami Khawam, Tughrul Arslan, Iain Lindsay, "System-level Scheduling on Instruction Cell Based Reconfigurable Systems", 2006 Design Automation and Test in Europe Conference (DATE 2006), Volume 1, pp. 1-6, 6-10 March 2006, Munich, Germany.