

**International Journal of Engineering & Technology** 

Website: www.sciencepubco.com/index.php/IJET

**Research Paper** 



# Implementation of data path components of ARM7 microprocessor using sub threshold current mode logic with sleep transistor technique

K. A. Jyotsna<sup>1</sup>\*, P. Satish Kumar<sup>2</sup>, B.K. Madhavi 3, Sana Bano<sup>3</sup>

<sup>1</sup> Department of Electronics and Communication Engineering, CVR College of Engineering, Hyderabad, India <sup>2</sup> Department of Electronics and Communication Engineering, ACE Engineering College, Hyderabad, India <sup>3</sup> Department of Electronics and Communication Engineering, Sridevi Women's Engineering College, Hyderabad, India <sup>4</sup> PG Student, M. Tech VLSI System Design, CVR College of Engineering, Hyderabad, India \*Corresponding author E-mail: kajyotsna72@gmail.com

#### Abstract

The latest Very Large Scale Integration (VLSI) technology trends have been moving towards making devices cheaper and more powerful for everyone to afford them. So the ultimate focus being reducing power consumption by the gadgets. With the added feature of transistors to be structured in 3D, the Moore's law is to be continued. Hence, Leakage currents are a major concern with the increasing number of transistors per chip when the technology is scaled, static power dissipation needs to be monitored. Advanced RISC Machine (ARM) Processors have been giving a new definition to smart phones, tablets and other embedded applications. This paper presents a novel technique focusing on low power technology, developing an ARM7 microprocessor using Sleep transistor with Sub threshold Current Mode Logic (STCML) technique in 45nm technology using Cadence Virtuoso Tool. The simulation results have been observed on Spectre simulator and power has been calculated in Analog Design Environment (ADE L).

Keywords: ARM Processor; Current Mode Logic (CML); Leakage Currents; Sleep Transistors; Sub Threshold Logic.

# 1. Introduction

Now-a-days people look for devices that are portable, battery powered, smaller in size and have more functionality. In the recent times, the goal of a designer has been to obtain lighter devices whose battery lifetime is reasonable and packaging cost is low. Also, for high performance portable devices and non-battery powered devices reducing the power dissipation has been the major objective to ensure device reliability. Power optimization entails requirement for reduction of Voltage, Physical Capacitance and data activity. Voltage scaling is an effective approach in reducing the power dissipation for latest IC fabrication processes.

Abridging both the operating voltage and oxide thickness for maintaining constant electric field helps in reducing power by 50% with every new technology node but this in turn creates requirement for reduction in the threshold voltage for meeting the performance goals of a technology. So, optimizing both performance and leakage at a time is practically not possible [1].

Also, if the circuit designed has devices which have multiple threshold voltages, this also facilitates for control of Leakage power but, this leads to increase in the gate delay [1]. It can also be annotated that during Physical Design also, power can be brought down. This can be done by employing multiple voltage domains where few blocks use low voltage supplies compared to others or can be turned completely off if not in use. The time for which the blocks are temporarily shut down is called as 'Inactive mode' or 'Low power mode.'[1]

# 2. Sleep transistor technique

Diminution of Leakage power is the requisite to design Low power circuits. Technology scaling has led to colossal increase in the functionality, performance and power in IC's. Static power dissipation becomes influential with this scaling. It is defined as the power dissipation when the device is in standby mode [2]. It is given in the equation 1 as follows,

$$P_{leak} = I_{leak} \times V_{DD} \tag{1}$$

Where  $I_{leak}$  accounts to the leakage current which results when the transistor is in OFF state,  $V_{DD}$  represents the power supply. When the transistor is in weak inversion region, the current due to carrier diffusion between the source terminals to drain terminal is called as Sub threshold leakage current. Out of all the techniques, Sleep transistor approach is found preserving the signal integrity while reducing the leakage power when the circuit is in inactive mode [3].

Sleep Transistor Technique also called as gated  $V_{DD}$  and gated GND is an Ultra Low Power Technique where pull-up or pull-down or both of them are disconnected from the power supply or ground using Sleep transistors. This technique uses PMOS above the pull up network as sleep transistor to control power supply ( $V_{DD}$ ) and NMOS below the pull-down network as another sleep transistor to control Vss supply [4].



The Sleep transistors should have considerably high resistance in Sleep mode to get a considerable voltage drop. But this technique faces performance degradation if the transistors are not sized properly.

The sleep transistors are of high threshold voltage and the logic to be used between them should be a low Vt logic. The input of PMOS transistor is sleep and Sleep bar is the input to NMOS transistor. Now, the low Vt logic to be sand witched between these transistors is to be chosen.

## 3. Current mode logic

Current Mode Logic (CML) is a logic family that is suitable for mixed signal circuit designs. For implementing the logic, NMOS differential pair transistors can be replaced with any number of NMOS transistors to satisfy the appropriate logic [5].

CML has been a choice over CMOS for single core processors that are operating at high frequencies. This logic was initially introduced as a high speed, low noise and analog friendly logic family and is being extensively used in serial-link transmitters, optical communication transceivers high speed ring oscillators and phase interpolators [5]. If energy efficiency is to be maintained in a current mode processor, complicacy-effective design that operates at high frequency is required so that the implementation time and static energy can be dwindled [6].

It can be observed that the current 'Iss' can be regulated by altering biasing voltage V<sub>BN</sub> or the W/L ratio or by the threshold voltage of NMOS tail transistor. The current is steered to one of the two branches of the circuit depending upon the logic levels of inputs. The output swing should be increased amply to switch the differential switches pair of the consecutive stage [7].

$$T_{SCL} = R_L \times C_L = V_{SW} \times \frac{C_L}{I}$$
(2)

It can be observed from the equation (2) that, the time constant at the output node limits the operating speed of CML gate where Vsw represents voltage swing at the output node.

A Load-device notion

The Load device used in the CML plays a major role in determining the required output swing.

$$R_{\rm L} = \frac{V_{\rm SW}}{I} \tag{3}$$

From equation (3) it can be observed that the load impedance is inversely proportional to the tail bias current. Hence, this load resistance RL should be in M $\Omega$  range [7]. This arrangement gives rise to a new concept of Sub threshold Current Mode Logic (STCML) [8].

The Fig.1 shows a low Vt logic and now this logic can be incorporated in Sleep transistor technique which has high Vt sleep transistors [1] as shown in Fig.2.



Fig. 1: Schematic of Sub Threshold Current Mode Logic Inverter.

This paper presents a novel methodology for incorporating Sub threshold Current Mode Logic (STCML) with Sleep Transistor Technique. The sleep transistors used are Low threshold voltage (lvt) transistors. This technique has been adopted to achieve low power consumption.



Fig. 2: Circuit of Sleep Transistor with STCML Low V<sub>T</sub> Logic.

#### 4. ARM7 composition

ARM microprocessors are one of the most extensively used microprocessor cores in everyday embedded accessories. ARM7 is a 32-bit Microprocessor which means all the blocks associated with it are of 32-bit and the instructions operate on 32-bit data. The Data path of ARM7 is typically comprised of Address Register, Data-In Register, Write Data Register, ALU, Register Bank, Barrel Shifter and Multiplier [9]. The Fig.3 shows the architecture of ARM7.

The Register Bank of ARM7 consists of 31, 32-bit registers which are organized as General purpose and Special Function Registers. ARM7 can be operated in 6 modes. The General purpose registers are common for all the 6 modes and depending upon the type of mode special function registers is accessible.

The six modes of ARM7 are User Mode, Fast Interrupt (FIQ) mode, Interrupt Request (IRQ) mode, Supervisor mode, undefined mode and Abort mode. User mode is an unprivileged mode in which most of the tasks are run and user has access to all the registers. ARM7 enters into FIQ mode when an interrupt with high priority is raised. ARM7 is enters into IRQ mode when a low priority interrupt is raised. ARM7 enters into Supervisor mode when it is reset suddenly or when a software interrupt is executed. The Abort mode and Undefined mode occur when there are memory violations and undefined instructions respectively.[9] Fig.4. shows the Register Bank of ARM7 microprocessor.



Fig. 3: Architecture of ARM7 Microprocessor.



Fig. 4: Register Bank of ARM7 Microprocessor.

# 5. Implementation of data path components of arm7 microprocessor

This paper has implementation of Data path components of ARM7 microprocessor using Cadence Virtuoso Tool in 45nm technology. All the components have been implemented in Sleep Transistor technique using Sub threshold Current Mode Logic (STCML) with a power supply of 0.24V. The 32-bit barrel shifter can be designed using 32 multiplexers. The function of Barrel Shifter is to shift and rotate the input bits coming from Data-In Register of ARM7.

The output of this block is given to ALU for performing the Arithmetic and Logical operations [10]. Vedic Multiplier is one of the fastest and simplest multipliers in terms of its architecture which performs multiplication based on a sutra called Urdhva Tiryakbhyam. The 32x32 Vedic Multiplier has been designed using four 16x16 Vedic Multipliers, two 32-bit Ripple Carry Adders and Half Adders [11].

a) 32-bit Arithmetic and Logic Unit

The presented 32-bit Arithmetic and Logic Unit is capable of performing 14 Arithmetic and Logic Operations as shown in Table 1 There are 7 multiplexers used in the design for selecting at a time 7 out of 14 operations for each logic low or high of select The entire schematic has been implemented using only basic logic gates, Half Adders and multiplexers.

Table 1: The Operations Performed by 32-Bit ALU of ARM7

| Sl. No. | Select lines | Output     | Operation        |  |  |
|---------|--------------|------------|------------------|--|--|
| 1       |              | out1<31:0> | Add/Subtract     |  |  |
| 2       |              | out2<31:0> | Logical AND      |  |  |
| 3       | 1            | out3<31:0> | Bit Clear        |  |  |
| 4       | 1            | out4<31:0> | Bit Invert       |  |  |
| 5       |              | out5<31:0> | Increment        |  |  |
| 6       |              | out6       | Shift Left       |  |  |
| 7       |              | out7       | Rotate Left      |  |  |
| 8       |              | out1<31:0> | Logical EXOR     |  |  |
| 9       |              | out2<31:0> | carry_borrow     |  |  |
| 10      |              | out3<31:0> | Logical OR       |  |  |
| 11      | 0            | out4<31:0> | Compare if equal |  |  |
| 12      |              | out5<31:0> | Pass A input     |  |  |
| 13      |              | out6       | Shift Right      |  |  |
| 14      |              | out7       | Rotate Right     |  |  |

b) Register Bank and 32-bit Data-In Register

The schematic of Register Bank in User Mode is as shown in the Fig.5. The user mode has all the registers including the General Purpose and Special Function registers. The registers used for the design are Parallel-In Parallel out (PIPO). The 32-bit Address Register consists of 32-bit D-Flip flops. The 32-bit Data-Out Register has also been designed in the same fashion.



Fig. 5: Schematic of Register Bank.

c) Integration of Data path unit of ARM7 Integration of Data path unit of ARM7 microprocessor is as shown in the Fig.6. All the blocks have been implemented in Sleep Transistor Technique incorporating Sub threshold Current Mode Logic.



Fig. 6: Schematic of Complete Data path Unit of ARM7.

#### 6. Simulation results

The Simulation Results of Data path components of ARM7 have been observed in Spectre simulator in the Analog Design Environment at 0.24V supply voltage.

a) A Simulation results of 32-bit Arithmetic and Logic unit and Register Bank

The simulation results of Register Bank are as shown in Fig.9. These results of register bank represent ARM7 in user mode. The register bank has a group of general purpose registers from r0-r7. The simulation results of 32-bit ALU are as shown in the Fig.7 and Fig 8. Fig.7 shows the simulation results of out1<31:0> add/subtract operation, out2<31:0> Logical AND operation, out3<31:0> Bit Clear, out4<31:0> Bit Invert, out5<31:0> Bit Increment, out6 Shift Left operation, out1<31:0> Logical EXOR operation, out2<31:0> Logical OR operation, out4<31:0> equality comparison, out5<31:0> Logical OR operation, out4<31:0> equality comparison, out5<31:0> passes a<31:0>, out6 Shift right operation and out7 Rotate right operation.

| ■ b<31:0>    | 1000000100000000001000100010            |       |       |      |       | 10000000100                             | 000000000100                            | 01000010 | )1   |
|--------------|-----------------------------------------|-------|-------|------|-------|-----------------------------------------|-----------------------------------------|----------|------|
| ■ a<31:0>    | 10000001010000000100000000              | 1:10  |       |      |       | 10000000101                             | 000000000100                            | 00000000 | J1   |
| 🔳 out1<31:0> | O010000100000000000000000000            | i     |       |      |       | 00100001000                             | 000000000000000000000000000000000000000 | 00000000 | 0    |
| 🔳 ou2<31:0>  | O000000000000000000000000000000000      |       |       |      |       | 0000000000                              | 000000000000000000000000000000000000000 | 00000000 | 10   |
| 🔳 out3<31:0> | I01000010001000000001010000000          | 1     |       |      |       | 10100001000                             | 100000000010                            | 1000000  | 01   |
| 🔳 out4<31:0> | • 0000000000000000000000000000000000000 | 1     |       |      |       | 000000000000000000000000000000000000000 | 00000000000000                          | 20000000 | 0    |
| 🔳 out5<31:0> | 10000000001000000001010000000           |       |       |      |       | 10000000000                             | 100000000010                            | 1000000  | 01   |
| p p          | @ 1                                     | L 1E  |       |      |       |                                         | 1                                       |          | _    |
| sel 🔳        | @ 0                                     | i je  |       |      |       |                                         | 0                                       |          | -    |
| s 🔊          | @ 0                                     | L ie  |       |      |       |                                         | 0                                       |          | -    |
| Elk.         | @ 1                                     | : ]   |       |      |       |                                         | 0                                       |          | -    |
| 🔳 rst        | @ 1                                     | ; ]   |       |      |       | 0                                       |                                         |          | _    |
| carry<3:0>   | @ 0001                                  | 1 -   | <br>: |      |       |                                         | 0001                                    |          | -    |
|              | @ 1                                     |       | <br>  |      | <br>_ |                                         | 1                                       |          | -    |
| out7         | @1                                      |       |       |      | <br>_ |                                         | 1                                       | _        | -    |
| 0007         | *                                       | 0 - F | <br>- |      | <br>- |                                         |                                         | _        | ÷    |
|              |                                         | 0.0   |       | 20.0 |       | 40.0                                    | time (ns)                               | 6        | 60.0 |

Fig. 7: Output Results of ALU when Sel='0'.



Fig. 8: Output Results of ALU when Sel='1'.



b) Comparison of Power and Delay for Datapath Blocks The Data path blocks of ARM7 microprocessor has also been implemented in CML logic at 0.5V and 0.24V power supply. Hence, the Table 2 gives the Power and Delay comparison between the three types of design techniques.

#### Table 2: Power and Delay Comparison at Different Power Supplies

| Design Block               |         | STCML with Sleep Transistor at 0.24V Power Supply |         | STCML at 0.24V Power Supply |         | CML at 0.5V Power Supply |  |
|----------------------------|---------|---------------------------------------------------|---------|-----------------------------|---------|--------------------------|--|
| C C                        | Delay   | Power                                             | Delay   | Power                       | Delay   | Power                    |  |
| 32-bit Barrel Shifter      | 1.211us | 17.41nW                                           | 20.82us | 18.46nW                     | 20.17us | 1.505uW                  |  |
| 32-bit ALU                 | 1.29us  | 153.5nW                                           | 25.68us | 172.5nW                     | 9.79us  | 14.72uW                  |  |
| Register Bank              | 18.57us | 479nW                                             | 439.1ns | 525nW                       | 173.4ns | 763.7nW                  |  |
| 32-bit Data-In Register    | 750.6us | 9.56nW                                            | 890us   | 10.79nW                     | 465.2ns | 942.8nW                  |  |
| 32-bit Data-Out Register   | 748.9us | 9.5nW                                             | 887.5us | 10.68nW                     | 459.8ns | 940.2nW                  |  |
| 32-bit Address Register    | 19.57us | 26.4nW                                            | 27.9us  | 23.56nW                     | 347.7ns | 680.4nW                  |  |
| 32-bit Address Incrementer | 78.6us  | 3.6nW                                             | 89.2us  | 5.39nW                      | 25.04us | 575.4nW                  |  |
| 32x32 Vedic Multiplier     | 520.9us | 350.4nW                                           | 658.4us | 386.4nW                     | 19.29us | 78.02uW                  |  |

### 7. Conclusion

The design of power efficient Data path components of 32-bit ARM7 processor has been successfully accomplished using Sleep Transistor Technique with STCML in 45nm technology using Cadence Virtuoso Tool. The objectives discussed in the introduction of the paper have been given justification with the design parts consuming less power as desired. This design is suitable for the products if the specifications suggest only for power efficiency as the delay has increased subsequently owing to the voltage scaling. The design can be further be modified by implementing it with other low power logics and by further scaling the technology depending upon the demands of present and future needs.

#### References

- Kanika Kaur, Arti Noor "Strategies & Methodologies for Low Power VLSI Designs: A Review" International Journal of Advances in Engineering & Technology, ISSN: 2231- 1963, vol.1, Issue 2, pp.159-165, 2011.
- [2] Bipin Gupta, Sangeeta Nakhate "Transistor Gating: A Technique for Leakage Power Reduction in CMOS Circuits" International Journal of Emerging Technology and Advanced Engineering, vol.2, Issue 4, 2010.
- [3] Md. Kamaruzzaman, Soumitra Kumar Mandal "A Novel approach for Leakage Power Reduction Techniques in CMOS VLSI Circuits" International Journal of Industrial Electronics and Electrical Engineering, vol.4, Issue 8,2016.
- [4] Vijaylaxmi C Kalal, Ravi Kumar K. I, Chaitrali V. Pawan "Novel Low Power Logic Gates using Sleepy Techniques." International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, vol.4, Issue 1.2015.
- [5] Sherif M. Sharroush, "Performance Optimization of MOS Current-Mode Logic" International Conference on Electrical, Electronics and Optimization Techniques, IEEE 2016, Egypt.

- [6] Yuxin Bai, Yanwei Song, Mahdi Nazm Bojnordi, Alexander Shapiro, Engin Ipek, Eby. G. Friedman "Architecting a MOS Current Mode Logic (MCML) Processor for Fast, Low Noise and Energy-Efficient Computing in the Near-Threshold Regime." IEEE Solid state circuits, 2015.
- [7] Armin Tajalli, Elizabeth J. Brauer, Yusuf Leblebici, Eric Vittoz" Subthreshold Source Coupled Logic Circuits for Ultra-Low-Power Applications" In IEEE Journal of Solid-State Circuits, vol.43, No.7, 2008.
- [8] Rajiv Gopal, M Murali Krishna "Subthreshold Design using SCL for Low Power Applications" International Journal of Recent Advances in Engineering & Technology. vol.2, Issue 3,2014
- [9] Steve Furber, ARM System On Chip Architecture, Addison Wesley (2000).
- [10] Enoch O. Hwang. , Digital Logic and Microprocessor Design with VHDL, Brooks/Cole (2005).
- [11] Anu Thomas, Ashly Jacob, Serin Shibu, Swathi Sudhakaran "Comparison of Vedic Multiplier with Conventional Array and Wallace Tree Multiplier." International Journal of VLSI Design and Communication Systems, ISSN: 2322-0929, vol.4, Issue 4, pp-0244-248, 2016.