

**International Journal of Engineering & Technology** 

Website: www.sciencepubco.com/index.php/IJET

Research paper



# Optimize Power in Integrated Circuits by Simple Power-Critical Nets Re-routing

Mohammed Darmi<sup>1\*</sup>, Lekbir Cherif<sup>1</sup>, Jalal Benallal<sup>1</sup>, Rachid Elgouri<sup>2</sup>, Nabil Hmina<sup>1</sup>

<sup>1</sup>Laboratory of Systems Engineering, National School of Applied Sciences, Ibn Tofail University, Kenitra, Morocco

<sup>2</sup>Laboratory of Electrical Engineering & Telecommunication Systems, National School of Applied Sciences, Ibn Tofail University,

Kenitra, Morocco

\*Corresponding author E-mail: mohammed.darmi@gmail.com

## Abstract

This work presents a new technique of the integrated circuit (IC) physical conception flow aiming to reduce the interconnexion consumption. The technique ranks nets based on their power consumption, then drive the global route prioritizing critical nets in term of power consumption. For maximum optimization, the re-routing considered only 30% of total nets that consume 88% of the total power. The technique was implemented and gave experimental results on a high-speed circuit (2GHz) realized with the 7-nm technology node. The goal was to achieve a significant power reduction with no degradation of the circuit performances and maintaining acceptable congestion overflow when routing the interconnections. The new nets re-routing power-aware performs an improvement of 12% of targeted data nets switching power and a 6% of the entire design total power.

Keywords: Dynamic power, Integrated Circuit conception, switching power, low power, power optimization, total power.

# 1. Introduction

The increasing complexity of integrated circuits makes their power consumption a priority concern. It is necessary to take them in to account starting from the architecture and through all circuit conception phases. Power optimization has multiple effects, such as increasing the density of the device, increasing clock frequency, increasing battery life and reducing packaging costs; higher energy consumption means greater heat dissipation, which requires a more efficient cooling system [1;2;3]. Several different approaches can be used to reduce energy/power consumption. The most used low power techniques are; supply voltage reduction, clock gating, multiple VT library cells, multi-voltage design, power gating, and dynamic voltage-frequency scaling. These techniques could be implemented during circuit design (RTL coding) and/or logic synthesis stages.

At physical design stage, power optimization can target the both power consumption types: (i)static power and (ii)dynamic power, on the two components of the design: components (standards cells and macros) and interconnections. In a previous research, we have implemented a new technique called multi-bit flip-flops merging at the physical implementation phase that helps on power reduction in clock-tree elements [4]. [5] presents some technics to reduce power on standards cells elements, [6] focus more on the technological solutions to decrease total consumed power in the design.

This research introduces a new technique aiming to optimize the power consumption in the interconnections, by acting on routing topology without impacting the IC area. Indeed, the dynamic power of the IC interconnections depends on the coupling capacitance. The procedure that rank nets from the higher to lower power consumption prior to the global route was applied. Critical power-nets was routed first with high priority achieved a short wire-length which reduce the dynamic power. To prove the benefit on a real test case, Mentor Graphics' physical design EDA tool Nitro-SoC<sup>TM</sup> and a high-speed design made with the advanced technology node 7nm was used. The experiment shows an important dynamic power improvement on the interconnections.

## **2.** Material and methods

## 2.1. Power Consumption and wire capacitance calculation

## 2.1.1. Wire capacitance parameters and calculation

An isolated wire over the substrate can be modeled as a conductor over a ground plane. The wire capacitance has two major components: the parallel plate capacitance of the bottom of the wire to ground and the fringing capacitance rising from fringing fields along the edge of a conductor with finite thickness. In addition, a wire adjacent to a second wire on the same layer can exhibit capacitance to that neighbor. These effects are illustrated in the Fig. 1. The classic parallel plate capacitance formula is [7]:

$$C = \frac{\varepsilon}{h} W L \tag{1}$$

Where: W: is the width of the wire L: is the length of the wire h: is the thickness of the dielectric layer

ε: is the permittivity of the dielectric layer.

Copyright © 2018 Authors. This is an open access article distributed under the <u>Creative Commons Attribution License</u>, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.



A cross-section of the model used for capacitance upper bound calculations is shown in Fig. 2. The total capacitance of the conductor of interest is the sum of its capacitance to the layer above, the layer below, and the two adjacent conductors.

If the layers above and below are not switching, they can be modelled as ground planes and this component of capacitance is called  $C_{gnd}$  [7].



Fig.2: Total Capacitance of the conductor.

#### 2.1.2. Power consumption calculation

At the transistor level, power consumption of an IC can be broken down into two additive terms: (1) static consumption which is due mainly to parasitic currents (leakage currents), and (2) dynamic consumption resulting from the switching activity of circuits [8]. Previous studies on the CMOS IC power consumption related to the hardware aspect have led us to the expression:

$$\boldsymbol{P}_{total} = \boldsymbol{P}_{dyn} + \boldsymbol{P}_{leak} \tag{2}$$

The Static Power " $P_{leak}$ " consumption is the transistor leakage current that flows whenever power applied on the device, independent of the clock frequency or switching activity. While the dynamic power " $P_{dyn}$ " consumption happens during the switching of transistors depends mostly of the clock frequency and the total capacitance of the interconnections. It consists of switching power and internal, distributed on cells and interconnections as follows:

For the cells, the dynamic power model is as follows:

$$P_{dyn} = P_{int} + P_{switch}$$

(3)

$$P_{int} = \frac{1}{2} \left( E_{rise} + E_{fall} \right) TR \tag{4}$$

Where:

Erise / Efall: Rise energy/fall energy

TR: is the Toggle rate, i.e. the number of toggles per time-unit.

$$P_{switch} = \frac{1}{2} C V^2 T R \tag{5}$$

Where:

C: is the total wire capacitance,

V: is the power supply voltage.

For the interconnections, the dynamic power is modelled as following:

$$P_{dyn} = \frac{1}{2} C V^2 T R \tag{6}$$

In previous sections, the capacitance property was introduced and presented the power calculation of the interconnect wire. The aim was to show the relationship between the consumed dynamic power of an interconnection and its capacitance. One parameter to reduce the wire capacitance is by decreasing its length. To not deteriorate the routing of the interconnections, only nets that consume nearly 90% of the total power consumed by all data nets was re-routed during this study.

This work presents a wire optimization technique for power saving on the interconnections at physical implementation phase of an IC. At this stage, the circuit voltage and the TR are fixed by the circuit function. The remaining parameter is the interconnections capacitance. This represents an opportunity for significant power saving using routing transforms.

The complexity of capacitance variations makes difficult to determine which combination of layers and vias structures to be used for a given net to obtain less consumed power and maintain an acceptable timing and good routing. This can be achieved through prioritizing power critical nets, coupled with a carefully set of "route\_global" Nitro-SoC basic command enabling the routing engines to efficiently trade-off timing quality of results (QoR) and congestion.

## 2.2. Layer optimization to reduce net power

The EDA tool Nitro-SoC<sup>TM</sup> [9]-[11] that performs both leakage and dynamic power optimization was used for this research. Power optimization could be performed either as an integrated stage within the physical implementation full flow or as a separate step after post-route optimization step. Fig. 1 shows a typical flow diagram for IC conception from System specification until final chip manufacturing, with a scope on the physical implementation stage [12;13;14]. The proposed flow is performed to work after the final "Timing Closure" step, called also "post-route opt" step. Previous works in [8, 9] has demonstrated that in the advanced technology nodes, 28-nm and below, the static power represents on average 10% of the total consumption power. In addition, it is mentioned that the internal power represents 20% to 30% of the dynamic consumed power, and the switching power represents 70% to 80% of the overall consumption of a circuit. That explain why we are focused on switching power optimization. Fig. 3 shows the repartition of the power on the design used for this study. The target for power optimization was all "Data Nets" that consumes 34% of the total power.



Fig.3: Total power repartition in a digital design block made with 7nm.

In addition to existing the EDA tool, it is necessary to take maximum advantages of the re-routing transforms during power optimization. The proposed routing optimization technique aims to reduce the critical net power consumption by re-routing the power-critical nets prior to other data nets without taking in the account timing degradation. Then, the native Nitro-SoC "route global" command was used to empower the routing engine to manage efficiently timing and congestion and have the required results. In parallel to the EDA tool software, there is a reference flow script TCL-based to run each part of the Place & Route design flow step by step. It is made by a set of organized TCL scripts that cover different stages from placement to post-route. The baseline flow includes all needed commands and settings to implement a large variety of designs from low to high complexity. The used Nitro-SoCTM [10] tool begins with floor-planning and placement, and handles CTS, and routing [11, 12] as shown in Fig. 4 (black rectangles). The enhanced solution is a set of TCL scripts that include Nitro-SoC™ basic commands. This solution was added exactly after the "Pre-Clock Tree Synthesis" (Pre-CTS) within the reference flow. Fig. 4 (green rectangles) shows where the wire optimization was added inside the initial flow (baseline flow)

| optimization was added inside the initial flow (baseline flow).                                                 |
|-----------------------------------------------------------------------------------------------------------------|
| Algorithm#1: Layer Optimization                                                                                 |
| 1: For Target_Nets $\in$ {5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% 90% and 100% of "Data Nets"} $\mathbf{do}$ |
| 2: Read Design Database                                                                                         |
| 3: Remove existing global routing.                                                                              |
| 4: procedure: Identify target critical-power nets                                                               |
| 5. Run global route on critical-power nets                                                                      |
| 6: Run global route on data next except already routed critical-power nets                                      |
| 7. Run global route with timing high effort for timing recovery on a data nets                                  |
| 8: Report QoR (Wire-length, Timing, Power, and Congestion)                                                      |
| 9: END for                                                                                                      |
|                                                                                                                 |
| Algorithm#2: Procedure: Identify target critical-power nets                                                     |
| 1: For Target_Nets $\in$ {5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% 90% and 100% of "Data Nets"} $\mathbf{do}$ |
| 2: Rank nets from the highest to lower power consumption                                                        |
| 3: Report the number of target nets                                                                             |
| 4: Report consumed dynamic power                                                                                |
| 5. Calculate and report the percentage of target nets power related to the total data nets power                |
| 6: Return Target_Nets                                                                                           |
| 9: END for                                                                                                      |



Fig. 4: Baseline design flow vs. featured flow.

# **3. Results and Discussions**

From the existing implementation flow described in the previous section, different tests were performed; for each test, the incremental "Incremental Layer Optimization" function was added targeting a percentage of data networks consuming the most energy. The Nitro-SoC <sup>™</sup> EDA tool was used for this exercise [11] to perform the physical implementation.

To measure the real impact of each trial on the circuit performance, this optimization method was applied on a high-speed design using advanced 7-nm technology node. The main characteristics are summarized in Table 1.

| Table 1: Design specification |                          |                        |                           |                       |                      |  |  |
|-------------------------------|--------------------------|------------------------|---------------------------|-----------------------|----------------------|--|--|
| Number<br>of<br>instanc-      | Num-<br>ber of<br>macros | Num-<br>ber of<br>nets | Physi-<br>cal area        | Max<br>frequen-<br>cy | Technolo-<br>gy node |  |  |
| es                            |                          |                        |                           |                       |                      |  |  |
| 313982                        | 12                       | 335736                 | 115737<br>µm <sup>2</sup> | 2 GHz                 | 7nm                  |  |  |

The first metric to analyses is the power consumption distribution on all data nets. Fig. 5 shows how the dynamic power on data nets is distributed. It is important to point out that by targeting only 30% or 40% of data nets for power reduction, more that 88% of the total dynamic power was optimized. Thus, a significant reduction of power on a minimum of run-time.



Fig. 5: Data nets dynamic power consumption distribution.

After finding the optimal target nets number for optimization, it is important to know how the dynamic power on all data nets is reduced for each case. Fig. 6 shows a significant power reduction up to 20% on data nets when the number of target nets exceed 40% of total data nets.

After 40% the power gain remains almost the same. This behavior is self-explained because 40% of data nets consumed 93% of the

total power. Therefore, as a first result, significant power reduction was achieved without targeting all data nets for optimization. The total power (including leakage and dynamic) reduction percentage approaches the 7% when targeting 40% of nets.



Fig. 6: Dynamic power gain on data nets.

Table 2 summarizes the timing Worst and Total Negative Slack (WNS/TNS), in addition to the congestion overflow percentage.

| % Number of targeted nets com- | WNS  | TN     | Over-    |
|--------------------------------|------|--------|----------|
| pared to all data nets         | (ps) | S (ns) | flow (%) |
| 100%                           | -    | -      | 4.400    |
|                                | 94.8 | 10.41  |          |
| 90%                            | -    | -      | 4.323    |
|                                | 94.8 | 11.44  |          |
| 80%                            | -    | -      | 4.158    |
|                                | 94.8 | 11.92  |          |
| 70%                            | -    | -      | 4.090    |
|                                | 93.8 | 11.45  |          |
| 60%                            | -    | -      | 4.055    |
|                                | 93.9 | 13.24  |          |
| 50%                            | -    | -      | 4.005    |
|                                | 87.7 | 13.36  |          |
| 40%                            | -72  | -      | 3.949    |
|                                |      | 8.62   |          |
| 30%                            | -    | -      | 3.847    |
|                                | 74.6 | 9.19   |          |
| 20%                            | -    | -      | 3.658    |
|                                | 75.2 | 11.11  |          |
| 10%                            | -    | -      | 3.237    |
|                                | 86.1 | 11.42  |          |
| 5%                             | -    | -      | 2.315    |
|                                | 81.5 | 18.53  |          |

|          |            | 0        |          |
|----------|------------|----------|----------|
| Table 2: | Timing and | overflow | analysis |

# 4. Conclusion

This research proposes an incremental power optimization new technique to be implemented after the pre-CTS stage of the IC conception. This technique leads to an important dynamic power improvement through a simple critical nets re-routing. The power on all data nets was reduced up to 20%, and the total power by 6%. All this improvement was achieved by targeting only 40% of data Nets that consume 93% of the total power in the interconnections. This new technique used to reduce net's switching power in the physical implementation is still under validation and requires minimizing the impact on the circuit performance and on the congestion through an optimal choice of targeted nets.

## Acknowledgement

This work was conducted with support from Mentor Graphics Corporation. We thank Dr. Hazem El Tahawy (Mentor Graphics, MENA Region Managing Director) for all provided support.

### References

- D. J. Radack and J. C. Zolper, "A Future of Integrated Electronics: Moving Off the Roadmap," in Proceedings of the IEEE, vol. 96, no. 2, pp. 198-200, Feb. 2008. doi: 10.1109/JPROC.2007.911049
- [2] I. Lee and K. Lee, "The Internet of Things (IoT): Applications, investments, and challenges for enterprises," Business Horizons, vol. 58, no. 4, pp. 431–440, 2015.
- [3] D. Flynn, R. Aitken, A. Gibbons, K. Shi, Low power methodology manual: for system-on-chip design, 2ed. ed., Springer, 2007, p13.
- [4] L. Cherif, M. Chentouf, J. Benallal, M. Darmi, R. Elgouri, and N. Hmina, "Usage and impact of multi-bit flip-flops low power methodology on physical implementation," 2018 4th International Conference on Optimization and Applications (ICOA), 2018.
- [5] M. Rahman, R. Afonso, H. Tennakoon and C. Sechen, "Design automation tools and libraries for low power digital design," 2010 IEEE Dallas Circuits and Systems Workshop, Richardson, TX, 2010, pp. 1-4
- [6] G. J. Y. Lin, C. B. Hsu and J. B. Kuo, "Critical-path aware power consumption optimization methodology (CAPCOM) using mixed-VTH cells for low-power SOC designs," 2014 IEEE International Symposium on Circuits and Systems (ISCAS), Melbourne VIC, 2014, pp. 1740-1743
- [7] J. M. Rabaey, A. P. Chandrakasan, and B. Nikolic, Digital Integrated Circuits: A Design Perspective, 2nd ed., Prentice Hall Electronics and VLSI Series, Upper Saddle River, NJ: Pearson Education, 2003.
- [8] J-G. Cousin, D. Chillet and O. Sentieys, "Power Estimation and Optimisation for ASIPs", Submitted to 1997 International Symposium on Low-Power Design, Mont. CA, Aug. 1997.
- [9] E. Macii, "High Level Design and Optimization for Low Power", NATO Advance Study: Low Power in Deep Submicron Electronics, Aug 1996.
- [10] Nitro-SoC<sup>™</sup> and Olympus-SoC<sup>™</sup> Software Version 2017.1. R2, August 2017.
- [11] Nitro-SoC<sup>™</sup> and Olympus-SoC<sup>™</sup> "User's Manual", Software Version 2017, August 2017.
- [12] Nitro-SoC<sup>TM</sup> and Olympus-SoC<sup>TM</sup> "Advanced Design Flows Guide", Software Version 2017, August 2017.
- [13] Reduction of Power Consumption using Joint Low Power Code with Crosstalk Avoidance Code in Case of Crosstalk and Random Burst Errors Ashok Kumar.K, Dananjayan.P International Journal of Engineering & Technology, 7 (3.12) (2018) 62-68.
- [14] Low power synthesis for asynchronous FIFO using unified power format (UPF) Avinash Yadlapati, K Hari Kishore International Journal of Engineering & Technology, 7 (2.8) (2018) 7-9.