

**International Journal of Engineering & Technology** 

Website: www.sciencepubco.com/index.php/IJET

Research paper



# Design a High Speed Parallel Prefix Adder using Generator and Propagator

Gajula Lakshminarayana<sup>1\*</sup>, Anupama A. Deshpande<sup>2</sup>, MoparthyGurunadha Babu<sup>3</sup>

<sup>1</sup>Ph.D.Scholar, Shri Jagdish Prasad Jhabarmal Tibrewala University, Jhunjhunu, Churela, Rajasthan 333001 <sup>2</sup>Professor, Dept. of EEE, Shri Jagdishprasad Jhabarmal Tibrewala University, Jhunjhunu, Churela, Rajasthan 333001 <sup>3</sup>Professor & Dean, Dept. of ECE, CMR Institute of Technology, Hyderabad

#### Abstract

Basically, Adders are used in VLSI designs. The chip packaging techniques are used in low power circuits which increases the design cost. Here a parallel prefix adder is proposed which is a rudimentary purposeful element in greatest computational stages and produces high performance. Parallel prefix adder is mostly classified hooked on3phases which are pre-computation, precede then post subtraction. While operating in the sub-threshold regime the proposed system gives effective results. In parallel prefix adder the addition process is done by adding the bits in parallel. The speed of parallel operation decides the computational operation. The Parallel prefix adder is used to avoid the high power consumption obtained in the system. Parallel prefix adder are derivative after the transmit appearance ahead adders. The delay gets reduced by achieving low logical depth in the system. So the Proposed Adder reduces the delay. Compared to wave carry adding machine besides convey except calculator, similar preface adder gives better results. It is implemented on Xilinx 14.7 and delay measurements are done.

Keywords: Parallel Prefix Adder, Ripple carry adder, carry save adder (CSA), Field Programmable Gate Array (FPGA), Digital Signal Processing (DSP), Look Up Table (LUT).

# 1. Introduction

As we know that Addition is one of the operation done in digital signal processing and control systems. Different types of adders are proposed before. Depend upon the parameters adder is used. But if the speed is main constraint then we supposed to use this parallel prefix adder. In very large scale integrated circuits, parallel prefix adder is mostly used. The parallel prefix structures allow trade-offs to obtain required logic levels. Depending upon the adder performance the digital signal processor produces accurate results.

Basically, swell convey calculator is obtained since the generalpurpose processors and DSP processors. Repetitive additions are performed by using multiplier operand. But using propagation, the quality, Delay, and area is measured. In ripple carry adder first the carry propagation is overlapped and next addition operation is performed. But the ripple carry adder produces large delay in the system. To overcome this delay problem parallel prefix adder is invented. This adder produces faster operations, reduces power consumption and delay. To add more number of numbers together powerful adders should be used.

Basically, parallel prefix adder produces high speed multi operands. The scaling down of device dimensions into the Nano-meter range is likely to result in significantly higher defect rates during the manufacturing process of IC's. With significantly increased defect rates, defect tolerance mechanisms are necessitated to guarantee a reasonable yield. Post manufacturing reconfiguration techniques to bypass defects are already applied in memory systems and FPGA's. However, such low-cost defect tolerance techniques rely heavily on the relative independence of operations of the homogeneous components, such as LUT and memory cells. Logic systems, on the other hand, usually constitute heterogeneous components with strong dependencies among each other. This makes it hard to realize fine-grained, low-cost defect tolerance schemes for a high level of defect rate.

Among the various adders, PPA provides a general form to represent a wide range of adder design choices. Reliable PPA designs have mostly been done on the particular form of ripple carry adder. Structure and hardware redundancy. For performance purposes, the hardware of a typical parallel prefix adder is divided into two disjoint groups of the even bits and the odd bits. This provides a natural way to make the defects isolated: errors caused by the defects in one group will not affect the results produced by the other group. Furthermore, the in-built redundancy in a parallel prefix adder allows each group to be capable of generating the results for the other group, with a small hardware and time overhead.

# 2. Adders

In gadgets snake execution expansion activities. Viper exits in ALU, some Arithmetic rationale units contain different adders. Snake can build dependent on numerical articulations, for example, Excess-3 or paired coded decimal (BCD), the most the adders works on binary numbers. For single piece expansion, two sorts of adders are there. Consider An and B are two contributions of half snake and yields are whole S and Carry Co. S is the XOR task of two data sources An and B, and Co is the AND activity of two information sources An and B. maybe out-put of Half viper is entirety of useless numbers, and Co be the most noteworthy



number. Second kind of viper is full snake; it contains three sources of info A, B and Ci. A full viper is developed dependent on two half-adders.

Multi-bit adders are a few sorts in that swell convey viper is basic, and additionally slowest, since it engender each full snake. Convey look forward viper is work by making and proliferate from a couple of signifi-cannot bit position. At times P is only total of yield of a half viper and G is conveying yield for some snake. P and G make conveys of bit position. Multi bit design breaks the snake into squares. To diminish time squares required the convey length of circuits, this square dependent on convey sidestep snake. Here each piece is dictated by p and g esteems from each square.

#### 2.1. Half Snake

An and B are two single piece numbers, Half snake is utilized to include these two numbers and it produces whole 'S' and Carry out Co. Half snake utilizes just two single digit numbers. For ale including circuits like Full adders it might be utilized as expressing building square. Boolean articulations are

$$S = A \bigoplus B = A^1 B + A B^1 \tag{1}$$

$$Co = AB \tag{2}$$

### 2.2. Full adder

Consider A, B and C are three contributions of Full viper and yields are Sum 'S' and Carry 'Co'. Full snake is developed in arrangement connection by utilizing two half viper. The aggregate of An and Bare encouraged to second half-viper, which at that point adds it to the convey in C to create last yield S. the Carry out , Co, is the aftereffect of an OR activity obtained from the convey yields of both half adders.

#### 2.3. Parallel viper

Parallel adders are built with essential computerized circuits which figures the expansion task of paired proportionate in parallel way.

## **3. RCA**

Essentially, Ripple convey viper is a route which produce math whole of II double numbers. From figure (1) we can see that it is built with full adders which are in arrangement design. The RCA is made out of a course of full viper squares. In spite of the fact that the RCA is made out of a straightforward structure, which empowers simple implementation and it presents poor execution, in light of the fact that the convey is proliferated into every expansion square.

In this to each convey bit is undulated addicted to the following phase. But here delay is obtained in the circuit. The delay of reverse converters will bind the growth of logarithm in Parallel prefix adder. Basically, in RCA large number of adders will be rummage-sale. The bit distance of this calculator is very big.

To perform the first part of addition in the system, a desired structure with particular operands are used. Here first full adder is replaced by half adder as shown in figure (1). High power consumption is obtained due to the recursive effect. Recursive effect is obtained while producing in addition broadcasting the signs at precede near. Tall fan out characteristic also obtained in the system. So to overwhelmed the problematic of postponement occurred in the system, a new system is proposed. However, this problem is eliminated by using additional prefix level which is discussed in below section. In contrast to the proposed system, it can achieve a provisional augmentation depending upon the control indications as exposed in Fig. 1.



## 4. Proposed Parallel Prefix Adder

A similar prefix trip computes N productions commencing N efforts by means of an associative worker. Greatest prefix calculation is pre-calculated since middle variables of the contributions. The intermediate variables are added to the prefix network. In the parallel-prefix, calculators the amount is alienated into III major ladder: i) Pre-computation: the mid-way produce also proliferate two of a kinds (g, p) are collected by BESIDES exclusive- otherwise ii) Prefix: every one the ciphers intended in the primary step are treated by a prefix system, with a prefix compartment machinist; iii) Post-subtraction: by the transitional indications proliferate also the take gestures a sum consequence container be originate by using additional exclusive- or else entrance.

Similar prefix adder is a multi-bit bring circulate adder which is secondhand for similar sum of 2 multi-bit statistics. P.P.A spreads the produced along with broadcasted reason of the transmit appear ahead adder to execute adding smooth earlier. According to the adder's formation, the prefix totaling collections both standards straight from the contribution. The postponement is augmented by the conformation to facilitate have the uppermost dangerous trail similar to the adders which procedure supplementary than 2 efforts. The bear propagation purpose of equivalent prefix adder is assumed in reckoning (3).The final sum conformation is controlled by apprehending the standard spending from the final temperament of the carry. This can be experiential from calculation (4).

$$P[i, i+1] = P[i] \cdot P[i+1]$$
(3)

$$S[i+1] = P[i+1] \bigoplus C[i]$$
(4)

The below figure (2) demonstrations the construction of corresponding prefix adder. Parallel prefix adding machine contains of III phases. They are pre-dispensation stage, take group phase in addition post dispensation period. In pre-processing step, the engender too prolife rate indications are approved out. In carry cohort stage prefix diagrams are castoff to describe the tree assembly. At last in post meting outpoint, quantity moreover transfer is calculated.



In the point of precede totaling, the conveys are gathered rendering to the adder's arrangement concerning inferior cost, authority, besides interruption. Rendering to the adder's outline, the prefix subtraction collections together standard unswervingly from the participation with morals to facilitate be subtracted in the preprocessing period. The augmented delay is found by the arrangement so as to has the maximum dangerous trail similar to the computers which procedure additional than two participations.

In the post-meting out pace the bear principles with the aim of constitute the output, are gathered. They are touched finished the previous adder arrangement with the intention of is decided by a explanation that correct the difficulty of the precise carry broadcast designed intended for every contribution bit. The concluding amount shape is prearranged by an XOR meaning that imprisonments the principles impending commencing the concluding temperament of the bear.

Based on the inputs given the outcome of operation is performed. As we know that the parallel prefix adder performs and executes the operation in parallel. The obtained output will be segmented into smaller pieces. There are different topologies used in parallel prefix adder, but the operator is associative. Based on topology the operation is performed.

By using the associative binary operations, the algorithms will be generalized. This generalized algorithm performs certain operations and computed with particular efficiency. Basically there are two procedures followed in the system, they are in first pass the prefix sum as are calculated from the processing unit. And in second pass known prefix values are computed from processing unit to get initial value. So along with that the system performs two read operations and one write operation.

Here a methodology is employed to design PPA. The experimental results mainly, depend on part, postponement, besides control ingesting. The disbursement of extra zone then extraordinary will upsurge the authority feasting. Compared with VLSI implementations, the Parallel-prefix adders will produce performance differently. The modern FPGAs employ the fast-carry chain process to get faster results.

Our first parallel prefix adder architecture is founded on the adder building is shown in number (3). The 1 bit parallel-adder computes the entire process and the component collected of an n-bit adder aimed at the smallest important minutes besides an exclusive-N.O.R is gate. Carry output obtained at the end of gate. Here the time should be minimized in computation process. Because of this the speed of the operation is increased. By using recursive equation each bit position is derived. In the same way using a single row of prefix operator the addition operation of transmit moment is done in similar.

The adding procedure container be achieved straight by linking the transport contribution of the carry increase stage. Here the project is recursively enhanced to get heal their competence. There is no essential of the additional carry augmentation phase. The parallel-prefix calculator buildings are resulting through one fewer precede equal.



Fig. 3: Architecture of parallel prefix adder

The below figure (4) demonstrations the assembly of planned scheme. Basically equivalent precede adders is also named as carry tree adding machine. From below figure (4) we can observe that the system is in tree structure. Produce and propagate indications are pre-computed by the organization. Lot of families of adders are executed by the prefix system. In the equivalent prefix adders, the bring are all intended in corresponding. Coming to the black cell it produces prearranged pair plus approaching to the grey compartment it produces left indication.



Parallel-prefix assemblies are originating to be shared in elevated performance adders for the reason that of the postponement is logarithmically comparative to the adding machine width. In black cell partaking four inputs besides two productions broadcast means then maneuver also generation resources and-or maneuver. In ashen cell consuming three participations and one output here generation incomes along with before process.In the prefix period, group produce/propagate indications are intended at each bit using the given compilations.

$$\mathbf{G}_{i:k} = \mathbf{G}_{i:j} + \mathbf{P}_{i:j} \mathbf{G}_{j-1:k}$$
(5)

$$P_{i:k} = P_{i:j} P_{j-1:k}$$
(6)

A perfect prefix system determination be collected of: i) log2 (n) phase of logic, ii) fan-out not once beyond 2 at apiece period, and iii) rejection additional than one flat path of cable at each stage. A lot of parallel-prefix networks contain be explained in the nonfiction to work out the hold indication by captivating keen on version the mention individuality. In P.P.A there are some resemblances

between them because all of them are alienated into the same three steps. The alteration between them is the area, the number of computations and the delay. The delay charge is confirmed finished logarithmic function, and thus these adders are called as logarithmic interruption adders. In the pre-processing point, the carry indications obligatory for the arrangement of the indications that enable the subsequent steps are intended. In this part, propagate functions are used, where carry proliferation values are produced.



Fig. 5: Operator for the prefix calculation

The kind of P.P.A adding machine be contingent on the way of establishing the tree of making and spreading workers. To get exact carriers in the system we use produce in addition broadcast indications. Here the signals are calculated to apiece additional by using fundamental carry operator. Depending upon the type of signals different architectures are created.

Here the precede lockups are made by means of static C.M.O.S entrances. To buffer the outputs, it should consists of complex gate and inverters. Basically, complex gates should be split into various forms to reduce the delay occurred in the system. Next Inverters are added to the cell of prefix output. By this process the power consumption is also reduced.

Black cell and gray cells are used in proposed system. By using 4 bit ripple carry adder the carriers are computed. At first the inputs bits are converted into propagate and generate signals. By using this signal the carriers are propagated. After the process of propagation it is added to adder block. Compared to the existed system it produces less delay.

In the figure (4) the bottom left- hand corner of carry tree is input specification. To minimize delay at each row from top to bottom, inverters are used. After the process of construction the adder is verified functionally. Here the end points of carry tree design are marked. Wider adders are obtained by the multiplication of rows. These wider adders are truncated with specific width. When compared to the results of existed system, the proposed system gives efficient results. At last it reduces the delay and provides more long wires to route. Less area is taken to implement the proposed adder and also it has less wiring congestion. The adders are implemented on Verilog hardware description language (vhdl) in xilinx14.7design suite. After that it is verified using Xilinx Virtex FPGA through chip scope analyzer.

# 5. Results

In the previous section the scheme of parallel prefix adder is planned. When execution their diagram application the origin of formulations had previously been labeled for approximation of the hardware limits lengthways with cumulative the bit breadth of contribution operands. The estimated adders comprise half of the calculator's bit-width for all the enterprises under assessment. During mixture, all the combinational logic after enterprises are conserved to assurance no optimizations or vicissitudes in the oldstyle adder topologies. In the contemporary section the comparison of the presented adders is performed by the subsequent parameter of the maximum delay for output results when performing the addition. The below figure (6) shows the comparison of logic delay for RCA, CSA and PPA.





From figure (6) it can observe that the logic delay is minimum in parallel prefix adder and maximum in ripple carry adder. It is because the RCA versions have the lowest frequency and are the starting point in this education. So the parallel prefix computer produces better results.



From figure (7) it can observe that routing delay is minimum in parallel prefix adder and maximum in CSA. OVERALL DELAY



Fig. 8: Overall delay comparison of RCA, CSA and PPA

From figure (8) it can observe that OVERALL DELAY is minimum in parallel prefix adder and maximum in CSA. It is since the CSA forms consume the bottom most in cadence then are the point of departure in this training

# 6. Conclusion

As discussed earlier that Equivalent begin calculator is a multi-bit carry-propagate calculator which is rummage-sale aimed at corresponding adding of 2 multi-bit statistics. In this broadside we contemporary the existed system that is ripple carry adder. Ripple carry adder enables easy implementation but it gives poor performance in delay. So the design of proposed tree structure will reduces the delay occurred in existed system. So from this it can observe that parallel prefix adder gives high performance after designing and implementation of the system. The project is vigorous that permits it to professionally function in the sub-threshold regime. The project practice presented here should be applied well to a higher precision adder. The total circuit delay is better than CSA, RCA as shown in results.

# References

- Pakkiraiah. Chakali, madhu Kumar. Patnala "Design of high speed Brent - Kung based carry select adder" IJSCE, Volume-3, Issue-1, march 2013
- [2] HaridimosT.Vergos, Member, IEEE and Giorgos Dimitrakopoulos, Member, IEEE," On modulo 2<sup>n</sup>+1 adder design" IEEE Trans on computers, vol.61, no.2, Feb 2012
- [3] David h, k hoe, Chris Martinez and Sri Jyothsnavundavalli "Design and characterization of parallel prefix adders using FPGAs", Pages.168-172, march2011 IEEE.
- [4] K. Vitoroulis and A.J. Al-Khalili, "performance of parallel prefix adders implemented with FPGA technology," IEEE Northeast Workshop on circuits and systems, pp.498-501, Aug 2007.
- [5] GiorgosDimitrakopoulos and DimitrisNikolos, "High Speed Parallel Prefix ...Ling adders," IEEE Transactions on Computers, vol.54, no.2, February 2005
- [6] S. Knowles," A family of adders,"proc.15thsymp. Comp. Arith, pp. 277-281, June 2001.
- [7] R.Brent and H.Kung, "Aregularlayoutforparallel adders," IEEETrans.Computers, vol.C-31,no.3, pp. 260-264, March1982.
- [8] R.E. Brent and M.J. Kung, "Parallel Prefix Computation," J. ACM, vol. 27, no. 4, pages 831-838, Oct. 1980.
- [9] R. P Brent and H. T. Kung, "A Regular Layout for Parallel Adders," in IEEE Trans. Computers, Vol. C-31, pp. 260-264,1982.
- [10] B. Ramkumar and H. M. Kittur, "Low-Power and Area Efficient Carry Select Adder," in IEEE Transactions on Very Large Scale Integration (Visi) Systems, Vol. 20, No. 2, February 2012.
- [11] P. Kogge and H. Stone, "A parallel algorithm for the efficient solution of general class Recurrence equations," in IEEE Trans. Computers, Vol.C-22, pp. 786-793, Aug. 1973.
- [12] M. Sunil, R.D. Ankith, G.D. Manjunatha and B.S. Premananda, "Design and implementation of faster parallel prefix kogge stone adder," in International Journal of Electrical and Eletronic engineering and Telecommunications, ISSN. 2319 –2518 Vol. 3, No. 1, January 2014.
- [13] JasbirKaur and LalitSood, "Comparison between various types of adder topologies," in IJCST, ISSN 2229 – 4333 Vol. 6, No. 1, March 2015.
- [14] R. Ladner and M. Fischer, "Parallel prefix computation," in Journal of ACM.La. Jolla CA, Vol.27, pp. 831-838, October1980.
- [15] V. Ionescu, I. Bostan, and L. Ionescu, "Systematic Design for Integrated Digital Circuit Structures," in IEEE Journal of Semiconductor Conference, Vol.2, pp. 467 – 470, 2004.