A Review of the Different Levels of Abstraction for Systems-on-Chip (SoC)

Simulating systems on a chip (SoC) even before starting its productivity makes it possible to validate the correct functioning of the systems, also avoiding the manufacture of defective chips. However, low-level design and system complexity makes verification and simulation more complicated and time consuming. The classification of the different levels of abstraction from lowest to highest generally depends on the estimation accuracy of the system performance and the speed of simulation. The RTL (Register Transfer Level) abstraction level allows efficient description at gate level with good precision. Therefore, RTL program are slowly simulated. Simulation speed usually depends on the size of the platform used, which is not the case for transaction level modeling (TLM) to achieve simulation speed based on the exchange of transactions between system modules. This work aims to give a detailed description of the different levels of abstraction with the main advantages, and disadvantages on the performances estimation side such as, energy consumption, precision, and speed. Furthermore, an overview of the most adequate memory architectures and interconnection networks, to aim the most suitable virtual platforms of simulation for SoC.


Introduction
The hardware/software co-simulation of complex systems on a chip from the early stages of design have an essential role since it allows us to reduce the time to market for the final product. For this reason, this cosimulation imposed the presence of tools to make powerful, accurate, and rapid development. Several factors control this performance. Such as the choice of architecture memory systems [1] and the selection of the adequate level of complex systems co-simulation.
The optimization of memory architecture remains a major problem. The technological evolutions in the research and development of multiprocessor systems on chip (MPSoC), show that these systems have a high computing capacity, the architecture which remains the most evolving for this kind of capability is Distributed shared memory (DSM), since it combines the advantage of two types of systems. Centralized shared memory (CSM) which are open to all device processors and have a single physical memory [2,3], and distributed memory (DM) which each processor has its own private memory. The communication between processors is usually done by the message transfer [4,5] .The broad overview of the proposed and existing approaches on the concept of share Memory distributed [6] shows an improvement in performance systems, especially in terms of energy dissipation as shown in Table 1.
* Corresponding author: aamalikaoutar@gmail.com To supply a co-simulation prototype, the choice of the level of abstraction is essential; the use of lower level of abstraction such as RTL [7] becomes very hard for the developers and designers, opposite the increase in the complexity of the systems. For that, the passage towards a high level of abstraction such as TLM becomes mandatory, which makes it possible to gain in speed of simulation, but as well to reduce the lines of code, which the developer is brought to write. However, the co-simulation platforms must follow this evolution. Therefore, this paper aims to take part in the quest for various levels of abstraction used for platforms for co-simulation and the distinction between each level's performances. This article is structured as follows. The classification of different levels of abstraction in Section 2. Section 2 contains some research work that gives descriptions for each level; this already provides an intuition on the performance of the systems-onchip. Section 3 describes an analyze of power consumption modeling of SoC for a shared memory architecture as well as for other network architectures. A discussion of the different findings is found in Section 4. Section 5 completes the paper by concluding.

Abstraction levels classification
The complexity of the Systems and the increase in the number of transistors on a chip about 50% in a year, according to Moore's law [8,9] fast and scalable precise simulations are therefore necessary for a sufficient exploration of multi-core systems in a limited simulation time.
The design flow is illustrated in Figure 1 each level of abstraction contains these own technologies and tools. In practice, the flow is not always linear certain levels can be ignored or replaced by sub-levels.
The logic level was used [10] allowing the first physical integrated circuits to be drawn. The methodology is ideal for incorporating into an FPGA or ASIC design flow, which has suggested a methodology that integrates standard building blocks into safe compound gates. Register Transfer Level is a level of abstraction of the effective framework used by M.Kammoun et al [11] in order to evaluate the various methods of hardware and software in terms of energy consumption and execution time, especially for video coding systems, as large data processing is required. A FPGA (Field Programmable Gate Arrays) based on Xilinx Zynq was used for this search.  [12] introduces a ViPar tool to explore various video processing architectures at a higher level of design. On the other hand, scientific analysis, such as Information Flow Monitoring (IFT) techniques, used another modern language to design and validate reliable hardware demands that the manufacturer uses. For this reason, in the case that is not assessed due to low-level abstraction against the broad designs [13] the RTL level is focused on the development of a Register Transfer Level IFT (RTLIFT) that improved the security verification performance. Garibotti et al [14] considers multiprocessor systems on a chip platforme (MPSoC) implemented at RTL level to propose a settling of a DSM (Distributed Shared Memory) architecture and make it comparable with CSM (Centralized Shared Memory). Nearly the same principle with a peer level of abstraction was proposed by J.Ax et al [15] to demonstrate the importance of distributed shared memory architecture [16].
With the development and increase in the complexity of microelectronic technology, the use of levels as a transistor or RTL improper for the realization and the exploration of the architecture systems on chip because of the latency of these levels. Which led researchers to move towards higher levels of abstraction. Since the primary objective of engineers and researchers is to evaluate the performance of the systems. Zhe-Mao Hsu et al [17] has been viewed with a goal, as various implementations have been applied on a device. Compared to RTL level systems G.Guindani et al [18] they have shown very encouraging results in terms of speed and percentage of error, based on a method of modeling mixed abstraction levels.
A PAC (Parallel Architecture Core), a multi-core architecture, is the framework used to validate this approach. The error rate is less than 5 percent and about 100 times faster than RTL simulation with the H.264 Decoder, JPEG Decoder, and MP3 Decoder applications.

Energy consumption
The article of EL. Hariti et al [19] presents models describing the static and dynamic power while using a virtual open-source platform (LIBTLMPWT) M. Moy et al [20],Based on a System on a chip (SoC) [21] on SystemC, with the use of the high level of abstraction Transaction Level Modeling (TLM) [22,23] . This model based on exchanges of 'transactions' between the different modules of the system, which obviously leads to reducing the lines of code that the developer has to write. The hardware part of the co-simulation platform included several components based on a MicroBlaze, VGA controller, timer, and shared memory with a part of data and another of instruction, the software used is based on the Game of Life application.
The purpose of the study is to analyze the energy consumption of each element, the model used to calculate the power is composed of three parts, dynamic (Pd), static (Ps), and short circuit power (Psc) expressed by the equation 1. On the other hand, the comparison between the static and dynamic power estimation for the two technology shown in figures 2 and 3 clearly proves that the use of new technology implies static energy becomes more important. M. Baharloo et al [24] also represents the static power of the total NoC power in different CMOS technology nodes. The static power has risen from 42% to 64% following a technology update from 45nm down to 22nm. In general, the smaller the technology node, the more than production based on smaller transistors. This implies that the chip size is small in order to gain at the surface area, which makes the system on a chip faster. Therefore, the use of 10nm or 14nm technology will be more interesting in terms of speed but it leads to an increase in energy consumption. Other studies are interested in the reduction of the power consumption of the system on chip based on a NoC architecture, L.Chen et al [25] proposed a schema Power-gating, is a promising technique to lessen the increasing static power of on-chip routers. Called the minimum performance penalty, this schema shows a decrease in terms of performance penalties and increasing static energy, this approach has been tested using PARSEC benchmarks. In addition, discover that the network architecture has an impact on the performance of the system M. Baharloo et al [24] used a multi network on chip instead of NoC, which has improved performance especially on power consumption. Bouhadiba et al [26] proposed a study on energy consumption at TLM level based on a real FPGA system, designed with Xilinx EDK. Studies such as M.Baharloo et al [24] propose models to make the estimation power of systems on chip, the first in an RTL level with a NOC (R. Garibotti et al 2015) network and the second with a bus network at TLM level L. Caiet al [27]. The results show the accuracy of the two models and the good results in terms of energy in high levels of abstraction. On the other hand, the R. Garibotti et al [14] studies took into consideration the increase in the number of processors in embedded architecture. For this reason, R. Garibotti et al [14] proposed a solution based on distributed memory architecture with the use of opensource realistic design framework, to execute a comparison of performance and energy consumption of DSM and CSM architecture. The results of this study show an increase in the energy consumption of DSM compare to CSM for 4kb and 2kb cache sizes. For the same configuration, we observe a decrease in energy dissipation.

Results and discussion
Co-simulation involves selecting a level of abstraction that depends on the needs of the system. Each level of abstraction is defined by the performance effect rated by benefits and disadvantages.

Simulation speedup
The simulation allows giving an idea about performance estimation on the systems even if the precision remains weak; a lower level of abstraction remains the most precise but with lower simulation time. Ben Attitalahet al [28] Used an encoder H. 263 on MPSoC with a variation in the number of processors (4,8,12 and 16) to compare between the simulation times of the two levels of abstraction CABA and PVT as Table 2 shows.
The results clearly show that the speed of execution on the PVT level is significant, the speed at the TLM sub-level is more important than that on the CABA level, the results also show the impact of the number of processors which decreases the execution time with the increase in the number of processor. Alali et al [29] also studies the speed and the precision of the MPSoC systems to evaluate the high level of abstraction (CABA levels, ISS, Native and PV+T) based on a Platform consisting of two Microblaze [30], and shared memory for two types of application game of life. The results show that the use of abstraction levels of high level makes it possible to reduce the time of validation of the design, thus it makes it possible to develop models fast.

Accurate co-simulation
Obtaining a good hardware/software prototype for the SoC design remains a challenge. The use of an FPGA based prototype includes a long time to do the design, for this reason, I. Bacivarov et al [31] has proposed an approach, which guarantees the speed and mainly the accurate of co-simulation of SoC. Generally, the cosimulation is accurate when the error between the trace produced and the correct trace is minimal.
The I. Bacivarov et al [31] approach used a HW RTL simulation, with a shared memory architecture, and a multiprocessor architecture (MPSoC). On three types of co-simulation ISS (Instruction Set Simulators), OS, and timed native execution, the comparison between the cases of co-simulation, shows an improvement at the level of acceleration, and on the other hand at the level of precision (17% of synchronization error compared to a co-simulation based on ISS).
M.K. Chung [32] uses the TLM level to perform HW-SW co-simulation for SoC design multiprocessors to synchronize a SW model program in C and communicate with the HW model written in SystemC. With the use of the two ISS and IPC co-simulator. Moreover, the architecture of a JPEG decoder, the approach achieve 95% accuracy of performance estimation.

Conclusion
This paper proposes an analysis of System-on-Chip architectures on the various levels of abstraction in the interest of estimating output on these different levels. Synthesis results show that choosing a high level of abstraction such as TLM is used to reduce design complexity, and gain speed. Moreover, choosing the most suitable memory architecture optimizes system performance in terms of speed and energy consumption. The work also presents a methodology applied to the LIBTLMPWT platform using SystemC / TLM, which allows an accurate estimate of energy consumption.
In our future work, we will be more interesting, to apply high levels of abstraction with a more complex memory architecture such as the distributed shared memory architecture (DSM), in order to optimize the co-simulation especially concerning speed, accuracy, and energy consumption.