System-Level Design of GPU-Based Embedded Systems

System-Level Design of GPU-Based Embedded Systems PDF Author: Arian Maghazeh
Publisher: Linköping University Electronic Press
ISBN: 9176851753
Category :
Languages : en
Pages : 62

Get Book

Book Description
Modern embedded systems deploy several hardware accelerators, in a heterogeneous manner, to deliver high-performance computing. Among such devices, graphics processing units (GPUs) have earned a prominent position by virtue of their immense computing power. However, a system design that relies on sheer throughput of GPUs is often incapable of satisfying the strict power- and time-related constraints faced by the embedded systems. This thesis presents several system-level software techniques to optimize the design of GPU-based embedded systems under various graphics and non-graphics applications. As compared to the conventional application-level optimizations, the system-wide view of our proposed techniques brings about several advantages: First, it allows for fully incorporating the limitations and requirements of the various system parts in the design process. Second, it can unveil optimization opportunities through exposing the information flow between the processing components. Third, the techniques are generally applicable to a wide range of applications with similar characteristics. In addition, multiple system-level techniques can be combined together or with application-level techniques to further improve the performance. We begin by studying some of the unique attributes of GPU-based embedded systems and discussing several factors that distinguish the design of these systems from that of the conventional high-end GPU-based systems. We then proceed to develop two techniques that address an important challenge in the design of GPU-based embedded systems from different perspectives. The challenge arises from the fact that GPUs require a large amount of workload to be present at runtime in order to deliver a high throughput. However, for some embedded applications, collecting large batches of input data requires an unacceptable waiting time, prompting a trade-off between throughput and latency. We also develop an optimization technique for GPU-based applications to address the memory bottleneck issue by utilizing the GPU L2 cache to shorten data access time. Moreover, in the area of graphics applications, and in particular with a focus on mobile games, we propose a power management scheme to reduce the GPU power consumption by dynamically adjusting the display resolution, while considering the user's visual perception at various resolutions. We also discuss the collective impact of the proposed techniques in tackling the design challenges of emerging complex systems. The proposed techniques are assessed by real-life experimentations on GPU-based hardware platforms, which demonstrate the superior performance of our approaches as compared to the state-of-the-art techniques.

System-Level Design of GPU-Based Embedded Systems

System-Level Design of GPU-Based Embedded Systems PDF Author: Arian Maghazeh
Publisher: Linköping University Electronic Press
ISBN: 9176851753
Category :
Languages : en
Pages : 62

Get Book

Book Description
Modern embedded systems deploy several hardware accelerators, in a heterogeneous manner, to deliver high-performance computing. Among such devices, graphics processing units (GPUs) have earned a prominent position by virtue of their immense computing power. However, a system design that relies on sheer throughput of GPUs is often incapable of satisfying the strict power- and time-related constraints faced by the embedded systems. This thesis presents several system-level software techniques to optimize the design of GPU-based embedded systems under various graphics and non-graphics applications. As compared to the conventional application-level optimizations, the system-wide view of our proposed techniques brings about several advantages: First, it allows for fully incorporating the limitations and requirements of the various system parts in the design process. Second, it can unveil optimization opportunities through exposing the information flow between the processing components. Third, the techniques are generally applicable to a wide range of applications with similar characteristics. In addition, multiple system-level techniques can be combined together or with application-level techniques to further improve the performance. We begin by studying some of the unique attributes of GPU-based embedded systems and discussing several factors that distinguish the design of these systems from that of the conventional high-end GPU-based systems. We then proceed to develop two techniques that address an important challenge in the design of GPU-based embedded systems from different perspectives. The challenge arises from the fact that GPUs require a large amount of workload to be present at runtime in order to deliver a high throughput. However, for some embedded applications, collecting large batches of input data requires an unacceptable waiting time, prompting a trade-off between throughput and latency. We also develop an optimization technique for GPU-based applications to address the memory bottleneck issue by utilizing the GPU L2 cache to shorten data access time. Moreover, in the area of graphics applications, and in particular with a focus on mobile games, we propose a power management scheme to reduce the GPU power consumption by dynamically adjusting the display resolution, while considering the user's visual perception at various resolutions. We also discuss the collective impact of the proposed techniques in tackling the design challenges of emerging complex systems. The proposed techniques are assessed by real-life experimentations on GPU-based hardware platforms, which demonstrate the superior performance of our approaches as compared to the state-of-the-art techniques.

Designing Embedded Processors

Designing Embedded Processors PDF Author: Jörg Henkel
Publisher: Springer Science & Business Media
ISBN: 1402058691
Category : Technology & Engineering
Languages : en
Pages : 551

Get Book

Book Description
To the hard-pressed systems designer this book will come as a godsend. It is a hands-on guide to the many ways in which processor-based systems are designed to allow low power devices. Covering a huge range of topics, and co-authored by some of the field’s top practitioners, the book provides a good starting point for engineers in the area, and to research students embarking upon work on embedded systems and architectures.

System-Level Design Techniques for Energy-Efficient Embedded Systems

System-Level Design Techniques for Energy-Efficient Embedded Systems PDF Author: Marcus T. Schmitz
Publisher: Springer
ISBN: 0306487365
Category : Computers
Languages : en
Pages : 194

Get Book

Book Description
System-Level Design Techniques for Energy-Efficient Embedded Systems addresses the development and validation of co-synthesis techniques that allow an effective design of embedded systems with low energy dissipation. The book provides an overview of a system-level co-design flow, illustrating through examples how system performance is influenced at various steps of the flow including allocation, mapping, and scheduling. The book places special emphasis upon system-level co-synthesis techniques for architectures that contain voltage scalable processors, which can dynamically trade off between computational performance and power consumption. Throughout the book, the introduced co-synthesis techniques, which target both single-mode systems and emerging multi-mode applications, are applied to numerous benchmarks and real-life examples including a realistic smart phone.

Designing for Resilience

Designing for Resilience PDF Author: Vanessa Rodrigues
Publisher: Linköping University Electronic Press
ISBN: 9179298672
Category : Electronic books
Languages : en
Pages : 137

Get Book

Book Description
Services are prone to change in the form of expected and unexpected variations and disruptions, more so given the increasing interconnectedness and complexity of service systems today. These changes require service systems to be resilient and designed to adapt, to ensure that services continue to work smoothly. This thesis problematises the prevailing view and assumptions underpinning the current understanding of resilience in services. Drawing on literature from service management, service design, systems thinking and social-ecological resilience theory, this work investigates how service design can foster resilience in service systems. Supported by empirical input from three research projects in healthcare, the findings show service design can contribute to the adaptability and transformability of service systems through its holistic, human-centred, participatory and experimental approaches. Through the analysis, this research identifies key intervention points for cultivating service systems resilience through service design, including the design of service interactions, processes, enabling structures and multi-level governance. The study makes two important contributions. First, it extends the understanding of service systems resilience as the collective capacity for intentional action in responding to ongoing change, coordinated across scales in order to create value. This is supported by offering alternative assumptions about resilience in service. Second, it positions service design as an enabler of service resilience by explicitly linking design practice(s) to processes that contribute to resilience. By extending the understanding of service systems resilience, this thesis lays the groundwork for future research at the intersection of service design, systemic change and resilience.

Distributed Moving Base Driving Simulators

Distributed Moving Base Driving Simulators PDF Author: Anders Andersson
Publisher: Linköping University Electronic Press
ISBN: 9176850900
Category :
Languages : en
Pages : 42

Get Book

Book Description
Development of new functionality and smart systems for different types of vehicles is accelerating with the advent of new emerging technologies such as connected and autonomous vehicles. To ensure that these new systems and functions work as intended, flexible and credible evaluation tools are necessary. One example of this type of tool is a driving simulator, which can be used for testing new and existing vehicle concepts and driver support systems. When a driver in a driving simulator operates it in the same way as they would in actual traffic, you get a realistic evaluation of what you want to investigate. Two advantages of a driving simulator are (1.) that you can repeat the same situation several times over a short period of time, and (2.) you can study driver reactions during dangerous situations that could result in serious injuries if they occurred in the real world. An important component of a driving simulator is the vehicle model, i.e., the model that describes how the vehicle reacts to its surroundings and driver inputs. To increase the simulator realism or the computational performance, it is possible to divide the vehicle model into subsystems that run on different computers that are connected in a network. A subsystem can also be replaced with hardware using so-called hardware-in-the-loop simulation, and can then be connected to the rest of the vehicle model using a specified interface. The technique of dividing a model into smaller subsystems running on separate nodes that communicate through a network is called distributed simulation. This thesis investigates if and how a distributed simulator design might facilitate the maintenance and new development required for a driving simulator to be able to keep up with the increasing pace of vehicle development. For this purpose, three different distributed simulator solutions have been designed, built, and analyzed with the aim of constructing distributed simulators, including external hardware, where the simulation achieves the same degree of realism as with a traditional driving simulator. One of these simulator solutions has been used to create a parameterized powertrain model that can be configured to represent any of a number of different vehicles. Furthermore, the driver's driving task is combined with the powertrain model to monitor deviations. After the powertrain model was created, subsystems from a simulator solution and the powertrain model have been transferred to a Modelica environment. The goal is to create a framework for requirement testing that guarantees sufficient realism, also for a distributed driving simulation. The results show that the distributed simulators we have developed work well overall with satisfactory performance. It is important to manage the vehicle model and how it is connected to a distributed system. In the distributed driveline simulator setup, the network delays were so small that they could be ignored, i.e., they did not affect the driving experience. However, if one gradually increases the delays, a driver in the distributed simulator will change his/her behavior. The impact of communication latency on a distributed simulator also depends on the simulator application, where different usages of the simulator, i.e., different simulator studies, will have different demands. We believe that many simulator studies could be performed using a distributed setup. One issue is how modifications to the system affect the vehicle model and the desired behavior. This leads to the need for methodology for managing model requirements. In order to detect model deviations in the simulator environment, a monitoring aid has been implemented to help notify test managers when a model behaves strangely or is driven outside of its validated region. Since the availability of distributed laboratory equipment can be limited, the possibility of using Modelica (which is an equation-based and object-oriented programming language) for simulating subsystems is also examined. Implementation of the model in Modelica has also been extended with requirements management, and in this work a framework is proposed for automatically evaluating the model in a tool.

Robust Stream Reasoning Under Uncertainty

Robust Stream Reasoning Under Uncertainty PDF Author: Daniel de Leng
Publisher: Linköping University Electronic Press
ISBN: 9176850137
Category :
Languages : en
Pages : 234

Get Book

Book Description
Vast amounts of data are continually being generated by a wide variety of data producers. This data ranges from quantitative sensor observations produced by robot systems to complex unstructured human-generated texts on social media. With data being so abundant, the ability to make sense of these streams of data through reasoning is of great importance. Reasoning over streams is particularly relevant for autonomous robotic systems that operate in physical environments. They commonly observe this environment through incremental observations, gradually refining information about their surroundings. This makes robust management of streaming data and their refinement an important problem. Many contemporary approaches to stream reasoning focus on the issue of querying data streams in order to generate higher-level information by relying on well-known database approaches. Other approaches apply logic-based reasoning techniques, which rarely consider the provenance of their symbolic interpretations. In this work, we integrate techniques for logic-based stream reasoning with the adaptive generation of the state streams needed to do the reasoning over. This combination deals with both the challenge of reasoning over uncertain streaming data and the problem of robustly managing streaming data and their refinement. The main contributions of this work are (1) a logic-based temporal reasoning technique based on path checking under uncertainty that combines temporal reasoning with qualitative spatial reasoning; (2) an adaptive reconfiguration procedure for generating and maintaining a data stream required to perform spatio-temporal stream reasoning over; and (3) integration of these two techniques into a stream reasoning framework. The proposed spatio-temporal stream reasoning technique is able to reason with intertemporal spatial relations by leveraging landmarks. Adaptive state stream generation allows the framework to adapt to situations in which the set of available streaming resources changes. Management of streaming resources is formalised in the DyKnow model, which introduces a configuration life-cycle to adaptively generate state streams. The DyKnow-ROS stream reasoning framework is a concrete realisation of this model that extends the Robot Operating System (ROS). DyKnow-ROS has been deployed on the SoftBank Robotics NAO platform to demonstrate the system's capabilities in a case study on run-time adaptive reconfiguration. The results show that the proposed system - by combining reasoning over and reasoning about streams - can robustly perform stream reasoning, even when the availability of streaming resources changes.

Applications of Partial Polymorphisms in (Fine-Grained) Complexity of Constraint Satisfaction Problems

Applications of Partial Polymorphisms in (Fine-Grained) Complexity of Constraint Satisfaction Problems PDF Author: Biman Roy
Publisher: Linköping University Electronic Press
ISBN: 9179298982
Category :
Languages : en
Pages : 57

Get Book

Book Description
In this thesis we study the worst-case complexity ofconstraint satisfaction problems and some of its variants. We use methods from universal algebra: in particular, algebras of total functions and partial functions that are respectively known as clones and strong partial clones. The constraint satisfactionproblem parameterized by a set of relations ? (CSP(?)) is the following problem: given a set of variables restricted by a set of constraints based on the relations ?, is there an assignment to thevariables that satisfies all constraints? We refer to the set ? as aconstraint language. The inverse CSPproblem over ? (Inv-CSP(?)) asks the opposite: given a relation R, does there exist a CSP(?) instance with R as its set of models? When ? is a Boolean language, then we use the term SAT(?) instead of CSP(?) and Inv-SAT(?) instead of Inv-CSP(?). Fine-grained complexity is an approach in which we zoom inside a complexity class and classify theproblems in it based on their worst-case time complexities. We start by investigating the fine-grained complexity of NP-complete CSP(?) problems. An NP-complete CSP(?) problem is said to be easier than an NP-complete CSP(?) problem if the worst-case time complexity of CSP(?) is not higher thanthe worst-case time complexity of CSP(?). We first analyze the NP-complete SAT problems that are easier than monotone 1-in-3-SAT (which can be represented by SAT(R) for a certain relation R), and find out that there exists a continuum of such problems. For this, we use the connection between constraint languages and strong partial clones and exploit the fact that CSP(?) is easier than CSP(?) when the strong partial clone corresponding to ? contains the strong partial clone of ?. An NP-complete CSP(?) problem is said to be the easiest with respect to a variable domain D if it is easier than any other NP-complete CSP(?) problem of that domain. We show that for every finite domain there exists an easiest NP-complete problem for the ultraconservative CSP(?) problems. An ultraconservative CSP(?) is a special class of CSP problems where the constraint language containsall unary relations. We additionally show that no NP-complete CSP(?) problem can be solved insub-exponential time (i.e. in2^o(n) time where n is the number of variables) given that theexponentialtime hypothesisis true. Moving to classical complexity, we show that for any Boolean constraint language ?, Inv-SAT(?) is either in P or it is coNP-complete. This is a generalization of an earlier dichotomy result, which was only known to be true for ultraconservative constraint languages. We show that Inv-SAT(?) is coNP-complete if and only if the clone corresponding to ? contains essentially unary functions only. For arbitrary finite domains our results are not conclusive, but we manage to prove that theinversek-coloring problem is coNP-complete for each k>2. We exploit weak bases to prove many of theseresults. A weak base of a clone C is a constraint language that corresponds to the largest strong partia clone that contains C. It is known that for many decision problems X(?) that are parameterized bya constraint language ?(such as Inv-SAT), there are strong connections between the complexity of X(?) and weak bases. This fact can be exploited to achieve general complexity results. The Boolean domain is well-suited for this approach since we have a fairly good understanding of Boolean weak bases. In the final result of this thesis, we investigate the relationships between the weak bases in the Boolean domain based on their strong partial clones and completely classify them according to the setinclusion. To avoid a tedious case analysis, we introduce a technique that allows us to discard a largenumber of cases from further investigation.

Embedded Processor Design Challenges

Embedded Processor Design Challenges PDF Author: Ed F. Deprettere
Publisher: Springer
ISBN: 3540458743
Category : Computers
Languages : en
Pages : 332

Get Book

Book Description
This textbook is intended to give an introduction to and an overview of sta- of-the-art techniques in the design of complex embedded systems. The book title is SAMOS for two major reasons. First, it tries to focus on the actual distinct, yet important problem ?elds of System-Level design of embedded systems, including mapping techniques and synthesis,Architectural design,Modeling issues such as speci?cation languages, formal models, and- nallySimulation. The second reason is that the volume includes a number of papers presented at a workshop with the same name on the Island of Samos, Greece, in July 2001. In order to receive international attention, a number of reputed researchers were invited to this workshop to present their current work. Participation was by invitation only. For the volume presented here, a number of additional papers where selected based on a call for papers. All contributions were refereed. This volume presents a selection of 18 of the refereed papers, including 2 invited papers. The textbook is organized according to four topics: The ?rst isA)System- LevelDesignandSimulation.Inthissection,wepresentacollectionofpapers that give an overview of the challenging goal to design and explore alternatives of embedded system implementations at the system-level. One paper gives an overview of models and tools used in system-level design. The other papers present new models to describe applications, provide models for re?nement and design space exploration, and for tradeo? analysis between cost and ?exibility of an implementation.

Parameterized Verification of Synchronized Concurrent Programs

Parameterized Verification of Synchronized Concurrent Programs PDF Author: Zeinab Ganjei
Publisher: Linköping University Electronic Press
ISBN: 9179296971
Category :
Languages : en
Pages : 192

Get Book

Book Description
There is currently an increasing demand for concurrent programs. Checking the correctness of concurrent programs is a complex task due to the interleavings of processes. Sometimes, violation of the correctness properties in such systems causes human or resource losses; therefore, it is crucial to check the correctness of such systems. Two main approaches to software analysis are testing and formal verification. Testing can help discover many bugs at a low cost. However, it cannot prove the correctness of a program. Formal verification, on the other hand, is the approach for proving program correctness. Model checking is a formal verification technique that is suitable for concurrent programs. It aims to automatically establish the correctness (expressed in terms of temporal properties) of a program through an exhaustive search of the behavior of the system. Model checking was initially introduced for the purpose of verifying finite‐state concurrent programs, and extending it to infinite‐state systems is an active research area. In this thesis, we focus on the formal verification of parameterized systems. That is, systems in which the number of executing processes is not bounded a priori. We provide fully-automatic and parameterized model checking techniques for establishing the correctness of safety properties for certain classes of concurrent programs. We provide an open‐source prototype for every technique and present our experimental results on several benchmarks. First, we address the problem of automatically checking safety properties for bounded as well as parameterized phaser programs. Phaser programs are concurrent programs that make use of the complex synchronization construct of Habanero Java phasers. For the bounded case, we establish the decidability of checking the violation of program assertions and the undecidability of checking deadlock‐freedom. For the parameterized case, we study different formulations of the verification problem and propose an exact procedure that is guaranteed to terminate for some reachability problems even in the presence of unbounded phases and arbitrarily many spawned processes. Second, we propose an approach for automatic verification of parameterized concurrent programs in which shared variables are manipulated by atomic transitions to count and synchronize the spawned processes. For this purpose, we introduce counting predicates that related counters that refer to the number of processes satisfying some given properties to the variables that are directly manipulated by the concurrent processes. We then combine existing works on the counter, predicate, and constrained monotonic abstraction and build a nested counterexample‐based refinement scheme to establish correctness. Third, we introduce Lazy Constrained Monotonic Abstraction for more efficient exploration of well‐structured abstractions of infinite‐state non‐monotonic systems. We propose several heuristics and assess the efficiency of the proposed technique by extensive experiments using our open‐source prototype. Lastly, we propose a sound but (in general) incomplete procedure for automatic verification of safety properties for a class of fault‐tolerant distributed protocols described in the Heard‐Of (HO for short) model. The HO model is a popular model for describing distributed protocols. We propose a verification procedure that is guaranteed to terminate even for unbounded number of the processes that execute the distributed protocol.

Electronic System-Level HW/SW Co-Design of Heterogeneous Multi-Processor Embedded Systems

Electronic System-Level HW/SW Co-Design of Heterogeneous Multi-Processor Embedded Systems PDF Author: Luigi Pomante
Publisher: CRC Press
ISBN: 1000795640
Category : Science
Languages : en
Pages : 270

Get Book

Book Description
Modern electronic systems consist of a fairly heterogeneous set of components. Today, a single system can be constituted by a hardware platform, frequently composed of a mix of analog and digital components, and by several software application layers. The hardware can include several heterogeneous microprocessors (e.g. GPP, DSP, GPU, etc.), dedicated ICs (ASICs and/or FPGAs), memories, a set of local connections between the system components, and some interfaces between the system and the environment (sensors, actuators, etc.). Therefore, on the one hand, multi-processor embedded systems are capable of meeting the demand of processing power and flexibility of complex applications. On the other hand, such systems are very complex to design and optimize, so that the design methodology plays a major role in determining the success of the products. For these reasons, to cope with the increasing system complexity, the approaches typically used today are oriented towards co-design methodologies working at the higher levels of abstraction. Unfortunately, such methodologies are typically customized for the specific application, suffer of a lack of generality and still need a considerable effort when real-size project are envisioned. Therefore, there is still the need for a general methodology able to support the designer during the high-level steps of a co-design flow, enabling an effective design space exploration before tackling the low-level steps and thus committing to the final technology. This should prevent costly redesign loops.In such a context, the work described in this book, composed of two parts, aims at providing models, methodologies and tools to support each step of the co-design flow of embedded systems implemented by exploiting heterogeneous multi-processor architectures mapped on distributed systems, as well as fully integrated onto a single chip.