Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Christos Kotselidis

School of Computer Science, University of Manchester, UK

Towards High Performance Java-based Deep Learning Frameworks

Jan 13, 2020
Athanasios Stratikopoulos, Juan Fumero, Zoran Sevarac, Christos Kotselidis

Figure 1 for Towards High Performance Java-based Deep Learning Frameworks

Figure 2 for Towards High Performance Java-based Deep Learning Frameworks

Figure 3 for Towards High Performance Java-based Deep Learning Frameworks

Figure 4 for Towards High Performance Java-based Deep Learning Frameworks

The advent of modern cloud services along with the huge volume of data produced on a daily basis, have set the demand for fast and efficient data processing. This demand is common among numerous application domains, such as deep learning, data mining, and computer vision. Prior research has focused on employing hardware accelerators as a means to overcome this inefficiency. This trend has driven software development to target heterogeneous execution, and several modern computing systems have incorporated a mixture of diverse computing components, including GPUs and FPGAs. However, the specialization of the applications' code for heterogeneous execution is not a trivial task, as it requires developers to have hardware expertise in order to obtain high performance. The vast majority of the existing deep learning frameworks that support heterogeneous acceleration, rely on the implementation of wrapper calls from a high-level programming language to a low-level accelerator backend, such as OpenCL, CUDA or HLS. In this paper we have employed TornadoVM, a state-of-the-art heterogeneous programming framework to transparently accelerate Deep Netts; a Java-based deep learning framework. Our initial results demonstrate up to 8x performance speedup when executing the back propagation process of the network's training on AMD GPUs against the sequential execution of the original Deep Netts framework.

Via

Access Paper or Ask Questions

Navigating the Landscape for Real-time Localisation and Mapping for Robotics and Virtual and Augmented Reality

Aug 20, 2018
Sajad Saeedi, Bruno Bodin, Harry Wagstaff, Andy Nisbet, Luigi Nardi, John Mawer, Nicolas Melot, Oscar Palomar, Emanuele Vespa, Tom Spink, Cosmin Gorgovan, Andrew Webb, James Clarkson, Erik Tomusk, Thomas Debrunner, Kuba Kaszyk, Pablo Gonzalez-de-Aledo, Andrey Rodchenko, Graham Riley, Christos Kotselidis, Björn Franke, Michael F. P. O'Boyle, Andrew J. Davison, Paul H. J. Kelly, Mikel Luján, Steve Furber

Figure 1 for Navigating the Landscape for Real-time Localisation and Mapping for Robotics and Virtual and Augmented Reality

Figure 2 for Navigating the Landscape for Real-time Localisation and Mapping for Robotics and Virtual and Augmented Reality

Figure 3 for Navigating the Landscape for Real-time Localisation and Mapping for Robotics and Virtual and Augmented Reality

Figure 4 for Navigating the Landscape for Real-time Localisation and Mapping for Robotics and Virtual and Augmented Reality

Visual understanding of 3D environments in real-time, at low power, is a huge computational challenge. Often referred to as SLAM (Simultaneous Localisation and Mapping), it is central to applications spanning domestic and industrial robotics, autonomous vehicles, virtual and augmented reality. This paper describes the results of a major research effort to assemble the algorithms, architectures, tools, and systems software needed to enable delivery of SLAM, by supporting applications specialists in selecting and configuring the appropriate algorithm and the appropriate hardware, and compilation pathway, to meet their performance, accuracy, and energy consumption goals. The major contributions we present are (1) tools and methodology for systematic quantitative evaluation of SLAM algorithms, (2) automated, machine-learning-guided exploration of the algorithmic and implementation design space with respect to multiple objectives, (3) end-to-end simulation tools to enable optimisation of heterogeneous, accelerated architectures for the specific algorithmic requirements of the various SLAM algorithmic approaches, and (4) tools for delivering, where appropriate, accelerated, adaptive SLAM solutions in a managed, JIT-compiled, adaptive runtime context.

* Proceedings of the IEEE 2018

Via

Access Paper or Ask Questions