The research goal of EAVISE is applying state-of-the-art computer vision techniques as a solution for industry-specific vision problems. In order to meet stringent execution speed, energy footprint, cost and price requirements, the developed algorithms are implemented and optimized on embedded systems, such as GPUs, DSPs and FPGAs. Application areas include industrial automation, product inspection, traffic monitoring, e-health, agriculture, eye-tracking research, microscopic imaging, surveillance and land surveying.
In this PhD project, the end goal is a general optimizing compiler for convolutional neural networks for embedded hardware.
The first task is to define a good instruction set for our in-house developed custom Neural Network processor hardware. This will define the hardware-software interface, to program the devices. This instruction set should reflect the flexibility of the underlying accelerator, enabling to map many contemporary as well as future networks, consisting of multi-precision convolutional, fully connected and recurrent layers, skip connections, 2D as well as depth wise and pointwise layers, shift layers, etc. For this, a careful balance has to be sought between fine granularity coding, yet without blowing up the code size of algorithms, nor involving large online decoding overhead. So far, such instruction set does not exist with sufficient flexibility towards novel network topologies (e.g. such as various input strides, output strides, depthwise separable convolutions, depth first computations, computational precision, etc.). Commercial accelerators either limit flexibility (e.g. only allowing INT8/16 as in TPU), or coding at very fine granularity (e.g. as in a GPU).
The second task in this PhD is to develop an optimizing DNN compiler for this custom processor hardware making use of the developed instruction set. This can take as its input a description in ONNX, an open format that is emerging to export neural network architectures from training tools as TensorFlow or PyTorch. Very recently, an open tool for this has already become available, namely ONNC. ONNC is a generic compiler that can target both GPUs and so-called DLAs (deep learning accelerators). Our goal in this task is to adapt this ONNC framework for the hardware developed in-house, and add approximative optimizations developed in the task above (like e.g. weight quantizing, layer pruning, etc. as available in TensorRT) to it. Moreover, a big optimization possibility lies in adding in-loop retraining in these compilers. When a compiler decides to choose a way to realize a network architecture on a hardware target, most non-conservative optimizations have a negative impact on the accuracy of the network. Intertwining these optimizations with in-loop retraining of the weights, would yield a network implementation that is both efficiently mapped on the target hardware and at the mean time top accurate.
- A relevant master's degree and good study results
- A basic knowledge in deep learning, image processing and/or embedded implementation techniques
- A thorough interest in research
- Very motivated
We offer a funded PhD position for about 4 years, standard wages apply (Barema 43).
The first contract is for one year and extendable.
For more information please contact Prof. dr. ir. Toon Goedemé, tel.: +32 15 68 82 34, mail: firstname.lastname@example.org.
You can apply for this job no later than July 31, 2019 via the online application tool
KU Leuven seeks to foster an environment where all talents can flourish, regardless of gender, age, cultural background, nationality or impairments. If you have any questions relating to accessibility or support, please contact us at diversiteit.HR@kuleuven.be.Continue reading
|Title||PhD Position: Compiling and Optimizing Deep Learning Networks on Custom Hardware|
|Job location||Oude Markt 13, 3000 Leuven|
|Published||May 13, 2019|
|Application deadline||July 31, 2019|
|Job types||PhD  |
|Fields||Algorithms,   Artificial Neural Network,   Electrical Engineering,   Machine Learning,   Computer Vision,   Image Processing  |