Project Description:Luxoft is looking for an individual to join an ambitious team developing Deep Learning and High-Performance Computing GPU kernels on the AMD Radeon Open Compute (ROCm) platform for MIOpen and Composable Kernel, AMD's Deep Learning primitives libraries which provides highly optimized implementations of different operators.
• https://github.com/ROCmSoftwarePlatform• https://github.com/ROCmSoftwarePlatform/composable_kernel• https://github.com/ROCmSoftwarePlatform/MIOpen
The successful person will be an experienced GPU-compute programmer with an eye towards hardware-aware performance optimizations.
Responsibilities:The ideal candidate will be responsible for writing high performance GPU kernels for Machine Learning and Deep Learning Library: MIOpen and Composable Kernel• Porting and optimizing algorithms for new GPU hardware• Performing code reviews, building unit tests, authoring detailed documentation related to their work, and working with on-site and off-shore teams to deliver the software solutions on schedule.• Playing a key role in all phases of the software development including system requirements analysis, coordinating feature design and development across functional and organization boundaries.
Mandatory Skills Description:• Strong programming skills in modern C++ (templates, compile-time optimizations)• In-depth knowledge about one of the parallel programming technologies CUDA/HIP/OpenCL/SYCL etc.• Experience in parallel computing on multi-core/multi-node architectures (GPU/DL accelerators, computer clusters)• Experience with parallel programming techniques and optimizations• Understanding of Linear Algebra routines on tensors ("general algorithms" knowledge, not necessarily Linear Algebra)• Good teamwork and interpersonal skills required• Ability to work independently and within complementary teams
Nice-to-Have Skills:• Demonstrate flexibility, strong motivation and a proven track record of meeting results-oriented deadlines.• Detailed knowledge about GPU/accelerators hardware architecture from computational perspective• Familiarity with deep neural network machine learning technologies, architectures and modern machine learning programming frameworks• Experience working with and developing virtualization containers and package managers for code deployment• Experience working with CPU/GPU assembly• Basic understanding of Linux OS• Basic knowledge of software development lifecycle, SW practices including debug, test, revision control, documentation, and bug tracking• Experience using version control software such as Git
Languages:English: C1 Advanced