Project Description:
Luxoft is looking for an individual to join a hardworking team developing Deep Learning and High-Performance Computing GPU kernels on the AMD Radeon Open Compute (ROCm) platform for MIOpen and Composable Kernel, AMD's Deep Learning primitives libraries which provides highly optimized implementations of different operators.
• https://github.com/ROCmSoftwarePlatform• https://github.com/ROCmSoftwarePlatform/composable_kernel• https://github.com/ROCmSoftwarePlatform/MIOpen
The successful person will be an experienced GPU-compute programmer with an eye towards hardware-aware performance optimizations.
Responsibilities:
The ideal candidate will be responsible for writing high performance GPU kernels for Machine Learning and Deep Learning Library: MIOpen and Composable Kernel• They will be porting and optimizing algorithms for new GPU hardware• Perform code reviews, building unit tests, authoring detailed documentation related to their work, and working with on-site and off-shore teams to deliver the software solutions on schedule.• They will play a key role in all phases of the software development including system requirements analysis, coordinating feature design and development across functional and organization boundaries.
Mandatory Skills Description:
• Strong programming skills in modern C++ (templates, compile-time optimizations)• In-depth knowledge about CUDA/HIP and or OpenCL• Experience in parallel computing on GPUs or HW accelerators and/or HPC (High Performance Computation)• Detailed knowledge about GPU/accelerators hardware architecture from computational perspective• Extensive experience with parallel programming techniques and optimizations• Understanding of Linear Algebra routines on tensors• Experience using version control software such as Git• Basic understanding of Linux internals, Servers, and Debugging• Basic knowledge of software development lifecycle, SW practices including debug, test, revision control, documentation, and bug tracking• Good teamwork and interpersonal skills required• Ability to work independently and within complementary teams
Nice-to-Have Skills:
• Demonstrate flexibility, strong motivation and a proven track record of meeting results-oriented deadlines.• Familiarity with deep neural network machine learning technologies, architectures and modern machine learning programming frameworks• Experience working with and developing virtualization containers and package managers for code deployment
Languages:English: C1 Advanced