Contract for this vacancy is signed for 6 months
Project Description:Are you passionate about computer graphics and high-performance computing? Would you like to have hands-on experience with state-of-the-art HW, sometimes even before others get a chance?We are looking for an experienced ML Ops Engineer or Dev Ops to contribute to deploy, maintain, and develop automation and infrastructure systems for major hardware vendor.The ideal candidate should have a background in ML operations, proficient in collaboration with Data Engineering teams, and is well versed with automation tools on clustered deployments.
ResponsibilitiesYou will be focused around supporting local Data Engineering, Software Infrastructure and Research teams:- Build tailored automation systems for teams of ML developers- Facilitate collaboration between Data Engineering, ML Research and Software Infrastructure teams- Implement new helper tools, focusing on practical deployment and cost/resource management
Mandatory Skills Description:- Decent understanding of Unix/Linux- Decently experienced in either Slurm or Kubernetes (both preferred; should be able to setup, configure, manage and expand the clusters as needed)- Can work with Bash, JS and Python with at least good experience in one of them- Experienced with GraphQL and Rest API (throttling, caching, on client side using redis, custom solutions etc.)- Experienced with Ansible, Terraform or Pulumi, Docker, Helm
Nice-to-Have Skills Description:- Experience with ML Frameworks- Technical academin degree in IT or related- General MLOPs pipeline expertise