Core Requirements:3-5yrs of Experience maintaining 7x24, high volume ML Model Execution/ MLOps and/or Data Storage Platforms doing the following activities: (Release deployment, capacity management, monitoring and alerting, high availability)
4+yrs of Python Development proficiency in a CI/CD environment (Core Backend / backbone functions over front-end purpose specific functions; ex. How to Sort, Find, Split and Put back together data, how to integrate / interface with other systems, create a decision flow / orchestration based on events, etc.)
2-3yrs of Docker image creation/assembling from scratch into Containers and putting into Kubernetes environment (does not need to be a Kubernetes admin, rather very proficient in assembling a container, updating the contents of an existing container and deploying / maintaining the container in a Kubernetes platform)
3+yrs of GitHub, Jenkins - Must be well versed in GitHub (code repository), and Jenkins (ex. experience managing and creating Jenkins jobs)
2+yrs of Cloud Platform Proficiency (Azure and/or AWS) AWS is Preferred!! Would consider strong Azure experience as well. (Please provide a summary as to what your candidate's experience is with AWS?, good things to look for are: Experience setting up containers (i.e. AWS EKS and/or ECS), access management (i.e. AWS IAM policies), compute image management (i.e. AWS AIMs), instrumentation (i.e. Cloud Watch), Machine Learning (i.e. AWS SageMaker): **Desired: (In Order of Importance!!)Terraform experience - cloud automation (What has your candidate done specifically with Terraform/Cloud Automation?, if AWS experience, use of the BOTO3 library
Machine Learning familiarity - (What is your candidate's experience with M/L, list a few specifics, they do not need to be a data scientist and build models from scratch but able to understand an existing model and especially their dependencies, like libraries and frameworks
Kafka - (What is your candidate’s experience with Kafka if any, list a few specifics?)