Must haves- • Apache Spark: Deep understanding of Spark's core concepts, performance optimization, and the ability to develop efficient data processing jobs. • Databricks data catalog: create tables from S3 and make them available in the unity catalog • Python, and SQL. Knowledge of R is a plus. • Hands-on experience with the Databricks platform, including Databricks SQL, Delta Lake, and Databricks Workspaces. • Experience with AWS and understanding how to integrate them with Databricks. • Version Control: Proficiency in using version control systems, such as Git, for code management and collaboration. Great to haves- • DevOps and CI/CD: Understanding of DevOps principles and experience with CI/CD pipelines to automate testing and deployment of Databricks jobs. • Data Engineering Skills: Experience in designing and implementing scalable and reliable data pipelines, understanding of ETL processes, and familiarity with data modeling techniques. • Machine Learning and Data Science: Knowledge of machine learning algorithms and data science principles. Experience with MLflow for managing the machine learning lifecycle.