About the Job
Key Responsibilities:
Pipeline Engineering: Build end-to-end data pipelines including data collection, transformation, quality, and integration.
Solution Design: Collaborate with business teams to identify data requirements and assemble large, complex datasets.
Optimization: Design, implement, and fine-tune analytics solutions to meet technical performance metrics.
Integrity & Security: Work with the Data Architect to maintain data model integrity and data governance.
Operations: Partner with DevOps engineers to support consistent pipeline deployments and stable operations.
AI & API Collaboration: Develop robust APIs and data tools to assist data scientists and machine learning engineers.
Skills & Experiences Required
Education & Experience
Degree: Bachelor’s or Master’s degree in Computer Science, IT, or a related field.
Experience: Minimum of 5 years in SQL, data engineering, and Business Intelligence (BI) solutions.
Technical Stack (Mandatory)
Databricks: Hands-on experience delivering data projects using Databricks (Spark, notebooks, pipelines) is mandatory.
Cloud Platform: Proven experience designing and building solutions on the Azure cloud platform.
Languages & Databases
Languages: High proficiency in Python, Spark, Scala, and advanced SQL scripting.
Databases: Experience with relational databases (e.g., PostgreSQL) and NoSQL platforms (e.g., HBase, MongoDB, Cassandra).
Architecture: Strong knowledge of Data Lakes, Data Factory, Data Warehousing, and BI Dashboards.
Modern AI Tools & Concepts
AI Productivity: Experience using coding assistants (e.g., GitHub Copilot) to uplift development speed.
GenAI Awareness: Familiarity with GenAI concepts (RAG, LangChain, LlamaIndex, vector databases) to collaborate effectively on AI use cases.