Role Overview
Work with big data platforms and tools to build event-driven processing and scalable ETL pipelines. Apply data governance, quality checks, and performance monitoring while supporting analytics and data science initiatives, including NLP and large language models.
Key Responsibilities
- Design, implement, and maintain ETL pipelines and event-based processing using AWS (SNS, SQS, Lambda), Hive, Cloudera, and Dataiku.
- Manage and optimise large datasets; enforce data governance, quality checks, and monitoring.
- Deliver descriptive analysis, KPI development, and exception reporting to drive business insights.
- Develop and productionise machine learning solutions, including unsupervised learning, predictive analytics, NLP, and LLM applications.
- Create dashboards and visualisations (Spotfire) to communicate findings to stakeholders.
- Collaborate with cross-functional teams and communicate technical results clearly to non-technical stakeholders.
Technical Skills
- Strong programming skills in Python and SQL for data manipulation and analysis.
- Experience working with both structured and unstructured data.
- Familiarity with machine learning frameworks, deep learning, NLP, and large language models.
- Hands-on experience with AWS services (SNS, SQS, Lambda), Hive, Cloudera, and Dataiku.
- Experience creating dashboards and visualisations, preferably using Spotfire.
Qualifications & Experience
- Technical degree in Computer Science, Data Analytics, Engineering, Mathematics, or a related field.
- At least 5 years of practical experience in data science or analytics roles.
- Self-starter who is intellectually curious and able to work independently without micromanagement.
- Strong stakeholder management and communication skills.