Purpose of the Position: Support clients in their data and analytics journey by designing scalable data pipelines, improving data consistency, and building infrastructure to support cloud-based analytics platforms on Google Cloud Platform (GCP).
Key Result Areas and Activities:
Develop and maintain data pipelines using GCP services like Cloud Functions, Dataflow, and BigQuery.
Assist in designing and implementing ETL/ELT workflows for structured and semi-structured data.
Contribute to building and maintaining data models and schemas using best practices.
Support in the development and optimization of Spark jobs using PySpark or Scala.
Collaborate with senior engineers to perform data quality checks and resolve data inconsistencies.
Essential Skills:
Strong experience in Python and Spark.
Solid understanding of data engineering concepts and GCP architecture.