The Data Engineer participates in the design and build of modern data products that comprise of raw data stores (data lakes) and cleansed data repositories, populated by batch or streaming data pipelines. The Data Engineer works with a team to create a robust, sustainable and flexible design and leads the technical delivery using Agile delivery frameworks like Scrum or Kanban.
* Developing and managing data processes to ensure that data is available and usable
* Creation and automation of data pipelines and platforms
* Managing and monitoring data quality via automated testing frameworks (Data Driven Testing, TDD, etc.)
* Working closely with data architects, data scientists, and data visualization developers to design, build, test, deliver, and maintain sustainable and highly scalable data solutions
* Researching data acquisition and evaluating suitability
* Integration of data management solutions into client environment
* Actively managing risks to data and ensuring there is a data recovery plan
* Building data repositories such as: data warehouses, data lakes, data marts, etc.
* Ample relevant professional work experience.
* Experience and expertise in the following:
* Creating robust and extensible data pipelines for production systems
* Use of AWS and more advanced tools (S3, EC2, RDS, ETL, Glue, RedShift, Kinesis)
* Creating secure, performant, and well-modeled data stores
* Common analytical platform architectural patterns (Star Schema, data integration patterns, ABAC, data quality frameworks etc.)
* Data lake design patterns and technology options (schema on read, metadata capture, search framework)
* Use of scripting languages, preferably Python
* Familiarity with graph databases, preferably Neo4j
* Source code version control management using git