• Location: USA, Massachusetts, Boston
  • Date Posted: 10th Apr, 2020
  • Reference: AWS 31320 PM_1586542370
Working under the Lead Data Engineer, you will write the algorithms for the most prominent features of their analytics products. Your quantitative mindset will steer you to solving complex problems in collecting, storing, and accessing all the data they collect.

Role & Responsibilities

* Create and maintain ETL processes using Spark

* Create and maintain existing Extraction Libraries for text/images/video

* Maintaining data integrity in every facet of the pipeline

* Relentlessly clean, fix, and aggregate the data

Skills & Qualifications

* Proficiency with writing in Python

* Experience building distributed computing and orchestration frameworks like Apache Spark and Airflow

* Experience writing and productionizing complex data transformations in SQL

* Experience designing and building ETL from various input sources

* Experience working in cloud infrastructure (e.g. AWS services such as S3, EC2, EMR, Lambda and Redshift)

* Experience with data extractions (e.g. text, images, video) from HTML documents

* Exposure to Data Science areas and algorithms would be a big plus

Benefits

* Unlimited PTO and flexible work-from-home

* Comprehensive health insurance

* Free lunch while in-office

* Unlimited monthly Metrocard

* Mental Health and Learning & Development Programs

Similar Jobs

Data Science Engineer - Remote - 150K+
USA, Massachusetts, Boston

Data Science Engineer - Remote - 150K+
USA, Massachusetts, Boston

Data Science Engineer - Remote - 150K+
USA, Massachusetts, Boston

Data Integration Engineer
USA, Massachusetts, Boston

Software/ Data Engineer
USA, Massachusetts, Boston