Senior Data Engineer
Location: 100% Remote (USA-based only)
Salary: Up to $160/$170k
Sponsorship: Not Available
About the Company
The organization delivers real-world evidence through syndicated registry data and analytics services to help biopharmaceutical companies. They are the world's leading source of highly-curated data in inflammatory and autoimmune diseases. The company's portfolio includes eight registries: rheumatoid arthritis, psoriasis, atopic dermatitis, psoriatic arthritis and spondylarthritis, inflammatory bowel disease, multiple sclerosis, and neuromyelitis optica.
Role & Responsibilities
As a Senior Data Engineer, you will work as part of a team to design, develop and implement AWS-based solutions that support our clinical registry, biorepository, and precision medicine businesses. Applications include:
* data acquisition and management
* interfacing with LIMS and third-party labs systems
* linking data from our other business units, e.g., retinal scans, and integrating with "omics" data collected from biosamples.
This is an important role that enables biostatisticians, epidemiologists, and data scientists to gain critical insights that enable improved patient outcomes.
You will be:
* Working as part of a small, focused agile team to design, develop, and release production-quality, cloud-native systems and applications that support our real-world evidence (RWE) and biorepository business.
* Designing and implementing new cloud-based solutions, e.g., PaaS, FaaS (serverless), and SaaS applications and pipelines on AWS to operationalize reporting and analytics use-cases with secure, frictionless delivery and access to registry and biospecimen data.
* Proposing new and alternative solutions to streamline and automate workflows and improve our data and software development lifecycle.
Skills & Qualifications
* Bachelor of Science in computer science, computer engineering, bioinformatics engineering, or related discipline.
* 7+ years of industry experience
* Experience working with clinical (registry, claims, EMR), epidemiologic (RWD), lab, omics (proteomics, genomics, etc.), and similar data and technologies.
* Development expertise within the data discipline using Linux, Python, SQL, Spark, Hive/Presto; along with data pipeline and orchestration tools with a strong understanding of key AWS technologies (S3, AWS Glue, CDK, ECS, EMR, etc.).
* Experience with ETL/ELT tools and REST-oriented APIs required for data integration.
* An understanding of a wide range of core data processing platforms including DBMS and platforms, data warehouses, lakes, and lakehouses.
* Solid understanding of modern data stack objectives-sourcing, ingestion (incl. API integration), storage/processing, transformation/modeling, observability/QA, BI/Analytics-to move from replicant databases to brand name (e.g. Snowflake), cloud collective (AWS) and/or hybrid solutions with orchestrated pipelines and other diverse workloads/flows (transactional, analytics and research data science).
Benefits
* Competitive salary up to $160k/$170k
* Fully Remote Opportunity
* Around 5% bonus
* Full benefits package
* 401k with 7,5% match
* Opportunity to work on a greenfield project
* Opportunity to dive into new technologies and projects if you're interested
* Great company if you're interested in breaking into healthcare
