Job Description
Develop 15 data ingestion pipelines from heterogeneous sources (REST APIs, SFTP file drops, database extracts) → S3 → Glue ETL → Lake Formation
Responsibilities- Implement ETL transformations per R2 data mapping specifications
- Build cross-agency data sharing patterns: Agency B data → Central Platform (Lake Formation cross-account grants, resource links)
- Implement data lineage tagging using OpenLineage / AWS-native lineage metadata for governance audit trail
- Configure data quality checks for multi-source ingestion — handle schema drift, late-arriving data, source unavailability
- Write and maintain IaC (CDK/Terraform) for R2 pipeline resources
- Execute unit testing, integration testing, and cross-agency data access validation
- Support UAT with Agency B data owners — validate data accuracy, timeliness, and access controls
- Document pipeline configurations, source connectivity patterns, and data flow ...
Apply for This Position
Ready to take the next step? Click the button below to submit your application.
Submit Application