Bioinformatics Data Engineer | Utrecht, Netherlands | Genmab

Position:	Bioinformatics Data Engineer
Institution:	Genmab
Location:	Utrecht, Netherlands
Duties:	Design, develop and deploy reproducible data pipelines using cloud-native tools. All our pipelines use infrastructure as code, have automated tests and are as re-usable and reproducible as possible; Connect with collaborators (scientists, project managers, etc.) to translate their needs and questions into technical requirements. We then use the requirements to build data pipelines and visualizations that are meaningful, comprehensible, and practical for them; Every data engineer has projects to lead and others in which there are only smaller contributions; Generate comprehensive documentation of the data products developed, both for technical and non-technical users; Promote good (coding/data) practices and lead by example
Requirements:	MS/PhD or equivalent experience in Computer Science, Bioinformatics, or related field; 3+ years of demonstrated working experience as a data engineer; Experience with data pipeline design and creation is a must. The pipelines should use good coding practices and the right tool for the job. Experience with ETL jobs (e.g. AWS Glue, Databricks jobs, AWS Lambda) and orchestrators (e.g. AWS StepFunctions) is desirable; Solid experience in database design (partitions, schemas, choosing database type, etc.) and querying languages (SQL, pyspark or similar) is a requirement. Experience with delta lake (delta tables) is a plus

Text:	Bioinformatics Data Engineer Design, develop and deploy reproducible data pipelines using cloud-native tools. All our pipelines use infrastructure as code, have automated tests and are as re-usable and reproducible as possible; Connect with collaborators (scientists, project managers, etc.) to translate their needs and questions into technical requirements. We then use the requirements to build data pipelines and visualizations that are meaningful, comprehensible, and practical for them; Every data engineer has projects to lead and others in which there are only smaller contributions; Generate comprehensive documentation of the data products developed, both for technical and non-technical users; Promote good (coding/data) practices and lead by example MS/PhD or equivalent experience in Computer Science, Bioinformatics, or related field; 3+ years of demonstrated working experience as a data engineer; Experience with data pipeline design and creation is a must. The pipelines should use good coding practices and the right tool for the job. Experience with ETL jobs (e.g. AWS Glue, Databricks jobs, AWS Lambda) and orchestrators (e.g. AWS StepFunctions) is desirable; Solid experience in database design (partitions, schemas, choosing database type, etc.) and querying languages (SQL, pyspark or similar) is a requirement. Experience with delta lake (delta tables) is a plus