Internship: Data engineering: Synthetic data rendering for automated tests - Ieper, Belgium
Internship: Data engineering: Synthetic data rendering for automated tests
Your future job
Data engineering: synthetic data rendering for automated testsThere's a constant increase (in volumes and ingestion rates) on incoming dataflows from various sources towards our data analytics and reporting platforms. To keep those flows manageable, more data tests are required, preferably in an automated fashion.
Those tests need to be performed on a consistent controlled data set; one that can be easily expanded / changed on added functionality, ...
More and more technology vendors are coming up with tools and platforms to support this.
Next to the generation of synthetic data, these setups need to be embedded within the data testing landscape in order to be used in an operational environment.
The goal of the project is to identify a framework for generating synthetic data on several data sets, to 'productize' them and use that framework in combination with automated tests on data.
The project includes the following steps:
- Tool selection and design
- Generation of synthetic data on different data sets
- Productization of the solution (setting up test pipelines, versioning, pipelines for deploying different versions,...)
(From 4 weeks up to 6 months)
- Tool selection and design
- Generation of synthetic data on different data sets
- Productization of the solution (setting up test pipelines, versioning, pipelines for deploying different versions,...)
(From 4 weeks up to 6 months)
Your profile
- Student in Bachelor or Master in IT
IT software analysis, design and development practices
Minimal: Python & SQL development
Preferably: Git/Gitlab, Docker, Kubernetes
Data engineering technologies
Main technologies used:
- Data generation solutions
- Great expectations
- Databricks, Spark
- Python, SQL
- Continuous delivery (GIT, gitlab CI, Docker, Kubernetes,..)
- PostgreSQL database knowledge
Competencies the student could develop
- Working with state-of-the-art enterprise applications frameworks used to develop and deploy applications to be used worldwide
- Analysis and development methodologies like Domain Driven Design and continuous integration/ deployment
- Data engineering tasks like ETL, Python, data profiling, cloud development
- Test Engineering
We offer
- a challenging job in a dynamic high-tech international environment
- the opportunity to take ownership of your professional passion in order to contribute to the success of the company
- an enjoyable, team-oriented and professional atmosphere in a flat-structured organization
- versatile development opportunities