Internship: Data engineering: Synthetic data rendering for automated tests - Ieper, Belgium

Internship: Data engineering:  Synthetic data rendering for automated tests

Your future job

Data engineering: synthetic data rendering for automated tests

There's a constant increase (in volumes and ingestion rates) on incoming dataflows from various sources towards our data analytics and reporting platforms. To keep those flows manageable, more data tests are required, preferably in an automated fashion.

Those tests need to be performed on a consistent controlled data set; one that can be easily expanded / changed on added functionality, ...

More and more technology vendors are coming up with tools and platforms to support this.
Next to the generation of synthetic data, these setups need to be embedded within the data testing landscape in order to be used in an operational environment.

The goal of the project is to identify a framework for generating synthetic data on several data sets, to 'productize' them and use that framework in combination with automated tests on data.
 
The project includes the following steps:

- Tool selection and design
- Generation of synthetic data on different data sets
- Productization of the solution (setting up test pipelines, versioning, pipelines for deploying different versions,...)

(From 4 weeks up to 6 months)

Your profile

  • Student in Bachelor or Master in IT 
  • IT software analysis, design and development practices 

  • Minimal: Python & SQL development

  • Preferably: Git/Gitlab, Docker, Kubernetes 

  • Data engineering technologies

 

Main technologies used:

  • Data generation solutions
  • Great expectations
  • Databricks, Spark
  • Python, SQL
  • Continuous delivery (GIT, gitlab CI, Docker, Kubernetes,..)
  • PostgreSQL database knowledge

Competencies the student could develop

  • Working with state-of-the-art enterprise applications frameworks used to develop and deploy applications to be used worldwide
  • Analysis and development methodologies like Domain Driven Design and continuous integration/ deployment
  • Data engineering tasks like ETL, Python, data profiling, cloud development
  • Test Engineering

 

We offer

  • a challenging job in a dynamic high-tech international environment
  • the opportunity to take ownership of your professional passion in order to contribute to the success of the company
  • an enjoyable, team-oriented and professional atmosphere in a flat-structured organization
  • versatile development opportunities