Internship Data engineering
Apply now »Date: 3 Jul 2026
Location: Ridderkerk, NL
Company: Alstom
Internship data engineering
Internship data engineering – Fleet Support Center NL
Alstom Netherlands
1. Alstom
Alstom is a global leader in the manufacturing, maintenance and service of rolling stock, infrastructure, signaling and digital mobility. Within the Benelux service organization, the Fleet Support Center provides digital solutions that help transform operational data into actionable insights for maintenance, troubleshooting and service delivery.
As the number of connected fleets and real-time data streams continues to grow, the Fleet Support Center is further professionalizing its data platform to ensure higher throughput, reliable processing and scalable data quality controls across its analytical and operational use cases.
2. Internship assignment
For this internship, Alstom is looking for a student interested in data engineering who will contribute to improve the throughput, scalability and reliability of the Fleet Support Center data platform. The assignment focuses on designing practical, scalable solutions for real-world data flows, with flexibility to align the content with the student’s interests.
Depending on the student’s interests, the internship may focus on data modeling, pipeline architecture, orchestration design, or distributed data processing. The student will work with technologies such as Pandas, Polars or Spark to design efficient data pipelines, selecting the appropriate tooling based on workload characteristics such as data volume, velocity and complexity.
In parallel, the student will design configurable schema-based validation steps to improve data quality and trustworthiness. These validation mechanisms will ensure that data is structurally correct and reliable before being used in downstream applications such as BI dashboards, monitoring solutions and data science use cases.
3. Scope
- Analyze existing data flow and identify bottlenecks affecting throughput, latency and maintainability.
- Profile the incoming datasets and support technology selection (Pandas, Polars, Spark) based on workload characteristics.
- Design scalable architectures for ingestion, transformation and storage of operational data.
- Develop prototype pipelines for representative data flows and processing scenarios.
- Design configurable schema-based validation rules and data quality checks.
- Evaluate solutions based on throughput, execution time, data quality and maintainability metrics.
At a later stage, it will be determined which fleet, subsystem or business process will be selected as the primary pilot case. Depending on the data profile relevant at the time of starting the internship, the pilot may focus on large-scale historical data processing, frequent incremental loads or real-time processing operational data.
4. Organization
The project team consists of the following functions:
- Data engineering / data science: the student and the Alstom data specialist(s). This team will be responsible for the technical analysis, architecture design and implementation of the pipeline improvements.
- Fleet Support Center officer: this team will help validate whether the improved data flow supports daily operational needs and service processes.
- Engineering: this team will support the definition of source-system logic, data semantics and validation rules from a domain perspective.
The student will work closely with the Fleet Support Center team and will be involved in regular operational governance, sprint reviews and alignment on architecture, validation approach and measurable business impact. The compensation fee for an internship is € 350 gross per month, and for a graduation assignment € 400 per month.
5. Expected deliverables
The exact deliverables depend on the final scope of the internship, which will be defined based on the student’s interests and business priorities. Typical outcomes include:
- Analysis of current data flows and identification of bottlenecks,
- A proposed improvement in data architecture, modeling, or pipeline design,
- One or more prototypes (e.g. scalable pipelines using Pandas, Polars or Spark),
- A configurable schema-based validation approach,
- And an evaluation of results in terms of throughput, data quality, and maintainability.
- The final deliverables should demonstrate both practical engineering impact and a structured approach to solving real-world data challenges.
6. Scientific Contribution
Depending on the type of internship (e.g. HBO or Master thesis), there may be a requirement for a scientific contribution.
The internship offers the opportunity to investigate how different data engineering approaches can be applied effectively in an industrial environment. This may involve evaluating processing techniques, architectural choices, or validation strategies under varying conditions such as data volume, complexity, and usage requirements.
The scientific contribution lies in the structured analysis and comparison of alternative solutions, focusing on the trade-offs between performance, scalability, maintainability, and data reliability. Rather than implementation alone, the emphasis is on developing a well-founded approach supported by analysis, experimentation, and evaluation.
The assignment is therefore suitable for a thesis-oriented project in which practical engineering work is combined with a systematic research methodology and results in insights that are transferable beyond the specific use case.
7. Competences required
The student requires the following competencies to complete the internship:
- Proficiency in Python.
- Strong interest in data engineering, data pipelines and scalable processing architectures.
- Experience with data manipulation libraries such as Pandas and Polars; familiarity with Spark is a strong advantage.
- Understanding of data modeling, ETL / ELT concepts and performance optimization techniques.
- Knowledge of data validation, schema design and data quality monitoring approaches.
- Experience with SQL and working with structured and semi-structured data.
- Ability to analyze results and communicate technical trade-offs clearly to both technical and business stakeholders.
- Knowledge of Git and notebook-based or code-based development workflows.
Are you enthusiastic? We can totally imagine! Apply via the button. For more information you are welcome to reach out to Erik Sonneveld. Reach him via 06 53 47 63 77 (WhatsApp or via calling).
We're looking forward to meeting you!
Job Segment:
Database, Intern, Network, SQL, Technology, Entry Level, Research