/AI-Driven semiconductor test data analytics

AI-Driven semiconductor test data analytics

Leuven | Just now

Your goal is to create a python based framework where an engineer can ask natural language questions on his electrical data and receive instant statistical insights and visual diagnostics.

The mission

We are aiming to reduce the "Time to Market" in our semiconductor R&D by revolutionizing how we analyze test data. Currently, our engineers manually parse complex JSON files containing parametric extraction and electrical curves.

Inspired by recent literature on AI-Driven Semiconductor Test Data Analytics, we want to build a system that combines a Data Translator with an Interactive AI Agent. Your goal is to create a python-based framework where an engineer can ask natural language questions (e.g., "Show me the defect pattern on Wafer X" or "Compare the yield of Lot A vs Lot B") and receive instant statistical insights and visual diagnostics.

Key responsibilities

  • Literature review & strategy: Start by analyzing state-of-the-art papers (such as Wang et al. on IEA-Plot and recent TPOR frameworks). Benchmark the pros and cons of using Knowledge Graphs vs. Vector RAG vs. SQL Agents for our specific data topology. 
  • Universal data translator: Design a pipeline to ingest strandardized JSON data files (curves, parameters, flags) and standardize them into a unified format (Parquet?) suitable for AI querying.
  • Agentic AI development: Build a "Code Interpreter" agent using Python (LangChain/LlamaIndex) and local LLMs (Llama 3, Mistral) on our GPU servers. The agent must be capable of writing code to perform statistical aggregations (PCA, Mean, Sigma) without hallucination.
  • Advanced visualization: Go beyond basic charts. Implement an automated plotting module capable of generating distribution curves, and correlation plots based on the chat context.
  • Prototyping: Wrap this technology in a user-friendly web interface (Streamlit) for immediate feedback from R&D engineers. It's possible that the student considers an integration in Azure or in Copilot studio but a stand alone solution is prefered. Decision will be decided after point 1 (Literature review & strategy

What you will learn

  • Research-to-Production: How to take concepts from academic papers and implement them in a real industrial environment.
  • Agentic workflows: Mastering the intersection of LLMs and deterministic code execution.
  • Semiconductor domain knowledge: Understanding the Parametric Chip/Device MOSFET testing.
  • High-Performance computing: Utilizing local GPU infrastructure for secure, private model inference.

Requirements

  • Master’s student in CS, Data Science, AI, or EE. Ph.D exchange.
  • Understanding of LLM architectures (transformers, LSTM) and AI fundamentals is a must. Familiarity with advanced LLMs architecture is preferred. Experience in already trained LLM models locally is preferred.
  • Strong Python skills: Experience in deep learning frameworks (tensorflow, pytorch) and related libraries such as pandas, numpy, is a must. 
  • Academic mindset: You are comfortable reading IEEE/research papers and extracting the methodology to apply it to code.
  • GenAI Knowledge: Understanding of RAG (Retrieval Augmented Generation) and LLM limitations.
  • Familiarity with Git and Linux/Unix environments.

Nice to Have

  • Knowledge of Knowledge Graphs (Neo4j, NetworkX).
  • Experience processing spatial data (heatmaps/wafer maps).

How to Apply

Please send your CV. Bonus: In your email, briefly mention one challenge you foresee in applying LLMs to numerical scientific data

Type of internship: Master internship, PhD internship

Duration: 6

Required educational background: Computer Science

University promotor: Siegfried Mercelis (UAntwerpen)

Supervising scientist(s): For further information or for application, please contact Jerome Mitard (Jerome.Mitard@imec.be)

The reference code for this position is 2026-INT-074. Mention this reference code in your application.

Only for self-supporting students.


Applications should include the following information:

  • resume
  • motivation
  • current study

Incomplete applications will not be considered.
Who we are
Accept marketing-cookies to view this content.
Cookie settings
imec inside out
Accept marketing-cookies to view this content.
Cookie settings

Send this job to your email