/AI-Driven semiconductor test data analytics

AI-Driven semiconductor test data analytics

Leuven | Just now

Your goal is to create a python based framework where an engineer can ask natural language questions on his electrical data and receive instant statistical insights and visual diagnostics.

Apply

The mission

We are aiming to reduce the "Time to Market" in our semiconductor R&D by revolutionizing how we analyze test data. Currently, our engineers manually parse complex JSON files containing parametric extraction and electrical curves.

Inspired by recent literature on AI-Driven Semiconductor Test Data Analytics, we want to build a system that combines a Data Translator with an Interactive AI Agent. Your goal is to create a python-based framework where an engineer can ask natural language questions (e.g., "Show me the defect pattern on Wafer X" or "Compare the yield of Lot A vs Lot B") and receive instant statistical insights and visual diagnostics.

Key responsibilities

Literature review & strategy: Start by analyzing state-of-the-art papers (such as Wang et al. on IEA-Plot and recent TPOR frameworks). Benchmark the pros and cons of using Knowledge Graphs vs. Vector RAG vs. SQL Agents for our specific data topology.
Universal data translator: Design a pipeline to ingest strandardized JSON data files (curves, parameters, flags) and standardize them into a unified format (Parquet?) suitable for AI querying.
Agentic AI development: Build a "Code Interpreter" agent using Python (LangChain/LlamaIndex) and local LLMs (Llama 3, Mistral) on our GPU servers. The agent must be capable of writing code to perform statistical aggregations (PCA, Mean, Sigma) without hallucination.
Advanced visualization: Go beyond basic charts. Implement an automated plotting module capable of generating distribution curves, and correlation plots based on the chat context.
Prototyping: Wrap this technology in a user-friendly web interface (Streamlit) for immediate feedback from R&D engineers. It's possible that the student considers an integration in Azure or in Copilot studio but a stand alone solution is prefered. Decision will be decided after point 1 (Literature review & strategy)

What you will learn

Research-to-Production: How to take concepts from academic papers and implement them in a real industrial environment.
Agentic workflows: Mastering the intersection of LLMs and deterministic code execution.
Semiconductor domain knowledge: Understanding the Parametric Chip/Device MOSFET testing.
High-Performance computing: Utilizing local GPU infrastructure for secure, private model inference.

Requirements

Master’s student in CS, Data Science, AI, or EE. Ph.D exchange.
Understanding of LLM architectures (transformers, LSTM) and AI fundamentals is a must. Familiarity with advanced LLMs architecture is preferred. Experience in already trained LLM models locally is preferred.
Strong Python skills: Experience in deep learning frameworks (tensorflow, pytorch) and related libraries such as pandas, numpy, is a must.
Academic mindset: You are comfortable reading IEEE/research papers and extracting the methodology to apply it to code.
GenAI Knowledge: Understanding of RAG (Retrieval Augmented Generation) and LLM limitations.
Familiarity with Git and Linux/Unix environments.

Nice to Have

Knowledge of Knowledge Graphs (Neo4j, NetworkX).
Experience processing spatial data (heatmaps/wafer maps).

How to Apply

Please send your CV. Bonus: In your email, briefly mention one challenge you foresee in applying LLMs to numerical scientific data

Type of internship: Master internship, PhD internship

Duration: 6

Required educational background: Computer Science

University promotor: Siegfried Mercelis (UAntwerpen)

Supervising scientist(s): For further information or for application, please contact Jerome Mitard (Jerome.Mitard@imec.be)

The reference code for this position is 2026-INT-074. Mention this reference code in your application.

Only for self-supporting students.

Applications should include the following information:

resume
motivation
current study

Incomplete applications will not be considered.

Apply

Who we are

Accept marketing-cookies to view this content.

Cookie settings

imec inside out

Accept marketing-cookies to view this content.

Cookie settings

Related jobs

Sub-THz Packaging solution for High-Bandwidth Wireline applications in Datacenter, HPC, and Optical Platforms

Optimize Imec RF/mixed-signal interposers enabling high-density, low-loss electrical links design among ASICs, driver ICs, and PICs

Hyperspectral Imaging for Detection and Quantification of Microorganisms in Aquatic Environments

hyperspectral imaging, microorganisms, ecology, environment, deep learning

Developing storage/memory dashboard for monitoring storage evaluation results

Design an interface for monitoring the memory/storage evaluation results (BW, latency, …) and make it user-friendly for our customers

Job opportunities

Share this article on

AI-Driven semiconductor test data analytics

Who we are

imec inside out

Related jobs

Sub-THz Packaging solution for High-Bandwidth Wireline applications in Datacenter, HPC, and Optical Platforms

Robust Learning Methods for Whole Slide Image Analysis in Digital Pathology

Digital innovation projects for R&D units

Security awareness support

Hyperspectral Imaging for Detection and Quantification of Microorganisms in Aquatic Environments

Developing storage/memory dashboard for monitoring storage evaluation results

Send this job to your email

Expertise

What we offer

Applications

Jobs

About imec

More imec