How AI and machine learning change the traditional compute architecture
For many decades, dynamic random-access memory (DRAM) has been the main memory in traditional Von Neumann compute architectures. Its role is temporarily storing data and program code and feeding these to the processor’s cache memories through double data rate (DDR) data buses. DRAM is byte-addressable, which means that it can address one or a few bytes at a time. One of the most critical metrics is the short latency, i.e., the ability to address the first byte within a ~50ns timeframe. This requirement is most needed for quickly retrieving program code, which typically contains branched instructions distributed randomly within the DRAM memory chip.
DRAM density could increase through technology scaling to address the growing demand for DRAM and keep pace with the performance improvement of the processor’s logic part. Unfortunately, since about 2015, DRAM cost scaling – expressed as cost per bit – has increasingly struggled to keep up with Moore’s Law.
Parallel to this evolution, data-intensive applications such as AI and machine learning are changing the Von Neumann compute architecture. Not only more but also more specialized processor cores (GPUs, TPUs, ...) now operate in parallel to perform the application-specific tasks. As these applications are extremely data-hungry, ever larger streams of data (not so much program code) go from the memory to the processors, increasing the demand for DRAM memory. New interconnect standards are being introduced to complement the parallel DDR buses to support large data transfers. One of these is the compute express link (CXL), an open, high-bandwidth processor-memory interconnect standard that allows for the more efficient use of DRAM memory. CXL supports a variety of use cases, giving rise to different types of standards, referred to as types 1, 2, and 3. The latter, also called type-3 buffer memory, can be envisioned as an off-chip pool of memories that feeds the various processor cores with large blocks of data through a high-bandwidth CXL switch.
A 3D charge-coupled device: promising alternative to DRAM in CXL type-3 buffer memories
While the industry pursues DRAM in combination with CXL interfaces, imec takes a different turn. The imec research team started from the observation that CXL memories, particularly the type-3 buffer memory, may have characteristics different from DRAM. Especially the strict requirement of first-bit latency – the reason why it has been so difficult to replace DRAM with another type of memory – can be relaxed in these CXL type-3 architectures. This is provided that the new memory technology is cost-effective and can address large blocks of data in a very short time to compensate for a larger first-bit latency.
Imec recently presented a new memory concept that promises to meet all CXL type-3 block-addressable memory requirements: a charge-coupled device (CCD) with an IGZO-based channel arranged in a 3D NAND-like architecture. [1]
In a CCD device, a CCD register is written by loading charges into the various stages, which are made up of MOS capacitors that each can store one bit of information. This is essentially a serial operation, similar to a bucket brigade way of transport: the charge is fed into the first stage. Then it moves on to the next stage - controlled by several phase gates per stage (typically three or four). This movement continues until the first charge arrives at the output to be read out. The use of CCD as a memory device dates back to 1970 but was soon overshadowed by the byte-addressable DRAM. The technology was later introduced in the image sensor market, where it was further matured. Thus, the basic CCD technology is well-known and reliable. Being charge-based, it is also power efficient.
The novelty of imec’s concept is the specific 3D nature, making the CDD technology highly dense and very cost-effective. The proposed 3D architecture is inspired by 3D NAND technology, which has memory cells in all three dimensions. In a 3D NAND architecture, the cells are stacked to form a vertical string and are addressed by horizontal word lines. A ‘punch and plug’ process is used for fabrication: a word-line layer stack is grown, and cylindrical holes are formed by drilling down through the stack using advanced etch processes. NAND-specific layers, including a poly-Si channel, are then deposited along the sidewall of the hole. [2]
Imec’s 3D CCD buffer memory concept follows a similar approach: the CCD registers, each composed of a string of MOS capacitor cells, are integrated into vertically aligned plugs. One key enabler is using an oxide semiconductor (such as IGZO) channel material instead of poly-Si. IGZO can be deposited via the technique of atomic layer deposition (ALD), allowing conformal deposition in such high aspect ratio structures. An additional advantage of using IGZO is the relatively long retention time. This relaxes the need to frequently refresh the memory, which is a major drawback of DRAM memories.
A 2D proof-of-concept IGZO-based CCD structure
As a first step towards real implementations, imec demonstrated the memory operation of the CCD with IGZO on a 2D proof-of-concept. This planar CCD structure consists of an input stage, 142 stages (each consisting of four phase gates), which can each store one bit, and a two-transistor-based read-out stage. The CCD register is written by injecting charges through the input stage and sequentially transferring them through all 142 stages – by switching the voltages of the phase gates. The CCD offers more than 200s retention, an endurance of >1010 cycles without degradation, and a charge transfer speed exceeding 6MHz. Multilevel storage capability of the CCD register was also demonstrated, contributing to a higher bit density.
Toward high-density and low-cost 3D IGZO-based CCD buffer memories
Due to the 3D NAND-like architecture, the proposed concept can be manufactured much more cost-effectively than DRAM. But can 3D CCD-based buffer memories also beat DRAM in terms of bit density, which is expected to reach 1Gb/mm2 by 2030? To answer that question, the imec researchers estimated the bit density of the new 3D buffer memory by combining the characteristics of the 2D proof-of-concept CCD structure with what NAND Flash can enable today. They assumed two bits per cell and 30% array area overhead, the overhead being determined by the footprint of the metal contacts on the word lines. Also, a three-phase clock operation was adopted. This means three different phases per stage, where the equivalent phase gates of each stage receive the same clock signal.
From what is possible with NAND Flash today (i.e., the capability of processing (at least) 230 layers), imec estimates that the 3D buffer memory can already provide five times more bit density than what (2D) DRAM is expected to offer by 2030. And 3D NAND Flash scaling hasn’t stopped: some of the memory chip makers promise to provide 1,000 layers by 2030. Hence, regarding bit density, the new block-addressable memory promises to vastly surpass DRAM. The imec researchers are currently investigating 3D implementations of the CCD structure, starting with a limited number of word lines.
This article was originally published in Nature Reviews Electrical Engineering.
Want to know more?
[1] ‘Novel high density 3D buffer memory enabled by IGZO channel charge coupled device’, R. Kishore et al., 2024 IEEE International Electron Devices Meeting
[2] ‘Imec improves memory window of a 3D trench cell for next-gen NAND Flash’, M. Rosmeulen, imec, June 2023
Maarten Rosmeulen received his M.Sc. degree in physics in 1993 and his M.Sc. degree in physics of micro-electronics and materials science in 1994, both from the KU Leuven, Belgium. In 2005, he received his Ph.D. in electrical engineering from the KU Leuven. Since then, he has been with imec, in Leuven, Belgium, where he has been active as an R&D engineer in process integration, semiconductor device design, and electrical device characterization for multiple internal and external projects. In 2009 he became a project leader in developing GaN-on-Silicon Light Emitting Diodes (LEDs). In 2014 he became the team leader of the Pixel Design and Testing team and has been responsible for the development of CMOS Image Sensor (CIS) technologies. In 2019 he became the program director of the Storage Memory program, the position he holds today.
More about these topics:
Published on:
15 January 2025