Building the Future of Robust Computing Systems

IEEE Computer Society Team
Published 09/23/2025
Share this on:

An Interview with Dr. Onur Mutlu – 2025 Harry H. Goode Memorial Award Recipient

Dr. Onur Mutlu, Professor of Computer Science at ETH Zurich, has done research in computer architecture, computing systems, hardware security, memory and storage systems, and bioinformatics and is an innovator whose work has improved microprocessors and memory and storage systems used by billions of people.

Your identification of the RowHammer vulnerability has had a significant impact on hardware security and memory systems. What led you to this discovery, and how has it influenced subsequent research in memory systems?

RowHammer is a fascinating phenomenon that affects all modern main memory (i.e., DRAM) chips used in almost all computers today (comprising a more than 120 billion USD market size). It is the fact that repeatedly accessing one DRAM row causes bitflips in physically nearby DRAM rows that should not change at all, leading to data corruption. We have identified the phenomenon and analyzed it in detail for the first time in 2012. We have continued to study it since then, discovering new properties as well as new read disturbance mechanisms along the way, such as RowPress (in 2020-2023) and Variable Read Disturbance (in 2024-2025). We have come a long way, yet the phenomenon still fascinates me and there is a lot more to do in fundamentally understanding and very efficiently solving it, especially as memory technology scales to much denser capacities using increasingly smaller cells to store each bit of data. And, not only us, the phenomenon continues to fascinate researchers across hardware security, design automation, computer architecture, dependable systems, device analysis communities. Many works are published each year on understanding, analyzing, modeling the phenomenon as well as solving it using new and creative methods across the system stack (spanning both hardware and software). For example, top security and computer architecture conferences have been almost regularly having dedicated RowHammer sessions (sometimes multiple of them) since 2020.

Our stumbling on the RowHammer problem and creation of its first scientific analysis happened as a result of a confluence of multiple factors. First, my group has been working on DRAM technology scaling issues since late 2010. We were very interested in failure mechanisms that appear or worsen due to aggressive technology scaling. To study such issues (e.g., data retention errors), we built an FPGA-based DRAM testing infrastructure between 2011-2012, which we later open sourced as SoftMC and DRAM Bender. This infrastructure serves as the basis of a large “laboratory for analyzing and understanding memory chips” in my group. Second, around the same timeframe, we were investigating similar technology scaling issues in flash memory using real NAND flash chips (e.g., our DATE 2012 and ITJ 2013 papers analyzed fascinating errors in such chips). We knew read disturbance errors were significant in real NAND flash memory chips and were very interested in how prevalent they were in real DRAM chips. Third, we were collaborating with Intel to understand and solve DRAM technology scaling problems and build our DRAM infrastructure. Three of my students and I spent the summer of 2012 at Intel to work closely with our collaborators (two are co-authors of our original RowHammer paper): during this time, we finalized the calibration and stabilization of our infrastructure and had significant technical discussions and experimentation on DRAM scaling problems.

Although there was very limited awareness of the RowHammer problem in industry in 2012 (see Footnote 1 in our original RowHammer paper), there was no comprehensive experimental analysis and detailed real-system demonstration of it. We believed it was critical to provide a rigorous scientific analysis using a wide variety of DRAM chips and scientifically establish major characteristics and prevalence of RowHammer (and also provide solutions to it). Hence, in the summer of 2012, we set out to use our DRAM testing infrastructure to analyze RowHammer. Our initial results showed how widespread the read disturbance problem was across the (at the time) recent DRAM chips we tested, so we studied the problem comprehensively and developed many solutions to it. We submitted the resulting paper to the MICRO conference in May 2013 but was rejected for interesting yet not valid scientific reasons. One reviewer, for example, rejected the paper strongly, claiming that this is not an important problem. Another reviewer argued that the problem was not of interest to the computer architecture community and should not be published at MICRO since the problem was a “DRAM manufacturers’ problem” that should be solved by them, not computer architects. We strengthened the results, especially of the mitigation mechanisms and the number of tested chips, and made the analysis more comprehensive before it was accepted to ISCA 2014 (2 of the 6 reviewers still rejected it for yet other interesting reasons, one being that industry has already solved the problem).

Our demonstration that one can easily and predictably induce bitflips in commodity DRAM chips using a real user-level program enabled a major mindset shift in hardware security. It showed that general-purpose hardware is fallible in a very widespread manner and its problems are exploitable. Hundreds of works built directly on our work to exploit RowHammer bitflips to develop many attacks that compromise system integrity and confidentiality, starting from the first RowHammer exploit by Google Project Zero in 2015 to recent works in 2020-2025 (e.g., TRRespass, RAMBleed, Blacksmith, SMASH, RowPress). These attacks showed increasingly sophisticated ways by which an unprivileged attacker can exploit RowHammer bitflips to circumvent memory protection and gain complete control of a system, gain access to confidential data, or maliciously destroy the safety and accuracy of a system, e.g., an otherwise accurate machine learning inference engine. The mindset enabled by RowHammer bitflips caused a renewed interest in hardware security research, enticing many researchers to deeply understand hardware’s inner workings and find new vulnerabilities. Thus, hardware security issues have become mainstream discussion in top security & architecture venues, some having sessions entitled RowHammer.

Fast forward more than a decade since our original investigation, RowHammer is now a well-recognized problem and all existing DRAM chips are fundamentally vulnerable to it, in a way that is much worse than in 2012-2014. The good news is a lot of progress has also been made, and the DRAM industry now finally openly speaks out about RowHammer and writes papers about it. SK Hynix, for example, wrote a paper in ISSCC 2023, acknowledging the problem publicly in print for the first time as a DRAM manufacturer, and describing their solutions to the problem. Similarly Samsung, Google, Microsoft have all written papers about the RowHammer problem since 2020, developing both new attacks and solutions. Our original solutions to RowHammer, which we developed in 2012-2014, were adopted initially by industry earlier but they were not implemented very well. New solutions continue to be developed to RowHammer as the problem is getting worse. Industry finally standardized a solution called Per-Row Activation Counters (PRAC) in April 2024, which will be implemented in all DRAM chips going forward. This solution adds an intelligent controller inside each DRAM chip, a system-memory co-design solution that we have argued in favor of, for a very long time (e.g., in our original IMW 2013 and ISCA 2014 papers).

Clearly, fascinating ideas (attacks, analyses, and solutions) continue in the RowHammer space, with real industrial impact on the 120+ billion USD DRAM market used in essentially all computers. I believe there are a lot more discoveries to be made and much better solutions to be developed. Memory robustness issues are also a moving target as new issues appear and get discovered over generations, so research in this field is critically important and highly exciting!

Interested folks (who might be excited to better analyze, exploit and even better solve the problem) can read some of our overview papers on the problem and also watch the DRAMSec 2025 workshop, which we recently organized on the topic:

Onur Mutlu, Ataberk Olgun, and A. Giray Yaglikci, “Fundamentally Understanding and Solving RowHammer” Invited Special Session Paper at the 28th Asia and South Pacific Design Automation Conference (ASP-DAC), Tokyo, Japan, January 2023.

Onur Mutlu and Jeremie Kim, “RowHammer: A Retrospective” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) Special Issue on Top Picks in Hardware and Embedded Security, 2019.

Onur Mutlu, “The RowHammer Problem and Other Issues We May Face as Memory Becomes Denser” Invited Paper in Proceedings of the Design, Automation, and Test in Europe Conference (DATE), Lausanne, Switzerland, March 2017.

Fifth Workshop on DRAM Security (DRAMSec), co-located with ISCA 2025: https://www.youtube.com/watch?v=5KmKxFjPopM

Your work often bridges the gap between academia and industry. Can you share an instance where your research directly influenced commercial hardware design?

I enjoy working on research topics that can fundamentally change the way we build computing systems, and enable them to become fundamentally more robust (i.e., reliable, safe, secure, available), more energy-efficient, higher-performance, more sustainable, and more sustainable. I also enjoy interacting with industry to not only discover long-term problems to work on but also enable transfer of the new ideas we develop into new products. As a PhD student, I spent five summers doing internships at cutting-edge microprocessor companies, examining new ideas in their designs. For example, Runahead Execution, which I developed in my PhD research was adopted by various microprocessor manufacturers, including IBM, Sun Microsystems, and NVIDIA (This work later went on to receive the HPCA Conference Test of Time Award 18 years after its publication). After my PhD, I started the Computer Architecture group at Microsoft Research Redmond, where we worked on many directly industry-relevant problems, including performance and security issues in the then-new multi-core processors. During my sabbaticals, I spent significant time at VMware, Google, and Microsoft, working on cutting-edge issues on memory systems, including emerging technologies, processing-in-memory, and memory robustness. I like going back and forth between academia and industry as well as closely interacting with our industrial partners and any partner in industry who takes interest in our work. I also very much encourage my students to interact with industry as well as spend time there doing internships that would be valuable to their growth and research.

Many of our works in the SAFARI Research Group have enjoyed large influence on industry, both directly and indirectly. A great example of the direct impact we have had on industry is our work on RowHammer (and more recently RowPress), which has had extensive influence on the memory industry as well as the microprocessor industry. Our original ISCA 2014 paper’s analysis and the solutions we developed in it were quickly adopted by industry. Folks developing industrial memory testing programs immediately included RowHammer tests, e.g., in memtest86, citing our work. Industry needed to immediately protect RowHammer-vulnerable chips already in the field, so almost all system vendors increased refresh rates; a solution we examined in our paper and deemed costly for performance and energy, yet it was the only practical lever that could be used in the field. Apple publicly acknowledged our work in their security release that announced higher refresh rates to mitigate RowHammer. Intel designed memory controllers that performed probabilistic activations (i.e., pTRR), a variant of our PARA solution. DRAM vendors modified the DRAM standard to introduce TRR (target row refresh) mechanisms, which used solutions similar to an unsecure version of PARA, and claimed their new DDR4 chips to be RowHammer-free. This bold claim was later refuted by our TRRespass work in IEEE S&P 2020, which introduced the many-sided RowHammer attack to circumvent internal protection mechanisms added to the DRAM chips. Our later work, Uncovering TRR (MICRO 2021) showed that one can almost completely reverse-engineer and thus easily bypass RowHammer mitigations employed in all tested DRAM chips, i.e., RowHammer solutions in DRAM chips are broken. The analysis done by our two major works in 2020 (TRRespass and Revisiting RowHammer) caused the industry to reorganize the RowHammer task group at JEDEC, which produced two white papers on mitigating RowHammer. Nine years after our original paper, in 2023, two major DRAM vendors, SK Hynix and Samsung, finally wrote papers on the RowHammer problem, describing their solutions. Many of these industry solutions build on the probabilistic & access-counter-based solution ideas and approaches our ISCA 2014 paper introduced.

Major Internet and cloud systems companies also took a deep interest in RowHammer as it can greatly impact their system security, dependability, and availability. Multiple works from Google, e.g., by Google Project Zero in 2015 and Half Double in 2021-2022 directly built on our paper to demonstrate attacks in real systems. Researchers from Microsoft have developed deeper analyses of RowHammer, along with new RowHammer attacks and defenses.

Today, all DRAM chips incorporate solutions directly based on our original RowHammer work, as well as more recent works on RowHammer characterization and RowPress. DRAM standards changed to incorporate such solutions. Such solutions are often described as the biggest change that has happened to DRAM chips in decades! And, I believe there is much more to come!

Besides RowHammer, our other works in computer architecture have also enjoyed significant influence on industry. These include our many works on Processing-in-Memory (both near DRAM/storage as well as using DRAM/storage), which was difficult to adopt in 2011 when we first started working on it using 3D-stacked as well as commodity DRAM technologies and which has now significant traction in industry due to its importance in saving energy and improving performance in modern AI workloads.

Our other industry-influential works include our work on flash memory and SSD (solid-state drive) controllers, QoS-aware and high-performance main memory controllers, data prefetcher designs (including runahead execution, feedback-directed prefetching, reinforcement-learning based prefetching), cache and memory compression, energy-efficient interconnection architectures, and more. Just to give two more examples: our work on memory controller design provided many ideas that influenced the design of modern SoC (system-on-chip) memory controllers, a critical component in essentially all computing systems today. For example, our Parallelism-Aware Batch-Scheduling [ISCA-2008] showed memory scheduling, if unaware of each thread’s memory-level parallelism, can serialize every thread’s otherwise-parallel requests and thus destroy the benefits of modern high-performance techniques like out-of-order execution. PAR-BS inspired many new solutions and its variants are implemented in Samsung SoC memory controllers. Similarly, our work in flash memory, using real flash memory chip characterization, uncovered new error mechanisms, provided precious experimental data available nowhere else, and greatly improved flash memory lifetime and reliability. These works are regarded as prime resources for educating engineers in leading flash-memory and storage companies and have inspired many techniques to be included in SSD controllers to improve flash memory lifetime and performance.

Dr. Onur Mutlu has been given the 2025 Harry H. Goode Memorial Award for seminal contributions to computer architecture research and practice, especially in memory systems, and previously won the 2020 Edward J. McCluskey Technical Achievement Award.