
Nicole Meng
Jul 24, 2025
As NeRF models power everything from autonomous drones to VR platforms, tiny pixel tweaks can now send them off the rails. With the recent rapid advancements in 3D reconstruction models, specifically Neural Radiance Fields (NeRF), the field of generative computer vision has drastically progressed. Generalizable NeRF models, such as IBRNet and GNT, have further improved the reconstruction accuracy and the overall generalizability of NeRF models, making them a popular choice among industry deployment and real-life applications. Yet as these systems move into safety-critical domains, we must ask: Can we trust NeRF models when an attacker tampers with the input?
Too often, the focus remains on visual fidelity sharper renders, smoother novel views, faster training and rendering time. But what happens when imperceptible changes in a single source image can cause a self-driving car’s perception to warp? Or a defense-simulation platform to misjudge an incoming threat? We ask a simple question: Can we trust these generalizable NeRF models? Specifically, can we trust their performance in the presence of adversarial attackers? Our answer, backed by rigorous research published in CVPR 2025 proceedings, is clear: Not in their current form.
Introducing the IL2-NeRF Attack
We proposed the IL2-NeRF Attack: The first adversarial attack on Generalizable NeRF models under the L2 perturbation threat model. Given the source view, we as attackers, could carefully craft a small adversarial perturbation bounded by a perturbation budget parameter ε. The budget ε is set to be small, making minimal visible changes on the source view. After that, the attacked source image gets passed through the GNeRF model and rendering pipeline, and the rendered images are collapsed into visible artifacts and distortions, as demonstrated in the figure below.

Fig. 1. IL2-NeRF Attack Pipeline Demonstration
IL2-NeRF attack targets both the source image pixels and the scene ray representations of given images, iteratively adding small perceptible perturbations uniformly (L2 threat model) to the entire image with the optimization goal of maximizing rendering loss. The unique loss function in IL2-NeRF is custom-designed to be a weighted combination of 8 loss functions, primarily targeting the RGB and depth values of rendered views.
This work exposes the security vulnerabilities of Generalizable NeRF models by demonstrating successful attacks on two popular NeRF models: IBRNet and GNT. Since IL2-NeRF lays the foundation for adversarial attacks under a new threat model, experimental results under a wide range of parameter settings were demonstrated across both synthetic and real-life object datasets. As demonstrated in the figure below, we can visibly see the visual artifact disruptions in the final rendered pictures across different objects under different perturbation budgets. These findings mark IL2-NeRF the first successful L2 adversarial attack on NeRF models.

Why does this matter?
You might think a drop in visual quality is merely cosmetic. In reality, every rendered view can feed downstream tasks, like object classification in autonomous driving. A slightly distorted view can cause a car’s vision system to misidentify a pedestrian as a shadow. Or a stop sign as a billboard. With the right tailored optimization objectives, an attacker can twist those misclassifications at will, turning a minor artifact into a major safety hazard.
Our work exposes a critical vulnerability in GNeRF models, producing glaring visual errors in the final 3D scenes from a tiny uniformed perturbation (well below perceptible thresholds). This kind of unpredictability is unacceptable in safety-critical domains such as defense simulations, where misrendered targets could compromise training fidelity and lead analysts to draw dangerously incorrect conclusions.
In national security applications, where 3D reconstruction informs mission planning, infrastructure assessments, and threat identification, even minor distortions can cascade into flawed intelligence.
By revealing how easily an attacker can hijack the rendering pipeline, IL₂-NeRF makes clear that security cannot be an afterthought. Robust defenses and adversarial testing must be integrated into NeRF workflows from day one, ensuring that tomorrow’s 3D vision systems remain both powerful and trustworthy.

Written by Nicole Meng
Nicole Meng is a PhD candidate in Electrical and Computer Engineering at Tufts University, where her research focuses on advancing the security and privacy of generative vision models. She holds a B.S. in Computer Science, Applied Math, and Economics from Brandeis University and an M.S. in Computer Science from the University of Connecticut. Prior to her graduate studies, Nicole worked as a full-stack software engineer at Pegasystems, where she contributed to their Workforce Intelligence software development and security enhancements.