Image processing with Optical matrix vector multipliers implemented for encoding and decoding tasks
Kim et al. Light: Science & Applications (2025)14:248
https://doi.org/10.1038/s41377-025-01904-z
www.nature.com/lsa
ARTICLE
Open Access
Image processing with Optical matrix vector
multipliers implemented for encoding and
decoding tasks
Minjoo Kim1, Yelim Kim1 and Won Il Park
1✉
1234567890():,;
1234567890():,;
1234567890():,;
1234567890():,;
Abstract
This study introduces an optical neural network (ONN)-based autoencoder for efficient image processing, utilizing
specialized optical matrix-vector multipliers for both encoding and decoding tasks. To address the challenges in
efficient decoding, we propose a method that optimizes output processing through scalar multiplications, enhancing
performance in generating higher-dimensional outputs. By employing on-system iterative tuning, we mitigate
hardware imperfections and noise, progressively improving image reconstruction accuracy to near-digital quality.
Furthermore, our approach supports noise reduction and optical image generation, enabling models such as
denoising autoencoders, variational autoencoders, and generative adversarial networks. Our results demonstrate that
ONN-based systems have the potential to surpass the energy efficiency of traditional electronic systems, enabling realtime, low-power image processing in applications such as medical imaging, autonomous vehicles, and edge
computing.
Introduction
Optical image processing is fundamental to applications
in autonomous navigation, security, defense, and medical
diagnostics1–4. It begins with capturing images using
cameras and converting these analog visuals into digital
data, followed by image recognition, reconstruction,
interpretation, and decision-making. However, digital
images are typically large and contain extraneous information, such as background patterns and noise, necessitating preprocessing to reduce dimensionality while
retaining essential information. Image encoders, including
autoencoders—a popular deep neural network (DNN)
architecture—transform high-dimensional input into a
compact latent space by compressing the data through a
series of layers5–9.
Autoencoders also employ decoder networks to reverse
this process, reconstructing the compressed data back to its
original form10–12. Decoders are trained to closely reproduce the original input, which, combined with encoder
Correspondence: Won Il Park ()
1
Division of Materials Science and Engineering, Hanyang University, Seoul,
Republic of Korea
These authors contributed equally: Minjoo Kim, Yelim Kim
networks, makes autoencoders valuable for tasks like data
compression, noise reduction, and feature extraction. In
generative models, such as variational autoencoders
(VAEs) and generative adversarial networks (GANs),
decoders generate realistic and contextually appropriate
images, crucial for image enhancement, data augmentation,
and creative applications in art and film13–15.
Despite significant advancements in computational
efficiency, driven by both software and hardware
improvements16,17, the electronic processing of DNNs
using accelerators faces challenges such as high energy
consumption and limitations in scaling parallelism and
speed efficiency. Optical neural networks (ONNs) offer an
alternative with effective operations such as matrix-vector
multiplication (MVM) and nonlinear activation of optical
signals8,18–23. Fully optical multilayer, nonlinear preprocessor for 2D image processing has been implemented using components like spatial light modulators
(SLMs), liquid crystal displays (LCDs), diffraction optical
elements (DOEs), nonlinear optical materials, and saturable amplifier (e.g., image intensifier), enabling ONNs to
efficiently compress and process large images with lower
latency5,24–27.
© The Author(s) 2025
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction
in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if
changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If
material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain
permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Kim et al. Light: Science & Applications (2025)14:248
Current ONN systems have shown superior performance in encoding operation, using optical analog computing to compress image data before digital
conversion19,28–30. However, decoding requires a similar
computational resource, highlighting the need for
advancements in both encoding and decoding. Given the
ultimate ONN goal is to implement multi-layer neural
networks primarily through optical methods–or at least in
combination with analog approaches5,6–developing
effective optical decoders is crucial to fully utilize the
potentials of autoencoders in handling complex realworld visual data. Although research has been conducted
on 2D image encoding and decoding using diffractive
deep neural networks (D2NNs)31,32, these systems inherently require coherent light as input. This presents a
fundamental limitation because actual incoherent image
inputs must undergo an optoelectronic conversion process to generate coherent input light. This additional step
introduces complexity and potential latency, undermining
the benefits of optical processing. Direct encoding of
natural images5,33 and subsequent decoding with high
accuracy and low latency are essential for applications like
autonomous navigation, medical diagnostics, and defense,
where fast and reliable image processing directly impacts
safety and performance.
In this work, we present an ONN-based autoencoder that
leverages optical matrix-vector multipliers for both
encoding and decoding tasks. While previous MVM strategies efficiently handle scenarios where input dimensions
exceed output dimensions5,31,33–36, they exhibit limitations
when dealing with larger output sizes. To overcome this,
we propose a method for optimizing output processing via
scalar multiplications, enhancing the efficiency of higherdimensional output generation. We introduced on-system
iterative tuning to mitigate hardware imperfections and
noise, which gradually improved image reconstruction
accuracy, nearing the quality of digital processors. Additionally, we extended our approach to noise reduction and
optical image generation, enabling functions such as
denoising autoencoder (DAE), VAE, and GAN. Our analysis indicates that, with further optimizations, ONN-based
autoencoders have the potential to significantly exceed the
energy efficiency of electronic systems. These advancemen (...truncated)