Yihui BaoXinyi LuYanyan XiaZhencheng YeHouyang Chen
Abstract Machine learning has emerged as a powerful tool for analyzing scanning transmission electron microscopy (STEM) images, yet its widespread application remains constrained by the scarcity of annotated training data. While deep generative models offer a promising solution, they typically struggle to reproduce the complex high‐frequency components that define experimental STEM images. Here, STEMDiff, a conditional diffusion model that transforms simple binary labels derived from crystal structures into realistic STEM images through a physical information embedding strategy, is proposed. By developing a novel Discrete Wavelet Transform‐based skip‐connection architecture, the high‐frequency bias inherent in diffusion models are addressed, enabling the preservation of experimental noise characteristics. This approach generates images that are quantitatively nearly indistinguishable from experimental data (17 fold improvement over previous methods) while retaining ground truth structural information. Fully convolutional networks trained exclusively on these synthetic images achieve high‐precision atomic column detection in experimental STEM images of WSe 2 and graphene, despite the presence of substantial background noise and contamination. This approach effectively eliminates the need for laborious manual annotation, providing a scalable solution to the data bottleneck in STEM image analysis. The principles underlying STEMDiff can extend to other scientific imaging modalities, accelerating advancements in materials design for water treatment.
Fangyuan MaoJilin MeiShun LuFuyang LiuL.X. ChenFangzhou ZhaoYu Hu
Rui LiGabriel della MaggioraVardan AndriasyanAnthony PetkidisArtsemi YushkevichNikita DeshpandeMikhail KudryashevArtur Yakimovich
Yuze LiZhenyu LiuXiao–Ping Zhang
Seungyeon ChoiSangmin SeoByung Ju KimChihyun ParkSanghyun Park
Di ZhangJiaYao LiZilong ChenYuntao Zou