In a significant advancement toward next-generation data storage technologies, a research team led by Professor Wu Huaming at Tianjin University’s Center for Applied Mathematics has published a pioneering study in Nature Computational Science, introducing a novel DNA-based data storage system specifically designed for biomedical images. The paper, titled “DNA data storage for biomedical images using HELIX” [link: https://www.nature.com/articles/s43588-025-00793-x], presents HELIX—a high-fidelity system that successfully encodes and reconstructs large-scale spatial-temporal omics images using DNA molecules.
As the global demand for data storage surges in the age of big data, traditional storage technologies face escalating challenges in terms of capacity, durability, and sustainability. DNA, with its remarkable density and longevity, has emerged as a compelling medium for long-term data archiving. A single gram of DNA can theoretically store hundreds of exabytes of data and remain stable for thousands of years without the need for continuous power supply. Among various data types, biomedical imagery stands out as an ideal candidate for DNA storage due to its high resolution, long-term value, and significant pattern redundancy.
The HELIX system proposed by Wu’s team incorporates three key modules - image compression, error-correcting encoding, and image reconstruction - specifically tailored for DNA-based storage of biomedical images. To address the nucleotide errors that may occur during DNA synthesis and sequencing, the system optimizes conventional compression algorithms, thereby significantly enhancing error tolerance. Moreover, HELIX integrates deep learning techniques to boost the accuracy and robustness of image recovery during the decoding process.
In experimental validation, the team successfully encoded two spatial-temporal omics images -each approximately 60 MB in size - into 130,000 synthetic DNA strands, each consisting of 183 nucleotides. Through wet-lab synthesis and sequencing, the original image data were effectively reconstructed. Results demonstrate that HELIX can recover the vast majority of image content with a sequencing depth of just 5.8 times, underlining the system’s robustness and efficiency.
This research represents a critical step toward the practical deployment of DNA storage systems, particularly for high-value scientific and medical data. The findings underscore the potential of customized storage architectures, such as HELIX, to improve both efficiency and reliability by tailoring system design to specific data types. The project was jointly conducted by the Center for Applied Mathematics at Tianjin University and the National Key Laboratory of Synthetic Biotechnology. Doctoral candidate Qu Guanjin is the first author of the paper, with Professor Wu Huaming serving as the corresponding author.
This achievement further positions Tianjin University at the forefront of global efforts to harness the potential of DNA as a next-generation storage medium and paves the way for broader applications in biomedical research and digital preservation.
By: Qin Mian