Abstract
As deep learning technologies develop rapidly, the increase of image quality and variety has become the focus of computer vision research. The paper proposes an image production approach based on deep learning (DL) long short-term memory (LSTM) network to extract richer semantic data by leveraging the long-term memory capacity of LSTM network to enhance image generating effect. The generation model of adding LSTM as encoder is studied and designed. The experiment is conducted on a computer equipped with a high-speed Graphics Processing Unit (GPU) and a large memory, using a dataset containing multi-class images. Taking the random forest network model as the control group, this paper evaluates the peak signal-to-noise ratio (PSNR), structural similarity, and generation accuracy. The findings demonstrate that the proposed model’s PSNR and structural similarity are superior to those of the control group. The PSNR reaches 20.5 dB, and the structural similarity reaches 0.8 after 100 iterations. After 600 cycles, the image generation accuracy reached 99%, which is also higher than that of the control group. This demonstrates how well the LSTM-based DL model can extract an image’s long-term semantic information and produce images that are both higher quality and more accurate. This paper is an invaluable resource for refining the DL-based image creation technique.
Keywords
Get full access to this article
View all access options for this article.
