中文版 | English
题名

An efficient multi-path structure with staged connection and multi-scale mechanism for text-to-image synthesis

作者
通讯作者Guo, Huanlei
发表日期
2023
DOI
发表期刊
ISSN
0942-4962
EISSN
1432-1882
卷号29期号:3页码:1391-1403
摘要
Generating a realistic image which matches the given text description is a challenging task. The multi-stage framework obtains the high-resolution image by constructing a low-resolution image firstly, which is widely adopted for text-to-image synthesis task. However, subsequent stages of existing generator have to construct the whole image repeatedly, while the primitive features of the objects have been sketched out in the previously adjacent stage. In order to make the subsequent stages focus on enriching fine-grained details and improve the quality of the final generated image, an efficient multi-path structure is proposed for multi-stage framework in this paper. The proposed structure contains two parts: staged connection and multi-scale module. Staged connection is employed to transfer the feature maps of the generated image from previously adjacent stage to the end of current stage. Such path can avoid the requirement of long-term memory and guide the network focus on modifying and supplementing the details of generated image. In addition, the multi-scale module is explored to extract feature at different scales and generate image with more fine-grained details. The proposed multi-path structure can be introduced to multi-stage based algorithm such as StackGAN-v2 and AttnGAN. Extensive experiments are conducted on two widely used datasets, i.e. Oxford-102 and CUB dataset, for the text-to-image synthesis task. The results demonstrate the superior performance of the methods with multi-path structure over the base models.
© 2023, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
关键词
相关链接[来源记录]
收录类别
EI ; SCI
语种
英语
学校署名
通讯
资助项目
The authors acknowledge the financial support from the Fundamental Research Funds for the Provincial Universities of Zhejiang (Grant No. GK219909299001-015), Natural Science Foundation of China (Grant No. 62206082), National Undergraduate Training Program for Innovation and Entrepreneurship (Grant No. 202110336042), Planted talent plan (Grant No. 2022R407A002) and Research on higher teaching reform (YBJG202233).
WOS研究方向
Computer Science
WOS类目
Computer Science, Information Systems ; Computer Science, Theory & Methods
WOS记录号
WOS:000939646100001
出版者
EI入藏号
20230913650700
EI主题词
Software engineering
EI分类号
Computer Programming:723.1
ESI学科分类
COMPUTER SCIENCE
来源库
EV Compendex
引用统计
被引频次[WOS]:0
成果类型期刊论文
条目标识符http://sustech.caswiz.com/handle/2SGJ60CL/519650
专题理学院_统计与数据科学系
作者单位
1.Computer and Software School, Hangzhou Dianzi University, Hangzhou; 310018, China
2.Department of Statistics and Data Science, Southern University of Science and Technology, Shenzhen; 518055, China
3.Zhuoyue Honors College, Hangzhou Dianzi University, Hangzhou; 310018, China
4.Hangzhou oke Technology Co Ltd, Hangzhou; 310000, China
5.Hangzhou Dianzi University Shangyu Institute of Science and Engineering, Shangyu; 312300, China
通讯作者单位统计与数据科学系
推荐引用方式
GB/T 7714
Ding, Jiajun,Liu, Beili,Yu, Jun,et al. An efficient multi-path structure with staged connection and multi-scale mechanism for text-to-image synthesis[J]. MULTIMEDIA SYSTEMS,2023,29(3):1391-1403.
APA
Ding, Jiajun,Liu, Beili,Yu, Jun,Guo, Huanlei,Shen, Ming,&Shen, Kenong.(2023).An efficient multi-path structure with staged connection and multi-scale mechanism for text-to-image synthesis.MULTIMEDIA SYSTEMS,29(3),1391-1403.
MLA
Ding, Jiajun,et al."An efficient multi-path structure with staged connection and multi-scale mechanism for text-to-image synthesis".MULTIMEDIA SYSTEMS 29.3(2023):1391-1403.
条目包含的文件
条目无相关文件。
个性服务
原文链接
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
导出为Excel格式
导出为Csv格式
Altmetrics Score
谷歌学术
谷歌学术中相似的文章
[Ding, Jiajun]的文章
[Liu, Beili]的文章
[Yu, Jun]的文章
百度学术
百度学术中相似的文章
[Ding, Jiajun]的文章
[Liu, Beili]的文章
[Yu, Jun]的文章
必应学术
必应学术中相似的文章
[Ding, Jiajun]的文章
[Liu, Beili]的文章
[Yu, Jun]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
[发表评论/异议/意见]
暂无评论

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。