题名 | imDedup: A Lossless Deduplication Scheme to Eliminate Fine-grained Redundancy among Images |
作者 | |
通讯作者 | Xia,Wen |
DOI | |
发表日期 | 2022
|
会议名称 | 38th IEEE International Conference on Data Engineering (ICDE)
|
ISSN | 1084-4627
|
ISBN | 978-1-6654-0884-4
|
会议录名称 | |
卷号 | 2022-May
|
页码 | 1071-1084
|
会议日期 | 9-12 May 2022
|
会议地点 | Kuala Lumpur, Malaysia
|
出版地 | 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA
|
出版者 | |
摘要 | Images occupy a large amount of storage in data centers. To cope with the explosive growth of the image storage requirement, image compression techniques are devised to shrink the size of every single image at first. Furthermore, image deduplication methods are proposed to reduce the storage cost as they could be used to eliminate redundancy among images. However, state-of-the-art image deduplication methods either can only eliminate file-level coarse-grained redundancy or cannot guarantee lossless deduplication. In this work, we propose a new lossless image deduplication framework to eliminate fine-grained redundancy among images. It first decodes images to expose similarity, then eliminates fine-grained redundancy on the decoded data by delta compres-sion, and finally re-compresses the remaining data by image compression encoding. Based on this framework, we propose a novel lossless similarity-based deduplication (SBD) scheme for decoded image data (called imDedup). Specifically, imDedup uses a novel and fast sampling method (called Feature Map) to detect similar images in a two-dimensional way, which greatly reduces computation overhead. Meanwhile, it uses a novel delta encoder (called Idelta) which incorporates image compression encoding characteristics into deduplication to guarantee the remaining deduplicated image data to be friendly re-compressed via image encoding, which significantly improves the compression ratio. We implement a prototype of imDedup for JPEG images, and demonstrate its superiority on four datasets: Compared with exact image deduplication, imDedup achieves a 19%-38% higher compression ratio by efficiently eliminating fine-grained redundancy. Compared with the similarity detector and delta encoder of state-of-the-art SBD schemes running on the decoded image data, imDedup achieves a 1.8×-3.4× higher throughput and a 1.3 ×-1. 6 × higher compression ratio, respectively. |
关键词 | |
学校署名 | 其他
|
语种 | 英语
|
相关链接 | [Scopus记录] |
收录类别 | |
资助项目 | NSFC[61972441]
|
WOS研究方向 | Computer Science
|
WOS类目 | Computer Science, Artificial Intelligence
; Computer Science, Information Systems
; Computer Science, Theory & Methods
|
WOS记录号 | WOS:000855078401011
|
EI入藏号 | 20223512637902
|
EI主题词 | Cost reduction
; Decoding
; Digital storage
; Encoding (symbols)
; Image coding
; Image compression
; Image enhancement
; Signal encoding
|
EI分类号 | Information Theory and Signal Processing:716.1
; Data Storage, Equipment and Techniques:722.1
; Data Processing and Image Processing:723.2
|
Scopus记录号 | 2-s2.0-85136441662
|
来源库 | Scopus
|
全文链接 | https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9835287 |
引用统计 |
被引频次[WOS]:3
|
成果类型 | 会议论文 |
条目标识符 | http://sustech.caswiz.com/handle/2SGJ60CL/395611 |
专题 | 南方科技大学 |
作者单位 | 1.Harbin Institute of Technology,Shenzhen,China 2.National University of Defense Technology,China 3.Southern University of Science and Technology,China |
推荐引用方式 GB/T 7714 |
Deng,Cai,Chen,Qi,Zou,Xiangyu,et al. imDedup: A Lossless Deduplication Scheme to Eliminate Fine-grained Redundancy among Images[C]. 10662 LOS VAQUEROS CIRCLE, PO BOX 3014, LOS ALAMITOS, CA 90720-1264 USA:IEEE COMPUTER SOC,2022:1071-1084.
|
条目包含的文件 | 条目无相关文件。 |
|
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论