You are here: University of Vienna PHAIDRA Detail o:293840
Title
Duplicate Detection for Quality Assurance of Document Image Collections
Subtitle (en)
Paper - iPRES 2012 - Digital Curation Institute, iSchool, Toronto
Language
English
Description (en)
Digital preservation workflows for image collections involving automatic and semi-automatic image acquisition and processing are prone to reduced quality. We present a method for quality assurance of scanned content based on computer vision. A visual dictionary derived from local image descriptors enables efficient perceptual image fingerprinting in order to compare scanned book pages and detect duplicated pages. A spatial verification step involving descriptor matching provides further robustness of the approach. Results for a digitized book collection of approximately 35.000 pages are presented. Duplicated pages are identified with high reliability and well in accordance with results obtained independently by human visual inspection.
Keywords (en)
iPRES, iSchool, Toronto, Canada, digital preservation, information retrieval, image processing
Author of the digital object
Reinhold  Huber-Mork
Alexander  Schindler
Sven  Schlarb
Format
application/pdf
Size
1.7 MB
Licence Selected
CC BY-NC-SA 3.0 AT
Conferences
Conference 2012
Name of Publication (en)
"iPres 2012 - Proceedings of the 9th International Conference on Preservation of Digital Objects." Editors: Reagan Moore, Kevin Ashley, Seamus Ross
From Page
188
To Page
195
Name of Collection/Monograph (en)
"iPres 2012 - Proceedings of the 9th International Conference on Preservation of Digital Objects." Editors: Reagan Moore, Kevin Ashley, Seamus Ross
Publishing Address
140 St. George Street, Toronto, ON M5S3G6
Publisher
Digital Curation Institute, iSchool University of Toronto
Publication Date
2012-11-01
Link to bibliographic information
https://ipres.ischool.utoronto.ca/sites/ipres.ischool.utoronto.ca/files/iPres%202012%20Conference%20Proceedings%20Final.pdf
Content
Details
Object type
PDFDocument
Format
application/pdf
Created
15.06.2013 08:38:36
Metadata