SEEK: Salford Environment for Expertise and Knowledge

Published Conference Proceedings - Paper
August 2013

The Significance of Reading Order in Document Recognition and its Evaluation

Clausner, C & Pletschacher, S & Antonacopoulos, A 2013, The Significance of Reading Order in Document Recognition and its Evaluation, in: 'Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR2013)', IEEE-CS, Los Alamitos, CA, USA. Conference details: 12th International Conference on Document Analysis and Recognition (ICDAR2013).

Abstract

  Reading order detection and representation is an important task in many digitisation scenarios involving the preservation of the logical structure of a document. The corresponding need for the evaluation of reading order results generated by layout analysis methods poses a particular challenge due to the potential deviations between the ground truth and actually detected segmentation of the page. To this end a novel evaluation approach that responds to this problem by incorporating region correspondence analysis is proposed. Furthermore, a sophisticated reading order representation scheme is presented and used by the system allowing the grouping of objects with ordered and/or unordered relations. This is a typical requirement for documents with complex layouts such as magazines and newspapers. The evaluation method has been validated using the results of two state-of-the-art OCR / layout analysis systems and a basic top-to-bottom reading order detection algorithm applied on representative samples from the PRImA contemporary and the IMPACT historical document datasets.

Publication Details

Conference Proceedings
Antonacopoulos, A & Pletschacher, S & Clausner, C eds. 2013, Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR2013), IEEE-CS, Los Alamitos, CA, USA.

Conference Details
12th International Conference on Document Analysis and Recognition (ICDAR2013)