Operation Manual
Section 9: Configure the Export
57
ePub
ePub is a free and open e-book standard by the International Digital Publishing Forum (IDPF). EPUB is
designed for reflowable content, meaning that the presentation of the content automatically adapts itself to
the device on which it is opened. EPUB also supports fixed-layout content.
Note that Readiris only creates body text in ePub files. Images are not included.
Text Options
The option Merge lines into paragraphs is selected by default. Readiris wordwraps the recognized text
until a new paragraph starts, and reglues hyphenated words at the end of a line.
HTML
HTML stands for "Hypertext Markup Language". It is the predominant markup language for web pages. It
provides a means to describe the structure and formatting of text-based information in a document. This
file format can be opened in Microsoft Excel, in Web browsers such as Safari, and in Web page editors such
as Adobe Dreamweaver.
Note: HTML is the recommended format when saving documents to Evernote.
Layout and Graphics Options
The same Layout and Graphics options are available as for DOCX, ODT, RTF.
XLSX
XLSX is the standard spreadsheet file format used since Microsoft Excel 2008. XLSX files are created using
the Open XML standard. Each cell in an XLSX file can have a different formatting.
Layout Options
Worksheet
The option Create one worksheet per page sees to it that one worksheet is created per scanned
page.
If a page contains tables and text, all is placed inside the same worksheet.
The option Create one worksheet per table places each table in a separate worksheet and
includes the recognized text (outside the tables) in another worksheet.
If the document being processed contains more than one page, each page will be processed in
the same manner.
This option is useful when processing tables of different sizes and different headings.
Other options
The option Retain word style and text paragraphs keeps the general format structure of your
scanned document.
The font type, size and type style are maintained across the recognition process.
The tabs and the alignment of each block are recreated.
The text blocks and columns aren't recreated; the paragraphs just follow each other.
Tables are recaptured correctly.
Pictures are not captured.