Monday, November 27, 2006

A speech at ATIA

I'm giving a speech at ATIA, and have three days to write the seemingly inevitable PowerPoint slides. Does anyone want to help?

The title is "Beyond Access - What's Next for Kurzweil 1000?"

I had to write a few paragraphs when I submitted the proposal - they are below:

"Software based reading programs for the Blind have moved well beyond their original purpose - access to print materials. We'll look at the many other problems that Kurzweil 1000 tries to solve. Although I'll describe current functionality, and answer questions about the current release, I'd like to spend most of the session in a discussion regarding possible features for future releases."

"This year is the 30th anniversary of the introduction of the first reading machine for the Blind. We'll briefly compare the price and performance of today's reading machine compared with the very first, which was a product from our distant corporate ancestor, Kurzweil Computer Products. Less interesting perhaps, but more important, we'll talk about the way in which product capabilities have expanded to address issues that go well beyond converting printed text into audio. I'll provide an overview of some of the newer capabilities of Kurzweil 1000, answer questions about this capabilities, and demonstrate them if that is of interest. Mainly, though, I would like a group discussion about where this product sector is going - what sort of problems should it solve in the future, and for whom?"

"Key learning objectives include:

A broader definition of what reading products for the Blind are generally, ahd what Kurzweil 1000 provides specifically.

A hint of what is likely to happen in future releases of Kurzweil 1000.

A sense that current users and solution providers can directly influence those future releases."

The fact that I'm asking for help here is mainly an indication that I'm still procrastinating - I hate writing these things, though I don't mind doing the presentation itself.

So, does anyone want to suggest some ideas?

Wednesday, November 15, 2006

A Miscellany of other New Features in Version 11

In previous posts, I've gone into a fair amount of detail about new features, with one feature listed per post. Here, I'll list the remaining features - they are pretty modest, but one or another may appeal to you.

An altered TTS Engine.

We've shipped IBM TTS for a number of years. With this release, we are switching to ETI Eloquence. IBM TTS has suffered from a lack of support for the last few years - we expect better of ETI Eloquence - in fact, a number of fixes were made at our request before we would accept the engine.

Recognition Optimization.

This new option works somewhat like scanning optimization, except that it uses the image associated with the current page, and runs it through several different recognition settings.

Table Output in HTML and DAISY.

Previous releases did not support table formatting when you saved documents as HTML or as DAISY. Version 11 supports this.

Read Newly Recognized Pages Setting.

If you are reading your mail, you might like to have reading begin at the top of whatever page you have just scanned. There is a setting that allows you to do that if you wish. You'll find it in the General Settings Dialog.

Searching for Blank Pages. In the Find dialog, you can now search for blank pages. A blank page is one that contains no text, or only spaces, new lines, and tabs.

Enhanced Explore layout.

Explore layout now distinguishes between headers, footers, captions, tables, titles, and normal text blocks. Further, you can choose to save a picture region as a TIFF file. To do so, select the picture in the grid control, then tab to the Extract button and press enter.

A Bigger Recently Opened Files List.

The number of items in this list has been expanded from 5 to 10.

Continuous Reading in ListViews and TreeViews.

Within any dialog that contains ListView or TreeView controls, you can read the contents of those controls by pressing F5. You can stop at any item by pressing F5 again.

Expanded use of Batch Scanning Prefix.

In previous releases, you could use this setting to begin each TIFF file name with a unique prefix when you were scanning images. This was handy mainly because you could differentiate between one set of images and another. You can still do that, but you can also specify a folder name in the prefix, directing image files to a particular location, and (later), processing images from that location when you choose to recognize image files.

Extract All Images.

This is a new option in the File Utilities menu, allowing you to extract the images from all of the pages in a document.

New Features Guides.

At conferences I frequently have someone come up to me who is using, say, version 7 of Kurzweil 1000. He or She will, quite naturally, ask me what is new. To help me answer that question, the Help menu now contains a submenu labeled "New Features". Within that, you will be able to open any of the New Features guides for any of the versions of Kurzweil 1000 that were on your computer.

Suggested Dictionary Lookup.

If you misspell a word when you attempt to look it up in the dictionary, Kurzweil 1000 may suggest possible corrections that are available in the dictionary.

Copy the Entire Definition.

Once you have looked up a word, use Control+W to copy the entire definition to the clipboard.

A Few More Verbosity Settings.

You can now be notified, if you wish, when recognition of a multi-page file is complete, or when creation of audio files are complete. You can also create a chime to be played when continuous reading passes an end of paragraph, and/or a blank line.

Read All Punctuation.

A few dialogs have been changed to always read with all punctuation audible. This is handy, for example, in the edit corrections and edit pronunciations dialogs.

Switch Currently Active Document.

We have added a new item to the reading keypad. Shift+Up Arrow will move you from one document to the next, presuming you have more than one document open.

Stealing Focus after Scanning.

TWAIN Interfaces have a nasty habit of changing the keyboard focus when a scan begins. Kurzweil 1000 tries to compensate for this by saving the keyboard focus when a scan begins, and refocusing the keyboard once the scan ends. This causes trouble if you are using the Scanner Hot Key sequence in another application, and manually change the keyboard focus while a scan is in progress. You can now disable this behavior in the ScanConf diagnostic by changing the value of the setting labeled "Keep Kurzweil in Foreground".

Changed Behavior for Control+T.

Using Control+T, or the "Tools->What Time is It?" menu item, always gave you the time and the date. Now it will do one or the other. Press it once, and you will get the time. Press it again within 10 seconds, and you'll get the date. This is a handier approach, especially when used in conjunction with Control+C and Control+V, which will copy and paste the date or time into your document.

Changed Defaults.

Scanner Threshold now defaults to Dynamic rather than to Static. Language Identification now defaults to Disabled rather than Once per Page. The default Reading and Message Voice volumes are now 80 (some voices exhibit bad behavior when driven at their maximum volume.)

Changes to the Font Properties Dialog.

In previous releases, we combined the bold and italic attributes for a font into one setting called a font style. That's unusual, and it turns out there is a good reason why it isn't done that way. It becomes difficult to, for example, remove the bold attribute of a block of selected text without affecting the italic attribute in the same block. So, we've separated it out into a Bold setting and an Italic setting. Each has two possible values usually: Enabled or Disabled. For the Font Format dialog, though, if you select text that has both states of those attributes, you can end up with a third possibility: Mixed. For consistency, the print properties dialog was changed in a similar manner - its font style setting is replaced with two settings, bold and italic.

Scanning Time Property.

You'll find a new item in the Recognition Properties dialog. It is the first control, and, like everything else there, its a read only text box. It is labeled "Scan Time", and its mnemonic is ALT+N. It contains the time, in seconds, that elapsed between the press of the scan button and the indication to K1000 that the scan was complete. Depending on the scanner, this may or may not include the time it took for the scan bar to return to its home position.

Creating a List of Misspelled Words.

If you are scanning something that contains a large number of unusual words, they are likely to show up in the ranked spelling dialog as misspellings. If you have access to an expert in the subject of the document, that person might be able to help you determine which words are misspellings, and which ones aren't. You can now use Control+C in the misspellings list to copy the list to the clipboard. After that, you'll be able to copy it to a new document that can be sent to the obliging expert.Read Number Setting effects the Message Voice. The Read Number setting, which lets you choose between whole numbers and digits, now effects the message voice as well as the reading voice.

Select Audio Device.

We have had a number of examples where people have lost the ability to use SAPI 4 voices within Kurzweil 1000, because the Audio Device ID has been changed for the speech engine. You can now use the SapiReg diagnostic to correct that problem.

Conversion Settings

When Kurzweil 1000 is asked to open a file, it often reads and converts the file into a temporary file in its own format - KES. If the format of the original file is text, RTF, Braille, HTML, XML, or DAISY, it does that conversion using techniques that we have written here at Kurzweil Educational Systems. If the format of the original file is an image, or is PDF, then the conversion is actually a recognition, and an OCR engine is used. If the format of the original file is something else - Microsoft Word, for example, then a third party conversion program is used. It is told to convert the file into a temporary RTF file. That RTF file, is then converted again by Kurzweil 1000 into KES. And you wondered why it took a long time to open some files?

When you save a file, if you are not saving into the KES format, then a conversion is happening. Again, if the output format is Text, RTF, Braille, HTML, XML, or DAISY, the conversion is done using code that was written (and is controlled) by Kurzweil Educational Systems. If the format is something else, then a two step process is used - K1000 will convert the file to RTF, and then will invoke a third party conversion program to create the final file in the requested format.

Beginning with this release, we have provided a dialog that lets you control some of the details of file conversions. You can access this dialog using the Conversion menu item in the Settings menus. Its just below the Verbosity menu item. It will open the Conversion Settings dialog. This dialog has the usual OK and Cancel buttons at the bottom, and two important list controls at the top. The first is labeled "Action", and allows you to choose between those settings that affect the Opening of a document, and those that affect the Saving of a document. The second list is labeled "Format", and lets you choose among a selection of document formats. Below those two controls, there are a variable number of other dialog box controls. Exactly what they are and what they do is determined by the settings of the first two controls. I'll go through each of the possibilities here.

Action = Opening, Format = Text.

Split Long Pages - a list box, whose possible settings are Enabled and Disabled. The default is Enabled. K1000 looks for form feed characters when it opens text file. If the amount of text between form feeds exceeds some amount, a page break is forced. This setting allows you to disable that action, so that the resulting KES file has no more pages than indicated by the form feeds in the text file. The mnemonic for this control is ALT+"P".

Paragraph Analysis - a list box, whose possible settings are Enabled and Disabled. The default is Enabled. K1000 uses a fairly sophisticated analysis to try to figure out where end of paragraph marks should be placed. The analysis is sensitive to attributes such as first line indent, average text length, the presence of blank lines, and even tries to make sense out of tables, block indents, and hanging indents. If you disable it, you will end up with each line in the original text file being treated as an end of paragraph. This preserves the look of the original text file, at the expense, often, of its editability. The mnemonic for this control is ALT+"A".

Action = Opening, Format = Braille.

Language - a list box, whose possible settings are Default, Danish, Dutch, English, German, Icelandic, Italian, Norwegian, Russian, Spanish, and Swedish. The default is, well, Default. Default behavior is to look at the language supported by the current reading voice, and use it whenever a Braille document is being opened. This setting won't do much if you aren't back translating, but it can be pretty useful if, for example, you know you are opening a Spanish Braille document. The mnemonic for this control is ALT+"L".

Action=Opening, Format=PDF.

Emphasis - a list box, whose possible settings are "Recognition of Images" and "Extraction of Text". The default is "Recognition of Images". The mnemonic for this control is ALT+"E". PDF files are unusual in that they can contain images and text. Unfortunately, they don't always contain text, and even when they do, that text may not contain all of the text that a sighted person would see when looking at the image of a page in the PDF file. When you open a PDF file, the recognition engine extracts the text and, potentially, recognizes the images for each page in the file. If you choose to emphasize the recognition of images, the text will be used to correct minor OCR mistakes, but the bulk of the results will come from the images. This is the default for this setting. Its primary advantage is that you are pretty much guaranteed to get access to all of the text that is represented in the PDF file - regardless of whether it is available as text from that file. There are, however, a few disadvantages. It is usually slower, and, if all of the text was there, it is likely to be less accurate. The alternative setting is "Extraction of Text". If text data is available for a page in a PDF file, that data will be trusted. Recognition will be done only to associate the text data with the image data on the page. Note that if no text data is available for a page, the image will be recognized and the results of that recognition will be made available to you. The advantages of this approach include both speed and accuracy. However, if portions of the page contain text represented only as an image, those portions will be ignored. It may be difficult for you to tell, when you read the page, that portions of it are missing. Note also that this setting interacts with your choice of recognition engines, and somewhat different results will result depending on which engine you choose, and which treatment you choose to emphasize.

Action = Opening, Format = RTF.

Split Long Pages - a list box, whose possible settings are Enabled and Disabled. The default is Enabled. RTF files may already contain page breaks, but K1000 will insert additional ones if the text of a page, in its assigned font, wouldn't fit on a 14 inch printed page. By disabling this setting, you can make sure that the number of pages in the opened file matches those that exist in the original RTF file. The mnemonic for this control is ALT+"P".

Action = Opening, Format = Other.

Use Microsoft Office for Conversions - a list box, whose possible settings are Enabled and Disabled. The default is Enabled. Microsoft Office comes with a conversion service that can convert documents in a number of different formats to RTF. From there, Kurzweil 1000 can convert the file from RTF to KES. This conversion package is usually a better choice than its alternative - a conversion service from another vendor that comes with Kurzweil 1000. However, if the conversion service from Microsoft has not been completely installed, our attempt to use it will bring up an unvoiced dialog from Office, asking you to complete the installation. If you do not have a screen reader running, this may look as though K1000 has hung. In this circumstance, it might be better to disable this setting. The mnemonic for this control is ALT+"M".

Action = Saving, Format = Text.

Add a Blank Line after each Paragraph - a list box, whose possible settings are Enabled and Disabled. The default is Disabled. One of the problems with plain text as a format is that it does not have a specific character used to mark an end of paragraph. Most of the settings for saving text have to do with trying to overcome, in one way or another, that limitation. If you enable this setting, each paragraph ending will always be followed by a blank line. If you do not enable it, paragraph endings will be followed by blank lines only if that is the case in the original file. The mnemonic for this control is ALT+"B".

Indent the First Line of each Paragraph - a list box, whose possible settings are Enabled and Disabled. The default is Disabled. If you enable this setting, the first line of each paragraph will begin with a tab, or with a certain number of spaces. If you do not enable it, paragraphs will have first line indentations only if the first line in the original paragraph begins with a tab. The mnemonic for this control is ALT+"I".

Spaces used for a First Line Indent - a text box. Possible values are the numbers 0 through 10. The default is 0. When zero, this setting indicates that a first line indent should be created with a tab character. Otherwise, the setting indicates the number of spaces to be used. This setting has no effect if the first line indent setting above it is disabled. The mnemonic for this control is ALT+"S".

Line Endings - a list box, whose possible settings are Preserve, Remove, or Wrap to Fit. The default is Preserve. When set in this manner, each line in the text file will have the same length as the original scanned lines - assuming that they were scanned by Kurzweil 1000. When set to Remove, each text line will be equal to a paragraph. Needless to say, this can create rather long text lines, but most text processors can automatically wrap long lines to fit within the width of the display window. Finally, the Wrap to Fit setting, which interacts with the maximum width setting that follows, will pretty much ignore the original line endings, but will introduce line endings as necessary to keep each line within a paragraph under a particular maximum limit The mnemonic for this control is ALT+"L".

Maximum Width of each Text Line - a text box. Possible values are the numeric range 30 through 250. The default is 80. This setting is important only if Line ENdings are set to "Wrap to Fit". It establishes what can be considered a margin for the document. Line endings will be added to keep lines under the number specified here. They can exceed that number only if a word has a length larger than the length specified here. The mnemonic for this control is ALT+"M".

Action = Saving, Format = Braille

Type of Braille - a list box, whose possible settings are Grade 1 and Grade 2. The default is Grade 2. This setting take effect whenever a Braille document is saved. The mmenonic for this control is ALT+"T".

Language - a list box, whose possible settings are Default, Danish, Dutch, English, German, Icelandic, Italian, Norwegian, Russian, Spanish, and Swedish. The default is, well, Default. Default behavior is to look at the language supported by the current reading voice, and use it whenever a Braille document is being written. The mnemonic for this control is ALT+"L".

Action = Saving, Format = Other

Use Microsoft Office for Conversions - a list box, whose possible settings are Enabled and Disabled. The default is Enabled. Microsoft Office comes with a conversion service that can convert RTF documents to a number of different formats. This conversion package is usually a better choice than its alternative - a conversion service from another vendor that comes with Kurzweil 1000. However, if the conversion service from Microsoft has not been completely installed, our attempt to use it will bring up an unvoiced dialog from Office, asking you to complete the installation. If you do not have a screen reader running, this may look as though K1000 has hung. In this circumstance, it might be better to disable this setting. The mnemonic for this control is ALT+"M".

This new dialog caused a few other changes. First, we have removed the maximum text length setting from the General Settings dialog, since we've replaced it with a few different settings in the text saving section of the conversion settings dialog. Second, the dialog for saving partial changes has a new value in the list of possible settings categories: Conversion.

Insert Signatures into Documents

I mentioned signatures with the forms recognition feature. You can scan signatures into the system, save any number of them, and insert them into any open document.

How do you create signatures? In Kurzweil 1000, with no files open, you will find another new menu item at the bottom of the Scan menu list. It is named "Create a Signature File..." It will let you scan in a signature. To do so, first get a blank, white, sheet of paper. Use a guide if necessary, and enter your signature in the top left area of the page - not too close to the corner, and as straight as possible. Place that sheet on the scanner so that the orientation will be right side up in the resulting image - we can't automatically orient a signature. Use the Scan a Signature menu item to scan in the page. An image of your signature will appear on the screen. If it is acceptable, enter a name for it.

Once you have some signature files, you might find that you'd like to insert the signature in a document. That's straightforward. You'll find that the last item in the Edit menu is "Insert a Signature". Its mnemonic is "I". It opens a dialog box, that lets you choose from the available signature files. Pick one, press enter, and it will be inserted into your document at your cursor position.

Those of you with some sight might notice that the signature is a white rectangle, with the signature in black. This might look odd on the screen, as our default text display shows a black background. Suffice it to say, most paper is white. When you print the document, your signature should look fine.

Tuesday, November 14, 2006

Bilingual Dictionaries

Kurzweil 1000 comes with the American Heritage Dictionary, 4th Edition. You can also get the Concise Oxford Dictionary for it, if you'd like. With this version, we include 12 pairs of bilingual dictionaries - useful if you want to get a short definition in English for a Spanish document, for example, or the French word for an English one.

The following dictionaries are available, assuming that you choose to install them.

Larousse Concise French to English
LSI French to English Concise
LSI French to English Large
Larousse Concise English to French
LSI English to French Concise
LSI English to French Large
AHD Spanish to English Small
LSI Spanish to English Concise
LSI Spanish to English Large
AHD English to Spanish Small
LSI English to Spanish Concise
LSI English to Spanish Large
LSI Dutch to English Large
LSI English to Dutch Large
LSI German to English Large
LSI English to German Large
LSI Italian to English Large
LSI English to Italian Large

You choose a dictionary by going into the Dictionary Lookup Dialog, and using the list labeled "Dictionary Source". You'll find that the Part of Speech control is not available for any of the bilingual dictionaries.

When you look up a word, you'll get a brief definition or group of definitions in the other language. Words will be automatically spoken by a voice that is capable of handling the language.

The definitions are quite terse, and this is not a particularly good way to get a translation of a block of text written in a language that you don't know at all. It is, however, a handy way to look up a few terms in a language that you are familiar with, when you are reading text in a language with which you are not quite so familiar.

Writing Files onto a CD

You've been able to create audio DAISY documents for a few releases now, but had to use other software to actually write those documents to a CD for use in a portable player. Now you can write them onto your CD from within Kurzweil 1000.

This functionality is available only if you have Windows 2000, Windows 2003, Windows XP, or Windows Vista. If you are using an older operating system, or you don't have a CD Rom drive that will write CDs, the appropriate menu items will not be available.

For the rest of you, you'll find a menu item at the bottom of your Tools menu titled "CD Writing Tasks". Its mnemonic is "C". Within it, you'll find six or seven submenus. Each will open a dialog. They are, in sequence, Add Files, Remove Files, Start Writing, Status, Erase CD, Directly Write, and Select A Drive. Mnemonics are "A", "R", "W", "S", "E", "D", and "L". All of these menus bring up dialogs. The last of them, Select A Drive, is available only if you have more than one drive that is capable of writing to a CD.

Add Files will bring up the K1000 file dialog box, and allow you to select a folder, or any number of files. These are files that you will be adding to a queue. The queue is just a folder on your hard drive that will contain copies of the files and folders that you have selected.

Remove Files will once again bring up the K1000 file dialog box. This time, though, you are positioned within the CD Writing Queue. Again, you can select a folder or any number of files. This time, though, the files are being removed from the queue.

Start Writing will begin the process of writing the queued files onto the CD. You should have a writable CD in the drive when you begin this process. A dialog box will be brought up which contains one read only text box, an OK button, and a Cancel button. As the writing progresses, the text box will show a percent done. You can press enter and exit from the dialog if you wish - the writing process will continue while you do other things. If you'd like, you can press escape or activate the cancel button. You will be asked if you are sure that you wish to cancel the write. Answer in the affirmative, and the CD writing process will be canceled at the next opportune moment.

If you have not begun writing a CD, the Status dialog will tell you how many files you have queued to be written, and the total size of those files. It will also tell you whether or not a CD is in the drive, and if one is, how much free space is available on that CD. If writing is in progress, it will bring you back to the small dialog box described above.

The next menu item, Erase CD, will erase all contents on your CD, provided you have the appropriate type of CD. Like the Start Writing menu item, it will bring up the status dialog while the erasure is in progress.
You can use the Directly Write menu item to select a folder, and immediately begin writing the contents of that folder into the root drive of the CD. The dialog that comes up allows you to select the folder. Once you accept your selection, it is read, and then written. This does not effect the queue of files or folders that you intend to write to your CD, which you might have created in other menu items. This option is particularly useful for copying CDs.

If you have multiple drives that can be used to write CDs, the final menu item, Select a Drive, will let you choose which one you wish to use. It will bring up a list, showing the paths of the available drives. This, by the way, is a saveable setting.

One last thing. If you choose to create Audio Files, you can choose the path of the writable CD if you'd like. Kurzweil 1000 will automatically write the audio files into the queue. Once audio creation is finished, you can use the Start Writing menu item to complete the process.

Linking Documents and Settings

Kurzweil 1000 allows you to save almost all of the various user settings in named settings files. But it can be pretty annoying when you realize that you have forgotten to save those settings, or when you have forgotten the name of the settings file that you used. People work hard to come up with optimal scanner settings for a particular document, for example, but may not be able to complete the scanning of that document in one session. If they forget to save their settings, it can be difficult to remember them or recreate them at a later time.

There is a new setting in the configuration settings dialog called "Link Documents and Settings". Its mnemonic is "I". It has three possible values: Disabled, Scanning Settings Only, and Most Settings. Its default value is disabled. When disabled, settings behave pretty much as they have in previous releases, except that special settings files, whose names are based on the names of your documents, are created whenever you close a document, and even when you change the currently active document if you changed a setting in that document.

If you change the configuration setting such that scanning settings are linked to documents, then scanning, recognition, and scanner margin settings are loaded whenever you open an existing document and scan a new page into it. This can be quite useful, in that you no longer have to remember to save or load those settings yourself. It can, however, be confusing. If you open an existing document, and then change some scanning, recognition, or scanner margin settings, and then scan a page, you'll lose those settings, as your older settings will be automatically, and silently, loaded.

It gets even more powerful, and potentially more confusing, if you link most settings to documents. In this case, voice, reading, general, display, and verbosity settings, along with scanning, recognition, and scanner margin settings, will all be automatically loaded whenever you open an existing document, and whenever you switch from one open document to another. You can have one document, for example, read with NeoSpeech Kate at 180 words per minute, while another uses Reed at 240 words per minute. All sorts of things start happening automatically. It can be fun. If you change a setting while one of those documents is open, the linked settings file will be automatically saved when you close it, or when you switch focus to another document.

Suppose, though, that you eventually find a voice that you like better than any other, and you want to update all of your settings files. If you did so without document linkage, you could just use the "Save Partial Settings" feature, indicate that you want to just save voice settings, select all of your settings files, and update them in one simple operation. If that didn't effect linked settings files, you would find, though, that your old voice settings keep coming back whenever you open an existing document. To prevent that, you'll find that the Save Partial Settings operation can also let you change all of your linked settings files.

Monday, November 13, 2006

Access to Audio Files

One odd thing about the last few releases of Kurzweil 1000 was that it could create files that it didn't know how to read. Using the Create Audio Files facility, which can be found under the File->Utilities menu, you were able to create MP3, WAV, or Audio DAISY files. This was a very useful feature for taking those files elsewhere and reading them at your leisure, but it was a little peculiar to not be able to play them within Kurzweil 1000. Now you can.

You can open MP3, WAV, WMA, or Audio DAISY files within Kurzweil 1000 and play them. Kurzweil 1000 treats them as though they were open documents, and many of the same keystrokes apply. Control+Home will take you to the start of the document, Control+End to its end. The commands that move backward, forward, or repeat the current unit work pretty much as you'd expect, except that the units are blocks of time rather than blocks of text. Specifically, of the reading unit is line or sentence, it equates to 15 seconds in an audio file. If the reading unit is by paragraph, it equates to 30 seconds. You can, of course, stop reading and restart it.

You can also change the reading speed. The default reading speed for an audio file is set, somewhat arbitrarily, at 150 words per minute. At that speed, you will be hearing the expected speed of the recording. You can speed it up with F12, or slow it down with F11, until you reach 300 words per minute, or 75 words per minute. This works reasonably well for recorded text, but it can sound more than a little strange if you are playing music.

You can also set and use bookmarks, create and read notes, and, of course, expect Kurzweil 1000 to maintain your current reading position between sessions.

Note that if you wish to open a DAISY 2 Audio document, you should find and open the file ncc.html.

Bookmarks and Notes in All Document Types

For many releases, clients have been able to create bookmarks and notes in open documents within Kurzweil 1000. Further, they've been able to close documents and know that they will be positioned at the spot where they were last reading when they open them again. However, this only worked for documents in Kurzweil's native format - .KES, or in DAISY 3 text documents (whose extensions are .OPF). If you added notes or bookmarks to a document, and then saved it in some other format, when you reopened it you would find that those various annotations were gone. Since those documents were written in formats that are not controlled by Kurzweil Educational Systems, and since the converters for those formats (in general) were not controlled by Kurzweil Educational Systems, we couldn't preserve those features.

Now we can. Beginning with version 11, you can create and preserve bookmarks or notes, as well as your last reading position, with any kind of document, as long as that document can be opened in Kurzweil 1000.

All of this extra information is kept in a database that is maintained by K1000. That database is backed up whenever you backup your settings, and restored whenever you restore them. One consequence of this is that you can't easily share your bookmarks and notes - they aren't really a part of the file that you are annotating.

The "key" that is used to lookup information in this database consists of the file name and extension, and the file size. Note that the file name does not include its full path, so you can move a file from one folder to another and Kurzweil 1000 will still be able to find bookmarks, notes, and reading position information for it. What you can't do, though, is change the file name or alter the size of the file outside of the Kurzweil 1000, and expect to still retain access to this extra information.

Your reading position is saved in the database when you close a file. Note that we use the database approach for all files other than KES files - largely because we had to completely rewrite DAISY files to save the reading position, and this approach is much faster. Bookmarks and notes are saved in the database only when you save a file in a format other than KES or DAISY.

Friday, November 10, 2006

Scan and Recognize from within Microsoft Word

The requests for this feature have perplexed me for years. Kurzweil 1000 costs more than commercially available, mainstream OCR packages. What people are paying for, presumably, is the inclusion of all sorts of features that make sense mainly for the Blind, and a seamless audible interface. If you scan from within Microsoft Word, you pretty much lose all of that. What I finally came to realize is that the people who wanted this feature didn't want to use it constantly - instead, its a convenience when you happen to already be in Word, and you happen to want to work from material that is in print. So, this time around, we added the feature.

Suppose you are in Microsoft Word, and you wish to scan a page. At the bottom of the Insert menu, you will find a menu item called "Kurzweil Scan and Recognize". Activate that menu item, and a page will be scanned and recognized. The results will flow into your Word document, beginning at your current cursor position. Kurzweil 1000 doesn't need to be open.

You can adjust certain scanning and recognition options from within Microsoft Word. To do so, activate the "Kurzweil Options..." menu item near the bottom of the Tools menu. A dialog box will come up. Its contents include a list of scanners, so that you can select the scanner source, a slider for your brightness setting, a list to select the scan type (black and white, grayscale, or color - note no dynamic thresholding here!), a list for the OCR engine, a check box to enable or disable column identification, another checkbox to enable or disable despeckling, and a list of possible recognition languages.

That's about it. Its important to note that there are good reasons to think that this is not the way you want to do all of your scanning. Important settings might be missing at the moment, but the main thing is that you can't read and edit the document at the same time that you are scanning it. Remember that the recognition results will come in starting at your cursor position. Fiddling with that position while recognition was in progress would cause a lot of trouble.

We have gotten this feature to work with Word 2002, Word 2003, and Word 2007. It does not work with Word 95 or Word 97.

Wednesday, November 08, 2006

An Appointment Calendar Application

We've been meaning to write an appointment calendar application for a long time. Early in the planning stage for each release, this would be fairly high on our priority list. Each time, though, I'd throw so many requirements at it that it would become too time consuming to do. For example, I wanted a way to include holidays, and I wanted holiday categories that included dates drawn from one or more lunar calendar. I wanted it to be able to synchronize with PDAs. I wanted it to provide multiple mechanisms for reminding people about appointments.

This time, I finally learned my lesson. Make it really simple, so there was at least half a chance that we would get it done in time for the release. Add items that our pilot customers thought were absolutely essential. Expect to add more items over time, in future releases. So, you'll find that this is a relatively simple, though useful application. It is not for those of you who have mastered the intricacies of Microsoft Outlook and synchronize everything to each of your several PDAs. Instead, its a straightforward way to create appointments and to be reminded of them.

You can start the appointment calendar application in one of two ways. Like other K1000 Applications, you can use the File Launch facility if you'd like. Get into the File menu, select the Launch item, and then the Appointment Calendar (the mnemonic is "M"). Its easier, though, to use the hot key that is associated with the Appointment Calendar. By default, it is Control+Alt+A. You can use that without first running Kurzweil 1000 if you'd like.

Please note that the Appointment Calendar is a separate talking application. As a consequence, you are likely to need to configure your screen reader to stop it from speaking at the same time.

If this application is going to have to remind you of any appointments, it needs to be running. If you create appointments for it, it will be started in the background whenever you restart your computer.

Once you have run the application, you will find that it has a menu bar with three separate items: File, Tools, and Help. Use the Help About item to learn about the functionality of the application. You'll find its pretty straightforward. The most complicated area is creating or editing an appointment. Use File->New to create one, File->Edit to modify one, File->Delete to (of course) delete one, and File->Duplicate to copy one. Copying an appointment is useful if you want to make several appointments, where most of the appointment properties are similar.

The dialog that you will use to create or modify an event begins with a combo box titled "Name". This can be a little misleading. You can, in fact, name an appointment here in any manner you wish, but in some ways this is more like a category, or template, for applications. You'll find that some have already been created for you, which you can access with the up and down arrow keys. These "names" include Anniversary, Appointment, Birthday, Daily Event, Holiday, Meeting, Monthly Event, Phone Call, Reminder, Weekly Event, and Yearly Event. As these names might suggest, they influence the default settings for the rest of the dialog - particularly whether or not an event recurs, and how often it recurs. The next control is a list box labeled "Recurrence". You can use it to decide if an event occurs only once, if it happens monthly or yearly, or if it happens daily or weekly. The next cluster of choices lets you specify the time of the event. After that, we get to the date choices, which are influenced by whether or not the event recurs. If it does not recur, you simply specify the day, month, and year. If it recurs monthly or yearly, you will find that there is a check box for each month of the year (the labels sound a little odd, by the way, because we are using just the first three letters of each month). Each month can be checked or unchecked. If checked, you are indicating that the event can occur in that particular month. After those twelve check boxes, there is a list box that lets you cause the event to repeat on a particular day of the month, or on a particularl day of the week. If, however, the recurrence field was set to daily or weekly, you'll have seven check boxes - one for each day of the week. After that, there will be a list box which lets you choose between repeating the event every week, every second week, every third week, or every fourth week. Beyond that, there is a text box that lets you enter details about the particular event. These details will be read to you when you are reminded about the appointment.

Finally, there are a group of controls regarding how, or if, you would like to be reminded about the appointment. First is a check box - check it if you want a reminder. Then there is a numeric text box where you can enter a number, followed by a list box that specifies what that number is for: minutes, hours, or days before the appointment.

That's about it, really. If you have an appointment, you want to keep this application running in the background. Use the escape key to dismiss it, and return to your main application. If you would rather cause it to exit altogether, use File->Exit.

When you need a reminder, the appointment calender will bring up a dialog box, and the contents of that box, which contain your comments regarding the appointment, will be spoken. You can also specify a wave file that will be played when reminders occur.

Monday, November 06, 2006

Updated OCR Engines

We currently ship with Optical Character Recognition technology developed by two different companies: Abbyy, which makes FineReader, and Nuance, which makes OmniPage. In general, the commercial versions of OCR products from those companies are released first. Later, sometimes much later, they package those technologies as toolkits that other companies, such as ours, can license and use. Both of those companies provided a new version of OCR technology in time for Version 11 of Kurzweil 1000 - FineReader Version 8, and ScanSoft Version 15.

We've taken advantage of some new features in both products, giving us some new functionality for analyzing forms (see an earlier post) and for opening PDF files. Mostly, though, we were interested in recognition improvements. We do see some, though we haven't done a large scale analysis yet of the differences between earlier versions and this one in terms of recognition accuracy.

One important point - the latest version of ScanSoft OCR no longer supports Windows '95, Windows '98, or Windows ME. If you use any of those operating systems, we will install an older version of ScanSoft OCR - version 12.6 - rather than the current one. As a consequence, form recognition will not be available.

Both engines now provide some speed control - that is, you can ask the engine to recognize something quickly, perhaps at the expense of the accuracy of the recognition. Some of you may remember that we listed ScanSoft's recognition engine twice in our previous release - one listing for speed, the other for accuracy. Now you will find it listed only once, because a separate setting applies to both recognition engines. That setting is labeled "Recognition Approach" - the setting options are "Accuracy" or "Speed". You'll find it immediately before the Engine setting in the Recognition Settings dialog. You'll also find that we removed two settings: Recognition of Light Text on a Dark Background and Questionable Character Markup. The former is not controllable in some engines, and is now always enabled. The latter wasn't possible with one of the new engines, and seemed to be used rarely, if at all.

While I'm on the subject of recognition settings, let me mention an important one. I'll talk about it more in a post on conversion settings, but it really has to do with character recognition. In the Conversion Settings Dialog, you'll find one setting that has to do with opening PDF documents. The setting is labeled "Emphasis", and your choices are "Recognition of Images" and "Extraction of Text". PDF files are unusual in that they contain both text and images. Sometimes they have no text, but they do have images of text - this happens most often when the person who created the PDF file used a scanner to make it. Sometimes they have text for portions of a page, but not for all of it. Sometimes the text for the full page is available, but it is not clear how it should be ordered. Both OCR engines can extract the text, if it is there, and use it. If you indicate that Extraction of Text should be emphasized, they will use the images only to establish the position of the text and its reading order. If you indicate that Recognition of Images should be emphasized, the text will be used only on a word by word basis to correct simple recognition errors. Recognition of Images is the default, although, if your PDF file has all of its text, it will be slower and less accurate than the other option. Unfortunately, if your PDF file contains images of text for which there is no text that can be extracted, the other option can cause entire sections of a page to be skipped.

Although we haven't independently verified the vendor's claims, I thought you might be interested in their claims about improvements in their new OCR engines.

For Abbyy, see http://www.abbyy.com/sdk/?param=35469
For Nuance, see http://www.nuance.com/omnipage/capturesdk/whatsnew.asp

I have reproduced some of the more pertinent claims here.

Abbyy has added a "Fast Mode", performing recognition up to two times faster.

Intelligent image analysis in FineReader Engine 8.0 delivers higher recognition accuracy. FineReader technology automatically adjusts its algorithms to account for image condition, resulting in increased accuracy by up to 30% on low resolution documents (scanned at under 200 dpi or faxes).

Abbyy also claims that their PDF processing is up to two times faster, and often more accurate, as they do a better job of analyzing the internal information within source PDF files, including annotations, metadata, text objects, font dictionaries, and content streams.

Nuance suggests that its newly developed 3-way voting system provides a 36% increase in accuracy over previous versions.

Your mileage, as they say, may vary. We'd be interested in what you think once you've taken version 11 out for a spin.

Friday, November 03, 2006

Scanner Button Support

For a number of years now, most scanners have had one or more buttons on them. In general, when you install the software that comes with the scanner, those buttons are assigned to bring up applications provided by the scanner vendor. Those applications may or may not be useful to you, as they may or may not be blindness friendly.

If you'd like, you can now assign one or more of those scanner buttons to Kurzweil 1000, or to a Kurzweil 1000 application. This can make it quite a bit easier to scan a book. Since you need to be by the scanner anyway to position the book properly on the platen, it is probably easier to press a scanner button then to press a button on the computer keyboard.

If you'd like to assign a button to a Kurzweil 1000 task, the first step is to simply press the button. It may bring up an application provided by the scanner vendor. If so, dismiss that application, go into your system's Control Panel, and run the "Scanners and Cameras" applet. (Note that I'm basing these instructions on my computer, which is running Windows XP and shows the "classic view" for Control Panel - your usage may vary from these instructions.) Select your scanner, and bring up the properties dialog. One of the panes of that dialog will let you configure the scanner buttons - it might be titled Events, or Buttons. I can't explain that dialog pane in any real detail, because what it contains will depend on your operating system, your scanner, and the various software applications that you have installed. Basically, though, you want it to either bring up a Kurzweil application, or let you choose from a list of possible applications when the button is pressed. I'm sorry to be so vague - among other issues, the scanner on my desk has no buttons at all, so I don't have much direct experience with this.

Eventually, one hopes, you will get to the point where, if you press the scanner button, a Kurzweil talking application will come up. If the button has not yet been configured from the perspective of the Kurzweil 1000, you will be asked what you would like to use the button for. Your options are "Scan", "Scan and fax", or "Scan and print". The first option is for K1000 itself, the second for the FAX application, and the third for the Photocopier application. Once you have chosen the application, it will be run, and a page scanned. If the appropriate application is already running, pressing the assigned scanner button will cause that application to scan another page.

Once you have configured a scanner button, you won't be asked about that button again. If you find that you wish to change the assignment of the button, run the diagnostic ScanConf.exe. You'll find this diagnostic in the folder "Program Files\Kurzweil Educational Systems\Diags". Check the "Reconfigure Next Scanner Button" check box, and then press enter to accept the change and exit from the diagnostic.

If you press a scanner button while a scan is in progress, the effect of the press will be delayed until the scan is complete.

Scanner buttons will not work properly if the scanner source is held. As a consequence, you might find that you need to change the "Hold Scanner" setting in the scanner settings dialog to "Never" in order to use scanner buttons. You might find that this slows down the scanner's response to an attempt to scan a page slightly. In the end, you may have to decide whether that time is worth being able to press the scanner button itself to initiate the scanning of a page.

Please note that all this works only if you have a scanner that properly supports the Windows interface for scanner events. Most scanners do. One notable exception, unfortunately, is the Plustek OpticBook Pro. This otherwise great scanner has a number of large scanner buttons, which can't be configured for anything other than Plustek applications.

Thursday, November 02, 2006

Form Recognition

As I wrote yesterday, my plan is to go into more detail about the more important new features, one post at a time. We'll start with Form Recognition.

What's a form? From the point of view of Kurzweil 1000, a form is a paper document that asks you one of more questions. The originator of the form expects you to respond to those questions by filling out fields on the form, and then returning the form. Those fields (for this product) may include horizontal lines where one or more words are expected to be written, rectangular boxes that define a blank space for writing, groups of small boxes that are intended to guide one to write one letter per box, check boxes that are used to ask yes or no questions, and groups of check boxes. A form may fit on one page, or it may be on multiple pages.

Often you, the unsuspecting user, won't know that the piece of paper in your hands is a form. You'll scan it into Kurzweil 1000, and start reading in the usual manner. Based on the text on the page, you will determine that it is a form. If you have kept the image (keeping images can be enabled in the general settings dialog) you can use the new menu item, "Rerecognize as a Form", which you'll find in the Scan menu. If you didn't keep images, delete the page, and scan it again, using the "Scan as a Form" menu item, also in the Scan menu.

When you begin reading a page that has been recognized as a form, it will first tell you that it is a form, and that it contains some number of fields. Like any other document, you can save it, name it, read it, discard it... whatever you'd like. If you'd like to fill it out, you can use one of two approaches. The first approach is primarily for people who have no usable vision, the second for people who have some vision, or at least have someone staring at the screen over their shoulder who can help.

The first approach is available by pressing Control+F3 when you are at a form page, or by activating the new menu item called "Fill a Field..." in the Edit menu. It brings up a dialog box that contains one control per field, followed by up to four command buttons: OK, Cancel, Next, and Back. You will be positioned at the field closest to your cursor position in the page. For fields where you are expected to enter text, you will find that the control is a text box. For single check boxes, you will find that the control is a list box with two options: Checked and Unchecked. For groups of check boxes, you will find that the control is a checked list box with as many options as there were check boxes. Fill out a field, tab to move to the next, and continue. If you find that there is a Next box, use it to move on to the next group of fields. Continue until the form is filled out.

Simple, right?

Well, actually it can be kind of hard. We try to associate the right label with each field, but that isn't always easy. It is often a good idea to read through the entire form first, and then to read through each field in the form fill out dialogs, and finally go back to fill out the fields. Completely accurate form recognition is difficult, and knowing what to write in each field is also difficult - even for people who can see the forms. Sometimes the software identifies fields that aren't really there - an example is that some forms have cross hairs for machine processing of the form at the corners of the page, and those can look like text boxes to the form recognizer. Typically they won't have labels that make sense, and their position, as I said, will be near the corners of the page. Sometimes an important field won't be labeled at all, or will have the wrong text. Surveying the entire form first can be useful.

Also useful are the following options in the form fill in dialog. Use F7 to find out what you've typed already, and to repeat the label. Use Shift+F7 to get all sorts of information about the field, including how many characters could fit into it, and its placement on the page. Use Control+X to hear the paragraph of text that is closest to the position of the field. Press Control+X a second time to hear all of the paragraphs of text that are within an inch of the field.

The other way to fill out the field is to show the image of the form. Use Control+W, or the "Show Image..." menu item in the Tools menu. This works pretty much the way it has always worked, except that you will find that you can tab from field to field. If there is a block of text between one field and another, tabbing will get to that intervening text as well. Once in a field, you can fill it in while still looking at the image. As in previous versions, you can magnify the image considerably if you need to - press F1 while looking at the image to get the long list of shortcut keys.

Suppose a field requires your signature? With either fill in approach, you can select from a list of signatures that you have scanned into the system by pressing Control+Down Arrow. I'll talk about creating those signatures later.

OK, once you have filled out a form, you probably would want to print it. No real changes here - just use the Print menu item in the File menu list, or Control+P for short. You will want to enable the printing of images. The original scanned form will be printed, with your changes inserted into the appropriate places in the fields.
If, by the way, you save the file, the data that you entered when you filled in the fields will be saved as well.

How do you create signatures? In Kurzweil 1000, with no files open, you will find another new menu item at the bottom of the Scan menu list. It is named "Create a Signature File..." It will let you scan in a signature. To do so, first get a blank, white, sheet of paper. Use a guide if necessary, and enter your signature in the top left area of the page - not too close to the corner, and as straight as possible. Place that sheet on the scanner so that the orientation will be right side up in the resulting image - we can't automatically orient a signature. Use the Scan a Signature menu item to scan in the page. An image of your signature will appear on the screen. If it is acceptable, enter a name for it.

So, how useful is this feature? I'm looking forward to feedback once people get this release into their hands. We've certainly been asked for a form fillout capability for years. My expectations, by the way, are that it will be useful and frustrating at the same time. It is only as good as the form recognition itself (which, by the way, is the same technology as in OmniForm). That recognition, in turn, is compromised by the rather poor layout and design of many forms. Often it is good enough for you to successfully fill out the form. But not always, and it can take a while to figure out whether a particular form is truly usable or not.

One more important gotcha. Form Recognition is being provided by the ScanSoft OCR engine. This engine does not support, and is not installed on, systems that are using Windows '95, Windows '98, or Windows ME. When Version 11 is installed on a system that is running one of those operating systems, the older version of ScanSoft's OCR enginer (Version 12.6, I think) is installed, rather than the newer one. For people with those operating systems, Form Recognition will not be available.

Wednesday, November 01, 2006

An Introduction to Version 11

Version 11 of Kurzweil 1000 will be announced tomorrow. It should be available towards the end of November. Although I fully expect to be answering many questions at the Kurzweil 1000 listserve (see http://www.kurzweiledu.com/support_listserv.aspx to sign up), I thought a blog might be an interesting way to provide the same information in a format that allows for a little more structure.

For now, I'll use the blog to announce the changes in this version of Kurzweil 1000, and to discuss those changes. Later, I'd be happy to entertain possibilities for future versions as well.

So, what's new in Version 11?

I'm going to provide a long list of new features now, with very little text about each one. It will be about as interesting as a shopping list. On subsequent posts, I'll go into each one in more detail.

Here goes...

Form Recognition. For the first time (I think), a Blind user has a shot at being able to take a paper form, convert it into something electronic, fill it out, and print the filled out version, all without assistance. The software will automatically identify various types of fields in a form, including check boxes, text boxes, and character boxes, give you a couple of different mechanisms to explore and fill out the form, let you insert a signature into a field if you wish, and let you print a filled out copy of the original.

Scanner Buttons. Finally, you can set up the buttons on a scanner to do useful things in Kurzweil 1000. One might bring up the product if its not already up and scan a page, another might bring up the Fax Application, a third might bring up the Photocopy application.

Updated OCR Engines. Major new upgrades of Abbyy FineReader (now at version 8), and Nuance's ScanSoft OCR (now at version 15) are included in this version.

An Appointment Calendar. We've written a new talking application - an appointment calendar. You can invoke it from Kurzweil 1000, or run it separately with a hot key. Its relatively simple, but does, I hope, much of what one would want from this soft of application.

Scan and Recognize from within Microsoft Word. We've been asked about this for several years, so we thought we'd finally get around to it. You can select from a variety of K1000 options within Microsoft Word, scan pages, and have recognized text flow into your current Word document.

Bookmarks and Notes in All Document Types. One of the most popular features of Kurzweil 1000 has been our ability to create notes, multi-level bookmarks, and, of course, keep track of where you stopped reading. This has only been available, however, when you read documents in our native format - KES, or in DAISY 3 format. Now, if we can open the file, we can also maintain this sort of information for the file, regardless of its format.

Play Audio Files. Speaking of opening files, Kurzweil 1000 can now open and play DAISY 2 Audio files, WAV files, MP3 files, and WMA files. Given the previous item, you can also make notes at various spots in those files, and create bookmarks for them.

Link Documents and Settings. You've been scanning a document, but you are interrupted. You come back to it later, but realize you've forgotten what the scanning and recognition settings were. Not a problem with version 11. If you'd like, you can have the K1000 automatically create and load settings files that are associated with a document. In one flavor of this new setting you can have it keep track of just scanning and recognition settings. In another flavor, you can have it keep track of just about everything.

Writing Files onto a CD. You've been able to create audio DAISY documents for a few releases now, but had to use other software to actually write those documents to a CD for use in a portable player. Now you can write them directly to your CD from within the product.

Bilingual Dictionaries. Kurzweil 1000 comes with the American Heritage Dictionary, 4th Edition. You can also get the concise Oxford Dictionary for it, if you'd like. With this version, we include 12 pairs of bilingual dictionaries - useful if you want to get a short definition in English for a Spanish document, for example, or the French word for an English one.

Insert Signatures into Documents. I mentioned signatures with the forms recognition feature. You can scan signatures into the system, save any number of them, and insert them into any open document.

Conversion Settings. For people who want to influence some of the nitty-gritty details of opening and saving documents in formats other than KES, this settings dialog will be the place to go.

An altered TTS Engine. We've shipped IBM TTS for a number of years. With this release, we are switching to ETI Eloquence.

OK, that is 15 of the new features. I have 22 others on my list, but they get increasingly esoteric. I think this will be enough to with which to start.