Making image scans of monochrome text 

When scanning pages of text, particularly those printed off from microfilm, it can be difficult to get a good image suitable to appear online (scans for OCR are a different matter, of course).  Here is how I turned some A4 pages into 20kb black-and-white images using an HP6350 scanner with HP Precision Scan, and Paintshop Pro 6.

  1. Scan the image as true-colour, 300dpi.  The exposure setting is critical: use highlight=185, shadow=88, midtones (less important)=3.3.  If the exposure is too dark, there will be tiny dots all over the image which will appear later; if it is too light when we remove dots, we will lose thin strokes such as the horizontal portion of lower case 'e'.
  2. Resize the resulting image to 33% of that given.  Save as native psp format, not as jpg or something like that.
  3. Adjust the brightness and contrast.  Brightness=-18, contrast=40.
  4. Colours|Histogram Functions|Stretch, and then Equalize.
  5. Despeckle.
  6. Convert the image to two colours.
  7. Save as .gif (17k) or .png (13k).

If you can tolerate 100k images, the quality will be much better as greyscale:

Written 20th May 2005.

