Wednesday, July 13, 2011

Birth Certificate Debate Caused by OCR Software and Digital Optimization?

Why Obama and His Birth Documents are Not Authentic!

Scanned documents aren't valid evidence of anything even if they are legitimate. The entire reason there is a professional field known as forensic document examination is that a great deal can be told from examination of the original document itself. Much, much less can be told by looking at a photocopy of a document and very little, if anything at all, can be told from looking at a digital image that purports to be an image of an original document. Too much opportunity for adulteration, no opportunity to examine the paper, the ink, and any impressions made on the paper, etc. These online arguments discussing images are like people studying animals by examining imitation scat.

The documents Obama would like the public to accept would not meet the standards of "evidence" in any court of law, any administrative hearing, any congressional investigation, application for any license, passport. or official ID papers

We the People will never know if Obama is a fraud unless the original documents are submitted to a panel of court approved certified document examiners. Certified Diplomate, American Board of Forensic Document Examiners

# # # ## # # # #

Birth Certificate Debate Caused by OCR Software and Digital Optimization?

Long-form Birth Certificate Image
There are a number of inconsistencies within the President’s scanned long-form birth certificate, posted on the White House website. The presence of layers, kerning, different pixel sizes, unusual variations in color, areas with and without noise and aliasing, pixel-by-pixel reproductions of certain blocks and letters, a misspelled word, and mismatching thresholding patterns throughout the document are consistent with the scenario of scanning with an OCR option, and optimizing the text. Without a “fresh,” copy of the document, not subject to optimization or other digital manipulation, further forensic analysis will not reveal conclusive details.

Layers in the President’s Birth Certificate

Layers in the Birth Certificate
When you scan a page into your computer, and open it with editing
software, you will normally see only one layer. You can then break it into layers for editing or manipulating if you want to change things around. If you scanned the page into your computer using OCR (Optical Character Recognition) settings, or an optimization program, however, it will already be in layers when you open it in your computer. Here are a few facts about scanning, OCR and optimization:
  • The OCR software compares each character on the page to an internal database, matches them when possible, and separates everything that it can match into another layer for potential editing.
  • Optimization software tries to ensure that the images are as clear as possible.
  • Most scanners are set to scan either with or without OCR as a default for each scanning job.
President Obama’s birth certificate was created on a Mac, in Quartz PDFContext, and clearly has a number of layers. It is impossible, however, to determine whether the layers are solely the result of OCR software during the scanning process coupled with optimization after the fact, or if additional layers were created at some later point.

Kerning and Document Authentication





Kerning in the Birth Certificate: Image from the "tu" in Student, Block 12a



Kerning is the process of aligning letters in text to make them fit together more evenly. Older typewriters simply place all letters an equal distance from one another, but a computer will nest letters together to ensure that the text is easy to read without taking up too much space.  The difference between text created by an old typewriter and output from a computer can identify a forgery in the eyes of a forensic document examiner.
What does this have to do with the birth certificate? OCR software translates the images into text, and treats the translation as regular computerized text. During the process, it formats and arranges the letters for optimum readability, which affects kerning and the alignment of letters and words in any document. In the case of the President’s birth certificate, the use of OCR and optimization software during scanning prevents the definitive evaluation of kerning and typesetting in the online birth-certificate-long-form .pdf document.

Other Effects of OCR and Optimization Software

The use of OCR software and image optimization have a number of other effects on documents. Each of these issues, which can result from OCR or optimization processing, may have led to the appearance of tampering and manipulation, and accusations of forgery.
Pixel size: In any scanned image, pixels are all the same size. Pixels in the President’s birth certificate, however, are not. The pixels around the optimized text are a much smaller size than the background pixels.
Color Variations May be Due to OCR
Color variations: There are variations in the colors of the text, ranging from a very dark black to gray and even green. This is not a normal result for a document that is simply scanned as an image – a simple  scan would be true to the original.
Noise: In any scanned document, there are small dots called “noise” scattered throughout the document, particularly in areas of high-contrast.  In President Obama’s birth certificate, noise around the letters is inconsistent.
Aliasing: The term “aliasing” refers to the smoothness of an edge. An aliased image is choppy, while an anti-aliased image is artificially smoothed by the computer to produce a more pleasing line. President Obama’s long-form birth certificate contains both aliased and anti-aliased images.
Pixel-by-Pixel Twins: During the process of scanning, translation and optimization, the software searches for ways to create a document with the best possible appearance with the least required resources. One method of reducing effort is to duplicate similar characters from the first character identified, rather than re-forming each subsequent character from scratch. This results in a document, such as this one, with bits that are identical on a pixel-by-pixel basis.



TXE: Misspelling likely due to poor translation by OCR

TXE and OCR Software: Optical character recognition software is not, by any means, perfect. When a document is translated using OCR, misspellings and other odd errors are the exception – not the rule. In a highly complex document, such as the President’s birth certificate, there would naturally be spelling or formatting errors. A single error of this nature is an unusually good result, but within the realm of possibility. In this case, the software would have been unable to fully match the “H” in “THE” and substituted an “X” instead.
Thresholding Patterns: During the optimization process, high-contrast areas are clarified with thresholding. Each pixel is evaluated and assigned a value based on a threshold set up by the program. Some areas of the president’s birth certificate have been optimized in this manner, while others have not.

President Barack Obama’s Birth Certificate: Conclusion

The changes made to the original document by OCR software and image optimization have rendered it impossible to determine whether these inconsistencies are due to manual tampering, or are simply the result of the optimization and scanning process.

About the author

Victoria Nicks
Victoria Nicks
Victoria Nicks holds a Master of Science and Bachelor of Science in Information Technology. She has hands-on experience with a wide variety of computer systems and software platforms, and writes primarily on the topic of artificial intelligence.


1 comment:

  1. The computer replaced an H with a poor X? An H is nothing like an X. In anycase the photo of the scan of the document done by the press has the same X in TXE!!!!!!!!!!!!

    Also are we not told by a number of experts that the document was NOT OCRed as it is not 'searchable' and says it was not OCRed in the metadata files!

    Surely the FBI should just now take a look with this kerfuffle.

    ReplyDelete