Abstract
The U.S. Patent and Trademark Office (PTO) applies a rigorous Quality Assurance (QA) program to its text and image databases, which comprise part of the Automated Patent System (APS). QA efforts are applied from the time data is captured, either by conventional data entry, optical character recognition, or digitization of paper files, to the point at which the data is actually loaded for on-line use to either magnetic storage devices or optical media. The volume of the data inspected via QA will eventually exceed thirty terabytes, consisting of issued U.S. and foreign patents, in both text and image format.
