1. Jun
    Posted April 17, 2013 at 11:58 am | Permalink

    This is amazing! After having OCR’d countless german and dutch texts, I appreciate this so much. Will the software for recognizing and parsing out the sections be made open source at some point? I have noticed that Acrobat’s OCR technology is not as good as whatever Google Books uses, and that it has trouble with Serif scripts, mixing up the t’s and r’s, and the e’s and c’s.

    • Joe Shubitowski
      Posted April 18, 2013 at 10:26 am | Permalink

      Hi Jun,
      We have actually never discussed open sourcing the parsing code, but there is really no reason why we couldn’t. That said……the code is highly specific to the texts we are parsing so it is one of these “your mileage may vary” situation for being able to use the code effectively out of the box.

      I’ll talk with my development team about how we might package and document the code base to make it distributable.

      Best regards,
      Joe Shubitowski
      Head, Information Systems
      Getty Research Institute

One Trackback

Post a Comment

Your email is never published or shared. Required fields are marked *


You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

  • Facebook

  • Twitter

  • Tumblr

    • photo from Tumblr

      The Queen Who Wasn’t

      Louis XIV clandestinely wed his mistress, Madame de Maintenon, at Versailles on October 9 or 10, 1683. The marriage was much gossiped about but never openly acknowledged. She was never queen.

      Madame de Maintenon had been the {judgy} governess to Louis XIV’s children by his previous mistress, Madame de Montespan. Louis gave these children moneyed titles—such as the comte de Toulouse, who ordered the tapestries shown here for his residence outside Paris.

      Louis’s secret marriage ushered in a period of religious fervor, in sharp contrast to the light-hearted character of his early reign. Madame de Maintenon was known for her Catholic piety, and founded a school for the education of impoverished noble girls at Saint-Cyr in 1686 that stayed in operation until 1793. This engraving of the Virgin and Child was dedicated to her by the king.

      Virgin and Child, late 1600s, Jean-Louis Roullet after Pierre Mignard; Johann Ulrich Stapf, engraver. The Getty Research Institute. Tapestries from the Emperor of China series. The J. Paul Getty Museum


  • Flickr