Aspose.OCR for Python via C++ について

Python用の光学式文字認識API

Aspose.OCR for Python via C++ is a feature-packed library that seamlessly integrates OCR into Python applications with minimal code. Experience exceptional speed and accuracy, supporting 130+ languages, including Latin, Cyrillic, Arabic, Persian, Indic, and Chinese scripts. Recognize a variety of images, from scanned documents and smartphone photos to screenshots and scanned PDFs. Advanced pre-processing filters handle challenges like rotated, skewed, and noisy images, ensuring optimal performance by leveraging GPU processing.

Supported file formats

Images

  • JPEG
  • PNG
  • TIFF
  • BMP

Batch OCR

  • Multi-page PDF
  • ZIP
  • Folder

Recognition results

  • Text
  • PDF
  • Microsoft Word
  • Microsoft Excel
  • RTF
  • JSON
  • XML

Advanced Python OCR API Features

  • Photo OCR - Extract text from smartphone photos with scan-level accuracy.
  • Searchable PDF - Convert any scan into a fully searchable and indexable document.
  • URL recognition - Recognize an image from URL without downloading it locally.
  • Bulk recognition - Read all images from multi-page documents, folders and archives.
  • Any font and style - Identify and recognize text in all popular typefaces and styles.
  • Fine-tune recognition - Adjust every OCR parameter for best recognition results.
  • Spell checker - Improve results by automatically correcting misspelled words.
  • Find text in images - Search for text or regular expression within a set of images.
  • Compare image texts - Compare texts on two images, regardless of the case and layout.
  • Limit recognition scope - Limit the set of characters the OCR engine will look for.
  • Detect image defects - Automatically find potentially problematic areas of image.
  • Recognize areas - Find and read only specific areas of an image, not all text.
  • 130+ Recognition Languages - Optimize recognition by letting the library detect language or define it for improved performance.
    • Extended Latin alphabet: English, Spanish, French, Indonesian, Portuguese, German, Vietnamese, Turkish, Italian, Polish, and 80+ more.
    • Cyrillic alphabet: Russian, Ukrainian, Kazakh, Serbian, Belarusan, Bulgarian.
    • Arabic, Persian, Urdu.
    • Chinese and Devanagari script, including Hindi, Marathi, Bhojpuri, and others.