IronOCR for .NET について

.NETのアプリケーションで画像やPDFのテキストを読み取るための優れたOCR(光学式文字認識)ライブラリ

IronOCR for .NET enables software engineers to read text content from images & PDFs in .NET applications and Web sites. Read text and barcodes from scanned images, supports multiple international languages and output as plain text, structured data or searchable PDFs. Iron Software’s OCR library can be used inside MVC, Web, console and desktop .NET applications. Licensing available for commercial deployments with support directly from the development team.

IronOCR Features

  • IronOCR reads Text, Barcodes & QR from all major image and PDF formats using the latest Tesseract 5 engine. This library adds OCR functionality to Desktop, Console and Web applications in minutes.
  • IronOCR's Unique Features:
    • Pure .NET OCR API.
    • All OCR tasks run locally (no SAAS).
    • 127+ languages.
    • Barcode & QR Code reading.
    • Corrects low quality, noisy and distorted scans.
    • Performance tuned above and beyond any other known build of Tesseract OCR.
    • Reads PDFs.
    • Reads multi-page TIFFs.
    • Can save any OCR Scan to a searchable PDF document or XHTML.
  • Data output options include: Plain Text, Barcode Data and an OCR Result class containing paragraphs, lines, words, and characters.
  • Language Support:
    • 127+ Languages including Arabic, Chinese, English, Finnish, French, German, Hebrew, Italian, Japanese, Korean, Portuguese, Russian, Spanish. Custom language packs can also be created.

OCR Engine

  • Underlying OCR Engine:
    • Tesseract 5 (Custom for .NET)
  • International Languages:
    • 125 Languages
    • High and Fast Quality
    • Custom Languages
  • Text and Barcode Reading:
    • Text and Numbers
    • Multiple Languages at Once
    • Barcodes
  • Specialist Documents:
    • Receipts
    • Checks (Cheques)
    • Invoices
  • Concurrency:
    • Single and Multithreading
    • Async Support
    • Suspend current thread
    • Cancel OCR Reading
  • Computer Vision:
    • Find text with trained models

OCR Input

  • Read from Many Formats:
    • Images (jpg, png, gif, tiff, bmp)
    • Multi-Page/Frame (tiff, gif)
    • System.Drawing Objects
    • Streams
    • PDFs (optimized target DPI)
  • Filters:
    • Filter Wizard (Find best filter combination)
    • Image Correction (Sharpen, Enhance Resolution, Denoise, Dilate, Erode)
    • Fix Image Orientation (Rotate, Deskew, Scale)
    • Fix Image Colors (Binarize, Grayscale, Invert, ReplaceColor, SelectTextColor)
  • Apply a Crop Region:
    • CropRectangle

OCR Result

  • Simple Data Output:
    • .NET Text Strings
    • Barcode and QR Data
    • Images
  • Structured Data Output:
    • Pages
    • Blocks
    • Paragraphs
    • Lines
    • Words
    • Characters
  • Export Documents:
    • Searchable PDFs
    • hOCR Export
    • HTML Export
    • Page or Text as image
    • Barcode or QR as Image
  • Highlight Text and Save:
    • Characters, words, lines, and paragraphs
  • Status and Analytics:
    • OCR Progress Tracking
    • Result Confidence