by DBI Technologies - 製品のタイプ: コンポーネント / Managed/Unmanaged Code - without COM / DLL
Extractor by DBI Technologies
Powerful text summarization engine. Extractor is a software text summarization engine. It consumes documents (text, html, email) and using a patented genetic extraction algorithm (GenEx) analyzes the recurrence of words and phrases, their proximity to one another, and the uniqueness of the words to a particular document. The engine returns a list of key words and phrases found in the document together with their relative ranking (how many times was the word/phrase found in the document) along with contextual links back to the position of the key word/phrase in the document itself.
Create relevancy from disparate sources of information. Integrate the Extractor Engine into your software applications enabling the summarization of documents into lists of keywords and key phrases with contextual links back to the originating document(s).
What is text summarization?
By definition text summarization is: To comprise in, or reduce to, a summary; to present briefly; quickly executed.
In terms of computer automated text summarization there are many definitions and implementations including Bayesian, Heurstics and linguistics. Extractor uses a Genetic approach which in itself provides a learning process. This is important for the summarization utility to move from one domain to another, versus other approaches which are traditionally domain specific and thereby require greater human intervention to adjust from one domain to another.
The Extractor API's have been designed for maximum flexibility allowing a wide variety of applications to take advantage of this unparalleled technology... supported development languages include:
C (C, C++, VC++)
Java
Visual Basic
Python
Perl
There are 26 primary API function calls that provide the development team with full control of the Extractor DLL and presentation of the extracted results.
Extractor supports Windows, Solaris and Linux computing platforms. Other computing platforms such as HP/UX, AIX or the Mac O/S can be custom compiled. (Upon confirmation of computing platform and engagement of the custom compilation, the process can take from one to two weeks for final testing and release.)
Multiple Threads with the Extractor API ... The API for Extractor allows several documents to be processed simultaneously, using separate threads for each document. This is useful, for example, when processing web pages. A major bottle-neck when downloading web pages is waiting for web servers to respond to requests for pages. One way around this bottle-neck is to download several pages simultaneously, using a separate thread to process each page.
Extractor is fully reentrant, to allow multithreading without the use of Win32 services such as semaphores and the EnterCriticalSection and LeaveCriticalSection functions. There should be a one-to-one relationship between threads and DocumentMemory values, so only one thread reads or writes to a given DocumentMemory. On the other hand, there may be a many-to-one relationship between threads and StopMemory values. That is, many threads may simultaneously read one StopMemory.
Most functions that take StopMemory as an argument only read StopMemory; they do not write. This is why many threads can safely access the same StopMemory. However, the functions ExtrAddStopWord and ExtrAddStopPhrase write StopMemory. These two functions should be called (one after the other; not at the same time) before any other threads access StopMemory. If one thread calls ExtrAddStopWord or ExtrAddStopPhrase with a given value of StopMemory while a second thread calls any function with the same value of StopMemory, the memory may become corrupted.
Applications of Text Summarization concepts: Text summarization is used in many applications. Most notably text summarization is used for:
Content review - defining document suitability.
Pre-Sort document summaries for Cataloging.
Creating document Indexes.
Providing interactive query refinement.
Defining document trends - performing document trend analysis.
Assisting in web page content analysis. Determining web page content accuracy
Enhancing Document Management systems.
Version History: The Extractor technology started as a machine learning and artificial intelligence research project at the National Research Council of Canada in the mid 1990's. In January of 1997 the initial result of that R&D effort was the release of the first version of Extractor. To this day research and development is ongoing through the exceptional efforts of Dr. Peter Turney at the Interactive Information Technology Group at the National Research Council of Canada and DBI Technologies Inc. For full product version history please see Extractor7History.htm.
Credits: Extractor is provided under a world wide distribution license to DBI Technologies Inc. from the National Research Council of Canada. Extractor is a patented technology held by the National Research Council of Canada. All copy rights and intellectual property are under the sole ownership of the National Research Council of Canada.
PartNumbers: PC-514931-148243 514931-148243 PC-514931-148244 514931-148244 PC-514931-148245 514931-148245
Publisher PartNumbers: EXT072 EXT072W
PurchaseOptions: Extractor V7.2 1 SDK License (See Licensing section for details) , Extractor V7.2 1 WebServer Run-time License , Extractor V7.2 Source Code Escrow Annual Subscription - (An Escrow Agreement will be sent to you for signature - please read Licensing section below)
Resources: Browse the Extractor API Documentation Web page, Read the Extractor SDK General License Agreement, Download the Extractor V7.2 Windows VB evaluation on to your computer - Displays Nag Screen, Download the Extractor V7.2 Windows C evaluation on to your computer - Displays Nag Screen
Operating System for Deployment: Windows XP, Windows Server 2003, Windows ME, Windows 2000, Windows 98, Windows NT 4.0, Windows 95, Windows NT 3.51, Windows 3.X, Sun Solaris 9, Sun Solaris 8, HP-UX 10.x, IBM AIX 5.x, Linux Kernel V2.4.x, RedHat Linux 7.x, SUSE Linux 8.x
Architecture of Product: 32Bit
Product Type: Component
Component Type: Managed/Unmanaged Code - without COM, DLL
Web Services: Supports SOAP 1.2, Supports SOAP 1.1, Supports SOAP 1.0, SOAP Binding Style rpc, SOAP Binding Style document
General: Supports Component Categories, Supports Apartment Model Threading, Microsoft Transaction Server Compatible (MTS)
Application Servers: Adobe JRun 4.0, Oracle WebLogic Server 8.1 (formerly BEA), Oracle WebLogic Server 7.0 (formerly BEA), Oracle WebLogic Server 6.1 with J2EE 1.3 Features(formerly BEA), Oracle WebLogic Server 6.1 (formerly BEA), IBM WebSphere (TM) Application Server 4.0, IBM WebSphere (TM) Application Server 5.0, Iona iPortal Application Server, JBoss (TM) 3.0.x, Oracle Application Server 9i, Sun ONE Application Server 6.5, Sun ONE Application Server 7.0, Sybase Enterprise Application Server 3.5, Borland Enterprise Server
Compatible Containers: Microsoft Visual Studio .NET 2003, Microsoft Visual Studio .NET, Microsoft Visual Studio 6.0, Microsoft Visual Studio 97, Microsoft Visual Basic .NET 2003, Microsoft Visual Basic .NET, Microsoft Visual Basic 6.0, Microsoft Visual Basic 5.0, Microsoft Visual C++ .NET 2003, Microsoft Visual C++ .NET, Microsoft Visual C++ 6.0, Microsoft Visual C++ 5.0, Microsoft Visual C# .NET 2003, Microsoft Visual C# .NET, Microsoft Visual J++ 6.0, Microsoft Visual J++ 1.1, Microsoft Visual InterDev 6.0, Microsoft Visual FoxPro 6.0, Microsoft Office XP, Microsoft Office 2000, Microsoft Office 97, Microsoft Access 2003, Microsoft Access 2002, Microsoft SQL Server 2005, Microsoft SQL Server 2000, Microsoft SQL Server 7.0, Microsoft Outlook 2002, Microsoft Outlook 2000, Microsoft Exchange Server 5.5, Microsoft Exchange Server 5.0, BizTalk Server, Microsoft Internet Information Server 5.0, Microsoft FrontPage, Microsoft Internet Explorer 6.0, Microsoft Internet Explorer 5.5, Microsoft Internet Explorer 5.0, CodeGear C++ (formerly Borland), C++Builder 6, C++Builder 5, Delphi 8.0, Delphi 7.0, JBuilder 9, JBuilder 8, JBuilder 7, Kylix 3.0, IBM VisualAge for Java 4, Sybase PowerBuilder 9.0, Sybase SQL Anywhere 6.0, Oracle 9i JDeveloper, Visual Café 4.0, Sun ONE Studio 4 (Formerly FORTE for Java), Sun ONE Studio 5 (Formerly FORTE Compiler Collection), .NET Framework 1.1, .NET Framework 1.0, WebLogic Workshop, WebLogic Portal, Tuxedo
Search Items: New Product June 04
Keywords: Search searching searches DBI Technologies Artificial Intelligence AI A.I. Professional Partner Text Extraction, Text Summarization, keyword, key phrase, taxonomy
開発元/発売元
主要なカテゴリー
関連製品
関連のカテゴリー