Semantic Enhancement Engine: A Modular Document En

Semantic Search Engine (SSE) processes semantic queries and returns ….. Because SEE can switch gears on a per document basis, a single engine could be …

More PDF Content

Semantic Enhancement Engine: A Modular Document Enhancement Platform for Semantic Applications over Heterogeneous Content
Page 1
To appear in Real World Semantic Web Applications, V. Kashyap and L. Shklar, Eds., IOS Press, 2002. Semantic Enhancement Engine: A Modular Document Enhancement Platform for Semantic Applications over Heterogeneous Content Brian Hammond, Amit Sheth*, Krzysztof Kochut* Semagix, Inc. Also, LSDIS Lab, Computer Science, University of Georgia Abstract. Traditionally, automatic classification and metadata extraction have been performed in isolation, usually on unformatted text. SCORE Enhancement Engine (SEE) is a component of a Semantic Web technology called the Semantic Content Organization and Retrieval Engine (SCORE). SEE takes the next natural steps by supporting heterogeneous content (not only unformatted text), as well as following up automatic classification with extraction of contextually relevant, domain-specific (i.e., semantic) metadata. Extraction of semantic metadata not only includes identification of relevant entities but also relationships within the context of relevant ontology. This paper describes SEE’s architecture, which provides a common API for heterogeneous document processing, with discrete, reusable and highly configurable modular components. This results in exceptional flexibility, extensibility and performance. Referred to as SEE modules (SEEMs), which are divided along functional lines, these processors perform one of the following roles: restriction (determine the segments of the input text to operate upon); enhancement (discover textual features of semantic interest); filtering (augment, remove or supplement the features recognized); or outputting (generate reports, annotate the original, update databases, or other actions). Each SEEM manages its configuration options and is arranged serially in virtual pipelines to perform designated semantic tasks. These configurations can be saved and reloaded on a per-document basis. This allows a single SEE installation to act logically as any number of Semantic Applications, and to compose these Semantic Applications as needed to perform even more complex semantic tasks. SEE leverages SCORE’s unique approach of creating and using large knowledge base in semantic processing. It enables SCORE to provide flexible handling of highly heterogeneous content (including raw text, HTML, XML and documents of various formats); reliable automatic classification of documents; accurate extraction of semantic, domain-specific metadata; and extensive management of the enhancement processes including various reporting and semantic annotation mechanisms. This results in SCORE’s advanced capability in heterogeneous content integration at a higher semantic level, rather than syntactical and structural level approaches based on XML and RDF, by supporting and exploiting domain specific ontologies. This work also presents an approach to automatic semantic annotation, a key scalability challenge faced in realizing the Semantic Web. Keywords: semantic applications, document enhancement platform, semantic metadata, automated metadata extraction, automatic classification, semantic content integration, semantic annotation
Page 2
To appear in Real World Semantic Web Applications, V. Kashyap and L. Shklar, Eds., IOS Press, 2002. 1. Introduction Today, many organizations have access to a broad variety of information resources, including collections of internal documents, data repositories, and external public and subscription-based content. Collectively referred to as enterprise content, these information resources occur in a wide variety of formats and digital media. In order to truly take advantage of enterprise content, the next generation of enterprise content management systems require novel, advanced capabilities. Making information resources searchable is a step in the right direction, but to get the most out of these resources, users must be able to access information through mechanisms that blend seamlessly into the workflow and applications

Download Semantic Enhancement Engine: A Modular Document En pdf from lsdis.cs.uga.edu, 22 pages, 515.77KB.
Related Books

2 Responses to “Semantic Enhancement Engine: A Modular Document En”

  1. You really make it appear really easy together with your presentation however I in finding this topic to be really something which I feel I’d by
    no means understand. It sort of feels too complicated and
    very vast for me. I’m looking ahead in your subsequent post, I
    will attempt to get the dangle of it!

Leave a Reply