Proceedings

show/hide abstracts

The Semantic Web - ISWC 2009 -- 8th International Semantic Web Conference, ISWC 2009, Chantilly, VA, USA, October 25-29, 2009. Proceedings

but these tags may be poorly chosen and related tags may be omitted. Current techniques to retrieve

Enrichment and Ranking of the YouTube Tag Space and Integration with the Linked Data Cloud , Smitashree Choudhury,John G. Breslin and Alexandre Passant , 747-762 , [OpenAccess] , [Publisher]
The increase of personal digital cameras with video functionality and video-enabled camera phones has increased the amount of user-generated videos on the Web. People are spending more and more time viewing online videos as a major source of entertainment and infotainment". Social websites allow users to assign shared free-form tags to user-generated multimedia resources

Doctoral Consortium Paper

A Lexical-Ontological Resource for Consumer Healthcare , Elena Cardillo , 949-956
EXPRESS: EXPressing REstful Semantic Services Using Domain Ontologies , Areeb Alowisheq,David E. Millard and Thanassis Tiropanis , 941-948 , [OpenAccess] , [Publisher]
Ontology-Driven Generalization of Cartographic Representations by Aggregation and Dimensional Collapse , Eric B. Wolf , 990-997
Semantic Usage Policies for Web Services , Sebastian Speiser , 982-989
Semantic Web for Search , Jessica Gronski , 957-964
Towards Agile Ontology Maintenance , Markus Luczak-Rösch , 965-972

Invited Talk Paper

Populating the Semantic Web by Macro-reading Internet Text , Tom M. Mitchell,Justin Betteridge,Andrew Carlson,Estevam R. Hruschka Jr. and Richard C. Wang , 998-1002
A key question to the future of the semantic web is “how will we acquire structured information to populate the semantic web on a vast scale?” One approach is to enter this information manually. A second approach is to take advantage of the great deal of structured information already present in various databases, and to develop common ontologies, publishing standards, and reward systems to make this data widely accessible. We consider here a third approach: developing software that automatically extracts structured information from unstructured text present on the web. This talk will survey attempts to extract structured knowledge from unstructured text, and will focus on an approach with three characteristics that we hypothesize make it viable. First, in contrast to the very difficult problem of reading information from a single document, we consider the much easier problem of reading hundreds of millions of documents simultaneously, so that our system can extract facts that are stated manytimes by combining evidence from many documents. Second, our system begins with a given ontology that defines the types of information to be extracted, enabling it to focus its effort and to ignore most of the text that is irrelevant to the target ontology. Third, the system uses a new class of semi-supervised learning algorithms to learn how to extract information from web pages -- algorithms designed to achieve greater accuracy when given more complex ontologies. Our experiments show that this approach can produce knowledge bases containing tens of thousands of facts to populate given ontologies with approximately 90% accuracy, starting with only a handful of labeled training examples and 200 million unlabeled web pages.
Search 3.0: Present, Personal, Precise , Nova Spivack , 1003-1004 , [OpenAccess] , [Publisher]

Poster/Demo Paper

Ontologies for User Interface Integration , Heiko Paulheim , 973-981 , [OpenAccess] , [Publisher]
Application integration can be carried out on three different levels: the data source level, the business logic level, and the user interface level. With ontologies-based integration on the data source level dating back to the 1990s and semantic web services for integrating on the business logic level coming of age, it is time for the next logical step: employing ontologies for integration on the user interface level. Such an approach will improve both the development times and the usability of integrated applications. In this poster, we present an approach employing ontologies for integrating applications on the user interface level.

Research Track Paper

A Conflict-based Operator for Mapping Revision , Guilin Qi,Qiu Ji and Peter Haase , 521-536 , [OpenAccess] , [Publisher]
Ontology matching is one of the key research topics in the field of the Semantic Web. There are many matching systems that generate mappings between different ontologies either automatically or semi-automatically. However, the mappings generated by these systems may be inconsistent with the ontologies. Several approaches have been proposed to deal with the inconsistencies between mappings and ontologies. This problem is often called a mapping revision problem, as the ontologies are assumed to be correct, whereas the mappings are repaired when resolving the inconsistencies. In this paper, we first propose a conflict-based mapping revision operator and show that it can be characterized by two logical postulates adapted from some existing postulates for belief base revision. We then provide an algorithm for iterative mapping revision by using an ontology revision operator and show that this algorithm defines a conflict-based mapping revision operator. Three concrete ontology revision operators are given to instantiate the iterative algorithm, which result in three different mapping revision algorithms. We implement these algorithms and provide some preliminary but interesting evaluation results.
A Decomposition-based Approach to Optimizing Conjunctive Query Answering in OWL DL , Jianfeng Du,Guilin Qi,Jeff Z. Pan and Yi-Dong Shen , 146-162 , [OpenAccess] , [Publisher]
Scalable query answering over Description Logic (DL) based ontologies plays an important role for the success of the Semantic Web. Towards tackling the scalability problem, we propose a decomposition-based approach to optimizing existing OWL DL reasoners in evaluating conjunctive queries in OWL DL ontologies. The main idea is to decompose a given OWL DL ontology into a set of target ontologies without duplicated ABox axioms so that the evaluation of a given conjunctive query can be separately performed in every target ontology by applying existing OWL DL reasoners. This approach guarantees sound and complete results for the category of conjunctive queries that the applied OWL DL reasoner correctly evaluates. Experimental results on large benchmark ontologies and benchmark queries show that the proposed approach can significantly improve scalability and efficiency in evaluating general conjunctive queries.
A Generic Approach for Large-Scale Ontological Reasoning in the Presence of Access Restrictions to the Ontology's Axioms , Franz Baader,Martin Knechtel and Rafael Peñaloza , 49-64 , [OpenAccess] , [Publisher]
The framework developed in this paper can deal with scenarios where selected sub-ontologies of a large ontology are offered as views to users, based on criteria like the user's access right, the trust level required by the application, or the level of detail requested by the user. Instead of materializing a large number of different sub-ontologies, we propose to keep just one ontology, but equip each axiom with a label from an appropriate labeling lattice. The access right, required trust level, etc. is then also represented by a label (called user label) from this lattice, and the corresponding sub-ontology is determined by comparing this label with the axiom labels. For large-scale ontologies, certain consequence (like the concept hierarchy) are often precomputed. Instead of precomputing these consequences for every possible sub-ontology, our approach computes just one label for each consequence such that a comparison of the user label with the consequence label determines whether the consequence follows from the corresponding sub-ontology or not. In this paper we determine under which restrictions on the user and axiom labels such consequence labels (called boundaries) always exist, describe different black-box approaches for computing boundaries, and present first experimental results that compare the efficiency of these approaches on large real-world ontologies. Black-box means that, rather than requiring modifications of existing reasoning procedures, these approaches can use such procedures directly as sub-procedures, which allows us to employ existing highly-optimized reasoners.
A Practical Approach for Scalable Conjunctive Query Answering on Acyclic EL+ Knowledge Base , Jing Mei,Shengping Liu,Guo Tong Xie,Aditya Kalyanpur,Achille Fokoue,Yuan Ni,Hanyu Li and Yue Pan , 408-423 , [OpenAccess] , [Publisher]
Conjunctive query answering for EL++ ontologies has recently drawn much attention, as the Description Logic EL++ captures the expressivity of many large ontologies in the biomedical domain and is the foundation for the OWL 2 EL profile. In this paper, we propose a practical approach for conjunctive query answering in a fragment of EL++, namely acyclic EL+, that supports role inclusions. This approach can be implemented with low cost by leveraging any existing relational database management system to do the ABox data completion and query answering. We conducted a preliminary experiment to evaluate our approach using a large clinical data set and show our approach is practical.
A Weighted Approach to Partial Matching for Mobile Reasoning , Luke Albert Steller,Shonali Krishnaswamy and Mohamed Medhat Gaber , 618-633 , [OpenAccess] , [Publisher]
Due to significant improvements in the capabilities of small devices such as PDAs and smart phones, these devices can not only consume but also provide Web Services. The dynamic nature of mobile environment means that users need accurate and fast approaches for service discovery. In order achieve high accuracy semantic languages can be used in conjunction with logic reasoners. Since powerful broker nodes are not always available (due to lack of long range connectivity), create a bottleneck (since mobile devices are all trying to access the same server) and single point of failure (in the case that a central server fails), on-board mobile reasoning must be supported. However, reasoners are notoriously resource intensive and do not scale to small devices. Therefore, in this paper we provide an efficient mobile reasoner which relaxes the current strict and complete matching approaches to support anytime reasoning. Our approach matches the most important request conditions (deemed by the user) first and provides a degree of match and confidence result to the user. We provide a prototype implementation and performance evaluation of our work.
Actively Learning Ontology Matching via User Interaction , Feng Shi,Juanzi Li,Jie Tang,Guo Tong Xie and Hanyu Li , 585-600 , [OpenAccess] , [Publisher]
Ontology matching plays a key role for semantic interoperability. Many methods have been proposed for automatically finding the alignment between heterogeneous ontologies. However, in many real-world applications, finding the alignment in a completely automatic way is highly infeasible. Ideally, an ontology matching system would have an interactive interface to allow users to provide feedbacks to guide the automatic algorithm. Fundamentally, we need answer the following questions: how can a system perform an efficiently interactive process with the user? How many interactions are sufficient for finding a more accurate matching? To address these questions, we propose an active learning framework for ontology matching, which tries to find the most informative candidate matches to query the user. The user's feedbacks are used to: 1) correct the mistake matching and 2) propagate the supervise information to help the entire matching process. Three measures are proposed to estimate the confidence of each matching candidate. A correct propagation algorithm is further proposed to maximize the spread of the user's guidance". Experimental results on several public data sets show that the proposed approach can significantly improve the matching accuracy (+8.0% better than the baseline methods)."
Analysis of a Real Online Social Network using Semantic Web Frameworks , Guillaume Erétéo,Michel Buffa,Fabien Gandon and Olivier Corby , 180-195 , [OpenAccess] , [Publisher]
Social Network Analysis (SNA) provides graph algorithms to characterize the structure of social networks, strategic positions in these networks, specific sub-networks and decompositions of people and activities. Online social platforms like Facebook form huge social networks, enabling people to connect, interact and share their online activities across several social applications. We extended SNA operators using semantic web frameworks to include the semantics of these graph-based representations when analyzing such social networks and to deal with the diversity of their relations and interactions. We present here the results of this approach when it was used to analyze a real social network with 60,000 users connecting, interacting and sharing content.
Automatically Constructing Semantic Web Services from Online Sources , José Luis Ambite,Sirish Darbha,Aman Goel,Craig A. Knoblock,Kristina Lerman,Rahul Parundekar and Thomas A. Russ , 17-32 , [OpenAccess] , [Publisher]
The work on integrating sources and services in the Semantic Web assumes that the data is either already represented in RDF or OWL or is available through a Semantic Web Service. In practice, there is a tremendous amount of data on the Web that is not available through the Semantic Web. In this paper we present an approach to automatically discover and create new Semantic Web Services. The idea behind this approach is to start with a set of known sources and the corresponding semantic descriptions and then discover similar sources, extract the source data, build semantic descriptions of the sources, and then turn them into Semantic Web Services. We implemented an end-to-end solution to this problem in a system called Deimos and evaluated the system across five different domains. The results demonstrate that the system can automatically discover, learn semantic descriptions, and build Semantic Web Services with only example sources and their descriptions as input.
Coloring RDF Triples to Capture Provenance , Giorgos Flouris,Irini Fundulaki,Panagiotis Pediaditis,Yannis Theoharis and Vassilis Christophides , 196-212 , [OpenAccess] , [Publisher]
Recently, the W3C Linking Open Data effort has boosted the publication and inter-linkage of large amounts of RDF datasets on the Semantic Web. Various ontologies and knowledge bases with millions of RDF triples from Wikipedia and other sources, mostly in e-science, have been created and are publicly available. Recording provenance information of RDF triples aggregated from different heterogeneous sources is crucial in order to effectively support trust mechanisms, digital rights and privacy policies. Managing provenance becomes even more important when we consider not only explicitly stated but also implicit triples (through RDFS inference rules) in conjunction with declarative languages for querying and updating RDF graphs. In this paper we rely on colored RDF triples represented as quadruples to capture and manipulate explicit provenance information.
Concept and Role Forgetting in ALC Ontologies , Kewen Wang,Zhe Wang,Rodney W. Topor,Jeff Z. Pan and Grigoris Antoniou , 666-681 , [OpenAccess] , [Publisher]
Forgetting is an important tool for reducing ontologies by eliminating some concepts and roles while preserving sound and complete reasoning. Attempts have previously been made to address the problem of forgetting in relatively simple description logics (DLs) such as DL-Lite and extended EL. The ontologies used in these attempts were mostly restricted to TBoxes rather than general knowledge bases (KBs). However, the issue of forgetting for general KBs in more expressive description logics, such as ALC and OWL DL, is largely unexplored. In particular, the problem of characterizing and computing forgetting for such logics is still open. In this paper, we first define semantic forgetting about concepts and roles in ALC ontologies and state several important properties of forgetting in this setting. We then define the result of forgetting for concept descriptions in ALC, state the properties of forgetting for concept descriptions, and present algorithms for computing the result of forgetting for concept descriptions. Unlike the case of DL-Lite, the result of forgetting for an ALC ontology does not exist in general, even for the special case of concept forgetting. This makes the problem of how to compute forgetting in ALC more challenging. We address this problem by defining a series of approximations to the result of forgetting for ALC ontologies and studying their properties and their application to reasoning tasks. We use the algorithms for computing forgetting for concept descriptions to compute these approximations. Our algorithms for computing approximations can be embedded into an ontology editor to enhance its ability to manage and reason in (large) ontologies.
DOGMA: A Disk-Oriented Graph Matching Algorithm for RDF Databases , Matthias Bröcheler,Andrea Pugliese and V. S. Subrahmanian , 97-113 , [OpenAccess] , [Publisher]
RDF is an increasingly important paradigm for the representation of information on the Web. As RDF databases increase in size to approach tens of millions of triples, and as sophisticated graph matching queries expressible in languages like SPARQL become increasingly important, scalability becomes an issue. To date, there is no graph-based indexing method for RDF data where the index was designed in a way that makes it disk-resident. There is therefore a growing need for indexes that can operate efficiently when the index itself resides on disk. In this paper, we first propose the DOGMA index for fast subgraph matching on disk and then develop a basic algorithm to answer queries over this index. This algorithm is then significantly sped up via an optimized algorithm that uses efficient (but correct) pruning strategies when combined with two different extensions of the index. We have implemented a preliminary system and tested it against four existing RDF database systems developed by others. Our experiments show that our algorithm performs very well compared to these systems, with orders of magnitude improvements for complex graph queries.
Decidable Order-Sorted Logic Programming for Ontologies and Rules with Argument Restructuring , Ken Kaneiwa and Philip H. P. Nguyen , 328-343 , [OpenAccess] , [Publisher]
This paper presents a decidable fragment for combining ontologies andrules in order-sorted logic programming. We describe order-sortedlogic programming with sort, predicate, and meta-predicate hierarchiesfor deriving predicate and meta-predicate assertions. Meta-levelpredicates (predicates of predicates) are useful for representingrelationships between predicate formulas, and further, theyconceptually yield a hierarchy similar to the hierarchies of sorts andpredicates. By extending the order-sorted Horn-clause calculus, wedevelop a query-answering system that can answer queries such as atomsand meta-atoms generalized by containing predicate variables. We showthat the expressive query-answering system computes every generalizedquery in single exponential time, i.e., the complexity of our querysystem is equal to that of DATALOG.
Discovering and Maintaining Links on the Web of Data , Julius Volz,Christian Bizer,Martin Gaedke and Georgi Kobilarov , 650-665 , [OpenAccess] , [Publisher]
The Web of Data is built upon two simple ideas: Employ the RDF data model to publish structured data on the Web and to create explicit data links between entities within different data sources. This paper presents the Silk - Linking Framework, a toolkit for discovering and maintaining data links between Web data sources. Silk consists of three components: 1. A link discovery engine, which computes links between data sources based on a declarative specification of the conditions that entities must fulfill in order to be interlinked; 2. A tool for evaluating the generated data links in order to fine-tune the linking specification; 3. A protocol for maintaining data links between continuously changing data sources. The protocol allows data sources to exchange both linksets as well as detailed change information and enables continuous link recomputation. The interplay of all the components is demonstrated within a life science use case.
Dynamic Querying of Mass-Storage RDF Data with Rule-Based Entailment Regimes , Giovambattista Ianni,Thomas Krennwallner,Alessandra Martello and Axel Polleres , 310-327 , [OpenAccess] , [Publisher]
RDF Schema (RDFS) as a lightweight ontology language is gaining popularity and, consequently, tools for scalable RDFS inference and querying are needed. SPARQL has become recently a W3C standard for querying RDF data, but it mostly provides means for querying simple RDF graphs only, whereas querying with respect to RDFS or other entailment regimes is left outside the current specification. In this paper, we show that SPARQL faces certain unwanted ramifications when querying ontologies in conjunction with RDF datasets that comprise multiple named graphs, and we provide an extension for SPARQL that remedies these effects. Moreover, since RDFS inference has a close relationship with logic rules, we generalize our approach to select a custom ruleset for specifying inferences to be taken into account in a SPARQL query. We show that our extensions are technically feasible by providing benchmark results for RDFS querying in our prototype system GiaBATA, which uses Datalog coupled with a persistent Relational Database as a back-end for implementing SPARQL with dynamic rule-based inference. By employing different optimization techniques like magic set rewriting our system remains competitive with state-of-the-art RDFS querying systems.
Efficient Query Answering for OWL 2 , Héctor Pérez-Urbina,Ian Horrocks and Boris Motik , 489-504 , [OpenAccess] , [Publisher]
The QL profile of OWL 2 has been designed so that it is possible to use database technology for query answering via query rewriting. We present a comparison of our resolution based rewriting algorithm with the standard algorithm proposed by Calvanese et al., implementing both and conducting an empirical evaluation using ontologies and queries derived from realistic applications. The results indicate that our algorithm produces significantly smaller rewritings in most cases, which could be important for practicality in realistic applications.
Executing SPARQL Queries over the Web of Linked Data , Olaf Hartig,Christian Bizer and Johann Christoph Freytag , 293-309 , [OpenAccess] , [Publisher]
The Web of Linked Data forms a single, globally distributed dataspace. Due to the openness of this dataspace, it is not possible to know in advance all data sources that might be relevant for query answering. This openness poses a new challenge that is not addressed by traditional research on federated query processing. In this paper we present an approach to execute SPARQL queries over the Web of Linked Data. The main idea of our approach is to discover data that might be relevant for answering a query during the query execution itself. This discovery is driven by following RDF links between data sources based on URIs in the query and in partial results. The URIs are resolved over the HTTP protocol into RDF data which is continuously added to the queried dataset. This paper describes concepts and algorithms to implement our approach using an iterator-based pipeline. We introduce a formalization of the pipelining approach and show that classical iterators may cause blocking due to the latency of HTTP requests. To avoid blocking, we propose an extension of the iterator paradigm. The evaluation of our approach shows its strengths as well as the still existing challenges.
Exploiting Partial Information in Taxonomy Construction , Rob Shearer and Ian Horrocks , 569-584 , [OpenAccess] , [Publisher]
One of the core services provided by OWL reasoners is classification: the discovery of all subclass relationships between class names occurring in an ontology. Discovering these relations can be computationally expensive, particularly if individual subsumption tests are costly or if the number of class names is large. We present a classification algorithm which exploits partial information about subclass relationships to reduce both the number of individual tests and the cost of working with large ontologies. We also describe techniques for extracting such partial information from existing reasoners. Empirical results from a prototypical implementation demonstrate substantial performance improvements compared to existing algorithms and implementations.
Exploiting User Feedback to Improve Semantic Web Service Discovery , Anna Averbakh,Daniel Krause and Dimitrios Skoutas , 33-48 , [OpenAccess] , [Publisher]
State-of-the-art discovery of Semantic Web services is based on hybrid algorithms that combine semantic and syntactic matchmaking. These approaches are purely based on similarity measures between parameters of a service request and available service descriptions, which, however, fail to completely capture the actual functionality of the service or the quality of the results returned by it. On the other hand, with the advent of Web 2.0, active user participation and collaboration has become an increasingly popular trend. Users often rate or group relevant items, thus providing valuable information that can be taken into account to further improve the accuracy of search results. In this paper, we tackle this issue, by proposing a method that combines multiple matching criteria with user feedback to further improve the results of the matchmaker. We extend a previously proposed dominance-based approach for service discovery, and describe how user feedback is incorporated in the matchmaking process. We evaluate the performance of our approach using a publicly available collection of OWL-S services.
Functions over RDF Language Elements , Bernhard Schandl , 537-552 , [OpenAccess] , [Publisher]
RDF data are usually accessed using one of two methods: either, graphs are rendered in forms perceivable by human users (e.g., in tabular or in graphical form), which are difficult to handle for large data sets. Alternatively, query languages like SPARQL provide means to express information needs in structured form; hence they are targeted towards developers and experts. Inspired by the concept of spreadsheet tools, where users can perform relatively complex calculations by splitting formulas and values across multiple cells, we have investigated mechanisms that allow us to access RDF graphs in a more intuitive and manageable, yet formally grounded manner. In this paper, we make three contributions towards this direction. First, we present RDFunctions, an algebra that consists of mappings between sets of RDF language elements (URIs, blank nodes, and literals) under consideration of the triples contained in a background graph. Second, we define a syntax for expressing RDFunctions, which can be edited, parsed and evaluated. Third, we discuss Tripcel, an implementation of RDFunctions using a spreadsheet metaphor. Using this tool, users can easily edit and execute function expressions and perform analysis tasks on the data stored in an RDF graph.
Goal-Directed Module Extraction for Explaining OWL DL Entailments , Jianfeng Du,Guilin Qi and Qiu Ji , 163-179 , [OpenAccess] , [Publisher]
Module extraction methods have proved to be effective in improving the performance of some ontology reasoning tasks, including finding justifications to explain why an entailment holds in an OWL DL ontology. However, the existing module extraction methods that compute a syntactic locality-based module for the sub-concept in a subsumption entailment, though ensuring the resulting module to preserve all justifications of the entailment, may be insufficient in improving the performance of finding all justifications. This is because a syntactic locality-based module is independent of the super-concept in a subsumption entailment and always contains all concept/role assertions. In order to extract smaller modules to further optimize finding all justifications in an OWL DL ontology, we propose a goal-directed method for extracting a module that preserves all justifications of a given entailment. Experimental results on large ontologies show that a module extracted by our method is smaller than the corresponding syntactic locality-based module, making the subsequent computation of all justifications more scalable and more efficient.
Graph-Based Ontology Construction from Heterogenous Evidences , Christoph Böhm,Philip Groth and Ulf Leser , 81-96 , [OpenAccess] , [Publisher]
Ontologies are tools for describing and structuring knowledge, with many applications in searching and analyzing complex knowledge bases. Since building them manually is a costly process, there are various approaches for bootstrapping ontologies automatically through the analysis of appropriate documents. Such an analysis needs to find the concepts and the relationships that should form the ontology. However, since relationship extraction methods are imprecise and cannot homogeneously cover all concepts, the initial set of relationships is usually inconsistent and rather imbalanced - a problem which, to the best of our knowledge, was mostly ignored so far. In this paper, we define the problem of extracting a consistent as well as properly structured ontology from a set of inconsistent and heterogeneous relationships. Moreover, we propose and compare three graph-based methods for solving the ontology extraction problem. We extract relationships from a large-scale data set of more than 325K documents and evaluate our methods against a gold standard ontology comprising more than 12K relationships. Our study shows that an algorithm based on a modified formulation of the dominating set problem outperforms greedy methods.
Investigating the Semantic Gap through Query Log Analysis , Peter Mika,Edgar Meij and Hugo Zaragoza , 441-455 , [OpenAccess] , [Publisher]
Significant efforts have focused in the past years on bringing large amounts of metadata online and the success of these efforts can be seen by the impressive number of web sites exposing data in RDFa or RDF/XML. However, little is known about the extent to which this data fits the needs of ordinary web users with everyday information needs. In this paper we study what we perceive as the semantic gap between the supply of data on the Semantic Web and the needs of web users as expressed in the queries submitted to a major Web search engine. We perform our analysis on both the level of instances and ontologies. First, we first look at how much data is actually relevant to Web queries and what kind of data is it. Second, we provide a generic method to extract the attributes that Web users are searching for regarding particular classes of entities. This method allows to contrast class definitions found in Semantic Web vocabularies with the attributes of objects that users are interested in. Our findings are crucial to measuring the potential of semantic search, but also speak to the state of the Semantic Web in general.
Learning Semantic Query Suggestions , Edgar Meij,Marc Bron,Laura Hollink,Bouke Huurnink and Maarten de Rijke , 424-440 , [OpenAccess] , [Publisher]
An important application of semantic web technology is recognizing human-defined concepts in text. Query transformation is a strategy often used in search engines to derive queries that are able to return more useful search results than the original query and most popular search engines provide facilities that let users complete, specify, or reformulate their queries. We study the problem of semantic query suggestion, a special type of query transformation based on identifying semantic concepts contained in user queries. We use a feature-based approach in conjunction with supervised machine learning, augmenting term-based features with search history-based and concept-specific features. We apply our method to the task of linking queries from real-world query logs (the transaction logs of the Netherlands Institute for Sound and Vision) to the DBpedia knowledge base. We evaluate the utility of different machine learning algorithms, features, and feature types in identifying semantic concepts using a manually developed test bed and show significant improvements over an already high baseline. The resources developed for this paper, i.e., queries, human assessments, and extracted features, are available for download.
Modeling and Query Patterns for Process Retrieval in OWL , Gerd Gröner and Steffen Staab , 243-259 , [OpenAccess] , [Publisher]
Process modeling is a core task in software engineering in general and in web service modeling in particular. The explicit management of process models for purposes such as process selection and/or process reuse requires flexible and intelligent retrieval of process structures based on process entities and relationships, i.e. process activities, hierarchical relationship between activities and their parts, temporal relationships between activities, conditions on process flows as well as the modeling of domain knowledge. In this paper, we analyze requirements for modeling and querying of process models and present a pattern-oriented approach exploiting OWL-DL representation and reasoning capabilities for expressive process modeling and retrieval.
Multi Visualization and Dynamic Query for Effective Exploration of Semantic Data , Daniela Petrelli,Suvodeep Mazumdar,Aba-Sah Dadzie and Fabio Ciravegna , 505-520 , [OpenAccess] , [Publisher]
Semantic formalisms represent content in a uniform way according to ontologies. This enables manipulation and reasoning via automated means (e.g. Semantic Web services), but limits the user's ability to explore the semantic data from a point of view that originates from knowledge representation motivations. We show how, for user consumption, a visualization of semantic data according to some easily graspable dimensions (e.g. space and time) provides effective sense-making of data. In this paper, we look holistically at the interaction between users and semantic data, and propose multiple visualization strategies and dynamic filters to support the exploration of semantic-rich data. We discuss a user evaluation and how interaction challenges could be overcome to create an effective user-centred framework for the visualization and manipulation of semantic data. The approach has been implemented and evaluated on a real company archive.
On Detecting High-Level Changes in RDF/S KBs , Vicky Papavassiliou,Giorgos Flouris,Irini Fundulaki,Dimitris Kotzinos and Vassilis Christophides , 473-488 , [OpenAccess] , [Publisher]
An increasing number of scientific communities rely on Semantic Web ontologies to share and interpret data within and across research domains. These common knowledge representation resources are usually developed and maintained manually and essentially co-evolve along with experimental evidence produced by scientists worldwide. Detecting automatically the differences between (two) versions of the same ontology in order to store or visualize their deltas is a challenging task for e-science. In this paper, we focus on languages allowing the formulation of concise and intuitive deltas, which are expressive enough to describe unambiguously any possible change and that can be effectively and efficiently detected. We propose a specific language that provably exhibits those characteristics and provide a change detection algorithm which is sound and complete with respect to the proposed language. Finally, we provide a promising experimental evaluation of our framework using real ontologies from the cultural and bioinformatics domains.
OntoCase-Automatic Ontology Enrichment Based on Ontology Design Patterns , Eva Blomqvist , 65-80 , [OpenAccess] , [Publisher]
OntoCase is a framework for semi-automatic pattern-based ontology construction. In this paper we focus on the retain and reuse phases, where an initial ontology is enriched based on content ontology design patterns (Content ODPs), and especially the implementation and evaluation of these phases. Applying Content ODPs within semiautomatic ontology construction, i.e. ontology learning (OL), is a novel approach. The main contributions of this paper are the methods for pattern ranking, selection, and integration, and the subsequent evaluation showing the characteristics of ontologies constructed automatically based on ODPs. We show that it is possible to improve the results of existing OL methods by selecting and reusing Content ODPs. OntoCase is able to introduce a general top structure into the ontologies, and by exploiting background knowledge the ontology is given a richer overall structure.
Optimizing Web Service Composition while Enforcing Regulations , Shirin Sohrabi and Sheila A. McIlraith , 601-617 , [OpenAccess] , [Publisher]
To direct automated Web service composition, it is compelling to provide a template, workflow or scaffolding that dictates the ways in which services can be composed. In this paper we present an approach to Web service composition that builds on work using AI planning, and more specifically Hierarchical Task Networks (HTNs), for Web service composition. A significant advantage of our approach is that it provides much of the how-to knowledge of a choreography while enabling customization and optimization of integrated Web service selection and composition based upon the needs of the specific problem, the preferences of the customer, and the available services. Many customers must also be concerned with enforcement of regulations, perhaps in the form of corporate policies and/or government regulations. Regulations are traditionally enforced at design time by verifying that a workflow or composition adheres to regulations. Our approach supports customization, optimization and regulation enforcement all at composition construction time. To maximize efficiency, we have developed novel search heuristics together with a branch and bound search algorithm that enable the generation of high quality compositions with the performance of state-of-the-art planning systems.
Parallel Materialization of the Finite RDFS Closure for Hundreds of Millions of Triples , Jesse Weaver and James A. Hendler , 682-697 , [OpenAccess] , [Publisher]
In this paper, we consider the problem of materializing the complete finite RDFS closure in a scalable manner; this includes those parts of the RDFS closure that are often ignored such as literal generalization and container membership properties. We point out characteristics of RDFS that allow us to derive an embarrassingly parallel algorithm for producing said closure, and we evaluate our C/MPI implementation of the algorithm on a cluster with 128 cores using different-size subsets of the LUBM 10,000-university data set. We show that the time to produce inferences scales linearly with the number of processes, evaluating this behavior on up to hundreds of millions of triples. We also show the number of inferences produced for different subsets of LUBM10k. To the best of our knowledge, our work is the first to provide RDFS inferencing on such large data sets in such low times. Finally, we discuss future work in terms of promising applications of this approach including OWL2RL rules, MapReduce implementations, and massive scaling on supercomputers.
Policy-Aware Content Reuse on the Web , Oshani Seneviratne,Lalana Kagal and Tim Berners-Lee , 553-568 , [OpenAccess] , [Publisher]
The Web allows users to share their work very effectively leading to the rapid re-use and remixing of content on the Web including text, images, and videos. Scientific research data, social networks, blogs, photo sharing sites and other such applications known collectively as the Social Web have lots of increasingly complex information. Such information from several Web pages can be very easily aggregated, mashed up and presented in other Web pages. Content generation of this nature inevitably leads to many copyright and license violations, motivating research into effective methods to detect and prevent such violations. This is supported by an experiment on Creative Commons (CC) attribution license violations from samples of Web sites that had at least one embedded Flickr image, which revealed that the attribution license violation rate of Flickr images on the Web is around 70-90%. Our primary objective is to enable users to do the right thing and comply with CC licenses associated with Web media, instead of preventing them from doing the wrong thing or detecting violations of these licenses. As a solution, we have implemented two applications: (1) Attribution License Violations Validator, which can be used to validate users' derived work against attribution licenses of reused media and, (2) Semantic Clipboard, which provides license awareness of Web media and enables users to copy them along with the appropriate license metadata.
Queries to Hybrid MKNF Knowledge Bases through Oracular Tabling , José Júlio Alferes,Matthias Knorr and Terrance Swift , 1-16 , [OpenAccess] , [Publisher]
An important issue for the Semantic Web is how to combine open-world ontology languages with closed-world (non-monotonic) rule paradigms. Several proposals for hybrid languages allow concepts to be simultaneously defined by an ontology and rules, where rules may refer to concepts in the ontology and the ontology may also refer to predicates defined by the rules. Hybrid MKNF knowledge bases are one such proposal, for which both a stable and a well-founded semantics have been defined. The definition of Hybrid MKNF knowledge bases is parametric on the ontology language, in the sense that non-monotonic rules can extend any decidable ontology language. In this paper we define a query-driven procedure for Hybrid MKNF knowledge bases that is sound with respect to the original stable model-based semantics, and is correct with respect to the well-founded semantics. This procedure is able to answer conjunctive queries, and is parametric on an inference engine for reasoning in the ontology language. Our procedure is based on an extension of a tabled rule evaluation to capture reasoning within an ontology by modeling it as an interaction with an external oracle and, with some assumptions on the complexity of the oracle compared to the complexity of the ontology language, maintains the data complexity of the well-founded semantics for hybrid MKNF knowledge bases.
Scalable Distributed Reasoning using MapReduce , Jacopo Urbani,Spyros Kotoulas,Eyal Oren and Frank van Harmelen , 634-649 , [OpenAccess] , [Publisher]
We address the problem of scalable distributed reasoning, proposing a technique for materialising the closure of an RDF graph based on MapReduce. We have implemented our approach on top of Hadoop and deployed it on a compute cluster of up to 64 commodity machines. We show that a naive implementation on top of MapReduce is straightforward but performs badly and we present several non-trivial optimisations. Our algorithm is scalable and allows us to compute the RDFS closure of 865M triples from the Web (producing 30B triples) in less than two hours, faster than any other published approach.
Semantic Web Service Composition in Social Environments , Ugur Kuter and Jennifer Golbeck , 344-358 , [OpenAccess] , [Publisher]
Semantically-aided business process modeling , Chiara Di Francescomarino,Chiara Ghidini,Marco Rospocher,Luciano Serafini and Paolo Tonella , 114-129 , [OpenAccess] , [Publisher]
Enriching business process models with semantic annotations taken from an ontology has become a crucial necessity both in service provisioning, integration and composition, and in business processes management. In our work we represent semantically annotated business processes as part of an OWL knowledge base that formalises the business process structure, the business domain, and a set of criteria describing correct semantic annotations. In this paper we show how Semantic Web representation and reasoning techniques can be effectively applied to formalise, and automatically verify, sets of constraints on Business Process Diagrams that involve both knowledge about the domain and the process structure. We also present a tool for the automated transformation of an annotated Business Process Diagram into an OWL ontology. The use of the semantic web techniques and tool presented in the paper results in a novel support for the management of business processes in the phase of process modeling, whose feasibility and usefulness will be illustrated by means of a concrete example.
Synthesizing Semantic Web Service Compositions with jMosel and Golog , Tiziana Margaria,Daniel Meyer,Christian Kubczak,Malte Isberner and Bernhard Steffen , 392-407 , [OpenAccess] , [Publisher]
In this paper we investigate different technologies to attack the automatic solution of orchestration problems based on synthesis from declarative specifications, a semantically enriched description of the services, and a collection of services available on a testbed. In addition to our previously presented tableaux-based synthesis technology, we consider two structurally rather different approaches here: using jMosel, our tool for Monadic Second-Order Logic on Strings and the high-level programming language Golog, that internally makes use of planning techniques. As a common case study we consider the Mediation Scenario of the Semantic Web Service Challenge, which is a benchmark for process orchestration. All three synthesis solutions have been embedded in the jABC/jETI modeling framework, and used to synthesize the abstract mediator processes as well as their concrete, running (Web) service counterpart. Using the jABC as a common frame helps highlighting the essential differences and similarities. It turns out, at least at the level of complication of the considered case study, all approaches behave quite similarly, both considering the performance as well as the modeling. We believe that turning the jABC framework into experimentation platform along the lines presented here, will help understanding the application profiles of the individual synthesis solutions and technologies, answering questing like when the overhead to achieve compositionality pays of and where (heuristic) search is the technology of choice.
Task Oriented Evaluation of Module Extraction Techniques , Ignazio Palmisano,Valentina A. M. Tamma,Terry R. Payne and Paul Doran , 130-145 , [OpenAccess] , [Publisher]
Ontology Modularization techniques identify coherent and often reusable regions within an ontology. The ability to identify such modules, thus potentially reducing the size or complexity of an ontology for a given task or set of concepts is increasingly important in the Semantic Web as domain ontologies increase in terms of size, complexity and expressivity. To date, many techniques have been developed, but evaluation of the results of these techniques is sketchy and somewhat ad hoc. Theoretical properties of modularization algorithms have only been studied in a small number of cases. This paper presents an empirical analysis of a number of modularization techniques, and the modules they identify over a number of diverse ontologies, by utilizing objective, task-oriented measures to evaluate the fitness of the modules for a number of statistical classification problems.
Towards Lightweight and Robust Large Scale Emergent Knowledge Processing , Vít Novácek and Stefan Decker , 456-472 , [OpenAccess] , [Publisher]
We present a lightweight framework for processing uncertain emergent knowledge that comes from multiple resources with varying relevance. The framework is essentially RDF-compatible, but allows also for direct representation of contextual features (e.g., provenance). We support soft integration and robust querying of the represented content based on well-founded notions of aggregation, similarity and ranking. A proof-of-concept implementation is presented and evaluated within large scale knowledge-based search in life science articles.
TripleRank: Ranking Semantic Web Data By Tensor Decomposition , Thomas Franz,Antje Schultz,Sergej Sizov and Steffen Staab , 213-228 , [OpenAccess] , [Publisher]
The Semantic Web fosters novel applications targeting a more efficient and satisfying exploitation of the data available on the web, e.g. faceted browsing of linked open data. Large amounts and high diversity of knowledge in the Semantic Web pose the challenging question of appropriate relevance ranking for producing fine-grained and rich descriptions of the available data, e.g. to guide the user along most promising knowledge aspects. Existing methods for graph-based authority ranking lack support for fine-grained latent coherence between resources and predicates (i.e. support for link semantics in the linked data model). In this paper, we present TripleRank, a novel approach for faceted authority ranking in the context of RDF knowledge bases. TripleRank captures the additional latent semantics of Semantic Web data by means of statistical methods in order to produce richer descriptions of the available data. We model the Semantic Web by a 3-dimensional tensor that enables the seamless representation of arbitrary semantic links. For the analysis of that model, we apply the PARAFAC decomposition, which can be seen as a multi-modal counterpart to Web authority ranking with HITS. The result are groupings of resources and predicates that characterize their authority and navigational (hub) properties with respect to identified topics. We have applied TripleRank to multiple data sets from the linked open data community and gathered encouraging feedback in a user evaluation where TripleRank results have been exploited in a faceted browsing scenario.
Using Naming Authority to Rank Data and Ontologies for Web Search , Andreas Harth,Sheila Kinsella and Stefan Decker , 277-292 , [OpenAccess] , [Publisher]
The focus of web search is moving away from returning relevant documents towards returning structured data as results to user queries. A vital part in the architecture of search engines are link-based ranking algorithms, which however are targeted towards hypertext documents. Existing ranking algorithms for structured data, on the other hand, require manual input of a domain expert and are thus not applicable in cases where data integrated from a large number of sources exhibits enormous variance in vocabularies used. In such environments, the authority of data sources is an important signal that the ranking algorithm has to take into account. This paper presents algorithms for prioritising data returned by queries over web datasets expressed in RDF. We introduce the notion of naming authority which provides a correspondence between identifiers and the sources which can speak authoritatively for these identifiers. Our algorithm uses the original PageRank method to assign authority values to data sources based on a naming authority graph, and then propagates the authority values to identifiers referenced in the sources. We conduct performance and quality evaluations of the method on a large web dataset. Our method is schema-independent, requires no manual input, and has applications in search, query processing, reasoning, and user interfaces over integrated datasets.
What Four Million Mappings Can Tell You About Two Hundred Ontologies , Amir Ghazvinian,Natalya Fridman Noy,Clement Jonquet,Nigam H. Shah and Mark A. Musen , 229-242 , [OpenAccess] , [Publisher]
The field of biomedicine has embraced the Semantic Web probably more than any other field. As a result, there is a large number of biomedical ontologies covering overlapping areas of the field. We have developed BioPortal - an open community-based repository of biomedical ontologies. We analyzed ontologies and terminologies in BioPortal and the Unified Medical Language System (UMLS), creating more than 4 million mappings between concepts in these ontologies and terminologies based on the lexical similarity of concept names and synonyms. We then analyzed the mappings and what they tell us about the ontologies themselves, the structure of the ontology repository, and the ways in which the mappings can help in the process of ontology design and evaluation. For example, we can use the mappings to guide users who are new to a field to the most pertinent ontologies in that field, to identify areas of the domain that are not covered sufficiently by the ontologies in the repository, and to identify which ontologies will serve well as background knowledge in domain-specific tools. While we used a specific (but large) ontology repository for the study, we believe that the lessons we learned about the value of a large-scale set of mappings to ontology users and developers are general and apply in many other domains.
XLWrap - Querying and Integrating Arbitrary Spreadsheets with SPARQL , Andreas Langegger and Wolfram Wöß , 359-374 , [OpenAccess] , [Publisher]
In this paper a novel approach is presented for generating RDF graphs of arbitrary complexity from various spreadsheet layouts. Currently, none of the available spreadsheet-to-RDF wrappers supports cross tables and tables where data is not aligned in rows. Similar to RDF123, XLWrap is based on template graphs where fragments of triples can be mapped to specific cells of a spreadsheet. Additionally, it features a full expression algebra based on the syntax of OpenOffice Calc and various shift operations, which can be used to repeat similar mappings in order to wrap cross tables including multiple sheets and spreadsheet files. The set of available expression functions includes most of the native functions of OpenOffice Calc and can be easily extended by users of XLWrap. Additionally, XLWrap is able to execute SPARQL queries, and since it is possible to define multiple virtual class extents in a mapping specification, it can be used to integrate information from multiple spreadsheets. XLWrap supports a special identity concept which allows to link anonymous resources (blank nodes) - which may originate from different spreadsheets - in the target graph.

Semantic Web In Use Track Paper

A Case Study in Integrating Multiple E-commerce Standards via Semantic Web Technology , Yang Yu,Donald Hillman,Basuki Setio and Jeff Heflin , 909-924 , [OpenAccess] , [Publisher]
Internet business-to-business transactions present great challenges in merging information from different sources. In this paper we describe a project to integrate four representative commercial classification systems with the Federal Cataloging System (FCS). The FCS is used by the US Defense Logistics Agency to name, describe and classify all items under inventory control by the DoD. Our approach uses the ECCMA Open Technical Dictionary (eOTD) as a common vocabulary to accommodate all different classifications. We create a semantic bridging ontology between each classification and the eOTD to describe their logical relationships in OWL DL. The essential idea is that since each classification has formal definitions in a common vocabulary, we can use subsumption to automatically integrate them, thus mitigating the need for pairwise mappings. Furthermore our system provides an interactive interface to let users choose and browse the results and more importantly it can translate catalogs that commit to these classifications using compiled mapping results.
Bridging the Gap Between Linked Data and the Semantic Desktop , Tudor Groza,Laura Dragan,Siegfried Handschuh and Stefan Decker , 827-842 , [OpenAccess] , [Publisher]
The exponential growth of the World Wide Web in the last decade brought an explosion in the information space, which has important consequences also in the area of scientific research. Finding relevant work in a particular field and exploring the links between publications is currently a cumbersome task. Similarly, on the desktop, managing the publications acquired over time can represent a real challenge. Extracting semantic metadata, exploring the linked data cloud and using the semantic desktop for managing personal information represent, in part, solutions for different aspects of the above mentioned issues. In this paper, we propose an innovative approach for bridging these three directions with the overall goal of alleviating the information overload problem burdening early stage researchers. Our application combines harmoniously document engineering-oriented automatic metadata extraction with information expansion and visualization based on linked data, while the resulting documents can be seamlessly integrated into the semantic desktop.
Extracting Enterprise Vocabularies Using Linked Open Data , Julian Dolby,Achille Fokoue,Aditya Kalyanpur,Edith Schonberg and Kavitha Srinivas , 779-794 , [OpenAccess] , [Publisher]
A common vocabulary is vital to smooth business operation, yet codifying and maintaining an enterprise vocabulary is an arduous, manual task. We describe a process to automatically extract a domain specific vocabulary (terms and types) from unstructured data in the enterprise guided by term definitions in Linked Open Data (LOD). We validate our techniques by applying them to the IT (Information Technology) domain, taking 58 Gartner analyst reports and using two specific LOD sources -- DBpedia and Freebase.
Lifting events in RDF from interactions with annotated Web pages , Roland Stühmer,Darko Anicic,Sinan Sen,Jun Ma,Kay-Uwe Schmidt and Nenad Stojanovic , 893-908 , [OpenAccess] , [Publisher]
In this demo we show the current state of our client-side rule engine for the Web. The engine is an implementation for creating and processing semantic events from interaction with Web pages which opens possibilities to build event-driven applications for the (Semantic) Web. Events, simple or complex, are models for things that happen e.g., when a user interacts with a Web page. Events are consumed in some meaningful way e.g., for monitoring reasons or to trigger actions such as responses. In order for receiving parties to understand events, i.e. comprehend what has led to an event, we demonstrate a general event schema using RDFS.
LinkedGeoData: Adding a Spatial Dimension to the Web of Data , Sören Auer,Jens Lehmann and Sebastian Hellmann , 731-746 , [OpenAccess] , [Publisher]
In order to employ the Web as a medium for data and information integration, comprehensive datasets and vocabularies are required as they enable the disambiguation and alignment of other data and information. Many real-life information integration and aggregation tasks are impossible without comprehensive background knowledge related to spatial features of the ways, structures and landscapes surrounding us. In this paper we contribute to the generation of a spatial dimension for the Data Web by elaborating on how the collaboratively collected OpenStreetMap data can be transformed and represented adhering to the RDF data model. We describe how this data can be interlinked with other spatial data sets, how it can be made accessible for machines according to the linked data paradigm and for humans by means of a faceted geo-data browser.
Live Social Semantics , Harith Alani,Martin Szomszor,Ciro Cattuto,Wouter Van den Broeck,Gianluca Correndo and Alain Barrat , 698-714 , [OpenAccess] , [Publisher]
Social interactions are one of the key factors to the success of conferences and similar community gatherings. This paper describes a novel application that integrates data from the semantic web, online social networks, and a real-world contact sensing platform. This application was successfully deployed at ESWC09, and actively used by 139 people. Personal profiles of the participants were automatically generated using several Web~2.0 systems and semantic academic data sources, and integrated in real-time with face-to-face contact networks derived from wearable sensors. Integration of all these heterogeneous data layers made it possible to offer various services to conference attendees to enhance their social experience such as visualisation of contact data, and a site to explore and connect with other participants. This paper describes the architecture of the application, the services we provided, and the results we achieved in this deployment.
Produce and Consume Linked Data with Drupal! , Stephane Corlosquet,Renaud Delbru,Tim Clark,Axel Polleres and Stefan Decker , 763-778 , [OpenAccess] , [Publisher]
Currently a large number of Web sites are driven by Content Management Systems (CMS) which manage textual and multimedia content but also - inherently - carry valuable information about a site's structure and content model. Exposing this structured information to the Web of Data has so far required considerable expertise in RDF and OWL modelling and additional programming effort. In this paper we tackle one of the most popular CMS: Drupal. We enable site administrators to export their site content model and data to the Web of Data without requiring extensive knowledge on Semantic Web technologies. Our modules create RDFa annotations and -- optionally -- a SPARQL endpoint for any Drupal site out of the box. Likewise, we add the means to map the site data to existing ontologies on the Web with a search interface to find commonly used ontology terms. We also allow a Drupal site administrator to include existing RDF data from remote SPARQL endpoints on the Web in the site. When brought together, these features allow networked RDF Drupal sites that reuse and enrich Linked Data. We finally discuss the adoption of our modules and report on a use case in the biomedical field and the current status of its deployment.
RAPID: Enabling Scalable Ad-Hoc Analytics on the Semantic Web , Radhika Sridhar,Padmashree Ravindra and Kemafor Anyanwu , 715-730 , [OpenAccess] , [Publisher]
As the amount of available RDF data continues to increase steadily, there is growing interest in developing efficient methods for analyzing such data. While recent efforts have focused on developing efficient methods for traditional data processing, analytical processing which typically involves more complex queries has received much less attention. The use of cost effective parallelization techniques such as Google's Map-Reduce offer significant promise for achieving Web scale analytics. However, currently available implementations are designed for simple data processing on structured data. In this paper, we present a language, RAPID, for scalable ad-hoc analytical processing of RDF data on Map-Reduce frameworks. It builds on Yahoo's Pig Latin by introducing primitives based on a specialized join operator, the MD-join, for expressing analytical tasks in a manner that is more amenable to parallel processing, as well as primitives for coping with semi-structured nature of RDF data. Experimental evaluation results demonstrate significant performance improvements for analytical processing of RDF data over existing Map-Reduce based techniques
Reasoning about Resources and Hierarchical Tasks Using OWL and SWRL , Daniel Elenius,David Martin,Reginald Ford and Grit Denker , 795-810 , [OpenAccess] , [Publisher]
Military training and testing events are highly complex affairs, potentially involving dozens of legacy systems that need to interoperate in a meaningful way. There are superficial interoperability concerns (such as two systems not sharing the same messaging formats), but also substantive problems such as different systems not sharing the same understanding of the terrain, positions of entities, and so forth. We describe our approach to facilitating such events: describe the systems and requirements in great detail using ontologies, and use automated reasoning to automatically find and help resolve problems. The complexity of our problem took us to the limits of what one can do with OWL, and we needed to introduce some innovative techniques of using and extending it. We describe our novel ways of using SWRL and discuss its limitations as well as extensions to it that we found necessary or desirable. Another innovation is our representation of hierarchical tasks in OWL, and an engine that reasons about them. Our task ontology has proved to be a very flexible and expressive framework to describe requirements on resources and their capabilities in order to achieve some purpose.
Semantic Enhancement for Enterprise Data Management , Li Ma,Xingzhi Sun,Feng Cao,Chen Wang,Xiaoyuan Wang,Nick Kanellos,Dan Wolfson and Yue Pan , 876-892 , [OpenAccess] , [Publisher]
Taking customer data as an example, the paper presents an approach to enhance the management of enterprise data by using Semantic Web technologies. Customer data is the most important kind of core business entity a company uses repeatedly across many business processes and systems, and customer data management (CDM) is becoming critical for enterprises because it keeps a single, complete and accurate record of customers across the enterprise. Existing CDM systems focus on integrating customer data from all customer-facing channels and front and back office systems through multiple interfaces, as well as publishing customer data to different applications. To make the effective use of the CDM system, this paper investigates semantic query and analysis over the integrated and centralized customer data, enabling automatic classification and relationship discovery. We have implemented these features over IBM Websphere Customer Center, and shown the prototype to our clients. We believe that our study and experiences are valuable for both Semantic Web community and data management community.
Semantic Web Technologies for the Integration of Learning Tools and Context-aware Educational Services , Zoran Jeremic,Jelena Jovanovic and Dragan Gasevic , 860-875 , [OpenAccess] , [Publisher]
One of the main software engineers' competencies, solving software problems, is most effectively acquired through an active examination of learning resources and work on real-world examples in small development teams. This obviously indicates a need for an integration of several existing learning tools and systems in a common collaborative learning environment, as well as advanced educational services that provide students with right in time advice about learning resources and possible collaboration partners. In this paper, we present how we developed and applied a common ontological foundation for the integration of different existing learning tools and systems in a common learning environment called DEPTHS (Design Patterns Teaching Help System). In addition, we present a set of educational services that leverages semantic rich representation of learning resources and students' interaction data to recom-mend resource relevant for students' current learning context.
Supporting Multi-View User Ontology to Understand Company Value Chains , Landong Zuo,Manuel Salvadores,S. M. Hazzaz Imtiaz,John Darlington,Nicholas Gibbins,Nigel R. Shadbolt and James Dobree , 925-940 , [OpenAccess] , [Publisher]
The objective of the Market Blended Insight (MBI) project is to develop web based techniques to improve the performance of UK Business to Business (B2B) marketing activities. The analysis of company value chains is a fundamental task within MBI because it is an important model for understanding the market place and the company interactions within it. The project has aggregated rich data profiles of 3.7 million companies that form the active UK business community. The profiles are augmented by Web extractions from heterogeneous sources to provide unparalleled business insight. Advances by the Semantic Web in knowledge representation and logic reasoning allow flexible integration of data from heterogeneous sources, transformation between different representations and reasoning about their meaning. The MBI project has identified that the market insight and analysis interests of different types of users are difficult to maintain using a single domain ontology. Therefore, the project has developed a technique to undertake a plurality of analyses of value chains by deploying a distributed multi-view ontology to capture different user views over the classification of companies and their various relationships.
Using Hybrid Search and Query for E-discovery Identification , Dave Grosvenor and Andy Seaborne , 811-826 , [OpenAccess] , [Publisher]
We investigated the use of a hybrid search and query for locating enterprise data relevant to a requesting party's legal case (e-discovery identification). We extended the query capabilities of SPARQL with search capabilities to provide integrated access to structured, semi-structured and unstructured data sources. Every data source in the enterprise is potentially within the scope of e-discovery identification. So we use some common enterprise structured data sources that provide product and organizational information to guide the search and restrict it to a manageable scale. We use hybrid search and query to conduct a rich high-level search, which identifies the key people and products to coarsely locate relevant data-sources. Furthermore the product and organizational data sources are also used to increase recall which is a key requirement for e-discovery Identification.
Vocabulary Matching for Book Indexing Suggestion in Linked Libraries - A Prototype Implementation and Evaluation , Antoine Isaac,Dirk Kramer,Lourens van der Meij,Shenghui Wang,Stefan Schlobach and Johan Stapel , 843-859 , [OpenAccess] , [Publisher]
In this paper, we report on a technology-transfer effort on using the Semantic Web (SW) technologies, esp. ontology matching, for solving a real-life library problem: book subject indexing. Our purpose is to streamline one library's book description process by suggesting new subjects based on descriptions created by other institutions, even when the vocabularies used are different. The case at hand concerns the National Library of the Netherlands (KB) and the network of Dutch local public libraries. We present a prototype subject suggestion tool, which is directly connected to the KB production cataloguing environment. We also report on the results of a user study and evaluation to assess the feasibility of exploiting state-of-the art techniques in such a real-life application. Our prototype demonstrates that SW components can be seamlessly plugged into the KB production environment, which potentially brings a higher level of flexibility and openness to networked Cultural Heritage (CH) institutions. Technical hurdles can be tackled and the suggested subjects are often relevant, opening up exciting new perspectives on the daily work of the KB. However, the general performance level should be made higher to warrant seamless embedding in the production environment-notably by considering more contextual metadata for the suggestion process.

http://data.semanticweb.org/conference/iswc/2009

Context and Domain Knowledge Enhanced Entity Spotting in Informal Text , Daniel Gruhl,Meenakshi Nagarajan,Jan Pieper,Christine Robson and Amit P. Sheth , 260-276 , [OpenAccess] , [Publisher]
This paper explores the application of restricted relationship graphs (RDF) and statistical NLP techniques to improve named entity annotation in challenging Informal English domains. We validate our approach using on-line forums discussing popular music. Named entity annotation is particularly difficult in this domain because it is characterized by a large number of ambiguous entities, such as the Madonna album Music" or Lilly Allen's pop hit "Smile". We evaluate improvements in annotation accuracy that can be obtained by restricting the set of possible entities using real-world constraints. We find that constrained domain entity extraction raises the annotation accuracy significantly
Optimizing QoS-Aware Semantic Web Service Composition , Freddy Lécué , 375-391 , [OpenAccess] , [Publisher]
Ranking and optimization of web service compositions are some of the most interesting challenges at present. Since web services can be enhanced with formal semantic descriptions, forming the semantic web services"

Proceedings of ISWC 2009 Industry Track

Bigdata: Enabling the Semantic Web at Web-Scale , Mike Personick , [OpenAccess] , [Publisher]
Descriptions, Ontologies, Collaboration and Governance , Mike Lang Jr. and Greg Milbank , [OpenAccess] , [Publisher]
Linked Data And The New York Times , Evan Sandhaus , [OpenAccess] , [Publisher]
Moving Objects: Look in the Past, Manage the Present and Predict the Future , Craig Norvell , [OpenAccess] , [Publisher]
Open Architectures for Open Government , Cory Casanave , [OpenAccess] , [Publisher]
Strengthen your SOA with Semantics , John Hebeler and Andrew Perez-Lopez , [OpenAccess] , [Publisher]
Systems Ontological Use-Cases , Gary Sikora , [OpenAccess] , [Publisher]

Proceedings of ISWC 2009 Poster and Demo

and completely new elements can be created. Individual statements can be created and the set of stored statements further extended and developed collaboratively on the Web by humans; in addition

Leadsto - Collaboratively Constructing and Discovering Causal Chains , Mark Hefke and Andreas Abecker , [OpenAccess] , [Publisher]
Leadsto is a prototypical Semantic Portal for collaboratively describing statements of the form x leads to y" (e.g. "accident leads to traffic jam"). Existing elements of statements (precedents

Poster/Demo Paper

A Framework for Ontologies-based User Interface Integration , Heiko Paulheim and Florian Probst , [OpenAccess] , [Publisher]
A Proposed Diagrammatic Logic for Ontology Specification and Visualization , Ian Oliver,John Howse,Gem Stapleton,Esko Nuutila and Seppo Torma , [OpenAccess] , [Publisher]
We propose a diagrammatic logic that is suitable for specifying ontologies. We provide a specification of a simple ontology and include examples to show how to place constraints on ontology specifications and define queries. The framework also allows the depiction of instances, multiple ontologies to be related, and reasoning about ontologies.
A RDF-base Normalized Model for Biomedical Lexical Grid , Cui Tao,Jyotishman Pathak,Harold R. Solbrig,Wei-Qi Wei and Christopher G. Chute , [OpenAccess] , [Publisher]
The Lexical Grid (LexGrid) project is an on-going community-driven initiative coordinated by the Mayo Clinic Division of Biomedical Statistics and Informatics. It provides a common terminology model to represent multiple vocabulary and ontology sources as well as a scalable and robust API for accessing such information. While successfully used and adopted in the biomedical and clinical community, LexGrid model now needs to be aligned with emerging Semantic Web standards and specifications. This paper introduces the LexRDF model, which maps the LexGrid model elements to corresponding constructs in W3C speci¯cations such as RDF, OWL, and SKOS. With LexRDF, the terminological information represent in LexGrid can be translated to RDF triples, and therefore allowing LexGrid to leverage standard tools and technologies such as SPARQL and RDF triple stores.
All About That - A URI Profiling Tool for monitoring and preserving Linked Data , Rob Vesse,Wendy Hall and Les Carr , [OpenAccess] , [Publisher]
All About That (AAT) is a URI Profiling tool which allows users to monitor and preserve Linked Data in which they are interested. Its design is based upon the principle of adapting ideas from hypermedia link integrity in order to apply them to the Semantic Web. As the Linked Data Web expands it will become increasingly important to maintain links such that the data remains useful and therefore this tool is presented as a step towards providing this maintenance capability.
Analogy Engines for the Semantic Web , Akshay Bhat , [OpenAccess] , [Publisher]
We propose a new utility for Semantic Web called as Analogy Engine. Analogy engine employs an example based search approach for retrieving the most similar URIs for the given URI by comparing number of shared links. The Analogy engine is based on Analogy Space, which uses Singular Value Decomposition on matrix representation of a Semantic Network. However Analogy Space faces difficulty with networks having more than a few thousand nodes. We present our preliminary work on scaling Analogy Space by dividing the network into multiple communities, and creating separate Analogy Space for each community. We show that this procedure results in significant improvements and can be used for a large scale network such as the Semantic Web.
Autonomous RDF Replication on Mobile Devices , Bernhard Schandl and Stefan Zander , [OpenAccess] , [Publisher]
Mobile applications are of increasing interest for research and industry. The widespread use and improved capabilities of portable devices enable the deployment of sophisticated and powerful applications that provide the user with services at any time and location. When such applications are built on top of Linked Data, permanent network connectivity is required, which is often not available or expensive to establish. Hence we propose a framework that uses RDF-based context descriptions to selectively and proactively replicate data to mobile devices. These replicas can be used when no network connection can be established, thus making mobile applications and users more autonomous and stable.
CoDR: A Contextual Framework for Diagnosis and Repair , Klaas Dellschaft,Qiu Ji and Guilin Qi , [OpenAccess] , [Publisher]
Ontologies play a central role for the formal representation of knowledge on the Semantic Web. A major challenge in collaborative ontology construction is to handle inconsistencies caused by changes to the ontology. In this paper, we present our CoDR system which helps to diagnose and repair collaboratively constructed ontologies. CoDR integrates RaDON, an ontology diagnosis and repair tool, and Cicero, which provides discussion functionality for the ontology developers. CoDR is realized as a plugin for the NeOn Toolkit. It helps to use discussions held in Cicero as context information during repairing an ontology with RaDON. But it is also possible to use the diagnoses from RaDON during the discussions in Cicero.
Collaborative climate change research on the semantic web , Christian Battista and Benjamin Coe , [OpenAccess] , [Publisher]
WikiEarth (http://www.wikiearth.net) is a website designed for encouraging collaboration between researchers across the academic spectrum, and also serves as a test case to determine the limitations and benefits of using an ontological data structure to manage the input of natural science based data from around the world. Drawing upon Wikipedia's model of massive user collaboration, WikiEarth's motivation is to extend beyond this by formalizing the relationships between the data being entered. A semantic ontology is a natural candidate for data representation for three reasons: first, the hierarchical class structure of an OWL-Ontology helps avoid redundancy when developing simulations, as an operation can be applied to a class and all its subclasses; secondly, a framework like Jena helps eliminate human error and reduce the amount of data entry that needs to be performed; and finally, important restrictions regarding data entry are imposed by the ontological structure and Jena, as opposed to by a proprietary system developed for one specific application. Utilizing this infrastructure, a WikiEarth Climate Demonstration was successfully conceptualized, constructed, deployed and subsequently unveiled at the 2009 World Student Environmental Summit. The success of this application demonstrates that ontologies could be effectively purposed for a high-traffic production system.
Composition Optimizer: A Tool for Optimizing Quality of Semantic Web Service Composition , Freddy Lecue , [OpenAccess] , [Publisher]
Ranking and optimization of web service compositions are some of the most interesting challenges at present. Since web services can be enhanced with formal semantic descriptions, forming the "semantic web services", it becomes conceivable to exploit the quality of semantic links between services (of any composition) as one of the optimization criteria. For this we propose to use the semantic similarities between output and input parameters of web services. Coupling this with other criteria such as quality of service (QoS) allow us to rank and optimize compositions achieving the same goal. We present the Composition Optimizer tool, using an innovative and extensible optimization model designed to balance semantic fit (or functional quality) with non-functional QoS metrics, in order to optimize service composition. To allow the use of this model in the context of a large number of services as foreseen by the EC-funded project SOA4All we propose and test the use of Genetic Algorithms.
Demonstration: Wireless Access Network Selection Enabled by Semantic Technologies , Carolina Fortuna,Bogdan Ivan,Zoltan Padrah,Luka Bradesko,Blaz Fortuna and Mihael Mohorcic , [OpenAccess] , [Publisher]
Service oriented access in a multi-application, multi-access network environment is faced with the problem of cross-layer interoperability among technologies. In this demo, we present a knowledge base (KB) which contains local (user terminal specific) knowledge that enables pro-active network selection by translating technology specific parameters to higher-level, more abstract parameters. We implemented a prototype which makes use of semantic technology (namely ResearchCyc) for creating the elements of the KB and uses reasoning to determine the best access network. The system implements technology-specific parameter mapping according to the IEEE 802.21 draft standard recommendation.
Finding Semantic Web Ontology Terms from Words , Lushan Han,Timothy Finin and Yelena Yesha , [OpenAccess] , [Publisher]
The Semantic Web was designed to unambiguously define and use ontologies to encode data and knowledge on the Web. Many people find it difficult, however, to write complex RDF statements and queries because it requires familiarity with the appropriate ontologies and the terms they define. We describe a framework that eases the experiences in authoring and querying RDF data, in which we focus on automatically finding a set of appropriate Semantic Web ontology terms from a set of words used as the labels of nodes and edges in an incoming semantic graph.
GNOWSYS-mode in Emacs for collaborative construction of knowledge networks in plain text , Divya S,Alpesh Gajbe,Rajiv Nair,Ganesh Gajre and Nagarjuna G , [OpenAccess] , [Publisher]
GNOWSYS-mode is an Emacs extension package for knowledge networking and ontology management using GNOWSYS (Gnowledge Networking and Organizing SYStem) as a server. The demonstration shows how to collaboratively build ontologies and semantic network in an intuitive plain text without any of the RDF notations, though importing and exporting in RDF is possible.
GoodRelations Tools and Applications , Martin Hepp,Andreas Radinger,Andreas Wechselberger,Alex Stolz,Daniel Bingel,Thomas Irmscher,Mark Mattern and Tobias Ostheim , [OpenAccess] , [Publisher]
The adoption of ontologies for the Web of Data can be increased by tools that help populating respective knowledge bases from legacy content, e.g. existing databases, business applications, or proprietary data formats. In this demo and poster, we show the results from our efforts of developing a suite of open-source tools for creating e-commerce descriptions for the Web of Data based on the GoodRelations ontology. Also, we demonstrate how RDF/XML data can be (1) submitted to Yahoo SearchMonkey via the RDF2DataRSS conversion tool, (2) inspected using the SearchMonkey Meta-Data Inspector, and (3) how common data inconsistencies can be spotted with the GoodRelations Validator.
Improved Semantic Graphs with Word Sense Disambiguation , Delia Rusu,Blaz Fortuna and Dunja Mladenic , [OpenAccess] , [Publisher]
Semantic graphs can be seen as a way of representing and visualizing textual information in more structured, RDF-like graphs. The reader thus obtains an overview of the content, without having to read through the text. In building a compact semantic graph, an important step is grouping similar concepts under the same label and connecting them to external repositories. This is achieved through disambiguating word senses, in our case by assigning the sense to a concept given its context. The paper presents an unsupervised, knowledge based word sense disambiguating algorithm for linking semantic graph nodes to the WordNet vocabulary. The algorithm is integrated in the semantic graph generation pipeline, improving the semantic graph readability and conciseness. Experimental evaluation of the proposed disambiguation algorithm shows that it gives good results.
Integrated Ontology Matching and Evaluation , Isabel Cruz,Flavio Palandri Antonelli and Cosmin Stroe , [OpenAccess] , [Publisher]
The AgreementMaker system for ontology matching includes an extensible architecture, which facilitates the integration and performance tuning of a variety of matching methods, an evaluation mechanism, which can make use of a reference matching or rely solely on quality measures, and a multi-purpose user interface, which drives both the matching methods and the evaluation strategies. Our demo focuses on the tight integration of matching methods and evaluation strategies, a unique feature of our system.
Integrity Constraints in OWL , JIAO TAO and Evren Sirin , [OpenAccess] , [Publisher]
In many data-centric applications, it is desirable to use OWL as an expressive schema language with which one expresses constraints that must be satisfied by instance data. However, specific aspects of OWL's standard semantics---i.e., the Open World Assumption (OWA) and the absence of Unique Name Assumption (UNA)---make it difficult to use OWL in this way. In this paper, we present an Integrity Constraint (IC) semantics for OWL axioms, show that IC validation can be reduced to query answering, and present our preliminary results with a prototype implementation using Pellet.
Knowledge based conference video-recordings visualization , Maria Sokhn,Elena Mugellini,Omar Abou Khaled and Ahmed Serhrouchni , [OpenAccess] , [Publisher]
The advent of technologies in information retrieval driven by users' requests calls for an effort to conceive and develop semantic-based applications. In recent years the semantic web gave place for a new generation of search query engines that rely on the semantic of the documents expressed by metadata. In this paper we present a knowledge-based approach to visualizing and navigating through conference video-recordings. This approach is based on a conference ontology that models the information conveyed within a conference life cycle.
LMI-CliCKE: The Climate Change Knowledge Engine (CliCKE) using Semantic MediaWiki , David Manzolillo and Julia Kalloz , [OpenAccess] , [Publisher]
LMI is a not-for-profit research organization committed to helping government leaders and managers reach decisions that make a difference on issues of national importance. Climate change will be one of the defining issues of this century. It has moved from the province of specialists in environmental issues to one of concern for all government leaders. The International Panel on Climate Change (IPCC), the U.S. Global Change Research Program, and individual U.S. agencies have produced important studies of climate change. However, the IPCC Fourth Assessment Report (AR4) alone is over 2600 pages. Within these pages, LMI identified 2693 findings that include specific defined levels of uncertainty. The findings from the IPCC have been so thoroughly demonstrated by the scientific method that it would be a failure of responsibility to ignore them. They form the basis for the LMI Climate Change Knowledge Engine (LMI-CliCKE) and A Federal Leader's Guide to Climate Change a LMI published book written to assist leaders of federal agencies in addressing the challenges associated with climate change. Thorough analysis of the 2693 findings led LMI to develop a semantically driven, wiki-based web site that allows users to explore, analyze, evaluate, and compare scientific findings related to climate change. The LMI Climate Change Knowledge Engine (LMI-CliCKE) gives full text and categorical details of the findings and relationships among them. As an initial prototype the LMI climate team has selected and categorized all findings from the AR4.
Learning to Classify Identity Web References using RDF Graphs , Matthew Rowe and Jose Iria , [OpenAccess] , [Publisher]
The need to monitor a person's web presence has risen in recent years due to identity theft and lateral surveillance becoming prevalent web actions. In this paper we present a machine learning-inspired bootstrapping approach to monitor identity web references that only requires as input an initial small seed set of data modelled as an RDF graph. We vary the combination of different RDF graph matching paradigms with different machine learning classifiers and observe the effects on the classification of identity web references. We present preliminary results of an evaluation in order to show the variation in accuracy of these different permutations.
LifeLogOn: Log on to Your Lifelog Ontology! , Sangkeun Lee,Gihyun Gong and Sang-goo Lee , [OpenAccess] , [Publisher]
LifeLogOn is a system that enables users to easily and rapidly convert heterogeneous relational log data into instance-level integrated log ontology without requiring understanding any ontology languages. It also enables visualizing the created log ontology and allows users to navigate entities and events in the ontology by following their semantic relationships. This demo shows that integration of logs from many different sources can be practical starting point of realizing life logging which can support users' memory and future intelligent services.
Multi-faceted Tagging in TagMe! , Fabian Abel,Ricardo Kawase,Daniel Krause and Patrick Siehndel , [OpenAccess] , [Publisher]
In this paper we present TagMe!, a tagging and exploration front-end for Flickr images, which enables users to add categories to tag assignments and to attach tag assignments to a specific area within an image. We analyze the differences between tags and categories and show how both facets can be applied to learn semantic relations between concepts referenced by tags and categories. Further, we discuss how the multi-faceted tagging helps to improve the retrieval of folksonomy entities. The TagMe! system is currently available at http://tagme.groupme.org
NCBO Annotator: Semantic Annotation of Biomedical Data , Clement Jonquet,Nigam H. Shah,Cherie H. Youn,Chris Callendar,Margaret-Anne Storey and Mark A Musen , [OpenAccess] , [Publisher]
The National Center for Biomedical Ontology Annotator is an ontology-based web service for annotation of textual biomedical data with biomedical ontology concepts. The biomedical community can use the Annotator service to tag datasets automatically with concepts from more than 200 ontologies coming from the two most important set of biomedical ontology & terminology repositories: the UMLS Metathesaurus and NCBO BioPortal. Through annotation (or tagging) of datasets with ontology concepts, unstructured free-text data becomes structured and standardized. Such annotations contribute to create a biomedical semantic web that facilitates translational scientific discoveries by integrating annotated data.
OntoPipeliner: A Semantic Broker-based Manager for Pipelining Semantically-operated Services , Hanmin Jung,Seungwoo Lee,Pyung Kim,Mikyoung Lee and Beom-Jong You , [OpenAccess] , [Publisher]
Like web services, semantically-operated services can be assembled to construct a new composite service. For this, we designed the semantic broker that searches semantic services matched with given conditions, assembles them to dynamically generate pipelines of semantic services, and execute the pipelines. By executing the resulting pipelines, the user can select one which he/she really intended. In this way, our system can help the user who wants to design new semantically-operated services by mashing up the existing semantically-operated services.
Open Ontology Repository , Todd Schneider and Kenneth Baclawski , [OpenAccess] , [Publisher]
The Open Ontology Repository is an open source effort to develop infrastructure for ontologies that is federated, robust and secure. This article describes the purpose, requirements and goals of this initiative.
Querying and Semantically Integrating Spreadsheet Collections with XLWrap-Server , Andreas Langegger and Wolfram Wöß , [OpenAccess] , [Publisher]
In this demo we will present XLWrap-Server, which is a wrapper for collections of spreadsheets providing a SPARQL and Linked Data interface similar to D2R-Server. It is based on XLWrap, a novel approach for generating RDF graphs of arbitrary complexity from spreadsheets with different layouts. To our best knowledge, XLWrap is the first spreadsheet wrapper, supporting cross tables and tables where data is not aligned in rows. It features a full expression algebra based on the syntax of OpenOffice Calc which can be easily extended by users and it supports Microsoft Excel, Open Document, and large CSV spreadsheets. XLWrap-Server can be used to integrate information from a collection of spreadsheets. We will show several use-cases and mapping design patterns in our demonstration.
RDA and the Open Metadata Registry , Jon Phipps and Diane Hillmann
As more and more of the world's databases are opened to the Semantic Web as linked data, there is a growing awareness of the need for upper-level ontologies and RDF vocabularies to support the dissemination of this data. For more than 150 years libraries have been developing standards for describing resources contained in the world's libraries. This year, for the first time in its long history, the library community is making that experience and knowledge freely available as a coordinated set of controlled vocabularies and upper-level ontologies. Resource Description and Access (RDA) is the international library community's new standard for resource description. A component of this standard -- the RDA Vocabularies -- will finally allow libraries to make the vast silos of library and museum metadata publicly available as semantically rich linked data, and provide the semantic web and linked data communities access to more than a century of library experience in describing resources. The Open Metadata Registry is hosting these vocabularies. The Registry is an Open Source, non-commercial project specifically designed to provide individuals, communities, and organizations an easy-to-use platform supporting the development and dissemination of multi-lingual controlled vocabularies and upper-level and domain-specific ontologies. This demo, poster and related handouts will introduce Resource Description and Access (RDA) and the Open Metadata Registry vocabulary development platform to the Semantic Web Community.
RDF2RDFa: Turning RDF into Snippets for Copy-and-Paste , Martin Hepp,Roberto Garcia and Andreas Radinger , [OpenAccess] , [Publisher]
In this demo and poster, we show a conceptual approach and an on-line tool that allows the use of RDFa for embedding non-trivial RDF models in the form of invisible div/span elements into existing Web content. This simplifies the publication of sophisticated RDF data, i.e. such that goes beyond simple property-value pairs, by broad audiences. Also, it empowers users with access limited to inserting XHTML snippets within Web-based authoring systems to add fully-fledged RDF and even OWL. Such is a frequent limitation for users of CMS systems or Wikis.
RKBPlatform: Opening up Services in the Web of Data , Hugh Glaser and Ian Millard , [OpenAccess] , [Publisher]
As the Linked Data initiatives and Web of Data become more widespread, sites that process and re-present the published data are growing in size and number. One challenge is to ensure that such sites do not themselves fall into the trap of failing to publish their new knowledge in a readily available manner. Not only should the work of such sites be re-published for Linked Data users, but it should also be accessible to site builders who have not yet embraced the Semantic Web. This paper presents the work that has been done with the RKBExplorer system to support this task, and describes examples of how it is used.
SIOOS: Semantically-driven Integration of Ocean Observing Systems , Mark Cameron,Jemma Wu,Kerry Taylor,David Ratcliffe,Goeffrey Squire and John Colton , [OpenAccess] , [Publisher]
The diversity and heterogeneity of ocean observing systems obstructs the information flow needed to fully realise the benefits. SIOOS is a prototype for semantically-driven integration of ocean observation systems. SIOOS is built upon our Semantic Service Architecture platform, making rich use of complex ontologies and ontology-to-resource mappings to offer a flexible, semantically-driven integration environment. The SIOOS prototype draws on a federation of autonomous web and sensor observation services from the Integrated Ocean Observing System (IOOS). In this demonstration, we will use typical information management scenarios drawn from the ocean observation community to highlight major features of the SIOOS and show how these features address some of the challenges faced by the IOOS community.
SKOS2OWL: An Online Tool for Deriving OWL and RDF-S Ontologies from SKOS Vocabularies , Martin Hepp and Andreas Radinger , [OpenAccess] , [Publisher]
Hierarchical classifications are available for many domains of interest. They often provide a large amount of category definitions and some sort of hierarchies. Thanks to their size and popularity, they are a promising ground for publishing and organizing data on the Semantic Web. Unfortunately, classifications can mostly not be directly used as ontologies in OWL, because they are not (or at least: very bad) ontologies. In particular, the labels in categories often lack a context-neutral notion of what it means to be an instance of that category, and the meaning of the hierarchical relations is often not a strict subClassOf. SKOS2OWL is an online tool that allows deriving consistent RDF-S or OWL ontologies from most hierarchical classifications available in the W3C SKOS exchange format. SKOS2OWL helps the user narrow down the intended meaning of the available categories to classes and guides the user through several modeling choices. In particular, SKOS2OWL can draw a representative random sample of relevant conceptual elements in the SKOS file and asks the user to make statements about their meaning. This can be used to make reliable modeling decisions without looking at every single element, which would be unfeasible for large classifications.
SPoX: combining reactive Semantic Web policies and Social Semantic Data to control the behaviour of Skype , Philipp Karger,Emily Kigel and VenkatRam Yadav Jaltar , [OpenAccess] , [Publisher]
In this demo paper we describe SPoX, a tool that allows to define the behaviour of Skype based on reactive Semantic Web policies. SPoX (Skype Policy Extension) enables users to define policies stating, for example, who is allowed to call and whose chat messages show up. Moreover, SPoX reacts to arbitrary events in Skype's Social Network as well, such as on-line status changes of users or the birthday of a friend. The decisions about how SPoX reacts are defined by means of Semantic Web policies that do not only consider the context of the user (such as time or on-line status) but include Social Semantic Web data into the policy reasoning process. By this means, users can state that, for instance, only people defined as friends in their FOAF profile, only friends on Twitter, or even only people they wrote a paper with are allowed to call. Further, SPoX exploits Semantic Web techniques for advanced negotiations by means of exchanging policies over the Skype application channel. This procedure allows two clients to negotiate trust based on their SPoX policies before a connection - for example a Skype call - is established.
STARS ﾖ Semantic Tools for Screen Arts Research , Simon Price,Jasper Tredgold,Nikki Rogers,Mike Jones,Damian Steer and Angela Piccini , [OpenAccess] , [Publisher]
STARS is an open source e-research tool that enables screen arts researchers to browse, annotate and replay moving image content in order to better understand its thematic links to those people and communities involved in all aspects of its creation. The STARS software was built using Semantic Web technologies to address the technical challenges of integrated searching, browsing and visualisation across curated core data and user-contributed annotations.
SaHaRa: Discovering Entity-Topic Associations in Online News , Krisztian Balog,Maarten de Rijke,Raymond Franz,Hendrike Peetz,Bart Brinkman,Ivan Johgi and Max Hirschel , [OpenAccess] , [Publisher]
We present SaHaRa, a system that helps to discover and analyze the relationship between entities and topics in large collections of news articles. We augment entity related search by including semantically related linked open data.
Semantic RPC via Queries , Mohammad Reza Tazari , [OpenAccess] , [Publisher]
A vision of the Semantic Web is to facilitate global software interoperability. Many approaches and specifications are available that work towards realization of this vision: Service-oriented architectures (SOA) provide a good level of abstraction for interoperability; Web Services provide programmatic interfaces for application to application communication in SOA; there are ontologies that can be used for machine-readable description of service semantics. What is still missing is a standard for constructing semantically formulated service requests that solely rely on shared domain ontologies without depending on programatic or even semantically described interfaces. emph{Semantic RPC} would then include the whole process from issuing such a request, matchmaking with semantic profiles of available and accessible services, deriving input parameters for the matched service(s), calling the service(s), getting the results, and mapping back the results onto an appropriate response to the original request. The standard must avoid realization-specific assumptions so that frameworks supporting semantic RPC can be built for bridging the gap between the semantically formulated service requests and matched programmatic interfaces. This poster introduces a candidate solution to this problem by outlining a query language for semantic service utilization based on an extension of the OWL-S ontology for service description.
Semantic Web Reasoning by Swarm Intelligence , Kathrin Dentler,Stefan Schlobach and Christophe Gueret , [OpenAccess] , [Publisher]
Semantic Web reasoning systems are confronted with the task to process growing amounts of distributed, dynamic resources. We propose a novel way of approaching the challenge by RDF graph traversal, exploiting the advantages of Swarm Intelligence. Our nature-inspired methodology is realised by self-organising swarms of autonomous, light-weight entities that traverse RDF graphs by following paths, aiming to instantiate pattern-based inference rules.
Semantic Web Technologies to Improve Customer Service , Kerstin Denecke,Gideon Zenz and Wladimir Krasnov , [OpenAccess] , [Publisher]
In this paper, we present an approach that exploits semantic web technologies to categorize specialized text and to create hierarchical facets representing the document content. For this purpose, domain knowledge represented by a thesaurus with relevant, domain-specific terms is used to identify relevant terms. Based on dependency information between single terms provided by the thesaurus (hypernomy, hyponymy), we create hierarchical facets representing the content of the text. The algorithm is applied to a collection of service messages and shows promising results in text categorization.
Semantic-Powered Research Profiling , Zhixiong Zhang,Ying Ding and Na Hong , [OpenAccess] , [Publisher]
Research profiling is a widely-adopted method to monitor research development and rank research performance. This paper describes a novel infrastructure to generate semantic-powered research profiling for research fields, organizations and individuals. It crawls related websites and news feeds, extracts research terms, research objects and relations from them and uses the proposed Research Ontology to model them into RDF triples to facilitate semantic queries and semantic mining on burst detection, hot topic detection, dynamics of research, and relation mining. The authors implement a research profiling experiment in Artificial Intelligence area to show the effectiveness of the research profiling based on semantic mining.
Semantically Annotating RESTful Services with SWEET , Maria Maleshkova,Carlos Pedrinaci and John Domingue , [OpenAccess] , [Publisher]
This paper presents SWEET, the first tool developed for supporting users in creating semantic RESTful services by structuring service descriptions and associating semantic annotations with the aim to support a higher level of automation when performing common tasks with RESTful services, such as their discovery and composition.
Sense Aware Searching and Exploration with MyTag , Klaas Dellschaft,Olaf Gorlitz and Martin Szomszor , [OpenAccess] , [Publisher]
In this work, we describe our approach on how to deal with tag ambiguity in tagging systems and how to enable a sense aware or semantic search. The sense aware search is realized by means of a Sense Repository which returns for given terms a list of potential senses. This list is then presented to the user of the cross-folksonomy search engine MyTag so that he can explicitly select the sense he wants to search for. The search results are then ranked according to this sense so that relevant resources appear higher in the result list.
SensorMasher: Enabling open linked data in sensor data mashup , Danh Le Phuoc and Manfred Hauswirth
In this demo, we demonstrate a plaform which makes sensor data available following the linked open data principle and enables the seamless integration of such data into mashups. SensorMasher publishes sensor data as Web data sources which can then easily be integrated with other (linked) data sources and sensor data. Raw sensor readings and sensors can be semantically described and annotated by the user. These descriptions can then be exploited in mashups and in linked open data scenarios and enable the discovery and integration of sensors and sensor data at large scale. The user-generated mashups of sensor data and linked open data can in turn be published as linked open data sources and be used by others
Spatial and Semantic Reasoning to Recognize Ship Behavior , Willem Robert van Hage,Gerben de Vries,Veronique Malaise,Guus Schreiber and Maarten van Someren , [OpenAccess] , [Publisher]
This demo shows the integration of spatial and semantic reasoning for the recognition of ship behavior. We recognize abstract behavior such as "ferry trip" and derive that the ship showing this behavior is a "ferry". We accomplish this by abstracting over low-level ship trajectory data and applying Prolog rules that express properties of ship behavior. These rules make use of the GeoNames ontology and a spatial indexing package for SWI-Prolog, which is available as open source software.
Tetherless World Mobile Wine Agent: An Application for Semantics on Mobile Devices , Evan Patton and Deborah McGuinness , [OpenAccess] , [Publisher]
The Tetherless World Mobile Wine Agent integrates semantics, geolocation, and social networking on a low-power, mobile platform to provide a unique food and wine recommender system. It provides a robust user interface that allows users to describe a wealth of information about foods and wines as OWL classes and instances and it allows users to share these descriptions with their friends via custom URIs. This demonstration will examine how the user interface simplifies generating RDF data, how location services such as GPS can simplify reasoning (reducing the ABox due to context-sensitive information), and how users of the Mobile Wine Agent can utilize social networking tools such as Facebook and Twitter to share content with others over the World Wide Web.
The Data-gov Wiki: A Semantic Web Portal for Linked Government Data , Li Ding,Dominic DiFranzo,Sarah Magidson,Deborah McGuinness and Jim Hendler , [OpenAccess] , [Publisher]
The Data-gov Wiki is the delivery site for a project where we investigate the role of linked data in producing, processing and utilizing the government datasets found on data.gov. The project has generated over 2 billion triples from government data and a few interesting applications covering data access, visualization, integration, linking and analysis.
The SILK System: Scalable and Expressive Semantic Rules , Benjamin Grosof,Mike Dean and Michael Kifer , [OpenAccess] , [Publisher]
SILK is a new knowledge representation (KR) language and system that integrates and extends recent theoretical and implementation advances in semantic rules and ontologies. It addresses fundamental requirements for scaling the Se- mantic Web to large knowledge bases in science and busi- ness that answer questions, proactively supply info, and rea- son powerfully. SILK radically extends the KR power of W3C OWL RL, SPARQL, and RIF-BLD, as well as of SQL and production rules. It includes defaults (cf. Courteous LP), higher-order features (cf. HiLog), frame syntax (cf. F-Logic), external actions (cf. production rules), and sound interchange with the main existing forms of knowledge/data in the Semantic Web and deep Web. These features cope with knowledge quality and context, provide flexible meta- reasoning, and activate knowledge.
The alpha Urban LarKC, a Semantic Urban Computing application , Emanuele Della Valle,Irene Celino and Daniele Dell'Aglio , [OpenAccess] , [Publisher]
This paper describes the alpha Urban LarKC, one of the first Urban Computing applications built with Semantic Web technologies. It is based on the LarKC platform and makes use of the publicly available data sources on the Web which refer to interesting information about a urban environment (the city of Milano in Italy).
Towards Soundness Preserving Approximation for TBox Reasoning in OWL 2 , Yuan Ren,Jeff Z. Pan and Yuting Zhao , [OpenAccess] , [Publisher]
Large scale semantic web applications require efficient and robust description logic (DL) reasoning services. In this paper, we present a soundness preserving tractable approximative reasoning approach for TBox reasoning in R, a fragment of OWL2-DL supporting ALC GCIs and role chains with 2ExpTime-hard complexity. We first rewrite the ontologies into EL+ with an additional complement table maintaining the complementary relations between named concepts, and then classify the approximation. Preliminary evaluation shows that our approach can classify existing benchmarks in large scale efficiently with a high recall.
Tripcel: Exploring RDF Graphs using the Spreadsheet Metaphor , Bernhard Schandl , [OpenAccess] , [Publisher]
Spreadsheet tools are often used in business and private scenarios in order to collect and store data, and to explore and analyze these data by executing functions and aggregations. They allow users to incrementally compose calculations by filling cells with formulas that are evaluated against data in the sheet, whereas expressions can be nested via cell references. In this paper we present Tripcel, a tool that applies the spreadsheet concept to RDF. It allows users to formulate expressions over the contents of an RDF graph, to arrange these expressions in a grid, and to interactively inspect their evaluation results. Thus it can be used to perform analysis tasks over large data sets within an understandable and familiar interface.
Ultrawrap: Using SQL Views for RDB2RDF , Juan Sequeda,Rudy Depena and Daniel Miranker , [OpenAccess] , [Publisher]
Ultrawrap is an automatic wrapping system that synthesizes an OWL ontology from the database's SQL schema and provides SPARQL query services for legacy relational databases. The system intentionally defines triples by using SQL view statements. The benefits of this organization include, the virtualization of the triple table assures real-time consistency between relational and semantic accesses to the database and the existing SQL optimizer implements the most challenging aspects of rewriting SPARQL to equivalent queries on the relational representation of the data. Initial experiments are auspicious.
Understanding Justifications for Entailments in OWL , Matthew Horridge,Bijan Parsia and Ulrike Sattler , [OpenAccess] , [Publisher]
Recent work in explanation of entailments in ontologies has focused on justifications and their variants. While in many cases, just presenting the justification is sufficient for user understanding, and in all cases justifications are much better than nothing, we have empirically identified cases where understanding how a justification supports an entailment is inordinately difficult. Indeed there are naturally occurring justifications that people, with varying expertise in OWL, cannot understand. To address this problem, we have developed a novel conceptual framework for justification oriented proofs. Given a justification for an entailment in an ontology, intermediate inference steps, called lemmas, are automatically derived, that bridge the gap between the axioms in the justification and the entailment. The proof shows in a stepwise way how the lemmas and ultimately the entailment follow from the justification. At the heart of the framework is the notion of a ``complexity model'', which predicts how easy or difficult it is for a user to understand a justification, and is used for selecting the lemmas to insert into a proof. This poster and demo presents this framework backed by a prototype implementation.
Viewpoint Management for Multi-Perspective issues of Ontologies , Kouji Kozaki,Takeru Hirota,Hiroko Kou,Mamoru Ohta and Riichiro Mizoguchi , [OpenAccess] , [Publisher]
This paper discusses semantic technologies for multi-perspective issues of ontologies based on ontological viewpoint management. We developed two technologies and implement them in environmental and medical domain. The first one is conceptual map generation tool which allows the users to explore an ontology according to their own perspectives and visualizes them in a user-friendly form, i.e. conceptual map. The other is on-demand reorganization of is-a hierarchy from an ontology. They contribute to integrated understanding of ontologies and a solution of multi-perspective issues of ontologies.
Web of Data Plumbing - Lowering the Barriers to Entry , Juergen Umbrich,Hugh Glaser,Tuukka Hastrup,Ian Millard and Michael Hausenblas , [OpenAccess] , [Publisher]
Publishing and consuming content on the Web of Data often requires considerable expertise in the underlying technologies, as the expected services to achieve this are either not packaged in a simple and accessible manner, or are simply lacking. In this poster, we address selected issues by briefly introducing the following essential Web of Data services designed to lower the entry-barrier for Web developers: (i) a multi-ping service, (ii) a meta search service, and (iii) a universal discovery service.
gOntt: a Tool for Scheduling Ontology Development Projects , Asuncion Gomez-Perez,Mari Carmen Suarez-Figueroa and Martin Vigo , [OpenAccess] , [Publisher]
The Ontology Engineering field lacks tools that guide ontology developers to plan and schedule their ontology development projects. gOntt helps ontology developers in two ways: (a) to schedule ontology projects; and (b) to execute such projects based on the schedule and using the NeOn Methodology.
iSMART: intelligent Semantic MedicAl Record reTrival , yuan ni,guo tong xie,Sheng ping liu,Han yu li,Jing mei,Gang hu,Hai feng liu and Xue qiao hou , [OpenAccess] , [Publisher]
We present iSMART, a system for intelligent Semantic MedicAl Record reTrival. Health Level 7 Clinical Document Architecture (CDA)[4], a standard based on XML, is well recognized for the representation and exchange of medical records. In CDAs, medical ontologies/terminologies, e.g. SNOMED CT[2], are used to specify the semantic meaning of clinical statements. To better use the structure and semantic information in CDAs for a more effective search, we propose and implement the iSMART system. Firstly, we design and implement an XML-to-RDF convertor to extract RDF statements from medical records using declarative mapping. Then, we design a reasoner to infer additional information by integrating the knowledge from the domain ontologies based on the extracted RDF statements. Finally, we index the inferred set of RDF statements and provide the semantic search on them. A demonstration video is available online[1].
sClippy: Connecting Personal Information and Linked Open Data , Tudor Groza,Laura Dragan,Siegfried Handschuh and Stefan Decker , [OpenAccess] , [Publisher]
The exponential growth of the World Wide Web in the last decade, brought an explosion in the information space, which has important consequences also in the area of scientific research. Thus, finding relevant work in a particular field and exploring the links between publications is quite a cumbersome task. Similarly, on the desktop, managing the publications acquired over time can represent a real challenge. Extracting semantic metadata, exploring the linked data cloud and using the semantic desktop for managing personal information represent, in part, solutions for different aspects of the above mentioned issues. In this poster/demo, we show an innovative approach for bridging these three directions with the overall goal of alleviating the information overload problem burdening early stage researchers.

Semantic Web In Use Track Paper

Lifting events in RDF from interactions with annotated Web pages , Roland Stuhmer,Darko Anicic,Sinan Sen,Jun Ma,Kay-Uwe Schmidt and Nenad Stojanovic
In this paper we present a method and an implementation for creating and processing semantic events from interaction with Web pages which opens possibilities to build event-driven applications for the (Semantic) Web. Events, simple or complex, are models for things that happen e.g., when a user interacts with a Web page. Events are consumed in some meaningful way e.g., for monitoring reasons or to trigger actions such as responses. In order for receiving parties to understand events e.g., comprehend what has led to an event, we propose a general event schema using RDFS. In this schema we cover the composition of complex events and event-to-event relationships. These events can then be used to route semantic information about an occurrence to different recipients helping in making the Semantic Web active. Additionally, we present an architecture for detecting and composing events in Web clients. For the contents of events we show a way of how they are enriched with semantic information about the context in which they occurred. The paper is presented in conjunction with the use case of Semantic Advertising, which extends traditional clickstream analysis by introducing semantic short-term profiling, enabling discovery of the current interest of a Web user and therefore supporting advertisement providers in responding with more relevant advertisements.