AbstractsComputer Science

Automatic classification of earthquake-induced building damages

by Pavlos Fafalios




Institution: University of Crete (UOC); Πανεπιστήμιο Κρήτης
Department:
Year: 2016
Keywords: Εξερευνητική αναζήτηση; Διασυνδεδεμένα δεδομένα; Εξόρυξη οντοτήτων; Τυχαίος περιηγητής; Σημασιολογική αναζήτηση; Exploratory search; Linked Open Data; Named-entity extraction; Random walk; Semantic search
Posted: 02/05/2017
Record ID: 2127577
Full text PDF: http://hdl.handle.net/10442/hedi/38725


Abstract

In recent years we have witnessed an explosion in publishing data on the Web, mostly in the form of Linked Data. An important question is how typical users, who mainly use keyword search queries, can access and exploit this constantly increasing body of knowledge. Although existing interaction paradigms in Semantic Search hide their complexity behind easy-to-use interfaces, they have not managed to cover common search needs. At the same time, according to several studies, a large number of search tasks are of exploratory nature. However, in such tasks the traditional 'ranked list' approach for interacting with the retrieved results is often inadequate. The objective of this thesis is to enable effective exploratory search services which can bridge the gap between the classic responses of non-semantic search systems (e.g., Professional Search Systems, Web Search Engines) and semantic information expressed in the form of Linked Open Data (LOD). Towards this direction, we introduce an approach in which named entities (like names of persons, locations, chemical substances, etc.) are exploited as the glue for automatically connecting documents (search results) with data and knowledge. We study an approach where this entity-based integration is performed at real-time, without any human intervention and without the need of prebuilt indexes. This allows the provision of 'fresh' information, the easy configuration of this functionality according to the needs of the underlying search application, as well as its easy exploitation by existing search systems. The provision of the aforementioned functionality is challenging. At first, the LOD that are available on the Web are big, are distributed in many knowledge bases, are increased and updated continuously, and also cover many domains. Consequently, there is the need of an interoperability model that will allow the specification of the entities of interest as well as of the related and useful semantic data. In addition, the number of extractable entities from the search results can be very high and the same is true for the amount of semantic information that can be retrieved from the LOD for these entities (i.e., the number of their attributes and of their associations with other entities). Thus, there is also the need of methods that can estimate the important (for the search context) entities, attributes and associations. To cope with above challenges, this thesis proposes a semantic analysis process in which the search results are connected with data and knowledge at real-time without any human intervention. For describing the entities of interest, as well as the related (and useful for the application context) semantic information, we propose a generic model for configuring a Named Entity Extraction (NEE) system, while for specifying the semantics of this model, we introduce an RDF/S vocabulary, called 'Open NEE Configuration Model', which allows a NEE system to describe (and publish as LOD) its entity-mining capabilities. To enable associating the result of a NEE process…