Extracting knowledge from web communities and linked data for case-based reasoning systems

Sauer, Christian; Roth-Berghofer, Thomas

Extracting knowledge from web communities and linked data for case-based reasoning systems

Lists

Sauer, Christian and Roth-Berghofer, Thomas (2013) Extracting knowledge from web communities and linked data for case-based reasoning systems. In: 18th UK Workshop on Case-Based Reasoning, 10 Dec 2013, Cambridge, UK.

[thumbnail of Extracting Knowledge from Web Communities and Linked Data for Case-based Reasoning Systems.pdf]

Preview

PDF
Extracting Knowledge from Web Communities and Linked Data for Case-based Reasoning Systems.pdf - Accepted Version
Download (788kB) | Preview

Official URL: http://ukcbr.org.uk/ukcbr13/index.html

Abstract

The recent developments of Web 2.0, has driven the web content from its static and formalised nature to a highly user-driven nature. Such web content includes blogs, forum posts and tweets which are mostly expressed in an unsystematic manner. Due to this reason, retrieving and reusing this content has become challenging. As a solution, Reichle et al. [6] present a novel architecture named SEASALT and within this architecture present the docQuery project, carried out as one instantiation of the presented architecture focusing on the domain of travel medicine. The work presented in this paper is demonstrating the use of Twitter feeds as a knowledge source within the SEASALT architecture, expanding the knowledge-base of the docQuery project. A Multi Agent System is developed to acquire Twitter feeds related to travel medicine, which are then transferred for further knowledge extraction to the Apprentice agent component of the SEASALT Architecture named: Knowledge Extraction Workbench (KEWo). In this paper, Twitter is analysed as a knowledge source in terms of the amount of data it can provide on a specific topic and how this provided amount of tweets has an impact on the performance and quality of knowledge extracted from them. Furthermore, the paper analyses how well the hash tag feature provided in Twitter can be employed as a source of structuring information. As a result of this analysis, a set of Group-By Features is introduced to enhance the knowledge extraction based on attributes of Twitter feeds such as retweet count and number of followers. As its final output, this paper demonstrates how to create a virtual community of experts within the SEASALT architecture for further knowledge extraction from said community.