SIGIR 2015 Tutorial on Wikipedia Leverage for Information Retrieval

Exploiting Wikipedia for Information Retrieval Tasks.

Wikipedia - the online encyclopedia - has long been used as a source of information for researchers, as well as being a subject of research itself . Wikipedia has been shown to be effective in recommender systems, sentiment analysis, validation and multiple domains in information retrieval. One of the reasons for Wikipedia's popularity among researchers and practitioners is the multiple types of information it contains, which enables practitioners to select the right "tool" for their respective tasks. In addition to its great potential, this multitude of information sources also poses a challenge: which sources of information are best suited for a specific problem and how can different types of data be combined?

Slides are here!

Find us on Facebook!

After the success of presenting the tutorial at SIGIR 2015, please find the report to cite upon using the resources for using Wikipedia in your own research.

What is this tutorial about?

This tutorial aims to provide a holistic view of Wikipedia's different features - text, links, categories, page views, editing history etc. - and explore the different ways they can be utilized in a machine learning framework. By presenting and contrasting the latest works that utilize Wikipedia in multiple domains, this tutorial aims to increase the awareness among researchers and practitioners in these fields to the benefits of utilizing Wikipedia in their respective domains, in particular to the use of multiple sources of information simultaneously.

Our participants will be provided with the following knowledge:

Become familiar with the different sources of information available in Wikipedia and the various features that can be derived from them.

Possess the understanding which of the Wikipedia features can be successfully applied in various domains.

Become acquainted with the various ways in which machine learning can be used to leverage the knowledge stored in Wikipedia.

How Wikipedia's structure of semi-structured textual entities can be used to leverage the search and retrieval missions.