Home Admissions Students Careers Research Business People Help
Text size A A A A A

| STUDENTS > Information Retrieval |

Information Retrieval

Note: Whilst every effort is made to keep the syllabus and assessment records correct for this course, the precise details must be checked with the lecturer(s).


Code: M052 (Also taught as: GI15)
Year:4
Prerequisites:
Term: 2
Taught By: Jun Wang (100%)
Aims:The course is aimed at an entry level study of information retrieval systems. While the basic theories and (probablilstic) models of information retrieval are covered, the course is primarily focused on practical algorithms of textual document indexing, relevance ranking, as well as their performance evaluations. Practical IR applications such as Web search engines and music/ movie recommender systems will also be covered.
Learning Outcomes:Students are expected to master both the theoretical and practical aspects of information retrieval. More specifically, the student will understand: 1. the basic concenps and processes of information retrieval systems. 2. The common algorithms and techniques for document indexing and retrieval, query processing, etc. 3. How the IR systems are evaluated. 4. The well-known probabilistic retrieval methods and ranking principle. 5. The techniques and algorithms existing in practical IR systems such as those in web search engines and the Amazon book/ Last.FM recommender systems. 6. The challenges and existing techniques for the emerging topics of P2P-IR and Multimedia IR.

Content:

Overview of the fieldStudy some basic concepts of information retrieval, such as the concept of relevance.
Understand the conceptual model of an information retrieval system.
IndexingIntroduce various indexing techniques for textual information items. They include, for instance, inverted indices, tokenization, stemming and stop words.
Retrieval methodsProbabililty ranking principle.
Study popular retrieval models: 1 Boolean, 2. Vector space, 3 Binary independence, 4 Language modelling
Other commonly-used techniques include relevance feedback, pseudo relevance feedback, and query expansion.
Evaluation of retrieval performanceMeasurements: Average precision, NDCG, etc.
TREC conference.
PersonalizationStudy basic techniques for collaborative filtering and recommender systems, such as the memory- based approaches, probabilistic latent semantic analysis (PLSA).
Personalized web search through click-through data.
Emerging areasPeer-to-peer information retrieval; Epidemic-based approaches, Distributed Hash Table (DHT) approaches.
Multimedia information retrieval; Study basic content analysis techniques, query by example, text- based image/video retrieval, and collaborative tagging.

Method of Instruction:

Lecture presentations, Practical exercises

Assessment:

The course has the following assessment components:

  • Written Examination (2.5 hours, 60%)
  • Coursework Section (1 piece, 40%)
To pass this course, students must:
  • Obtain an overall pass mark of 50% for all sections combined
The examination rubric is:
Answer question 1 in PART ONE, and any three of five questions (2-6) in PART TWO

Resources:

Introduction to Information Retrieval, Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Cambridge University Press. 2008.

Modern Information Retrieval (MIR)(errata), Ricardo Baeza-Yates and Berthier Ribeiro-Neto, Addison-Wesley, 2000.

Managing Gigabytes (2nd Ed.) Ian H. Witten, Alistair Moffat and Timothy C. Bell. (1999), Morgan Kaufmann, San Francisco, California.

Pattern Recognition and Machine Learning, Christopher M. Bishop, Springer (2006).

course website

This page last modified: 26 May, 2010 by Nicola Alexander

Computer Science Department - University College London - Gower Street - London - WC1E 6BT - Telephone: +44 (0)20 7679 7214 - Copyright © 1999-2007 UCL


Search by Google
Link to UCL home page