There is growing interest in ways to incorporate models of human behavior into Cranfield-style evaluation of information retrieval systems. While there are many approaches to this problem, two prominent approaches have been either to start from a user interface and then simulate user interaction with this interface, or to take traditionally styled evaluation measures and make them more realistic in terms of known human behavior.
The SIGIR 2013 Workshop on Modeling User Behavior for Information Retrieval Evaluation (MUBE 2013) aims to bring together people to discuss existing and new approaches, ways to collaborate, and other ideas and issues involved in improving information retrieval evaluation through the modeling of user behavior.
Areas of interest for the workshop include but are not limited to:
The workshop consists of three main parts: invited speakers, short paper presentations, and breakout groups and their presentations.
We are pleased to announce that our invited speakers at the workshop will be Ben Carterette and Leif Azzopardi:
Abstract: Information retrieval evaluation is highly reliant on averages: we typically evaluate an engine by testing for a significant difference in an average effectiveness computed using relevance judgments that may be averaged over assessors, and perhaps taking one or more parameters estimated as averages from user data. Using averages at every stage of the evaluation process this way presents a false certainty; in reality there is so much potential variability at each point that even long-held conventional wisdom about effectiveness-enhancing techniques must be questioned.
In this talk, we argue for the importance of incorporating something about variability in user behavior into automatic (batch-style) retrieval evaluations. We will present results from three ongoing projects: (1) using user logs to incorporate variability in user behavior; (2) using many preferences-based assessments to incorporate variability about relevance; (3) using different classes of tasks to incorporate variability about topics.
Abstract:In this talk, I want to discuss a number of issues regarding the simulation of users and how we are "assimilating" users into our evaluations through the models that we create. First, I'll try and provide some definitions about what measures, models and simulations are, and how they relate. Specifically, I argue that simulation has been a central component in most, if not all, evaluations. Then, I'll review some of the different kinds of simulations, the advantages, pitfalls and challenges of performing simulations. It is here that, I wonder about what we are trying to achieve, and what we want to do with all our models, simulations and measures. Is it to evaluate or is it to assimilate? I'll argue that we need to look towards developing more explanatory models of user behaviour so that we can obtain a better understanding of users and their interaction with information systems.
We solicited 2 page poster papers to encourage sharing of ideas and discussion. We accepted 10 papers for short presentation (ordered here by last name of first author):
Workshop proceedings (PDF).
Detailed schedule in PDF.
09:00-09:15 Welcome and Introduction
09:15-10:00 Invited Speaker: Ben Carterette
10:00-10:30 3 Short Paper Presentations
10:30-11:00 Coffee Break
11:00-11:45 Invited Speaker: Leif Azzopardi
11:45-12:15 3 Short Paper Presentations
12:15-12:30 Brainstorming Breakout Topics
12:30-14:00 Lunch (provided)
14:00-14:40 4 Short Paper Presentations
14:40-15:00 Finalize Breakout Topics and Assign Champions and Scribes
15:00-16:30 Breakout Sessions (Coffee from 15:30-16:00)
16:30-17:30 Breakout Presentations and Discussion
We extended the submission deadline by a week and adjusted the other deadlines, too.
Organizers' Contact Email: