Side Projects

Teaching at Carnegie Mellon University  Qatar

I am delighted with the challenge to teach at CMU-Qatar.
In the Spring semester of 2017, I taught Information Retrieval there (course 67-300 –  Search Engines).  Slides are publicly available at GitHub link.

In the Fall semester of 2019, I will be teaching Practical Data Science (course 67-364). Slides will be available at the end of the semester.


I have been working on this set of tools for information retrieval evaluation in Python. It provides an interface to repetitive and common tasks such as analyzing runs, running an IR framework like Indri or Terrier with different baselines, evaluating runs with trec_eval, analyzing results, or even fusing ranking lists to create a more robust run.
For collection creators, it provides the ground for tasks such as document pool creation.

The project is in constant update. You can follow it at GitHub or download the latest version with pip.

CLEF eHealth

Together with Dr. Guido Zuccon, I am leading in the Information Retrieval task at CLEF eHealth since 2014.

In 2014, we ran an ad-hoc information retrieval task focused on supporting laypeople in searching for and understanding their health information. The challenge mimic patients querying for some key disease that appeared in their discharge summary. The document collection was provided by Khresmoi project and contains more than one million Web pages covering a broad range of health topics, targeted at both the general public and healthcare professionals.

In 2015, we changed the task to focus on symptoms, rather than in diseases. We mimic queries of lay people that are confronted with a sign, symptom or condition and attempt to find out more about the condition they may have. Recent research has indicated that current web search engines fail to effectively support these queries (Zuccon et al., Staton et al.). Another innovation for 2015 is that we did our the first experiments with understandability: we asked our medical assessors to also assess, in parallel to the document relevance, whether they would recommend the document for their patient, taking into consideration the difficult to read that document. To the best of our knowledge, it is the first time that document readability is being assessed in an IR task, and we are going to investigate in details what is the impact of that in the rankings created.

In 2016 and 2017, we moved the challenge a step forward with a larger and messier collection, ClueWeb12 B13, which is more representative of the current state of health information online. Topic stories and associated queries were made by mining health Web forums, such as Reddit’s AskDocs to identify real consumer information needs.

See more: GitHub for all collections built in CLEF eHealth.


In the TREC clinical decision support 2014/2015 (TREC-CDS), I participated representing Vienna University of Technology. The focus of this task was on providing material to support physicians when making decisions regarding diagnosis, medical tests and treatments. The data collection was made of full-text documents from PubMed Central.

Although the main subject of this TREC track is also medical information retrieval, there were many significant differences between this one and CLEF eHealth. For example, readability is not a problem in the context of TREC-CDS, as IR systems should be used by physicians, instead of patients/general public. Nevertheless, all domain specific medical resources, such as Metamap annotations, MeSH, or UMLS, can be used in this task as well.

In 2015, our query expansion method got the second place (our of 30). You can check the details here.


In 2014 and 2015, I participated in the retrieving diverse social images challenged at 2014 Mediaeval benchmark. The challenge consisted of re-ranking an initial Flickr results list to cope with diversity. In this context, a diverse list of images is a list that shows different perspectives of a specific point of interest. For example, a list that shows the Notre Dame cathedral from inside, from outside, from far away, and so on. We proposed an ensemble of clusters that worked relatively well (3rd place).

For 2015, we incorporated more features, including a deep learning solution. We also explored ways to combine visual and textual features. This is our system that got the first place.


One of the subjects that I started working on for my Ph.D. is the readability of textual documents. The challenge here is to match the reading skill of a person with the best possible material for him/her. There are a lot of traditional metrics for measuring how hard a text is, mostly based on surface features level characteristics of text, i.e., the length of words and sentences. I implemented them all in the open source Python package called ReadabilityCalculator.

If you are interested in learning more about readability, I highly recommend this link, which covers much of the literature on readability.

See more: readability-calculator source code


In 2013-2014 I had a great time with Kaggle. As soon as I got to know this website, I became addicted to it and started to participate in as many competitions as my schedule allowed me. However, as everybody knows, time is a limited resource… Thus, I am much less active now. Nevertheless, if there is an interesting competition going on and you want to team up with me, please let me know! 🙂

See more: my Kaggle profile