It has been quite a while since my last blogpost. One of the reasons for the delay is that I was in Vienna just before Easter for the European Conference on Information Retrieval (ECIR). ECIR is one of the best conferences for IR in the world. As usual, the conference has great papers and even better discussions during the coffee breaks where I saw old friends and met new ones. In addition to this, I had a demonstration, which was awarded the best one in the conference, and a presentation during the industry session.
The first day of the conference, I attended the workshops session, started with the “Supporting Complex Search Tasks” session. I am not as experience as I should on the UI/UX side of IR and this session was a perfect opportunity for me to dive deeper into this field. We split into small teams and I was lucky enough to end up in a group with real experts in the field such as Krisztian Balog, Elaine Toms or Marti A. Hearst. We spent most of our time on two specific questions: Firstly, the group believed that a definition of “complex search” was needed before considering potential solutions for the interface. Some of the potential definitions were “a task not achievable by query expansion” or “a task not answerable by individual documents” . Secondly, we all agreed that a search box is not enough to solve complex search tasks. However, the open question presented to the group was if a search box is enough as an entry point for a generic complex search task. Although the group was a bit more divided on this, most people shared the same view I do: a simple search box is probably a good entry point for a complex search, given the fact that the UI must respond and adapt to the user information need. For instance, if a user searches for a known researcher the system should “understand” the user intent and provide a much more suited UI, probably closer to a vertical library search. On the other side, if the search is focus on geographical characteristics, a map might be the best option. This will avoid the necessity of having multiple vertical search engines by hiding most of this complexity in the backend of our generic interface.
Another interesting factor that appear during this session is the concept of “Controversial queries”. Shiri Dori-Hacohen introduced this topic to me and, as long as I know, several people in the session. She presented her poster and explained the different types of controversial queries. For instance, I knew before that queries with large ranges of opinions are probably controversial. However, she also shown that you can have highly controversial information with no sentiment attached to it: “Obama was born in Kenya” vs “Obama was born in Honolulu”. Both this sentences are written as a fact while at least one of them is very controversial. This lead to a long and enjoyable discussion about bias, controversy, manipulation and control in the news with some references thrown into the mix for measuring news bias, extracting politician opinions from news, controversy and sentiment in news.
For the second workshop, I attended the “Bibliometric-Enhanced IR”. This workshop merges the digital libraries and information retrieval communities to improve the retrieval and analysis of written publications (e.g., research publications). This type research might be relevant for more general problems where textual and relational data (e.g., citations or hyperlinks) are equally important. One of the first things I learned here is that sending your abstracts to a conference early does affect your chances of being accepted. Nonetheless, Guillaume Cabanac explained to us that the reasons might not be the ones you expected. His analysis shows that the earlier you submit, the more bids your paper receives. This makes sense because the number of people clicking “next page” decreases with the number of pages you have already seen. The second piece of information is that the more bids a paper received, the more interest it has. These two factors combined imply that the earlier you submit, the better reviews you get, and finally, the more chances you have to be accepted.
The other main discussion in the workshop was the fact that google scholar is very powerful to find a specific article that we know exists, but it is very limited to discover or suggest articles we do not know about. Based on this discussion, two of the people in the audience (Ian Wesley-Smith and Jevin West) suggested two public similar articles based on a initial set: theadvisor and recommends.eigenfactor. These tools are awesome and I recommend all my research friends to use them.
More info about the rest of the conference soon.