Category Archives: Information Retrieval

Elasticsearch and Clojure: Getting Started

Search is omnipresent these days, from the moment we type a set of keywords into our favourite search engine to find a webpage we are looking for to the moment we type a name and expect our email client to find all the emails sent by that person. Both these processes are based on years of research and experimentation in the field of Information Retrieval in order to efficiently being able to find the most relevant documents.

This blogpost will show how to set up Elasticsearch, one of  the best and most popular search engines (with Solr being the other main alternative). Its main characteristic is to allow unbelievable scalability and advance querying and indexing capabilities with minimum engineering effort. In addition to this, I will also shown how to perform some  basic operations using elastisch, a fantastic library for elasticsearch written in Clojure.

Continue reading

Advertisements

NewsIR 2016 (Behind the scenes)

About a week ago I attended the European Conference on Information Retrieval (ECIR). The conference was great and I will write a blogpost about it soon. However, the main focus of this article is one specific workshop within that conference: the Recent Trend in News Information Retrieval (NewsIR). The reason why I want to talk about it is because I was the lead organiser and the event ended up being a success much bigger than we could have predicted. This blogpost will explain how the workshop idea was born and how the workshop was organised. We thought it is worth sharing this knowledge hoping that other people can get some insight out of it. A latter blogpost will focus on the content of the workshop itself.

Continue reading

Playing with Word2Vec in Clojure

Word2Vec is a novel technique that produces a vector representation of documents where the meaning and relationships between words is encoded spatially. Therefore, words that are related to each other are closer on the defined feature space. Word2Vec is gaining huge traction in the machine learning community and it is definitely worth to know more about it. This blogpost will illustrate the main characteristics of this methods and it will provide an proof of concept using Clojure libraries.

Continue reading

ECIR 2015 (part 2)

In the previous blogpost I described the ECIR conference and the workshop and tutorials day. Now, I will summarise some of the recurrent topics during the main conference. The first session of the conference focused on reproducibility and the challenges to obtain the same results as they were reported in previous research.

Continue reading

ECIR 2015 (part 1)

It has been quite a while since my last blogpost. One of the reasons for the delay is that I was in Vienna just before Easter for the European Conference on Information Retrieval (ECIR). ECIR is one of the best conferences for IR in the world. As usual, the conference has great papers and even better discussions during the coffee breaks where I saw old friends and met new ones. In addition to this, I had a demonstration, which was awarded the best one in the conference, and a presentation during the industry session.

Continue reading