A long time ago I published a blogpost explaining how to represent the Reuters-21578 collection (and more in general, any textual collection for text classification). However, that blogpost never explained how to perform the classification step itself. This post will introduce some of the basic concepts of classification, quickly show the representation we came up with in the prior post and finally, it will focus on how to perform and evaluate the classification.
I have attended PyData once again the London PyData meet-up and I am as happy as I was the first time. The day started with some news from the organisers who listed some interesting discoveries within the python ecosystem: