Category Archives: Kaggle

Python and Kaggle: Feature selection, multiple models and Grid Search.

I have spoken before about the Kaggle ecosystem and the Digit recognition challenge, and I have also shown how to improve the original version of the code. However, no quality improvement over the initial solution was attempted. This blogpost focuses exactly on that: What can we do to improve the quality of our results?

Continue reading


Python and Kaggle: Code improvements, logging and cross-validation

In the last blog, I focused on a basic piece of functionality that provided a solution for one of the Kaggle challenges using Python. This blogpost shows some improvements in the code itself, as well as the classification process:

  1. Removing some of the functionality that was available in public repositories (pandas)
  2. Adding logging capabilities
  3. Include quality evaluation and cross-validation

Continue reading

Remembering Python and Kaggle

I think that every developer should periodically used more than one programming language and more than one programming paradigm to be knowledgable enough and to not develop a “tunnel vision” which makes us believe that some solutions are not possible just because our current paradigm does not support them.

For the last year or so, I have been using mainly one developing language (Clojure). Do not get me wrong, I believe Clojure is the future and I love it as a language, but I do think that being a polyglot developer is something we all should look forward to. Therefore, I have decided to fresh up my python skills going back to the basics and use it to solve one Kaggle competition.

Continue reading