I have spoken before about the Kaggle ecosystem and the Digit recognition challenge, and I have also shown how to improve the original version of the code. However, no quality improvement over the initial solution was attempted. This blogpost focuses exactly on that: What can we do to improve the quality of our results?
This blogpost is focused on one very important part of any business: Align the goals of the company with each team and the personal objectives. One of the techniques to address this challenge in an elegant and productive way is known as OKRs (Objectives and Key Results). OKRs have been used before in large companies such as Oracle or Intel, but they are probably better known because Google has adopt them as a core part of the organisation.
In the last blog, I focused on a basic piece of functionality that provided a solution for one of the Kaggle challenges using Python. This blogpost shows some improvements in the code itself, as well as the classification process:
- Removing some of the functionality that was available in public repositories (pandas)
- Adding logging capabilities
- Include quality evaluation and cross-validation
I think that every developer should periodically used more than one programming language and more than one programming paradigm to be knowledgable enough and to not develop a “tunnel vision” which makes us believe that some solutions are not possible just because our current paradigm does not support them.
For the last year or so, I have been using mainly one developing language (Clojure). Do not get me wrong, I believe Clojure is the future and I love it as a language, but I do think that being a polyglot developer is something we all should look forward to. Therefore, I have decided to fresh up my python skills going back to the basics and use it to solve one Kaggle competition.