If you have ever read an algorithms textbook, you know about the handful of sorting algorithms that run in O(n*log(n)) time. These include quicksort, heapsort, and mergesort. Under the hood, Python’s List.sort function uses yet another one called Timsort. That’s not the point of this post. The point of this post is show you how […]
Snippets
Bits of knowledge, technical advice, and package recommendations not befitting of a whole blog post. If it takes me a substantial amount of time or effort to search out a solution to a common issue, I'll write about it here.
Fastest Way to Flatten a List in Python
In my latest book, Fast Python, I bust a lot of speed myths in Python. Some of the most basic advice people give about optimizing Python code is wrong, likely because no one ever bothered to test it. In this book, I go back to the basics and verify (or bust) all of the advice […]
Multicore Repeated K-Fold Classifier
The following snippet is a reusable multicore K-Fold classifier for scikit-learn models. The return value is an array of cross validation scores of length N_SPLITS * N_REPEATS. The sklearn library already provides a simple interface for multicore cross validation through the cross_val_score function, but it does not provide a facility for repeating the cross validation […]
Calculating Triple Barrier Labels from Advances in Financial Machine Learning
In Marcos Lopez de Prado’s 2018 book, Advances in Financial Machine Learning, the author proposes a system for calculating labels for financial events based on the precipitation of events followings a list of event dates. These labels are typically members of the set {-1, 0, 1}, and are ideal for fitting machine learning classification models. […]
Activate Verbose Logging Output in Django
When you are developing Django, you likely want the most verbose debugging output possible. Django uses the logging levels defined by Python logging module, and defines the logging style in a Python dictionary in settings.py. Read more about Django logging. Example Logging Configuration My favorite logging configuration is to dump the most verbose output possible […]