Hi Paolo! Yes, here is a real example. In one of my projects on language classification (programming language) for 38 different languages, the training time currently is less than 5 minutes with ~20,000 files. This is only because I’m using a model that does not have long memory dependencies (e.g. Deep Learning based LSTM models). Had I used a model that takes a couple of hours to train and has expensive gpu requirements, then scaling this up to about 300 languages (which is what GitHub has) is going to be really expensive both in terms of time to train and cost to sustain.