Hi Paolo! Yes, here is a real example.

Paolo Messina

1 min readNov 27, 2018

Hi Paolo! Yes, here is a real example. In one of my projects on language classification (programming language) for 38 different languages, the training time currently is less than 5 minutes with ~20,000 files. This is only because I’m using a model that does not have long memory dependencies (e.g. Deep Learning based LSTM models). Had I used a model that takes a couple of hours to train and has expensive gpu requirements, then scaling this up to about 300 languages (which is what GitHub has) is going to be really expensive both in terms of time to train and cost to sustain.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Kavita Ganesan

665 Followers

17 Following

Chief AI Strategist & Architect | Author of The Business Case For AI | Connect: www.linkedin.com/in/kavita-ganesan/ Learn more: www.opinosis-analytics.com

Responses (2)

Write a response

What are your thoughts?

Also publish to my profile

Kavita Ganesan

Author

Nov 29, 2018

@Paolo Messina yes that’s quite possible. However, it all depends on the features and the data itself. If the dataset used does not provide enough vocabulary information about a language, then its possible to confuse the classifier. So should the…

Paolo Messina

Nov 27, 2018

Thanks. Does the model performance degrade when you scale it to 300 languages? Because the precision/recall also is a factor that depends on the problem scale

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams