Join us for a whistle-stop tour of text classification using NLP: namely supervised machine learning with neural networks, word vector embeddings, class sensitive cost functions, convolutional layers and novel text string generation using Markov chains.
Over the last year we have been developing software to classify vehicles from sales advert texts. This represents a deeply imbalanced multi-class classification problem, with over 50,000 classes. We used data visualisations, both to identify this data science project and to hone the resulting models: Seeking to comprehend these techniques (visually, mathematically and philosophically) represents the ‘science’ in ‘data science’ and better insight affords data scientists much better understanding of the performance of their models.
Coming directly from a physics background, developing this system demanded a steep learning curve, littered with successes, failures and blind-alleys. It is this (hitchhiked) journey that we hope to share, so we will present a good ol’ fashioned talk on our data science and data visualisation work.
So come along to cap hpi, Bond Court, Leeds from 5.00pm on Wednesday 24th April if you would like to find out more