SoundClassification ml5

What is SoundClassifier ?

The ml5.soundClassifier() allows you to classify audio. With the right pre-trained models, you can detect whether a certain noise was made (e.g. a clapping sound or a whistle) or a certain word was said (e.g. Up, Down, Yes, No). At this moment, with the ml5.soundClassifier(), you can use your own custom pre-trained speech commands or use the the "SpeechCommands18w" which can recognize "the ten digits from "zero" to "nine", "up", "down", "left", "right", "go", "stop", "yes", "no", as well as the additional categories of "unknown word" and "background noise"."

What's ml5.js ?

ml5.js is a JavaScript librabry that enable to access easily to machine learning audience. Artists, coders and students can use ml5.js without knowing a lot in algorithm and machine learning because it is very approachable and there are lot of documentation. The library provides access to machine learning algorithms and models in the browser, building on top of TensorFlow.js with no other external dependencies.
A good way to do it is to train a model in Python using GPU acceleration thanks to Gradient°, export the model to JavaScript and run everything on the browser with the ml5.styleTransfer() method.

Step 1

First, upload your audio file. You can use any type of song as long as there is only one singer to ensure the algorithm can work properly. Be creative ! From Rap music to Heavy Metal through K-pop, every type of song and languages can be used !

Step 2

Next, you have to specify how old the singer is. You can choose between different age classes to make it simplier. From 15 to 65, you can be as accurate as you'd like the algorithm to be !

Step 3

And Voila ! The machine learning process permits you to know the exact age mesured with level of uncertainty through a probability. You can now target a specific audience through marketing campaign and maximize your ROI !

How does it works ?

The most important thing to know is that no data is avaible directly from ml5.js for speech recognition, so it is necessary to get it from a important database. In this case, we can use Mozilla's Common Voice Dataset and Google's AudioSet. Using ml5, it will be trained to recognize the voice spectrum that correspond to a particular audience and predict how old it is.
Source: Mozilla's Common Voice Dataset
Source: Google's AudioSet

This is what must be included in our code in order to start machine learning training.

The main advantage with AudioSet is that it comes with youtube video clip of music that last 10 seconds. Therefore it can provide much more information about the data and enhance the prediction capability of the model.

Get extra information and data on these webpages :

Please watch the video to understand how to use Sound Classification with ml5.js

Target the Audience of a Song