Pre-published October 2023
Music streaming services have become an essential part of our digital landscape. However, differentiating between instrumental music and vocal music is one of the major issues in music streaming. This distinction is essential for a variety of uses, such as building playlists for particular objectives, concentration, or relaxation, and even as a first step in language categorization for singing, which is crucial in marketplaces with numerous languages. To address this challenge, a team of researchers from Amazon has proposed a unique multi-stage method for instrumental music detection. This method consists of three main stages: source separation model, quantification of singing voice, and background track analysis. The researchers have contended that when it comes to detecting instrumental music, using the conventional approach yields less than ideal results. With regard to instrumental music identification specifically, applying these models yields low recall. To overcome this challenge, the team has proposed a unique multi-stage method for instrumental music detection. The proposed method has demonstrated significant performance in detecting instrumental music in a large-scale music catalog.
The first stage of the proposed method involves dividing the audio recording into two parts: the vocals and the accompaniment (background music). This distinction is essential because instrumental music shouldn’t include any vocal components. In the second stage, the vocal signal’s singing voice content is quantified. This quantification makes it possible to tell whether a track has vocals or not. The presence of a singing voice implies that the recording is not instrumental if it falls below a predetermined level. In the third stage, the background track, which stands in for the song’s instrumental components, is also examined. A neural network that has been trained to divide sounds into instrumental and non-instrumental categories is used for this investigation. This neural network’s main job is to determine whether the background recording has any musical instruments in it or not.
This novel approach to detecting instrumental music in a large-scale music catalog has significant implications for music streaming services and other applications that require accurate identification of instrumental music. It can help build better playlists for specific objectives such as concentration or relaxation and improve language categorization for singing in marketplaces with numerous languages.
Here is the original article for more details. Don't forget to stay tuned for the latest news in artificial intelligence!