Transfer Learning Using Musical/Non-Musical Mixtures for Multi-Instrument Recognition

EasyChair Preprint 10727

5 pages•Date: August 16, 2023

Hannes Bradl, Markus Huber and Franz Pernkopf

Abstract

Datasets for most music information retrieval tend to be relatively small. However, in deep learning, insufficient training data often leads to poor performance. Typically, this problem is approached by transfer learning (TL) and data augmentation. In this work, we compare various of these methods for the task of multi-instrument recognition. A convolutional neural network (CNN) is able to identify eight instrument families and seven specific instruments from polyphonic music recordings. Training is conducted in two phases: After pre-training with a music tagging dataset, the CNN is retrained using multi-track data. Experimenting with different TL methods suggests that training the final fully-connected layers from scratch while fine-tuning the convolutional backbone yields the best performance. Two different mixing strategies - musical and non-musical mixing -- are investigated. It turns out that a blend of both mixing strategies works best for multi-instrument recognition.

Keyphrases: Convolutional Neural Network, Multi-Instrument Recognition, ransfer Learning

Links:

https://easychair.org/publications/preprint/qQ18

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:10727,
  author    = {Hannes Bradl and Markus Huber and Franz Pernkopf},
  title     = {Transfer Learning Using Musical/Non-Musical Mixtures for Multi-Instrument Recognition},
  howpublished = {EasyChair Preprint 10727},
  year      = {EasyChair, 2023}}

Download PDF Open PDF in browser