Browse by author
Lookup NU author(s): Teck CHAN, Professor Cheng Chin
Full text for this publication is not currently held within this repository. Alternative links are provided below where available.
The main scientific question of this year DCASE challenge, Task 4 - Sound Event Detection in Domestic Environments, is to investigate the types of data (strongly labeled synthetic data, weakly labeled data, unlabeled in domain data) required to achieve the best performing system. In this paper, we proposed a deep learning model that integrates Convolution Neural Network (CNN) with Non-Negative Matrix Factorization (NMF). The best performing model can achieve a higher event based F1-score of 30.39% as compared to the baseline system that achieved an F1-score of 23.7% on the validation dataset. Based on the results, even though synthetic data is strongly labeled, it cannot be used as a sole source of training data and resulted in the worst performance. Although, using a combination of weakly and strongly labeled data can achieve the highest F1-score, but the increment was not significant and may not be worthwhile to include synthetic data into the training set. Results have also suggested that the quality of labeling unlabeled in domain data is essential and can have an adverse effect on the accuracy rather than improving the model performance if labeling was not done accurately.
Author(s): Chan TK, Chin CS, Li Y
Publication type: Conference Proceedings (inc. Abstract)
Publication status: Published
Conference Name: IEEE Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2019)
Year of Conference: 2019
Online publication date: 25/10/2019
Acceptance date: 29/07/2019
Publisher: IEEE
URL: http://dcase.community/documents/challenge2019/technical_reports/DCASE2019_Chan_5.pdf