Formulating our real-world problem

The main objective of our real-world case study here is audio event identification and classification. This is a supervised learning problem where we will be working on an audio event dataset with samples of audio data that belong to specific categories (which are the sources of the sounds).

We will be leveraging concepts from transfer learning and deep learning to build a robust classifier whereby, with any given audio sample belonging to one of our pre-determined categories, we should be able to correctly predict the source of this sound. The dataset we will be using is popularly known as the UrbanSound8K dataset (https://urbansounddataset.weebly.com/), and has 8,732 labeled audio sound files (the duration of which are usually equal to or greater than 4 seconds) that contain excerpts from common urban sounds. The ten categories of sounds in this dataset are as follows:

  • air_conditioner
  • car_horn
  • children_playing
  • dog_bark
  • drilling
  • engine_idling
  • gun_shot
  • jackhammer
  • siren
  • streen_music

For a detailed description of this dataset and other potential datasets and initiatives, we recommend readers to visit the UrbanSound website and also check out this amazing paper by the creators, J. Salamon, C. Jacoby, and J. P. Bello, A Dataset and Taxonomy for Urban Sound Research (http://www.justinsalamon.com/uploads/4/3/9/4/4394963/salamon_urbansound_acmmm14.pdf), 22nd ACM International Conference on Multimedia, Orlando USA, November 2014. We would like to thank them, as well as NYU's Center for Urban Science and Progress (CUSP), for making this a reality.

To get the data, you will need to fill out a form on their website, after which you should get a download link over email. Once you unzip the file, you should be able to see all the audio files in ten folders (ten folds) and a readme file containing more details about the dataset.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.231.106