Understanding spectrograms

A spectrogram is a time-varying spectral representation that shows how the spectral density of a signal varies with time.

It represents a spectrum of frequencies of the sound or other signal in a visual manner. It is used in various science fields, from sound fingerprinting like voice recognition to radar engineering and seismology.

Usually spectrogram layout is as following: x-axis represents time, y-axis represents frequency, and the third dimension is amplitude of a frequency-time pair, which is color coded. This is three-dimensional data, therefore, we can also create 3D plot where the intensity is represented as height on the z-axis. The problem with 3D charts is that humans are bad at understanding and comparing them. Also, they tend to take more space than 2D charts.

Getting ready

For serious signal processing, we would go into low level details to be able to detect patterns and auto fingerprint certain specific, but for this data visualization recipe we, will leverage a couple of well-known Python libraries to read in audio file, sample it, and spot a spectrogram.

In order to read .wav files to visualize sound, we need to do some prep work. We need to install the libsndfile1 system library for reading/writing audio files. This is done via the favorite package manager. For Ubuntu, you can use:

$ sudo apt-get install libsndfile1-dev.

It is important to install the dev package, which contains header files so pip can build the scikits.audiolab module.

We can also install libasound, ALSA (Advanced Linux Sound Architecture) headers to avoid the runtime warning. This is optional, as we are not going to use features provided by the ALSA library. For Ubuntu, Linux issue the following command:

$ sudo apt-get install libasound2-dev

To install scikits.audiolab, which we will use to read .wav files, we will use pip:

$ pip install scikits.audiolab

Note

Always remember to enter the virtual environment for your current project, as you don't want to dirty system libraries.

How to do it...

For this recipe, we will use prerecorded sound file test.wav that can be found in the file repository with this book. But we could also generate a sample, which we will try later.

In this following example, we perform the following steps in this order:

  1. Read the .wav file that contains recorded sound sample
  2. Define the length of the window used for Fourier transform—NFFT
  3. Define the overlapping data points while sampling—noverlap

    Note

    NFFT defines the number of data points used for computing the Discrete Fourier Transform in each block. The most efficient computation is then the NFFT is the power of two. The windows can overlap and the number of data points that are overlapped (that is, repeated) is defined by the noverlap argument.

import os
from math import floor, log

from scikits.audiolab import Sndfile
import numpy as np
from matplotlib import pyplot as plt

# Load the sound file in Sndfile instance
soundfile = Sndfile("test.wav")


# define start/stop seconds and compute start/stop frames
start_sec = 0
stop_sec  = 5
start_frame = start_sec * soundfile.samplerate
stop_frame  = stop_sec * soundfile.samplerate

# go to the start frame of the sound object
soundfile.seek(start_frame)

# read number of frames from start to stop
delta_frames = stop_frame - start_frame
sample = soundfile.read_frames(delta_frames)


map = 'CMRmap'

fig = plt.figure(figsize=(10, 6), )
ax = fig.add_subplot(111)
# define number of data points for FT
NFFT = 128
# define number of data points to overlap for each block
noverlap = 65

pxx,  freq, t, cax = ax.specgram(sample, Fs=soundfile.samplerate,
                                 NFFT=NFFT, noverlap=noverlap,
                                 cmap=plt.get_cmap(map))
plt.colorbar(cax)
plt.xlabel("Times [sec]")
plt.ylabel("Frequency [Hz]")

plt.show()

This generates the following spectrogram, with visible "white-like" traces for separate notes.

How to do it...

How it works...

We need to load a sound file first. To do this, we use the scikits.audiolab.SndFile method and provide it with a filename. This will instantiate sound object, which we can then query for data and call function on.

To read data needed for spectrogram, we need to read the desired frames of data from our sound object. This is done by read_frames(), which accepts the start and end frame. We calculate the frame number by multiplying sample rate with the time points (start, end) we want to visualize.

There's more...

If you can't find audio (wave), you can easily generate one. Here's how to generate it:

import numpy


def _get_mask(t, t1, t2, lvl_pos, lvl_neg):
    if t1 >= t2:
        raise ValueError("t1 must be less than t2")

    return numpy.where(numpy.logical_and(t > t1, t < t2), lvl_pos, lvl_neg)


def generate_signal(t):
    sin1 = numpy.sin(2 * numpy.pi * 100 * t)
    sin2 = 2 * numpy.sin(2 * numpy.pi * 200 * t)

    # add interval of high pitched signal
    sin2 = sin2 * _get_mask(t, 2, 5, 1.0, 0.0)

    noise = 0.02 * numpy.random.randn(len(t))
    final_signal = sin1 + sin2 + noise
    return final_signal


if __name__ == '__main__':
    step = 0.001
    sampling_freq=1000
    t = numpy.arange(0.0, 20.0, step)
    y = generate_signal(t)

    # we can visualize this now
    # in time 
    ax1 = plt.subplot(211)
    plt.plot(t, y)
    # and in frequency
    plt.subplot(212)
    plt.specgram(y, NFFT=1024, noverlap=900, 
        Fs=sampling_freq, cmap=plt.cm.gist_heat)

    plt.show()

Will give you the following signal where the top subplot represent the signal we generated. Here, the X axis is time and Y axis is the signal's amplitude. The bottom subplot is the same signal in the frequency domain. Here, while x-axis is the same time as in the top subplot (we matched the time by selecting the sampling rate), the y-axis is the frequency of the signal.

There's more...
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.60.158