In this notebook, we are going to take a closer look at the data. Let us begin by loading everything in.
import librosa
import pandas as pd
import numpy as np
from IPython.lib.display import Audio
from matplotlib import pyplot as plt
import multiprocessing
import scipy.signal
from scipy import signal
anno = pd.read_pickle('data/annotations.dataframe.pkl.gz')
anno.head()
The annotations dataframe contains extracted calls in the call column. This data sets does not include other annotations. All of the calls have been recorded with a sample rate of 24kHz.
SAMPLE_RATE = 24000
There is a total of 17882 calls in this dataset.
anno.shape
They are of varying type and we do not have additional labels for them.
Here is what the distribution of call duration looks like
call_durations = anno.call.apply(lambda x: x.shape[0] / SAMPLE_RATE)
plt.title('Call durations in seconds')
plt.xlabel('seconds')
plt.ylabel('count')
plt.hist(call_durations);
Out of the 17882 calls, 16432 (92% of all calls) are under two seconds long.
sum(call_durations < 2)
Let us look at the distribution of these shorter calls more closely.
plt.title('Call durations in seconds')
plt.xlabel('seconds')
plt.ylabel('count')
plt.hist(call_durations[call_durations < 2]);
The vocalizations are extremely varied. Below is a non-exhaustive selection to provide a better understanding of some of the richness of this dataset.
The labels are not annotations, but my own qualitative description of the calls.
| growl-like | bark-like | yawn-like |
|---|---|---|
| whistle-like | squeak-like | trumpet-like |
|---|---|---|
| elephant-like | parrot-like |
|---|---|