AudioSample is an optimized numpy-like audio manipulation library, created for researchers, used by developers.
It is an advanced audio manipulation library designed to provide researchers and developers with efficient, numpy-like tools for audio processing. It supports complex audio operations with ease and offers a familiar syntax for those accustomed to numpy.
AudioSample is perfect for data loading and ETLs, because its fast and has a low memory footprint due to lazy actions.
Seamless Audio Operations: Perform a wide range of audio manipulations, including mixing, filtering, and transformations.
Integration with Numpy: Leverage numpy's syntax and capabilities for intuitive audio handling.
Integration with Torch: Export audio directly to and from torch tensors.
High Performance: Optimized for speed and efficiency, suitable for research and production environments. Most actions are lazy, so no operation done until absolutely necessary.
Extensive I/O Support: Easily read from and write to various audio formats. Utilizes PyAv - to support multiple ranges.
Support up to numpy 2.2.0
Streaming input, streaming output:
AudioSample now supports receiving a python generator for input Generator[Union[bytes,numpy,AudioSample]]
Warning: It currently still stores everything in memory so this can't live forever.
Plugin functionality is not supported in stream mode.
streaming mode requires PyAV (See example below):
Constructor supports numpy buffers (same as calling AudioSample.from_numpy use force_read_sample_rate to set sample rate.)
To install AudioSample, use pip:
to install all prerequisites:
pip install audiosample[all]
#linux/WSL:
pip install audiosample[all]
#Possible extras are:
[av] - only av
[torch] - add torch
[tests] - include everything for tests.
[noui] - install without jupyter support.
#Mac OS:
brew install portaudio
#linux/WSL:
apt-get install portaudio19-dev
[play] - bare, with ability to play audio in console. (uses pyaudio)
Here's a quick example of how to load, process, and save audio using AudioSample:
importaudiosampleasapimportnumpyasnp# Create a 1 second audio sample with 44100 samples per second and 2 channelsau=ap.AudioSample.from_numpy(np.random.rand(2, 48000), rate=48000)
beep=ap.AudioSample().beep(1).to_stereo()
out=au.gain(-12) *beepout.write("beep_with_overlayed_noise.mp3")
out=au.gain(-10) +au.silence(1) +beepout.write("noise_then_silence_then_beep.mp3")
Resampling: Fast resampling of audio.
Normalization: Easily normalize audio levels.
Mixing: Easily mix multiple audio sources together. Using * sign
Concat Easily concat audio sources. Using + sign
Playback: Play audio directly in Jupyter notebooks or from the command line.
AudioSample outperforms PyDub
open concatenation and save.
longbeep is a 100s long wav file of beep.
importpydubfromaudiosampleimportAudioSampledeftest_audiosample():
au=AudioSample()
foriinrange(0, 100):
au+=AudioSample("longbeep.wav")[50:51]
au.write("out.wav")
deftest_pydub():
au=pydub.AudioSegment.empty()
foriinrange(0, 100):
au+=pydub.AudioSegment.from_file("longbeep.wav")[50:51]
au.export("out.wav")
%timeittest_audiosample()
#52.9 ms ± 1.89 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)%timeittest_pydub()
#376 ms ± 15.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
AudioSample is developed by Deepdub, a company specializing in AI-driven audio solutions. Deepdub focuses on enhancing media experiences through cutting-edge technology, enabling content creators to reach global audiences with high-quality, localized audio.
If you have questions or need help, please open an issue on GitHub.