ESP32 Audio Input Using I2S and Internal ADC

View All Posts

20 min read

Want to keep up to date with the latest posts and videos? Subscribe to the newsletter

· · · · · Posts · Videos · Tags · Support

« Wio Terminal Audio Visualizer

Can You Spot the Problem? »

HELP SUPPORT MY WORK: If you're feeling flush then please stop by Patreon Or you can make a one off donation via ko-fi

Learn how to effectively capture audio data using an ESP32 device and analog-to-digital converters in this detailed tutorial. Discover the power of I2S peripheral with DMA controller and optimize your system's audio performance with the MAX 4466 and MAX 9814 microphone breakout boards.

ESP32 Audio Output with I2S DMA and the MAX98357A Class D Amplifier - Learn how to use the MAX98357A breakout board with an ESP32 to output audio, create a digital audio path, configure the I2S interface, and read WAVE files from SPIFFS in this engaging tutorial.

Play MP3 Files on ESP32 Without Codec Chip: Easy Guide - Learn how to decode and play MP3 audio files on the ESP32 with both headphone support and I2S digital amplifiers. Discover techniques to enhance audio quality and reduce power interference for clearer sound.

ESP32 Audio Input Showdown: INMP441 vs SPH0645 MEMS I2S Microphones! - Discover the performance of two MEMS microphone boards, the SPH0645 and the INMP441, when connected to an ESP32. This video showcases their audio recording capabilities, noise handling and overall usability, with the INMP441 emerging as the winner!

I've got 150 breakout boards to test - Learn about testing MAX98357 stereo amplifier and ICS-43434 I2S microphone breakout boards in a reliable and efficient way, as well as dealing with possible faulty components.

ESP32 Audio Input - MAX4466, MAX9814, SPH0645LM4H, INMP441 - In this blog post, I've delved deep into the world of audio input for ESP32, exploring all the different options for getting analogue audio data into the device. After discussing the use of the built-in Analogue to Digital Converts (ADCs), I2S to read ADCs with DMA, and using I2S to read directly from compatible peripherals, I go on to present hands-on experiments with four different microphones (MAX4466, MAX9814, SPH0645, INPM441). This comprehensive look at getting audio into the ESP32 should be a valuable resource for anyone hungry for a deep-dive into ESP32's audio capabilities, complete with YouTube videos for an even more detailed look!

E32-S3 no DAC - No Problem! We'll Use PDM - In this post, I tackle the lack of a DAC on the ESP32-S3 by demonstrating how to use Pulse Density Modulated (PDM) audio with Sigma Delta Modulation to achieve analog audio output. I explore the simplicity of creating a PDM signal and its reconstruction into an audio signal using a low pass filter, even an RC filter, though a more sophisticated active filter is recommended. I guide through using both a timer and the I2S peripheral on the ESP32 for outputting PDM data, noting the quirks and solutions for each method. And I wrap up with how straight PDM signals can drive headphones or work with various amplifiers, including the MAX98358 or SSM2537, exhibiting the versatility of PDM in audio applications with the ESP32-S3.

ESP32 I2S DMA Settings - dma_buf_len and dma_buf_count Explained - In this blog post, we delve deep into the intriguing concepts of I2S audio and DMA, particularly focusing on parameters like dma_buf_count and dma_buf_len. We explore their roles, ideal values, and the impacts they have on aspects such as CPU load and latency. We also discuss the limitations that come hand in hand with these parameters. This post aims to provide you some insights on trading-off between latency, CPU load, memory usage and the overall buffer space allocation. However, the primary takeaway remains that the optimal configurations largely depend on individual context and application needs.

Decoding AVI Files for Fun and... - After some quality time with my ESP32 microcontroller, I've developed a version of the TinyTV and learned a lot about video and audio streaming along the way. Using Python and Wi-Fi technology, I was able to set up the streaming server with audio data, video frames, and metadata. I've can also explored the picture quality challenges of uncompressed image data and learned about MJPEG frames. Together with JPEGDEC for depth decoding, I've managed to effectively use ESP32's dual cores to achieve an inspiring 28 frames per second. Discussing audio sync, storage options and the intricacies of container file formats for video storage led me to the AVI format. The process of reading and processing AVI file headers and the listing subtype 'movi' allowed me to make significant headway in my project. All in all, I'm pretty chuffed with my portable battery powered video player. You can check out my code over on Github!

DIY Alexa With the ESP32 and Wit.ai - This post provides a comprehensive guide to building a do-it-yourself (DIY) Alexa using an ESP32 and Wit.ai. It illustrates how to create a wake word detection system, use Python for machine learning and employ TensorFlow for the 'wake' word identification. It also covers the usage of Wit.ai for intent recognition and managing commands. The post is fully backed with code snippets, examples and video tutorials to deliver an interactive learning experience to readers.

[0:00] hey everyone for my next project i need
[0:01] to get audio data into the esp32
[0:04] to do this we’re going to use the
[0:06] built-in analog-to-digital converters
[0:08] on the esp32 now the esp32
[0:12] integrates two 12-bit analog to digital
[0:14] converters
[0:15] adc1 has 8 channels and adc 2
[0:18] has 10 channels adc2 is used by the wifi
[0:22] driver so we can only use
[0:23] adc2 when not using wi-fi in addition
[0:26] some of the pins attached to adc2 are
[0:28] used for strapping pins
[0:30] so it cannot be used freely i’m going to
[0:31] be using wifi in my project so i’ll
[0:33] stick to using
[0:34] adc1 for simple low frequency sampling
[0:37] we can use the arduino analog
[0:39] read or we can use the expressive adc
[0:42] functions directly if you need
[0:44] very accurate readings then you can
[0:45] calibrate your analog to digital
[0:47] converter the chips manufactured
[0:49] recently this may have been done
[0:50] in the factory but if required you can
[0:52] also do this manually check the
[0:54] description
[0:55] for some links on how to do this to get
[0:56] a calibrated value from the adc we need
[0:59] to do a few steps we need a full back
[1:01] value for the vref
[1:02] if nothing is available from the
[1:03] calibration system we then need to set
[1:05] up our adc
[1:06] in a known configuration in this example
[1:09] code we are setting our adc
[1:10] to 12 bit resolution giving us a range
[1:13] from naught
[1:13] to 4096 we’re also setting our
[1:16] attenuation
[1:17] to 11 db this should give us the full
[1:19] input range
[1:20] from naught to 3.3 volts we can now pull
[1:23] back
[1:23] the calibration characteristics for
[1:25] these settings and now
[1:26] we can get the raw value from the adc
[1:29] and map it onto a voltage
[1:31] using the calibration characteristics to
[1:33] test this
[1:34] i’ve hooked up a potentiometer to one of
[1:37] the adc
[1:38] channels and printed out both the raw
[1:41] value
[1:42] and the calibrated value in a loop
[1:46] as we vary the voltage on the pin the
[1:49] value read from the adc
[1:52] changes
[1:58] you can see from my dev board that it’s
[2:00] been factory calibrated
[2:02] with a vref value using the adc in this
[2:07] way is fine
[2:08] if you only need to get a value every so
[2:10] often or you want to sample
[2:12] at very low frequencies for very low
[2:16] quality speech
[2:17] we need to capture a bandwidth of around
[2:19] four kilohertz
[2:21] for good quality speech we need around
[2:24] eight kilohertz
[2:25] and for high quality audio we need a
[2:27] bandwidth of 20 kilohertz
[2:31] to avoid issues with aliasing we need to
[2:34] sample at the nyquist rate
[2:36] which is twice the highest frequency you
[2:38] want to sample
[2:39] for audio this would be around 40
[2:41] kilohertz
[2:42] you can see in this animation the effect
[2:45] when the sampling rate
[2:46] is too low as the input frequency
[2:49] approaches past the nyquist limit we
[2:52] start to see aliasing
[2:53] and the output signal cannot be
[2:55] reconstructed from the samples
[2:59] since our audio signals will go up to 20
[3:01] kilohertz
[3:02] we need to sample at a minimum of 40
[3:05] kilohertz
[3:07] now doing this in a loop on the cpu
[3:09] would leave us no time
[3:11] for doing any other work our cpu
[3:14] would be constantly polling the adc for
[3:16] new samples
[3:17] and would not have much time for any
[3:19] other processing
[3:21] we would also struggle to achieve a
[3:22] consistent sampling period
[3:24] leading to weird artifacts in our audio
[3:27] fortunately
[3:28] there is an alternative mechanism for
[3:30] reading samples from the adc
[3:33] we can use the i2s peripheral to read
[3:35] the samples
[3:37] this has a dedicated dma controller that
[3:40] allows us to stream samples
[3:41] straight into ram buffers independently
[3:44] from the cpu
[3:46] i’ve written a simple sketch that uses
[3:48] i2s
[3:49] to read the samples and as audio data
[3:52] becomes available
[3:53] it streams it to a server running on my
[3:56] desktop computer
[3:57] which writes the audio data to disk
[4:01] i’ve set a sample rate of 40 kilohertz
[4:04] and allocated four dma buffers at 1024
[4:08] bytes each
[4:09] this should give us sufficient time to
[4:11] process any buffers without them being
[4:13] overwritten
[4:14] with new data we’ve set up the i2s
[4:17] peripheral
[4:18] to read from adc one channel seven this
[4:21] equates to
[4:23] gpio35 and then we set up a task
[4:26] to read the values from the its queue
[4:28] looking at our reader task
[4:31] it waits for an itos event rx done to be
[4:34] received
[4:35] and then reads bytes from the dma buffer
[4:37] into our own local buffer
[4:41] once we’ve read all the samples we’re
[4:42] interested in we do whatever processing
[4:45] we need to do
[4:46] and in our case i’m just sending the
[4:47] samples over to a local server
[4:50] running on my desktop machine here’s the
[4:52] output when we use our potentiometer
[4:54] to change the values on the adc for this
[4:57] example
[4:58] i’ve just used a sampling rate of 10
[5:00] kilohertz
[5:02] we can now move on to getting audio into
[5:04] the system
[5:05] i’ve got two microphone breakout boards
[5:08] the max 4466 and the max
[5:11] 9814 now the max 4466
[5:15] has an adjustable gain from 25 times
[5:18] to 125 times and the max
[5:22] 9814 has a built-in automatic gain
[5:25] control
[5:26] this will make quiet sounds louder and
[5:28] louder sounds
[5:29] quieter both these modules are simple
[5:32] and easy to wire up
[5:33] only requiring power and ground
[5:37] so here’s an audio recording from the
[5:39] max 4466
[5:47] testing testing
[5:55] as you can hear we have some audio data
[5:58] but it’s terribly noisy and it’s not
[6:00] going to be usable
[6:02] uh here’s the audio recording from the
[6:03] max 9814
[6:11] testing testing one two three
[6:17] it’s a bit better but there is still
[6:19] quite a lot of noise coming through
[6:21] it looks like the noise can inside with
[6:24] our transmission of data over wi-fi
[6:26] i’ve attached an oscilloscope to the 3.3
[6:29] volt output from the dev board
[6:31] and the output from the max 4466
[6:35] you can see that our 3.3 volt line has a
[6:38] lot of noise
[6:39] when the module is transmitting this
[6:42] noise is being amplified by the
[6:43] sensitive op amps
[6:44] on the microphone board looking at the v
[6:47] in line
[6:48] we can see a similar problem so what can
[6:50] we do
[6:51] we could connect up a separate power
[6:53] supply or we could use a battery
[6:56] to power our microphone boards but this
[6:58] is not going to be very convenient
[7:01] we could also turn off wi-fi but i need
[7:04] wi-fi
[7:04] for my project so the underlying problem
[7:08] seems to be the current draw when the
[7:10] esp needs to transmit
[7:12] this is causing a large amount of noise
[7:14] on the 3v3
[7:16] power rail which is then picked up by
[7:18] the very sensitive microphone amplifiers
[7:21] i’ve tried adding capacitors which have
[7:23] a small effect
[7:25] but to really fix it we would need to
[7:27] add excessively large capacitors
[7:30] to both the vin and the 3v3 rails
[7:33] so our microphone boards don’t require
[7:36] very much current
[7:37] so what we can do is take the vin line
[7:41] and pass it through an rc filter for my
[7:44] tests i’m just using a
[7:45] 30 ohm resistor and a 470
[7:48] microfarad capacitor we can then feed
[7:51] this filtered signal
[7:53] into a low dropout regulator to generate
[7:56] a clean
[7:57] 3v3 power supply for the microphone
[7:59] boards
[8:00] looking at the low-pass filtered vin we
[8:03] can see that we have a
[8:04] cleaner signal and on the 3v3
[8:07] output of the regulator we now have a
[8:10] pretty clean signal
[8:11] with noise down to around 20 millivolts
[8:14] instead of the 200 millivolts
[8:16] that we were seeing on the original 3v3
[8:18] line
[8:19] from the board looking at a trace from
[8:21] our microphone boards
[8:23] here’s the max 4466
[8:26] we don’t see any noise now when the
[8:28] esp32
[8:30] transmits and the same is true for the
[8:32] max
[8:33] 9814 so here’s an audio sample
[8:37] from the max4466 testing testing
[8:42] one two three and an audio sample
[8:45] from the max 9814
[8:48] testing testing one two three
[8:53] they are both better but still quite
[8:55] noisy
[8:57] one thing we can do is filter out some
[8:59] of the noise
[9:00] by over sampling and then take an
[9:03] average value
[9:04] as the actual sample value so i’ve tried
[9:06] using a
[9:07] median filter here and you can see
[9:10] that our simulation shows quite an
[9:12] impact
[9:14] and here’s the audio from the max 4466
[9:21] testing testing one two three
[9:24] and the audio from the max 9814
[9:28] testing testing one two three
[9:33] so which module should we actually use
[9:35] for my next project
[9:38] i think the max 9814
[9:41] comes out the winner it seems to be less
[9:44] susceptible to
[9:45] power supply noise and the automatic
[9:48] gain control makes it very useful
[9:51] we don’t have to fiddle with a
[9:53] potentiometer to change the volume
[9:56] um if you really need high quality
[9:59] low noise input then the built-in adcs
[10:03] on the esp32
[10:05] are probably not suitable and i would
[10:07] take a look at some external boards
[10:09] for this kind of project but on balance
[10:12] i think we’ll use the
[10:13] max 9814 for my next project
[10:17] so thanks for watching i hope you found
[10:19] this video
[10:20] useful and interesting all the code is
[10:23] in a github repo
[10:24] the links in the description if you did
[10:26] find the video useful
[10:28] then please subscribe to the channel and
[10:30] hit the like button
[10:32] there’s another video coming soon well
[10:34] i’ll actually do something with the
[10:36] audio data
[10:37] so see you in the next video

HELP SUPPORT MY WORK: If you're feeling flush then please stop by Patreon Or you can make a one off donation via ko-fi

Want to keep up to date with the latest posts and videos? Subscribe to the newsletter

· · · · · Posts · Videos · Tags · Support

ESP32 Audio Input Using I2S and Internal ADC

Written by

Chris Greening

Supported by

atomic14

A collection of slightly mad projects, instructive/educational videos, and generally interesting stuff. Building projects around the Arduino and ESP32 platforms - we'll be exploring AI, Computer Vision, Audio, 3D Printing - it may get a bit eclectic...

ESP32 Audio Input Using I2S and Internal ADC

Related Videos

Related Posts

Written by

Chris Greening

Supported by

atomic14

A collection of slightly mad projects, instructive/educational videos, and generally interesting stuff. Building projects around the Arduino and ESP32 platforms - we'll be exploring AI, Computer Vision, Audio, 3D Printing - it may get a bit eclectic...