Streaming Video and Audio over WiFi with the ESP32

View All Posts

24 min read

Want to keep up to date with the latest posts and videos? Subscribe to the newsletter

· · · · · Posts · Videos · Tags · Support

« Build Your Own Alexa with ESP32 & Wit.ai: Step-by-Step Tutorial

ESP32 Remote Logging »

HELP SUPPORT MY WORK: If you're feeling flush then please stop by Patreon Or you can make a one off donation via ko-fi

In this video, we dive into a hardware hack combining several components to create my version of the TinyTV, complete with a remote control, and video streaming over Wi-Fi. We challenge the speed of image display, using different libraries and tweaking performance for optimal results. We explore Motion JPEG or MJPEG to decode and draw images quickly, and even reach about 28 frames per second. We also catered audio using 8-bit PCM data at 16kHz, and deal with syncing both video and audio streams. Finally, we add some interactive elements allowing us to change channels and control volumes, with a classic static animation thrown in for good measure. There's a few hiccups along the way, but that's part of the fun, right?

Streaming Video From an SD Card on the ESP32. - In this video, we successfully navigated the convoluted process of setting up movie file playback from an ESP32 with an SD card. There were a few bumps along the way, such as confusing USB data pins and the intricacies of various video container formats, but our quirky PCBWay board came through. Discussed an ingenious method of creating a simple custom video container format with ffmpeg that can be effortlessly parsed by the ESP32. And yes, even though the tiny TV guys use AVI files, we pushed boundaries and learned a thing or two about list chunks, sub formats, and hex dumps. The result? We achieved smooth audio playback and video frame skipping for an optimal balance. Check out the streaming version on WiFi for more fun!

Sound and (almost 😉) Vision - We're getting closer to our own Tiny 📺 - In this exciting video, we're making progress on our miniature television project, having perfected sound and making strides with vision. We delve into the audio aspect, utilizing Mini esp32 S3 boards with 3-watt class D amplifiers based on the versatile max 98357ic. Fascinating features like class D amplifiers' efficiency and the easy PWM signal creation process are explored. We also play around with speakers of varying sizes, check out the temperature of the amplifier, and fiddle with animated gifs on our square display. Lots to come in future videos, including Version 2 of our boards and potential video playing methods!

Lots of Stuff - And a NEW PCB! It's a rare mailbag video. - In today's episode, I'm unboxing some goodies from PCB way - my super tiny esp32 breakout boards, which I'm planning to use to recreate a mini TV complete with speakers and a display. Also received some convenient adapter boards for easier testing. Excited to explore a new Arduino Nano esp32 based on a different, tinier module, and contrasting it with other products like the Tiny Pico. Also up for testing is a new mini wear electronic load compared to my old one, and an ATX power adapter for more USB ports. I'll be testing power banks, playing with inexpensive yellow displays and nunchucks for fun gaming projects, and testing out an RGB bead curtain with hackable possibilities. Also, under my ongoing experiments is a Raspberry Pi zero, turning into a 'Doom' playing device with added sound and game controllers. Finally, an air quality measuring device for detecting air particles, CO2 levels, humidity, temperature and other parameters is up for exploration as well. A whole array of fun projects queued up, so stay tuned!

Can You Spot the Problem? - Buckle up folks, this video is a thrilling one! There's everything from unboxing my new ESP32 TV boards that arrived from PCB Way to discovering some hidden issues. We're talking about some pesky problems, surprises, and even a potential catastrophic error that could've led to a disaster. The main dish is the high-speed SD card access over USB - ultimately achieving a whooping transfer rate! But, the journey is a roller-coaster ride, from the project completely failing initially, to some smart hacks and triumphant moments. All the peripherals worked well, from the display to the sound amplifier and even the infrared receiver. Despite the ups and downs, there's a lot to learn and that's what makes this video exciting! Can't wait to share the improvements I have in mind for turning the prototype into the ultimate all-in-one device. But first, let's address the elephant in the room - an ill-placed diode that's a ticking bomb, because you know, safety first!

Decoding AVI Files for Fun and... - After some quality time with my ESP32 microcontroller, I've developed a version of the TinyTV and learned a lot about video and audio streaming along the way. Using Python and Wi-Fi technology, I was able to set up the streaming server with audio data, video frames, and metadata. I've can also explored the picture quality challenges of uncompressed image data and learned about MJPEG frames. Together with JPEGDEC for depth decoding, I've managed to effectively use ESP32's dual cores to achieve an inspiring 28 frames per second. Discussing audio sync, storage options and the intricacies of container file formats for video storage led me to the AVI format. The process of reading and processing AVI file headers and the listing subtype 'movi' allowed me to make significant headway in my project. All in all, I'm pretty chuffed with my portable battery powered video player. You can check out my code over on Github!

A Faster ESP32 JPEG Decoder? - An intriguing issue appeared in the esp32-tv project that deals with speeding up JPEG file decoding using SIMD (Single Instruction Multiple Data) instructions, showing immense performance boost. However, there were some notable differences in speed when it comes to drawing the images versus simply decoding them. The problem was found to be with the DMA drawing mechanism and the way the new fast library decodes the image all at once. But despite this hiccup, by overlapped decoding and displaying process, a high frame rate can still be achieved. Joined me in this dissecting process and my initial tests showing approximately 40 frames per second display rate, on our journey to find the most efficient way to get images on screens.

16 bit mini handheld video arcade - Disassembling a 16-bit mini handheld video arcade revealed a fairly uncomplex interior with most of the functionality being handled by a blob chip on a single-sided PCB. Despite the simplicity, the impressive design manages to fit 156 games into flash storage on a multi-layered daughter board, which helps simplify the main board's design. While not as hackable as hoped, the teardown provided an interesting glimpse into the device's construction.

ESP32-S3 Hardware SPI on the Adafruit ST7789 - I've had some commenters point out the issue with the slow display updates in my recent Arduino Nano ESP32 video. It turns out, the software SPI of the Adafruit_ST7789 library was the culprit. Lo and behold, the solution is simple - using the hardware SPI constructor of the library. Apparently, this isn't well documented, so I wrote some code to serve as reference for myself and others who might run into the same snags. Trust me, the difference in speed is absolutely bonkers. Check out the video to see the magic in action.

Self Organising WS2811 LEDs - I've successfully used addressable WS2811 LED strings and an ESP-CAM board to create an adjustable lighting system. The best part is that the image processing code can be duplicated in JavaScript which allows you to use a plain dev board to drive the LEDs instead of needing a camera on your ESP32 board. If you want to replicate this project, you'll need your own ESP32 dev board and some addressable LEDs. After figuring out the location of each LED in 2D space, it's easy to map from each LED's x and y location onto a pattern you want to show on the frame buffer. Desiring to keep it accessible, I've posted detailed instructions and my sample code on GitHub, making sure anyone with basic knowledge can undertake this fun technological DIY project!

[0:00] “Oh! Have we got the video?”
[0:02] “Yes, we’ve got a video!”
[0:06] Yes, we have got a video.
[0:08] My version of the TinyTV is working.
[0:11] It’s even got a remote control.
[0:13] So a big shout out to the channel Patreons,
[0:15] they’ve been getting a couple of sneak previews of progress along the way on this project.
[0:19] And once again thanks to PCBWay who manufactured the PCBs that I’m using for this project.
[0:24] They’ve been working flawlessly,
[0:25] but keep an eye out for a follow up video where I fix some of the mistakes that I’ve made.
[0:30] Now in the previous video I asked if you’d prefer to see streaming over Wi-Fi
[0:34] or streaming from an SD card.
[0:36] And the people’s choice was streaming over Wi-Fi.
[0:39] And in a way I was kind of pleased about this.
[0:41] I don’t have an SD card slot on my PCB and although we’ve seen in previous videos how
[0:46] easy it is to wire one up, it would have meant doing a lot of messing around with decoding video files.
[0:51] Streaming over Wi-Fi to me is a little bit easier.
[0:54] It gives me free reign over how to present the data to the ESP32.
[0:58] But before we get to the server, we’ve got some challenges to solve on the ESP32 side of things.
[1:03] The first question I had was how fast can we actually display an image on the display?
[1:09] We saw in previous videos using animated GIFs that the screen is pretty fast.
[1:13] But just how fast is it?
[1:15] So I’ve created a full screen uncompressed image and hard coded it into the sketch.
[1:19] Let’s see how quickly we can push it to the screen.
[1:22] I’m using the very excellent TFT_eSPI library for this test.
[1:26] So that’s pretty impressive.
[1:27] We can push 280 by 240 16-bit pixels in about 17 milliseconds.
[1:33] That would give us a theoretical frame rate of almost 60 frames per second.
[1:38] But there are some issues with this.
[1:39] I’m planning on streaming the frames over Wi-Fi and a single uncompressed frame is over 130K.
[1:45] That is pretty big.
[1:47] It might be doable over a very fast Wi-Fi network,
[1:50] but to get something that works we’re going to have to make it much smaller.
[1:53] A really popular thing to do is to use something called Motion JPEG or MJPEG.
[1:58] This is just a stream of JPEG images so it’s really simple.
[2:01] The question now is how quickly can we decode and draw a JPEG?
[2:05] My initial attempts were not particularly promising.
[2:08] I used the sample code from the TFT_eSPI library
[2:12] and I’m pretty sure we can actually see the image being drawn.
[2:14] It’s taking around 180 milliseconds to draw an image.
[2:18] This gives us a maximum frame rate of about 8 to 9 frames per second.
[2:22] Not very impressive.
[2:24] Fortunately there is another JPEG decoder library called JPEGDEC.
[2:28] This has some impressive performance claims.
[2:30] Any one who needs to use microseconds to measure time is obviously doing something right.
[2:34] If we use this library we’re now down to around 49 milliseconds to draw each frame.
[2:39] That’s a much more respectable 20 frames per second.
[2:42] That should give us time to download a frame, display it and still get a decent frame rate.
[2:47] We can go even faster by enabling DMA.
[2:50] This gets us down to just 36 milliseconds, which gives us 28 frames per second.
[2:55] The DMA solution works so well because we can hand off the pixels to the DMA controller
[2:59] and decode the next section of the image while the pixels are being sent to the display.
[3:03] We can get even more clever.
[3:05] We’ve got two cores on the ESP32.
[3:07] We can use one core to download the image and another core to decode and display it.
[3:12] This way, while we’re drawing a frame, we can be downloading the next frame to display.
[3:16] That’s pretty cool.
[3:18] The target I had in my head for FPS was around 15 frames per second
[3:22] and this definitely seems achievable.
[3:24] The limiting factor will probably be my crappy Wi-Fi rather than any technical blockers.
[3:28] So how are we going to stream the frames from the server?
[3:32] I don’t want to make anything too complex as I want to keep the ESP32 code as simple as possible.
[3:37] The easiest thing that I can think of is to have an HTTP server
[3:40] that will give us a JPEG image from the video being played.
[3:43] I’m not trying to do live streaming so we can just have the ESP32 request a frame for a particular timestamp.
[3:49] This makes things very simple as we don’t need any complex synchronisation between the server and the client.
[3:54] The client can just consume and display frames at its own pace.
[3:57] And this works really well.
[3:59] You can see the frames being requested by the client.
[4:01] We do get the occasional long display due to my Wi-Fi but it works.
[4:05] On the server side, I’m pre-processing the videos to extract all the frames.
[4:10] This makes the server very fast as all the hard work has already been done and it just
[4:14] serves the JPEG data at the requested timestamp without needing to do any work.
[4:18] So it looks like we’ve got the vision part of the system solved.
[4:21] Now we just need sound.
[4:23] I don’t want to deal with any complex codecs on the ESP32
[4:26] so I’m just going to use 8-bit PCM data at 16kHz.
[4:30] This should still give us reasonable quality and it won’t be a lot of data to transfer.
[4:34] We can pull this audio data directly from the server.
[4:37] We’ll download a chunk of audio and send it to the I2S amplifier.
[4:41] This has its own internal DMA buffers.
[4:43] Once these have space for more data, we can fetch the next chunk of audio data.
[4:47] This will give us a continuously playing piece of audio without any caps.
[4:51] It also works pretty well.
[4:52] I’m pulling down one second of audio at a time and I’ve got 4K of space in the DMA buffers,
[4:57] which gives us around 256 milliseconds to download the next chunk of data.
[5:02] If we needed to, we could increase the number of DMA buffers to allow for poor
[5:06] network conditions and give us more time to download data.
[5:08] But this seems to be working okay on my network.
[5:11] There’s one potentially tricky challenge still remaining.
[5:14] We have our audio stream.
[5:15] This will play audio sequentially and the streaming rate is controlled by the I2S sample rate.
[5:20] And we have our video frame stream.
[5:23] The rate of this is controlled by the elapsed time
[5:25] and it just draws frames as quickly as it can download and display them.
[5:29] There’s a real danger of these two independent streams getting out of sync.
[5:33] If our audio stream has some network issue and gets delayed,
[5:36] our images won’t match up, which would be really irritating.
[5:39] There’s a fairly obvious solution to this.
[5:41] We just use the audio stream to calculate elapsed time.
[5:44] Every time we need to push data out to the I2S peripheral, we know how much time has passed,
[5:49] so we can use this to keep our images in sync.
[5:52] It’s a nice, elegant, simple solution.
[5:54] It won’t work for live streams, but it will work really nicely for our use case.
[5:58] With our syncing solved, we’ve got a fully functional video streaming system.
[6:02] There’s only one thing really lacking.
[6:04] We don’t have any controls.
[6:05] We can’t change the channel or the volume.
[6:07] It’s a TV, so the obvious thing to do is to add a remote control.
[6:11] I’ve had a couple of these infrared receivers for ages and it’s finally time to put them to use.
[6:16] They are pretty easy to wire up.
[6:18] They’ve only got three pins, out, ground and VCC.
[6:21] I’ve stuck one in a breadboard and hooked it up to my oscilloscope.
[6:24] It’s pretty interesting to see the patterns for the different buttons.
[6:27] We have power, volume up, volume down, channel up and channel down.
[6:32] That should do us for our TV control.
[6:34] I’ve soldered three legs directly to GPIO pins.
[6:37] I definitely need to add something to the next revision of the board to do this properly.
[6:40] Looking at the datasheet, we are supposed to have an RC filter on the power supply
[6:44] and a pull up resistor, but it seems to be working reasonably well without them.
[6:47] We do get a few spurious commands, but it seems OK.
[6:50] I’ve hooked up the iRemote Arduino library and we’re getting commands being detected.
[6:56] It’s working!
[6:56] We can now control our little TV.
[6:59] When I hit the power button, it asks the server for the list of channels.
[7:02] It also asks how long each channel is.
[7:05] At the moment I’m just looping around each video,
[7:07] but you could get clever here and move on to the next channel at the end automatically.
[7:11] The volume buttons work nicely and the channel up and down buttons
[7:15] just move through each channel and reset the play position back to zero.
[7:18] I’ve also added a nice little animation to show static when changing the channel.
[7:23] This was pretty interesting to implement.
[7:25] My initial naive approach was to fill a buffer with random grayscale values
[7:29] using the random command, but this turned out to be painfully slow.
[7:33] But we don’t really need a proper random number,
[7:35] we can just generate one using a pretty rubbish pseudo-random number generator.
[7:39] This works really quickly and we get a nice little static effect.
[7:43] There’s nothing really specific about my custom board that makes this work.
[7:46] It should work on any ESP32 and SPI display.
[7:49] You’ll just need to add an amplifier to get the sound output.
[7:52] I’ve tried it out on the cheap yellow display that Brian Lough has been talking about,
[7:56] and it does work.
[7:57] Unfortunately there does seem to be a bug around using DAC for audio output at the moment,
[8:02] which needs to be investigated.
[8:03] But the streaming does work.
[8:05] It might be interesting to get it controlled via the touchscreen,
[8:08] but that’s probably for another video.
[8:10] If you want to try it out for yourself, there’s a couple of places you need to modify.
[8:14] Almost all the settings for the firmware are set up in the platformio.ini file.
[8:19] The only thing you’ll need to modify in the code is the wi-fi credentials
[8:22] and the IP address of the server.
[8:24] On the server side of things, you just drop any videos you want to play in the movies folder.
[8:28] Keep them fairly short because as I said earlier,
[8:31] it does pre-process them to extract all the jpeg images.
[8:34] You’ll need to change the jpeg size in the code to match the size of your display.
[8:38] [The code is pretty rough and ready, and has been hacked together pretty quickly,
[8:47] so use it at your own risk. But as always, have fun and keep making stuff!]

HELP SUPPORT MY WORK: If you're feeling flush then please stop by Patreon Or you can make a one off donation via ko-fi

Want to keep up to date with the latest posts and videos? Subscribe to the newsletter

· · · · · Posts · Videos · Tags · Support

Streaming Video and Audio over WiFi with the ESP32

Written by

Chris Greening

Supported by

atomic14

A collection of slightly mad projects, instructive/educational videos, and generally interesting stuff. Building projects around the Arduino and ESP32 platforms - we'll be exploring AI, Computer Vision, Audio, 3D Printing - it may get a bit eclectic...

Streaming Video and Audio over WiFi with the ESP32

Related Videos

Related Posts

Written by

Chris Greening

Supported by

atomic14

A collection of slightly mad projects, instructive/educational videos, and generally interesting stuff. Building projects around the Arduino and ESP32 platforms - we'll be exploring AI, Computer Vision, Audio, 3D Printing - it may get a bit eclectic...