ESP32 Super Fast JPEG Decoder - 20ms!

View All Posts

18 min read

Want to keep up to date with the latest posts and videos? Subscribe to the newsletter

· · · · · Posts · Videos · Tags · Support

« USB Power Bank Hack

DIY Power Bank: 20,000mAh? »

HELP SUPPORT MY WORK: If you're feeling flush then please stop by Patreon Or you can make a one off donation via ko-fi

In my exploration for the fastest JPEG decoder for the ESP32, I trod a path from the original JPEG decoder library at 109 milliseconds, to the accelerated TJPEG decoder at 55 milliseconds, and finally the impressive JPEG dec library at 32 milliseconds. But wait, there's more. Along came a GitHub issue suggesting decoding JPEG with SIMD, efficiently working wonders at a decoding speed of just 20 milliseconds. However, it threw a curveball when it came to drawing. The decoding process doesn't overlap with the pixel transfer making it slower than expected. It does bring great benefits for streaming JPEGs though, as the decoding of the next frame can start while the current one is loading. But the space this library would expropriate, around 130 kilobytes, from the memory of an ESP32 module can be an issue. Yet, there are promising improvements marching on the horizon, so stay tuned for the upcoming enhancements!

ESP32 SD Card Speedup With a Couple of Lines of Code - In this video, we explore the disappointingly slow data writing speed of the ESP32 when reading and writing to an SD card in our TinyTV project. With 500 kilobytes/sec reading and a dismal 270 kilobytes/sec writing, we embark on an adventure to find a solution. After ditching the Arduino code in favor of IDF functions, we discover incredible improvements. Seeing potential risks, I propose a truly bonkers plan: using a IC to interface SD cards with USB with a USB multiplexer switch and another switch to alternate between ESP32 and the GL823. This could be a total disaster, but I'm game for the challenge. Stay tuned to see if it works out!

I Feel the Need – The Need for Hardware SPI… - An insightful iteration on my Arduino Nano esp32 video. Despite criticism regarding the slow display update speed, a solution was found thanks to the helpful fellow, Nick. Turns out, the software SPI was the cause of the issue. A quick tweak in the code and voilà, we've got ourselves an SPI clock whizzing at 80 megahertz. Quite the speed boost for just a few lines of code alteration!

Streaming Video From an SD Card on the ESP32. - In this video, we successfully navigated the convoluted process of setting up movie file playback from an ESP32 with an SD card. There were a few bumps along the way, such as confusing USB data pins and the intricacies of various video container formats, but our quirky PCBWay board came through. Discussed an ingenious method of creating a simple custom video container format with ffmpeg that can be effortlessly parsed by the ESP32. And yes, even though the tiny TV guys use AVI files, we pushed boundaries and learned a thing or two about list chunks, sub formats, and hex dumps. The result? We achieved smooth audio playback and video frame skipping for an optimal balance. Check out the streaming version on WiFi for more fun!

A Faster ESP32 JPEG Decoder? - An intriguing issue appeared in the esp32-tv project that deals with speeding up JPEG file decoding using SIMD (Single Instruction Multiple Data) instructions, showing immense performance boost. However, there were some notable differences in speed when it comes to drawing the images versus simply decoding them. The problem was found to be with the DMA drawing mechanism and the way the new fast library decodes the image all at once. But despite this hiccup, by overlapped decoding and displaying process, a high frame rate can still be achieved. Joined me in this dissecting process and my initial tests showing approximately 40 frames per second display rate, on our journey to find the most efficient way to get images on screens.

Decoding AVI Files for Fun and... - After some quality time with my ESP32 microcontroller, I've developed a version of the TinyTV and learned a lot about video and audio streaming along the way. Using Python and Wi-Fi technology, I was able to set up the streaming server with audio data, video frames, and metadata. I've can also explored the picture quality challenges of uncompressed image data and learned about MJPEG frames. Together with JPEGDEC for depth decoding, I've managed to effectively use ESP32's dual cores to achieve an inspiring 28 frames per second. Discussing audio sync, storage options and the intricacies of container file formats for video storage led me to the AVI format. The process of reading and processing AVI file headers and the listing subtype 'movi' allowed me to make significant headway in my project. All in all, I'm pretty chuffed with my portable battery powered video player. You can check out my code over on Github!

ESP32-S3 Hardware SPI on the Adafruit ST7789 - I've had some commenters point out the issue with the slow display updates in my recent Arduino Nano ESP32 video. It turns out, the software SPI of the Adafruit_ST7789 library was the culprit. Lo and behold, the solution is simple - using the hardware SPI constructor of the library. Apparently, this isn't well documented, so I wrote some code to serve as reference for myself and others who might run into the same snags. Trust me, the difference in speed is absolutely bonkers. Check out the video to see the magic in action.

ESP32-S3 USBMSC - Can we make it faster? - After lots of tinkering, I've managed to improve the speed of writing to the SD Card of my ESP32-TV considerably, but it's still not as fast as I'd like. The Arduino 'readRaw' and 'writeRaw' functions were the culprits, they can only write one sector at a time! After bypassing this and using IDF functions, writing speed improved by 70%. I also experimented with writing to the SD Card in the background, which ironically yielded even better results. However, it's still slower than I'd like, so I've got a crazy new plan: using a cheap IC (GL823) for SD card interfacing and a USB multiplexer switch to swap connections between ESP32 and GL823. It's a wild ride, but that's how we make progress!

Self Organising WS2811 LEDs - I've successfully used addressable WS2811 LED strings and an ESP-CAM board to create an adjustable lighting system. The best part is that the image processing code can be duplicated in JavaScript which allows you to use a plain dev board to drive the LEDs instead of needing a camera on your ESP32 board. If you want to replicate this project, you'll need your own ESP32 dev board and some addressable LEDs. After figuring out the location of each LED in 2D space, it's easy to map from each LED's x and y location onto a pattern you want to show on the frame buffer. Desiring to keep it accessible, I've posted detailed instructions and my sample code on GitHub, making sure anyone with basic knowledge can undertake this fun technological DIY project!

[0:00] I’ve been sent a link to a very fast JPEG decoder for the ESP32, but just how fast is
[0:05] it and is it actually practical?
[0:07] We’ve been decoding JPEGs on the ESP32.
[0:09] We’ve gone from the fairly slow to the pretty snappy.
[0:13] We’ve got the original memory optimised JPEG decoder library clocking in at 109 milliseconds.
[0:19] And then we’ve got the improved version, TJPEG decoder, which is based on the tiny
[0:23] JPEG decoder codebase.
[0:24] This is about twice as fast at only 55 milliseconds.
[0:28] And then we’ve got the amazing JPEG dec library that only takes 32 milliseconds.
[0:33] But can we do any better?
[0:34] Well, you wouldn’t be watching this if we couldn’t.
[0:37] I received a slightly vague GitHub issue on my ESP32 TV project, decoding JPEG with SIMD,
[0:44] along with a link to some sample code.
[0:46] Now SIMD stands for Single Instruction Multiple Data.
[0:49] Basically it can perform the same operation on multiple data elements simultaneously.
[0:54] The ESP32 has a bunch of these instructions and the ESP32-S3 has a bunch more.
[0:59] Now decoding JPEG can really take advantage of these instructions, potentially giving
[1:03] a huge performance improvement.
[1:05] If we used the library suggested in the GitHub issue, we’d down to just 20 milliseconds
[1:09] to decode a JPEG image.
[1:11] Now that is pretty amazing.
[1:13] But there’s some really interesting timing when we combine decoding with drawing.
[1:18] This chart shows the overhead added when we start sending pixels to the display.
[1:22] Our first two libraries need an extra 10 milliseconds when we draw to the screen.
[1:27] The JPEG dec library needs just 6 milliseconds extra.
[1:31] But for some reason our super fast library needs an extra 17 milliseconds to draw to
[1:35] the screen.
[1:36] So what’s going on?
[1:37] Surely we’re sending the same number of pixels.
[1:39] Why is there a difference?
[1:41] Well this is pretty interesting.
[1:43] We’re using DMA to draw the pixels to the display.
[1:46] This means that once the DMA process has been started, the CPU can get back to doing work,
[1:51] so potentially we can overlap decoding the JPEG with drawing the JPEG.
[1:55] The first two libraries decode the JPEG in 16 by 16 blocks of pixels at a time.
[2:00] I’ve slowed down this process here so you can see it drawing.
[2:03] We can send these 256 pixels off to the display using DMA and while that’s happening, the
[2:09] CPU can be working on the next block of pixels.
[2:12] This DMA transfer is pretty small so we end up waiting for the CPU to give us more data
[2:16] to send to the display.
[2:18] The JPEG deck library takes us even further.
[2:21] It decodes 128 by 16 pixels at a time, which is 2048 pixels.
[2:27] This is much better.
[2:28] We can really take advantage of DMA to blast the pixels out of the display while the CPU
[2:32] is doing some work.
[2:34] So why does our new super fast JPEG decoder seem so slow to draw?
[2:38] Well, the fast JPEG decoder decodes the entire image in one go and then we have to draw all
[2:43] the pixels to the screen.
[2:45] This means that unlike the other libraries, we don’t get any overlap between the processing
[2:49] and the sending of the pixels.
[2:51] Now at first this may seem a bit disappointing.
[2:53] We’ve got a really fast JPEG decoder but it’s not actually any faster to use than
[2:57] JPEG deck.
[2:59] However, for our TV project we’re streaming JPEGs so we can still take advantage of overlapping
[3:04] DMA and CPU work.
[3:06] We can start the DMA transfer at one frame and immediately start decoding the next frame.
[3:11] This could potentially work really well.
[3:13] Now there is of course a massive trade-off going on here.
[3:16] Our slow libraries don’t need much RAM.
[3:19] A 16 by 16 block of pixels is only 512 bytes.
[3:23] A 128 by 16 block of pixels is only 4K.
[3:26] But a fully decoded image is around 130 kilobytes.
[3:31] The ESP32-S3 module I’m using does have PSRAM built in, so it’s not a huge problem
[3:35] for me but for your regular ESP32 modules this would cause quite a problem.
[3:40] It’s going to be pretty difficult to get that much free RAM in one continuous block
[3:44] and it would be near impossible to get two blocks of RAM that sized for overlapping decoding.
[3:50] But stay tuned.
[3:52] The library is being actively worked on and partial decoding and improvements are in the
[3:56] works.
[3:57] It’s pretty exciting stuff.

HELP SUPPORT MY WORK: If you're feeling flush then please stop by Patreon Or you can make a one off donation via ko-fi

Want to keep up to date with the latest posts and videos? Subscribe to the newsletter

· · · · · Posts · Videos · Tags · Support

ESP32 Super Fast JPEG Decoder - 20ms!

Written by

Chris Greening

Supported by

atomic14

A collection of slightly mad projects, instructive/educational videos, and generally interesting stuff. Building projects around the Arduino and ESP32 platforms - we'll be exploring AI, Computer Vision, Audio, 3D Printing - it may get a bit eclectic...

ESP32 Super Fast JPEG Decoder - 20ms!

Related Videos

Related Posts

Written by

Chris Greening

Supported by

atomic14

A collection of slightly mad projects, instructive/educational videos, and generally interesting stuff. Building projects around the Arduino and ESP32 platforms - we'll be exploring AI, Computer Vision, Audio, 3D Printing - it may get a bit eclectic...