Published June 26, 2023 © Apache-2.0

AI upscaling techniques made easy with Pico

W5100S-EVB-Pico & Arducam with A. I(Real-ESRGAN)

BeginnerFull instructions provided1 hour325

AI upscaling techniques made easy with Pico

Things used in this project

Hardware components

Arduino arducam HM0360

WIZnet W5100S-EVB-Pico

Software apps and online services

Adafruit - Circuitpython

Microsoft VS Code

Story

PROJECTSTORY

Intro

I saw the link and wanted to work on a project utilizing Arducam, rp2040 and W5100s. (https://github.com/Innovation4x/WIZnet-EVB-Pico-ArduCam)

We've asked ChatGPT to summarize the contents of an existing UCC link and to create an AI project that can be linked to the above project.

#Chat GPT Examples

This project is about upscaling images with AI using the W5100S-EVB-Pico and Arducam.

The project started with an interest in a project utilizing Arducam, rp2040, and W5100s. To this end, we asked ChatGPT to summarize the content of the existing UCC links and create an AI project that coulThis project is about upscaling images with AI using the W5100S-EVB-Pico and Arducam.

The project started with an interest in a project utilizing Arducam, rp2040, and W5100s. To this end, we asked ChatGPT to summarize the content of the existing UCC links and create an AI project that could be linked to the above projects. As a result, ChatGPT proposed using AI to upscale images.d be linked to the above projects. As a result, ChatGPT proposed using AI to upscale images.

As you can see, chatgpt suggested upscaling the image using AI.

AI Model

Following chatgpt's recommendation, I looked up a few models that can upscale images and found the Real-ESRGAN model to be the best.

https://github.com/xinntao/Real-ESRGAN

Real-ESRGAN is an AI model that stands for Enhanced Super-Resolution Generative Adversarial Networks. It is capable of transforming low-resolution images into high-resolution images.

The model is based on the concept of Generative Adversarial Networks (GAN). GAN consists of two neural networks, the generator and the discriminator, which compete against each other during the learning process. The generator aims to produce fake data that is similar to the real data, while the discriminator aims to distinguish between the generated fake data and the real data. Through this competitive process, the generator gradually generates data similar to the real data, and the discriminator becomes better at distinguishing between real and fake data.

ESRGAN applies this GAN concept to the generation of ultra-high-resolution images. Particularly, ESRGAN has several improvements over the existing SRGAN (Super-Resolution Generative Adversarial Networks). One of them is the use of a structure called Residual in Residual Dense Block (RRDB). RRDB adds a Dense Block to the existing Residual Block, allowing more information to be preserved and better reproducing the details of the image.

Moreover, Real-ESRGAN has evolved into an optimized model capable of supporting facial enhancement by integrating with GFPGAN and even restoring animation images/videos. Through this model, various projects can be conducted to enhance low-resolution images into high resolution.📷📷

vit-gpt2-image-captioning

The project at this link is an image captioning model that uses transformers. It was trained by @ydshieh in flax and this is the PyTorch version of it.

The model takes an image as input and generates a caption for the image. The model uses Vision Transformer (ViT) as the encoder and GPT-2 as the decoder. The encoder processes the image and generates a sequence of image features, which are then fed into the decoder to generate the caption.

Here is a sample code for using the model:

Creating

https://www.hackster.io/louis_m/w5100s-poe-web-camera-88002f

See the link above to build the hardware by combining the W5100s-evb-pico board with the arducam, circuitpython to get the webcam working.

We used the Bundle for Version 7.x of the CircuitPython libraries, and for the Adafruit_CircuitPython_wiznet5k library, we used the 1.12.15 release version.

https://circuitpython.org/libraries

https://github.com/ArduCAM/PICO_SPI_CAM/tree/master/Python

https://github.com/adafruit/Adafruit_CircuitPython_Wiznet5k/releases/tag/1.12.15

We have changed the existing streaming method to a capture method, and lowered the resolution as much as possible for quick capture.

Curation

The rest was carried out in VS Code. The code was written in Python, and we saved images captured via Arducam, then proceeded to upscale these images four times.

Here is an example of upscaling using an image of IU.

You can refer to the detailed code on Github.

https://github.com/WiznetAI/CCC_image_upscaling_esrgan_img2txt_with_GPT

inference

inference based on the actual author

I wanted to do an image to text example using the GAN project above, so I wrote some multimodal code that utilizes GPT with image-captioning to make inferences from pictures. It should be a useful reference. This code will be useful for extending AIOT.

Code that utilizes GPT as an API to take an upscaled image and create and store a name for itself

nextstep

While the example is of a human face, natural upscaling is possible for a variety of images used in real life, not just people.

As a next project, we are considering video upscaling, and we plan to upgrade our features by adding a function that describes the photo using an AI model that provides image-captioning.

ESRGAN applies this GAN concept to the generation of super-resolution images. In particular, ESRGAN has a number of improvements over traditional Super-Resolution Generative Adversarial Networks (SRGANs), one of which is the use of a structure called Residual in Residual Dense Block (RRDB). RRDB adds a Dense Block to the existing Residual Block, which allows it to preserve more information and better reproduce the details of the image.

In addition, Real-ESRGAN has been integrated with GFPGAN to develop an optimized model that can support face enhancement and restore animated images/videos. With this model, we can work on various projects to enhance low-resolution images to high resolution.

how to?

VScode(Python3)

You can utilize the tutorial python file to run it by setting the input output to pico appropriately.

!git clone https://github.com/jh941213/w5100s_image_upscaling 
!git clone https://github.com/xinntao/Real-ESRGAN.git

Tutorials python code

Refer to the code above and run it slowly You must have a Cuda-ready PC environment!

That concludes this post thaks!

simon

8 projects • 5 followers

I am interested in artificial intelligence and looking for ways to combine it with IOT.

Contact

Comments

Please log in or sign up to comment.

Embed the widget on your own site

AI upscaling techniques made easy with Pico

AI upscaling techniques made easy with Pico

Things used in this project

Hardware components

Software apps and online services

Story

Intro

AI Model

Creating

Curation

inference

nextstep

how to?

Credits

simon

Comments

Related channels and tags