Hardware:
- Minisforum Venus UM790 pro Mini Pc
Software:
- Operating System: Windows or Linux.
- Python: Programming language used for model training and application development.
- Conda: Environment management tool.
- PyTorch: Deep learning framework used to define and train the UNet model.
- ONNX: Open Neural Network Exchange format for model export.
- ONNX Runtime: To run ONNX models with various execution providers.
- Flask: Web framework for creating the user interface and handling HTTP requests.
- Docker: For containerizing the Vitis AI tools and runtime environment.
- Vitis AI: Xilinx tools for AI model optimization and deployment.
Tools:
- Visual Studio Code: Integrated Development Environment (IDE) for code development.
- Git: Version control system for tracking changes and collaboration.
- Jupyter Notebook: For interactive model training and testing.
Image Processing Libraries:
- OpenCV: For image preprocessing and enhancement.
- PIL (Pillow): For handling image files in Python.
- Web Browser: For accessing and testing the web application interface.
I was really excited that my idea ended up getting selected for the AMD Pervasive AI Developer Contest and Immediately started to work on project idea.The Mini Pc had really good specs.
And I then set it up, it already had Windows 11 OS pre loaded. I then wanted to actually see what are special about these Ryzen AI chips, They basically have and additional chip called the NPU (Neural Processing Unit) built in the AMD XDNA architecture, what this unlocks is the developers can the use AMD's Ryzen AI Software to build and deploy models trained in PyTorch or TensorFlow and run them directly on laptops powered by Ryzen AI using ONNX Runtime and Vitis AI Execution Provider (EP).
This frees up GPU and GPU resources for other compute tasks and allows developers to run private, on-device LLM AI workloads and concurrent applications efficiently and locally.
Learn More about Ryzen AI
Installing Ryzen AI
I followed through the installation instructions given on their website : Ryzen AI
Also on this Youtube Video : Getting Started With Ryzen AI Software
Issues:However I faced so many issues
- I wasn't able to find "AMD IPU Device" under the "System Devices".
- After some google searches many people also faced the same issue, the reason being the mini pc company "Minisforum" decided to disable the IPU. Luckily IPU could be enabled from the BIOS.
AMD IPU device not detected on 7940HS : Helped me fix the issue.
After this I could follow up with the given instructions and successfully installed it.
- Came across another issue while running one of the tutorials called "hello_world"
You can read about my issue on here.
Following through these steps helped me in resolving the issue.So This is What I Managed to build
Enhance Pro:A Website where users can upload an image that needs to be enhanced and with the click of a button it enhances it and displays it to the user.The enhancing process happens in the IPU itself.
It asks for the user to upload the Image once the user uploads and clicks on the "Enhance Image" button the following things happen
- Handling the POST Request: The Flask application receives the POST request. The uploaded file is accessed from the request and saved to a specified directory (e.g., static folder) with a predefined name like uploaded_image.jpg.
- Image Enhancement Process: The enhance_image function is called with the path to the uploaded image. The image is loaded using PIL and preprocessed with transformations (resize and normalization to tensor).
- ONNX Runtime Session: The preprocessed image tensor is then fed into the ONNX Runtime inference session configured to use the Vitis AI Execution Provider (or CPU if Vitis AI is unavailable). The inference session processes the image through the quantized model, and the enhanced image tensor is obtained.
- Post-processing: The enhanced image tensor is post-processed (transpose and scaling) to convert it back to an image format. The enhanced image is saved back to the specified directory (e.g., static folder) with a predefined name like enhanced_image.jpg.
- Rendering the Enhanced Image: The Flask route returns an HTML page that includes both the original and enhanced images by referencing their paths. The webpage dynamically displays the original uploaded image and the enhanced image side by side.
- Display to the User: The user sees the original image and the enhanced image on the webpage, visually confirming the enhancement performed by the model. They can then download the image if they think it looks good.
Here is the Final Output
UNet The model used is a UNet architecture, which is a popular neural network designed primarily for image segmentation tasks. However, it can be adapted for image-to-image translation tasks, such as image enhancement.
Key Components of UNet:
- Encoder (Contracting Path): Enc1: Two convolutional layers with 64 filters, ReLU activation, and padding. Enc2: Two convolutional layers with 128 filters, ReLU activation, and padding. Enc3: Two convolutional layers with 256 filters, ReLU activation, and padding. Pooling: Max pooling layers that reduce the spatial dimensions by half after each encoder block.
- Decoder (Expansive Path): Up-sampling: Increases the spatial dimensions using bilinear interpolation. Dec3: Two convolutional layers with 128 filters, ReLU activation, and padding. Dec2: Two convolutional layers with 64 filters, ReLU activation, and padding. Dec1: A final convolutional layer with 3 filters (corresponding to the RGB channels), padding, and no activation.
- Skip Connections: These connections concatenate feature maps from the encoder to the corresponding decoder layers, allowing the network to use both low-level and high-level features for accurate reconstruction.
- Crop and Concat: This function ensures that the spatial dimensions of the encoder's feature maps match those of the up-sampled feature maps before concatenation.
- Loss Function: Mean Squared Error (MSE) Loss, suitable for regression tasks where the goal is to minimize the difference between the predicted and target images.
- Optimizer: Adam optimizer with a learning rate of 0.001, which is effective for training deep neural networks.
- Transforms: Images are resized to 224 × 224 224×224 and converted to tensors.
- Dataset: A custom dataset class loads raw and enhanced images.
- DataLoader: Handles batching and shuffling of data during training.
- Forward Pass: The input image is passed through the encoder, reducing its spatial dimensions while capturing high-level features.
- Bottleneck: The most compressed representation of the image.
- Up-sampling: The decoder reconstructs the image, using skip connections to incorporate detailed information from the encoder.
- Loss Calculation: The difference between the predicted enhanced image and the target enhanced image is calculated using MSE loss. Backpropagation: The optimizer updates the model weights to minimize the loss.
- Epochs: The training loop runs for a specified number of epochs, iterating over the dataset multiple times.
Model Evaluation and Inference:
- After training, the model is saved and can be exported to ONNX format for inference.
- The ONNX model is quantized for optimized performance on the IPU.
- During inference, the model processes an input image, and the enhanced output is displayed on the web interface. This model is designed to accept any image size and outputs an enhanced version of the image, leveraging the computational power of the IPU for efficient processing.
Training done using two sets of Images:
- Raw: Set of raw images from the MIT-Adobe 5k Dataset
- Enhanced: These are the images processed using an auto-tuning function to improve their quality, used an Auto-tone function based on CLAHE (Contrast Limited Adaptive Histogram Equalization) to generate enhanced versions of the raw image
class UNet(nn.Module):
def __init__(self):
super(UNet, self).__init__()
self.enc1 = nn.Sequential(
nn.Conv2d(3, 64, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(64, 64, kernel_size=3, padding=1),
nn.ReLU(inplace=True)
)
self.enc2 = nn.Sequential(
nn.Conv2d(64, 128, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(128, 128, kernel_size=3, padding=1),
nn.ReLU(inplace=True)
)
self.enc3 = nn.Sequential(
nn.Conv2d(128, 256, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(256, 256, kernel_size=3, padding=1),
nn.ReLU(inplace=True)
)
self.pool = nn.MaxPool2d(2, 2)
self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
self.dec3 = nn.Sequential(
nn.Conv2d(256 + 128, 128, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(128, 128, kernel_size=3, padding=1),
nn.ReLU(inplace=True)
)
self.dec2 = nn.Sequential(
nn.Conv2d(128 + 64, 64, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(64, 64, kernel_size=3, padding=1),
nn.ReLU(inplace=True)
)
self.dec1 = nn.Conv2d(64, 3, kernel_size=3, padding=1)
def forward(self, x):
e1 = self.enc1(x)
e2 = self.enc2(self.pool(e1))
e3 = self.enc3(self.pool(e2))
d3 = self.up(e3)
d3 = self.crop_and_concat(d3, e2)
d3 = self.dec3(d3)
d2 = self.up(d3)
d2 = self.crop_and_concat(d2, e1)
d2 = self.dec2(d2)
d1 = self.dec1(self.up(d2))
return d1
def crop_and_concat(self, upsampled, bypass):
diffY = bypass.size()[2] - upsampled.size()[2]
diffX = bypass.size()[3] - upsampled.size()[3]
upsampled = nn.functional.pad(upsampled, [diffX // 2, diffX - diffX // 2, diffY // 2, diffY - diffY // 2])
return torch.cat((upsampled, bypass), 1)
model = UNet()
IPU when running:(ryzenai-1.1-20240728-104357) PS C:\Users\Uday U\Desktop\amd_project\LastDance\MODEL\app> python -u "c:\Users\Uday U\Desktop\amd_project\LastDance\MODEL\app\app.py"
2024-07-29 09:13:16.3467393 [W:onnxruntime:Default, vitisai_provider_factory.cc:48 onnxruntime::VitisAIProviderFactory::CreateProvider] Construting a FlexML EP instance in Vitis AI EP
2024-07-29 09:13:16.3492791 [W:onnxruntime:Default, vitisai_execution_provider.cc:117 onnxruntime::VitisAIExecutionProvider::SetFlexMLEPPtr] Assigning the FlexML EP pointer in Vitis AI EP
2024-07-29 09:13:16.3679734 [W:onnxruntime:Default, vitisai_execution_provider.cc:137 onnxruntime::VitisAIExecutionProvider::GetCapability] Trying FlexML EP GetCapability
2024-07-29 09:13:16.3703036 [W:onnxruntime:Default, flexml_execution_provider.cc:180 onnxruntime::FlexMLExecutionProvider::GetCapability] FlexMLExecutionProvider::GetCapability, C:\amd\voe\binary-modules\ResNet.flexml\flexml_bm.signature can't not be found!
2024-07-29 09:13:16.3733385 [W:onnxruntime:Default, vitisai_execution_provider.cc:153 onnxruntime::VitisAIExecutionProvider::GetCapability] FlexML EP ignoring a non-ResNet50 graph
WARNING: Logging before InitGoogleLogging() is written to STDERR
I20240729 09:13:16.376204 17696 vitisai_compile_model.cpp:346] Vitis AI EP Load ONNX Model Success
I20240729 09:13:16.376204 17696 vitisai_compile_model.cpp:347] Graph Input Node Name/Shape (1)
I20240729 09:13:16.376204 17696 vitisai_compile_model.cpp:351] input : [-1x3x224x224]
I20240729 09:13:16.376204 17696 vitisai_compile_model.cpp:357] Graph Output Node Name/Shape (1)
I20240729 09:13:16.376204 17696 vitisai_compile_model.cpp:361] output : [-1x3x448x448]
I20240729 09:13:16.377209 17696 vitisai_compile_model.cpp:232] use cache key hello_cache
[Vitis AI EP] No. of Operators : CPU 85 IPU 59 40.97%
[Vitis AI EP] No. of Subgraphs :Actually running on IPU 4
W20240729 09:13:16.433452 17696 tool_function.cpp:171] The operator named /Shape_1_output_0, type: :Shape, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:16.433452 17696 tool_function.cpp:171] The operator named /Gather_3_output_0, type: :Gather, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:16.434509 17696 tool_function.cpp:171] The operator named /Sub_1_output_0, type: :Sub, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:16.434509 17696 tool_function.cpp:171] The operator named /Div_output_0, type: :Div, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:16.434509 17696 tool_function.cpp:171] The operator named /Unsqueeze_5_output_0, type: :Unsqueeze, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:16.437175 17696 tool_function.cpp:171] The operator named /Concat_1_output_0, type: :Concat, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:16.438256 17696 tool_function.cpp:171] The operator named /Reshape_output_0, type: :Reshape, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:16.438256 17696 tool_function.cpp:171] The operator named /Slice_output_0, type: :Slice, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:16.438256 17696 tool_function.cpp:171] The operator named /Transpose_output_0, type: :Transpose, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:16.439366 17696 tool_function.cpp:171] The operator named /Pad_output_0, type: :Pad, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:16.439366 17696 tool_function.cpp:171] The operator named /Pad_output_0_QuantizeLinear_Output, type: :QuantizeLinear, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:16.440290 17696 tool_function.cpp:171] The operator named /Pad_output_0_DequantizeLinear_Output, type: :DequantizeLinear, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:16.440290 17696 tool_function.cpp:171] The operator named /dec3/dec3.0/Conv_output_0, type: :Conv, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
* Serving Flask app 'app'
* Debug mode: on
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on http://127.0.0.1:5000
Press CTRL+C to quit
* Restarting with stat
2024-07-29 09:13:19.0921145 [W:onnxruntime:Default, vitisai_provider_factory.cc:48 onnxruntime::VitisAIProviderFactory::CreateProvider] Construting a FlexML EP instance in Vitis AI EP
2024-07-29 09:13:19.0946456 [W:onnxruntime:Default, vitisai_execution_provider.cc:117 onnxruntime::VitisAIExecutionProvider::SetFlexMLEPPtr] Assigning the FlexML EP pointer in Vitis AI EP
2024-07-29 09:13:19.1214067 [W:onnxruntime:Default, vitisai_execution_provider.cc:137 onnxruntime::VitisAIExecutionProvider::GetCapability] Trying FlexML EP GetCapability
2024-07-29 09:13:19.1240632 [W:onnxruntime:Default, flexml_execution_provider.cc:180 onnxruntime::FlexMLExecutionProvider::GetCapability] FlexMLExecutionProvider::GetCapability, C:\amd\voe\binary-modules\ResNet.flexml\flexml_bm.signature can't not be found!
2024-07-29 09:13:19.1283580 [W:onnxruntime:Default, vitisai_execution_provider.cc:153 onnxruntime::VitisAIExecutionProvider::GetCapability] FlexML EP ignoring a non-ResNet50 graph
WARNING: Logging before InitGoogleLogging() is written to STDERR
I20240729 09:13:19.131781 3632 vitisai_compile_model.cpp:346] Vitis AI EP Load ONNX Model Success
I20240729 09:13:19.131781 3632 vitisai_compile_model.cpp:347] Graph Input Node Name/Shape (1)
I20240729 09:13:19.131781 3632 vitisai_compile_model.cpp:351] input : [-1x3x224x224]
I20240729 09:13:19.131781 3632 vitisai_compile_model.cpp:357] Graph Output Node Name/Shape (1)
I20240729 09:13:19.131781 3632 vitisai_compile_model.cpp:361] output : [-1x3x448x448]
I20240729 09:13:19.133801 3632 vitisai_compile_model.cpp:232] use cache key hello_cache
[Vitis AI EP] No. of Operators : CPU 85 IPU 59 40.97%
[Vitis AI EP] No. of Subgraphs :Actually running on IPU 4
W20240729 09:13:19.196134 3632 tool_function.cpp:171] The operator named /Shape_1_output_0, type: :Shape, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:19.196134 3632 tool_function.cpp:171] The operator named /Gather_3_output_0, type: :Gather, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:19.197134 3632 tool_function.cpp:171] The operator named /Sub_1_output_0, type: :Sub, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:19.197134 3632 tool_function.cpp:171] The operator named /Div_output_0, type: :Div, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:19.197134 3632 tool_function.cpp:171] The operator named /Unsqueeze_5_output_0, type: :Unsqueeze, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:19.198151 3632 tool_function.cpp:171] The operator named /Concat_1_output_0, type: :Concat, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:19.198151 3632 tool_function.cpp:171] The operator named /Reshape_output_0, type: :Reshape, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:19.198151 3632 tool_function.cpp:171] The operator named /Slice_output_0, type: :Slice, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:19.198151 3632 tool_function.cpp:171] The operator named /Transpose_output_0, type: :Transpose, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:19.198151 3632 tool_function.cpp:171] The operator named /Pad_output_0, type: :Pad, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:19.198151 3632 tool_function.cpp:171] The operator named /Pad_output_0_QuantizeLinear_Output, type: :QuantizeLinear, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:19.198151 3632 tool_function.cpp:171] The operator named /Pad_output_0_DequantizeLinear_Output, type: :DequantizeLinear, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
W20240729 09:13:19.198151 3632 tool_function.cpp:171] The operator named /dec3/dec3.0/Conv_output_0, type: :Conv, is not defined in XIR. XIR creates the definition of this operator automatically. You should specify the shape and the data_type of the output tensor of this operation by set_attr("shape", std::vector<int>) and set_attr("data_type", std::string)
* Debugger is active!
* Debugger PIN: 105-239-934
127.0.0.1 - - [29/Jul/2024 09:15:54] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [29/Jul/2024 09:16:02] "POST / HTTP/1.1" 200 -
127.0.0.1 - - [29/Jul/2024 09:16:02] "GET /static/uploaded_image.jpg HTTP/1.1" 200 -
127.0.0.1 - - [29/Jul/2024 09:16:02] "GET /static/enhanced_image.jpg HTTP/1.1" 200 -
127.0.0.1 - - [29/Jul/2024 09:18:19] "POST / HTTP/1.1" 200 -
127.0.0.1 - - [29/Jul/2024 09:18:19] "GET /static/uploaded_image.jpg HTTP/1.1" 200 -
127.0.0.1 - - [29/Jul/2024 09:18:19] "GET /static/enhanced_image.jpg HTTP/1.1" 200 -
127.0.0.1 - - [29/Jul/2024 09:18:29] "POST / HTTP/1.1" 200 -
127.0.0.1 - - [29/Jul/2024 09:18:29] "GET /static/uploaded_image.jpg HTTP/1.1" 200 -
127.0.0.1 - - [29/Jul/2024 09:18:29] "GET /static/enhanced_image.jpg HTTP/1.1" 200 -
(ryzenai-1.1-20240728-104357) PS C:\Users\Uday U\Desktop\amd_project\LastDance\MODEL\app> ^C
All the remaining code is uploaded on my GitHub here.The Image is a little blurry since I wasn't able to fully train with all the set of 5000 images, I only managed to do it with 1000 images since the training time was about 10hrs+ for about 20 epochs.If it's trained with all the complete set of images then the model would definitely perform better and the resulting image would also be better.
Comparison: Raw Image:
Output Image:
Ideal Image Output:
Takeaways:I did this project for the AMD Pervasive Developers Contest, I really enjoyed working with this project, learnt a lot of new things and "Thank You " for all the helpful support from all Hackster.io coordinators.
Future Enhancements:
- Increase the models performance.
- And Sliders and Controls to let the use fine tune the edit.
Comments