What problem are we going to solve
What are we going to build to solve this problem? How is it different from existing solutions? Why is it useful
How does our solution work? What are the main features? Please specify how you will use the AMD AI Hardware in our solution
Hardware and software we will use to build this

Created July 31, 2024

Autonomous Self-Navigating AI Vision-Guided Robot Using AMD

This project addresses the critical need for autonomous navigation in challenging environments, where traditional navigation methods are inf

Things used in this project

Hardware components

AMD Kria™ KR260 Robotics Starter Kit

Digilent Pmod HB3

DC Motor, 12 V

Webcam, Logitech® HD Pro

AA Batteries

Software apps and online services

AMD Vitis Unified Software Platform

Story

What problem are we going to solve?

This project addresses the critical need for autonomous navigation in challenging environments, where traditional navigation methods are infeasible. It aims to develop a robot capable of understanding and navigating its surroundings independently, applicable in disaster response, exploration of uncharted areas, everyday tasks like navigating crowded spaces without collisions and performing indoor tasks. The challenge is to create a robot that can make informed decisions based on its real-time understanding of its environment, without relying on predefined paths or external guidance.

What are we going to build to solve this problem? How is it different from existing solutions? Why is it useful?

The solution involves building an autonomous self-navigating robot that uses a vision-guided approach to navigate its environment. This robot will leverage an AMD Kria 260 AI accelerator to process camera feed data through a vision Large Language Model (LLM) to understand its surroundings and make decisions on the fly. This approach is different from existing solutions in that it combines the power of AI with the flexibility of vision-based navigation, allowing the robot to adapt to dynamic environments and perform tasks more efficiently. It is particularly useful in scenarios where the robot needs to navigate through complex, unpredictable environments.

How does our solution work? What are the main features? Please specify how you will use the AMD AI Hardware in our solution.

The robot will be equipped with a high-resolution camera that captures its surroundings in real-time. This camera feed is then processed by the AMD Kria 260 AI accelerator, which is capable of running complex AI models efficiently. The processed data is fed into a vision LLM, which analyzes the visual input to understand the environment and identify objects, obstacles, and paths. Based on this analysis, the robot decides on the best course of action, such as turning left or right, moving forward, or stopping. The robot uses this information to navigate autonomously, adjusting its path as needed to avoid obstacles and reach its destination. The main features of this solution include: Vision-Based Navigation: Utilizes a camera and a vision LLM to understand the environment and navigate autonomously. AI Acceleration: Leverages the AMD Kria 260 AI accelerator for efficient processing of visual data, enabling real-time decision-making. Dynamic Adaptation: Capable of adapting to changes in the environment, such as obstacles or changes in lighting conditions. Task-Specific Navigation: Can be programmed to perform specific tasks, such as searching for a person in a crowd or navigating through an indoor space to do a specific task.

Hardware and software we will use to build this.

Hardware: AMD Kria 260 AI Accelerator: For processing visual data and running the vision LLM. High-Resolution Camera: For capturing the robot's surroundings in real-time. Robot Chassis: A robot chassis equipped with wheels or tracks for mobility Motor Control System: Controls the robot's movement based on the navigation instructions generated by the LLM. Power Supply: To provide the necessary power to the AI accelerator and other components. Software: Vision Model (LLM): A Large Language Model trained to interpret visual data and generate navigation instructions. AI Accelerator Software: Drivers and software to enable the AMD Kria 260 to process video feeds and run the LLM. Navigation Control Software: Interprets the instructions from the LLM and translates them into motor commands for the robot's movement. Safety and Collision Avoidance Algorithms: Ensures the robot can safely navigate its environment, avoiding obstacles and preventing collisions.

Code

The Flask Server running on the Kria

from flask import Flask, send_from_directory, render_template, jsonify
import cv2
import os
from gradio_client import Client
import time
app = Flask(__name__)

#============CAMERA PART===============================#

# Directory to save images
IMAGE_DIR = './main/static/images'

# Ensure the directory exists
if not os.path.exists(IMAGE_DIR):
    os.makedirs(IMAGE_DIR)

def capture_image():
    # Initialize the camera (use  0 for the default camera)
    cap = cv2.VideoCapture(3)

    # Check if the camera is opened correctly
    if not cap.isOpened():
        raise IOError("Cannot open camera")

    # Capture a single frame
    ret, frame = cap.read()

    # Release the camera
    cap.release()

    if ret:
        # Save the image
        path = f"image_{int(time.time())}.jpg"
        filename = os.path.join(IMAGE_DIR, path)
        cv2.imwrite(filename, frame)
        print(f"Image saved to {filename}")
    else:
        print("Failed to capture image")

    return path


def process_image(path):
    client = Client(
        "https://ybelkada-llava-1-5-dlai.hf.space/--replicas/f3edg/")
    result = client.predict(
        "Describe this image",  # str  in 'parameter_0' Textbox component
        path,  # filepath  in 'parameter_1' Image component
        api_name="/predict"
    )
    print(result)
    return result

@app.route('/capture')
def capture_and_save():
    path = capture_image()
    path = './main/static/images/'+path
    print(path)
    desc = process_image(path)
    # return desc
    return jsonify({'image_url': path, 'description': desc})

@app.route('/')
def home():
    return render_template('index.html')

# @app.route('/get_car_data')
# def get_car_data():
#     # Example data
#     image_url = 'static/images/image_1708585084.jpg'
#     description = 'This is a description of the car.'
    
#     # Return the data as JSON
#     return jsonify({'image_url': image_url, 'description': description})

if __name__ == '__main__':
    app.run(debug=True, host='0.0.0.0')

Credits

samarth

1 project • 1 follower

I like to build stuff

Contact

Comments

Please log in or sign up to comment.

Autonomous Self-Navigating AI Vision-Guided Robot Using AMD

Things used in this project

Hardware components

Software apps and online services

Story

What problem are we going to solve?

What are we going to build to solve this problem? How is it different from existing solutions? Why is it useful?

How does our solution work? What are the main features? Please specify how you will use the AMD AI Hardware in our solution.

Hardware and software we will use to build this.

Schematics

Schematic

Code

The Flask Server running on the Kria

Credits

samarth

Comments

Embed the widget on your own site

Autonomous Self-Navigating AI Vision-Guided Robot Using AMD

Autonomous Self-Navigating AI Vision-Guided Robot Using AMD

Things used in this project

Hardware components

Software apps and online services

Story

What problem are we going to solve?

What are we going to build to solve this problem? How is it different from existing solutions? Why is it useful?

How does our solution work? What are the main features? Please specify how you will use the AMD AI Hardware in our solution.

Hardware and software we will use to build this.

Schematics

Schematic

Code

The Flask Server running on the Kria

Credits

samarth

Comments