In recent years, robotic arm technology has undergone unprecedented development, expanding its application fields from traditional manufacturing to new emerging industries such as medical, service, and logistics. This cross-industry expansion is attributed to rapid advancements in technology, especially in the fields of sensors, control systems, and artificial intelligence. Particularly last year, the surge in artificial intelligence technology has provided more possibilities for the intelligent upgrade of robotic arms, greatly broadening their application scenarios and enhancing their operational efficiency and intelligence level.
With the integration and innovation of these technologies, we have witnessed many impressive application cases that not only demonstrate the advanced nature of robotic arm technology but also reflect future development trends. This article will share 4 selected cases published on our platform last year, involving the application of robotic arms combined with technologies such as deep cameras and face tracking.
Case 1: ChatGPT for Robotics: Design Principles and Model AbilitiesBackground IntroductionSince the release of ChatGPT in 2022, it has received worldwide attention as a powerful artificial intelligence chatbot. As researchers from various fields continue to study and explore ChatGPT, many creative functions have been derived. Among these is the effort by Microsoft's Autonomous Systems and Robotics Group to extend ChatGPT's capabilities into the robotics domain, enabling intuitive language control of robotic arms, drones, home robots, and more.
You might think that it's already possible to issue operational commands to a robot and have it execute those instructions, wondering what difference this could possibly make. However, there's an incredibly powerful function here that you might not imagine. Previously, the implementation of such functions relied on writing code to control the robot.
For example, if I wanted a robotic arm to move forward upon hearing a command, I would need to program it in advance with the code to receive and execute the "move forward" command, which would then trigger the function.
This team sought to change this status quo by using OpenAI's AI language model, ChatGPT, to make natural human-machine interaction possible. It eliminates the need for extensive pre-written codes for fixed programs. Instead, simple conversations with ChatGPT can achieve certain functions, aiming to help people interact with robots more easily without needing to learn complex programming languages or detailed information about robotics.
For instance, the idea is to have corresponding APIs programmed in advance, informing ChatGPT what these functions are for. Then, through conversations with ChatGPT, it can, based on the dialogues' requirements and the corresponding API's functions, autonomously call the API to control the robotic arm.
The article gives an example of using myCobot 280M5 to build the Microsoft logo with blocks.
Step 1:
First, define a set of APIs or a library to control the robotic arm, using pymycobot, a library specifically for controlling myCobot. It can be used directly, with some annotations needed for the API's functions.
pymycobot:https://github.com/elephantrobotics/pymycobot
Step 2:
Next, we need to write a text prompt for ChatGPT that describes the task objective, clarifies which functions in the pymycobot library can be used, and how to use them.
User:
Imagine we are working with a manipulator robot. This is a robotic arm with 6 degrees of freedom that has a suction pump attached to its end effector. I would like you to assist me in sending commands to this robot given a scene and a task.
At any point, you have access to the following functions:
grab(): Turn on the suction pump to grab an object
release(): Turns off the suction pump to release an object
get_position(object): Given a string of an object name, returns the coordinates and orientation of the vacuum pump to touch the top of the object [X, Y, Z, Yaw, Pitch, Roll]
move_to(position): It moves the suction pump to a given position [X, Y, Z, Yaw, Pitch, Roll].
You are allowed to create new functions using these, but you are not allowed to use any other hypothetical functions.
Keep the solutions simple and clear. The positions are given in mm and the angles in degrees. You can also ask clarification questions using the tag "Question - ". Here is an example scenario that illustrates how you can ask clarification questions.
Let's assume a scene contains two spheres.pick up the sphere.
Step 3:
By continuously communicating with ChatGPT, we finally achieve the desired effect, letting ChatGPT generate the corresponding programming code.
User:
Awesome! I want now to use the blocks to build the logo from Microsoft on top of the white pad. It consists of four colors forming a square, blue on the bottom left, yellow on the bottom right, red on the top left and green on the top right.
ChatGPT:
Sure! Here's the Python code to build the Microsoft logo on top of the white pad using the blocks:
# get the positions of the blocks and the white pad
white_pad_pos = get_position("white pad")
blue_pos = get_position("blue block")
yellow_pos = get_position("yellow block")
red_pos = get_position("red block")
green_pos = get_position("green block")
# pick up the blue block
pick_up_object("blue block")
# calculate the position to place the blue block
place_pos = [white_pad_pos[0]-20, white_pad_pos[1]-20, white_pad_pos[2]+40, 0, 0, 0]
# place the blue block on the white pad
place_object(place_pos)
# pick up the yellow block
pick_up_object("yellow block")
# calculate the position to place the yellow block
place_pos = [white_pad_pos[0]+20, white_pad_pos[1]-20, white_pad_pos[2]+40, 0, 0, 0]
# place the yellow block on the white pad
place_object(place_pos)
# pick up the red block
pick_up_object("red block")
# calculate the position to place the red block
place_pos = [white_pad_pos[0]-20, white_pad_pos[1]+20, white_pad_pos[2]+40, 0, 0, 0]
# place the red block on the white pad
place_object(place_pos)
# pick up the green block
pick_up_object("green block")
# calculate the position to place the green block
place_pos = [white_pad_pos[0]+20, white_pad_pos[1]+20, white_pad_pos[2]+40, 0, 0, 0]
# place the green block on the white pad
place_object(place_pos)
Case 2: Application of Face Tracking Technology in Robotic ArmsBackground Introduction:This case has been very popular because the author is also a novice robotic arm enthusiast who wanted to implement face recognition and tracking with a robotic arm. The project is divided into two parts: the first part discusses the use of the OpenCV algorithm for face recognition and outlines the project's framework and requirements. The second part details the algorithm for controlling the motion of the robotic arm.
Technical Implementation:The entire project uses the mechArm 270, a 6-degree-of-freedom robotic arm powered by a Raspberry Pi 4B as its mainboard. To achieve face recognition and robotic arm tracking, machine vision is essential. Currently, the most popular recognition algorithm on the market is from OpenCV, whose technology is relatively mature and can be used directly.
Face recognition generally consists of two parts: face detection and face identification. In face detection, the main task is to construct a classifier that can differentiate instances containing faces from those that do not. OpenCV provides three pre-trained cascade classifiers. As the name suggests, cascade classifiers sequentially filter different features to determine the category to which they belong.
Link:http://face-rec.org/databases/
By using the classifier and OpenCV algorithms, the video stream can detect faces.
import cv2
import matplotlib.pyplot as plt
import time
def video_info():
# Loading classifiers
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
# Input video stream
cap = cv2.VideoCapture(0)
# To use a video file as input
#cap = cv2.VideoCapture('demo.mp4')
while True:
_, img = cap.read()
# Conversion to greyscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Detecting faces
faces = face_cascade.detectMultiScale(gray, 1.1, 4)
# Drawing the outline
for (x, y, w, h) in faces:
cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)
center_x = (x+w-x)//2+x
center_y = (y+h-y)//2+y
cv2.circle(img,(center_x,center_y),10,(0,255,255),2)
# Display effects
cv2.imshow('img', img)
k = cv2.waitKey(30) & 0xff
if k==27:
break
cap.release()
To capture video information with the robotic arm's end effector, it needs to be positioned correctly, involving a concept known as hand-eye calibration. Hand-eye calibration allows the robotic arm to know the position of objects captured by the camera relative to the arm, or in other words, to establish a mapping between the camera's coordinate system and the robotic arm's coordinate system.
This means determining the parameters of faces in the camera's view relative to the position of the robotic arm, which is crucial for accurately and real-time capturing the corresponding coordinates of faces.
After experimentation, it was found that the computational power of the Raspberry Pi 4b was lacking for hand-eye calibration, resulting in subpar performance and many unknown parameters. Therefore, the author decided to use relative displacement for motion control. This requires designing a sampling movement mechanism to ensure the complete acquisition of face displacement within a cycle and to achieve tracking. Let's see how it works.
Case 3: Operate myCobot with the Spatial Recognition of D455Background Introduction:To meet the demands of more scenarios, depth cameras are now widely used, as exemplified by the following user case. The Realsense D455, a depth camera by Microsoft, is used, capable of obtaining multiple parameters of an object. This project aims to combine a depth camera with a robotic arm for spatial object tracking.
In the scene, there is a camera in the top left corner capturing the coordinates of a red object and then relaying this information to the robotic arm, which moves accordingly.
This case shares similarities with the previous one in the application of robotic arm vision but differs in the use of a depth camera. The camera's built-in algorithm can conveniently obtain the parameters of blocks, facilitating hand-eye calibration. The case also mentions the motion module information of Realsense, with settings that can go as low as 0.01m, or even approximately 0.001m, allowing for the capture of very fine parameters.
The entire project is completed within ROS (Robot Operating System), linking Realsense D455 and myCobot.
This article can greatly assist those who have purchased a robotic arm or depth camera but are unsure how to utilize them.
Case 4: Automating Fruit Harvesting and SortingBackground Introduction:This technical case leans towards the construction of a simulated scenario, involving multiple mechArm 270 units working together, a conveyor belt, and several camera modules. The main goal is to simulate the harvesting and sorting of fruits.
The entire scenario process showcases several key technical points:
1. Machine vision for recognition and acquiring target fruit coordinates.
2. The collaborative relationship between robotic arms and the logical relationship with the conveyor belt.
From cases 1 to 4, if the aim is to apply scenarios, machine vision is indispensable, necessitating cameras. To execute a successful project, knowledge of machine vision is essential. Let's see how vision is handled here. Depth cameras are used in the case, which, compared to 2D cameras, eliminates the need for some markers. We can rely on the parameters from the depth camera for direct hand-eye calibration, where our "eye" is fixed outside the hand, aiming to find a transformation matrix. This matrix describes the camera coordinate system's position and orientation relative to the robot base coordinate system.
Object recognition is performed using OpenCV's color recognition, differentiating fruits by color.
Detection and identification class
"""
HSV_DIST = {
"redA": (np.array([0, 120, 50]), np.array([3, 255, 255])),
"redB": (np.array([118, 120, 50]), np.array([179, 255, 255])),
"orange": (np.array([8, 150, 150]), np.array([20, 255, 255])),
"yellow": (np.array([28, 100, 150]), np.array([35, 255, 255])),
}
By defining possible outcomes, precise recognition and acquisition of the target fruit coordinates can be achieved.
Then, considering several aspects mainly determines the path for the robotic arm to grasp the fruit:
● Initial posture
● Grabbing posture
● Obstacle avoidance posture
These aspects ensure that the grasping of the target is not hindered by other objects in the scene.
Next, a critical step is managing the logical relationship between the two robotic arms and the conveyor belt, assigning roles to determine which robotic arm moves first to ensure a smooth process without leading to program deadlock.
This scenario application involves multifaceted functionalities, aiding beginners in learning how to utilize robotic arms and construct a scenario. It also helps grasp knowledge on robotic arm motion control, vision recognition, object grasping techniques, and more, for those interested in learning more.
SummaryIn 2023, the rapid development of robotic arm technology has garnered widespread attention, with several notable cases demonstrating its immense potential for innovation and application. We welcome everyone interested in exploring this field and willing to share innovative ideas to communicate with us. Let's witness and experience together how technology shapes the wonders of the future world!
Comments
Please log in or sign up to comment.