Published May 25, 2021

Scout Robot - NLP, SLAM, 3D Depth Vision, Machine Learning

This robot talks, sees in 3D, and located itself using a compass, visual SLAM and an IMU. Obstacle avoidance uses 3D depth and 9 sonars.

AdvancedWork in progressOver 83 days1,063

Scout Robot - NLP, SLAM, 3D Depth Vision, Machine Learning

Things used in this project

Hardware components

MeLE Mini PC Stick Windows 10 Pro 8GB DDR 128GB eMMC Intel Celeron J4125 Processor Quad-Core Fanless Mini Computer

Powered USB 4-Port Hub

Orbbec Astra S 3D Depth and Imaging Camera

Intel T265 Tracking and SLAM Cam with IMU

CMPS010 Tilt Compensated Compass

12V Drive Motors with Encoders

Neo-pixel Jewels

Arduino Mega ADK

SparkFun Ultrasonic Sensor - HC-SR04

Software apps and online services

Microsoft Windows 10

Wikipedia API

Wolfram Alpha API

Linked Open Data Sources (RDF)

Microsoft SQL Server

Spacy NLP Library

OpenCV

TensorFlow

Hand tools and fabrication machines

3D Printer (generic)

Story

Scout is a robot built to test 3D Vision, SLAM, NLP, neural nets, mapping, pathfinding, etc. I am using this bot to test features meant for another and larger bot of mine (Ava v2). My goal is to perfect the ability to move intelligently around my house from any point to any other point accessible to the robot. I also intend to perfect the 3D perception system and a new spatial 3D memory for everything the bot sees. This bot fuses data from multiple sensors and neural nets to perform the various functions. This bot is controlled via voice remote, web page, or game controller. This bot also has an autonomous mode.

Video Demo

Neural Net of Neural Nets

The software for this robot is organized into many "neurons" that are "activated" at various times. Each neuron has one or more English names. A neuron can activate other neurons. Each neuron can contain its own embedded script (written in plain English, fuzzy logic (also English). Each node can also contain its own embedded neural net. This means the brain is effectively a neural net of neural nets.

Translucent 3D Printed and LED Lighted Brain

This bot has LED lighting that "fires" under a translucent brain. This represents the firing of various neurons and conditions.

Design and Assembly

This robot was designed in about a week based on a previous robot I had built around 8 years ago. I printed it in about 10 days, and took another two weeks to install and test all the sensors. It cost around $1200. I spent several years writing and refining the software, which is shared by a few different robots I have.

Here is a link to the previous version of this bot (Anna) from years ago.

https://www.robotshop.com/community/robots/show/super-droid-bot-anna-w-learning-ai

This older bot explored the limits of verbal intelligence before modern transformer models were available.

English NLP - Not Just for Talking, used for Thinking and Orchestration

This robot uses plain English and NLP to communicate with people and to communicate from one part of the brain to another. If some set of actions needs to be performed by various services, a neuron simply forms a paragraph of sentences asking the rest of the system to perform those actions.

Machine Learning Models

I incorporated the following off-the-shelf neural nets into the brain of this robot. Most of these models are running in python on a separate laptop with a graphics card, and accessed through a custom built flask api.

Verbal Models

NLP DialoGPT Model (Microsoft)
NLP Text Generation Model
NLP Sentiment Detection Model
NLP Masking Model (Transformer)
NLP Question Answering Model (Transformer)
NLP Entity Recognition Model

Vision Models

YOLO v3 DarkNet Model
AlexNet Model
Gender/Age Detection Model
Face Detection Model
Emotion Detection Model

Significant Libraries & Algorithms

I incorporated the following libraries and algorithms into the software. This list is only the major ones.

Spacy NLP Library
Open CV Vision - Various algos for color, shape detection, etc.
A Star Pathfinding (in Python)
2D Occupancy Grid Map (in Python)
3D Memory System - (in Python)
Various custom built NLP Algos
Various Graph Algos
Fast Fourier Transforms (for audio spectrum analyzer)

Data Sources & APIs

This bot learns on its own any time any new word is encountered. For this, I use the following data sources.

Word/Thesaurus API
Wikipedia Text and API
RDF Triple Sources (Dbpedia and others on Linked Open Data Web)
Wolfram Alpha API
ConceptNet
GeoNames database
Some custom built SQL Server databases
Weather API

3D Depth Camera

This robot makes heavy use of its Orbbec 3D depth cam. This stream basically amounts to 307, 000 forward facing distance sensors.

The robot uses the 3D depth information, its position and orientation in 3D space, along with its sonars, to move around the house.

The data from all the sensors is fused into a 2D occupancy grid map.

The robot creates another version of the map with a "force field" that creates forced around all the objects in the map. This force field allows the robot to prefer open areas to move through when picking a path and navigating through a room, hallways, etc.

What is Left to Do:

The main things I am working on now with this bot is building better 2D and 3D maps and moving the bot within the world based on those maps using pathfinding and other sensors. I am also putting a lot of time on a new 3D memory system where everything that is seen and recognized is also remembered. This will allow the robot to explore, and get back to any object previously seen from any position in a house.