Pages

Sunday, 22 September 2024

Kakashi: The Copycat Robot

In this post, I want to share about "Kakashi: The Copycat Robot"—a fun side project I built a few years ago. The name is inspired by the famous character from Naruto, Kakashi Hatake, also known as the Copycat Ninja.

The goal of this robot was to mimic Kakashi's ability to copy movements—though, of course, in a more limited way. Check it out for yourself!

Here are the things I used to build this:

  1. Aurdino UNO board.
  2. Max7219 8x8 LED
  3. 3D printed Pan and Tilt brackets (2x) 
  4. 4 servo motors
  5. Bread board and jumper wires.
I will go through it in following sections:
  • The Sharinghan
  • Pan and Tilt motion 
  • Controller - Serial bus
  • Tracking algorithm

The Sharinghan

Of course, our Kakashi needs a Sharingan! For those unfamiliar, the Sharingan is the special eye that grants Kakashi his copycat abilities in Naruto.



For this, I used a Max7219 8x8 LED. It has 5 pins which I connected as follows:

  • VCC - connect to 5V 
  • GND - connect to ground
  • DIN - data in ports.h 
  • CS - chip select ports.h 
  • CLK - clock ports.h 



Then I found a led editor which I used to create a hex mapping of the sharinghan in different angles and wrote this code that loops around it. 


Pan and Tilt

Pan and tilt are the two motions using which you can basically cover any movement when used in combination. 


I used two of these to mimic arm movements. Each one is made up of a pan and tilt bracket, which you can either 3D print or purchase pre-made from Amazon. I attached two servo motors to each bracket. I won't go into assembly details, as there are plenty of great tutorials available on how to put one together. 

Each servo motor has 3 pins - 5V power, ground and control. I connected the four control cables to the following ports:


I wrote a simple class to control the 4 servo motors and map it into pan and tilt actions. 

   

Controller - Serial Bus

The serial bus acts as a communication channel between my computer and the Arduino via a wired connection. I use it to control the 8x8 LED display and handle pan and tilt actions. This setup is flexible and has been useful in several other projects as well. 
On the client side, I implemented a simple class that sends control messages. It also has the ability to record and playback actions—similar to how Kakashi copies techniques and reuses them.


On aurdino, I receive these messages and do an appropriate action.


Tracking Algorithm

   

In this section, I'll explain how I mapped my real-world movements to control the robot's actions. There were three main requirements:

  1. Hand Tracking: The system needed to track my hand movements and map them to four angles, corresponding to the servo motors in the pan and tilt setup.

  2. Scale Invariance: It had to be scale-invariant, meaning I could start from any position and move freely, with the robot replicating the same actions regardless of where I started.

  3. Smooth Movements: The movements had to be smooth, taking into account the bandwidth limitations of the serial bus and the movement speed of the servo motors while being fault tolerant. 

For hand tracking, I needed a model that could quickly provide hand landmarks while running efficiently on CPU/MPS (for Mac). Since high accuracy wasn't critical, I opted for the EfficientDet model via MediaPipe. You can find more details in the kakashi.py file.




Once I have the hand landmarks, I extract three key pieces of information from each hand:

  1. Center of the Hand (landmark 0)
  2. Palm Height (difference between landmarks 5 and 0) — used to scale the coordinates.
  3. Average Position of Finger Tips (landmarks 4, 8, 12, 16, 20) — since not all fingers might always be visible.

With the tracking data available for each frame, the next step is to map it to the pan and tilt actions, i.e., the four angles for the servo motors.

A servo motor can move between 0 and 180 degrees. I set the motors to point forward at 0 degrees, and whenever the program starts, the motors reset to this position. The tracking data from the first frame (td₀) serves as the reference point.

For each subsequent frame, we calculate the distance along the x and y axes relative to the reference frame. This distance is scaled based on palm height to maintain scale invariance. After scaling, the distance is normalized between 0 and 1, with a range of -3 to +3, and then converted into a corresponding angle between 0 and 180 degrees.

Here is the code that does this:


Then we put all this together and voila, we have the Kakashi: The Copycat Robot! 

PS: Feel free to checkout the whole code on github (I hope I get to clean it up someday).

No comments:

Post a Comment