Posts

    Showing posts with label image processing. Show all posts
    Showing posts with label image processing. Show all posts

    Sunday, 22 September 2024

    Kakashi: The Copycat Robot

    In this post, I want to share about "Kakashi: The Copycat Robot"—a fun side project I built a few years ago. The name is inspired by the famous character from Naruto, Kakashi Hatake, also known as the Copycat Ninja.

    The goal of this robot was to mimic Kakashi's ability to copy movements—though, of course, in a more limited way. Check it out for yourself!

    Here are the things I used to build this:

    1. Aurdino UNO board.
    2. Max7219 8x8 LED
    3. 3D printed Pan and Tilt brackets (2x) 
    4. 4 servo motors
    5. Bread board and jumper wires.
    I will go through it in following sections:
    • The Sharinghan
    • Pan and Tilt motion 
    • Controller - Serial bus
    • Tracking algorithm

    The Sharinghan

    Of course, our Kakashi needs a Sharingan! For those unfamiliar, the Sharingan is the special eye that grants Kakashi his copycat abilities in Naruto.



    For this, I used a Max7219 8x8 LED. It has 5 pins which I connected as follows:

    • VCC - connect to 5V 
    • GND - connect to ground
    • DIN - data in ports.h 
    • CS - chip select ports.h 
    • CLK - clock ports.h 



    Then I found a led editor which I used to create a hex mapping of the sharinghan in different angles and wrote this code that loops around it. 


    Pan and Tilt

    Pan and tilt are the two motions using which you can basically cover any movement when used in combination. 


    I used two of these to mimic arm movements. Each one is made up of a pan and tilt bracket, which you can either 3D print or purchase pre-made from Amazon. I attached two servo motors to each bracket. I won't go into assembly details, as there are plenty of great tutorials available on how to put one together. 

    Each servo motor has 3 pins - 5V power, ground and control. I connected the four control cables to the following ports:


    I wrote a simple class to control the 4 servo motors and map it into pan and tilt actions. 

       

    Controller - Serial Bus

    The serial bus acts as a communication channel between my computer and the Arduino via a wired connection. I use it to control the 8x8 LED display and handle pan and tilt actions. This setup is flexible and has been useful in several other projects as well. 
    On the client side, I implemented a simple class that sends control messages. It also has the ability to record and playback actions—similar to how Kakashi copies techniques and reuses them.


    On aurdino, I receive these messages and do an appropriate action.


    Tracking Algorithm

       

    In this section, I'll explain how I mapped my real-world movements to control the robot's actions. There were three main requirements:

    1. Hand Tracking: The system needed to track my hand movements and map them to four angles, corresponding to the servo motors in the pan and tilt setup.

    2. Scale Invariance: It had to be scale-invariant, meaning I could start from any position and move freely, with the robot replicating the same actions regardless of where I started.

    3. Smooth Movements: The movements had to be smooth, taking into account the bandwidth limitations of the serial bus and the movement speed of the servo motors while being fault tolerant. 

    For hand tracking, I needed a model that could quickly provide hand landmarks while running efficiently on CPU/MPS (for Mac). Since high accuracy wasn't critical, I opted for the EfficientDet model via MediaPipe. You can find more details in the kakashi.py file.




    Once I have the hand landmarks, I extract three key pieces of information from each hand:

    1. Center of the Hand (landmark 0)
    2. Palm Height (difference between landmarks 5 and 0) — used to scale the coordinates.
    3. Average Position of Finger Tips (landmarks 4, 8, 12, 16, 20) — since not all fingers might always be visible.

    With the tracking data available for each frame, the next step is to map it to the pan and tilt actions, i.e., the four angles for the servo motors.

    A servo motor can move between 0 and 180 degrees. I set the motors to point forward at 0 degrees, and whenever the program starts, the motors reset to this position. The tracking data from the first frame (td₀) serves as the reference point.

    For each subsequent frame, we calculate the distance along the x and y axes relative to the reference frame. This distance is scaled based on palm height to maintain scale invariance. After scaling, the distance is normalized between 0 and 1, with a range of -3 to +3, and then converted into a corresponding angle between 0 and 180 degrees.

    Here is the code that does this:


    Then we put all this together and voila, we have the Kakashi: The Copycat Robot! 

    PS: Feel free to checkout the whole code on github (I hope I get to clean it up someday).

    Saturday, 18 May 2013

    Jarvis, at your sevice


    Hello everyone! Its been a long time since I blogged. Today I will show you how to use 'Jarvis' which is an open source software which can be used to control your Linux system using your hand motion, gestures which was made as my Human Computer Interaction project. This is mainly an image processing based project developed using python.



    Things u can do using Jarvis:


    1. The first thing you can do using Jarvis is control your mouse. You just need a colored object (preferably has color different from its background). So you can do things like draw in air!
    2. The second thing is that you can assign any gesture which is a combination of
      Left->Right, Right->Left, Top->Bottom, Bottom->Top to any command. So using this you can literally perform anything!!! The following are the things you can do by default:

    • Maximize/Minimize/Close current window
    • Go next and forward in PPT presentation
    • Page up and Page down
    • Switch window (Alt+tab)
    • Take screenshot
    • Shutdown/Suspend system
    • Mute and unmute
    • Open Calculator, File manager, Gedit


    You can practically add anything else also. All you need to know is the command which does that and the equivalent of the gesture which you want to assign in the combination of Left->Right, Right->Left, Top->Bottom, Bottom->Top.

    Getting dependencies:

    Ubuntu:
    $ sudo apt-get install python-opencv xdotool
    Fedora:
    $ sudo yum install python-opencv xdotool

    Installation:

    $ git clone https://github.com/alseambusher/jarvis.git
    $ cd jarvis
    $ ./install

    Now you need to set the screen resolution of your screen. If your screen resolution is 1366x768 then skip this step.
    $ gedit ~/.jarvis/config.py
    change the value of variable RESOLUTION corresponding to your screen resolution.

    Running:

    Now simply open Jarvis from your Applications menu
    OR
    Do this:
    $ cd ~/.jarvis
    $ python main.py
    Now click on Start Jarvis 

    Add new gesture:

    1. Go to settings from the File menu


    2. Click on Add gesture from the File menu of settings


    3. Suppose say that you want to add a gesture which opens terminal.
    Say the gesture you wish to give is (Left to Right)->(Right to Left)->(Top to Bottom)->(Bottom to Top)
    The command corresponding to open gnome terminal is 'gnome-terminal'

    4. Fill the details and save it.


    You are done!!!

    Editing and deleting gestures are simple :P

    How to use?

    We need two different colored objects which are required to run. One the tracker and the other one is the flag!

    1. If the flag is not exposed then the gesture is disabled and Jarvis works as a mouse controller.
    2. When both tracker and flag are exposed the gesture begins. Perform the gesture using the tracker. Once the gesture is complete hide the flag. Jarvis then processes and analyses the gesture performed and checks for any matches from the existing database. If there is a match then it executes it!.

    Customizing Tracker and Flag color:

    By default the tracker is yellow color and flag is blue color.
    You can change it by editing the config.py file

    $ gedit ~/.jarvis/config.py

    Change the min and max values of TRACKER_COLOR and GESTURE_COLOR corresponding to the HSV values of the color intended

    By default these are the values:
    TRACKER_COLOR={'MIN':[20,100,100],'MAX':[30,255,255]}
    GESTURE_COLOR={'MIN': [108.0, 100, 10],'MAX': [118.0, 255, 255]}

    Thank you. Dont forget to contribute to this open source project as there is a lot of scope for improvements. :)

    Fork the project from here: https://github.com/alseambusher/jarvis