Posts

  • 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
  • Magic Cauldron: Harry Potter Themed Gender Reveal Party - #Aurdino

    Earlier this year, we had a very fun filled Harry Potter themed gender reveal party. For the reveal, I built a Magic Cauldron which would reveal the gender. Check it out for yourself! For this I needed: A Cauldron. WS2812B LED array. Aurdino UNO. Bread board and jumper wires. Dry ice. Kasa Smart bulbs I will go over in the following sections The Mist.

  • Kakashi: The Copycat Robot - #Aurdino #image processing

    In this post, I want to share about "Kakashi: The Copycat Robot"—a fun side project I built a few years ago. The name is inspired by the famous character from Naruto, Kakashi Hatake, also known as the Copycat Ninja.The goal of this robot was to mimic Kakashi's ability to copy movements—though, of course, in a more limited way. Check it out for yourself!Here are the things I used to build this:

  • Neural network inference pipeline for videos in Tensorflow - #Deeplearning #Tensorflow

    Just as we saw a huge influx of images in the past decade or so, we are now seeing a lot of videos being produced on social media. The need to understand and moderate videos using machine learning has never been greater. In this post, I will show you how to build an efficient pipeline to processes videos in Tensorflow.  For simplicity, let us consider a Resnet50 model pre-trained on

  • Finding Where's Waldo using Mask R-CNN - #Deeplearning #ML

    When I was a kid, I really loved solving Where's Waldo. There were few books (it used to be called Where's Wally) in our school library on which I spent hours finding Waldo. For people who do not know what it is, basically Waldo - a unique character is hidden among hundreds of other characters and you have to find him in all the chaos in the image. Now that I am too old to be solving it and

  • Higher level ops for building neural network layers with deeplearn.js - #Deeplearning #javascript #ML

    I have been meddling with google's deeplearn.js lately for fun. It is surprisingly good given how new the project is and it seems to have a sold roadmap. However it still lacks something like tf.layers and tf.contrib.layers which have many higher level functions that has made using tensorflow so easy. It looks like they will be added to Graphlayers in future but their priorities as of now is to

Tuesday, 23 January 2018

Higher level ops for building neural network layers with deeplearn.js

I have been meddling with google's deeplearn.js lately for fun. It is surprisingly good given how new the project is and it seems to have a sold roadmap. However it still lacks something like tf.layers and tf.contrib.layers which have many higher level functions that has made using tensorflow so easy. It looks like they will be added to Graphlayers in future but their priorities as of now is to fix the lower level APIs first - which totally makes sense.

So, I quickly built one for tf.layers.conv2d and tf.layers.flatten which I will share in this post. I have made them as close to function definitions in tensorflow as possible.

1.  conv2d - Functional interface for the 2D convolution layer.

function conv2d(
inputs: Tensor,
filters: number,
kernel_size: number,
graph: Graph,
strides: number = 1,
padding = "valid",
data_format = "channels_last",
activation?,
kernel_initializer: Initializer = new VarianceScalingInitializer(),
bias_initializer: Initializer = new ZerosInitializer(),
name: string = "")
Arguments:
  • inputs Tensor input.
  • filters Integer, the dimensionality of the output space (i.e. the number of filters in the convolution).
  • kernel_size Number to specify the height and width of the 2D convolution window.
  • graph Graph opbject.
  • strides Number to specify the strides of convolution.
  • padding One of "valid" or "same" (case-insensitive).
  • data_format "channels_last" or "channel_first"
  • activation Optional. Activation function which is applied on the final layer of the function. Function should accept Tensor and graph as parameters
  • kernel_initializer An initializer object for the convolution kernel.
  • bias_initializer  An initializer object for bias.
  • name string which represents name of the layer.
Returns:

Tensor output.

Usage:

// 32 5x5 filters
var network = conv2d(tensor, 32, 5, graph);
// 32 5x5 filters, stride 2, "same" padding with relu activation
var network = conv2d(tensor, 32, 5, graph, 2, "SAME", undefined, (layer, graph) => {return graph.relu(layer)});
// applying some kernel_initializer
var network = conv2d(x, 32, 5, g, undefined, undefined, undefined, undefined, new RandomUniformInitializer(0, 0.5));
view raw conv2d.usage.ts hosted with ❤ by GitHub
Add this to your code:
function conv2d(
inputs: Tensor,
filters: number,
kernel_size: number,
graph: Graph,
strides: number = 1,
padding = "valid",
data_format = "channels_last",
activation?,
kernel_initializer: Initializer = new VarianceScalingInitializer(),
bias_initializer: Initializer = new ZerosInitializer(),
name: string = "") {
// get the channels parameter from the input
const channel_axis = data_format == "channels_last" ? inputs.shape[2] : inputs.shape[0];
// shape of the kernel to create the filters
const depthwise_kernel_shape = [kernel_size, kernel_size, channel_axis, filters];
// Create a new variable for weights of the filters and apply the initializer
var weights = graph.variable(name + "w",
kernel_initializer.initialize(depthwise_kernel_shape, kernel_size * kernel_size * channel_axis * filters,
filters));
// create a new variable for bias and apply the initializer
var bias = graph.variable(name + "b", bias_initializer.initialize([filters], kernel_size, filters))
// call the actual conv2d function
const layer = graph.conv2d(inputs, weights, bias, kernel_size, filters, strides, padding == "valid" || padding == "VALID" ? 0 : undefined);
// return the tensor. Apply the activation if defined
return activation == undefined ? layer : activation(layer, graph);
}
view raw conv2d.ts hosted with ❤ by GitHub

2. flatten - Flattens an input tensor.

/**
* Flattens an input tensor.
* @param inputs Tensor input
*/
function flatten(inputs: Tensor) {
return g.reshape(inputs, (() => {
let i = 1;
inputs.shape.forEach((val) => { i *= val });
return [i];
})());
}
view raw flatten.ts hosted with ❤ by GitHub

I wrote these snippets while building a tool using deeplearnjs where I do things like loading datasets, batching, saving checkpoints along with visualization. I will share more on that in my future posts.

Thursday, 11 January 2018

Hacking FaceNet using Adversarial examples


With the rise in popularity of face recognition systems with deep learning and it's application in security/ authentication, it is important to make sure that it is not that easy to fool them. I recently finished the 4th course on deeplearning.ai where there is an assignment which asks us to build a face recognition system - FaceNet. While I was working on the assignment, I couldn't stop thinking about how easy it is to fool it with adversarial examples. In this post I will tell you how I managed to do it.

First off, some basics about FaceNet. Unlike image recognition systems which map every image with a class, it is not possible to assign a class label to every face in face recognition. This is because one, there are way too many faces that a system should handle in the real world to assign class to each of them and two, if there are new people the system should handle, it can't do it. So, what we do is, we build a system that learns similarities and dissimilarities. Basically, there is a neural network similar to what we have in image recognition and instead of applying softmax in the end, we just take the logits as embedding for the given image input and then minimize something called the triplet loss.  Consider face A, we have a positive match P and negative match N. If f is the embedding function and L is the triplet loss, we have this:

Triplet loss

Basically, it is incentivizing small distance between A - P and large distance between A - N. Also, I really recommend watching Ian Goodfellow's lecture from Stanford's CS231n course if you want to know about adversarial examples.

Like I said earlier, this thought came to me while doing an assignment from 4th course from deeplearning.ai which can be found here and I have built on top of it.  The main idea here is to find small noise that when added to someone's photo although causing virtually no visual changes, can make faceNet identify them as the target.





Benoit (attacker)
Add noise
Kian
Kian Actual (Target)

First lets load the images of the attacker Benoit and the target Kian.


Now say that the attacker image is A` and the target image is T. We want to define triplet loss to achieve two things:

  1. Minimize distance between A` and T
  2. Maximize distance between A` and A` (original)
In other words the triplet loss L is:

L (A, P, N) = L (A`, T, A`)

Now, let's compute the gradient of the logits with respect to the input image 



These gradients are used to obtain the adversarial noise as follows :

noise = noise - step_size * gradients

According to the assignment, a l2 distance of the embeddings of less than 0.7 indicates that two faces have the same person. So lets do that.



The distance decreases from 0.862257 to 0.485102 which is considered enough in this case.

L2 distance between embeddings of attacker and target
This is impressive because, all this is done while not altering the image visibly just by adding a little calculated noise!



Also note that the l2 scores indicate that the generated image is more of Kian than Benoit in spite of looking practically identical to Benoit. So there you go, adversarial example generation for FaceNet.