Monik's Blog https://monik.in Thoughts, tutorials and ideas Tue, 23 Feb 2021 23:46:47 +0000 en-US hourly 1 https://i0.wp.com/monik.in/wp-content/uploads/2021/02/cropped-android-chrome-512x512-1-1.png?fit=32%2C32&ssl=1 Monik's Blog https://monik.in 32 32 113098687 A quick guide to getting a Django – Uwsgi – Nginx server up on Ubuntu 16.04 (AWS EC2) https://monik.in/a-quick-guide-to-getting-a-django-uwsgi-nginx-server-up-on-ubuntu-16-04-aws-ec2/ https://monik.in/a-quick-guide-to-getting-a-django-uwsgi-nginx-server-up-on-ubuntu-16-04-aws-ec2/#comments Wed, 26 Jul 2017 18:16:09 +0000 http://monik.in/?p=325 Continue Reading >>]]>

I’ve wasted quite some trying to get the server up for a new project everytime; I end up reading from multiple tutorials picking up what works. I thought I’d write a simple guide and share it.

Ubuntu Stuff

ssh ubuntu@your-aws-instance-public-ip -i key.pem
cd ~

After logging in, let’s get all the required stuff installed.

sudo apt-get update
sudo apt-get install python-dev
sudo apt-get install python-pip

It is a good idea to work with a virtual environment. If you don’t know what it is, you can think of it as another shell inside the shell where we can install packages that won’t get installed system-wide. You can jump in and out of this virtualenv anytime.

mkdir our-project
cd our-project
pip install virtualenv

Let’s start the virtualenv and enter it.

virtualenv venv
source venv/bin/activate

Django Stuff

Once we’re here, we need need to get the django files in place. For sake of simplicity, I’ll make a new Django project. If you have already setup one, skim through this step and you’d only have to install your requirements.

pip install django
django-admin.py startproject hello

If you had an existing project, you would be installing it from a requirements.txt file like this.

pip install -r requirements.txt

To test if this installation worked, we need to do a runserver on port 8000 (make that this port is open to public: you can do this by adding an inbound rule for port 8000 on the security group for your EC2 Instance) and open it in the browser.

cd hello
python manage.py runserver 0.0.0.0:8000

Now if you fire up, http://your-aws-instance-public-ip:8000, your Django project should load up.

Uwsgi Stuff

Now we’ve got a Django project that runs inside the virtualenv. We need Uwsgi to serve django to the web instead of the lightweight development server, that we just ran using manage.py runserver command. If the thought of running the runserver command on a screen passes your mind, drop it. The dev server with django is terribly lightweight and highly insecure, and absolutely cannot scale.

deactivate

We just came out of the virtualenv (you can notice that the prompt to the left on your command screen changes) and we’ll install uwsgi now system-wide because, we’ll be running the server from the root user.

sudo pip install uwsgi

Let’s run the server using uwsgi with the same config. This command does the same thing a manage.py runserver would do.

uwsgi --http :8000 --home PATH/TO/THE/VIRTUALENV --chdir /PATH/TO/THE/DJANGO/PROJECT/FOLDER/CONTAINING/MANAGE.PY/FILE -w YOUR-PROJECT-NAME.wsgi

So in our case, the command would be:

uwsgi --http :8000 --home /home/ubuntu/our-project/venv  --chdir /home/ubuntu/our-project/hello -w hello.wsgi

Now if you fire up http://your-aws-instance-public-ip:8000, your Django website should show up in the browser.

We need to run this in the ‘background’ (ah, well you could probably run it and screen it but there are better ways) so we’re going to achieve that next.

The way we will do it is by using Ubuntu’s systemd, which gets pid 1 (the first process to run after booting up) and this is fully supported for versions 15.04 and beyond. We will let it initliase our uwsgi process.

To store our config options, we need to create an ‘ini’ file which will contain all the uwsgi config details (like which virtualenv to use, where is the home folder, etc arguments we passed while executing the command to run the server).

sudo mkdir /etc/uwsgi/sites
sudo vim /etc/uwsgi/sites/hello.ini

We’ll load the file with the config details.

[uwsgi]

chdir = /home/ubuntu/our-project/hello #same as above
home = /home/ubuntu/our-project/venv #same as above
module = hello.wsgi:application #same as above

master = true
processes = 5 #more processes, more computing power

socket = /run/uwsgi/hello.sock #SOCKET_LOC
chown-socket = ubuntu:www-data #user and user's group
chmod-socket = 660 
vacuum = true #delete the socket after process ends
harakiri = 30 #respawn the process if it takes more than 30 secs
Press ESC
Type :wq
Enter

You’ll notice that we did not mention any port like 8000 as we did before. We’re going to be routing this via a socket file instead of a port as this is more optimal. There is no difference, only that whatever requests were routed to port 8000, would now be required to go via the socket file.

You can test if this works by running the following command.

uwsgi --ini /etc/uwsgi/sites/hello.ini

If this works fine, you’ll see a couple of lines and status that 5 or some number of processes have been spawned.

Now, we need to let systemd (Ubuntu’s service manager) take care of this. So we will create a special service that will make sure our server is running.

Overall, this would be:

Ubuntu's SystemD --call-> Service we create --execute-> Uwsgi ini --run-> Our Django Project
sudo vim /etc/systemd/system/uwsgi.service

Paste the following stuff into it

[Unit]
Description=uWSGI Emperor service

[Service]
ExecStartPre=/bin/bash -c 'mkdir -p /run/uwsgi; chown ubuntu:www-data /run/uwsgi' #make the folder where we'll store our socket file and have the right user/group permissions
ExecStart=/usr/local/bin/uwsgi --emperor /etc/uwsgi/sites #this the command to execute on start
Restart=always #make sure the server is running
KillSignal=SIGQUIT
Type=notify
NotifyAccess=all

[Install]
WantedBy=multi-user.target
ESC
:wq

This gist of what we pasted is simple, the service will execute this line everytime it comes up and make sure it is up. You could even fire it up on the terminal to see that it runs the server for you. The only special thing here is the –emperor. The emperor mode checks a particular folder (in our case, sites) for .ini files and fires each of them (our hello.ini is sitting there) making it useful if we have multiple websites.

/usr/local/bin/uwsgi --emperor /etc/uwsgi/sites

Now let’s tell the systemd to run our service.

sudo systemctl restart uwsgi

If you want to make sure, you could do a htop and see the number of processes (search for uwsgi) you wanted to spawn + 1 (for master) are running.

So uwsgi is running. But we need to get it to appear when an HTTP request comes in. For that, we’re going to use Nginx.

Nginx Stuff

Nginx is a lightweight server and we’ll use it as a reverse proxy. What we’re trying to achieve is this:

WWW.DOMAIN.COM <--> NGINX <--Talk to the hello.sock --> UWSGI <--hello.wsgi--> DJANGO

You would wonder, why introduce Nginx in between and why not have uwsgi handle requests directly. You could let Uwsgi run directly on port 80 but Nginx has many benefits (full discussion here) which makes it desirable.

Let’s install Nginx.

sudo apt-get install nginx
sudo service nginx start

If you hit the http://your-public-ec2-address, you will see a Nginx welcome page because Nginx is listening to port 80 (the default http port) according to its default configuration.

Nginx has two directories, sites-available and sites-enabled. Nginx looks for all conf files in the sites-enabled folder and configures the server according to it. So let’s create a conf file to connect the browser request to the uwsgi server we are running.

sudo vim /etc/nginx/sites-available/hello

Paste the following into it.

server {
    listen 80;
    server_name yourdomain.com www.yourdomain.com your-ec2-public-ip-address(if you don't have a domain);

    location = /favicon.ico { access_log off; log_not_found off; }
    client_max_body_size 20M;

    location / {
        include         uwsgi_params;
        uwsgi_pass      unix:/run/uwsgi/hello.sock; #SAME as #SOCKET_LOC in the hello.ini
    }
}

You can let server_name be a subdomain or multiple domains where you want to serve the website. The uwsgi_pass must point to the socket file we had our uwsgi ini file create. You can configure the Nginx conf far more and add a lot of things, I’ve only added some basic stuff.

We need to add this to sites-enabled directory, in order to be picked up by Nginx. We can create a symlink to the file.

sudo ln -s /etc/nginx/sites-available/hello /etc/nginx/sites-enabled/

That’s all. Now restart nginx and you’re all set.

sudo service nginx restart

Now if you have configured your domain, and added the same domain to nginx conf, your website should load on the domain or if you added the ip, then on http://your-ec2-public-ip. If you’re confused about configuring the domain to the address, read on.

Domain configuration stuff

This is fairly straightforward. You need to add a simple record to your DNS records. If you’ve bought your domain with popular domain name sellers like GoDaddy or Name.com, you’d have some panel to manage DNS settings.

Get to it and add an ‘A Record’ with ‘@’ or a blank host (and another record with ‘www’) and point it to ‘your-ec2-public-ip’ and everything should work then! If you want to run a subdomain, then instead of ‘@’, enter the subdomain name. This is telling the domain that incase anybody requests something at that url, just forward the request to the IP where we have our NGINX server listening and waiting to respond.

Django Config Stuff

You could run into 400 or 502 errors when trying to serve if you’re running with DEBUG = False and have not set ALLOWED_HOSTS in settings.

You need to have allowed hosts configured to allow those domains. You could allow everything,

ALLOWED_HOST = ['*'] #INCASE you want allow every host but this may turn out to be unsafe

Or allow the domains we configured in the nginx conf,

ALLOWED_HOST = ['yourdomain.com','www.yourdomain.com','your-ec2-public-ip']

And finally, we’re live with our django website.

Reference
1. DigitalOcean

]]>
https://monik.in/a-quick-guide-to-getting-a-django-uwsgi-nginx-server-up-on-ubuntu-16-04-aws-ec2/feed/ 4 325
A noob’s guide to implementing RNN-LSTM using Tensorflow https://monik.in/a-noobs-guide-to-implementing-rnn-lstm-using-tensorflow/ https://monik.in/a-noobs-guide-to-implementing-rnn-lstm-using-tensorflow/#comments Sun, 19 Jun 2016 19:33:09 +0000 http://monik.in/?p=264 Continue Reading >>]]> The purpose of this tutorial is to help anybody write their first RNN LSTM model without much background in Artificial Neural Networks or Machine Learning. The discussion is not centered around the theory or working of such networks but on writing code for solving a particular problem. We will understand how neural networks let us solve some problems effortlessly, and how they can be applied to a multitude of other problems.

What are RNNs?

Simple multi-layered neural networks are classifiers which when given a certain input, tag the input as belonging to one of the many classes. They are trained using the existing backpropagation algorithms. These networks are great at what they do but they are not capable of handling inputs which come in a sequence. For example, for a neural net to identify the nouns in a sentence, having just the word as input is not helpful at all. A lot of information is present in the context of the word which can only be determined by looking at the words near the given word. The entire sequence is to be studied to determine the output. This is where Recurrent Neural Networks (RNNs) find their use. As the RNN traverses the input sequence, output for every input also becomes a part of the input for the next item of the sequence. You can read more about the utility of RNNs in Andrej Karpathy’s brilliant blog post. It is helpful to note the ‘recurrent’ property of the network, where the previous output for an input item becomes a part of the current input which comprises the current item in the sequence and the last output. When done over and over, the last output would be the result of all the previous inputs and the last input.

What is LSTM?

RNNs are very apt for sequence classification problems and the reason they’re so good at this is that they’re able to retain important data from the previous inputs and use that information to modify the current output. If the sequences are quite long, the gradients (values calculated to tune the network) computed during their training (backpropagation) either vanish (multiplication of many 0 < values < 1) or explode (multiplication of many large values) causing it to train very slowly.

Long Short Term Memory is a RNN architecture which addresses the problem of training over long sequences and retaining memory. LSTMs solve the gradient problem by introducing a few more gates that control access to the cell state. You could refer to Colah’s  blog post which is a great place to understand the working of LSTMs. If you didn’t get what is being discussed, that’s fine and you can safely move to the next part.

The task

Given a binary string (a string with just 0s and 1s) of length 20, we need to determine the count of 1s in a binary string. For example, “01010010011011100110” has 11 ones. So the input for our program will be a string of length twenty that contains 0s and 1s and the output must be a single number between 0 and 20 which represents the number of ones in the string. Here is a link to the complete gist, in case you just want to jump at the code.

Even an amateur programmer can’t help but giggle at the task definition. It won’t take anybody more than a minute to execute this program and get the correct output on every input (0% error).

count = 0
for i in input_string:
    if i == '1':
        count+=1

Anybody in their right mind would wonder, if it is so easy, why the hell can’t a computer figure it out by itself? Computers aren’t that smart without a human instructor. Computers need to be given precise instructions and the ‘thinking’ has to be done by the human issuing the commands. Machines can repeat the most complicated calculations a gazillion times over but they still fail miserably at things humans do painlessly, like recognizing cats in a picture.

What we plan to do is to feed neural network enough input data and tell it the correct output values for those inputs. Post that, we will give it input that it has not seen before and we will see how many of those does the program get right.

Generating the training input data

Each input is a binary string of length twenty. The way we will represent it will be as a python list of 0s and 1s. The test input to be used for training will contain many such lists.

import numpy as np
from random import shuffle

train_input = ['{0:020b}'.format(i) for i in range(2**20)]
shuffle(train_input)
train_input = [map(int,i) for i in train_input]
ti  = []
for i in train_input:
    temp_list = []
    for j in i:
            temp_list.append([j])
    ti.append(np.array(temp_list))
train_input = ti

There can be a total of 220  ~ 106 combinations of 1s and 0s in a string of length 20. We generate a list of all the   220  numbers, convert it to their binary string and shuffle the entire list.  Each binary string is then converted to a list of 0s and 1s. Tensorflow requires input as a tensor (a Tensorflow variable) of the dimensions [batch_size, sequence_length, input_dimension] (a 3d variable). In our case, batch_size is something we’ll determine later but sequence_length is fixed at 20 and input_dimension is 1 (i.e each individual bit of the string). Each bit will actually be represented as a list containing just that bit. A list of 20 such lists will form a sequence which we convert to a numpy array. A list of all such sequences is the value of train_input that we’re trying to compute. If you print the first few values of train_input, it would look like

[
 array([[0],[0],[1],[0],[0],[1],[0],[1],[1],[0],[0],[0],[1],[1],[1],[1],[1],[1],[0],[0]]), 
 array([[1],[1],[0],[0],[0],[0],[1],[1],[1],[1],[1],[0],[0],[1],[0],[0],[0],[1],[0],[1]]), 
 .....
]

Don’t worry about the values if they don’t match yours because they will be different as they are in random order.

Generating the training output data

For every sequence, the result can be anything between 0 and 20. So we have 21 choices per sequence. Very clearly, our task is a sequence classification problem. Each sequence belongs to the class number which is the same as the count of ones in the sequence. The representation of the output would be a list of the length of 21 with zeros at all positions except a one at the index of the class to which the sequence belongs.

[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
 0  1  2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20
This is a sample output for a sequence which belongs to 4th class i.e has 4 ones

More formally, this is called the one hot encoded representation.

train_output = []

for i in train_input:
    count = 0
    for j in i:
        if j[0] == 1:
            count+=1
    temp_list = ([0]*21)
    temp_list[count]=1
    train_output.append(temp_list)

For every training input sequence, we generate an equivalent one hot encoded output representation.

Generating the test data

For any supervised machine learning task, we need some data as training data to teach our program to identify the correct outputs and some data as test data to check how our program performs on inputs that it hasn’t seen before. Letting test and training data overlap is self-defeating because, if you had already practiced the questions that were to come in your exam, you would most definitely ace it.  Currently in our train_input and train_output, we have 220  (1,048,576) unique examples. We will split those into two sets, one for training and the other for testing. We will take 10,000 examples (0.9% of the entire data) from the dataset and use it as training data and use the rest of the 1,038,576 examples as test data.

NUM_EXAMPLES = 10000
test_input = train_input[NUM_EXAMPLES:] 
test_output = train_output[NUM_EXAMPLES:] #everything beyond 10,000

train_input = train_input[:NUM_EXAMPLES]
train_output = train_output[:NUM_EXAMPLES] #till 10,000

Designing the model

This is the most important part of the tutorial. Tensorflow and various other libraries (Theano, Torch, PyBrain) provide tools for users to design the model without getting into the nitty-gritty of implementing the neural network, the optimization or the backpropagation algorithm.

Danijar outlines a great way to organize Tensorflow models which you might want to use later to organize tidy up your code. For the purpose of this tutorial, we will skip that and focus on writing code that just works.

Import the required packages to begin with. If you haven’t already installed Tensorflow, follow the instructions on this page and then continue.

import tensorflow as tf

After importing the tensorflow, we will define two variables which will hold the input data and the target data.

data = tf.placeholder(tf.float32, [None, 20,1]) 
target = tf.placeholder(tf.float32, [None, 21])

The dimensions for data are [Batch Size, Sequence Length, Input Dimension]. We let the batch size be unknown and to be determined at runtime. Target will hold the training output data which are the correct results that we desire. We’ve made Tensorflow placeholders which are basically just what they are, placeholders that will be supplied with data later.

Now we will create the RNN cell. Tensorflow provides support for LSTM, GRU (slightly different architecture than LSTM) and simple RNN cells. We’re going to use LSTM for this task.

num_hidden = 24
cell = tf.nn.rnn_cell.LSTMCell(num_hidden,state_is_tuple=True)

For each LSTM cell that we initialise, we need to supply a value for the hidden dimension, or as some people like to call it, the number of units in the LSTM cell. The value of it is it up to you, too high a value may lead to overfitting or a very low value may yield extremely poor results. As many experts have put it, selecting the right parameters is more of an art than science.

Before we write any more code, it is imperative to understand how Tensorflow computation graphs work. From a hacker perspective, it is enough to think of it as having two phases. The first phase is building the computation graph where you define all the calculations and functions that you will execute during runtime. The second phase is the execution phase where a Tensorflow session is created and the graph that was defined earlier is executed with the data we supply.

val, state = tf.nn.dynamic_rnn(cell, data, dtype=tf.float32)

We unroll the network and pass the data to it and store the output in val. We also get the state at the end of the dynamic run as a return value but we discard it because every time we look at a new sequence, the state becomes irrelevant for us. Please note, writing this line of code doesn’t mean it is executed. We’re still in the first phase of designing the model. Think of these as functions that are stored in variables which will be invoked when we start a session.

val = tf.transpose(val, [1, 0, 2])
last = tf.gather(val, int(val.get_shape()[0]) - 1)

We transpose the output to switch batch size with sequence size. After that we take the values of outputs only at sequence’s last input, which means in a string of 20 we’re only interested in the output we got at the 20th character and the rest of the output for previous characters is irrelevant here.

weight = tf.Variable(tf.truncated_normal([num_hidden, int(target.get_shape()[1])]))
bias = tf.Variable(tf.constant(0.1, shape=[target.get_shape()[1]]))

What we want to do is apply the final transformation to the outputs of the LSTM and map it to the 21 output classes. We define weights and biases, and multiply the output with the weights and add the bias values to it. The dimension of the weights will be num_hidden X number_of_classes. Thus on multiplication with the output (val), the resulting dimension will be batch_size X number_of_classes which is what we are looking for.

prediction = tf.nn.softmax(tf.matmul(last, weight) + bias)

After multiplying the output with the weights and adding the bias, we will have a matrix with a variety of different values for each class. What we are interested in is the probability score for each class i.e the chance that the sequence belongs to a particular class. We then calculate the softmax activation to give us the probability scores.

What is this function and why are we using it?

 

This function takes in a vector of values and returns a probability distribution for each index depending upon its value. This function returns a probability scores (sum of all the values equate to one) which is the final output that we need. If you want to learn more about softmax, head over to this link.

cross_entropy = -tf.reduce_sum(target * tf.log(tf.clip_by_value(prediction,1e-10,1.0)))

The next step is to calculate the loss or in less technical words, our degree of incorrectness. We calculate the cross entropy loss (more details here) and use that as our cost function. The cost function will help us determine how poorly or how well our predictions stack against the actual results. This is the function that we are trying to minimize. If you don’t want to delve into the technical details, it is okay to just understand what cross entropy loss is calculating. The log term helps us measure the degree to which the network got it right or wrong. Say for example, if the target was 1 and the prediction is close to one, our loss would not be much because the values of -log(x) where x nears 1 is almost 0. For the same target, if the prediction was 0, the cost would increase by a huge amount because -log(x) is very high when x is close to zero. Adding the log term helps in penalizing the model more if it is terribly wrong and very little when the prediction is close to the target. The last step in model design is to prepare the optimization function.

optimizer = tf.train.AdamOptimizer()
minimize = optimizer.minimize(cross_entropy)

Tensorflow has a few optimization functions like RMSPropOptimizer, AdaGradOptimizer, etc. We choose AdamOptimzer and we set minimize to the function that shall minimize the cross_entropy loss that we calculated previously.

Calculating the error on test data

mistakes = tf.not_equal(tf.argmax(target, 1), tf.argmax(prediction, 1))
error = tf.reduce_mean(tf.cast(mistakes, tf.float32))

This error is a count of how many sequences in the test dataset were classified incorrectly. This gives us an idea of the correctness of the model on the test dataset.

Execution of the graph

We’re done with designing the model. Now the model is to be executed!

init_op = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init_op)

We start a session and initialize all the variables that we’ve defined. After that, we begin our training process.

batch_size = 1000
no_of_batches = int(len(train_input)/batch_size)
epoch = 5000
for i in range(epoch):
    ptr = 0
    for j in range(no_of_batches):
        inp, out = train_input[ptr:ptr+batch_size], train_output[ptr:ptr+batch_size]
        ptr+=batch_size
        sess.run(minimize,{data: inp, target: out})
    print "Epoch - ",str(i)
incorrect = sess.run(error,{data: test_input, target: test_output})
print('Epoch {:2d} error {:3.1f}%'.format(i + 1, 100 * incorrect))
sess.close()

We decide the batch size and divide the training data accordingly. I’ve fixed the batch size at 1000 but you would want to experiment by changing it to see how it impacts your results and training time.

If you are familiar with stochastic gradient descent, this idea would seem fairly simple. Instead of updating the values after running it through all the training samples, we break the training set into smaller batches and run it for those. After processing each batch, the values of the network are tuned. So every few steps, the network weights are adjusted.  Stochastic optimization methods are known to perform better than their counterparts for certain functions. This is because the stochastic methods converge much faster but this may not always be the case.

For every batch, we get the input and output data and we run minimize, the optimizer function to minimize the cost. All the calculation of prediction, cost and backpropagation is done by tensorflow. We pass the feed_dict in sess.run along with the function. The feed_dict is a way of assigning data to tensorflow variables in that frame. So we pass the input data along with target (correct) outputs. The functions that we wrote above, are now being executed.

That’s all. We’ve made our toy LSTM-RNN that learns to count just by looking at correct examples! This wasn’t very intuitive to me when I trained it for the first time, so I added this line of code below the error calculation that would print the result for a particular example.

    print sess.run(model.prediction,{data: [[[1],[0],[0],[1],[1],[0],[1],[1],[1],[0],[1],[0],[0],[1],[1],[0],[1],[1],[1],[0]]]})

So as the model trains, you will notice how the probability score at the correct index in the list gradually increases. Here’s a link to the complete gist of the code.

Concerns regarding the training data

Many would ask, why use a training data set which is just 1% of the all the data. Well, to be able to train it on a CPU with a single core, a higher number would increase the time exponentially. You could of course adjust the batch size to still keep the number of updates same but the final decision is always up to the model designer. Despite everything, you will be surprised with the results when you realize that 1% of the data was enough to let the network achieve stellar results!

Tinkering with the model

You can try changing the parameter values to see how it affects the performance and training time. You can also try adding multiple layers to the RNN to make your model more complex and enable it to learn more features. An important feature you can implement is to add the ability to save the model values after every few iterations and retrieve those values to perform predictions in future. You could also change the cell from LSTM to GRU or a simple RNN cell and compare the performance.

Results

Training the model with 10,000 sequences, batch size of 1,000 and 5000 epochs  on a MacbookPro/8GB/2.4Ghz/i5 and no GPU took me about 3-4 hours. And now the answer to the question, everybody is waiting for. How well did it perform?

Epoch 5000 error 0.1%

For the final epoch, the error rate is 0.1% across the entire (almost so because our test data is 99% of all possible combinations)  dataset! This is pretty close to what somebody with the least programming skills would have been able to achieve (0% error). But, our neural network figured that out by itself! We did not instruct it to perform any of the counting operations.

If you want to speed up the process, you could try reducing the length of the binary string and adjusting the values elsewhere in the code to make it work.

What can you do now?

Now that you’ve implemented your LSTM model, what else is there that you can do? Sequence classification can be applied to a lot of different problems, like handwritten digit recognition or even autonomous car driving! Think of the rows of the image as individual steps or inputs and the entire image to be the sequence. You must classify the image as belonging to one of the classes which could be to halt, accelerate, turn left, turn right or continue at same speed. Training data could be a stopper but hell, you could even generate it yourself. There is so much more waiting to be done!

*The post has been updated to be compatible with Tensorflow version 0.9 and above.

]]>
https://monik.in/a-noobs-guide-to-implementing-rnn-lstm-using-tensorflow/feed/ 76 264
Jnana : Hyperfocus Education https://monik.in/jnana/ https://monik.in/jnana/#respond Mon, 24 Dec 2012 10:43:46 +0000 http://monik.in/?p=146 Continue Reading >>]]> This is a project that I’ve initiated at my School and I would like to share it with all of you here. I’m embedding the flickr photo library for you to have a look at the pictures and videos. Incase it does not work, you can check the album on flickr here. Below, I’ve put up a formal report for my project. Let me know if you have any ideas or suggestions!

Project Profile

The definition of underprivilidged : “Not enjoying the same standard of living or rights as the majority of people in a society.” (Ref : Oxford Dictionary).  I chose the to help the “underprivileged” children because they will become the forerunners of our civilization.

Our country like the universe, is far away from the goal of equality and symmetry. Children are denied quality education as well as fair treatment in education centers. Although the government has made Primary Education compulsory in the country, the impact has not been significant due to drawbacks in the system at all levels.

The foundations of the entire human civilization rest on the fundamentals, the same that a generation learns during their early part of life. What happens when the system developed to impart knowledge fails miserably when it has to overcome obstacles like wealth and intelligence inequality?

The equilibrium of the world is gradually disturbed. Enter, “Jnana : HyperFocus Education”.

What one frequently sees in all charitable organisations is that they help the child for a day or two, and then move on. This is something that we did not want to do. Our main aim while doing this project was to help the children in a wholesome manner by giving them the personal attention that they so desperately needed but that they evidently lacked at home or school. We wanted them to think of us as family.

This is why, at the beginning of our project, we decided to take on only a few children so that we could give them individual attention and form a bond with them. Our intention was not only to become their teachers but also their mentors, guides and friends because we believe that in today’s busy, hurried world, a few words of guidance, encouragement and a pat on the back go a long, long way.

Chapter 2: DESIGN OF THE STUDY 

Issue 

Quality schools are essential, as they foster a population capable of taking advantage of opportunities created by increased demand. Unfortunately, primary and secondary schools in India are unequipped to do so as they are rife with corruption. The most pervasive and detrimental form of corruption perpetrated on the primary and secondary school system is teacher absenteeism at government-run schools, with about 13 percent of teachers failing to show up for work, yet still being paid.

Indian primary and secondary schools suffer from the additional weaknesses of infrastructure limitations and inefficiency. Inefficient teaching methods which focus on memorization as opposed to critical reasoning are also widespread at the primary and secondary school level. Studies by the Program for International Students Assessment, an OECD initiative, and Wipro, an Indian consulting firm, found that students at the primary and secondary school level have regressed in math, science, and reading literacy in recent years.

The day we decided to begin the project, we visited Mahadevi Podar Prathmik Vidyalay, the school whose students we intended to assist. The condition of teaching in the school was worse than what we had anticipated. The average teacher to student ratio was about one to sixty and some classes had no teacher at all. In the classes where there were teachers, none of the students seemed to be paying attention.

That day, when we began teaching the kids students, we realised the extent of damage that this sort of a school environment had on their minds. We asked some of the children to write essays and after we had corrected their grammar (which was worse than we had expected), we asked a few of them to read it out loud. They refused to do so because they were too embarrassed and ashamed of their work. There were many such problems that we faced with the children.

The student’s fear of asking questions was also a major problem that we  had to overcome. The kids were ashamed of saying and asking things. The only probable reason for this was that they were punished for asking questions or expressing opinions.

Hence a system needs to be created which can expand the student’s horizon enabling him to analyse a situation and harnessing an effective solution with the resources available to him.

When asked for an English Story, our student wrote –

DO BHAI THE EK THORA MAD THA AUR EK THIK THA EK DIN DO NO KAM KI TALASH ME BHAHAR NICLA USE EK OPHIC DIKHAEE DIYA DO JAN UNDAR GAE USE EK MAN MILA USNE DONO BHAI KO KAM PAR RAKH LIY EK BHAI JO MAD THA BHABARA PATU THA EK DI JAB KANA KANE BAITHA TO DAS BARAH JAN KA KHAN EK SATH KALIYA US KE MALIK PRESANHO GAAE AUR BOLE KI JIT BARAPLAIT KAL SE TUM LAOGE UTNAI KHANA TUME MILEGA DUSRE DIN BHA EK BANANA KA LIVIS LE AAYA AUR PURE JAN KA KHAN KHA LI YA US KE MALIK NEKHAHA KI KAL SE TUM MERE SATH MERE OPISH CLNA AUR RASTE ME JO CHIJ GIRE USE UTHA LENA JAB DUS RE DIN DONNO JAN OPHISGANE LGETO US KE MALIK HOURSPAR BAITH GAAE AUR BHA HOURS KE PICHE CHLNE LAGA AUR THORI DER BAD HORSE NE TOILETKIYA BHA USE UTHA KARBAG ME RAKH LIYA JAB UUKE MALIK NE USSE BAG MAGA TO USNE GAB BHA BAG DIYATO SARE FILI BHIGE HUAE THE  USNE MALIK SE KAHA ”

Objective 

The objective of my project is to bridge the gap created between students due to an inefficient education system. Jnana aims to create a unique support system that integrates a ‘hyper focus’ mentorship with practical exercises to nurture the child’s thinking capability. It seeks to motivate students in the age group of 12-14 years to overcome their weaknesses in basic subjects like Mathematics and English. Through our initiative we persevere to make the students aware of the various technological advancements that surround them and enable them to adapt to such facilities. This would optimise the student’s ability to utilise available resources.

Jnana attempts to build the self confidence of students and helps them express their opinions effectively. Through this interactive mode of teaching, we hope to eradicate the inequality in the quality of teaching across schools and create analytical minds equipped to solve problems.

Methodology 

We had to make an impact on the minds of our students and make them receptive to a new approach towards education.

From our analysis, we inferred that Math and English are the most vital subjects to a student. According to a survey by College Board, students who performed well in those two subjects would excel at other subjects too. We designed our module around these subjects and chose computer as the medium for completing the courses.

We began looking around for children who had not been as fortunate as I had. As we spoke to our “Moushi’s” and “Bai’s” at my school we learnt that they sent their children to our neighbour schools where the fees was almost waived. Our research on the other neighbour schools shocked us and we set on a mission to make a change.

Fortunate children have the means to access quality education in mediums even outside of school, while the students we found had access to neither. It is our hyperfocused approach of selecting students as well as treating them, which makes the program unique.

Implementation 

Every student in the class enjoyed the sessions as much as the we, the teachers.

One hilarious moment I recall occurred when we were having the English – Writing class. All of the students had been given 5 minutes and 100 word limit on their computer to type out any topic. One of the students chose to write about olympics. His post was purely informative. One of my team members and I overheard a conversation happening between that student and his neighbour.

“Abbey, Aei story hai, moral toh likhna padega!” to which the author replies “Olympics hai, iska kya moral likhu?”.

Another moment that I clearly remember is the one when my team and I tried to make the “Silent Girl” of the class read out her story. She felt so afraid that she aggressively took the mouse and started deleting what she had written. After 6 attempts from me and my team members, she finally agreed to read it in front of the class.

When we were teaching maths, we raised a few questions. What is “21,22,23, …” When I asked what was “20”, all I got was a stare. Tall buildings on weak foundations were not going last long. Our team was assured that we were on the right track and it inspired us to do whatever we could, to repair the damage done by a faulty system.

Chapter 3: DETAILS AND ANALYSIS

Activities and Visits

My team and I undertook various activities such as reviewing schools and the analysing the faults in their teaching system. For this we visited our neighbouring government unaided schools i.e., Mahadevi Podar Prathmik Vidhyalaya and Podar School SSC. Through these visits we concluded that these schools were lacking in a wholesome system of education where the child’s need for attention is catered to.

We also visited Bapsai village, 70 kms from Mumbai, to help the village panchayat in the upkeep of their primary school and aanganwadis. We volunteered to paint their school, provide the students with stationery required for their term and also interacted with the students of the primary school.

My team and I also educated the children about the technology that is available to them. We demonstrated the basic utilities of a computer and the internet. This was done to create a level of awareness in the students’ mind and expose them to resources present in the world around them.

 

Project Module

After these experiences my team and I chalked out a regular schedule for teaching the students. We invited the students to my school for a period of approximately 5 weeks.

We requested the schools to send us 18 children (7th Grade) from the English medium and 11 (9th Grade) from Hindi medium school and that they sent us students who performed poorly in academics.

During the vacations ie, till the 15th of November, we conducted 2 classes a week each for the batch of students from the English medium and the Hindi medium school. These 2 hour sessions included activities planned for two subjects along with a doubt session for their school curriculum. Posts the 15th of November, the frequency of classes were reduced to 1 class per batch per week.

Subject-wise activity

Maths:

  • The topics that the students found difficult were covered. The fundamental concepts of these topics including problems from their curriculum were explained.
  • Many conceptual ideas beyond their scope of syllabus were introduced to them and explained giving examples in daily life.
  • Tips and tricks were taught to the students to make problem solving easier and efficient.
  • Problems of a variety was solved and corrected in class and sheets for homework were given to each child.
  • Review of their school tests and exams along with solving their doubts in those tests were taken up.
  • At the end of each session two problems involving all the concepts learnt were given to the children, who solved and corrected the problems themselves.

English

  • The students were given grammar exercises and sheets to enhance their writing abilities. They were taught basic principles and rules of English to help them frame grammatically correct sentences.
  • Sessions were conducted to teach them the effective and contextual use of words. Spelling sessions were conducted to point out the common errors made and help them overcome it.
  • Real life conversations were simulated to elaborate the effecient usage of English. The students were asked to identify the mistakes made by the speakers and correct them to gain a better understanding of the subject.
  • The students were assigned 5 minutes to write 100 words on a topic of their choice every session and then read it out to the group. A mentor would help them correct the mistakes and teach them ways to express themselves better while writing.
  • Tips about public speaking which included body language, voice modulation and eye contact were given to the students. The students were asked to speak about a topic in front of the group to overcome their lack of self confidence.
  • The students were made to read lessons from their school books as well as articles collected by us to enhance their reading skills and encourage them to read more often.

Computer

  • The students were taught fundamentals of the computer such as searching for a file, changing the screensaver,etc.
  • The students were taught to use Microsoft word and Power point Presentation.
  • They were introduced to the internet and made aware of the various benefits of the web.
  • Sessions teaching the students how to search for text and images on Google for their projects were conducted. The students were also shown videos on various science topics on the internet.
  • They were shown videos on astronomy, a subject that were not familiar with.
  • The students were also taught how to use paint on the computer.
  • Since students came from Hindi medium, sessions to familiarize them with the Technical English terms were organised.

Strengths

  • Addresses the educational and social needs of the students through the one to one mentorship program.
  • Makes it easier to track the progress of the student and identify the weakness, strength and potential of each student.
  • Communication gap between the teacher and student is overcome as the teacher is also a student.
  • Implementation of this project in other schools is quicker as not many resources are required to set up such centres. The project can mobilise mass action and expand to many schools quickly.
  • Unifies a large number of individuals (students) to teach the lesser fortunate children and impart holistic education to them.
  • It is a supplementary course to the school curriculum, hence it enhances the child’s understanding capability.
  • It is a self sustaining structure as the alumni of this project would join and help other students as well.

Weaknesses

  • It is a challenge is to maintain the quality of education imparted to children while working on the overall development of the child’s personality.
  • A lot of time is spent in making the child receptive to our style of teaching. Hence time which could have been utilised to correct the fundamentals of the child and teach him is lost.
  • The child is often ignorant about the things beyond the syllabus and a lot of time is spent on making the child aware of the various options around him.
  • A fear of asking questions and getting doubts cleared often hold back a child who could learn more.

 

Opportunities 

  • This project is capable of expanding to schools with a low teacher to student ratio where our one to one mentorship could attend to the needs of children.
  • The project can be extended to rural areas where due to lack of teachers students remain ignorant about various subjects.
  • The principle of our project can be introduced in any school to develop hands on experience in teaching as well as learning. Interaction of students would help break inhibitions and make the students comfortable to get their questions answered.
  • This project could also benefit students who receive no schooling at all. Regular classes could be conducted by students to teach them the basics of subjects they would require in life.
  • This project could be extended to teaching adults who were unable to receive quality education in their lifetime.

Threats

  • If this project is unable to inspire other students to teach, the idea of expanding to schools could be threatened.  If the project does not gain popularity mobilising mass action would be difficult.
  • Sustaining quality education after expansion would be problematic to monitor.  Also, due to space and resource constraint the success of this project could be hampered.
  • Schools unwilling to participate in this project coupled with narrow minded parents who question the teaching capability of students could dampen the project.
  • The willingness of volunteers to teach students is of the essence in my project. Unwillingness and lack of determination in volunteers could affect the growth rate of my project.

Results 

  • The teacher-student barrier was broken in the first class itself. The students started feeling comfortable because their “teachers” seemed reachable.
  • After a few classes, students opened up and started gaining the courage to question. For the first 3 lectures, doubt sessions were darkened with silence. As we progressed, we had enough doubts to organize extra classes.
  • After 5 weeks of classes, the student marks in Mathematics and English rose by 10% (Source : Maintained records and respective teachers)
  • The students began exploring outside their syllabi and returned with questions about their surroundings and their workings.

Chapter 4: Conclusion

 Future Plans 

  • At my school, An alumni body of 20 members has been constituted to carry out the workings of the project. The project continues to be done and will be passed over to newer students of our school.
  • Over the next year, we envision to roll out our plan state-wide and invite various schools from Mumbai and other cities to participate in the program and constitute their own bodies under the common name, “Jnana”.
  • As we accept more student-volunteers, we will increase the size of the batches and frequency.
  • A website is also under-construction to make it easier for students-volunteers to collaborate and plan lectures and schedule topics to be taught.

This project has been an eye opener for everyone who participated. Things we did not know, could not imagine, have happened and occur everyday. It was our curiousity, the opportunity and good wishes that enabled us to explore the situation and accept the challenge, bold-face.

We have and will continue our struggle against the ineffecient education system and eradicate the suppression of our fellow friends who have not been as lucky as us.

Bibliography

 

  1. http://www.census2011.co.in/census/district/357-mumbai-city.html
  2. http://www.educationinindia.net/
  3. http://www.bbc.co.uk/wales/schoolgate/helpfromhome/content/howchildrenlearn.shtml
  4. http://www.psychologytoday.com/blog/the-moment-youth/201109/mistakes-improve-childrens-learning
  5. http://www.scilearn.com/blog/how-children-learn.php
  6. Biddle, B.J. (1979).  Role theory: Expectations, identities and behaviors.

 

]]>
https://monik.in/jnana/feed/ 0 146
Solving the Auto-Rickshaw Problem in Mumbai https://monik.in/solving-the-auto-rickshaw-problem-in-mumbai/ https://monik.in/solving-the-auto-rickshaw-problem-in-mumbai/#comments Sun, 01 Aug 2010 09:26:58 +0000 http://monik.in/?p=116 Continue Reading >>]]> I travel, almost everyday in an Auto-Rickshaw which is somewhat similar to the fate of 11 million travelers in the Financial Capital of India. Everyone wants growth, Ceo’s driving porsche or the man by the street selling ‘Vada Pavs’. The million dollar question is “HOW?

Growth, Scalability, Feasibility and terms as such fit well on the Presentations of a Million Dollar company, but what about Auto-Rickshaw? If your running a startup for an instance, you’ve got plans, marketing strategies, capital (somewhat), few employees or a team, education for backup purposes, etcetera.

Now push that into the bin and envisage yourself to be one amongst  those 100,000 Rickshaw drivers and you want growth. You want your life to be success than more of a struggle. What are you going to do? How? There are 99,999 more people like you giving you competition so if you don’t take a passenger surely someone else will? How will you break through this situation?

The answer in Seth Godin’s word is “BE REMARKABLE”.

You need a reason why people must remember you, why they should reward you, why they like your company, why they like your services. They want something UNIQUE, and now are you able to fulfill their demand?  Well, thats the only way you can get through this.

  1. The Techy Rickshaw – My mom told me of a ride, while she was dropping my brother to his classes of a Techi Auto Rickshaw driver who had installed a television set, radio, stock market index, and all the things one would imagine in a hefty priced car. You don’t get to see such thing in Cabs! Fancy, Auto RickShaws! But the driver did something remarkable and it got him a post on a blog. Imagine, Auto Driver and Tech Blogs? Woah. And yes, the commentators seemed curious to have a ride around the town on the hot engine!
  2. The Winner – The best way to get onto someone’s head is SYMPATHY. My brain had been bruised at school, and tension took a new troll. I had no one to chat with except the auto driver who dropped my to my house. As soon as I took a seat, he looked at me and asked, “Sab Theek Hai Beta?” (Hindi – English : Is everything alright?). What amazed me was that not my friends had asked this and a stranger seemed worried about me. It took him 5 minutes to drop me to my house and to win my sympathy. I Ended up paying him 1/2 amount more just because I thought he deserved it. See, few words make a huge difference.
  3. The Strategist – I guess this is an aberrating one, but it surely deserves a mention. In Mumbai, the worse problem is “No Rickshaw”. Though the rule states, an auto driver cannot deny a passenger but the driver matches it to his comfort, to find a ride. The Strategist could be around you. Waiting and observing you ask few Rickshaw drivers who happily slam a No on your face and then finally making a superstar entry and saving the passenger. Obviously, you know whats going on the passengers mind! “Thanks Bhaisaab” (Hindi-English: Thanks) is what most would say and there you do, a reward for being different and kind goes to the driver!
  4. The Smartest – This one applies to the cabs but I am sure it can be the same for Auto-Rickshaws too. But before you need to see this.

    Image Credits : Labnol.org

    Expected? Not at all! Obviously, out of those thousands of black coloured cabs, I would crave for this crab! Design makes a difference, a hell lot.

  5. The Planner – Drivers look for longer rides, as it promises larger wages compared to the smaller rides. Out of Serendipity, the driver finds 6 of 7 the longer rides. It only works till your lucky. Now it’s time for some math.For Instance, the shorter rides.
    5 Hours is the total ride time. He serves 16 Passengers. Rs. 50 Earned per passenger which totals to Rs. 800.
    Now, the longer ones.
    5 Hours total time. 7 Rides. He earns Rs. 130 per passenger which totals to Rs 910. Fancy Enough? Now many would wonder why a difference of Rs. 110? Its quite simple. When the driver takes shorter rides, time in wasted in searching for new rides, fuel wasted, energy wasted and everytime a new passenger means new directions and new areas.This probably means, you’ve got to look at long term basis and make calculations to support your logic and execute your plan. If I would be the Rickshaw driver I would probably not look for passengers near School as Children tend to have houses near and those who don’t travel by school buses, but party halls as people seldom travel far and those who do, don’t mind spending much on travel few times.
  6. Using the unused – How can you promote an auto rickshaw? Post an Ad in newspaper or broadcast them T.V Channels? You probably don’t have that budget and plus, it’s almost useless for something like this. The What? Here’s an idea, from the mind of Samson. His idea? He used the web to promote his Ride. He posted a video on youtube about his services, how he would treat you, benefits of riding in his rickshaw and he was Original and Natural. Those videos, his website and his ideas brought him glory with Tourists from as far as San Francisco dying to book a ride! Sounds cool? Sometimes we tend to over look whats in front of us and use it, even though its free!

    TukTastic Ride

What’s your takeaway? Think like others don’t.

]]>
https://monik.in/solving-the-auto-rickshaw-problem-in-mumbai/feed/ 7 116