Breaking Aljazeera’s CAPTCHA

I was on Aljazeera Arabic's website the other day and, as I was voting on a poll, was presented the following screen:

The CAPTCHA in the screen above immediately caught my attention. The distortions in it seemed very simple, the text was not warped in any form and no overlap between characters.

The following is a URL for one of the CAPTCHAs:

http://www.aljazeera.net/Sitevote/SiteServices/
Contrlos/SecureCAPTCHA/
GenerateImage.aspx?Code=EANmyyXghpajFhOX6rCRKQ==&Length=4

Opening the URL above and refreshing the page a few times gives the following CAPTCHAs:







The dashed grey lines are randomized, while the letters in the CAPTCHAs above are static. The letters are encoded in the Code parameter in the URL. Notice that there are two forms for each character; a straight form and another that is slightly rotated.

Aljazeera's CAPTCHA can easily be broken by doing the following:

  1. Removing the dashed grey lines
  2. Finding the characters in the image
  3. Separating the characters in the image
  4. Classifying each character

I'll be using Octave/Matlab for the above tasks and will be explaining my algorithm using the following CAPTCHA as an example.

Continue reading “Breaking Aljazeera’s CAPTCHA” »

El-Tetris in HTML5. See it in action!

Following up on my previous post on the El-Tetris algorithm, a Tetris player that clears 16 million rows on average per Tetris game and, at the time of this writing, is the most performant one-piece Tetris AI out there, I thought I would provide an implementation, rather than just a description of the algorithm.

This algorithm is implemented fully in Javascript and the rendering is done in HTML5 canvas. The rendering is purely for cosmetic reasons (so you can actually see how the game is progressing). If you're only interested in the final score, you can choose to speed up the game by enabling "Hardcore Mode". In that mode, rendering the board will be disabled and the algorithm will run continuously in the background. You can also change the size of the board; the smaller the board, the shorter the game.

Full source code can be found here.

Note: For faster execution, use Google Chrome.

Live Notes

For the past few months I have been involved with the project BigBlueButton, an open-source web conferencing system. That, along with looking into Etherpad's source code, really ignited my interest in real-time collaboration technologies.

I started an open-source project to extend BigBlueButton with real-time document collaboration to the conference's participants. The project is still at a very early stage, but will be out for beta testing in the next release of BigBlueButton.

Before I start ranting about the project, which I am tentatively and temporarily calling it "Live Notes", let me first show you a demo. Continue reading “Live Notes” »

Etherpad on Debian/Ubuntu

Since the release of Etherpad's source code last month and I have been really interested in studying the code and algorithm behind Etherpad's realtime editing.

Etherpad's backend is mostly written in Scala. Seeing how Scala is starting to be adopted by many popular online services, it's also worth looking into.

I thought I'd start off by installing my own instance of Etherpad. Unfortunately, the installations didn't quite work for me straight out of the box. After a few quirks, and with the help of this guide, I finally got Etherpad running. I am using Ubuntu 9.10, but the instructions should also work for Debian.

  1. Install Sun Java JDK
    apt-get install sun-java6-jdk

    Note: You need to have Sun's Java as your default. You can verify this by the following:
    java -version
    Java(TM) SE Runtime Environment (build X.X)
    Java HotSpot(TM) Client VM (build 14.3-b01, mixed mode, sharing)

    If you see anything other than what's above (for example, OpenJDK). You need to set Sun's Java as the default JDK. To do so, use the following command and follow the prompt.

    sudo update-alternatives --config java

  2. Install remaining prerequisites:
    apt-get install scala mysql-server libmysql-java mercurial
  3. Paste the following to /etc/profile, and be sure to replace X.X.X with the version of mysql connector that you have installed.
    export PATH
    export JAVA_HOME="/usr/lib/jvm/java-6-sun"
    export SCALA_HOME="/usr/share/java"
    export JAVA="/usr/bin/java"
    export SCALA="/usr/bin/scala"
    export PATH="/usr/bin:/usr/bin:/usr/local/mysql/bin:$PATH"
    export MYSQL_CONNECTOR_JAR="/usr/share/java/mysql-connector-java-X.X.X.jar"
    export JAVA_HOME SCALA_HOME JAVA SCALA MYSQL_CONNECTOR_JAR PATH
    umask 022
  4. Download the etherpad source to /usr/local/etherpad
    hg clone https://etherpad.googlecode.com/hg/ /usr/local/etherpad
  5. Set the environment variables: Again, remember to replace X.X.X with the corresponding version on your machine.
    export JAVA_HOME="/usr/lib/jvm/java-6-sun"
    export SCALA_HOME="/usr/share/java"
    export JAVA="/usr/bin/java"
    export SCALA="/usr/bin/scala"
    export PATH="/usr/bin:/usr/bin:/usr/local/mysql/bin:$PATH"
    export MYSQL_CONNECTOR_JAR="/usr/share/java/mysql-connector-java-X.X.X.jar"
  6. Add your domain to the superdomain section in /usr/local/etherpad/trunk/etherpad/src/etherpad/globals.js. If you will only be accessing it locally (through localhost), you don't need to do this.
  7. Create the etherpad mysql db and privileges:
    mysql -u root -p
    # enter your password when prompted
    create database etherpad;
    grant all privileges on etherpad.* to 'etherpad'@'localhost' identified by 'password';
    quit
  8. Compile the JARs:
    cd /usr/local/etherpad/trunk/etherpad/
    ln -s /usr/share/java /usr/share/java/lib
    bin/rebuildjar.sh

    Edit: As Mihira pointed out in comment #2, you may come across this error when compiling with the bin/rebuildjar.sh:
    Unable to establish connection to compilation daemon. Compilation failed.

    The problem maybe that the server’s hostname doesn't have a entry for 127.0.0.1 in the /etc/hosts file. In that case, add 127.0.0.1 to the /etc/hosts file and run bin/rebuildjar.sh again.

  9. Run the web server:
    bin/run-local.sh
    Note: If you are running a machine or a VM with little RAM, you might encounter this message:
    Error occurred during initialization of VM
    Could not reserve enough space for object heap

    If you see this error when you try to run the web server, it can be resolved by decreasing the size of the needed heap. Edit bin/run-local.sh and change the variable MXRAM from 1G to something smaller (256m should do the trick), then try running bin/run-local.sh again.

  10. That's it! You can access Etherpad locally on http://localhost:9000

Adding Formula Fields to a Database with Ruby on Rails

Every now and then you might encounter a situation where you need to have a database column that has a value based on a formula consisting of values of other columns. Unfortunately, there is no standard way of embedding this into the SQL definition of a table. You must keep track of updating this value on your own when using the database.

But, this begs the question, why would I want to have a formula field? Can't I just do all my calculations when I am fetching my data with a SQL query? Sure, you can. However, if you are doing calculations on vast amounts of data, having part of this data pre-calculated in a formula field can give you a solid performance boost.

If you're using Ruby on Rails for your project, there's a very simple solution you can implement. I'll show you how through a trivial example. In this example, we imagine a teacher wanting to store his students' assignment marks.

Let's go ahead and create our rails project called teacherexample. I'll use MySQL as my database in this example, but the concept applies to any database system.

Continue reading “Adding Formula Fields to a Database with Ruby on Rails” »