Thursday, 17 March 2011

Building a Google App Engine application

I've been poking around lately with an application for creating competitions and tournaments with your friends. The main purpose is to keep track of results and creating rankings.

Similar to chess ratings, when you beat a better ranked opponent you will gain a lot of ranking while beating a less ranked opponent will not affect your ranking as much.

For test data I've been using Premier League results over some time period and calculating the teams rankings as functions of the teams performance.

I've been using Google App Engine (GAE) as deploy target and trying to take advantage of what the GAE infrastructure can offer. I'd thought I'll write some blog posts about pros and cons of using GAE and how my technology stack has worked out.

So what is GAE? GAE is a cloud based hosting service running applications written in Java or Python. In contrast to for example Amazon EC2 you have no control over what OS or web/application server you are running on. This might be a problem if you wish to have total control. For my purposes it is perfect since I don't need to worry about server and OS upgrades etcetera.

The main advantage is that you can focus entirely on the application and infrastructure like scaling of more servers and database backends comes for free. The pricing model is also very generous. App Engine is free to use up to certain quota limits of daily traffic measured on server requests, data store accesses, memcache data transfers etcetera. These limits are so ridiculously high that an average small to medium application can never breach them. If your application becomes a success you will start paying for the extra traffic.

So what can you run on GAE? If focusing on Java applications you must conform to the JEE container specs. What we got to work with is a JEE servlet container supporting most of the specification. Your application is a standard Java web application contained in war archive with the ordinary directory structure of WEB-INF etcetera.

A GAE application runs in a sandboxed environment with constraints like no forking of threads, no access of the file system and other IO tasks and not a complete Java SE library to work with, the AWT libraries are for example not bundled. But there are work arounds for most practical problems. For asynchronous jobs there is for example an API for creating long running tasks. Similarly there are alternatives to lacking JEE standards like JMS messages queues.

Naturally, you don't want to write to much code relying on propritary Google APIs to be able to move to other hosting systems. So when you hear that there is no standard relational SQL database to work with you might fell uneasy!

Data storing relies on Googles BigTable which you might see as a big and extremely fast lookup table. Fortunately GAE provides different access mechanisms to BigTable. If you're familiar with Java ORM frameworks like Hibernate or JPA you can feel at rest since there are JPA and JDO bindings so that you can be data store agnostic in your code.

Obviously, you can leverage a lot of Google APIs inside your application. For example reusing the the identification framework of Google Accounts or using Gmail as email infrastructure.

I'll post some notes on examples of the APIs and more info on GAE deployment.