Monday, September 15, 2008

project euler repo

I have recently gotten back into the Project Euler problems. The website provides a bunch of math and logic problems that can be solved any way -- paper and pen or programming. I like the challenge every once and awhile, so I thought I might share some of my results of the code I have written to help solve the problems. There are comments in most of these. They are just quick dirty hacks to help me get the answer. Sometimes the output will not be the answer and you might need to go look for it, but the logic is there.

Saturday, September 13, 2008

new job and location

I have neglected my duties as the maintainer of this blog for a few months. Its has been for the better though. About a month ago, I left my position at Chumby and moved to San Francisco to start working at [context]. I had a great time and experience working at Chumby, but I felt that San Francisco is where I needed to be for both work, but also experience. I grew up in one of the largest (and best) cities in the world, and I missed the lifestyle that came with it. San Diego has great weather, but I hated driving everywhere.

Now I am in San Francisco. Trying to sell my car. And enjoying my new job, people, and experience.

If you are in the area, please contact me, nice to meet some readers. :)

Flash on S3

This is a continuation of my previous post talking about using S3 as a CDN. This post discusses some of the issues that occurred with hosting Flash content on S3, and the solutions for them.

Problems started to occur once Flash SWFs were loaded from the Chumby device and the Chumby website. SWFs have a built in security policy known as cross domain policies that allow the owner of a domain to specify what domains have access to the domain. Think of it as a robots.txt for Flash SWFs.

With S3 there are two ways to access content from a bucket -- AWS based URL or a CNAME from your domain that points to AWS (Amazon Web Services). When the Flash content is on S3, the Flash player looks for the crossdomain going through the AWS URL path. We setup a CNAME 'swf.chumby.com' and placed a crossdomain.xml that could be accessed via http://swf.chumby.com/crossdomain.xml and also one on the top level http://chumby.com/crossdomain.xml. This allowed to control what SWF movies could load the widgets.

Playing the SWF as a stand alone Flash movie never showed any problems. When it was loaded via http://swf.chumby.com the SWF would claim its domain to be swf.chumby.com instead of an AWS domain. From the chumby website there is a way to preview the content that will appear on your Chumby -- the Virtual Chumby. The SWF for the Virtual Chumby exists on the main chumby.com website. With the crossdomain, it was able to load the widgets from swf.chumby.com no problem, but when it wanted to send parameters to the widget a problem occurred.

Flash apparently has various sandbox models for the SWF files. This is good because allows SWF to maintain a state security and ensures your data is protected. This bit us in the ass though. Since a SWF can grant only certain (sub)domains to ability to send it parameters we had 1000s of widget that we could play, but they didn't have access to any information that made it work well within the Virtual Chumby. There were two possible solutions. Change every widget to have the code allowDomain, which would take weeks to contact 3rd party developers, countless resources, etc. The second solution is even tougher it would require moving the Virtual Chumby SWF over to the swf.chumby.com domain and updating the links to it on our website. :)

s3 as a CDN

I worked for a company that provides widgets as a primary resource for our product the chumby. These widgets are purely static content in the form of Flash SWF files and an associated jpeg thumbnail. This content is provided from our servers from both dynamic (database) and static (file servers) resources. These resources are ready to scale to certain calculated amount before we have to worry about more servers, bandwidth, etc... We try to stay ahead of the curve with growth.

The scaling numbers show that we can do one of two things -- expand our servers and utilize more bandwidth or use a CDN to provide our content utilizing caching. In short, the most cost effective solution is S3. Our content, widgets that can change instantaneously when someone uploads a new one, needs to be provided to all users with in a reasonable time. A normal CDN could take minutes-hours to propagate and take time for integration. Expanding our servers would mean more time and maintenance on our end.

The architecture we have decided is to have a two tier distribution, which will provide with redundancy for widgets. The widgets will exists on our servers in the database and on S3. Our database server is used to hold the widgets because its easy to backup, restore and replicate. With our current system, when a user uploads/updates a widget, it is saved in the database directly, so the newest version can be pushed to users as soon as it gets approved.

Transferring files to S3 has proven to be quite simple to implement. The main problem has been adjusting our architecture to adapt to external URLs. Frontend (website) facing, obviously changing URLs is pretty trivial and all browsers support cross domain loading of content.

Pushing widgets to the database is easy. A simple create/update with ActiveRecord and you're done. When a user uploads a widget, in the same POST request the file is saved to the database, so there is no delay and problems and errors with the file are reported in real time. A blocking operation for Rails, but with size limits imposed on the database, model, and web server it shouldn't be too slow.

To transfer the widgets to S3 from our database in 'real time' is a tricky part. This is a blocking that depends on factors beyond our control. The S3 servers could be done, our bandwidth pipe could be saturated with web hits so upload to outside server is slow, etc. This is a blocking operation no matter what, but one we don't want the the user to have to wait for when they upload a new widget. The solution was to push the transfer of a widget to S3 to a job server, whose main purpose is to queue long running tasks. The job server was built using BackgroundRB that integrates well with Ruby On Rails.

This post is to be continued in follow up posts. There is still so much more to cover with the problems we had with Flash and the framework built to white label CDNs.