Monday, May 26, 2008

s3 logs with webalizer

I recently saw the site s3stat.com, which is a simple service that takes your s3 logs and pushes them through webalizer to get the nice graphics and stats. The service is $2/month. I thought to myself that this can surely be done for free.

An hour or so later, I think I have something that's pretty comparable in features to s3stat. I am in no way trying to put them out of business. They maintain their website and are improving the s3stat with more features. I purely wanted and needed a way to view s3 stats, and didn't want to pay for it.

This script requires that a bucket on s3 has logging enabled. Please do so before using this script, or you will just get an error that logging is not turned on. The rubygem AWS::S3 is required to run this script. Please look at the options hash for required parameters.


#!/usr/bin/env ruby
require 'rubygems'
require 'aws/s3'
require 'getoptlong'
require 'tempfile'
require 'date'

#something wrong in the s3 gem
Date::ABBR_MONTHS = Date::Format::ABBR_MONTHS

#default arguments
options = {
:access_key=>'', #the Amazon access key
:secret_key=>'', #the Amazon secret key
:bucket_name=>'', #bucket name to pull logs from
:folder_name=>'webalizer' #foldername for webalizer output
#:clear_webalizer_folder => true #delete local webalizer data
}

#establish connection the s3
AWS::S3::Base.establish_connection!(
:access_key_id => options[:access_key],
:secret_access_key => options[:secret_key]
)

#find the bucket specifying the log files
puts "Checking for logging for bucket #{options[:bucket_name]}"
if AWS::S3::Bucket.logging_enabled_for?(options[:bucket_name])
new_log = File.new('bucket_log.log', 'w+')
log_status = AWS::S3::Bucket.logging_status_for(options[:bucket_name])
puts "Processing log files"
AWS::S3::Bucket.logs(options[:bucket_name]).each do |log|
#convert the lines of amazon s3 log to CLF (Common Log Format)
log.lines.each do |line|
new_log << "#{line.remote_ip} - - [#{line.time.strftime("%d/%B/%Y:%H:%M:%S %z")}] \"#{line.request_uri}\" #{line.http_status || '-'} #{line.bytes_sent || '-'} \"#{line.referrer}\" \"#{line.user_agent}\"\n"
end
end
new_log.close()
#make sure webalizer folder_name exists
if options[:clear_webazlier_folder] && File.exists?(options[:folder_name])
Dir["#{options[:folder_name]}/*"].each{|f| puts f; File.delete(f)}
Dir.delete(options[:folder_name])
end
Dir.mkdir(options[:folder_name])
#run webalizer on current log file
webalizer_output = `webalizer -o webalizer/ -D dns.db -N 5 -F clf bucket_log.log`
puts "output from webalizer:"
puts webalizer_output
#update webalizer bucket with newest info
puts "updating webalizer to s3 bucket #{log_status.target_bucket}"
Dir["#{options[:folder_name]}/*"].each do |filename|
puts "uploading file #{filename}"
AWS::S3::S3Object.store("#{filename}",open(filename),log_status.target_bucket,{:access=>:public_read})
end
end



If you have any improvements please let me know in the comments.

Tuesday, May 20, 2008

Rails top 100

This wiki maintains a list of the top 100 websites (by Alexa ranking) that use Rails. Chumby is ranked number 67 on the list.

NOTE: This is a shameless plug because I work for Chumby doing the RoR development. :)