Monday, March 17, 2008

dirty fields in ActiveRecord

Managing databases from a programming environment is always an ordeal. Rails has an ORM (ActiveRecord) that transparently provides a programming environment that maps directly to the database and generates all SQL. ActiveRecord sacrifices efficiency over ease of use in some cases. I believe one of the most neglected ones is how ActiveRecord handles updating records in a table.

When ActiveRecord updates a record in a table, it updates all fields whether or not those fields have actually been changed. In most cases this is fine, which a record is so small that setting the fields again takes a trivial amount of time. I happen to have one those rare cases where this is not a valid solution. I happen to use my database for file storage with files from 10KB to 1MB, so changing the name of the file causes the large chunk of data to be set again.

Only the field that has changed (a dirty field) should be updated. ActiveRecord has no flags for what fields have changed or not. I took the time to extend ActiveRecord to support dirty flags for fields by modifying write_attribute method to mark a hash that an attribute of a record has changed.

With the knowledge of dirty fields, generating the appropriate UPDATE statement needed to be done. The UPDATE statement is currently generated by converting all fields and their values in the an SQL assignment by the function attribute_with_quotes. I just extended this function to check that an attribute had been dirtied before it was added to the assignment statement. Since an UPDATE is syncing the database with the current model all the dirty fields are flagged as no longer being dirty.

Show code:


module DirtyAttributes
def self.included(base)
base.class_eval do
alias_method_chain :write_attribute, :dirty
alias_method_chain :attributes_with_quotes, :dirty
after_update :reset_dirty
end
end
def reset_dirty
@dirtied_attrs = nil
end
def write_attribute_with_dirty(attr_name, value)
dirtied(attr_name)
write_attribute_without_dirty(attr_name,value)
end
def dirtied(attr_name = nil)
@dirtied_attrs ||= {}
#logger.info "dirtied #{attr_name}"
@dirtied_attrs[attr_name.to_s] = true if attr_name
end
def dirty?(attr_name = nil)
@dirtied_attrs ||= {}
#logger.info "dirty? #{attr_name}"
return (attr_name.nil? && !@dirtied_attrs.empty?) || @dirtied_attrs.has_key?(attr_name.to_s)
end
def attributes_with_quotes_with_dirty(*args)
quoted = attributes_with_quotes_without_dirty(*args)
quoted.delete_if {|key,value| !dirty?(key) } unless (self.new_record?)
return quoted
end
end

ActiveRecord::Base.send(:include, DirtyAttributes)

I would just like to emphasize how easy this was to do. No more than 30 minutes. This is why I love Rails and Ruby. Yet, what gets me is why this wasn't done before. Sorry to say, but this is really one of those DUH! things that should have been implemented from start. If there is a reason that there wasn't please let me know because I am really curious.

NOTE: I have not made this a plugin yet. This code has not been tested on a production environment either.

UPDATE #1: Because of the way ActiveRecord handles method chaining with alias_method_chain there are some edge cases that need to be solved. Mainly I need to find a way to override the original define update function ActiveRecord::Base. You think it would be easy, but alias_method_chain renames functions, and since it is used to add support for 'update_at/updated_on' timestamps, it has proven to be difficult. I have found ways to make it work, but the potential for other plugins doing alias_method_chain to update could potentially cause problems. I think I will just submit a patch for ActiveRecord.

UPDATE #2: I have been able to edit the code to work correctly in all 1.2.3/2.0 version of ActiveRecord successfully. I had to learn a little more about the internals, but the final code shows that it is possible and pretty easy to do.

June 5, 2008 - It looks like Rails 2.1 is supporting dirty fields. Hurray, but its weird, I had it first. ;)