Blog
what did i learn today

In our Rails 3 application we use mongo to store logging of critical actions. At first we did not store a separate timestamp, since the _id (which is a BSON::ObjectId contains a timestamp as well. Our model, simplified, looked like this: [ruby] class Log include Mongoid::Document # mongo id's contain timestamps, 4 bytes = epoch def timestamp Time.at([id.to_s].pack("H8").unpack("N")[0]) end end [/ruby] This is all fine and dandy, but when we wanted to build some reporting, of course we were unable to filter and query based upon date. Mongoid has an easy way to add timestamp fields, and will add the timestamp to all the newly created documents for you. Inside your model just add: [ruby] include Mongoid::Timestamps::Created [/ruby] We only need to track the creation-time, since we are not interested in any updates. Including Mongoid::Timestamps will add and maintain both created_at and updated_at. Now all that remained was adding the created_at field to all existing data. We needed reporting, but also on the existing data. Luckily, the value of the field was known, using the timestamp hidden in the id. Secondly, building a script to add and populate the field was also not too hard. This gist was my inspiration, but unlike that script, I was able to use the higher level interface of Mongoid. [ruby] # To allow querying on time-ranges, we need to add the created_at field. # Querying on the timestamp does not seem possible, which is not completely surprising # as it is (if i understand correctly) a part of the _id field. # # As a one time operation, we iterate over all documents and add the created_at field # This script has to be run in the Rails environment, please use : # # rails runner script/convert_mongo_add_created_at.rb # COLLECTIONS = ["your-collection"] # Put a list of collection names here def convert_collection(collection) skipped_docs = 0 all_converted_docs = 0 Audit::Log.all.each do |doc| unless doc.respond_to?(:created_at) && doc.created_at.present? doc[:created_at] = doc.timestamp doc.save all_converted_docs +=1 else skipped_docs += 1 end end puts " added :created_at to #{all_converted_docs} documents [skipped: #{skipped_docs}]" puts "Converted #{collection}" end puts "Start conversion ..." @db = Mongoid.database COLLECTIONS.each do |collection| convert_collection collection end [/ruby]

[mongoid] doing a group-by on date

Doing a simple group-by query using mongo and mongoid is actually pretty straightforward. According to the documentation, mongoid does not offer any group/aggregation function itself, but mongo does. And you can directly access the mongo using the collection. So assume we have a mongo collection: [ruby] class Log include Mongoid::Document include Mongoid::Timestamps::Created field :action, :type => String end [/ruby] and now I want to count all occurrences of the different actions. [ruby] Log.collection.group(:key => "action", :initial => { :count => 0 }, :reduce => "function(doc,prev) { prev.count += +1; }") [/ruby] This will return an array of hashes as follows: [ruby] [{"action"=>"create", "count"=>1565.0}, {"action"=>"update", "count"=>2142.0}, {"action"=>"destroy", "count"=>27.0}] [/ruby] That is already very nice. But now I want to get the results of a certain action, grouped per day and month. [ruby] Log.collection.group(:keyf => "function(doc) { d = new Date(doc.created_at); return {nr_month: d.getMonth(), nr_day: d.getDate() }; }", :initial => { :visits => 0 }, :reduce => "function(doc,prev) { prev.visits += +1; }" ) [/ruby]Notice: the value of :keyf and :reduce is a string containing javascript. This is very flexible, but important: no ruby! But since it is a string, you can use string interpolation to get values in there. We should, of course, add a condition, to limit the result-set. So something like this: [ruby] Log.collection.group(:keyf => "function(doc) { d = new Date(doc.created_at); return {nr_month: d.getMonth() }; }", :initial => { :visits => 0 }, :reduce => "function(doc,prev) { prev.visits += +1; }", :cond => {:action => 'create'}) [/ruby] This will returns the logged create per month.Notice: the :cond and :initial contain regular ruby hashes. Selecting on a date-range is also pretty easy, once you know how this took me hours to find : [ruby] Log.collection.group(:keyf => "function(doc) { d = new Date(doc.created_at); return {nr_month: d.getMonth() }; }", :initial => { :visits => 0 }, :reduce => "function(doc,prev) { prev.visits += +1; }", :cond => {:created_at => {'$gte' => Time.utc(2011,04), '$lt' => Time.utc(2011,05) }) [/ruby] I still have some weird offset error with the dates, since I also get a few from the previous month, but I guess this has something to do with the time-zones. Now, Mongoid can help us to write the condition. We can do something like: [ruby] conditions = Log.where(:created_at.gte => Date.today.at_beginning_of_month, :created_at.lte => Date.today.at_end_of_month).selector Log.collection.group(:keyf => "function(doc) { d = new Date(doc.created_at); return {nr_month: d.getMonth() }; }", :initial => { :visits => 0 }, :reduce => "function(doc,prev) { prev.visits += +1; }", :cond => conditions) [/ruby] Hope this helps.