I have an audit-log table with +300M rows and while most messages are just confirming we performed some very repetitive action, I was thinking of an easy way to compress the data: when a month is over, instead of adding 1000 lines per day with the message "CHECKED IT", replace them with 1 line saying we "CHECKED IT [logged X times on day Y]". Wrote the code, and that was pretty easy. But then came the moment I had to run this, and the simplest query to check the number of messages:
select count(*) as count
from audit_logs
where created_at < '2016-01-01' 
  and created_at > '2015-12-01'       
  and message = 'Checking for new map-requests'  
  and organization_id = 3
took 15 seconds. So then I checked my indexes: I had only an index on organization_id.
I read an article on the best way to create an index when searching for ranges, and it suggested a combined index on the equality operator first, and then the range second, so in my case I added a migration with the following addition
add_index :audit_logs, [:organization_id, :created_at]
Rerunning the query showed an immense improvement, and now the query only took a mere 20ms :) :) :)
Comments
Add comment