There is this recurring problem we have in GIS: getting road-segments and wanting to show complete roads. The naive approach would we to do something like the following:
insert into street_geoms
select ro.rd_ro_ident, ro.rd_ro_name, ro.com_code, ssdo_aggr_union(sdoaggrtype(rd.ro_geometry, 0.005)) as geom
from rd_road ro, rd_ro_sec ros
where ros.rd_ro_ident = ro.rd_ro_ident
group by ro.rd_ro_ident, ro.rd_ro_name, ro.com_code;
For good measure: we have 45.000+ roads, with a total of 230.000+ road segments. So when that query starts running and starts taking a long time, I started googling. Apparently there are two faster alternatives: SDO_AGGR_CONCAT_LINES
and SDO_AGGR_SET_UNION
. While the first was really quick, completed in minutes, the result was completely wrong (complete segments were missing). The second might be quicker, but it was really hard to get an idea about any progress, and if it would fail, everything should be lost (rolled back).
So I decided to write a little script, and issue a sql statement for each single road, allowing me to track progress and added restartibility. For each road I issued a statement like:
insert into street_geoms
select ro.rd_ro_ident, ro.rd_ro_name, ro.com_code, sdo_aggr_set_union(CAST(COLLECT(ros.rd_ros_geometry) AS mdsys.SDO_Geometry_Array),0.005) as geom
from rd_road ro, rd_ro_sec ros
where ros.rd_ro_ident = ro.rd_ro_ident
and ro.rd_ro_ident = 1895101
group by ro.rd_ro_ident, ro.rd_ro_name, ro.com_code;
I added some ruby code around it, to make sure it tracked the progress and calculated the remaining time, just to have an idea. The first "large" road it stumbled upon literally took hours. It only had to join 39 segments. A simple query learned I had 150+ roads with more segments, and a maximum of 125 segments in the database. I could not just simply ignore them :) So this was not going to work either.
Why would this be so hard? I just wanted to throw all linestrings together into one geometry. How could I do that? Querying the geometries was really easy, so what if I joined the geometries outside of oracle? And wouldn't that be hard? But there is a simple solution: convert the strings to WKT, and join all LINESTRING
in a MULTILINESTRING
. This would just be simple string manipulation. I can do that ;)
I had some hiccups with this approach: handling the long strings proved a bit akward (use CLOB
instead) and I had to regularly call GC.start
to make sure the open cursors were released. And I had to make sure not to build a string literal which was too long (ORA-06550).
But in the end I was able to join the road-sections for the 45.000 + roads in approx 1.5h, which is not blindingly fast, but faster than 1 single SDO_AGGR_SET_UNION
operation :) :)
For reference you can see the full code:
class StreetGeom < ActiveRecord::Base
self.primary_key = 'rd_ro_ident'
end
def format_time (t)
t = t.to_i
sec = t % 60
min = (t / 60) % 60
hour = t / 3600
sprintf("% 3d:%02d:%02d", hour, min, sec)
end
def eta(count)
if count == 0
"ETA: --:--:--"
else
elapsed = Time.now - @start_time
# eta = elapsed * @total / count - elapsed;
eta = (elapsed / count) * (@total - count)
sprintf("ETA: %s", format_time(eta))
end
end
all_roads = Road.count
geoms_to_calculate = all_roads - StreetGeom.count
@total = geoms_to_calculate
puts "Joining geometries for #{all_roads} roads [still #{geoms_to_calculate} to do]"
cntr = 1
@start_time = Time.now
done = 0
Road.order(:rd_ro_ident).each do |road|
street_count = StreetGeom.where(rd_ro_ident: road.rd_ro_ident).count
print "\rConverting #{cntr}/#{all_roads} [#{eta(done)}] "
if street_count == 0
print "..."
$stdout.flush
## get all geometries in WKT format
get_geoms_sql = <<-SQL
select sdo_cs.make_2d(ros.rd_ros_geometry).get_wkt() as wkt_geom from rd_ro_sec ros where ros.rd_ro_ident = #{road.rd_ro_ident}
SQL
cursor = Road.connection.execute(get_geoms_sql)
line_strings = []
while row = cursor.fetch
line_string = row[0].read.to_s
line_strings << line_string[10..-1]
end
insert_sql = <<-SQL
DECLARE
wkt_str clob;
BEGIN
wkt_str := 'MULTILINESTRING(#{line_strings.join(", ';\nwkt_str := wkt_str || '")})';
insert into street_geoms(rd_ro_ident, name, com_code, geom)
values (#{road.rd_ro_ident}, q'[#{road.rd_ro_name}]', '#{road.com_code}',
sdo_util.from_wktgeometry(to_clob(wkt_str)) );
END;
SQL
Road.connection.execute(insert_sql)
done += 1
else
print "_"
end
cntr += 1
# periodically cleanup GC so we release open cursors ...
# to avoid ORA-1000 errors
if (cntr % 50) == 0
GC.start
end
end
print "\n"
puts "\n\nDone!"
and I run this script in the rails environment as follows: rails runner lib\tasks\join_road_geometries.rb
.