Ruby’s garbage collector (GC) has become a problem for the Luz user experience.
The GC process can cause the entire application to pause for upwards of 200ms at a time (on a P3 1.2ghz), which is simply unacceptable for an application doing real-time animation where, to achieve even 24fps, a new frame must be generated every 42ms. As a result, we see ‘hickups’ in the animation.
People have suggested various solutions, which I’d like to address here.
Ruby’s GC can be disabled
When GC is disabled, if your application produces any junk objects at all, memory use will grow steadily.
In theory, if we could eliminate all garbage production, disabling GC would be feasible, but this is quite impossible in a Ruby application, as even the simplest Ruby code manages to produce garbage objects.
Ruby’s GC can be scheduled for convenient times
Unfortunately, for applications producing real-time animation, there is never a convenient time to pause the application.
Ruby’s GC time can be reduced by reducing the active object count
This is true, and to some degree it’s possible, but Ruby programmers shouldn’t be constantly worried about creating more objects!
Further, the gains to be had here are minimal, as most objects are, of course, necessary.
So where does this leave us?
At first, I thought the problem was simple: more objects means more GC time. In that case the only solutions would be a rewrite of the Ruby GC or running Luz on faster hardware.
But after writing some test programs, I’ve found out that the situation is more complex than I had previously thought.
Have a look at the following application:
module ObjectSpace
def self.object_count(base_class = Object)
count = 0
each_object(base_class) { count += 1 }
return count
end
end
def gc_report
puts "#{ObjectSpace.object_count} objects, #{start = Time.now ; GC.start ; Time.now - start} GC time, #{ObjectSpace.object_count} remain"
end
# Toss some junk in the trunk
trunk = []
strs = ('a'..'z').to_a
10.times do
trunk << (1..10000).collect { strs[rand(strs.size)] + strs[rand(strs.size)] } # ...random combinations...
gc_report
end
The output looks something like this:
10420 objects, 0.003659 GC time, 10424 remain ... 50427 objects, 0.019191 GC time, 50430 remain ... 100436 objects, 0.038279 GC time, 100434 remain
So 100k objects takes only 38ms to garbage collect, while the 23k in Luz takes around 160ms? What is going on here?
I tried another test, this time loading Ruby bindings for GTK+ library that the Luz editor uses.
# ...the same helper methods from above...
5.times { gc_report }
require 'gtk2'
5.times { gc_report }
The output surprised me:
403 objects, 0.047237 GC time, 390 remain 392 objects, 0.001476 GC time, 391 remain 393 objects, 0.001331 GC time, 391 remain 393 objects, 0.001326 GC time, 391 remain 393 objects, 0.002924 GC time, 391 remain ...gtk loaded... 11648 objects, 0.014791 GC time, 5172 remain 5174 objects, 0.052599 GC time, 5171 remain 5173 objects, 0.018333 GC time, 5171 remain 5173 objects, 0.056349 GC time, 5169 remain 5171 objects, 0.056296 GC time, 5169 remain
Two things to notice here.
The first is that GC time jumps dramatically after loading GTK, taking 40x as long in the common case. Note that we’re not actually using GTK, just loading the bindings.
The second thing to notice is that GC time is rather erratic. (I can’t explain that.)
Does the same happen when loading other libraries?
Loading just the Cairo bindings produces:
514 objects, 0.001132 GC time, 512 remain
Loading just webrick produces:
3401 objects, 0.015232 GC time, 3399 remain
One conclusion I’m pulling from this is that object count matters, but other factors matter much more.
Someone pointed out on the #ruby-lang IRC channel that libraries (such as at GTK bindings) can hook in to the garbage collection process, and may be doing extra work with every GC run. This seems like a very, very likely culprit and a good place to begin profiling.
So what does this mean for Luz?
The good news here is that Ruby itself is not the main problem.
Ruby is certainly not off the hook, however, as it would be a lot nicer (from a Ruby programmer’s point of view) to have a GC that runs concurrently, or one that can be set to run for a limited amount of time, or any number of other solutions (not to mention reference counting instead of GC).
If avoiding loading GTK is enough to solve Luz’s GC problems, that is a definitely possibility. Only the Luz editor requires GTK, and a live performance doesn’t need the editor.
We can extract the Luz engine + plugins into a separate process which shouldn’t suffer from the GC problems noted above. Only one way to find out, though.
Please take all of this with a grain of salt. The simple GTK test application above looks pretty damning, but I’m still not sure exactly what is going on under the hood, so I’m not pointing any fingers.
[...] Ruby Garbage Collection Update I have some more information about the Ruby/GTK+ garbage collection situation that I talked about in my previous post. [...]
| The first is that GC time jumps dramatically after loading GTK, taking 40x as long in the common | case. Note that we’re not actually using GTK, just loading the bindings.
I would guess the reason here is that in loading GTK, the amount of heap space used by ruby for the “buckets” containing objects goes up exponentially. The current design of ruby 1.8.6 prevents this space from ever being reclaimed. Because scanning of these buckets occurs at GC time, having a large number of “heap buckets” increases the time each GC pass takes.
would be interesting to see if the patch at http://lloydforge.org/projects/ruby/ affects the results in any interesting way.
lloyd
saw your previous post, would still be curious about the effects of this patch, but am less certain it would affect your situation positively.. sigh.
lloyd