Ruby’s Garbage Collection Problem

EDIT: This post deals with Ruby 1.8, and specifically Ruby with the Ruby-Gtk bindings loaded.  The problems have been worked around in the Luz project, and Ruby 1.9 is also much smoother– enough for 30+ frame-per-second animation without noticeable GC hickups. There are more updates at the bottom.

Ruby’s garbage collector (GC) has become a problem for the Luz user experience.

The GC process can cause the entire application to pause for upwards of 200ms at a time (on a P3 1.2ghz), which is simply unacceptable for an application doing real-time animation where, to achieve even 24fps, a new frame must be generated every 42ms. As a result, we see ‘hickups’ in the animation.

People have suggested various solutions, which I’d like to address here.

Ruby’s GC can be disabled

When GC is disabled, if your application produces any junk objects at all, memory use will grow steadily.

In theory, if we could eliminate all garbage production, disabling GC would be feasible, but this is quite impossible in a Ruby application, as even the simplest Ruby code manages to produce garbage objects.

Ruby’s GC can be scheduled for convenient times

Unfortunately, for applications producing real-time animation, there is never a convenient time to pause the application.

Ruby’s GC time can be reduced by reducing the active object count

This is true, and to some degree it’s possible, but Ruby programmers shouldn’t be constantly worried about creating more objects!

Further, the gains to be had here are minimal, as most objects are, of course, necessary.

So where does this leave us?

At first, I thought the problem was simple: more objects means more GC time. In that case the only solutions would be a rewrite of the Ruby GC or running Luz on faster hardware.

But after writing some test programs, I’ve found out that the situation is more complex than I had previously thought.

Have a look at the following application:

module ObjectSpace
	def self.object_count(base_class = Object)
		count = 0
		each_object(base_class) { count += 1 }
		return count

def gc_report
	puts "#{ObjectSpace.object_count} objects, #{start = ; GC.start ; - start} GC time, #{ObjectSpace.object_count} remain"

# Toss some junk in the trunk
trunk = []
strs = ('a'..'z').to_a
10.times do
	trunk << (1..10000).collect { strs[rand(strs.size)] + strs[rand(strs.size)] }  # ...random combinations...

The output looks something like this:

10420 objects, 0.003659 GC time, 10424 remain
50427 objects, 0.019191 GC time, 50430 remain
100436 objects, 0.038279 GC time, 100434 remain

So 100k objects takes only 38ms to garbage collect, while the 23k in Luz takes around 160ms? What is going on here?

I tried another test, this time loading Ruby bindings for GTK+ library that the Luz editor uses.

# ...the same helper methods from above...

5.times { gc_report }

require 'gtk2'

5.times { gc_report }

The output surprised me:

403 objects, 0.047237 GC time, 390 remain
392 objects, 0.001476 GC time, 391 remain
393 objects, 0.001331 GC time, 391 remain
393 objects, 0.001326 GC time, 391 remain
393 objects, 0.002924 GC time, 391 remain
...gtk loaded...
11648 objects, 0.014791 GC time, 5172 remain
5174 objects, 0.052599 GC time, 5171 remain
5173 objects, 0.018333 GC time, 5171 remain
5173 objects, 0.056349 GC time, 5169 remain
5171 objects, 0.056296 GC time, 5169 remain

Two things to notice here.

The first is that GC time jumps dramatically after loading GTK, taking 40x as long in the common case. Note that we’re not actually using GTK, just loading the bindings.

The second thing to notice is that GC time is rather erratic. (I can’t explain that.)

Does the same happen when loading other libraries?

Loading just the Cairo bindings produces:

514 objects, 0.001132 GC time, 512 remain

Loading just webrick produces:

3401 objects, 0.015232 GC time, 3399 remain

One conclusion I’m pulling from this is that object count matters, but other factors matter much more.

Someone pointed out on the #ruby-lang IRC channel that libraries (such as at GTK bindings) can hook in to the garbage collection process, and may be doing extra work with every GC run. This seems like a very, very likely culprit and a good place to begin profiling.

So what does this mean for Luz?

The good news here is that Ruby itself is not the main problem.

Ruby is certainly not off the hook, however, as it would be a lot nicer (from a Ruby programmer’s point of view) to have a GC that runs concurrently, or one that can be set to run for a limited amount of time, or any number of other solutions (not to mention reference counting instead of GC).

If avoiding loading GTK is enough to solve Luz’s GC problems, that is a definite possibility. Only the Luz editor requires GTK, and a live performance doesn’t need the editor.

We can extract the Luz engine + plugins into a separate process which shouldn’t suffer from the GC problems noted above. Only one way to find out, though.

EDIT: Soon after this, the Luz Performer was born, a pure Ruby+SDL+OpenGL application that animates quite smoothly. It is what is run when the user clicks the Play Fullscreen button.

EDIT: Many years later, the Ruby-GTK team decided to stop including OpenGL support, leaving Luz Studio high and dry. This was the impetus to find a new UI toolkit for Luz.  Failing that, we chose to implement a custom widget toolkit using the same Ruby 3D-drawing API as Luz’s Actor Effect plugins use.  This effort would be called Luz 2.0.


4 Responses to Ruby’s Garbage Collection Problem

  1. […] Ruby Garbage Collection Update I have some more information about the Ruby/GTK+ garbage collection situation that I talked about in my previous post. […]

  2. | The first is that GC time jumps dramatically after loading GTK, taking 40x as long in the common | case. Note that we’re not actually using GTK, just loading the bindings.

    I would guess the reason here is that in loading GTK, the amount of heap space used by ruby for the “buckets” containing objects goes up exponentially. The current design of ruby 1.8.6 prevents this space from ever being reclaimed. Because scanning of these buckets occurs at GC time, having a large number of “heap buckets” increases the time each GC pass takes.

    would be interesting to see if the patch at affects the results in any interesting way.


  3. saw your previous post, would still be curious about the effects of this patch, but am less certain it would affect your situation positively.. sigh.


  4. […] I found a nice article about this topic here and another one here – old but still […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: