Rubygems, HTTPS, and jRuby OpenSSL LoadError

UPDATE: It seems there was enough clamor over the unannounced change that a reversion to HTTP was warranted. The advice below isn’t needed currently, but may become relevant in the future, should HTTPS end up back in the mix without a suitable workaround.

Rubyists have been dropping by #rubygems this morning experiencing issues with jRuby and gem install. The issue is related to Rubygems.org’s switch to forcing HTTPS on connections to their Gem repository. Specifically, there is a recursive issue with jruby-openssl when trying to install Gems from Rubygems.org: you need jruby-openssl to install jruby-openssl.

If you attempt to install a Gem in any version of jRuby prior to 1.7 that doesn’t already have jruby-openssl installed, you will receive a LoadError error that looks something like this:

Exception `LoadError' at /home/username/.rubies/jruby-1.5.1/lib/ruby/site_ruby/shared/jruby/openssl/autoloads/ssl.rb:8 - OpenSSL::SSL requires the jruby-openssl gem

The trouble is, you’ll get this message even if you’re trying to install the jruby-openssl gem. Uh oh.

The root of the problem is owed to two factors

  1. Prior to jRuby 1.7, OpenSSL support was not included in jRuby core because of crypto export constraints
  2. Rubygems.org recently switched to forcing HTTPS as a means of increasing security in the rubygems ecosystem

The jRuby devs have worked through the crypto export issue, so updating to jRuby 1.7 will solve the problem; you no longer need jruby-openssl, period. If you’re stuck on an older version of jRuby and need to get gem install working again, you can use this horrible, hacky work around:

Download the jruby-openssl .gem file (using something like wget/curl) and install it from the local file like so:

wget http://production.cf.rubygems.org/gems/jruby-openssl-0.7.7.gem
gem install --local ./jruby-openssl-0.7.7.gem

Be sure to replace the version number with one compatible with your version of jRuby. Also, understand that there are no guarantees that the URL schema above will remain in place. Rubygems are an API, so the implementation may change. The long term solution is to move to jRuby 1.7.

Why be evil.rb?

Caius Durling shared some “hax” with the Ruby community that inspired some discussion over at Hacker News. I commented there, but it seemed like a decent topic for a blog post. One of the examples illustrates a means to define a method-local variable in the argument definition list:

  def output name=((default=true); "caius")
    "name: #{name.inspect} -- default: #{default.inspect}"
  end

  output() # => "name: \"caius\" -- default: true"
  output("fred") # => "name: \"fred\" -- default: nil"

Have a closer look at the method definition line. This code works because of the way parenthesis are handled. Much like in math, when parenthesis are encountered, we evaluate from the inside out. Fire up an IRB session and run this code:

((default=true); "caius")
default.inspect

The return value given for the first line is “caius”, which gets assigned to “name” in our argument list, but you can also see that default is set to “true”. Using the semi-colon statement separator only works because we’ve wrapped the whole thing in parenthesis. That’s why the return value is “caius”. We leverage Ruby’s last-line return value feature.

This might all seem trivial to you if you’ve been programming in Ruby for a while, but therein lies the crux of the discussion. If we only ever wrote code for ourselves, this would be a non-argument. If we expect other people to read and use our code, we should write in a way that is easy to interpret.

Like many viewpoints, the “wrongness” of this example is not black & white; it’s shades of gray. On one hand, you have the “anything that will eval is valid Ruby” view, and on the other you have the “If it’s not immediately obvious to a beginner, you shouldn’t do it” view. There may be better ways to express those two sides of the matter, but that’s the general idea.

The problem with this code (from the latter viewpoint) is that it crams too much program logic in to the argument definitions. This example uses parenthesis to force the evaluation of default=true; "caius" in the argument definition list. That’s only two statements, but it violates some common expectations:

  1. We generally expect argument definitions to be clear and readable, so that method definitions are self documenting (to some degree); this approach clutters the argument definitions.

  2. We expect argument definitions to sometimes assign default values.

  3. We expect program logic to appear in the body of a method, or to be DRY’d up in separate methods.

In this way, the example is not “incorrect” but awkward. To borrow an idea from the literate programming camp, I’d say that just because you can write awkward sentences with valid grammar, it doesn’t mean you should.

Side note: In Caius’ defense, he does say never to use these.

Walk a mile in Egor’s shoes

This was a rough weekend for Egor Homakov. Right now, some of you are questioning my framing of this weekend’s events as rough for Egor and not Github. After all, he went and violated Github’s ToS. There’s a tone of arrogance that rings through in his posts, but I suspect a portion of this can be attributed to his broken English. Recognition may be what motivated him to take the path he did (public disclosure), but I don’t mean to judge him. I can identify with Egor to some degree, because I’ve worn his shoes for about a year now.

Some time late last year, while working on a systems automation project, I discovered a DoS vulnerability in a fairly common piece of voice network hardware. This particular hardware is commonly used in telecommunications to provide voice interconnets between VoIP networks and legacy systems. There are thousands (maybe tens of thousands) sitting on public IP addresses. This particular vulnerability causes the gear in question to crash spectactularly, invoking a core dump and cold boot. The really nasty part is that the “exploit” doesn’t require any special skill. I’d say there’s a fairly high likelihood that someone else could stumble upon this, just like I did, provided they were using the same toolchain. Maybe they have.

I disclosed the vulnerability to the vendor in March of last year (2011). Last time I tested, the issue remains. I disclosed through a third-party that has a VAR relationship with the manufacturer, so I don’t have any direct insight in to their handling of the issue. There’s a small part of me that desires to make an example out of one, or the both, of them.

It makes me pretty angry. I have some idea of how many of these devices are out in the wild. They’re in use by some very large companies. Exploiting the vulnerability causes the device to become unavailable for 3-5 minutes at a time. An attacker would only need a handful of hosts to keep a device offline for a long while, even if the device operator actively blocked attacking hosts. Mitigating the attack would take some pretty significant retooling of the network configuration because of the service that is exploited. This is telecommunications hardware. Imagine taking down a large company’s entire call center for a couple of hours. How big of a check do you think they’d write to make you go away?

So here I sit, knowing that this problem exists, but worrying public disclosure could make me an outsider really quickly. Maybe Egor is just young and naive. Maybe he has guts. I don’t know him, so I can’t say, but I can tell you that knowing about a major security exploit is an uncomfortable position to be in. This is doubly so when you are ignored after disclosing it.

Before you judge him for the way he handled the situation, take a step back and make sure you understand which side of the table you’re sitting on.

Rails: rails vs ./script/rails

Long time Rails developer Jesse Storimer (of Shopify) recently blogged about the efficiency of the new (since 3.0) rails command versus using plain old ./script/rails. You can find his piece here: The rails command and exec(2). You should definitely read that before moving on.

So Jesse’s premise is that you should consider using ./script/rails instead of the rails command because of the overhead involved with the rails script and exec. When I read this, I was skeptical. Just how much overhead can there be in a simple wrapper like the rails script? Why ask rhetorical questions when we can know for sure!?

I performed this testing in our staging environment:

  • Linode 512 instance running Debian 6.0
  • Rails 3.1.1
  • Ruby 1.9.3-p0
  • Bundler 1.0.21
  • Gems in bundle: 68

Since this is a capistrano deployed environment, we rely on bundler, so I wasn’t able to test the bare rails wrapper. Maybe another time.

I ran two sets of tests. First, I figured I’d run a test against rails runner because it wasn’t immediately obvious to me how I’d execute rails console using the GNU time utility. Once the console is open, the REPL interface would sit there waiting on me and be included in the time results. Easy fix. I simply added a call to exit to my ~/.irbrc file causing irb to execute immediately and reliably without waiting on a slow human. Once I figured out how I’d test the console, I wired up my tests.

TL;DR – Avoiding the wrapper bundle exec rails saves you about 1 second for every execution of console or runner. Meh. Although, a full second can seem much longer when you’re reloading your console 5 times in a row because of some stupid error you keep making.

Read on to see the methodology and analysis.

Files

Exec: count gems (no file, just executed at a prompt)

# Be sure to subtract 1 for the header line!
bundle show | wc -l

File: test_script

puts "Runners gon' run!"

File: ~/.irbrc

# Be sure to remove this later or you'll be reeeeeally confused
exit

File: bench_runner.sh

#!/usr/bin/env bash

echo "Runner tests: 4 times each"
echo
echo "cmd: bundle exec rails runner -e staging test_script"
time bundle exec rails runner -e staging test_script
time bundle exec rails runner -e staging test_script
time bundle exec rails runner -e staging test_script
time bundle exec rails runner -e staging test_script
echo
echo "cmd: ./script/rails runner -e staging test_script"
time ./script/rails runner -e staging test_script
time ./script/rails runner -e staging test_script
time ./script/rails runner -e staging test_script
time ./script/rails runner -e staging test_script

File: bench_console.sh

#!/usr/bin/env bash

echo "Console tests: 4 times each"
echo
echo "cmd: bundle exec rails console staging"
time bundle exec rails console staging
time bundle exec rails console staging
time bundle exec rails console staging
time bundle exec rails console staging
echo
echo "cmd: ./script/rails console staging"
time ./script/rails console staging
time ./script/rails console staging
time ./script/rails console staging
time ./script/rails console staging

Test Results

Let’s have a look at the results for runner:

Runner tests: 4 times each

cmd: bundle exec rails runner -e staging test_script
Runners gon' run!

real    0m9.949s
user    0m9.190s
sys 0m0.715s
Runners gon' run!

real    0m9.519s
user    0m8.753s
sys 0m0.724s
Runners gon' run!

real    0m9.652s
user    0m8.898s
sys 0m0.712s
Runners gon' run!

real    0m9.898s
user    0m9.105s
sys 0m0.747s

cmd: ./script/rails runner -e staging test_script
Runners gon' run!

real    0m8.765s
user    0m8.101s
sys 0m0.621s
Runners gon' run!

real    0m8.417s
user    0m7.731s
sys 0m0.642s
Runners gon' run!

real    0m8.991s
user    0m8.290s
sys 0m0.657s
Runners gon' run!

real    0m8.530s
user    0m7.828s
sys 0m0.658s

And now console:

Console tests: 4 times each

cmd: bundle exec rails console staging
Loading staging environment (Rails 3.1.1)

real    0m9.712s
user    0m8.885s
sys 0m0.782s
Loading staging environment (Rails 3.1.1)

real    0m9.725s
user    0m8.905s
sys 0m0.774s
Loading staging environment (Rails 3.1.1)

real    0m9.661s
user    0m8.861s
sys 0m0.754s
Loading staging environment (Rails 3.1.1)

real    0m9.742s
user    0m9.024s
sys 0m0.675s

cmd: ./script/rails console staging
Loading staging environment (Rails 3.1.1)

real    0m8.617s
user    0m7.978s
sys 0m0.596s
Loading staging environment (Rails 3.1.1)

real    0m9.168s
user    0m8.385s
sys 0m0.738s
Loading staging environment (Rails 3.1.1)

real    0m8.589s
user    0m7.868s
sys 0m0.677s
Loading staging environment (Rails 3.1.1)

real    0m8.573s
user    0m7.846s
sys 0m0.684s

Analysis

Time to do the math. Since we’re interested in the time we have to wait before we can do more work, we’ll look at the real field results (using Ruby, of course).

Runner:

# average times for bundle exec rails runner
([9.949, 9.519, 9.652, 9.898].reduce(:+).to_d / 4).to_s

# average times for ./script/rails runner
([8.765, 8.417, 8.991, 8.530].reduce(:+).to_d / 4).to_s

Using bundle exec rails runner: 9.7545s avg
Using ./script/rails runner: 8.67575s avg

Result: avoiding bundle exec rails for runner saves you an average of 1.07875 seconds every time you execute runner.

Console:

# average times for bundle exec rails console
([9.712, 9.725, 9.661, 9.742].reduce(:+).to_d / 4).to_s

# average times for ./script/rails console
([8.617, 9.168, 8.589, 8.573].reduce(:+).to_d / 4).to_s

Using bundle exec rails console: 9.71s avg
Using ./script/rails console: 8.73675s avg

Result: avoiding bundle exec rails for console saves you an average of 0.97325 seconds every time you boot a console.

An introduction to Rack

If you write much Ruby, you’ve definitely encountered Rack. For users of frameworks like Rails and Sinatra, Rack can easily be relegated to the world of “plumbing” down there with the web and app servers; something to be configured and ignored. But that would be a mistake. This intro to Rack provides the most straightforward explanation of Rack I’ve seen. Every concept from the bottom up is discussed with clear code examples. If you write web applications using Ruby, you owe it to yourself to brush up on your Rack knowledge.

From RubyLearning.com and Satish Talim: A Quick Introduction to Rack

Simple wrapper for Ruby scripts running in an RVM environment

Ruby is a fantastic programming language, but I’ve not been all that happy with the community’s viewpoint of the execution environment. Ruby is fast moving, so you often encounter scripts or applications that require specific versions of Ruby. This problem isn’t unique to Ruby, but the solutions the Ruby community offers are.

Typically, versioned interpreters are installed using a convention that works something like this. If Ruby 1.9 and Ruby 1.8 were installed concurrently, the interpreters would be installed using /usr/bin/ruby19 and /usr/bin/ruby18. Scripts that require Ruby would have their shebang altered appropriately. You’d also typically symlink /usr/bin/ruby to your preferred (default) Ruby install.

Ruby eschews this convention. Ruby interpreters that come with many Linux packages use this convention, but most Rubyists will advise you to avoid OS packages because of other issues. As a system administrator, the entire scenario borders on horrifying. I get why Rubyists feel the way they do, but to people on the “other side”, it’s terrifying. Hearing “just install from source” keeps sysadmins up at night.

Enter RVM. Ruby enVironment Manager is a tool for installing and managing Ruby interpreters and RubyGems (Ruby’scripts library package manager). At it’s core, RVM is a collection of bash scripts, written by author Wayne Seguin. It’s an excellent tool, and while it gets some public criticism over methodologies like overriding common shell commands like cd, I’m generally happy with it. I’m one of a growing group of sysadmins who cross over in to the programming side of things, so the power and flexibility of having multiple Ruby interpreters and multiple RubyGems gemsets is has a lot of value to me.

RVM works by altering the execution environment. I won’t go in to details about how to use RVM, but you should be able to glean enough from this simple statement to get some idea of what’s going on:

$ rvm use [email protected]

This tells RVM to “use” (load) the environment for Ruby version 1.9.2, patch level 290, and use the “sprinkle” gemset. A gemset is just a collection of gems that you give a name. This appears to work like magic, but it’s really quite simple. RVM changes a set of environment variables that are relevant to the execution of Ruby and RubyGems. Basically, a bunch of load and execution paths. That’s all.

So where’s this all headed? The beauty of RVM is that it encapsulates Ruby execution environments. The primary drawback of RVM is that it encapsulates Ruby execution environments. How’s that? Take the ruby executable path for example:

/Users/myser/.rvm/rubies/ruby-1.9.2-p290/bin/ruby

Not exactly as friendly as /usr/local/bin/ruby. Fortunately, that’s a solved problem. We have /usr/bin/env to work around that problem. The catch is, env works by searching the current environment. As we know, RVM dynamically loads an environment that works for each Ruby. What happens if RVM is never executed, and why would that be the case?

One example is cron. When you run a cron task, a very minimal environment is loaded. This means that RVM isn’t loaded. The standard solution to this is to source the rvm script prior to execution, but this means including the source line in every cron task, which A) requires that you remember it, and B) adds a lot of visual clutter to the crontab. When you’re reviewing a list of cron tasks looking for something specific, all these source lines make a real mess.

Additionally, if you write shell scripts in Ruby, you may need some scripts to run in Ruby 1.8, while others work in Ruby 1.9. In an interactive session, Ruby shell scripts are executed in the currently loaded environment, so you may not get what you expect. I’d prefer a self contained script that loads an expected interpreter and environment explicitly.

Wayne provides some example code for loading the RVM environment on the scripting page, but stops short of a working script example. I’ve taken the very, very small leap to a working script wrapper. The benefit of this wrapper is that it is portable to any environment where RVM is available at a user or system-wide level. You will, however, have to ensure that the appropriate Ruby and gemset are available. The script looks like this:

#!/usr/bin/env bash

# Load RVM into a shell session *as a function*
if [[ -s "$HOME/.rvm/scripts/rvm" ]] ; then
  # First try to load from a user install
  source "$HOME/.rvm/scripts/rvm"
elif [[ -s "/usr/local/rvm/scripts/rvm" ]] ; then
  # Then try to load from a root install
  source "/usr/local/rvm/scripts/rvm"
else
  printf "ERROR: An RVM installation was not found.\n"
fi

# Configure ruby and gemset
rvm [email protected]_name >/dev/null

ruby <<-rb
  puts "Hello!"
rb

I’m a jack of all trades, so I’m sure that there are different ideas and methods that may be better than this. I’d love to hear about them! Trackbacks are open on this page, so please comment and let me know what you think.