The McDev Blog

Kevin McKelvin's perspective of the Ruby world.

Enumerators in Ruby 1.9

Lazy evaluation of enumerables is one of the most exciting new features in Ruby 2.0’s standard library. Changing the execution sequence of an enumeration pipeline to yield item by item is as easy as starting the enumeration chain with lazy.

This type of lazy evaluation is the standard when working with IEnumerable<T> in the .NET space. It allows you to create a pipeline that can project from one data structure into another without needing to evaluate an entire stack of objects at a time. This is really useful when dealing with ETL tasks as you can work with one entry at a time instead of projecting an array of all entries at each step of the process. This gives tremendous efficiency when reading hundreds of thousands of entries in on one side of the pipeline, doing a few map/reduce transformations and saving the result of the transformation.

While the Ruby 2.0 Enumerable::Lazy really brings Ruby up to that level of efficiency, there are ways of getting that behaviour in Ruby 1.9 using the Enumerator class.

Consider this example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
puts RUBY_VERSION

en = Enumerator.new do |e|
  puts "yielding a"
  e.yield 'a'
  puts "yielding b"
  e.yield 'b'
  puts "yielding c"
  e.yield 'c'
end

en.each do |e|
  puts "received #{e}"
end
# >> 1.9.3
# >> yielding a
# >> received a
# >> yielding b
# >> received b
# >> yielding c
# >> received c

Yielding from the Enumerator will release execution to the consuming code for each entry, where as if you project the enumerator into an array first, you get a different execution order:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
puts RUBY_VERSION

en = Enumerator.new do |e|
  puts "yielding a"
  e.yield 'a'
  puts "yielding b"
  e.yield 'b'
  puts "yielding c"
  e.yield 'c'
end

en.to_a.each do |e|
  puts "received #{e}"
end
# >> 1.9.3
# >> yielding a
# >> yielding b
# >> yielding c
# >> received a
# >> received b
# >> received c

It’s a subtle difference, but yields extreme power (pun intended ;)).

Faster Rails Testing With Zeus

Rails boot times suck. Waiting between 5 and 25 seconds to restart a server, run a test or to open the Rails console just doesn’t cut it. When doing TDD, you need the shortest feedback cycle possible.

Performance patched Ruby installs give you a significant improvement, and with Ruby 2.0, the require path is much faster than it used to be. These improvements are great, but we can still do more.

On cue, in comes Zeus. In a nutshell, Zeus preloads your development and test environments and makes the on-demand initialization of Rails servers, tests, rake tasks and consoles blisteringly fast. More technically speaking, Zeus is a process checkpointer for single-threaded applications. It’s been built for Ruby, but support for other languages is planned. For this post however, I just care about Ruby.

Zeus is an external piece of software to your Rails app. It’s distributed as a gem, but must not be included in your Gemfile. It’s designed to be run outside of bundler.

Installing

To install Zeus, simply install the gem: gem install zeus

Once Zeus is installed, cd to your project directory and run: zeus start

Zeus then fires up Ruby and checkpoints it at a point where you can connect to the process to run commands such as rake, test, rspec, console, and server.

To demonstrate, once Zeus has initialised, run zeus rspec spec and watch the magic happen.

Vim integration

@r00k has a really nifty script in his vimrc that will run tests through Zeus. Just drop this into your .vimrc.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
" Test-running stuff
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
function! RunCurrentTest()
  let in_test_file = match(expand("%"), '\(.feature\|_spec.rb\|_test.rb\)$') != -1
  if in_test_file
    call SetTestFile()

    if match(expand('%'), '\.feature$') != -1
      call SetTestRunner("!zeus cucumber")
      exec g:bjo_test_runner g:bjo_test_file
    elseif match(expand('%'), '_spec\.rb$') != -1
      call SetTestRunner("!zeus rspec")
      exec g:bjo_test_runner g:bjo_test_file
    else
      call SetTestRunner("!ruby -Itest")
      exec g:bjo_test_runner g:bjo_test_file
    endif
  else
    exec g:bjo_test_runner g:bjo_test_file
  endif
endfunction

function! SetTestRunner(runner)
  let g:bjo_test_runner=a:runner
endfunction

function! RunCurrentLineInTest()
  let in_test_file = match(expand("%"), '\(.feature\|_spec.rb\|_test.rb\)$') != -1
  if in_test_file
    call SetTestFileWithLine()
  end

  exec "!zeus rspec" g:bjo_test_file . ":" . g:bjo_test_file_line
endfunction

function! SetTestFile()
  let g:bjo_test_file=@%
endfunction

function! SetTestFileWithLine()
  let g:bjo_test_file=@%
  let g:bjo_test_file_line=line(".")
endfunction

Bind it to a hotkey:

1
2
map <leader>rt :call RunCurrentTest()<CR>
map <leader>rl :call RunCurrentLineInTest()<CR>

Gotcha with oh-my-zsh

If you’re using the OMZ bundler plugin, you’ll need to remove the zeus command from the plugin’s autobundle list in ~/.oh-my-zsh/plugins/bundler/bundler.plugin.zsh. Zeus will run your application through bundler by itself. Running Zeus inside bundler slows it down.

I find myself making single line changes and just running my specs habitually now, where I wouldn’t have bothered in the past. The responsiveness and feedback is addictive!

Happy testing!

Groping Test Tools and Their Effect on Object Oriented Design

Groping test tools like TypeMock exist in the realm of statically typed languages to allow you to bypass encapsulation and access private members of the objects you’re testing. In the Ruby world we’ve got metaprogramming and the #send method which can pretty much do the same thing, but in a deceptively simple way.

These tools have a place in our arsenal. If I’m testing a codebase that hasn’t considered object oriented best practices, it’s useful to have tools like Timecop that can give control over Ruby’s global date and time, or FakeWeb that intercepts HTTP calls. But if you’re using these tools on a fresh codebase, you’re violating object oriented best practices.

Just because you have a hammer, doesn’t mean that you should treat every problem as a nail. These ‘groping’ test tools have a tendency to be that hammer. We use encapsulation and information hiding to provide a cleaner API to consumers, but if we need to access private state from tests, then an argument can be made that the information should be available at a different scope, and that there’s a flaw in the design of the API.

Consider the testability of the following snippet. Notice the hard dependency on Time.zone.now - which is an indeterministic function that accesses a global state.

1
2
3
4
5
6
7
8
9
class Post < ActiveRecord::Base
  def publish!
    update_attributes(published_at: Time.zone.now)
  end

  def published?
    published_at && published_at < Time.zone.now
  end
end

To test this, we use a groping test tool - Timecop - to modify Ruby’s date and give us control. Timecop does this by using some of Ruby’s metaprogramming features.

1
2
3
4
5
6
7
8
9
10
11
12
13
describe Post do
  describe "#publish!" do
    it "sets the published at timestamp" do
      date = "2013-01-01".to_date
      Timecop.freeze(date)

      subject.publish!
      subject.published_at.should == date

      Timecop.return
    end
  end
end

My problem with this style of testing is that there’s no definition of intent. Reading the test, there’s no link between freezing the time and the command to publish the post. It’s implied, but it’s not clear. We’re changing something in a global state that just happens to have an effect on the published_at timestamp. It’s a side-effect. What’s more, if you forget to Timecop.return afterwards, you’ve infected the global state of DateTime with a trait that will be shared across other tests. If you’ve ever seen ridiculous profiles on run times in RSpec - that’s probably because you’ve forgotten to Timecop.return, and the Ruby global date is still in an altered state.

Allowing the dependency on time to be injected from outside of the method call gives us flexibility in testing. It gives us full control of a variable that is otherwise in a global, uncontrolled state.

Let’s refactor that first example a bit:

1
2
3
4
5
6
7
8
9
10
11
12
13
class Post < ActiveRecord::Base
  def publish!(clock = Time.zone)
    publish_at clock.now
  end

  def publish_at(time)
    update_attributes(published_at: time)
  end

  def published?(clock = Time.zone)
    published_at && published_at < clock.now
  end
end

Now #publish! depends on any object that responds to #now. So when we’re working in a test environment we can stub that out with OpenStruct.

1
2
3
4
5
6
7
8
9
describe Post do
  describe "#publish!" do
    it "sets the published at timestamp" do
      clock = OpenStruct.new(now: Time.zone.now)
      subject.publish!(clock)
      subject.published_at.should == clock.now
    end
  end
end

In production, thanks to Ruby’s default parameter values, we can still maintain an easily readable API without having to pass Time.zone in everywhere. Just call post.publish!

I far prefer this design since we’re keeping a similar level of abstraction at each method. We consider publishing to be the act of setting the published_at field, but there’s only one place that actually encapsulates that in code - in the publish_at method. This is evidence of following DRY.

While I prefer this design, there’s another level refactoring we can do in this code around the dependency on Time.zone.now and determining whether a post is published or not, but I’m going to leave that for a future post. Stay tuned!

Rubyfuza 2013 in Hindsight

On 7 - 9 February, the Strand Tower Hotel in Cape Town played host to the 2013 installation of Rubyfuza. Rubyfuza is the premier Ruby event of the year in Africa, attracting speakers from all over the world including top people from Heroku, Github and EngineYard.

This year’s conference was unquestionably the best one yet. The talks were insightful, thought provoking and were presented at a level that challenged my opinions on software. The opportunity to engage with industry professionals has re-ignited my passion for the craft of writing good software.

It was also breaking new ground for me as I gave my first ever conference talk, speaking about applying best practices when using RSpec. It’s a different ballgame speaking to 120 industry professionals at a conference over speaking to 25 people at a local usergroup. But as stressful as it was (I was nervous as hell), it was also really addictive. Now that I’ve broken into it, I’m dying to do it again and improve.

The three presentations that had the most impact on me were

  • Aslam Khan’s “Not Quite Object Oriented” - showing off some of the advantages of functional programming, monads and tree data structures.
  • Vicent Marti’s “Unicorns Die With Bullets Made Of Glitter” - explaining the problems in MRI’s garbage collector. I’m never going to lose the image of the two-headed troll named “Mark & Sweep”
  • Jesse Newland’s “ChatOps at Github” - Evidently, Hubot’s built for more than just finding funny cat photos. It’s one flexible piece of software when it comes to managing infrastructure in a visible way. After this talk, I’ve started building some services for Platform45’s infrastructure to do more dev-ops in our Campfire.

I must mention Marc Heiligers (the conference organiser) in closing. Spectacular job mate. The work you’ve put into Rubyfuza over the last 3 years has pulled a really tight knit community together and has established South Africa on the global map in the Ruby space.

Clearing Old Rails Logs

When developing Rails apps, the logs tend to subtly grow without being checked. On one machine I freed up 10GB of disk space just by clearing old Rails development and test logs.

If you’re like me, you probably have most of your projects sitting in a Code or Projects directory in your home. You probably tail -f the log file once every now and again, but don’t really need to keep the entire log file around.

Here’s a shell command that will go through every project in the ~/Code directory and clear out old log files.

1
find ~/Code -type d -iname log -maxdepth 2 | xargs -I% find % -type f -iregex ".*log$" | xargs rm

I’ve got this command aliased to rmlogs in my zshrc as well.

Regain Your Sanity - Grouping RSpec Examples

When doing TDD it’s important to have a short feedback cycle. Fast tests aren’t just a nice-to-have, they’re an essential part of the cycle. If I have to wait 5 minutes to know whether my tests pass or fail, I’m naturally going to slack off on running them as often as I should. This breaks the red/green/refactor cycle.

Rails integration tests using Capybara and Selenium are painfully slow to run every time your code changes. Thankfully RSpec has a --tag argument that can alleviate this pain.

Any describe, context or it block can be tagged by passing a hash after the description. For example:

1
2
3
describe User, group: 'user' do
  ...
end

To run only the contents of specs tagged with group: 'user', run RSpec with: rspec spec --tag group:user

Tags can be inflected to run everything except specs with a given tag by using the ~. For example: rspec spec --tag ~group:user

Applying this to Rails

Rails & Capybara will automatically tag every spec in the spec/requests directory with type: 'request'. Capybara also uses the js: true tag to determine whether to run a headless test or to run the test through Selenium.

Applying our knowledge about tags and the RSpec runner, you can run everything except the request specs by running:

rspec spec --tag ~type:request

Or you can just ignore the JavaScript specs that run inside a browser by running: rspec spec --tag ~js:true

Happy testing!

Migrating to Octopress

I’ve been rather quiet lately, but over the last couple of weeks I’ve been determined to revisit my blog and spiffy it up a bit.

I have been unhappy with Wordpress as a blog engine for a long time. I’ve been looking for good alternatives - to the point that I had even written my own blog engine using Rails. @iFrankZA pointed me to Octopress and I realized that this was the tool I had really been looking for.

A few things I really like about Octopress:

  • All posts are written in Markdown
  • It’s all static HTML (Huge scalability)
  • Adding new content is dead easy using rake tasks
  • It’s Ruby!

To migrate from Wordpress to Octopress, exitwp came in really handy. It converted all of my posts from Wordpress into Markdown with the headers that Octopress expects.

I encountered a few issues around encoding of UTF8 characters when generating the Octopress site, but adding the following lines to my .zshrc fixed my problem by changing the default encoding from ASCII to UTF8.

.zshrc
1
2
export LC_CTYPE=en_US.UTF-8
export LANG=en_US.UTF-8

Deployment

My favourite part of Octopress is its deployment model. Because the whole site is generated into HTML, I can use rsync to deploy any new content to my web server over SSH. It just needs a slight modification to the default Rakefile:

1
2
3
4
5
ssh_user       = "user@domain.com"
ssh_port       = 22
document_root  = "~/domain.com/"
rsync_delete   = true
deploy_default = "rsync"

Then deployment is as easy as running rake gen_deploy

Rails 3.2 Released - the Upgrade Story

Rails 3.2 was released on Friday 20 January. So as any good developer would, I started playing around with it on an app I’m building.

It was a breeze getting the bundle updated, just add these lines to the Gemfile and run bundle update

1
2
3
4
gem 'rails', '3.2.0'
gem 'sass-rails',   '3.2.3'
gem 'coffee-rails', '~> 3.2.1'
gem 'uglifier', '>= 1.0.3'

Once the updated gems had installed, firing up the rails server spat a few errors out. It turns out ActiveAdmin v0.3.4 doesn’t work with Rails 3.2. Within a few hours of the Rails release there was already a fix for this which has been pulled into the activeadmin Github repo.

Bundling against the edge release is risky, but it works and all my tests were still passing:

1
gem 'activeadmin', git: 'git://github.com/gregbell/active_admin.git'

The last issue I had was related to using Ruby 1.9.3. WEBrick has a few issues, so I switched over to using Thin in development instead.

1
gem 'thin'

Overall I’ve noticed a significant improvement in speed in Rails 3.2’s development environment using Ruby 1.9.2. It feels even faster when I ramp it up to Ruby 1.9.3.

Check out the Rails 3.2 release announcement and the upgrade instructions doc.

OSX Lion Reverse Scrolling

The subject of mouse movement in OS X has been debated for years now. Personally I despise the default movement settings in OS X and have a whole array of tweaks in place. But that’s a debate for another day.

Today I’m looking at the new “natural” reversed scrolling feature in Lion. Having used it for a couple of days now I found that I like having the reverse scrolling on the trackpad, but whenever I reach for my mouse I prefer the classic scrolling method that we’ve been using for years.

I dug around and couldn’t find any way inside OS X of decoupling them, but I came across a cool app called Scroll Reverser. It allows you to customize the reverse scrolling of the track pad and not of the mouse.

The settings might seem a bit confusing though. I have OSX set up to reverse scrolling so that it works across all applications. I then use Scroll Reverser to reverse the scrolling on the mouse from OSX’s behaviour back to the classic behaviour.

Scroll Reverser Settings

FakeWeb

I was testing a client I wrote to a server API recently. Being relatively new to testing with RSpec and Ruby, I initially took the naive approach of building a node.js application to behave as a dummy test server.

Originally when I wrote the code I knew there had to be a better way, but I only found that better way today.

I was revisiting some of that code and discovered FakeWeb. It’s a Ruby framework that makes it simple to test code that involves HTTP requests. It intercepts HTTP calls made through Net::HTTP and makes it dead simple to create predictable responses for those calls.

This means that tests that would have been regarded as integration tests before can be isolated from the dependency on an external server and can be executed as unit tests. Big win!

Here’s a sample of how it works in context of RSpec:

1
2
3
4
5
6
7
before do
  FakeWeb.allow_net_connect = false
  register_body = {
    :id => 1
  }.to_json
  FakeWeb.register_uri(:post, 'http://localhost/register', :body => register_body)
end

The first line in the before block tells FakeWeb to disallow any real network connections from happening. Every ‘connection’ must be handled inside FakeWeb. An exception is raised if a request can’t be handled by FakeWeb directly.

The register_body variable simply contains the key-value pairs to be returned in the HTTP response’s body as JSON.

FakeWeb.register_uri then registers a verb and URI to be handled by FakeWeb, and sets the body of the response. It’s also possible to set a status code and to handle :any verb.

Once the URI has been registered, any call made to that URI from Net::HTTP will be responded to by FakeWeb.

My tests are now running marginally faster and are far less flaky since all the external dependencies are now being substituted with a controlled and predictable stub.

To get going, just add the gem to your Gemfile and run bundler:

1
gem 'fakeweb'

Also check out the FakeWeb docs at RubyForge