Array() in Ruby

Dec 27, 2020

When it comes to working with a mix of values or variables, and I want to treat them as an array, I am in the habit of using the Array(...) method to make sure I am always working with an array, without having to explicitly check for a nil or non-array values. What this method does is that it tries to call first #to_ary, then #to_a on the passed argument.

This basically means that if the argument is or can act as an array, it will return itself, otherwise, it will be converted to an array. And since nil.to_a => [], that means we will get an empty array if the argument was nil, which is often what I want.

Confusion over this method

I have seen usages of other methods when what they really wanted was the result that Array() offers. Here is a comparison of this and other array methods, that I have seen it being confused with.

Array(...)

?> Array(nil)
=> []

?> Array(3)
=> [3]

?> Array([3])
=> [3]

Array[...]

?> Array[nil]
=> [nil]

?> Array[3]
=> [3]

?> Array[[3]]
=> [[3]]

Array.new(...)

?> Array.new(nil)
TypeError: no implicit conversion from nil to integer

?> Array.new(3)
=> [nil, nil, nil]

?> Array.new([3])
=> [3]

The last example in particular have been a source of confusion, because it looks to give you exactly what you want when passing an existing array. But as soon as nil is passed, it raises an error. Very dangerous when testing is lacking...

How it can be used

A common scenario where I use Array() is when I want to check if one variable is allowed or valid based on a second variable. This might look like this.

BY_CATEGORY = {
  'fruits' => %w(apple banana orange),
  'vegetables' => %w(carrot cucumber tomato)
}

Array(BY_CATEGORY[category]).include?(item)

And the result will be true or false, without having to check for nil values before checking for array inclusion. Very neat and compact in my opinion.

Here are a few alternatives that I have seen being used and sometimes used myself.

BY_CATEGORY[category] && BY_CATEGORY[category]).include?(item)

(arr = BY_CATEGORY[category]) && arr.include?(item)

BY_CATEGORY[category]&.include?(item)

All of these are variants of checking for nil before checking for inclusion. The first variant avoids assigning a new variable, but has the added expense of accessing the hash twice.

The second variant uses a temporary variable to store the hash value before checking for inclusion which is efficient, but looks quite cumbersome and is disliked by certain style guides.

And the third variant uses the Safe Navigation Operator (&.), which requires at least Ruby version 2.3 (released in 2015)

What about performance?

While I believe that performance differences on this level is most likely neglectable in comparison to other improvements that can be made in production level applications out there, I was interested to know how these examples compare. Obviously there is a cost (however minor) to creating an empty array instance, just to check for an inclusion that will always be false. So I ran some numbers.

BY_CATEGORY = {
  'fruits' => %w(apple banana orange),
  'vegetables' => %w(carrot cucumber tomato)
}

cat1 = 'fruits'
cat2 = nil
item = 'pineapple'

Benchmark.bm do |x|
  n = 10_000_000
  x.report("When arg exists: Array(group[cat]).include?        ") {
    n.times { Array(BY_CATEGORY[cat1]).include? item }
  }
  x.report("When arg exists: group[cat] && group[cat].include? ") {
    n.times { BY_CATEGORY[cat1] && BY_CATEGORY[cat1].include?(item) }
  }
  x.report("When arg exists: (res = group[cat]) && res.include?") {
    n.times { (res = BY_CATEGORY[cat1]) && res.include?(item) }
  }
  x.report("When arg exists: group[cat]&.include?              ") {
    n.times { BY_CATEGORY[cat1]&.include?(item) }
  }
  x.report("When arg is nil: Array(group[cat]).include?        ") {
    n.times { Array(BY_CATEGORY[cat2]).include? item }
  }
  x.report("When arg is nil: group[cat] && group[cat].include? ") {
    n.times { BY_CATEGORY[cat2] && BY_CATEGORY[cat2].include?(item) }
  }
  x.report("When arg is nil: (res = group[cat]) && res.include?") {
    n.times { (res = BY_CATEGORY[cat2]) && res.include?(item) }
  }
  x.report("When arg is nil: group[cat]&.include?              ") {
    n.times { BY_CATEGORY[cat2]&.include?(item) }
  }
end

Which resulted in the following

                                                        user     system      total        real
When arg exists: Array(group[cat]).include?          1.456000   0.000000   1.456000 (  1.458288)
When arg exists: group[cat] && group[cat].include?   1.996000   0.004000   2.000000 (  2.002088)
When arg exists: (res = group[cat]) && res.include?  1.384000   0.000000   1.384000 (  1.384498)
When arg exists: group[cat]&.include?                1.312000   0.000000   1.312000 (  1.314109)
When arg is nil: Array(group[cat]).include?          1.824000   0.000000   1.824000 (  1.828075)
When arg is nil: group[cat] && group[cat].include?   0.584000   0.000000   0.584000 (  0.586510)
When arg is nil: (res = group[cat]) && res.include?  0.612000   0.000000   0.612000 (  0.612291)
When arg is nil: group[cat]&.include?                0.576000   0.000000   0.576000 (  0.577248)

As expected, there was a significant difference when the argument was nil and having to create an empty array. That &. was the fastest in both scenarios is no surprise either since it must be taking advantage of runtime optimisations.

Just note that it works in this scenario since we know that we will only ever be working with nil or array values. There might be scenarios when arg could be a String or other type of objects, in which case &. would try to call include? on that arg, while Array() would first convert it to an array.

Conclusion

As I already mentioned, the performance differences are minor, but if you want to account for it, I would suggest using the safe navigation operator as long as you know it could only ever be nil or array values.

But if you are working with a combination of values where some are arrays and some are other objects, then Array() will be better suited. And if nil will be a rare occurrence, then it is only marginally slower (+10%) than &..

Happy Holidays and Happy Coding!

Also posted on dev.to