Using select, reject, collect, inject and detect.

I spend a lot of time convincing my friends to switch to a Mac. Some of my friends are also software developers so naturally, just when they think the evangelism has come to an end, I convince them to get on the Rails. However, learning Rails usually means learning Ruby for the first time as well. In this post I am going to address one of the issues I see for newcomers to Ruby. Looping.

Looping in Ruby seems to be a process of evolution for newcomers to the language. Newcomers will always find their way to the for loop almost immediately and when confronted with iterating an array, the first choice will generally be a for..in:

a = [1,2,3,4]
for n in a
  puts n
end

This works, but its not very… Ruby. The next stage of evolution will be using an iterator for the first time. So the for loop gets dropped all together and each is used. The Rubyist is born at this point:

a.each do |n|
  puts n
end

What I see next is a lot of conditional logic being used inside the each block. The logic is generally introduced to perform the following operations:

  1. Building a list of items from an array.
  2. Total the items in an array.
  3. Find an item in the array.

So if this is you, then stop. Ruby has plenty more iterators where each came from. Which one you should be using depends on what operation you are trying to perform. So let’s take a look at our previous list and see if we can find a more Ruby way to get them done.

Building a list of items from the array using select

For this operation you should be using select. The way select works is simple, it basically iterates through all the elements in your array and performs your logic on each one. If the logic returns TRUE, then it adds the item to a new array which it returns when the iteration is complete. Here’s an example:

a = [1,2,3,4]
a.select {|n| n > 2}

This will return the last two elements in the array: 3 and 4. Why? Because 3 and 4 are both greater than 2, which was the logic we placed in the block. It’s worth noting that select has an evil step sister named reject. This will perform the opposite operation of select. Logic which returns FALSE adds the item to the array that is returned. Here’s the same examples as before except we will swap select, with reject:

a = [1,2,3,4]
a.reject {|n| n > 2}

In this example the return value is [1,2] because these elements return false when the condition is tested.

I also have to mention another close sibling to select and reject; collect, which returns an array of values that are the RESULT of logic in the block. Previously we returned the item based on the result of the CONDITION in the block. So perhaps we need square the values in our array:

a = [1,2,3,4]
a.collect {|n| n*n}

This returns a new array with each item in our array squared.

Finally, note that using select, reject, and collect returns an array. If you want to return something different, because you are concatenating or totaling values, then check out inject.

Total the items in an array using inject

When you think of accumulating, concatenating, or totaling values in an array, then think of inject. The main difference between select and inject is that inject gives you another variable for use in the block. This variable, referred to as the accumulator, is used to store the running total or concatenation of each iteration. The value added to the accumulator is the result of the logic you place in the block. At the end of each iteration, whatever that value is, can be added to the accumulator. For example, let’s sum all the numbers together in our array:

a = [1,2,3,4]
a.inject {|acc,n| acc + n}

This will return 10. The total value of all the elements in our array. The logic in our block is simple: add the current element to the accumulator. Remember, you must do something to the accumulator in each iteration. If we had simply placed n in the block the final value of the accumulator would have been 4. Why? Because its the last value in the array and since we did not add it to the accumulator explicitly the accumulator would be replaced in each iteration.

You can also use a parameter with the inject call to determine what the default value for the accumulator is:

a = [1,2,3,4]
a.inject(10) {|acc,n| acc + n}

In this example the result is 20 because we assigned the accumulator an initial value of 10.

If you need to return a string or an array from inject, then you will need to treat the accumulator variable that way. You can use the default value parameter of inject to do this:

a = [1,2,3,4]
a.inject([]) {|acc,n| acc << n+n}

In this example I add n to itself and then append it to the accumulator variable. I initialized the accumulator as an empty array using the default value parameter.

Find an item in the array using detect

Our last example operation was to find an element in the array. Let’s just put it out there and say that other iterators could be used to select the correct value from the array, but I am going to show you how to use detect to round out our exploration of these iterators.

So let’s find the value 3 in our array using detect:

a = [1,2,3,4]
a.detect {|n| n == 3}

This returns 3. The value we were looking for. If the value had not been found, then the iterator returns nil.

So if your head is spinning at this point as to which iterator to use for when, then remember this:

  1. Use select or reject if you need to select or reject items based on a condition.
  2. Use collect if you need to build an array of the results from logic in the block.
  3. Use inject if you need to accumulate, total, or concatenate array values together.
  4. Use detect if you need to find an item in an array.

By using these iterators you will be one step closer to mastering… Ruby-Fu.

  • http://chadnantais.com/ chad

    brilliant.

  • http://www.dcmanges.com/ Dan Manges

    I also use any? and all? fairly regularly. Although not as often as the methods you mentioned.

  • Josh

    In this example the result is 20 because we assigned the accumulator an initial value of 20.

    I think you meant 10… Otherwise, great article!

  • curtis

    The for..in is functionally identical to for each. What would be really unforgivable is if someone did this in any case that an iterator based loop would do: a = [1,2,3,4] 1.upto(a.length-1) do |i| foo(a[i]) end

  • http://sam.aaron.name/ Sam Aaron

    Very nicely written post, thanks. My only point is that I use the synonym map instead of collect, although I’m not really sure why, and it might help people starting out to know that map and collect are actually the same thing. Keep it up!

  • Mike Breen

    Nice post. Coming from .NET to Ruby I certainly went through this learning process when working with arrays.

  • Travis

    very nicely explained, good job! simple and to the point.

  • Ed

    Simple yet good explanation :-)

  • Vojtech Salbaba

    Nice work with the inject, thank you.

  • http://matthewcarriere.com Matthew

    You’re right Josh. Nice catch. Thanks.

  • Soundar Rathinasamy

    nice.very useful

  • Ritesh

    Great tutorial..I think I just moved one more level up in ruby :)

  • Pingback: Select, reject etc. on array

  • Pingback: Today I Learned (TIL)… Traits, ‑ects, Prying Eyes, and Bourbon | Jake Vose

  • Maria Elena Caldereta

    Great article. Thanks!

  • jiangxiao

    thianks

  • Rite

    thanx.. it helps a lot

  • http://www.aerogami.com.br Mohamad El-Husseini

    One important difference between collect and detect is that detect will stop iterating after when the block evaluates to true while collect will continue until the end of the array.

    • http://matthewcarriere.com Matthew Carriere

      Good point!