Consider this code:

h =  # New hash pairs will by default have 0 as values
h[1] += 1  #=> {1=>1}
h[2] += 2  #=> {2=>2}

That’s all fine, but:

h =[])  # Empty array as default value
h[1] <<= 1  #=> {1=>[1]}                  ← Ok
h[2] <<= 2  #=> {1=>[1,2], 2=>[1,2]}      ← Why did `1` change?
h[3] << 3   #=> {1=>[1,2,3], 2=>[1,2,3]}  ← Where is `3`?

At this point I expect the hash to be:

{1=>[1], 2=>[2], 3=>[3]}

but it’s far from that. What is happening and how can I get the behavior I expect?

I wonder if you’d consider reevaluating your accepted answer, since, IMO, it is incomplete and not wholly correct. I wrote an answer that elaborates more fully on each possible method and why it’s wrong/correct (though obviously you need not choose mine, there are several other answers that improve/expand to varying degrees on the accepted one). – Andrew Marshall
@AndrewMarshall, you're right, thank you for the answer. It's funny to see how things that were unclear 5 years ago make perfect sense now. – Valentin

5 Answers 11

You're specifying that the default value for the hash is a reference to that particular (initially empty) array.

I think you want:

h = { |hash, key| hash[key] = []; }

That sets the default value for each key to a new array.

How can I use separate array instances for each new hash? – Valentin
5 upvote
That block version gives you new Array instances on each invocation. To wit: h = { |hash, key| hash[key] = []; puts hash[key].object_id }; h[1] # => 16348490; h[2] # => 16346570. Also: if you use the block version that sets the value ( {|hash,key| hash[key] = []}) rather than the one that simply generates the value ({ [] }), then you only need <<, not <<= when adding elements. – James A. Rosen

When you call[]), the default value for any key is not just an empty array, it's the same empty array.

To create a new array for each default value, use the block form of the constructor: { [] }
16 upvote
But be careful to actually perform an assignment when using the hash created this way: h[1]<<1 won't work. More info here: //… – Mladen Jablanović
Mladen, you help me 3rd time in a row, thanks a great lot! – Valentin
9 upvote
  flag {|h,k| h[k]=[]} <-- make that a snippet in your editor ;) – John Douthat
2 upvote
This answer won't work. this is what you need: {|h,k| h[k]=[]} – Yossi Shasho

The operator += when applied to those hashes work as expected.

[1] pry(main)> foo = [] )
=> {}
[2] pry(main)> foo[1]+=[1]
=> [1]
[3] pry(main)> foo[2]+=[2]
=> [2]
[4] pry(main)> foo
=> {1=>[1], 2=>[2]}
[5] pry(main)> bar = { [] }
=> {}
[6] pry(main)> bar[1]+=[1]
=> [1]
[7] pry(main)> bar[2]+=[2]
=> [2]
[8] pry(main)> bar
=> {1=>[1], 2=>[2]}

This may be because foo[bar]+=baz is syntactic sugar for foo[bar]=foo[bar]+baz when foo[bar] on the right hand of = is evaluated it returns the default value object and the + operator will not change it. The left hand is syntactic sugar for the []= method which won't change the default value.

Note that this doesn't apply to foo[bar]<<=bazas it'll be equivalent to foo[bar]=foo[bar]<<baz and << will change the default value.

Also, I found no difference between{[]} and{|hash, key| hash[key]=[];}. At least on ruby 2.1.2 .

Nice explanation. It seems like on ruby 2.1.1{[]} is the same as[]) for me with the lack of expected << behavior (though of course{|hash, key| hash[key]=[];} works). Weird small things breaking all the things :/ – butterywombat
up vote 137 down vote accepted

First, note that this behavior applies to any default value that is subsequently mutated (e.g. hashes and strings), not just arrays.

TL;DR: Use { |h, k| h[k] = [] } if you want the simplest, most idiomatic solution.

What doesn’t work

Why[]) doesn’t work

Let’s look more in-depth at why[]) doesn’t work:

h =[])
h[0] << 'a'  #=> ["a"]
h[1] << 'b'  #=> ["a", "b"]
h[1]         #=> ["a", "b"]

h[0].object_id == h[1].object_id  #=> true
h  #=> {}

We can see that our default object is being reused and mutated (this is because it is passed as the one and only default value, the hash has no way of getting a fresh, new default value), but why are there no keys or values in the array, despite h[1] still giving us a value? Here’s a hint:

h[42]  #=> ["a", "b"]

The array returned by each [] call is just the default value, which we’ve been mutating all this time so now contains our new values. Since << doesn’t assign to the hash (there can never be assignment in Ruby without an = present), we’ve never put anything into our actual hash. Instead we have to use <<= (which is to << as += is to +):

h[2] <<= 'c'  #=> ["a", "b", "c"]
h             #=> {2=>["a", "b", "c"]}

This is the same as:

h[2] = (h[2] << 'c')

Why { [] } doesn’t work

Using { [] } solves the problem of reusing and mutating the original default value (as the block given is called each time, returning a new array), but not the assignment problem:

h = { [] }
h[0] << 'a'   #=> ["a"]
h[1] <<= 'b'  #=> ["b"]
h             #=> {1=>["b"]}

What does work

The assignment way

If we remember to always use <<=, then { [] } is a viable solution, but it’s a bit odd and non-idiomatic (I’ve never seen <<= used in the wild). It’s also prone to subtle bugs if << is inadvertently used.

The mutable way

The documentation for states (emphasis my own):

If a block is specified, it will be called with the hash object and the key, and should return the default value. It is the block’s responsibility to store the value in the hash if required.

So we must store the default value in the hash from within the block if we wish to use << instead of <<=:

h = { |h, k| h[k] = [] }
h[0] << 'a'  #=> ["a"]
h[1] << 'b'  #=> ["b"]
h            #=> {0=>["a"], 1=>["b"]}

This effectively moves the assignment from our individual calls (which would use <<=) to the block passed to, removing the burden of unexpected behavior when using <<.

Note that there is one functional difference between this method and the others: this way assigns the default value upon reading (as the assignment always happens inside the block). For example:

h1 = { |h, k| h[k] = [] }
h1  #=> {:x=>[]}

h2 = { [] }
h2  #=> {}

The immutable way

You may be wondering why[]) doesn’t work while works just fine. The key is that Numerics in Ruby are immutable, so we naturally never end up mutating them in-place. If we treated our default value as immutable, we could use[]) just fine too:

h =[].freeze)
h[0] += ['a']  #=> ["a"]
h[1] += ['b']  #=> ["b"]
h[2]           #=> []
h              #=> {0=>["a"], 1=>["b"]}

Of all the ways, I personally prefer this way—immutability generally makes reasoning about things much simpler (this is, after all, the only method that has no possibility of hidden or subtle unexpected behavior).

This isn’t strictly true, methods like instance_variable_set bypass this, but they must exist for metaprogramming since the l-value in = cannot be dynamic.

1 upvote
It bears mentioning that using "the mutable way" also has the effect of causing every hash lookup to store a key value pair (since there's an assignment happening on in the block), which may not always be desired. – johncip
@johncip Not every lookup, just the first one to each key. But I see what you mean, I’ll add that to the answer later; thanks!. – Andrew Marshall
Whoops, being sloppy. You're right, of course, it's the first lookup of an unknown key. I almost feel like { [] } with <<= has the fewest surprises, were it not for the fact that accidentally forgetting the = might lead to a very confusing debugging session. – johncip
pretty clear explanations about differences when initializing hash with default values – cisolarix

When you write,

h =[])

you pass default reference of array to all elements in hash. because of that all elements in hash refers same array.

if you want each element in hash refer to separate array, you should use

h ={[]} 

for more detail of how it works in ruby please go through this:

This is wrong, { [] } does not work. See my answer for details. It’s also already the solution proposed in another answer. – Andrew Marshall

Not the answer you're looking for? Browse other questions tagged or ask your own question.