A Gotcha With perl’s “each()”

It’s my bi-monthly perl blog entry, to earn my place on planet.perl.org! ;)

Here’s an interesting “gotcha”. Take this code:

    perl -e '%t=map{$_=>1}qw/1 2 3/;
    while(($k,$v)=each %t){print "1: $k\n"; last;}
    while(($k,$v)=each %t){print "2: $k\n";}'

In other words, iterate through all the key-value pairs in %t once, then do it again — but exit early in the first loop.

You would expect to get something like this output:

    1: 1
    2: 1
    2: 3
    2: 2

instead, you see:

    1: 1
    2: 3
    2: 2

The “1” entry in the second loop is AWOL. Here’s why — as “perldoc -f each” notes:

There is a single iterator for each hash, shared by all “each”, “keys”, and “values” function calls in the program

That’s all “each” calls, throughout the entire codebase, possibly in a different class entirely. Argh.

The workaround: reset the iterator using “keys” between calls to “each”:

    perl -e '%t=map{$_=>1}qw/1 2 3/;
    while(($k,$v)=each %t){print "1: $k\n"; last;}
    keys %t;
    while(($k,$v)=each %t){print "2: $k\n";}'

This got us in SpamAssassin — bug 4829.

To be honest, having to call “keys” after the loop is kludgy — as you can see if you check the patch in bug 4829 there, we had to change from a “return inside loop” pattern to a “set variable and exit loop, reset state, then return” pattern. It’d be nice to have a scoped version of each(), instead of this global scope, so that this would work:

    perl -e '%t=map{$_=>1}qw/1 2 3/;
    { while(($k,$v)=scoped_each %t){print "1: $k\n"; last;} }
    # that each() iterator is now out of scope, so GC'd;
    # the next call uses a new iterator, starting from scratch
    { while(($k,$v)=scoped_each %t){print "2: $k\n";} }'

Scoping, of course, has the benefit of allowing “return early” patterns to work; in my opinion, those are clearer — at the least because they require less lines of code ;)

This entry was posted in Uncategorized and tagged , , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.


  1. Posted April 3, 2006 at 13:10 | Permalink

    Yeah, almost everyone walks into this gotcha at one point or other. On PerlMonks, it is a periodical feature – every other month, someone who just discovered it posts a meditation to warn others about it. In fact I’ve grown tired of telling people “yeah, everyone runs into this.” ;-)

    Once in a deep blue moon, this single-iterator behaviour is in fact desired.

    However, why go to such lengths? Wouldn’t it suffice to just couple the keys call with the last?

    while(($k,$v)=each %t){print "1: $k\n"; keys %t, last;}
    while(($k,$v)=each %t){print "2: $k\n";}

    Sometimes, whether something is a kludge or not depends merely on how it is cast…

  2. Posted April 3, 2006 at 13:34 | Permalink

    Aristotle — nice idiom! It hadn’t occurred to me that it’d be fine to reset the iterator inside the loop scope itself. I like it a lot.

  3. Posted April 3, 2006 at 15:26 | Permalink

    each is ugly basically for this reason. All it offers over for $k (keys %t) { $v = $t{$k} is a little less typing and a performance gain that in almost all cases is trivially small. Working around it as you describe loses you the typing advantage, and makes the code harder to understand. Just use keys :)

  4. ben
    Posted April 3, 2006 at 17:37 | Permalink

    Ugly? A Perl function? I’m shocked.