JavaScript – Iteration

Mozilla’s JavaScript extensions introduce new iteration
techniques, including the for each
loop and Python-style iterators and generators. They are detailed in
the subsections below.

The for/each Loop

The for/each loop is a new
looping statement standardized by E4X. E4X (ECMAScript for XML) is a
language extension that allows XML tags to appear literally in
JavaScript programs and adds syntax and API for operating on XML
data. E4X has not been widely implemented in web browsers, but it is
supported by Mozilla’s JavaScript 1.6 (released in Firefox 1.5). In
this section, we’ll cover only the for/each loop and its use with non-XML
objects. See E4X: ECMAScript for XML for details on the rest of
E4X.

The for each loop is much
like the for/in loop. Instead of
iterating through the properties of an object, however, it iterates
through the values of those properties:

let o = {one: 1, two: 2, three: 3}
for(let p in o) console.log(p);       // for/in: prints 'one', 'two', 'three'
for each (let v in o) console.log(v); // for/each: prints 1, 2, 3

When used with an array, the for/each loop iterates through the
elements (rather than the indexes) of the loop. It typically
enumerates them in numerical order, but this is not actually
standardized or required:

a = ['one', 'two', 'three'];
for(let p in a) console.log(p);       // Prints array indexes 0, 1, 2
for each (let v in a) console.log(v); // Prints array elts 'one', 'two', 'three'

Note that the for/each loop
does not limit itself to the array elements of an array—it will
enumerate the value of any enumerable property of the array
including enumerable methods inherited by the array. For this
reason, the for/each loop is
usually not recommended for use with arrays. This is particularly
true for code that must interoperate with versions of JavaScript
before ECMAScript 5 in which it is not possible to make user-defined
properties and methods non-enumerable. (See Iterating Arrays for a similar discussion of the
for/in loop.)

Iterators

JavaScript 1.7 enhances the for/in loop with more general behavior.
JavaScript 1.7’s for/in loop is
more like Python’s for/in and
allows it iterate over any iterable object. In
order to understand this, some definitions are required.

An iterator is an object that allows
iteration over some collection of values and maintains whatever
state is necessary to keep track of the current “position” in the
collection. An iterator must have a next() method. Each call to next() returns the next value from the
collection. The counter()
function below, for example, returns an iterator that returns
successively larger integers on each call to next(). Note the use of the function scope
as a closure that holds the current state of the counter:

// A function that returns an iterator;
function counter(start) {
    let nextValue = Math.round(start);  // Private state of the iterator
    return { next: function() { return nextValue++; }}; // Return iterator obj
}

let serialNumberGenerator = counter(1000);
let sn1 = serialNumberGenerator.next();    // 1000
let sn2 = serialNumberGenerator.next();    // 1001

Iterators that work on finite collections throw StopIteration from their next() method when there are no more
values to iterate. StopIteration
is a property of the global object in JavaScript 1.7. Its value is
an ordinary object (with no properties of its own) that is reserved
for this special purpose of terminating iterations. Note, in
particular, that StopIteration is
not a constructor function like TypeError() or RangeError(). Here, for example, is a
rangeIter() method that returns
an iterator that iterates the integers in a given range:

// A function that returns an iterator for a range of integers
function rangeIter(first, last) {
    let nextValue = Math.ceil(first);
    return {
        next: function() {
            if (nextValue > last) throw StopIteration;
            return nextValue++;
        }
    };
}

// An awkward iteration using the range iterator.
let r = rangeIter(1,5);                  // Get an iterator object
while(true) {                            // Now use it in a loop
    try {
        console.log(r.next());           // Try to call its next() method
    }
    catch(e) {
        if (e == StopIteration) break;   // Exit the loop on StopIteration
        else throw e;
    }
}

Note how awkward it is to use an iterator object in a loop
where the StopIteration method
must be handled explicitly. Because of this awkwardness, we don’t
often use iterator objects directly. Instead we use
iterable objects. An
iterable object represents a collection of
values that can be iterated. An iterable object must define a method
named __iterator__() (with two
underscores at the start and end of the name) which returns an
iterator object for the collection.

The JavaScript 1.7 for/in
loop has been extended to work with iterable objects. If the value
to the right of the in keyword is
iterable, then the for/in loop
will automatically invoke its __iterator__() method to obtain an
iterator object. It then calls the next() method of the iterator, assigns the
resulting value to the loop variable, and executes the loop body.
The for/in loop handles the
StopIteration exception itself,
and it is never visible to your code. The code below defines a
range() function that returns an
iterable object (not an iterator) that represents a range of
integers. Notice how much easier it is to use a for/in loop with an iterable range than it
is to use a while loop with a
range iterator.

// Return an iterable object that represents an inclusive range of numbers
function range(min,max) {
    return {                           // Return an object representing a range.
        get min() { return min; },     // The range's bounds are immutable.
        get max() { return max; },     // and stored in the closure.
        includes: function(x) {        // Ranges can test for membership.
            return min <= x && x <= max;
        },
        toString: function() {         // Ranges have a string representation.
            return "[" + min + "," + max + "]";
        },
        __iterator__: function() {     // The integers in a range are iterable.
            let val = Math.ceil(min);  // Store current position in closure.
            return {                   // Return an iterator object.
	        next: function() {     // Return next integer in the range.
	            if (val > max)     // If we're past the end then stop.
	                throw StopIteration;
	            return val++;      // Otherwise return next and increment.
	        }
            };
        }
    };
}

// Here's how we can iterate over a range:
for(let i in range(1,10)) console.log(i);  // Prints numbers from 1 to 10

Note that that although you must write an __iterator__() method and throw a Stop Iteration exception
to create iterable objects and their iterators, you are not expected
(in normal use) to call the __iterator__() method nor to handle the
Stop Iteration
exception—the for/in loop does
this for you. If for some reason you want to explicitly obtain an
iterator object from an iterable object, call the Iterator() function. ( Iterator() is a
global function that is new in JavaScript 1.7.) If the argument to
this function is an iterable object, it simply returns the result of
a call to the __iterator__() method, keeping your
code cleaner. (If you pass a second argument to Iterator(), it will
pass that argument on to the __iterator__() method.)

There is another important purpose for the Iterator() function, however. When you
call it on an object (or array) that does not have an __iterator__() method, it returns a custom
iterable iterator for the object. Each call to this iterator’s
next() method returns an array of
two values. The first array element is a property name, and the
second is the value of the named property. Because this object is an
iterable iterator, you can use it with a for/in loop instead of calling its
next() method directly, and this
means that you can use the Iterator() function along with
destructuring assignment to conveniently loop through the properties
and values of an object or array:

for(let [k,v] in Iterator({a:1,b:2}))  // Iterate keys and values
    console.log(k + "=" + v);          // Prints "a=1" and "b=2"

There are two other important features of the iterator
returned by the Iterator()
function. First, it ignores inherited properties and only iterates
“own” properties, which is usually what you want. Second, if you
pass true as the second argument
to Iterator(), the returned iterator
will iterate only property names, not property values. The following
code demonstrates these two features:

o = {x:1, y:2}                              // An object with two properties
Object.prototype.z = 3;                     // Now all objects inherit z
for(p in o) console.log(p);                 // Prints "x", "y", and "z"
for(p in Iterator(o, true)) console.log(p); // Prints only "x" and "y"

Generators

Generators are a JavaScript 1.7 feature (borrowed from Python)
that use a new yield keyword,
which means that code that uses them must explicitly opt in to
version 1.7, as described in Constants and Scoped Variables. The yield keyword is used in a function and
functions something like return
to return a value from the function. The difference between yield and return, however, is that a function that
yields a value to its caller retains its internal state so that it
is resumable. This resumability makes yield a perfect tool for writing
iterators. Generators are a very powerful language feature, but they
can be tricky to understand at first. We’ll begin with some
definitions.

Any function that uses the yield keyword (even if the yield is unreachable) is a
generator function. Generator functions return
values with yield. They may use
the return statement with no
value to terminate before reaching the end of the function body, but
they may not use return with a
value. Except for their use of yield, and this restriction on the use of
return, generator functions are
pretty much indistinguishable from regular functions: they are
declared with the function
keyword, the typeof operator
returns “function”, and they inherit from Function.prototype just as ordinary
functions do. When invoked, however, a generator function behaves
completely differently than a regular function: instead of executing
the body of the generator function, the invocation instead returns a
generator object.

A generator is an object that represents
the current execution state of a generator function. It defines a
next() method that resumes
execution of the generator function and allows it to continue
running until its next yield
statement is encountered. When that happens, the value of the
yield statement in the generator
function becomes the return value of the next() method of the generator. If a
generator function returns (by executing a return statement or reaching the end of
its body), the next() method of
the generator throws StopIteration.

The fact that generators have a next() method that can throw StopIteration should make it clear that
they are iterator objects.[19] In fact, they are iterable iterators, which means that
they can be used with for/in
loops. The following code demonstrates just how easy it is to write
generator functions and iterate over the values they yield:

// Define a generator function for iterating over a range of integers
function range(min, max) {
    for(let i = Math.ceil(min); i <= max; i++) yield i;
}

// Invoke the generator function to obtain a generator, then iterate it.
for(let n in range(3,8)) console.log(n); // Prints numbers 3 through 8.

Generator functions need never return. In fact, a canonical
example is the use of a generator to yield the Fibonacci
numbers:

// A generator function that yields the Fibonacci sequence
function fibonacci() {
    let x = 0, y = 1;
    while(true) {
        yield y;
        [x,y] = [y,x+y];
    }
}
// Invoke the generator function to obtain a generator.
f = fibonacci();
// Use the generator as an iterator, printing the first 10 Fibonacci numbers.
for(let i = 0; i < 10; i++) console.log(f.next());

Notice that the fibonacci()
generator function never returns. For this reason, the generator it
returns will never throw StopIteration. Rather than using it as an
iterable object in a for/in loop
and looping forever, we use it as an iterator and explicitly call
its next() method ten times.
After the code above runs, the generator f still retains the execution state of the
generator function. If we won’t be using it anymore, we can release
that state by calling the close()
method of f:

f.close();

When you call the close
method of a generator, the associated generator function terminates
as if there was a return
statement at the location where its execution was suspended. If this
location is inside one or more try blocks, any finally clauses are run before close() returns. close() never has a return value, but if a
finally block raises an exception
it will propagate from the call to close().

Generators are often useful for sequential processing of
data—elements of a list, lines of text, tokens from a lexer, and so
on. Generators can be chained in a way that is analogous to a
Unix-style pipeline of shell commands. What is interesting about
this approach is that it is lazy: values are
“pulled” from a generator (or pipeline of generators) as needed,
rather than being processed in multiple passes. Example 11-1 demonstrates.

Example 11-1. A pipeline of generators

// A generator to yield the lines of the string s one at a time.
// Note that we don't use s.split(), because that would process the entire
// string at once, allocating an array, and we want to be lazy instead.
function eachline(s) {
    let p;
    while((p = s.indexOf('\n')) != -1) {
        yield s.substring(0,p);
        s = s.substring(p+1);
    }
    if (s.length > 0) yield s;
}

// A generator function that yields f(x) for each element x of the iterable i
function map(i, f) {
    for(let x in i) yield f(x);
}

// A generator function that yields the elements of i for which f(x) is true
function select(i, f) {
    for(let x in i) {
        if (f(x)) yield x;
    }
}

// Start with a string of text to process
let text = " #comment \n  \n  hello \nworld\n quit \n unreached \n";

// Now build up a pipeline of generators to process it.
// First, break the text into lines
let lines = eachline(text);
// Next, trim whitespace from the start and end of each line
let trimmed = map(lines, function(line) { return line.trim(); });
// Finally, ignore blank lines and comments
let nonblank = select(trimmed, function(line) {
    return line.length > 0 && line[0] != "#"
});

// Now pull trimmed and filtered lines from the pipeline and process them,
// stopping when we see the line "quit".
for (let line in nonblank) {
    if (line === "quit") break;
    console.log(line);
}

Typically generators are initialized when they are created:
the values passed to the generator function are the only input that
the generator receives. It is possible, however, to provide
additional input to a running generator. Every generator has a
send() method, which works to
restart the generator like the next() method does. The difference is that
you can pass a value to send(),
and that value becomes the value of the yield expression. (In most generator
functions that do not accept additional input, the yield keyword looks like a statement. In
fact, however, yield is an
expression and has a value.) In addition to next() and send(), another way to restart a generator
is with throw(). If you call this
method, the yield expression
raises the argument to throw() as
an exception. The following code demonstrates:

// A generator function that counts from an initial value.
// Use send() on the generator to specify an increment.
// Use throw("reset") on the generator to reset to the initial value.
// This is for example only; this use of throw() is bad style.
function counter(initial) {
    let nextValue = initial;                 // Start with the initial value
    while(true) {
        try {
            let increment = yield nextValue; // Yield a value and get increment
            if (increment)                   // If we were sent an increment...
                nextValue += increment;      // ...then use it.
            else nextValue++;                // Otherwise increment by 1
        }
        catch (e) {                          // We get here if someone calls
            if (e==="reset")                 // throw() on the generator
                nextValue = initial;
            else throw e;
        }
    }
}

let c = counter(10);           // Create the generator at 10
console.log(c.next());         // Prints 10
console.log(c.send(2));        // Prints 12
console.log(c.throw("reset")); // Prints 10

Array Comprehensions

An array comprehension is another feature
that JavaScript 1.7 borrowed from Python. It is a technique for
initializing the elements of an array from or based on the elements
of another array or iterable object. The syntax of array
comprehensions is based on the mathematical notation for defining
the elements of a set, which means that expressions and clauses are
in different places than JavaScript programmers would expect them to
be. Be assured, however, that it doesn’t take long to get used to
the unusual syntax and appreciate the power of array
comprehensions.

Here’s an array comprehension that uses the range() function developed above to
initialize an array to contain the even square numbers up to
100:

let evensquares = [x*x for (x in range(0,10)) if (x % 2 === 0)]

It is roughly equivalent to the following five lines:

let evensquares = [];
for(x in range(0,10)) {
    if (x % 2 === 0) 
        evensquares.push(x*x);
}

In general, an array comprehension looks like this:

[ expression for ( variable in object ) if ( condition ) ]

Notice that there are three main parts within the square
brackets:

  • A for/in or for/each loop with no body. This piece
    of the comprehension includes a
    variable (or, with destructuring
    assignment, multiple variables) that appears to the left of the
    in keyword, and an
    object (which may be a generator, an
    iterable object, or an array, for example) to the right of the
    in. Although there is no loop
    body following the object, this piece of the array comprehension
    does perform an iteration and assign successive values to the
    specified variable. Note that neither the var nor the let keyword is allowed before the
    variable name—a let is
    implicit and the variable used in the array comprehension is not
    visible outside of the square brackets and does not overwrite
    existing variables by the same name.

  • An if keyword and a
    conditional expression in parentheses
    may appear after the object being iterated. If present, this
    conditional is used to filter iterated values. The conditional
    is evaluated after each value is produced by the for loop. If it is false, that value is skipped and
    nothing is added to the array for that value. The if clause is optional; if omitted, the
    array comprehension behaves as if if
    (true)
    were present.

  • An expression that appears
    before the for keyword. This
    expression can be thought of as the body of the loop. After a
    value is returned by the iterator and assigned to the variable,
    and if that value passes the
    conditional test, this expression is
    evaluated and the resulting value is inserted into the array
    that is being created.

Here are some more concrete examples to clarify the
syntax:

data = [2,3,4, -5];                   // An array of numbers
squares = [x*x for each (x in data)]; // Square each one: [4,9,16,25]
// Now take the square root of each non-negative element
roots = [Math.sqrt(x) for each (x in data) if (x >= 0)]

// Now we'll create arrays of property names of an object
o = {a:1, b:2, f: function(){}}
let allkeys = [p for (p in o)] 
let ownkeys = [p for (p in o) if (o.hasOwnProperty(p))]
let notfuncs = [k for ([k,v] in Iterator(o)) if (typeof v !== "function")]

Generator Expressions

In JavaScript 1.8,[20] you can replace the square brackets around an array
comprehension with parentheses to produce a generator expression. A
generator expression is like an array
comprehension (the syntax within the parentheses is exactly the same
as the syntax within the square brackets), but its value is a
generator object rather than an array. The benefits of using a
generator expression instead of an array comprehension are that you
get lazy evaluation—computations are performed as needed rather than
all at once—and that you can work with potentially infinite
sequences. The disadvantage of using a generator instead of an array
is that generators allow only sequential access to their values
rather than random access. Generators, that is, are not indexable
the way arrays are: to obtain the nth value,
you must iterate through all n-1 values that
come before it.

Earlier in this chapter we wrote a map() function like this:

function map(i, f) { // A generator that yields f(x) for each element of i
    for(let x in i) yield f(x);
}

Generator expressions make it unnecessary to write or use such
a map() function. To obtain a new
generator h that yields f(x) for
each x yielded by a generator g,
just write this:

let h = (f(x) for (x in g));

In fact, given the eachline() generator from Example 11-1, we can trim whitespace and filter
out comments and blank lines like this:

let lines = eachline(text);
let trimmed = (l.trim() for (l in lines));
let nonblank = (l for (l in trimmed) if (l.length > 0 && l[0]!='#'));


[19] Generators are sometimes called “generator iterators” to
clearly distinguish them from the generator functions by which
they are created. In this chapter, we’ll use the term
“generator” to mean “generator iterator.” In other sources, you
may find the word “generator” used to refer to both generator
functions and generator iterators.

[20] Generator expressions are not supported in Rhino at the
time of this writing.

Comments are closed.