loading...

JavaScript – JavaScript Subsets

Most language subsets are defined to allow the secure execution
of untrusted code. There is one interesting subset defined for
different reasons. We’ll cover that one first, and then cover secure
language subsets.

The Good Parts

Douglas Crockford’s short book JavaScript: The Good
Parts
(O’Reilly) describes a JavaScript subset
that consists of the parts of the language that he thinks are worth
using. The goal of this subset is to simplify the language, hide
quirks and imperfections, and ultimately, make programming easier
and programs better. Crockford explains his motivation:

Most programming languages contain good parts and bad parts.
I discovered that I could be a better programmer by using only the
good parts and avoiding the bad parts.

Crockford’s subset does not include the with and continue statements or the eval() function. It defines functions
using function definition expressions only and does not include the
function definition statement. The subset requires the bodies of
loops and conditionals to be enclosed in curly braces: it does not
allow the braces to be omitted if the body consists of a single
statement. It requires any statement that does not end with a curly
brace to be terminated with a semicolon.

The subset does not include the comma operator, the bitwise
operators, or the ++ and -- operators. It also disallows == and != because of the type conversion they
perform, requiring use of === and
!== instead.

Since JavaScript does not have block scope, Crockford’s subset
restricts the var statement to
appear only at the top level of a function body and requires
programmers to declare all of a function’s variables using a single
var as the first statement in a
function body. The subset discourages the use of global variables,
but this is a coding convention rather than an actual language
restriction.

Crockford’s online code-quality checking tool at http://jslint.com includes an option to enforce
conformance to The Good Parts. In addition to ensuring that your
code uses only the allowed features, the JSLint tool also enforces
coding style rules, such as proper indentation.

Crockford’s book was written before the strict mode of
ECMAScript 5 was defined, but many of the “bad parts” of JavaScript
he seeks to discourage in his book are prohibited by the use of
strict mode. With the adoption of the ECMAScript 5 standard, the
JSLint tool now requires programs to include a “use strict”
directive when “The Good Parts” option is selected.

Subsets for Security

The Good Parts is a language subset designed for aesthetic
reasons and with a desire to improve programmer productivity. There
is a larger class of subsets that have been designed for the purpose
of safely running untrusted JavaScript in a secure container or
“sandbox.” Secure subsets work by disallowing all language features
and APIs that can allow code to break out of its sandbox and affect
the global execution environment. Each subset is coupled with a
static verifier that parses code to ensure that it conforms to the
subset. Since language subsets that can be statically verified tend
to be quite restrictive, some sandboxing systems define a larger,
less restrictive subset and add a code transformation step that
verifies that code conforms to the larger subset, transforms it to
use a smaller language subset, and adds runtime checks where static
analysis of the code is not sufficient to ensure security.

In order to allow JavaScript to be statically verified to be
safe, a number of features must be removed:

  • eval() and the Function() constructor are not allowed
    in any secure subset because they allow the execution of
    arbitrary strings of code, and these strings cannot be
    statically analyzed.

  • The this keyword is
    forbidden or restricted because functions (in non-strict mode)
    can access the global object through this. Preventing access to the global
    object is one of the key purposes of any sandboxing
    system.

  • The with statement is
    often forbidden in secure subsets because it makes static code
    verification more difficult.

  • Certain global variables are not allowed in secure
    subsets. In client-side JavaScript, the browser window object
    does double-duty as the global object, so code is not allowed to
    refer to the window object.
    Similarly, the client-side document object defines methods that
    allow complete control over page content. This is too much power
    to give to untrusted code. Secure subsets can take two different
    approaches to global variables like document. They can forbid them
    entirely, and instead define a custom API that sandboxed code
    can use to access the limited portion of the web page that has
    been alloted to it. Alternatively, the “container” in which the
    sandboxed code is run can define a facade or proxy document object that implements only
    the safe parts of the standard DOM API.

  • Certain special properties and methods are forbidden in
    secure subsets because they give too much power to the sandboxed
    code. These typically include the caller and callee properties of the arguments object (though some subsets
    do not allow the arguments
    object to be used at all), the call() and apply() methods of functions, and the
    constructor and prototype properties. Nonstandard
    properties such as __proto__
    are also forbidden. Some subsets blacklist unsafe properties and
    globals. Others whitelist a specific set of properties know to
    be safe.

  • Static analysis is sufficient to prevent access to special
    properties when the property access expression is written using
    the . operator. But property
    access with [] is more
    difficult because arbitrary string expressions within the square
    brackets cannot be statically analyzed. For this reason, secure
    subsets usually forbid the use of square brackets unless the
    argument is a numeric or string literal. Secure subsets replace
    the [] operators with global
    functions for querying and setting object properties—these
    functions perform runtime checks to ensure that they aren’t used
    to access forbidden properties.

Some of these restrictions, such as forbidding the use of
eval() and the with statement, are not much of a burden
for programmers, since these features are not commonly used in
JavaScript programming. Others, such as the restriction on the use
of square brackets for property access are quite onerous, and this
is where code translation comes in. A translator can automatically
transform the use of square brackets, for example, into a function
call that includes runtime checks. Similar transformations can allow
the safe use of the this keyword.
There is a tradeoff, of course, between the safety of these runtime
checks and execution speed of the sandboxed code.

A number of secure subsets have been implemented. Although a
complete description of any subset is beyond the scope of this book,
we’ll briefly describe some of the most important:

ADsafe

ADsafe (http://adsafe.org) was one
of the first security subsets proposed. It was created by
Douglas Crockford (who also defined The Good Parts subset).
ADsafe relies on static verification only, and it uses JSLint
(http://jslint.org) as its verifier. It
forbids access to most global variables and defines an
ADSAFE variable that
provides access to a secure API, including special-purpose DOM
methods. ADsafe is not in wide use, but it was an influential
proof-of-concept that influenced other secure
subsets.

dojox.secure

The dojox.secure subset (http://www.sitepen.com/blog/2008/08/01/secure-mashups-with-dojoxsecure/)
is an extension to the Dojo toolkit (http://dojotoolkit.org) that was inspired by
ADsafe. Like ADsafe, it is based on static verification of a
restrictive language subset. Unlike ADsafe, it allows use of
the standard DOM API. Also, it includes a verifier written in
JavaScript, so that untrusted code can be dynamically verified
before being evaluated.

Caja

Caja (http://code.google.com/p/google-caja/) is
Google’s open-source secure subset. Caja (Spanish for “box”)
defines two language subsets. Cajita (“little box”) is a
narrow subset like that used by ADsafe and dojox.secure.
Valija (“suitcase” or “baggage”) is a much broader language
that is close to regular ECMAScript 5 strict mode (with the
removal of eval()). Caja
itself is the name of the compiler that transforms (or
“cajoles”) web content (HTML, CSS, and JavaScript code) into
secure modules that can be safely hosted on a web page without
being able to affect the page as a whole or other modules on
the page.

Caja is part of the OpenSocial API (http://code.google.com/apis/opensocial/) and
has been adopted by Yahoo! for use on its websites. The
content available at the portal http://my.yahoo.com, for example, is organized
into Caja modules.

FBJS

FBJS is the variant of JavaScript used by Facebook
(http://facebook.com) to allow untrusted
content on users’ profile pages. FBJS relies on code
transformation to ensure security. The transformer inserts
runtime checks to prevent access to the global object through
the this keyword. And it
renames all top-level identifiers by adding a module-specific
prefix. Any attempt to set or query global variables or
variables belonging to another module is prevented because of
this renaming. Furthermore, any calls to eval() are transformed by this
identifier prefixing into calls to a nonexistent function.
FBJS emulates a safe subset of the DOM API.

Microsoft Web Sandbox

Microsoft’s Web Sandbox (http://websandbox.livelabs.com/) defines a
broad subset of JavaScript (plus HTML and CSS) and makes it
secure through radical code rewriting, effectively
reimplementing a secure JavaScript virtual machine on top of
nonsecure JavaScript.

Comments are closed.

loading...