Welcome, guest Sign Out

Chapter 4. Caja Support

Introduction

What is Caja?

Caja is a system that transforms ordinary HTML and JavaScript into a restricted form of JavaScript. The transformation is called "cajoling", and the result is "cajoled script". Caja is an Open Source project sponsored by Google and hosted at Google Code.

The cajoled script is then run within a security sandbox created in your browser. This provides a way to safely include arbitrary third-party content on any Web page.

In principle, Caja should be transparent. Most JavaScript behaves the same whether it's run directly or cajoled. However, since Caja is currently incomplete and rapidly evolving, there are many noticeable differences.

Caja Status for This Release

Since Caja is used to transform an application's HTML and JavaScript into a restricted form that prevents malicious applications from doing damage, applications cannot contain arbitrary ActiveX objects, use eval to get around the ActiveX restriction, or use iframes to get around the eval restriction.

Other than that, most JavaScript application elements should work. Our goal is to make Caja as unobtrusive as possible for ordinary applications. However, we're not there yet. Caja still has many rough edges, and you may experience mysterious Caja behavior. This document will describe some of those mysteries in detail.

To get started right away, use Firefox with the Firebug add-on and set alert to bring up the Firebug console. Other browsers are not currently supported.

Note the following restrictions that apply to this release:

  • Complex libraries such as YUI, jQuery, and Prototype are not yet supported.
  • The document.write method isn't supported. However, innerHTML and many commonly-used DOM interfaces are currently supported.
  • Global variables that are not defined with var will cause compile-time and run-time errors.
  • If something doesn't work, check your Firebug console, even if the Firebug icon doesn't indicate any errors. There are several common runtime errors that don't raise exceptions. Instead, they show up as plain messages in the console like this:

    Not readable: ([SomeClass]).foo

    Messages of this type usually means you're trying to do something that isn't yet supported, and you need to find an alternative .

  • Caja currently restricts obj.prototype and constructors in a way that blocks some common JavaScript idioms, such as monkey-patching.

Why Do We Need Caja?

When a website wants to include arbitrary third-party content, it needs to consider many potential security problems. One of the harder problems is "drive-by downloads": an attacker inserts malicious HTML that tries to install malware when you view the page.

A typical vector is an <iframe src=...> tag pointing at the attacker's website. Your browser automatically loads the iframe, which runs a script that figures out what browser and extensions you have, then downloads malware targeted specifically at the vulnerabilities known for your system.

The traditional solution to this problem is to aggressively sanitize third-party content by removing iframes, removing scripts, etc. That works well in many cases, but aggressive sanitization makes it difficult to create interesting applications.

Today, we want to allow anybody to create interesting applications that can appear on our site, but we also want to limits our users' exposure to scripts that install malware.

Sanitizing JavaScript is difficult, and that's what Caja is about.

How Does Caja Work?

Caja has two main parts:

  • server-side translator
  • client-side runtime support
The Server-Side Translator

The Caja translator rewrites arbitrary HTML and JavaScript into safe HTML and JavaScript, using white list security principles, by

  • Removing anything it doesn't understand
  • Removing HTML and CSS that isn't on a white list
  • Modifying CSS rules, limiting them to a sandbox <div>
  • Transforming JavaScript into forms known to be safe

The JavaScript transformation is the complicated part. It's basically a form of virtualization:

  • Replaces references to global variables with references to a sandbox-specific IMPORTS___ object
  • Rewrites references to this to prevent access to the real global scope
  • Replaces most JavaScript code with semantically similar code that has runtime checks for security
  • Rejects some JavaScript code early, such as with(obj){...}.

Here's an example transformation. This JavaScript source code:

is cajoled into something like this:

Note

The actual Caja transformation is slightly different. This example has been modified a bit to make it easier to see what Caja is doing under the hood.

The main purpose of the transformation is to guarantee that cajoled script can't access arbitrary global variables. Cajoled script can only use objects and functions that are explicitly given to it by the container. Basically, cajoled script conforms to an object-capability security model.

For more details about the JavaScript transformation, see the Caja project page.

(As of 9/2008, the Caja project is in the process of migrating to a new rewriting scheme called "Valija". The description in the Caja paper is accurate for the current Caja generation, but it's going to be wrong when Valija takes over.)

The client-side runtime

Cajoled script can't access any real global objects without help, and that's what the Caja runtime system is for. The runtime system creates a useful sandbox environment by adding objects to an IMPORTS___ object that's given to the cajoled script.

Some of the imported objects are the real thing. For example, IMPORTS___.Array is identical to the browser's Array.

Some of the imported objects are proxies. For example, IMPORTS___.document is a proxy object that exposes a safe subset of the DOM interface. The proxy function document.getElementById will return objects that are also proxies. Basically, you never get direct access to a real DOM object, but that generally doesn't matter, because for most purposes the proxy objects are similar enough to the real thing.

The runtime system also enforces the Caja security model, by checking that objects and functions were properly tagged before they're used. You can see Caja's internal tagging when you examine objects in Firebug. Most objects will have properties that end with triple-underbar, such as length_canRead___, ___FROZEN___, etc.

Table of Contents

Copyright © 2009 Yahoo! Inc. All rights reserved. Copyright | Privacy Policy

Help us continue to improve the Yahoo! Developer Network: Send Your Suggestions