In the Add-on SDK we have a problem that is both annoying and confusing for a lot of our users. The problem is not SDK-specific though and maybe interesting for anyone dealing with JS in concurrent execution contexts.
Problem overview
The Add-on SDK was designed to be compatible with browser architecture where chrome and content may be in a separate isolated processes. This imposes a lot of limitations on how add-on code can interact with a page / tab content, and the lack of concurrency constructs in Javascript make this problem a lot more irritating. In the SDK we ended up implementing an API that closely resembles web workers. The main add-on code can execute content scripts in a page context that acts like a worker and interaction between add-on and content scripts happen through a message passing (for more details please take a look at content script documention)
We quickly discovered that not a lot of people are comfortable with message passing APIs, but even putting that aside it’s really annoying to be creating separate content script files for a few lines of code (most of the add-ons have content scripts that consist of few lines of code, unless accompanied with external js libraries like jQuery):
#!env/javascript
var pageMod = require("sdk/page-mod")
var mod = new pageMod.PageMod({
include: ["*.co.uk"],
contentScriptFile: require("self").data.url("./my.js")
}))
As an option we also allow passing content scripts in form of JS string, which is far from ideal:
#!env/javascript
var pageMod = require("page-mod")
pageMod.add(new pageMod.PageMod({
include: ["*.co.uk"],
contentScript: "document.body.innerHTML = " +
"'<h1>this page has been eaten</h1>'"
}))
What you would actully want is to just write a function that will be executed in the context of the content. Unfortunately it’s not just matter of serialising function and then evaluating it in the context, since functions can access bindings from the outer scopes.
Idea
If we could prove that function does not refers to anything but arguments passed to it or definitions with in it, it would be pretty safe to transplant it into completely different execution context:
#!env/javascript
var pageMod = require("sdk/page-mod")
var mod = new pageMod.PageMod({
include: ["*.co.uk"],
contentScript: isolate(function() {
document.body.innerHTML = "<h1>this page has been eaten</h1>"
}))
As a matter of fact such isolate
function can be implemented relatively easy,
all it has to do is serialise given function, parse it, perform static analyzes
to verify no outer references are made and return some wrapping of source back.
If given function does have prohibited references then throw TypeError
back.
Code
With a great projects like Esprima, Acorn parsing JS is no-brainer. Also for SDK we could / should just use Spidermonkey parser API, but since all of them produce de facto standad AST format we can swap parsers as we pleased. So the only remaining chunk of work was static analysis. I made few micro-libraries solving very specific problems.
interset
While playing with AST I quickly discovered the need for binary operations for logical sets. So I end up writing interset small library for doing exactly that:
#!env/javascript
var union = require("interset/union")
union([1, 2], [2, 3], [3, 4])
// => [1, 2, 3, 4]
var intersection = require("interset/intersection")
intersection([1, 2], [2, 3])
// => [2]
var difference = require("interset/difference")
difference([1, 2, 3], [1], [1, 4], [3])
// => [2]
episcope
ECMAScript scope analyzer. This Library provides set of functions that perform analyzes on the nodes of the AST in the de facto AST format. All the API function take AST nodes denoting a lexical scope and performed static analyzes at the given scope level (Note: examples use esprima but library does not really cares about it and expects AST format implemented by all popular JS parsers):
#!env/javascript
var esprima = require("esprima")
var references = require("episcope/references")
var ast = esprima.parse("console.log('>>>', error)")
references(ast)
// => [{ type: "Identifier", name: "console" }, { type: "Identifier", name: "error" }]
var bindings = require("episcope/bindings")
var ast = esprima.parse("function foo(a, b) { var c = a + b; return c * c }")
ast.body[0].id
// => { type: 'Identifier', name: 'foo' }
bindings(ast.body[0])
// => [ { type: 'Identifier', name: 'a' },
// { type: 'Identifier', name: 'b' },
// { type: 'Identifier', name: 'c' } ]
var scopes = require("episcope/scopes")
var ast = esprima.parse(String(function root() {
function nested() { /***/ }
try { /***/ } catch(error) { /***/ }
}))
scopes(ast.body[0])
// => [
// {
// type: 'FunctionDeclaration',
// id: { type: 'Identifier', name: 'nested' },
// // ...
// },
// {
// type: 'CatchClause',
// param: { type: 'Identifier', name: 'error' },
// body: { type: 'BlockStatement', body: [] }
// }
//]
isoscope
ECMAScript function isolation analyzer. This is very alpha, but intention is to provide API for performing static analyzes on the JS AST nodes denoting a function definition / declaration and perform analyzes to gather info about their isolation:
#!env/javascript
var esprima = require("esprima")
var enclosed = require("isoscope/enclosed")
// Parse some code
var form = esprima.parse(String(function fn(a, b) {
console.log(String(a) + b)
}))
// Get a function form we'll be analyzing
var fn = form.body[0]
fn.id
// => { type: "Identifier", name: "fn" }
// Get names of enclosed references
enclosed(fn)
// => [ "console", "String" ]
Now the only thing left for isolate
to do is ensure that enclosed references are
legit globals or built-ins and that nasty things like with and eval are
not used.
If this experiment works out in practice that would allow us to explore a lot more to make new and brave concurrent world a lot more accessible and pleasant to use for our users.
All the libraries mentioned are open source and work both in SDK and Node, probably in browsers too. So feel free to play with them, break them and maybe even contribute fixes!