What does it mean to “Sandbox” a script?
Sandboxed scripts are isolated from their environment and can only talk to other scripts through a secure channel — think Docker or Virtual Machines_
Here we have a simple script that is printing out the website’s title to the debug console. We should expect it to say “Welcome to My Website!”.
Sandboxing (marked hypothetically using the
sandbox attribute in this example) makes it such that the
script being executed does not have access to the document:
A sandboxed script also cannot access global variables:
Sandboxed scripts would communicate with the host page using the
postMessage API, as is typical for page-external entities.
Why is this useful?
You might be wondering what value there is in limiting the reach that a script has.
Often applications will need to reference an external script in order to perform a task the engineers are unable to code themselves. Examples of this are things like third party Social login, Firebase, Analytics tooling, Performance monitoring tooling.
Granting these third party integrations completely unregulated root level access to your web application means that it’s possible they for them to scrape the application for valuable sensitive user data.
Sandboxing a script like this means the third party only has access to data explicitly sent by the hosting web application.
Leading on from the previous point, it’s also possible for these third party scripts to be collecting user data unwittingly through vulnerabilities in their own client side script.
Knowing that a script does not have access to the host context allows browsers to execute the script in its own process thread, allowing for parallel execution which can improve the performance of websites that would otherwise evaluate this code directly on the host thread.
This sounds like a Web Worker?
In some ways, yes — this resembles a web worker, it does however present some notable differences:
Cross origin script compatibility
Web workers cannot be instantiated from a script located on a cross origin domain. So, essentially if a third party offers an integration — that integration cannot use multi threading (at least not easily) and that third party integration cannot be sandboxed.
Opt-in sandbox features and same-origin Iframe access
Web Workers are unconditionally sandboxed from their execution context meaning they are strictly unable to access anything from the host.
However, iframes can access the content of other iframes
sandbox attribute in a similar way to the
sandbox attribute opens to the door to
more use cases.
In a case like social login, perhaps we need an iframe created by the sandboxed script and that iframe needs to communicate to the sandboxed script.
sandbox attribute values would allow for
more versatility for safer third party integrations.
You might be wonder why advocate for a sandbox flag over simply asking for Worker to accept cross origin scripts and, honestly, if that was on offer I’d take it.
In addition to a lack of opt-in sandbox permission, the browser cannot eagerly optimise a Web Worker on page load.
When a web page is loaded, the browser will do quick a first pass “once over” of the html to see if there is anything it can download/evaluate in parallel right away — later executing it in the order specified in the page.
Worker scripts are also instantiated synchronously, where
<script> tags can be evaluated asynchronously with
the async attribute.
Currently companies will use hidden iframes to sandbox third party or cross origin scripts.
Iframes will initialize their own DOM and are expected to render another website in an embedded format within a host site.
When using hidden iframes to sandbox scripts, they don’t show this embedded website so it’s obvious that having an entire DOM constructed is overkill.
In addition, (while unclear) it appears that a cross origin iframe uses the equivalent memory of an entire new browser tab — which can get quite costly if you have several third party integrations.