The Challenges of Executing Arbitrary Code: Lesson Learned

Executing arbitrary code is challenging. Learn from our experiences with tools like VM2, Docker, and Linux and ensure safety in high-demand environments.

Published
August 22, 2024

It is getting more usual to run code that was written by someone else – a practice that is referred to as arbitrary code execution – for tasks like automating workflows or integrating multiple services. However, like with any other strategic weapon, it comes with large-scale problems. Therefore, it has become one of our greatest concerns to understand how executing code can be done safely and in the most efficient way. Today we want to tell you what we have discovered, what obstacles we have come across and how we addressed them.

What Is Arbitrary Code Execution?

So, let’s begin by defining what arbitrary code execution (ACE) is. Typically, when people speak of software, it is the code created by developers who have written, checked and deployed the code. But in the case of ACE, the run code is written and provided by users. This means that developers are working with code, which they did not write and have not checked, which makes it a little complicated. They must assume this code could be dangerous. Their job is to securely execute it to avoid interference with other clients’ information or with our servers.

VM2: Our Major Wake-Up Call

When we first began executing arbitrary code, we used VM2 (Virtual Machine 2), a sandboxed environment where JavaScript code can be executed safely within a Node.js server. The idea of VM2 is good at first: it established that untrusted code could be executed without harm to the system. However, as time passed, things began to deteriorate.

VM2 had a lot of security problems, and the creators of the software realized it was becoming increasingly difficult to fix these problems. That is why they decided to cease its development, and we had to look for an alternative. This was a major wake-up call for us, as the same applies to tools we considered to be very safe... one day they get to the state where they are unsafe to use.

The Technical Challenges of Running Untrusted Code

Running code written by someone else isn’t just tough - it's one of the hardest problems in programming. Why? Because code can do just about anything. Programming languages are powerful, and there are endless possibilities for what a piece of code might try to do. We must think of every possible scenario and ensure the code can’t do anything harmful.

Some of the biggest challenges include:

  • Isolation: Each piece of code should be isolated in such a way that it does not interact with other components.
  • Multiple Users: Sometimes, we have many customers executing code simultaneously. This means that we must ensure that someone’s code should not affect another in any way.
  • Resetting the Environment: By cleaning after executing codes, we ensure that there is no data stored or hidden threats after every code execution.

What About Other Tools?

Perhaps the easiest way to guarantee that code executes without interfering with other computations is to run the code on a separate computer that is not connected to the internet and only contains the code to be run. Finally, once the code was executed, the person would have to get rid of the computer containing the code to eradicate any problem. But obviously, that’s not practical.

Lately, we have been making significant changes to upgrade some of the back-end infrastructure of our code executors. When trying to address the problems associated with the execution of arbitrary code, we got inspired by advanced sandboxing technologies employed in the Docker and the Linux kernel projects. This update doesn’t just make code execution safer – it also allows you to install dependencies in a one-time-use, disposable environment.

With these advanced techniques, it becomes possible to offer a clean environment for the code runs, hence offering the highest levels of security. Regardless of the number of concurrent executions or whatever challenging tasks are being addressed, we are confident with the new setup that each execution is independent and secure. Building on Docker-inspired containers and kernel-level Linux isolation constitutes a solid starting point to efficiently and securely host and run arbitrary code.

To Wrap Things Up

Our journey with arbitrary code execution has taught us some valuable lessons:

  • No Tool is Perfect: There is always a limit to the effectiveness of any tools. Therefore, it is crucial that the tools are revised and updated often to ensure safety.
  • Isolation is Critical: Code should be isolated from the rest of the system. Isolation is useful – no matter if VMs are used, Docker or anything else – it prevents issues.
  • Clean Up After Each Use: This control is also crucial to minimize the ability of other data or threats to compromise the program, keeping things safe for the next user.
  • Keep an Eye Out: Ongoing monitoring to detect strange activities and being able to counteract them before they start an attack.

What’s a Rich Text element?

The rich text element allows you to create and format headings, paragraphs, blockquotes, images, and video all in one place instead of having to add and format them individually. Just double-click and easily create content.

Static and dynamic content editing

A rich text element can be used with static or dynamic content. For static content, just drop it into any page and begin editing. For dynamic content, add a rich text field to any collection and then connect a rich text element to that field in the settings panel. Voila!

How to customize formatting for each rich text

Headings, paragraphs, blockquotes, figures, images, and figure captions can all be styled after a class is added to the rich text element using the "When inside of" nested selector system.