In this post, I will demonstrate a technique to escape the V8 sandbox by applying JIT spraying to WebAssembly. By manipulating the immediate constants in a WebAssembly function and redirecting execution flow, we can force the JIT compiler to emit executable shellcode into memory. This exploit assumes we have already achieved arbitrary read/write primitives within the sandbox and focuses on crossing the sandbox boundary to achieve full remote code execution.
voidI64Const(FullDecoder* decoder, Value* result, int64_t value){ // The {VarState} stores constant values as int32_t, thus we only store // 64-bit constants in this field if it fits in an int32_t. Larger values // cannot be used as immediate value anyway, so we can also just put them in // a register immediately. int32_t value_i32 = static_cast<int32_t>(value); if (value_i32 == value) { __ PushConstant(kI64, value_i32); } else { LiftoffRegister reg = __ GetUnusedRegister(reg_class_for(kI64), {}); __ LoadConstant(reg, WasmValue(value)); __ PushRegister(kI64, reg); } }
LiftoffCompiler::I64Const() calls LiftoffAssembler::LoadConstant() to generate instructions loading the constant value into reg.
Assembler::emit_mov() emits the opcode for the 64-bit mov instruction followed by the raw value.
JIT Spraying
JIT spraying is an exploitation technique used to bypass the memory protection mechanism. It abuses the JIT compiler emitting immediate constants directly into the compiled code, to insert shellcode into executable memory. We can apply this technique to WebAssembly because Liftoff (the baseline compiler) emits the operand of the i64.const instruction directly into the compiled code, as analyzed above.
After Liftoff compiles the function f, we can observe the following instructions:
We can insert arbitrary 8-byte constant numbers into the middle of the function code. If we can redirect the instruction pointer to the exact location of the constant, those bytes are interpreted as instructions (shellcode).
Consequently, we can execute arbitrary 8-byte shellcode. However, 8 bytes is typically insufficient to execute a complex payload, such as execve("/bin/sh", 0, 0). To overcome this limitation, we can chain several shellcode segments using the relative jmp instruction.
let builder = newWasmModuleBuilder(); builder.addFunction("f", makeSig([], [])).addBody([]).exportFunc(); let instance = builder.instantiate();
instance.exports.f();
The JavaScript code above creates a WebAssembly module containing an empty function, then calls the exported function.
src/objects/js-function.tq
32 33 34 35 36 37 38 39 40
externclassJSFunction extends JSFunctionOrBoundFunctionOrWrappedFunction { shared_function_info: SharedFunctionInfo; context: Context; feedback_cell: FeedbackCell; @if(V8_EXTERNAL_CODE_SPACE) code: CodeDataContainer; @ifnot(V8_EXTERNAL_CODE_SPACE) code: Code; // Space for the following field may or may not be allocated. prototype_or_initial_map: JSReceiver|Map; }
When an exported WebAssembly function is called, the function call handler reads the address of the CodeDataContainer object corresponding to the function from the code field of the Function object.
Next, the handler retrieves the entrypoint from the CodeDataContainer object and jumps to that address. This is not the WebAssembly function’s entrypoint, but rather the generic JS-to-Wasm wrapper.
src/objects/js-function.tq
32 33 34 35 36 37 38 39 40
externclassJSFunction extends JSFunctionOrBoundFunctionOrWrappedFunction { shared_function_info: SharedFunctionInfo; context: Context; feedback_cell: FeedbackCell; @if(V8_EXTERNAL_CODE_SPACE) code: CodeDataContainer; @ifnot(V8_EXTERNAL_CODE_SPACE) code: Code; // Space for the following field may or may not be allocated. prototype_or_initial_map: JSReceiver|Map; }
The wrapper then retrieves the SharedFunctionInfo from the Function object.
@generateBodyDescriptor externclassSharedFunctionInfo extends HeapObject { // function_data field is treated as a custom weak pointer. We visit this // field as a weak pointer if there is aged bytecode. If there is no bytecode // or if the bytecode is young then we treat it as a strong pointer. This is // done to support flushing of bytecode. @customWeakMarking function_data: Object; name_or_scope_info: String|NoSharedNameSentinel|ScopeInfo; outer_scope_info_or_feedback_metadata: HeapObject; script_or_debug_info: Script|DebugInfo|Undefined; // [length]: The function length - usually the number of declared parameters // (always without the receiver). // Use up to 2^16-2 parameters (16 bits of values, where one is reserved for // kDontAdaptArgumentsSentinel). The value is only reliable when the function // has been compiled. length: int16; // [formal_parameter_count]: The number of declared parameters (or the special // value kDontAdaptArgumentsSentinel to indicate that arguments are passed // unaltered). // In contrast to [length], formal_parameter_count includes the receiver. formal_parameter_count: uint16; function_token_offset: uint16; // [expected_nof_properties]: Expected number of properties for the // function. The value is only reliable when the function has been compiled. expected_nof_properties: uint8; flags2: SharedFunctionInfoFlags2; flags: SharedFunctionInfoFlags; // [function_literal_id] - uniquely identifies the FunctionLiteral this // SharedFunctionInfo represents within its script, or -1 if this // SharedFunctionInfo object doesn't correspond to a parsed FunctionLiteral. function_literal_id: int32; // [unique_id] - For --log-maps purposes, an identifier that's persistent // even if the GC moves this SharedFunctionInfo. @if(V8_SFI_HAS_UNIQUE_ID) unique_id: int32; }
From there, it obtains the address of the WasmExportedFunctionData object.
externclassWasmFunctionData extends HeapObject { // The wasm-internal representation of this function object. internal: WasmInternalFunction; // Used for calling this function from JavaScript. @if(V8_EXTERNAL_CODE_SPACE) wrapper_code: CodeDataContainer; @ifnot(V8_EXTERNAL_CODE_SPACE) wrapper_code: Code; }
externclassWasmExportedFunctionData extends WasmFunctionData { // This is the instance that exported the function (which in case of // imported and re-exported functions is different from the instance // where the function is defined -- for the latter see WasmFunctionData::ref). instance: WasmInstanceObject; function_index: Smi; signature: Foreign; wrapper_budget: Smi; // The remaining fields are for fast calling from C++. The contract is // that they are lazily populated, and either all will be present or none. @if(V8_EXTERNAL_CODE_SPACE) c_wrapper_code: CodeDataContainer; @ifnot(V8_EXTERNAL_CODE_SPACE) c_wrapper_code: Code; packed_args_size: Smi; // Functions returned by suspender.returnPromiseOnSuspend() have this field // set to the host suspender object. suspend: Smi; // Boolean. }
Subsequently, it accesses the WasmInternalFunction referenced by the WasmExportedFunctionData object.
// This is the representation that is used internally by wasm to represent // function references. // The {foreign_address} field inherited from {Foreign} points to the call // target. externclassWasmInternalFunction extends Foreign { // This is the "reference" value that must be passed along in the "instance" // register when calling the given function. It is either the target instance // (for wasm functions), or a WasmApiFunctionRef object (for functions defined // through the JS or C APIs). // For imported functions, this value equals the respective entry in // the module's imported_function_refs array. ref: WasmInstanceObject|WasmApiFunctionRef; // The external (JS) representation of this function reference. external: JSFunction|Undefined; // This field is used when the call target is null. @if(V8_EXTERNAL_CODE_SPACE) code: CodeDataContainer; @ifnot(V8_EXTERNAL_CODE_SPACE) code: Code; }
Finally, it reads the call target address from the WasmInternalFunction object and jumps to that address, which points to the WebAssembly jump table.
The WasmInternalFunction object resides within the V8 sandbox (heap), but its foreign_address field points to the executable machine code (JIT page) located outside the sandbox. By overwriting this field with the address of our shellcode (which we can calculate by leaking the code address of f() and adding the offset to our constants) using a sandboxed arbitrary write primitive, we can hijack control flow and execute the shellcode.
The above patch moves the call target address into the external pointer table, which is located outside the V8 sandbox and referenced by an index (handle), thus preventing the direct overwrite.