Skip to content

Use-after-free when a decode hook re-enters feed() #695

@methane

Description

@methane

Reported by @oakkaya


If a decode hook (object_hook / object_pairs_hook / list_hook / ext_hook) calls .feed()
on the same Unpacker while it is still unpacking, append_buffer() may reallocate
(PyMem_Free) the internal buffer that the in-progress unpack_execute() is still reading from.
When the hook returns, the parser keeps reading the remaining bytes from the freed buffer →
use-after-free. On a stock build this is a hard crash (SIGSEGV).

The application supplies the (re-entrant) hook; the attacker controls the bytes, which decide when
the hook fires and how large the re-entrant feed grows the buffer.

Reproduction

import struct
from msgpack import Unpacker

up = None
def ext_hook(code, data):
    # re-entrant feed on the SAME unpacker, large enough to force a buffer realloc
    up.feed(b"\xc0" * 8_000_000)
    return 0

up = Unpacker(ext_hook=ext_hook, max_buffer_size=64 * 1024 * 1024)
# array(200): [ ext (fires the re-entrant hook), then 199 more elements ]
up.feed(b"\xdc" + struct.pack(">H", 200) + b"\xd4\x05A" + b"\x2a" * 199)
for _ in up:           # SIGSEGV
    pass

Under ASan:

ERROR: AddressSanitizer: heap-use-after-free  READ of size 1
    #0 unpack_execute msgpack/unpack_template.h:162
freed by: PyMem_Free  <-  Unpacker.feed -> append_buffer

Root cause

Unpacker._unpack runs the parser over the internal buffer:

    ret = execute(&self.ctx, self.buf, self.buf_tail, &self.buf_head)

unpack_execute keeps local p / pe pointers into self.buf. A decode hook is invoked from
inside execute (at map/array end or for ext). If the hook calls up.feed(...),
append_buffer (_unpacker.pyx) reallocates the buffer:

    new_buf = <char*>PyMem_Malloc(new_size)
    ...
    memcpy(new_buf, buf + head, tail - head)
    PyMem_Free(buf)            # <-- frees the buffer the outer execute() is reading

After the hook returns, unpack_execute continues reading from the now-freed p/pe.

(The same applies to a file_like.read() that re-enters feed()/unpack(); the unpacker is not
re-entrant but does not guard against it.)

Suggested fix (verified)

Add a re-entrancy guard so a buffer-mutating call during an active parse fails cleanly instead of
corrupting memory. Set a flag around the execute(...) call and reject feed() while it is set:

    # field:  cdef bint _in_exec      (init False in __init__)

    def feed(self, next_bytes):
        ...
        if self._in_exec:
            raise RuntimeError("Unpacker.feed() called re-entrantly during unpacking")
        ...

    # in _unpack, around the execute call:
                self._in_exec = True
                try:
                    ret = execute(&self.ctx, self.buf, self.buf_tail, &self.buf_head)
                finally:
                    self._in_exec = False

Verified: the PoC now raises RuntimeError instead of crashing (clean under ASan), and normal
streaming (feed() between objects), iteration, and the file_like path are unaffected
(read_from_file calls append_buffer between execute() calls, where the flag is not set, so
there is no false positive). A broader guard could also reject re-entrant unpack()/skip().

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions