Was doing some passive recon on bentleymotors.com and stumbled onto a subdomain I hadn't seen before, id.bentleymotors.com. Running NGINX, clearly some kind of identity / login portal. Looked custom-built, which is always interesting because custom auth flows tend to have weird edges.
The first thing that caught my eye was the CSP header. Buried in img-src there was a wildcard pointing at https://*.vwgroup.io. That's a pretty strong signal this isn't actually a Bentley-owned stack. It's VW Group's shared "Identity Kit" platform that Bentley just happens to be a tenant on. Filed that away for later because it ends up mattering.
Started mapping routes. Most of them behaved. Hit /api/profile unauthenticated, got bounced with a 401, no big surprise. Then I tried /landing-page.
200 OK. Full internal application shell. HTML, JS routing, the API map, all of it, served to me with no session, no token, nothing. The frontend auth guard was clearly doing its job client-side, but the server wasn't enforcing anything on this route. Classic case of trusting the SPA to handle auth.
Remembering the VW Group CSP tie-in, I pointed the same request at id.vwgroup.io. Identical behavior, same route, same 200, same dump.
So this isn't a Bentley misconfiguration. It's a misconfiguration in the underlying Identity Kit platform that every VW Group tenant is inheriting. That raised the scope question real fast.
While poking at authenticated routes, I noticed something weirder. Hit the OAuth callback with a bogus redirect:
Two things wrong here. First, the www-authenticate header is telling me exactly where the app expects the access token to live ("in cookies"). That's free intel for an attacker, you don't have to guess where to stuff a stolen token.
Second, and this is the weird one, a failed 401 auth attempt is mutating state. The server is issuing a valid session cookie (identity-kit-profile-session) on a request that just got rejected. Error responses shouldn't be writing session state. At minimum it's sloppy, at worst it could be abused depending on what downstream code trusts that cookie's presence.
The moment I confirmed this was VW Group shared infrastructure and not just Bentley, I stopped. Bentley's bug bounty scope is Bentley's IT systems, and going further on a platform that handles identity for the entire VW Group would've blown past that line. Reported both findings to Bentley, noted the VW Group tie-in, and let them handle escalation internally.
As of writing, both issues are still unpatched. The landing page route still serves the internal app shell to unauthenticated users, and the 401 response still leaks the session cookie and the access token location.
Was hunting for LPE primitives that don't need setuid. Spent a while on race conditions in the VFS layer, kept hitting dead ends, and pivoted to logic bugs in the idmapping chain. The idmapped mount feature is relatively new and the interaction with OverlayFS felt underexplored, so I started reading fs/overlayfs/inode.c.
The setup that matters: a host running a container with a user namespace, where container UID 0 maps to something like host UID 100000. Inside that container, you mount an idmapped mount that does the UID translation, and then you stack OverlayFS on top of it. The lower layer is the idmapped mount, the upper layer is container-local. The kernel is supposed to use the idmap for every permission check on the OverlayFS files so the container's UID 0 never gets confused with host UID 0.
The functions in inode.c receive the correct idmap argument from the VFS, then throw it away and use &nop_mnt_idmap instead. nop_mnt_idmap is the identity mapping, meaning no UID/GID translation happens at all.
So every permission check on OverlayFS files goes through the identity map. Container UID 0, which the host is supposed to translate to UID 100000, never gets translated. It's just UID 0 all the way down.
The attack chain is straightforward once the bug is clear. Inside the container, you create a file through the OverlayFS mount. Then you chmod or chown it. Because ovl_setattr uses nop_mnt_idmap, the ownership change happens in the host's UID space directly, with no translation. Container UID 0 becomes host UID 0. The permission check passes because, as far as the kernel is concerned, you ARE UID 0 on the host.
No setuid file involved. No race condition. No kernel address leak needed. The whole thing is a clean logic bug.
A minimal PoC looks like this:
While auditing around the same area I found a handful of other permission gaps. None of them are as clean as the OverlayFS one, but they're worth noting.
lookup_noperm and friends do a dcache lookup with no inode_permission() call. They're exported to filesystem modules. A malicious or buggy module can use them to walk paths without DAC checks. Severity is high but you need a filesystem module, which limits the attack surface.
Opening a file with O_PATH sets f_op = &empty_fops and returns early, before security_file_open() and fsnotify_open_perm() get called. You can then reopen the fd via /proc/self/fd/ and the LSM never sees the original open. Medium severity, but useful for flying under monitoring.
Setting xattrs with the security. or system. prefix skips the inode_permission() call entirely. You can write security xattrs without proper DAC checks. Medium severity.
If ATTR_FORCE is set in the iattr mask, the code jumps straight to kill_priv and skips every permission check. Internal callers or buggy filesystems can use this to bypass chown/chmod checks entirely. Medium, internal-triggered.
Two more threads I want to pull on but haven't confirmed yet.
FUSE as OverlayFS lower layer. If the kernel allows a FUSE filesystem as the lower layer of an OverlayFS mount, you could craft a FUSE fs that returns files owned by the attacker with capability xattrs set. When OverlayFS does copy-up, the upper layer ends up with attacker-owned files with capabilities. That would be critical if it works. I haven't verified whether the kernel actually allows FUSE as a lower layer yet.
Idmapped mount edge cases. The INVALID_UID fallback (UID 65534 / nobody) when mapping fails could grant access to files owned by unmapped UIDs. There's also a potential capability namespace mismatch in capable_wrt_inode_uidgid(), which uses mnt_userns while the inode UID might be in a different namespace. Nested namespaces with overlapping mappings could cause privilege confusion. Both need verification.
| vuln | sev | exploitable | no setuid | priority |
|---|---|---|---|---|
| nop_mnt_idmap | critical | yes | yes | #1 |
| FUSE lower layer | critical | tbd | yes | #2 |
| lookup_noperm | high | needs module | yes | #3 |
| idmapped mount bugs | high | tbd | yes | #4 |
| O_PATH bypass | med | yes | yes | #5 |
| xattr DAC skip | med | yes | yes | #6 |
| ATTR_FORCE bypass | med | internal | yes | #7 |
The fix for the main bug is a one-liner per call site. Pass the idmap that was already passed in.
Same fix in ovl_permission() at line 309 and ovl_set_acl() at lines 537 and 546. Reported and patched upstream. The fix is exactly what you'd expect, pass the idmap that was already passed in.
Was looking at the futex subsystem on Linux 6.12.74+deb13+1-amd64. Originally chasing race conditions in the private hash code, but CONFIG_FUTEX_PRIVATE_HASH isn't compiled into this kernel and the FUTEX2_NUMA TOCTOU path isn't implemented. Dead ends. Pivoted to the robust futex list, which is older code and gets walked every time a thread exits.
The relevant path is exit_robust_list() calling into handle_futex_death(). The list head lives in userspace. The kernel reads it, walks the entries, and for each entry computes the futex address as entry + futex_offset.
futex_offset is a signed 64-bit long read straight from userspace with get_user. No bounds check, no VMA validation, no magnitude check, no verification that the computed address is inside the process's own memory. Positive offset goes forward from entry, negative offset goes backward, full 64-bit range.
The compat path (compat_exit_robust_list()) has the same bug, with the added weirdness that 32-bit compat mode truncates pointers, so the address calculation can land somewhere unexpected. Same root cause, slightly different shape.
Inside handle_futex_death(), the kernel does a get_user on the computed address. That's an arbitrary userspace read. Then it masks the value with FUTEX_TID_MASK (0x3fffffff) and checks whether it equals the dying thread's TID. If it matches, it does a cmpxchg that sets bit 30 (FUTEX_OWNER_DIED).
So the primitive is: read any userspace address, and if you can place the dying thread's TID in the low 30 bits at that address, set bit 30 on it. There's also a secondary path: if FUTEX_WAITERS (bit 31) is set and it's not a PI futex, the kernel calls futex_wake() on the arbitrary address. That gives you a cross-process futex wake primitive on addresses the dying task never actually held.
| capability | details |
|---|---|
| read | arbitrary userspace read via get_user() |
| write condition | (value & 0x3fffffff) == dying_thread_tid |
| write effect | value = (value & 0x80000000) | 0x40000000 |
| bit modified | bit 30 (OWNER_DIED) |
The conditional bit-30 write is restrictive on its own, but the arbitrary read is immediately useful. You can leak stack canaries, heap pointers, and any other userspace data without any special privileges. The read doesn't require the TID condition, it fires on every list entry.
For the write, the TID condition is manageable. The dying thread's TID is known (it's the thread calling exit), so you pre-place that value at the target address before the thread exits. The bit-30 write can corrupt flags in data structures. Specifically, if you target a struct file's f_flags field, setting bit 30 flips FMODE_LSEEK or adjacent flags depending on the architecture, which can cause logic bugs in file operations.
The futex wake on arbitrary addresses is the most directly weaponizable path. If you point the robust list at an address in another process's memory where you've placed a value with the dying thread's TID in the low bits and FUTEX_WAITERS set, the kernel will call futex_wake() on that address. Any thread in the target process that's doing a FUTEX_WAIT on the wrong address gets woken spuriously. That's a denial of service primitive, and in the right conditions can desynchronize locking logic in multi-threaded programs.
The cleanest escalation path I see is using the arbitrary read to leak a pointer from the stack or heap, then using the conditional write to corrupt a function pointer or vtable entry. The bit-30 constraint means you need to find a target where setting that bit actually changes control flow. In practice this means targeting a flags field that gates a code path, not a pointer directly.
Alternatively, if you have a second vulnerability that gives you an arbitrary write (even a limited one), the futex primitive serves as the oracle. Read any address, use the information to calculate the right offset for the second vuln, and chain them. The futex bug doesn't need to be the final step, it just needs to be the information-leak step.
The fix is straightforward: validate futex_offset before using it. At minimum, check that entry + futex_offset falls within a valid VMA owned by the calling process. Ideally, cap the magnitude to something reasonable (page-aligned, within the range of the containing VMA).
Same fix needed in compat_exit_robust_list() with the appropriate 32-bit offset type. As of writing, this is unpatched. The robust list path has been stable for over a decade and nobody seems to have audited the offset handling.
| property | value |
|---|---|
| kernel tested | 6.12.74+deb13+1-amd64 |
| primitive | arbitrary read + conditional bit-30 write + futex wake |
| privileges required | none |
| setuid needed | no |
| user namespace escape | requires chaining |
| patched | no |
| reported | pending |
completed targets get moved to their respective writeups above