NPE in IndexedXMLObjectChildrenList

Description

We're seeing around a dozen or two of these errors each day in production:

 

ERROR [net.shibboleth.idp.saml.profile:-2] - Uncaught runtime exception java.lang.NullPointerException: null at org.opensaml.core.xml.util.ListView.isEmpty(IndexedXMLObjectChildrenList.java:332)

Unfortunately this is only happening in production so I haven't been able to turn on DEBUG logging to troubleshoot further.

Please let me know if I can provide additional information.

 

Environment

None

Activity

Show:

Brent Putman March 4, 2021 at 5:04 AM

Huge thanks for that, it's a big help.

Unfortunately the lack of any future errors won't necessarily mean it's fixed (maybe statistically suggests it), but if you do see continued errors we'll positively know it's not fixed.

 

Scott Cantor March 4, 2021 at 1:15 AM

I've deployed a patched opensaml-core to production tonight.

Brent Putman March 3, 2021 at 6:53 PM
Edited

That might be true in a general case, but here I don't think it's an issue. It's because the critical section is just the classic "check map and create key/value if it doesn't exist". So it (logically) should actually only happen once for a given key, the first time something requests that QName index. It should be ok now, since that's guarded and so (physically) it will absolutely only happen once - only 1 thread can now eval "!objectIndex.containsKey(index)" as true.

And then the code below checkAndCreateIndex doesn't mutate anything, just reading the map, which is now guaranteed to be in a consistent state.

My read of the race is: The actual error was an NPE in ListView#isEmpty(). That must be because the view's indexList member is null. That means that in the view's ctor the "indexList = backingList.get(index)" must have assigned null. And I think that could happen if one thread was partially through checkAndCreateIndex such that a second thread evaled the "!objectIndex.containsKey(index)" as false, fell through and later (before the first thread finished the critical section) called the map get() before the value was actually available from the map.

I'm pretty sure that can happen since - and I always forget this - the plain Java collection types (at least the hashtable-based ones) are NOT thread-safe for mutation. That is in fact the reason for the existence for the ConcurrentHashMap, -HashSet impls - to at least allow the collection to be structurally correct and not corrupted, etc in the fact of concurrent mods.  Esp important since those will internally re-hash and re-structure themselves based on the capacity algorithm, when you add new items.  (They don't of course help in this kind of logical concurrency issue of "check and create if doesn't exist", because that involves 2+ calls to the collection).

Scott Cantor March 3, 2021 at 1:24 PM

I haven't exactly figured out what the actual race is, but my thought was that it might be a problem to have one thread reading the collection below the call to checkAndCreateIndex at the same time something else is inside it.

Since both threads have to go through that initial step, I can imagine it's fine, but since I don't quite understand what's happening now I'm being paranoid.

I'll try and get a patched version in but I really haven't seen the issue often, so it probably won't prove it's fixed unfortunately.

Brent Putman March 3, 2021 at 4:17 AM

Yes, but simply reading the map in multiple threads doesn't cause an issue, I think. The initial setup and mutation of the index map for a given QName key is now governed by the mutex, which is where I think the problem is. If you see otherwise, let me know. Always good to have multiple sets of eyes on concurrency problems (semi-pun intended).

Yeah, if you could cherry pick this and test in your production IdP that would be awesome.

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Fix versions

Affects versions

Created November 10, 2020 at 11:04 PM
Updated March 24, 2021 at 3:41 PM
Resolved March 3, 2021 at 2:49 AM

Flag notifications