Disassembler stage may result in schema-invalid metadata
Description
Environment
Attachments
is related to
Activity
Ian Young August 1, 2024 at 7:11 PM
It’s possible that this ends up being the same as , or at least that it would be addressed by the same fix.
Ian Young November 20, 2019 at 3:07 PM
There is no fix for this issue today, which is why it's still "open".
It is, unfortunately, not an easy thing to fix, because the Java API we use to break down EntitiesDescriptors
simply does (what I would describe as) the wrong thing. Rather than an extracted copy of a node including definitions of all namespace prefixes that were in scope in the source document, it apparently includes only those which are used in the subtree in a sense which excludes being part of xsi:type
attributes. Fixing the issue would probably require figuring out the complete context and then synthesising new declarations to fill in the gaps.
I'm not aware of any consequences of stripping AttributeValue/@xsi:type
attributes from metadata. The UK federation has always taken this approach and I have never heard of a problem resulting from it. To a certain extent, this has lowered the priority of this issue in my mind. Given that it has come to the fore again, I'd like to address it for 0.10.0 if I can, but I can't guarantee that given the lack of insight we currently have into the cause.
Former user November 20, 2019 at 2:50 PM
Our old MDA installation for SWITCHaai also choked on this. Is there a release where this issue is fixed?
What are the consequences of stripping xsi:type
attributes?
Ian Young November 19, 2019 at 5:25 PM
Same issue came up today for the Canadian federation when the eduGAIN aggregator changed the way it was handling namespaces, such that declarations required by xsi:type
moved from the site where {{xsi:type}} was used up beyond the boundary of individual entities.
The simplest fix for this in practice has been to strip all xsi:type
from AttributeValue
elements, which is what both the UK federation and InCommon do.
Ian Young September 3, 2011 at 11:54 AM
Here's a reasonably short example. Moving the xmlns:xs definition down from the EntitiesDescriptor to the EntityAttributes element results in schema-valid metadata after disassembly.
It looks like when the disassembler stage creates a new DomElementItem, the namespace context constructed for the new item contains only those namespace prefix definitions that are visibly used in elements and attributes in the subtree. This seems to optimise out the case of a namespace prefix definition which is only used in an attribute of QName type: this is only visible as a use of the prefix if you know the schema that defines the attribute, which the disassembler does not.
The result is that the resulting DomElementItem can fail schema checks even when the original document would pass.
The case I came across was of an xsi:type="xs:string" in an EntityAttributes element, courtesy of Steven Carmody. With xmlns:xs defined on the aggregate, everything validates at the aggregate level. The individual EntityDescriptor is faulted, however:
checkSchemas: UndeclaredPrefix: Cannot resolve 'xs:string' as a QName: the prefix 'xs' is not declared.
One workaround appears to be to write documents such that the problematic prefix is defined within each EntityDescriptor, but obviously any kind of namespace normalisation occurring before the disassembly stage would negate that.
Fixing this by carrying across all namespace prefix declarations in scope for the EntityDescriptor (but not appearing on the newly created one) might work, but might result in genuinely unused declarations cluttering things up and needing to be cleaned up later. But, we already knew we needed more sophisticated namespace normalisation.
I will try and come up with a small example document.