Default logging setup in BasicParserPool considered harmful
Basics
Logistics
Basics
Logistics
Description
TL;DR: The logging setup in BasicParserPool seems inconsistent, and interacts with the implementation of LoggingErrorHandler to (in my opinion) over-log minor errors that are actually handled elsewhere.
At length:
BasicParserPool}} builds a {{DocumentBuilder}} as a result of properties set on the bean. One aspect of this is the (optional) {{errorHandler, set as follows (l.590):
if (errorHandler != null) {
builder.setErrorHandler(errorHandler);
}
This seems to imply that errorHandler can be null, and that if so the default behaviour of DocumentBuilder is required. In fact, this code is never executed because:
BasicParserPool's constructor initialises errorHandler with a LoggingErrorHandler, and
setErrorHandler does not permit the property to be set to null
This seems like an example of an inconsistent intent. Is the default behaviour of a parser in the absence of an explicit ErrorHandler supposed to be obtainable, or not?
This effect is compounded by the implementation of LoggingErrorHandler:
This not only logs the exception, but also a full stack trace. Given that the exception is also thrown so that it can be caught by another part of the code and processed there, the stack trace seems like overkill. It certainly makes logs unnecessarily hard to read when they are regularly interrupted by blocks of stack.
From first principles, I would not have expected an error to be logged and then re-thrown (to be potentially logged a second time at a higher level). This is pretty much baked into the default behaviour of BasicParserPool now, though.
I think it's much less clear that it should not be possible to configure BasicParserPool with a null error handler, or, failing that, to provide a non-logging class alongside LoggingErrorHandler.
Finally, I don't think that LoggingErrorHandler should be causing a stack trace for any error. That seems to me to be debug-level information. Two alternatives would be to just log the exception itself, perhaps including the cause; or logging the stack trace on error only if DEBUG was enabled for the logger.
It's best to require an ErrorHandler because this is one of the "standard" XMLisms that's fairly non-intutive. Non-fatal errors by default are typically not reported by the default error handling in the parser. We probably don't really need a logging variant, just throwing would be fine.
I did note the stack trace on this is extremely minimal, it's about 2 lines, unlike the more typical ones. The big ones you're seeing are probably from all the other areas we do this poorly, because that log and rethrow issue is endemic.
For now, I corrected the javadocs and made the code clearer, and add a flag defaulting off to log the exception to eliminate the stack trace. But the metadata layer logs the same error about 4 times, so that's the bigger challenge.
TL;DR: The logging setup in
BasicParserPool
seems inconsistent, and interacts with the implementation ofLoggingErrorHandler
to (in my opinion) over-log minor errors that are actually handled elsewhere.At length:
BasicParserPool}} builds a {{DocumentBuilder}} as a result of properties set on the bean. One aspect of this is the (optional) {{errorHandler
, set as follows (l.590):if (errorHandler != null) { builder.setErrorHandler(errorHandler); }
This seems to imply that
errorHandler
can benull
, and that if so the default behaviour ofDocumentBuilder
is required. In fact, this code is never executed because:BasicParserPool
's constructor initialiseserrorHandler
with aLoggingErrorHandler
, andsetErrorHandler
does not permit the property to be set tonull
This seems like an example of an inconsistent intent. Is the default behaviour of a parser in the absence of an explicit
ErrorHandler
supposed to be obtainable, or not?This effect is compounded by the implementation of
LoggingErrorHandler
:public void error(final SAXParseException exception) throws SAXException { log.error("XML Parsing Error", exception); throw exception; }
This not only logs the exception, but also a full stack trace. Given that the exception is also thrown so that it can be caught by another part of the code and processed there, the stack trace seems like overkill. It certainly makes logs unnecessarily hard to read when they are regularly interrupted by blocks of stack.
From first principles, I would not have expected an error to be logged and then re-thrown (to be potentially logged a second time at a higher level). This is pretty much baked into the default behaviour of
BasicParserPool
now, though.I think it's much less clear that it should not be possible to configure
BasicParserPool
with anull
error handler, or, failing that, to provide a non-logging class alongsideLoggingErrorHandler
.Finally, I don't think that
LoggingErrorHandler
should be causing a stack trace for any error. That seems to me to be debug-level information. Two alternatives would be to just log the exception itself, perhaps including the cause; or logging the stack trace on error only if DEBUG was enabled for the logger.