Libcurl versions before 7.20 do not refresh caches of IdP IP address look-ups - this breaks SAML1 attribute query if the IdP's IP address is changed

Description

Please see this shibboleth-users discussion, in particular the last six emails:

http://groups.google.com/group/shibboleth-users/browse_thread/thread/03b347520659428c/1b866419a1198144?pli=1

There is a bug in libcurl versions lower than 7.20 which means that once an IdP's IP address has been looked up for attribute query purposes, it is cached and no further DNS lookups are made for that hostname until the SP is restarted. The bug was fixed in libcurl 7.20 - http://daniel.haxx.se/blog/2010/02/09/a-big-curl-forward/ - and http://curl.haxx.se/changes.html

So, if an IdP IP address is changed, and it runs up against this bug, the SP operator has to restart shibd to get things working again for that IdP/SP pair..

I know that this is not an SP bug, but the problem is that the CentOS5/RedHat5 libcurl is only 7.15.5! This is something of a problem for operators of SPs on those platforms. But I know that the CentOS and RH6 platforms are not affected, even though they also ship a version of libcurl lower than 7.20, because of this:

https://wiki.shibboleth.net/confluence/display/SHIB2/NativeSPLinuxRH6

Might I suggest that this solution is built for CentOS/RH5 as well, in order to work around this libcurl bug?

Environment

CentOS 5, RedHat 5, any other distribution that ships a version of libcurl before 7.20
CentOS 6 and RedHat 6 are not affected, please see Description for reason why.

Activity

Show:

Scott Cantor March 5, 2013 at 3:23 PM

Red Hat, as expected, just closed the bug WONTFIX.

Scott Cantor June 26, 2012 at 5:03 PM

Bug filed:
https://bugzilla.redhat.com/show_bug.cgi?id=835639

Will discuss on dev call, but my inclination given that RH5 EOL is 2017 is that we consider dumping the built-in version and force the override to a current package. I don't see how we can expect to stay with 7.15 for 5 more years when 7.26 is out now.

Scott Cantor June 26, 2012 at 4:23 PM

http://svn.shibboleth.net/view/cpp-xmltooling?rev=981&view=rev

I set the timeout just for the heck of it, but that shouldn't help.

I added a bit of code that prevents handles/connections from being saved to the connection pool if there's a failure during basic HTTP send/receive. My hope is that the bug might be mitigated by throwing away failing connections so that the DNS cache entries get freed up. Since most cases probably involve the old server failing to respond, the hope is that detectig failure at that layer is enough to help out.

I'm going to look into opening a Red Hat bug.

Scott Cantor June 26, 2012 at 4:05 PM

Moved to xmltooling as an RFE, since this is not our bug.

Scott Cantor June 26, 2012 at 3:17 PM

Of course it's not that easy. Latest libcurl is not the same soname, it's now libcurl.so.4 and the version in RH5 is libcurl.so.3, so there's no way to build for the original and substitute the new one.

It's all or nothing, no choice.

I might be able to add a lifetime to the cached connections and stop using them after they expire. I don't know if that will fix it, but it would limit the connection pool from staying around forever and maybe that will allow the stale entries to expire during periods of less use. I don't know. There's not alot I can do here, the real fix is to ask Red Hat to backport the patch.

Fixed
Pinned fields
Click on the next to a field label to start pinning.

Details

Assignee

Reporter

Components

Fix versions

Affects versions

Created April 18, 2012 at 5:46 PM
Updated March 5, 2013 at 3:23 PM
Resolved June 26, 2012 at 4:23 PM

Flag notifications