Warner Losh imp at bsdimp.com
Fri Aug 9 03:15:21 EDT 2013

On Aug 7, 2013, at 3:32 PM, Rob Seaman wrote:

> The actual title is rather less dramatic: "How the Leap Second Led Facebook to Build DCIM Tools". The take-away comment from the Facebook Site Operations VP:


> “The number of cabinets brought down was not significant enough for it to impact our users.”

Based on my experience over the last 8 or so leap days, they don't bring down any servers, let alone multiple cabinets. The only leap year bug I can recall that got a lot of press was the Zune one, and since nobody bought a zone, its impact was tiny :)

Yet in each of the last few leap seconds, there have been some measurable consequences that elevate it beyond any leap day snafus I've seen. Why should leap seconds cause so much more collateral damage?

One can only conclude that the leap day standard has been well understood and well implemented for the past few hundred years in most of the world, while the leap second standards and rules have been poorly implemented and there's poor confidence by large users such as Facebook in their vendor's ability to implement this standard well enough to obviate the need for extra monitoring. Facebook hasn't gone off and implemented a special monitoring system to ensure its systems get leap days right....


