Recent Less Wrong downtime

post by drpowell · 2011-07-18T04:50:14.546Z · LW · GW · Legacy · 3 comments

Contents

3 comments

Many users may have noticed that Less Wrong experienced about 6 hours of downtime on 16/7/2011.

CAUSE: The server was put under an unusual amount of load and started up a new instance to load-balance the traffic.  Unfortunately, there was a bug in the script that starts the new instance that caused it to use an inconsistent mix of old and new code.  The symptom seen by users was that any post with comments was inaccessible.

RESPONSE: A hotfix was deployed as soon as the problem was detected, unfortunately it was a Saturday so this reponse time was slower than we would like.  We have since implemented a proper fix for the particular bug that caused this problem.  We are also creating some extra monitoring probes so we'll be notified promptly of any similar problems in the future.

Apologies for the inconvenience.

3 comments

Comments sorted by top scores.

comment by Paul Crowley (ciphergoth) · 2011-07-18T12:32:17.522Z · LW(p) · GW(p)

Thanks for writing this up!

comment by Nisan · 2011-07-18T13:44:13.786Z · LW(p) · GW(p)

Thanks for the response!

comment by jsalvatier · 2011-07-18T14:50:16.381Z · LW(p) · GW(p)

Thanks for letting us know :)