Current status: All our ducks are in a row.

What is this site? We monitor the status of coursera.org and our class sites, and we update here whenever there are interruptions in service.

Tuesday, October 22, 2013

2013-10-22: [Resolved] Site-wide interruptions

At 20:27 PST, normal access to our site was resolved. We are currently investigating and working with AWS to understand the issues. We will continue to monitor for further problems.

We treat downtime very seriously and apologize for the inconvenience. Thank you for your patience.

2013-10-22: Site-wide interruptions

We are currently experiencing issues with our servers that are causing site-wide interruptions. We are working to resolve the issue. Please be patient as we investigate the problem.

Friday, October 4, 2013

2013-10-04: [Update] Unusual site activity

At 20:54 Pacific time, we deployed remediation code to neutralize the unusual activity. We will continue to monitor the situation, however Coursera have been restored to full functionality. Thank you for your patience.

2013-10-04: Unusual site activity

At 7:58pm pacific time, unusual activity commenced on our site. Most of the site remained functional. We have begun remediation attempts to further quarantine this activity, and prevent it from impacting other users and classes. Thank you for your patience.

Sunday, August 25, 2013

2013-08-25: [Update] Period of increased latency

We have shifted traffic to avoid the AWS datacenter that is currently experiencing issues, and are serving at full capacity from Amazon's other datacenters. Site latency and error rates seem to be normal at this point.

2013-08-25: Period of increased latency

Beginning around 12:50 PDT, we observe increased site latency and error rates. Investigations are ongoing, although it looks like we might have been affected by degraded EBS performance (the disks backing our servers) reported by Amazon. The issue looks resolved now, but we will continue to monitor for further problems.

Saturday, May 11, 2013

2013-05-11: [Follow up] Site-wide Interruptions


Beginning around 23:22 UTC May 11, our database hosted on Amazon Web Services unexpectedly entered into a "failed connection" state. As our database caches expired, we experienced progressive site failure, ultimately leading to a site-wide outage.  We restored all class.coursera.org functionality at 00:00 UTC (12 May), and www.coursera.org functionality for all save Chrome users. Chrome users continued to experience issues browsing courses on www.coursera.org until 00:23 UTC, due to an experiment to optimize loading speed that required additional time to fix.

Based on early reports, it appears that the database outage was due to a bug in the way Amazon Web Services interacts with MySQL, the software powering our databases. Specifically, we believe the problem was caused by a MySQL binlog issue, a log that ensures that updates to the database are not lost. Engineers at Coursera and Amazon are continuing their investigation to determine the root cause of the outage.

Any downtime is unacceptable to us; we will be working closely with Amazon to prevent a similar issue from occurring again. We sincerely apologize for the inconvenience caused to the Coursera community.

2013-05-11: [Resolved] Site-wide Interruptions

Normal access to our site has been resolved, as of 5:25 pm PDT. Thank you for your patience.

2013-05-11: Site-wide Interruptions

We are currently experiencing issues with our servers that are causing site-wide interruptions. We are working with Amazon to resolve the issue. Please be patient as we investigate the problem.

Monday, April 29, 2013

2013-04-29: [Resolved] Latency Increase

We have resolved the latency issues that affected the home page. Although logging into a class intermittently failed, the logged-in in-class experience was not affected. We thank everyone for their patience and understanding.

2013-04-29: Latency Increase

We are working with Amazon Web Services to resolve an issue with the database that powers the home page. The in class experience is not affected. We will update this page when we have further details. We apologize for the inconvenience.

Friday, March 29, 2013

2013-03-29: Preventative Maintenance Concluded

The preventative maintenance has concluded without any service interruption. Thank you for your understanding as we work to make Coursera as reliable and speedy as possible.

2013-03-29 - Preventative Maintenance

We are performing preventative maintenance on our homepage (www.coursera.org). Users may encounter slowness during the course browsing experience. The in-class experience will not be affected. Check back here (www.coursera.org) for updates. Thank you for your patience as we make Coursera as reliable and fast as possible.

Thursday, March 28, 2013

2013-03-28 - Resolved: Minor service interruptions

Normal access to the home page has been restored. Access was restored around 11:15am PDT. We apologize for the inconvenience.

2013-03-28 - Update: Minor Service Interruptions

Access to the home page (www.coursera.org) remains intermittent. A team of more than 5 engineers from Coursera and a team of more than 3 from Amazon Web Services have been working on the issue since 5:45am PDT this morning. Working with the team from Coursera, Amazon engineers have isolated the underlying issues to be related to their own internal management processes. We continue to work closely together with the team from Amazon to resolve the underlying issues. In parallel, we are rewriting portions of the home page to function without the underlying database so students and instructors can continue to navigate to their classes. The in-class experience is functioning normally with no service interruptions or downtime. We sincerely apologize for the inconvenience.

2013-03-28 Minor service interruptions

We are experiencing issues with the database that powers www.coursera.org, which is causing intermittent outages and latency issues. We will continue working to bring Coursera back into a stable state. Students and staff should still be able to access class.coursera.org.

Monday, March 18, 2013

2013-03-18 Service interruptions resolved

We have resolved the connectivity issues on the Coursera website. Assignment submissions should be good to go! A small number of students were affected, and we will be reaching out to them to resolve their problems. We apologize for any inconvenience caused to students and instructors, and we thank you for your patience.

2013-03-18 Minor Service Interruption

Earlier today, users were experiencing intermittent connectivity issues. This affected some users who were trying to submit quizzes and peer assessments on our website. We are currently monitoring the situation, and will post a further update once the problem is resolved.

Friday, March 8, 2013

2013-03-08: Amazon Network Issues

Amazon's Elastic Compute Cloud is currently experiencing network connectivity issues. Fortunately, Coursera is still up and running, although some students may experience increased latency. Although we operate in 3 availability zones (data centers) within the North Virginia region, the network issues appear to be affecting the whole region. For further information, please check on Amazon's status at: http://status.aws.amazon.com/.

Update: As of 10:51 PM PST, Amazon's network connectivity issues have been resolved.  All Coursera systems are operating normally at this time.

Friday, February 22, 2013

2013-02-22: Site Outage Conclusion

Earlier today, we experienced site-wide intermittent access and high latency issues. We've now resolved these issues, and have determined the underlying causes.

The launch of the new course catalog, with its interactive search features, unfortunately came with a glitch that caused an overload on the database that serves www.coursera.org. In an attempt to address the performance issues this morning, we transitioned from our primary database to our secondary database on Amazon Web Services. However, the Amazon hardware that the secondary database was located on appears to have been malfunctioning.

After exhausting other options, we moved away from this secondary machine to another one. Immediately after, all our performance metrics recovered. We have been gradually restoring disabled home page features. Throughout most of this, our in-course experience continued to function normally, with only brief minor interruptions.

We sincerely apologize for the inconvenience this has caused. We take reliability very seriously and will be reviewing the timeline and causes of today's events very closely.

2013-2-22: Site Outage Update

We are experiencing issues with the database that powers www.coursera.org, which is causing intermittent outages and latency issues. We will continue working to bring Coursera back into a stable state. Students and staff should still be able to access class.coursera.org.

2013-2-22: Site Outage Update

Update: We now have www.coursera.org back up (as of around 11AM PST), but we have disabled the course explorer interface and are actively investigating high latency issues.

2013-2-22: Site Outage Update

We were able to restore service to class.coursera.org (the class platform) at around 9 AM PST. We are still experiencing an outage / high latency issues on www.coursera.org (the course explorer and account dashboard) and are working to bring those back up.

2013-02-22 : Site Outage

We are currently experiencing an outage on www.coursera.org and class.coursera.org. We are investigating the issue and hope to have it resolved soon.

Friday, January 18, 2013

2013-01-18 - 2013-01-21 - Planned Maintenance

Over this weekend, Coursera will be performing routine maintenance on archived, inactive and upcoming classes. This will involve turning off access to individual course sites (including to new enrollments and access to the class site itself) for up to 15 minutes.

Classes currently ongoing will not be affected by this maintenance. We are sorry for any inconvenience caused.

Monday, January 14, 2013

2012-01-14: Issues Resolved

The issues mentioned before only affected a couple classes. The vast majority of classes were not affected. Everything is operating smoothly now. We apologize for the inconvenience to those students and instructors.

2012-01-14: Site Outage

We are investigating issues. We apologize for the inconvenience.

Saturday, January 12, 2013

2013-01-12 - Shibboleth Access Problems

Due to an configuration problem, university-based users may not be able to authenticate to our systems via Shibboleth authentication systems (e.g. Stanford Weblogin, Rice NetID, etc...). We are working closely with InCommon to address this issue and will post updates as we make progress on this issue.

We are sorry for any inconvenience caused.

[Update 2:46pm - Issue fixed. Huge thanks to the team at InCommon!]

Saturday, January 5, 2013

2013-01-05 - SSL Certificate Changes

Yesterday at 6:30pm, we revoked our old SSL certificate (as it was due to expire), and issued a new one. We quickly updated the SSL certificate used on our two main domains (www.coursera.org and class.coursera.org). No downtime occurred. Unfortunately, a few auxiliary services were not upgraded to the new SSL certificate. We upgraded these auxiliary services by noon (PST) on Saturday (today). We apologize for the inconvenience.

Wednesday, January 2, 2013

2013-01-02: Planned Database Upgrades

We will be performing preventative maintenance on our databases tonight beginning around 11pm PST. Classes will become unavailable in a rolling window. We expect the maintenance to take around 2 hours. We apologize for the inconvenience. We will post further details here if necessary. Thanks!

Edit: All completed last night. The upgrade was completed within minutes with no disruptions to our services. Thanks for your patience!