Site slow down - backlog of workers
Incident Report for 17hats
Resolved
All systems are caught up. We have identified the root cause, and have added measures to insure this won't repeat itself.

We have also fixed the monitors that should have alerted us sooner but didn't.

Thank you all for your patience today as we figured out what caused this!
Posted Feb 11, 2019 - 17:06 PST
Update
The site itself is functioning properly and has been since about 9am PT this morning. Some background workers are still catching up.

We are still investigating the root cause.
Posted Feb 11, 2019 - 14:55 PST
Update
We are continuing to investigate this issue.
Posted Feb 11, 2019 - 08:55 PST
Investigating
During the weekend, our background worker machines got overloaded and started delaying executing certain tasks, such as calendar sync. For reasons to be investigated still, no alerts were given until this morning.

When we unclogged the worker machines however, we got so many requests from 3rd party integrations that this clogged all our app servers, which slowed down the site. We paused the workers and are bringing them back one-by-one so that the site is available as normal. The workers should be fully back to normal within the next few hours.

These things are never fun, especially on a Monday morning! We will keep you posted with updates
Posted Feb 11, 2019 - 08:55 PST
This incident affected: Web App.