Service Degradation in Roadmunk
Incident Report for Roadmunk
Postmortem
  1. What happened - The issue appeared to be a memory leak with the sync servers on our European deployment.
  2. Why it happened - We do not deploy over the weekend which caused the server to not restart as it usually does. During this time period, the memory usage continued to rise on the machine until it ran out of available memory and thus created the incident.
  3. What was done to fix it - We were able to identify the host having the issue and issued the reboot to fix the the host not being available.
  4. How will this prevented in the future -
  • Begun an investigation of the memory usage of sync servers
  • Increased the memory and CPU on proxy servers
  • Added another proxy server for our European deployment
  • Updated our platform plans to include better alerting on this condition so that we can enable self-healing in future releases
Posted Sep 23, 2021 - 18:36 UTC

Resolved
This incident has been resolved.
Posted Sep 20, 2021 - 13:26 UTC
Update
We are continuing to investigate this issue.
Posted Sep 20, 2021 - 13:01 UTC
Investigating
A number of our EU customers are experiencing issues accessing Roadmunk. Our team is aware and currently looking into this issue.
Posted Sep 20, 2021 - 12:07 UTC
This incident affected: Roadmapping.