On-call engineers at the online shipping marketplace use Wavefront’s alerting and anomaly detection to monitor new code deployments.

CASE STUDY: uShip

uShip, Inc., operates the uShip website, a global online marketplace present in 19 countries, for shipping services where individuals and businesses post items they need to be shipped including auto transport, moving services, and the transport of heavy industrial equipment.

The Challenge

uShip runs an online shipping platform that offers shipping services worldwide. They manage that platform using a DevOps approach that can see as many as 20 code releases pushed to production in a single day. Such an operation needs the agility to monitor new releases for unintended effects so that code can be fixed and re-released before it becomes a significant issue for customers.

Initially, developers were on call for their own code – at least during off hours. This worked when the company was small, but as it grew, support became centralized, focusing the on-call duties on a smaller group of support engineers, with developers back working on new code. This created two issues: burn-out for the on-call group, and slower issue resolution, since cases weren’t being handled by whomever wrote the code. uShip needed to spread the on-call load.

That load would be further exacerbated by false alerts. uShip needed a way to reduce the shared burden of support by creating better, more meaningful alerts that were not only legitimate, but were also specific enough to suggest a solution, speeding resolution.

Finally, the monitoring solutions that uShip used allowed them to look for anticipated problems, but they didn’t have the scope necessary to flag unexpected problems. This created the possibility that issues could arise without them being noticed by the monitors, putting the customer experience at risk.

The Solution

uShip addressed these challenges by bringing on Wavefront, SaaS-based analytics platform for application performance monitoring and alerts – and VictorOps – for scheduling and managing on-call support duties. It was VictorOps that would issue the pager alerts for whomever was on call at the time. This helped in spreading the load, but, since VictorOps was not the origin of the alerts, it didn’t address the quality-of-alert or monitoring challenges. That’s where Wavefront was critical.

Wavefront provides instant insights for engineering teams from a variety of data sources and across the full stack, with a rich query language that allows uShip to build metrics and alerts that bridge data and team silos while monitoring broadly for the unexpected. VictorOps is one of the many systems with which Wavefront integrates. Wavefront ensures that the alerts sent to on-call engineers are legitimate, meaningful, and helpful.

The Results

Within minutes of Wavefront installation, uShip’s engineers got clear visibility into the causes behind some CPU spikes. As they continued deployment, they found that they could quickly detect – and therefore repair – code issues. This saved significant time for developers trying to fix their code. By monitoring all phases of the development pipeline, developers were often able to detect issues before they went to production.

Meanwhile, the Wavefront/VictorOps combo was able to deliver high-quality, useful alerts to the engineers on call. The broader spread of on-call duty resulted in a number of benefits:

  • Better morale for the developers on call
  • Better code ownership by developers
  • Better developer empathy for the potential issues facing other developers
  • Better developer understanding of the overall system and how it interacts with code
  • Overall better code and better developer productivity
  • A better, more competitive customer experience

“We’ve tried to identify and solve some code problems with Graphite and Grafana over the years, but we’ve never been able to crunch and analyze that data in a way that I can now do with Wavefront.”

Raleigh SchickelSr. Manager of DevOps, uShip