Skip to main content

Inside How Zipline Tests and Improves Safety Systems

Early this year, Zipline launched home deliveries with our new P2 aircraft, after completing one of the largest ever aviation test campaigns. We conducted tens of millions of virtual flights and more than 150,000 deliveries at our test facilities across America – that’s 15 times more flights than the U.S. military’s F-35 fighter jet had before it entered service.¹ 

Our service is launched, but every day we’re still working just as hard to improve our system, with new features and better performance, and even more testing. We’ve now cycle tested some of our components more than 4 million times in extreme durability tests, and our team now routinely completes more than 9,000 test flights a week. This massive effort lets us make sure that every update we make improves the safety, reliability, and performance of our system. It’s an exhaustive process, but we think it’s essential that we continuously improve everything we do, with safety above all else.

Now that we get to watch the joy on people’s faces as they get their delivery, I’ve been reflecting on all of the hard work that’s gone into making Zipline the world’s largest autonomous delivery service. Today, I’m peeling back the curtain and sharing one example of how we respond to and learn from flight events to make our system better. 

Flights flown before entering service, by aircraft

March 24, 2025 at a Zipline Test Site:

5:50pm - The Incident

Zip 290 was returning from a successful delivery when our team deliberately cut power to one of its motors mid-flight. This is a common stress test we run to see how the aircraft will respond to a rare event. Zip 290 was testing a new version of software which had already passed tens of thousands of virtually simulated flights, and flown hundreds of real-world test flights. But in this real-life scenario, the system did not perform as expected.

The Zip, which conducts more than 500 safety checks each second, immediately attempted to stabilize the flight. Normally our Zips compensate for a motor-out and return to stable flight within moments, but this time its attempts weren’t working. Instead of a straightforward return to normal flying, the remaining motors had to push to their absolute limits to continue the flight. 

While the Zip made it back to its docking area, it wasn’t able to complete the final maneuver, where the aircraft thrusts up into the dock to lock itself in place. Zip 290’s safety system ran through its logic and determined the best option was to land by parachute instead. The Zip cut its motors, the parachute deployed, and Zip 290 landed on the safety barrier below the dock, which is designed to harmlessly catch a Zip in unexpected circumstances like this. All parts of our safety system worked as designed.

Three things then occurred simultaneously: 

  1. Zipline’s remote pilot, who was monitoring the flight, alerted the ground team to spring into action. 

  2. The Zip’s computer began sending through important information that could help the team diagnose the issue. 

  3. Zipline’s fleet-level software flagged the issue and automatically kicked off an incident response protocol.

Other flights at the test facility were grounded while the team retrieved the aircraft to start identifying the root cause of the issue. Within minutes Zipline’s on-call engineering team were brought into the fold, equipped with data, photos, and everything they needed to begin triaging. 

6:15pm-1:45am - The Investigation

As expected, the logs showed a normal flight until the fault injection, followed by an abnormal flight afterwards, compared to other motor-out test flights. The on-call team spent the next hour digging deeper into the data, and isolated the problem to the controls system. They pulled together a team of specialized engineers to begin working on a fix. 

Aircraft stay in stable flight using coordinated motors and actuators that spin propellers and move “control surfaces". Pilots do this manually, while Zips do so autonomously. Zipline’s autonomy software makes positioning adjustments roughly every 20 milliseconds – 5 times faster than the blink of an eye – allowing flights smooth enough to deliver drinks without spilling. 

In Zip 290’s case, those microadjustments seemed to make the flight worse. The team’s analysis of the flight found the Zip had experienced undamped oscillation: with the way this specific Zip’s hardware and the new software vibrated in flight, the software’s microadjustments were timed poorly, and created a feedback loop that worsened stability. Now that they had identified the cause, the team knew what to fix.

The next day: The Fix

Oscillation challenges are incredibly tricky to solve on an aircraft as advanced as a Zip, and our process is far more rigorous than just “fixing it.” The team had to: 

  1. Reproduce the issue in simulation;

  2. Fix the immediate oscillation issue that led to flight instability, and confirm the fix solved the issue as reproduced in simulation;

  3. Strengthen the flight computer to detect and handle similar issues;

  4. Validate that the fix didn’t degrade overall performance;

  5. Incorporate this edge case scenario into our automated testing to prevent regressions in future software versions. 

The controls engineers knew the core of the solution would include strategically changing the speed of the microadjustments in order to break the feedback loop. Using a simplified model, they replicated the problem and evaluated several approaches. The most promising ones were advanced to the next stage, where thousands of virtual flights were run, including an exact replica of Zip 290’s flight. Within hours they’d aligned on a fix: by triggering select control movements slightly slower, every 200 milliseconds, they could eliminate the feedback loop and fix the resonance issue, leading to a stabilized flight. Counterintuitively, fewer adjustments was the key to stability.

Visualization of flight stability before and after fixing (brighter colors indicate increased instability)

By 11am, just 17 hours after the event, the improved code was submitted for further testing and validation. Parachute landings are rare, but when they happen, root causing them and thoroughly fixing the issue as soon as possible is critical.

March 25-27: Validating the fixes at scale

Now that the team was aligned on a fix, they had to verify and validate that the updated software met all of Zipline’s robust safety criteria before it was deployed commercially. First, the updated software was tested through tens of thousands of additional digitally simulated flights, then on actual Zip hardware, running a virtual environment on a benchtop. Using these tools, Zipline’s engineers can simulate flights 10,000x times faster than real-world testing. Within hours, the updated code had passed tens of thousands of simulations and was ready for real-world testing.

At 2:47pm the day after the incident, our test facilities team began flying real world flights, starting at short distances, then progressively longer, reviewing after each one. At 6:18pm, just 24 hours after the initial incident, the team re-flew the same test flight that Zip 290 flew the day before using the updated software. The flight remained stable even after the injected motor-out issue. The next day, dozens of aircraft were flying at our test facilities with the new software. By the afternoon of March 27th the changes had been validated on more than 1,000 real-world test flights. The team determined that the fixes were a success and added them into our commercial codebase. The issue has not reappeared.

The Conclusion

Zipline’s testing goes far beyond the scale and velocity of traditional aviation test campaigns. In just a few days our team was able to identify an issue, fix it, and then validate the fix across more than a thousand real-world flights. It’s a benefit that comes with having designed every element of our system from the ground up for safety and reliability, from having incredible teams that work together, and from the sheer scale of our comprehensive test campaign.

Want to solve problems like these? View our open roles at flyzipline.com/careers