Are Self-Driving Cars Really Safer Than Human Drivers?

Self-driving cars

Photo: iStockPhoto / NanoStockk

Self-driving vehicles are one of the most exciting and impactful applications of AI.

More than 35,000 people die every year in motor vehicle crashes in the US alone. Since self-driving vehicles can theoretically react faster than human drivers and don’t drive drunk, text while driving, or get tired, they should be able to dramatically improve vehicle safety. They also promise to increase the independence and mobility of seniors and others who cannot easily drive.

More than $250 billion has been invested in self-driving vehicles over the last three years. A sampling of companies and vehicle types is shown in the table below:

DELIVERY ROBOTS SHUTTLES AND BUSES TAXIS TRUCKS CONSUMER VEHICLES
Gofurther.ai Auto (Ridecell) Argo Aurora Apple
Idriverplus Baidu Aurora Daimler Aurora
Kiwibot Beep AutoX Einride Baidu
Neolix Coast Autonomous Baidu Apollo Embark BMW
Nuro e.Go Cruise Gatik Cruise
Refraction.ai EasyMile Didi Chuxing Ike (Nuro) Daimler
Scout (Amazon) Local Motors Dongfeng Inceptio Ford
Serve Robotics Milla Pod Hyundai Kodiak Robotics Honda
Starship Technologies May Mobility Motional Locomation Huawei
Unity Drive Navya (Softbank) Pony.ai Plus.ai Hyundai
Yours Technology Ohmio Waymo Pony.ai Kia
Optimus Ride WeRide Pronto.ai Mazda
Sensible4 Zoox (Amazon) Volvo Nissan
TransDev Tesla Peugot
Venti Technologies TuSimple SAIC
Voyage (Cruise) Udelv Subaru
Yutong UPS Tata Elxsi
Waymo Tesla
Zipline Tencent
Toyota
Volkswagen
Volvo

But will self-driving vehicles actually be safer? The biggest issue for the automotive industry revolves around handling unexpected situations that arise from edge cases. In fact, there are two new automotive safety standards, ISO 26262 and UL 4600, that attempt to address these edge cases. However, these standards are not prescriptive, and regulatory agencies are not requiring compliance with these or any other standard for autonomous vehicles. Worse, as I’ll explain below, there are good reasons to believe that some types of autonomous vehicles may not be capable of handling these edge cases.

Levels of Automation

First, let’s define what we mean by “self-driving”. The Society of Automotive Engineers has defined six levels of driving automation as illustrated below:

Are Self-Driving Cars Really Safer Than Human Drivers? 2
Table created by author

Levels 3-5 are considered Automated Driving Systems (ADSs) in which the driver does not need to pay attention to the road. At Level 3, the driver can read a book or watch a movie but must be able to take over control within 10-60 seconds if asked to do so by the vehicle. One big issue for Level 3 vehicles is that a crash might occur in the 10 seconds the driver spends taking over, so Level 3 vehicles will probably need to include an ODD where 10 seconds is reasonably safe (e.g. low-speed highway traffic jams).

The difference between Levels 4 and 5 is that Level 4 vehicles are restricted to an Operational Design Domain (ODD), which usually includes restricted geography (e.g. a limited set of streets in a city) and may include other restrictions based on weather, time of day, precipitation, road grades and curvature, and other factors. Level 5 vehicles can drive anywhere with no restrictions and are theoretically effective replacements for consumer vehicles and commercial trucks.

Many of the consumer vehicles on the road today, such as Teslas, have driver assistance capabilities. They can keep the vehicle centered in its lane, and they can accelerate and brake automatically. However, it is not safe for the driver to read a book or watch a movie. The driver must constantly monitor the road and be ready to take over control instantly. These are Level 2 vehicles and are not considered ADSs. Drivers must pay attention to the road and be ready to take over instantly. For example, I was driving my Tesla last week on a New York highway in autopilot mode when I hit a big bump. My Tesla swerved and went ding-ding-ding, which means “Steve, you are on your own” and I had to react quickly to steer it back into my lane. An issue for these Level 2 vehicles is that marketing terms like “full self-driving” may be convincing drivers to act dangerously, as in the April 2021 Tesla crash with no one in the driver’s seat.

The Big Problem:  Edge Cases

People use commonsense reasoning to handle unexpected phenomena while driving: A deer darts onto the highway. A flood makes the road difficult or impossible to navigate. Cars are fishtailing, trying to get up an icy hill.

People do not learn about all these possible edge cases in driving school. Instead, we use our everyday commonsense reasoning skills to predict actions and outcomes. If we see a ball roll onto the street, we know to look out for children chasing the ball. We change our driving behavior when we see the car in front of us swerving, knowing that the driver might be intoxicated or texting.

Unfortunately, no one knows how to build commonsense reasoning into cars, or into computers in general.  In lieu of commonsense reasoning capabilities, ADS developers must anticipate and code every possible situation.  Machine learning can only help to the extent that manufacturers anticipate every situation and provide training examples of every possible situation.

Worse, there are millions, maybe billions of these edge cases. Everyone has at least one unusual driving story. There are 1.4 billion drivers in the world. If there are 1.4 billion of these edge cases, how can they possibly all be identified much less coded?
And if ADSs cannot perform commonsense reasoning to handle all these edge cases, are they really safer than human drivers?

The Importance of Commonsense Reasoning is ODD-Dependent

Level 4 vehicles are restricted to a specific ODD. This will usually dramatically reduce the number of edge cases compared to Level 5 vehicles that have no ODD. For example, we are already seeing Level 4 point-to-point shuttles on corporate campuses that drive very slowly. These vehicles are unlikely to encounter many edge cases because there are not many unexpected things that can happen on a single road between two locations. And if something were to happen, the shuttles travel so slowly that there is little risk for passengers or pedestrians.

Level 4 self-driving taxis for which the ODD is limited to specific city streets will encounter more edge cases than corporate shuttles but probably nowhere near the number of edge cases consumer vehicles might encounter. It is possible to maintain detailed maps (e.g. traffic lights and construction zones) by limiting the driving domain to specific streets.. In contrast, a Level 5 vehicle must be able to drive on every street in the world – or at least the consumer’s country. The Washington Post counted over one million roads in the US alone.

This is why there are so many ADS developers testing self-driving taxis in cities like Phoenix, San Francisco, and many other cities. Most tests are being carried using safety drivers who are ready to take over instantly in the event of a dangerous situation. However, there are only a small number of driverless tests being carried out with restrictive ODDs.

Self-Driving Vehicles Don’t “See” Like People

Another issue for ADSs is that computer vision systems are prone to mistakes as they can be fooled in ways that people are not. For example, researchers showed that minor changes to a speed limit sign could cause a machine learning system to think the sign said 85 mph instead of 35 mph. Similarly, some hackers tricked Tesla’s autopilot into changing lanes by using bright colored stickers to create a fake lane.  In both cases, these changes fooled cars but did not fool people. These are only a few ways a bad actor could confuse cars or trucks into driving off the road or into obstacles.

The differences in how self-driving cars perceive the world leads to concerns far beyond hackers.  For example, in real-world driving, many Tesla owners have reported that shadows, such as of tree branches, are often treated by their cars as real objects.  In the case of the Uber test car that killed a pedestrian, the car’s object recognition software first classified the pedestrian as an unknown object, then as a vehicle, and finally as a bicycle. I don’t know about you, but I worry about being on the road as a pedestrian or driver or if vehicles cannot recognize pedestrians and other objects with 100 percent accuracy!

Why Testing is Necessary

We have some good reasons to believe that ADSs will be safer than human drivers, and we have some good reasons to worry that ADSs will not be as safe as human drivers.

From a regulatory perspective, it would be silly to just look at the good reasons and give ADS developers carte blanche to put ADSs on the road without proving they are safer than human drivers. But that is exactly what regulatory authorities are doing!

The position of the National Highway and Transportation Safety Authority (NHTSA) is that

The safety benefits of automated vehicles are paramount. Automated vehicles’ potential to save lives and reduce injuries is rooted in one critical and tragic fact: 94% of serious crashes are due to human error. Automated vehicles have the potential to remove human error from the crash equation, which will help protect drivers and passengers, as well as bicyclists and pedestrians. When you consider more than 35,000 people die in motor vehicle-related crashes in the United States each year, you begin to grasp the lifesaving benefits of driver assistance technologies.

The NHTSA published a proposed framework for ADS safety in December 2020 that was criticized by the National Transportation Safety Board (NTSB) for “… the lack of a requirement for mandatory submission of the safety self-assessment reports and the absence of a process for NHTSA to evaluate their adequacy.”

Are Self-Driving Cars Really Safer Than Human Drivers? 4
Image snipped from public document

The NTSB also noted that individual states are creating their own regulations due to the absence of federal regulations. Arizona, for example, has minimal restrictions that the NTSB said were at least partially responsible for a pedestrian fatality in 2018 by an Uber ADS. Florida statute 316.85 specifically allows the operation of autonomous vehicles and explicitly states that a driver does not need to pay attention to the road in a self-driving vehicle (e.g., the driver can watch movies).  It also explicitly permits autonomous vehicle operation without a driver even present in the vehicle. And there are no requirements for manufacturers to pass safety tests beyond the requirements that were in place prior to self-driving capabilities.  Whenever a car, truck, bus, or taxi company decides they are ready, they are free to test and sell driverless vehicles. I own a home in Florida and this terrifies me.  Many other states are also encouraging the rollout of self-driving vehicles without safety standards.

Are Self-Driving Cars Really Safer Than Human Drivers? 6
Photo: iStockPhoto | gorodenkoff

Safety Standards for ADSs

There are three very different types of safety testing needed for autonomous vehicles. The first is to ensure that all the components that feed information into AV decision-making are working correctly. In Taiwan a year ago, a Tesla Model 3 in Autopilot mode crashed into an overturned tractor-trailer while going 70 mph. The crash was reportedly caused by a software failure in the car’s forward-facing sensor array, and this prevented the automatic braking from working properly. Testing of sensors should be adequate to prevent this type of failure.

Waymo (Google) reports that it tests each camera before putting it in the car, then tests the car once it’s integrated, and finally, it tests various abilities associated with the camera such as the ability to detect pedestrians.

The ISO 26262 standard has been widely adopted by the automotive industry for testing software bugs and hardware failures. It ensures that sensors and other components are working as designed.

The second type of safety testing is to prove that the vehicle can handle the types of real-world scenarios it is expected to encounter. In 2016, the NHTSA outlined 36 scenarios that should be tested but noted that it was not a complete list. Example scenarios included

  • Detect and Respond to Speed Limit Changes and Speed Advisories
  • Perform High-Speed Merge (e.g., Freeway)
  • Perform Low-Speed Merge

Waymo tests these and other scenarios in both a driving simulator and on a closed course in a 113-acre facility in California before testing them on real streets with a safety driver.

Testing Edge Cases

The third type of safety testing that should be conducted is an analysis of whether systems respond safely when they encounter unexpected situations for which they were not trained or programmed.

There are two standards that have been developed to support this type of safety testing: ISO 21448, also known as Safety Of The Intended Functionality (SOTIF), was designed for Level 1 and 2 vehicles but can be used for ADSs. UL 4600 was specifically designed for ADSs. These standards ask the developers to list and test for the known edge cases. However, neither standard determines the safety level of the ADS because they do not test for unknown edge cases.

So, how do you test for an unknown edge case? One method is with disengagement data. In California, many manufacturers are testing self-driving taxis in cities like San Francisco. These vehicles operate in Level 4 mode with an ODD that is restricted to specific city streets but there is also a safety driver who is required to be fully alert and ready to take over instantly as if it were a Level 2 vehicle. When the vehicle gets in trouble, the safety driver takes over. This is known as a disengagement.

We know how many miles the average human drives between accidents. If one assumes that each disengagement would result in an accident, then the disengagement rate could be compared to the accident rate for people (or, more specifically, for taxi drivers). Unfortunately, it is not quite that simple to estimate the number of miles that autonomous vehicles need to be driven to demonstrate safety. The Rand Corporation estimated in 2016 that it would require hundreds of millions of self-driving miles, which is impractical. But since then, Waymo alone has logged over 20 million self-driving test miles.

Another issue is that we do not have statistics for human driver crashes for specific ODDs. However, we could at least compare ADS crash rates for specific ODDs to overall human driver crash rates. Given that 90% of human driver crashes are due to human error, even if we had granular ODD statistics, there might not be big deviations from the overall crash rates because most of the crashes would be due to human error.

A final issue is that many of the disengagements that occur during ADS testing would not have resulted in accidents. Disengagements occur for benign reasons like needing to drive on a street that is not part of the ODD. Still, the ADS testing process could be enhanced so that some degree of comparison is possible. For example, Aurora safety drivers record which times they disengaged to prevent an accident. Similarly, Argo records the sensor data for disengagements so that it can be used by analysts to review the disengagement. Waymo uses the disengagement sensor data to recreate the disengagement conditions in a simulation that can determine whether a crash would have occurred if the safety driver had not disengaged.

Companies like Aurora, Argo, and Waymo are testing Level 4 vehicles – primarily self-driving taxis. Analyzing disengagement data is more difficult for Level 5 consumer vehicles that would be difficult to test with safety drivers because of the enormous variety of environments. However, consumer vehicles effectively have safety drivers when operating with a Level 2 ADS. The safety driver is the consumer. In fact, some consumer vehicles record disengagements. For example, when a Tesla is driven in Autopilot mode and the human driver takes over control, this is recorded as a disengagement and reported back to Tesla HQ. However, to my knowledge, Tesla does not share this data with regulators. Still, consumer vehicle manufacturers could test their vehicles with safety drivers who record reasons for disengagements and ADS developers could be required to share this data with regulators.

Summary

Automated Driving Systems (ADSs) are being designed to drive a car, taxi, bus, or other vehicle while the driver is otherwise engaged (e.g. reading a book). Level 3 and Level 4 ADSs have restrictions on their Operational Design Domains (ODDs), which may include restricted geography and restrictions based on weather, time of day, precipitation, road grades and curvature, and other factors. Level 5 vehicles can operate anywhere and do not have an ODD.

There are good reasons to assume that ADSs will be safer than human drivers. They never get tired, text while driving, or drive after drinking.

There are equally good reasons to assume they will not be safer. No one knows how to build common sense into computers. However, commonsense reasoning is needed for an ADS to handle all the unexpected situations (edge cases) it might encounter. Without commonsense reasoning, ADSs can only handle edge cases that have been explicitly coded into the ADS software or edge cases that the ADS has been trained to handle. Accidents and traffic jams can occur when edge cases are encountered that were not anticipated by the ADS engineers.

The reality is that ADSs will likely be safe for certain ODDs (e.g. a corporate shuttle that back and forth along a single street at 5 mph) but perhaps not for other ODDs (e.g. a taxi service that covers a broad geography and all weather conditions) and for Level 5 operation.

Regulatory authorities in the US and all over the world are rushing ADS technology to market because of potential benefits such as lowering accident rates and improving mobility for seniors and disabled people. For example, the National Highway and Transportation Safety Authority (NHTSA) has indicated that it does not believe testing of ADS capabilities should be required. Some ADSs will be proven safe for certain ODDs. Some ADSs will likely be proven unsafe for other ODDs and/or for Level 5 driving.

Shouldn’t ADS developers be required to prove an ADS is at least as safe as a human driver for a specific ODD before allowing it out onto public roads? This lack of regulation has the potential to turn our roads into a vast experiment with disastrous consequences.

This article discusses the types of testing that should be performed before allowing ADSs on our roads and calls on our regulatory authorities to require such testing.

This article was originally published in The Gradient.