Quick takeaways from Tesla’s AI Day Event for AV’sMotion Planning
Excitingly, Tesla just shared lots of details of their vision and planning methods at the Tesla AI Day event held on Aug 19th. They have shown much fancy staff, especially their frontier vision techniques. As traffic engineers, the authors are more interested in the planning part, i.e. how Autopilot plans for the longitudinal and lateral movements in its self-driving mode, which correspond to the traditional car-following (CF) and lane-changing(LC) models in the traffic research domain.
This short article will share some quick takeaways pertinent to the planning methods revealed in Tesla’s AI event. As a disclaimer, all screenshots listed below are made from the youtube recording with or without comments from the writer of this blog.
Karpathy started to introduce his work after joined Tesla four years ago. One thing related to the planning is the speed estimation from images/videos, see Fig. 1. Karpathy mentioned that using videos and their fusion, the speed estimation is clearly improved compared to single-frame estimation with differentiation.
Right after that, the lead in Tesla’s Planning&Control talked about the detailed methods used in Autopilot.
Planning is an optimization problem with a balanced objective:
The speaker started by introducing the three pillars of the planning module, which are safety, comfort, and efficiency shown in Fig.2. The goal of planning is to maximize a balanced total of them.
Those three aspects are not surprising. But the total cost of the three aspects indicates Tesla is treating the planning as an optimal control problem, where it needs to compute a trajectory that maximizes a certain function of the three terms.
What are the key problems in planning?
To solve the planning as an optimization problem is not a simple task. What are the challenges out there? The speaker mentioned 2 major problems; see Fig. 3. The action is non-convex means there exist multiple local optimal, where the algorithm can easily converge to local minimum points and fail to find the global one. The high-dimensional issue simply means it is too expensive to search for a continuous trajectory because Tesla needs to predict a future 20s. Instead, a discrete search with finite and fewer dimensions makes it tractable.
A hybrid planning system:
To circumvent the challenges, Tesla adopts a pipeline as depicted in Fig. 4. Simply put, Autopilot first conducts a low-dimension coarse search, e.g. a route consisting of 10 points, and around that line there exists a zone called ‘convex corridor’, where we should only have one optimum that gives us the best objective value. In that convex corridor, it is possible to do a continuous optimization with a high-dimension trajectory due to the lower computational cost.
Example of the hybrid planning method:
Then the speaker gave us an example of the hybrid planning method, which is a lane change case.
Now Autopilot picks one optimal solution from the 2,500 searches which best balances the three pillars in its objective function: safety, comfort and efficiency.
Although the speaker did not show an example of the longitudinal movement. We can conjecture a similar paradigm is used for the car-following case.
How to deal with surrounding traffic?
Another key question in planning is how to model the reactions or predict their trajectories. In many research papers, it is typical to see a simplified model such as Intelligent Drive Model (IDM) is used for all other vehicles, which has become a lingering issue in the research area. It is interesting to see how Tesla deals with it.
We have a short and direct answer from the speaker: Tesla runs Autopilot for all surrounding vehicles as well as itself. He showed a vivid example when a Tesla vehicle runs into two up-coming cars in a narrow lane, the speaker mentioned Tesla predicts the trajectories of upcoming vehicles and have different solutions according to different scenarios; see Fig. 7.
For the author of the blog, it is a bit surprising to see Tesla runs Autopilot for all vehicles. Because that means Tesla has to be very confident about the similarity between Autopilot and a real human driver. If it is not true, simulating all vehicles using Autopilot won’t make a lot of sense.
In the end, the speaker used a driving scene possibly from India to show the driving tasks can be very different across the world. He further commented that it needs a learning-based solution to handle those ‘complicated’ driving jobs; see Fig.8.
The learning-based approach is for the future:
The speaker briefly introduced their current plan to develop a learning-based motion planning approach. With a neural net planner, the whole architecture would look something like this:
Honestly, the speaker did not reveal too many details on how to train such a neural net planner, possibly due to it is still under development. However, we see a clear approach using a ‘Reinforcement Learning’ approach, where Tesla still treats the planning as an optimization problem but leverages RL methods similar to the famous MUZERO paper and those successful Atari games.
Similar to MUZERO, Autopilot tries to predict the states and action distribution and later conducts a Monte-carlo tree search with its customized cost functions, similar to the balanced objective function as we mentioned above; see Fig. 10.
In summary, Tesla solves the planning problem using a hybrid search-based method, with a human-designed balanced goal consisting of safety, comfort, and efficiency. Interestingly, Tesla predicts all surrounding vehicles using the same Autopilot model and interacts with them to increase its competition capabilities. Tesla notices the challenges in more complex driving scenarios and is currently on its way to adopt an RL method for the planning problem, similar to what we have seen in recent RL success in Atari games.