Waymo and Google Brain partner to advance data augmentation research

Posted: 6 April 2020 |

Through the collaboration, Waymo aims to extend previous research to automatically discover optimal data augmentation policies to improve perception tasks for the ‘Waymo Driver’.

Waymo and Google Brain collaborate to optimise data augmentation

In order to advance its machine learning models and further improve its self-driving system’s perception, Waymo, the self-driving car service, has teamed up with colleagues from Google Brain, a deep learning artificial intelligence research team at Google, to extend its automated data augmentation research and test it against its dataset of autonomous driving.

“Data augmentation allows us to increase the quantity and diversity of data we observe without additional collecting or labelling costs. The principle behind augmenting data is simple. Let’s say you have a picture of a dog – by using various image augmentation operations such as rotation, cropping, image mirroring, colour shifting, etc., you can morph and transform the photo – but it doesn’t change the fact that it’s an image of a dog. These simple transformations turn one image of a dog into many, though determining which combinations of augmentation operations to use and applying them requires a lot of manual engineering,” Waymo wrote in a blog post.

The Google Brain team designed a new search space consisting of augmentation policies – combinations of augmentation operations. They were reportedly able to automatically explore which augmentation policies to use through reinforcement learning. By finding the optimal image transformation policies from the data itself, it was then able to improve image recognition tasks on various academic datasets and extend these ideas to object localisation problems. They also discovered a way to reduce the computational cost of searching for effective data augmentation policies, which is said to make it an effective and inexpensive tool for Waymo to use across its dataset.

To automate the process of finding good augmentation policies, Waymo created a new automated data augmentation algorithm – Progressive Population Based Augmentation (PPBA). PPBA builds on Waymo’s previous Population Based Training (PBT) work, where it trained neural nets with evolutionary computation using principles similar to Darwin’s Natural Selection Theory. PPBA reportedly learns to optimise augmentation strategies effectively and efficiently by narrowing down the search space at each population iteration and adopting the best parameters discovered in past iterations.

By automating data augmentation to lidar point clouds in Waymo’s Open Dataset, PPBA is said to achieve significant performance improvement across detection architectures.

“Our experiments also show PPBA is much faster and more effective in finding data augmentation strategies compared to a random search or a PBA baseline. Additionally, because we rely on labelled lidar data to train our neural nets, PPBA also allows us to save on labelling costs, in turn improving our data efficiency as one labelled example becomes many. Our 3D detection control experiments on the Waymo Open Dataset show that using PPBA is up to 10 times more data efficient than training nets without augmentation,” the blog post continued.

“Our experiments show that by applying automated data augmentation to lidar data, we can significantly improve 3D object detection without additional data collection or labelling. On the baseline 3D detection model, our method is up to 10x more data efficient than without augmentation, enabling us to train machine learning models with fewer labelled examples, or use the same amount of data for better results, at a lower cost. The increase in data efficiency is especially important as it means we can speed up the training process and improve the perception tasks of our fifth-generation Waymo Driver, enabling us to serve our Waymo Via partners and Waymo One riders more effectively and efficiently.”