Worth noting that they already use BERT in Search.

The raw neural network does use search during training though, and does not rely on search only during evaluation.