Skip to main content

Traveling in Style with Google Street View

I created a software program to apply a Neural Style Transfer algorithm to Google Street View data to make an animation that looks like you are traveling inside a painting.

Google collects panorama photos from all around the world and makes them available through the Google Street View service. Although they visit many interesting locations, the photos often look dull. My goal for this project was to make the photos look artistic and visually appealing. I was inspired by the aesthetics of neural style transfers and wanted to use these new algorithms to improve the Google Street View photos.

To accomplish this I needed to obtain these photos through Google's API and use a coherent style transfer algorithm to make the sequence look like a painted animation. All of the computer code is written in Python. I wrote code to download photos through Google's API and I used Python and TensorFlow to implement the style transfers.

Here are two examples. The first is a trip down the Hudson River in the style of Monet's Sunset on the Seine at Lavacourt (1880). I also mixed in a small amount of a photograph by Javan Ng to give the result more structure.

The second is a trip down US 12 in Idaho. The style is a mixture of Picasso's Seated Nude (1909) and a photograph by Cat Connor of a sunset at Yosemite National Park.

Here are the stylized videos side by side with the original photos from Google Street View. It is interesting to compare the two to see the improvements.

Style Transfer Experimentation Tools

I did more than just make these two videos; I made a set of tools that can do this again and again for any location visited by Google Street View. All of the photos come from Google's Street View Image API.

Unfortunately the algorithm I am currently using to do the style transfer is extremely slow. It takes my computer several minutes or more to complete a style transfer on a single reasonably sized image or frame. At that rate a 60 second animation with a low frame rate of just 6 frames per second can take several days to compute. Because of the slow speed the opportunity cost of a failed style transfer is high. I put a lot of effort into small test runs to get an idea of if a larger job will give me the results I am looking for.

Technical Challenges

To speed up the process of stylizing each frame of a video I used cloud resources to accelerate the work. Using cloud resources for this project was fun and educational but moving forward I'm probably not going to continue using it for style transfers. There's nothing about this that has any kind of time urgency. I'm almost always going to be fine with doing this at home with my NVIDIA 1080 Ti GPU. In general it is my belief that using advanced hardware like V100s is acceptable but only after making an effort to optimize one's algorithms and code. For a lot of reasons I believe that my current hardware is underutilized and there is a lot I can do to make this faster.

Next Steps

My most important next step is to improve algorithm performance. My current idea is to use some kind of progressive image resizing to accelerate the early stages of the optimization. My initial attempts at implementing this didn't work as well as I hoped. It failed because of some details about how the optimization function works; I'll explore this in the future.

My code uses the algorithm described in Ruder, Dosovitskiy, and Brox's paper Artistic style transfer for videos. There are other papers that describe completely different and more sophisticated approaches that I will learn more about and experiment with. There's still a lot more for me to learn from the current paper though so I am going to stick with it before I move on to others.

I also want to switch from TensorFlow to PyTorch. I think PyTorch will put me in a better position to experiment with the code in the way I'd like.

Finally, I would like to apply these tools to 360 Videos. This was my original goal for this project. Doing such a thing would be amazing but would take much more computation time than what is really feasible right now. I'm determined to make this happen.