Data Assembly
Milestone #2: Data Assembly¶
The second step of this project is to access the Google Street View data and organize it in a suitable format. In my project plan my target was to reach this goal by February 21st (last Wednesday). Although I have accomplished a lot, I have not achieved all of the things I wanted to achieve for this milestone. I expect to hit it by next week at the latest.
Here's what I have achieved.
First, I can download all of the relevant data from Google. This includes all of the panorama image data and meta data. I can also access the panorama ids for the neighboring locations. All of the metadata is stored in a database.
%cd ..
from common import db
data = db.dump_database()
data.head(1).T
Below are the database table column names.
data.columns.tolist()
The download code is written in such a way that the download job can be paused and resumed at a later date. This will be essential later if I hit the Google Street View API download limit of 25,000 requests per day.
The raw image files are stored in an image directory. The filenames contain the panorama id and the heading/pitch of the image. There are 6 images for each panorama id.
!ls -l data/images/test/ | head -6
Here's one of those image files. The heading of this file is 0 degrees, meaning the camera is pointing directly north.
from IPython.display import Image
with open('data/images/test/0G-deBj1AdAD4afV_n-ARQ_000_000.jpg', 'rb') as f:
img = f.read()
Image(img)
Observe the image is 640x640 pixels.
I can also look directly east:
from IPython.display import Image
with open('data/images/test/0G-deBj1AdAD4afV_n-ARQ_090_000.jpg', 'rb') as f:
img = f.read()
Image(img)
Or straight up:
from IPython.display import Image
with open('data/images/test/0G-deBj1AdAD4afV_n-ARQ_000_090.jpg', 'rb') as f:
img = f.read()
Image(img)
And so on. The 6 directions are north, east, south, west, up, and down.
Of course I probably don't want to use these same directions for my project. That's why I wrote code that will assemble these 6 images into any image I want. Observe:
from sequencing import assembler
planar = assembler.Planar('test', 2000, 1000, 500, 60)
planar.generate('0G-deBj1AdAD4afV_n-ARQ')
My code allows me to change the heading to arbitrary values. Here's that same location turned to the left 60 degrees:
import numpy as np
planar.generate('0G-deBj1AdAD4afV_n-ARQ', heading=np.deg2rad(-60))
Both of these pull pixels from multiple images and assemble them into one single image.
This is a lot of functionality, but you might be wondering why I went through through the trouble to code it this way. Why not download the image data for the orientations I want to use? Why download images for the fixed directions north, east, south, west, up, and down if I need to re-orient everything later?
The reason is because downloading data is slow, and I only want to download data for a location one single time. This approach allows me to decide later what orientations I want. I can manage data problems steming from the Google Street View car's actual recording path or other data idiosyncrasies. I can make larger images that have an aspect ratio of something other than 1:1, and I can change my mind about this as many times as I please.
And another reason is I wanted to do this:
equi = assembler.Equirectangular('test', 2000)
equi.generate('0G-deBj1AdAD4afV_n-ARQ', heading=np.deg2rad(-60))
That's a valid equirectangular projection, similar to what a 360 degree camera will give. If I get a series of these in a sequence I can make a 360 video.
And here's the best part of my code:
%time img = planar.generate('0G-deBj1AdAD4afV_n-ARQ', heading=np.deg2rad(-60))
%time img = equi.generate('0G-deBj1AdAD4afV_n-ARQ', heading=np.deg2rad(-60))
Both images can be assembled in a fraction of a second. That's fast!
The code to do all of this is modeled after similar code used in my Camera3D library. The code is non-trivial but since I did it before I had working code I could model my Python code after. The basic idea is the same except I had to use several layers of numpy's advanced indexing to operate the lookup tables.
The performance improvement of my approach is significant. A naive implementation would take at least a minute for a single image.
Next Steps¶
I have more work to do.
- One problem with my image assembly code is it does not allow me to change the pitch (angle up or down). Most of the time I won't want this but adding this feature is an easy fix. There will be a performance cost but only a few hundred milliseconds.
- Constructing single images is great, but I want to assemble a sequence of images into a video. This sounds easy but isn't. I need to get the correct set of panorama ids in the proper order. To accomplish this I plan on building a graph using Python's
networkx
library to do necessary data cleaning and inspection. - I still need to figure out how to parse the depth data. I don't need it now but I want that information accessible.
- I need to add more error handling and monitor API call limits.
Other Problems¶
While doing the above work I learned some things about this data. This will shape what I can and cannot achieve.
- The Google Street View car seems to snap images once every 10 meters. My initial investigation shows this to be precise to within a millimeter or two. The precision is very helpful, but consider a video with a camera that moves 10 meters from one frame to the next. If the video has 30 frames per second, the camera will seem to move 670 miles per hour. I can reduce that to maybe 15 frames a second but it is still unusually fast for an automobile. Speed is going to have to be a part of the work.
- I'm at the mercy of the actual path the Google Street View car took when it recorded images. If Google's car drove down the entire length of Broadway in one continuous run, I can use all of those images for one continuous animation sequence. If Google's car turned off of Broadway and returned later, a video made from the same sequence of locations will appear to have a temporaral break. This is something I'll have to deal with as I work with data.
Good Google Street View data will be data for highways in the midwest or other locations with roads in a wide open space. Also, non-road data like Google's trip down the Amazon or the Colorado river. Other locations like inside historic buildings might be useful but I will have to use them differently.
Comments