Data Assembly

Milestone #2: Data Assembly

The second step of this project is to access the Google Street View data and organize it in a suitable format. In my project plan my target was to reach this goal by February 21st (last Wednesday). Although I have accomplished a lot, I have not achieved all of the things I wanted to achieve for this milestone. I expect to hit it by next week at the latest.

Here's what I have achieved.

First, I can download all of the relevant data from Google. This includes all of the panorama image data and meta data. I can also access the panorama ids for the neighboring locations. All of the metadata is stored in a database.

In [1]:
%cd ..
/home/jim/Projects/ITP/pds
In [2]:
from common import db

data = db.dump_database()

data.head(1).T
Out[2]:
0
pano_id h43_C8RtuVS-eyaBTeDuLw
road_argb 2164127858
road_description Broadway
description 719 Broadway
lat 40.7293
lng -73.9935
pano_date 2017-09
elevation_egm96_m -14.6791
elevation_wgs84_m -14.6791
links [('4ElQmDQdNK49Kc_Mxjp1Tw', 31.38), ('6pXUO87O...
link_count 2
download_date 2018-02-27 23:31:03
job_name test
metadata_complete Y
tiles_complete Y

Below are the database table column names.

In [3]:
data.columns.tolist()
Out[3]:
['pano_id',
 'road_argb',
 'road_description',
 'description',
 'lat',
 'lng',
 'pano_date',
 'elevation_egm96_m',
 'elevation_wgs84_m',
 'links',
 'link_count',
 'download_date',
 'job_name',
 'metadata_complete',
 'tiles_complete']

The download code is written in such a way that the download job can be paused and resumed at a later date. This will be essential later if I hit the Google Street View API download limit of 25,000 requests per day.

The raw image files are stored in an image directory. The filenames contain the panorama id and the heading/pitch of the image. There are 6 images for each panorama id.

In [4]:
!ls -l data/images/test/ | head -6
total 18460
-rw-rw-r-- 1 jim jim 73127 Feb 27 23:32 0G-deBj1AdAD4afV_n-ARQ_000_000.jpg
-rw-rw-r-- 1 jim jim 72587 Feb 27 23:32 0G-deBj1AdAD4afV_n-ARQ_000_090.jpg
-rw-rw-r-- 1 jim jim 46300 Feb 27 23:32 0G-deBj1AdAD4afV_n-ARQ_000n090.jpg
-rw-rw-r-- 1 jim jim 66686 Feb 27 23:32 0G-deBj1AdAD4afV_n-ARQ_090_000.jpg
-rw-rw-r-- 1 jim jim 69896 Feb 27 23:32 0G-deBj1AdAD4afV_n-ARQ_180_000.jpg
ls: write error: Broken pipe

Here's one of those image files. The heading of this file is 0 degrees, meaning the camera is pointing directly north.

In [5]:
from IPython.display import Image

with open('data/images/test/0G-deBj1AdAD4afV_n-ARQ_000_000.jpg', 'rb') as f:
    img = f.read()

Image(img)
Out[5]:
City street with a few cars parked on the side and an apartment building in the background.

Observe the image is 640x640 pixels.

I can also look directly east:

In [6]:
from IPython.display import Image

with open('data/images/test/0G-deBj1AdAD4afV_n-ARQ_090_000.jpg', 'rb') as f:
    img = f.read()

Image(img)
Out[6]:
City intersection with a few cars parked on the side and fresh and co on the corner. There are miscellaneous buildings in the background.

Or straight up:

In [7]:
from IPython.display import Image

with open('data/images/test/0G-deBj1AdAD4afV_n-ARQ_000_090.jpg', 'rb') as f:
    img = f.read()

Image(img)
Out[7]:
View looking up from a new york city intersection, showing clouds and buildings reaching for the sky.

And so on. The 6 directions are north, east, south, west, up, and down.

Of course I probably don't want to use these same directions for my project. That's why I wrote code that will assemble these 6 images into any image I want. Observe:

In [8]:
from sequencing import assembler

planar = assembler.Planar('test', 2000, 1000, 500, 60)

planar.generate('0G-deBj1AdAD4afV_n-ARQ')
Out[8]:
City street with a few cars parked on the side and an apartment building in the background.

My code allows me to change the heading to arbitrary values. Here's that same location turned to the left 60 degrees:

In [9]:
import numpy as np

planar.generate('0G-deBj1AdAD4afV_n-ARQ', heading=np.deg2rad(-60))
Out[9]:
Looking down a city street with a few cars parked on both sides and apartment buildings in the background.

Both of these pull pixels from multiple images and assemble them into one single image.

This is a lot of functionality, but you might be wondering why I went through through the trouble to code it this way. Why not download the image data for the orientations I want to use? Why download images for the fixed directions north, east, south, west, up, and down if I need to re-orient everything later?

The reason is because downloading data is slow, and I only want to download data for a location one single time. This approach allows me to decide later what orientations I want. I can manage data problems steming from the Google Street View car's actual recording path or other data idiosyncrasies. I can make larger images that have an aspect ratio of something other than 1:1, and I can change my mind about this as many times as I please.

And another reason is I wanted to do this:

In [10]:
equi = assembler.Equirectangular('test', 2000)

equi.generate('0G-deBj1AdAD4afV_n-ARQ', heading=np.deg2rad(-60))
Out[10]:
Equirectangular projection with sky at the top, the street below on the bottom, a view down a street in the center, and buildings on both sides of the road.

That's a valid equirectangular projection, similar to what a 360 degree camera will give. If I get a series of these in a sequence I can make a 360 video.

And here's the best part of my code:

In [11]:
%time img = planar.generate('0G-deBj1AdAD4afV_n-ARQ', heading=np.deg2rad(-60))
CPU times: user 180 ms, sys: 0 ns, total: 180 ms
Wall time: 179 ms
In [12]:
%time img = equi.generate('0G-deBj1AdAD4afV_n-ARQ', heading=np.deg2rad(-60))
CPU times: user 247 ms, sys: 4.2 ms, total: 252 ms
Wall time: 250 ms

Both images can be assembled in a fraction of a second. That's fast!

The code to do all of this is modeled after similar code used in my Camera3D library. The code is non-trivial but since I did it before I had working code I could model my Python code after. The basic idea is the same except I had to use several layers of numpy's advanced indexing to operate the lookup tables.

The performance improvement of my approach is significant. A naive implementation would take at least a minute for a single image.

Next Steps

I have more work to do.

  • One problem with my image assembly code is it does not allow me to change the pitch (angle up or down). Most of the time I won't want this but adding this feature is an easy fix. There will be a performance cost but only a few hundred milliseconds.
  • Constructing single images is great, but I want to assemble a sequence of images into a video. This sounds easy but isn't. I need to get the correct set of panorama ids in the proper order. To accomplish this I plan on building a graph using Python's networkx library to do necessary data cleaning and inspection.
  • I still need to figure out how to parse the depth data. I don't need it now but I want that information accessible.
  • I need to add more error handling and monitor API call limits.

Other Problems

While doing the above work I learned some things about this data. This will shape what I can and cannot achieve.

  • The Google Street View car seems to snap images once every 10 meters. My initial investigation shows this to be precise to within a millimeter or two. The precision is very helpful, but consider a video with a camera that moves 10 meters from one frame to the next. If the video has 30 frames per second, the camera will seem to move 670 miles per hour. I can reduce that to maybe 15 frames a second but it is still unusually fast for an automobile. Speed is going to have to be a part of the work.
  • I'm at the mercy of the actual path the Google Street View car took when it recorded images. If Google's car drove down the entire length of Broadway in one continuous run, I can use all of those images for one continuous animation sequence. If Google's car turned off of Broadway and returned later, a video made from the same sequence of locations will appear to have a temporaral break. This is something I'll have to deal with as I work with data.

Good Google Street View data will be data for highways in the midwest or other locations with roads in a wide open space. Also, non-road data like Google's trip down the Amazon or the Colorado river. Other locations like inside historic buildings might be useful but I will have to use them differently.

Comments