Digital images can be tagged to record where they were photographed. This is done by storing the geographical latitude and longitude of the camera in the EXIF or XMP metadata of the image. The latitude and longitude may be supplied by a GPS receiver or added later by an application such as GeoSetter, in which the camera position is manually selected on a Google Maps display.
Geotagging with latitude and longitude enables applications to fly to the location of an image. In Adobe Lightroom, for example, clicking the arrow → on a GPS field in the Metadata panel brings up Google Maps in a browser page, with a marker at the GPS location:-
Latitude and longitude alone, however, do not fully capture the relationship of image content to the scene surrounding the camera. The image content obviously depends on the bearing, tilt, and roll of the camera, and the resulting view volume after zooming and cropping. These additional camera parameters can be specified in Google's PhotoOverlay KML element, but the required values are not easy to determine. In any case, this KML element is not part of the image metadata.
An alternative approach to image registration comes from the science of Computer Vision. The fundamental matrix describes the projective relationship between corresponding points in two different photographic images of the same subject. A fundamental matrix F is an array of nine values. The mathematical relationship is shown below, where x1, y1 and x2, y2 are the coordinates of corresponding points in the first and second images, respectively:-
The first and second images can have arbitrary coordinate systems. Calculation of the fundamental matrix requires no knowledge about either camera, such as the focal length of the lens, or whether it was tilted, so it is not dependent on any existing image metadata.
For each point in one image, F determines a line in the other image on which the corresponding point must lie. All such lines in each image are called epilines, as they pass through a single point called the epipole. The epipole in each image is the projected position of the camera which photographed the other image. The two epipoles can readily be recovered from the values of F.
An in-depth technical description of epipolar geometry and the fundamental matrix is contained in chapter 9 of Multiple View Geometry in Computer Vision, by Richard Hartley and Andrew Zisserman.
A fundamental matrix relates pairs of images: the concept was originally introduced to analyse stereoscopic pairs. But we now have global satellite and aerial imagery, made readily available by Google Earth and Google Maps, to which any topographic image content can be related. The fundamental matrix can then relate the pixel coordinate system of a digital photograph to the World Geodetic System (WGS 84) used by Google Earth and Google Maps. We then call it a geodetic matrix, G, shown by:-
where λ is the longitude and φ is the latitude of a point in WGS 84; x and y are coordinates of a point in the pixel coordinate system of an image. The relationship only holds in the vicinity of the photograph, where the earth is effectively flat.
A geodetic matrix can be stored in the metadata of the image. It is easy to create a new tag and namespace in the XMP metadata, which was designed by Adobe to be extensible. Here is an example:-
<rdf:Description rdf:about='' xmlns:shortcipher='http://ns.shortcipher.com/1.0/'> <shortcipher:Geodetic Matrix> 1.7112808513742855e-005, -1.9510534820659577e-007, 0.01282266732236652, -1.171387582434056e-005, 2.8483914997267486e-006, 0.08675795480824508, -0.0029842750344773179, 0.0001259856117215182, 1.0 </shortcipher:Geodetic Matrix> </rdf:Description>
As the satellite or aerial camera is very high above the ground, the geographical epipole, which is the projected position of the other camera, is effectively at the longitude and latitude of the camera location. The geodetic matrix can be also used to obtain epilines corresponding to the four corners of the image: these show the projection on the ground of the camera's view volume. The epiline for the centre of the image effectively gives the camera's bearing. An application can therefore use the geodetic matrix in an image's metadata to fly to the camera location in Google Earth, display the projected view volume (blue lines), and rotate the Google Earth display to make the viewing direction upright. This is shown below:-
Furthermore, the epiline for any point in the image can readily be obtained. In Google Earth, this is displayed as a magenta line showing the line of sight from the camera to that point, which helps to find its geographical location:-
The application had selected a point (under a box cursor □) on the front corner of the bridge tower in this photograph:-
The calculation of G can be performed using the pixel coordinates of seven points in the image, and their corresponding geographical coordinates. An application can obtain these coordinates by mouse clicking on the points in a display of the image and in a Google Earth view of the image location. Because of Google Earth's bird's eye view, suitable points are often found on rooftops. The Google Earth view may be slightly oblique, showing the top of a building, for example, slightly displaced from its base. The geodetic matrix takes this into account, as long as the corresponding image point is chosen at the same height. The selection of seven pairs of corresponding points is shown in this example:-
The blue epilines in both views pass right through the selected points, as an exact solution for G can be found using the 7-point algorithm. It is not possible to specify the position of points with perfect accuracy, however, especially with lower resolution satellite imagery. As a result, this solution may not be accurate for all points in the photograph. Additional pairs of points may be specified, and the 8-point algorithm used to obtain a better solution which minimises discrepancies between the points and the epilines on which they should lie.
In the photograph, the epilines are almost vertical and parallel, because the aerial camera is high in the sky.
An application program Vicinity, written in Python, demonstrates the geodetic matrix concept:-
© Christopher B. Jones, July 2009