|
Spatial Footprint Visualization
Contents
Each of the items in a geospatial digital library or geographic
information system has associated with it a subset of the Earth's
surface that represents the item's spatial coverage or spatial
relevance [1]. We'll refer to this subset
as the item's footprint. Visualizing footprints against a
background map is a useful way to contextualize and evaluate the
associated items, especially the items belonging to a query result
set. And in the specific case of a query result set, visualization
can also yield information about the query itself— for example,
whether or not the query was underspecified, and if so, how it might
be further constrained.
Visualizing footprints is relatively easy if all footprints are
points:
- Points can be rendered literally or iconically.
- Visual point representations are small, and are not as prone to
overlap as other shapes are.
- When there are large numbers of points, relatively simple
techniques exist for clustering points and visualizing the density of
points.
- Points remain points regardless of map scale.
- Sounds silly to say, but where no point appears, there is no
point. I.e., where no rendered point appears, there is no
footprint.
However, visualizing shapes that have areal extent introduces
several complications (for this discussion we'll focus on boxes, but
the same complications arise with polygons, circles, etc.):
- Boxes are larger, and box overlap and occlusion tends to be a
significant problem.
- It is possible, but more difficult to cluster boxes and visualize
box density.
- Box appearance varies depending on the map scale. At extreme
small scales, boxes degenerate into points.
- Where no box appears, there may nevertheless be a box. I.e., if
boxes are rendered by drawing box edges only, it may not be apparent
whether any given point on the background map is contained within a
footprint or not. In the extreme case, boxes can disappear entirely
because their edges are entirely outside the map's field of view.
As of this writing, few spatial query systems support visualization
of query result sets. Most systems (Geospatial One-Stop and the Geography Network are
representative) support map-based, spatial search, but revert to
simple linear textual listings when presenting results. Those systems
that do support visualization of spatial result sets (Google Local, Yahoo! Local, and MSN are recent examples) support
visualization of point footprints only.
The Alexandria
Digital Library's default webclient
supports visualization of boxes. Currently it can display only one
result footprint at a time, a soon-to-be-removed limitation of the
third-party map software it is bundled with. Improved map software
will allow multiple footprints to be displayed simultaneously, raising
the issue of how query result sets can best be visualized.
To evaluate different footprint visualization techniques on real
data, we have developed a test collection of result sets that
illustrates the wide variety of result sets users encounter in
practice and the kinds of visualization challenges such result sets
pose. The collection was generated by randomly selecting 100 queries
from the ADL log files (among queries that yielded at least one
result, that is), categorizing the associated result sets, and
hand-selecting the following 11 representatives. Footprints are
visualized here as red rectangles [2].
-
Ideal.
Here, the query "online images containing phrase 'simi valley'"
yielded 8 results having 8 homogeneous, clustered, yet clearly
distinct footprints. This is the simplest and ideal case; if only it
were always this easy.
Query
Results
-
Duplicates.
A ubiquitous problem is that footprints are not unique; not only do
they overlap and occlude one another, they are often completely
coincident. For example, this query ("cartographic works overlapping
a query region and containing words 'new haven'"; the query region is
indicated here in green) yielded 38 results and but only 7 distinct
footprints.
Query
Results
-
Inside.
In queries that use the "within" spatial operator, the query region
often provides good context for interpreting the results.
Query
Results
-
Outside.
The "overlaps" and "within" spatial operators account for
approximately 80% and 16% of spatial queries, respectively; the
"contains" operator (as in, "find items that completely contain the
query region") is used only 4% of the time. When it is, it often
provides very poor context for interpreting the results, as this
example shows.
Query
Results
-
Range of sizes.
A result set can hold footprints of wildly different sizes. Here, the
query "containing phrase 'point dume'" yielded the cluster of 22
results shown on the right, along with another 6 (coincident) results
whose footprints are over 1,000× larger in terms of area.
Query
Results
-
High density.
Result set density varies considerably. Here, a relatively
underspecified query ("digital items overlapping a query region")
yielded 244 unique footprints. Given the query limit of 250 results,
a reasonable inference the user could draw from this visualization is
that there is plenty of data distributed over the query region, and
some additional criteria must be supplied.
Query
Results
-
Medium density.
Query
Results
-
Low density.
This query ("online, containing word 'oil'") yielded a very low
density result set. The footprints are so small in relation to their
spatial distribution that iconic representation might be more
appropriate in this case.
Query
Results
-
Telescope effect.
An "overlaps" spatial query (the tiny green dot) produced this classic
result set. The results span the spectrum from the specific (a result
with a footprint slightly larger than the query region; approximately
1 deg2) to the most general possible (a political map of
the world; 64,800 deg2). Given the relative specificity of
the query in this case, the user would likely find most of these
results to be irrelevant, and thus the whole-world view of the result
set that we've shown here is less than desirable. This is a
frequently occurring pattern.
Query
Results
-
Tiled.
The tiling of this result set is due to the tiled nature of the
underlying collection.
Query
Results
-
Extreme.
A result set that is extreme in terms of the number of footprints (250
results; 218 unique footprints), distribution of footprints
(worldwide), density of footprints (it's not apparent in this
visualization, but the density is high in the Southern California
region), and range of footprint sizes (the ratio of the areas of the
largest and smallest footprints is
1.8 × 106). The query in this case
("online maps") was overly general, and so the goal of visualizing the
result set should be to communicate that fact and to indicate how the
user might profitably refine the query.
Query
Results
Typically, icons are used in visualization when literal
representations would be too small or too dense. The Acme GeoRSS Map
Viewer gives some nice examples of and an algorithm for
automatically clustering point footprints and visualizing footprint
clusters with icons.
Here's an idea for using icons for the opposite reason:
for visualizing items that are too large. The figure to the
right is result set 9 above ("telescope effect"), but zoomed in so
that (the edges of) some of the larger footprints are no longer
visible. The existence of these footprints has instead been indicated
by arrow icons on the right side of the map. If the figure were an
active map, clicking on the icons would presumably zoom the map out to
reveal the selected footprint.
There are obviously many variants and refinements of this core
idea. Icons could represent distinct footprints, or they could
represent fixed, larger zoom levels at which more footprints are
visible. A rule is needed to determine when a footprint is visualized
literally versus iconically. One possibility is to visualize a
footprint literally whenever at least two of the footprint's edges are
at least partially visible (i.e., at least one corner or two parallel
edges). In any case, the general idea is to use icons to indicate the
existence of items that would not otherwise be apparent.
There doesn't appear to be a lot of previous work in this area.
The closest is the MetaViz project [3],
which experimented with a number of techniques for visualizing
geographic metadata. This work did not really address the issues of
footprint occlusion and disambiguation and footprint disappearance as
a function of map zoom. Nevertheless, the techniques are intriguing,
and it would be nice to apply and extend this work to the real world
result sets given above.
A little farther afield is the University of Maryland's work on Generalized Query
Previews.
[1] There are some subtleties in defining
spatial relevance that we're ignoring here. See: Douglas R. Caldwell,
Unlocking
the Mysteries of the Bounding Box, Coordinates: Online Journal
of the Map and Geography Round Table, ser. A, no. 2 (American
Library Association; August 29, 2005).
[2] Background maps and rendering courtesy
of Google Maps.
[3] Volker Jung, MetaViz: Visual
Interaction with Geospatial Digital Libraries, International
Computer Science Institute (Berkeley, CA) technical report TR-99-017
(October 1999).
Greg
Janée
Created: 2006-01-24
Last modified: 2008-02-28 10:44
|