|
Simple Geometry Language
This page describes some incomplete work undertaken in 2003 to
define a standard, XML-based language for describing geographic
regions, i.e., geometric regions on the Earth's surface. For brevity,
we'll call any such language a geometry language. A geometry
language defines a set of possible shapes and standard representations
and encodings of those shapes, and also addresses the handling of
cartographic quantities (Earth datums, projections, and coordinate
systems), either by mandating standard quantities or by providing
standard declaration mechanisms.
The motivation for a standard geometry language is rooted in the
observation that every system/service/effort that has had to deal with
geographic regions has ended up defining its own geometry language.
All these languages have broadly similar capabilities to varying
degrees, yet all have enough idiosyncracies to bedevil easy
interoperability. It is instructive to compare and contrast the
geometry languages embedded in specifications such as:
- FGDC
Content Standard for Digital Geospatial Metadata
(see "Spatial Domain," p. 5, and "Latitude and Longitude," p. x)
- ADN
(ADEPT/DLESE/NASA) metadata framework
(see Geospatial
overview and geospatial.xsd)
- ADL
Middleware Server
(see elements <spatial-value> in ADL-bucket-report.dtd
and <spatial-constraint> in ADL-query.dtd)
- ADL
Gazetteer Protocol
(see Reports
and Query
language and elements <bounding-box> and
<footprints> in gazetteer-standard-report.xsd
and <footprint-query> in gazetteer-query.xsd)
- Dublin Core
(see DCMI
Point and DCMI
Box encoding schemes)
- OpenGIS
Simple Features Specification for SQL
(see §2.1, "Geometry Object Model," p. 2-1, and §3.2.5, "SQL
Textual Representation of Geometry," p. 3-11)
- IBM
Informix Geodetic DataBlade Module
(see §3, "Geodetic Data Types," p. 3-1)
- MapInfo
SpatialWare for SQL Server
(see §8, "Constructing Geometry," p. 62)
- PostgreSQL
(see Geometric
Types)
- OpenGIS Web
Feature Service Implementation Specification
(see §13.3.3, "Bounding box," p. 61)
- OpenGIS
Web Map Service
(see §6.7.4, "Bounding boxes," p. 11, and §7.2.4.6.8,
"BoundingBox," p. 20)
- GCMD
Directory Interchange Format (DIF)
(see Spatial
Coverage)
(A number of additional geometry languages are derived from one or
more of the above.) A standard geometry language would facilitate
interoperability across different systems, particularly among
consumers of geographic regions such as renderers and spatial
indexers.
From the perspective of distributed geospatial digital libraries
and distributed gazetteer services, which use geometry only for the
limited purposes of representing object footprints and query regions
and performing spatial comparisons between the two, a geometry
language must satisfy three requirements:
- The language must support enough possible shapes—and complex
enough shapes—so that spatial matching over those shapes yields
acceptable search precision. For gazetteers a sufficient set of
shapes is not known, but necessary shapes include points for point
features such as water wells, polylines for linear features such as
rivers, and at least simple polygons for areal features.
- The spatial reference system (SRS) in which shapes are defined
(i.e., the Earth datum and coordinate system) must not be mandated by
the language, but must be declarable in a standard way. Mandating a
particular SRS forces language users to translate SRSs, which can be
mathematically complex and can introduce unintended consequences such
as formation of aggregate shapes.
- The language must provide a lingua franca that virtually all
geometry producers and consumers can operate on; in practice, due to
simplicity of implementation, ease of mappability, and general
widespread support, the lingua franca is latitude/longitude-aligned
minimum bounding rectangles, or bounding boxes for
short.
- Notwithstanding requirement 2 above, to support interoperability,
bounding boxes must be defined in a standard SRS, e.g., WGS84
latitude/longitude coordinates. (It is reasonably easy to compute
such bounding boxes from commonly-used cylindrical and polar
projections.)
- In principle, bounding boxes are deterministically computable from
primary shapes; nevertheless, bounding boxes must explicitly accompany
all primary shapes in instance documents. To fail in this regard is
to place the burden of computing bounding boxes on the very geometry
consumers that are incapable of doing so: those that rely on bounding
boxes because they're incapable of operating on more complex
shapes.
- Bounding boxes must be defined in a manner that supports geodetic
continuity, that is, in a manner that recognizes that the Earth is,
topologically, a sphere. In particular, there must be no
discontinuity that bounding boxes are not allowed to cross such as, in
many geometry languages, the ±180° meridian.
The Open GIS Consortium's Geography
Markup Language (GML), version 3.0, is one well-known attempt to
define a standard geometry langauge. It is a comprehensive
specification having many desirable characteristics, but it suffers
from two defects that are shared by many of the aforementioned
geometry languages. First, in balancing the concerns of consumers of
the language, who generally prefer uniformity and simplicity, versus
producers, who generally prefer expressiveness and flexibility, GML
weighs heavily in favor of producers. It defines many, many possible
shapes and shape-related options. The effect of this imbalance is
that, in practice, consumers can not and do not accept but an
idiosyncratic fraction of the entire GML language. The second defect
is that GML does not meet any of the conditions of requirement 3
above.
The XML schema below represents a first effort at defining a
geometry language that addresses these concerns. The language is a
profile of GML, that is, a subset and logical restriction of GML such
that any instance document that adheres to the language below also
adheres to GML and can be interpreted by any GML consumer.
This deliberately simple geometry language supports just three
possible shapes: points, polylines ("linestrings" in GML parlance),
and simple (i.e., self-intersection-free and hole-free) polygons.
Each shape is represented in the language by both an XML schema type
(e.g., PolygonType) and an XML element (e.g.,
<Polygon>). However, the intention of the language
is that only schema type AbstractFeatureType be
referenced by application schemas; this usage forces a bounding box to
be associated with every shape in instance documents. SRSs can be
declared using the srsName attribute.
| ADL-geometry.xsd |
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:gml="http://www.opengis.net/gml"
targetNamespace="http://www.opengis.net/gml"
elementFormDefault="qualified">
<element name="coordinates" type="string"/>
<!-- needed only by ADL-geometry-extended.xsd -->
<element name="radius">
<complexType>
<simpleContent>
<extension base="double">
<attribute name="uom" type="anyURI" use="required"/>
</extension>
</simpleContent>
</complexType>
</element>
<complexType name="AbstractGeometryType" abstract="true">
<attribute name="srsName" type="anyURI"/>
</complexType>
<element name="_Geometry" type="gml:AbstractGeometryType"/>
<complexType name="PointType">
<complexContent>
<extension base="gml:AbstractGeometryType">
<sequence>
<element ref="gml:coordinates"/>
</sequence>
</extension>
</complexContent>
</complexType>
<element name="Point" type="gml:PointType"
substitutionGroup="gml:_Geometry"/>
<complexType name="LineStringType">
<complexContent>
<extension base="gml:AbstractGeometryType">
<sequence>
<element ref="gml:coordinates"/>
</sequence>
</extension>
</complexContent>
</complexType>
<element name="LineString" type="gml:LineStringType"
substitutionGroup="gml:_Geometry"/>
<complexType name="PolygonType">
<complexContent>
<extension base="gml:AbstractGeometryType">
<sequence>
<element name="exterior">
<complexType>
<sequence>
<element name="LinearRing">
<complexType>
<sequence>
<element ref="gml:coordinates"/>
</sequence>
</complexType>
</element>
</sequence>
</complexType>
</element>
</sequence>
</extension>
</complexContent>
</complexType>
<element name="Polygon" type="gml:PolygonType"
substitutionGroup="gml:_Geometry"/>
<complexType name="AbstractFeatureType" abstract="true">
<sequence>
<element name="boundedBy">
<complexType>
<sequence>
<element name="Envelope">
<complexType>
<sequence>
<element ref="gml:coordinates"/>
</sequence>
</complexType>
</element>
</sequence>
</complexType>
</element>
<element name="location">
<complexType>
<sequence>
<element ref="gml:_Geometry"/>
</sequence>
</complexType>
</element>
</sequence>
</complexType>
</schema> |
The above geometry language, expressed as a profile of GML, has a
number of nice properties, not the least of which is that it weeds out
99% of the 600-plus-page GML specification. However, there are a
number of serious deficiencies which are still unresolved:
- To satisfy requirement 3(i), the bounding box SRS must be
standardized to, for example, WGS84 latitude/longitude coordinates.
If the above language were an independent specification, such a
requirement could be stated as part of the specification itself; an
explicit declaration of the SRS need not be present in instance
documents or even in the schema. But to avoid ambiguity as a profile
of GML, all SRSs must be made explicit, and GML makes this possible by
allowing an
srsName attribute to be placed on the
<Envelope> element. Unfortunately, at the time of
this work, there appears to be no standard means of referring to
SRSs.
- In GML, an
<Envelope> element "defines an
extent using a pair of positions defining opposite corners," that is,
using a pair of minimum and maximum coordinate values. A consequence
of being defined this way, as opposed to being defined in terms of
explicitly-labeled east and west boundaries, is that it is not
possible to describe a bounding box that crosses the ±180°
meridian (or other discontinuity).
If east/west
bounding coordinates are mapped to minimum/maximum coordinates
according to their values, then a bounding box such as Russia's will
be misinterpreted (its east bounding coordinate, being less than its
west, will be considered the minimum coordinate value), with the
result that the GML envelope will describe the longitudinal complement
of the desired bounding box. Always mapping the west (east) bounding
coordinate to the minimum (maximum) coordinate value, even when west
is numerically greater than east, would solve the problem (this is
effectively equivalent to explicitly labeling the east and west
boundaries), but the GML specification gives no indication that this
is admissible or that SRSs may employ such modular arithmetic.
It seems that the
only unambiguous and correct method of encoding a bounding box that
crosses the ±180° meridian is to convert the bounding box
to a whole-world band. But this loss of shape fidelity results in
many false positives by spatial search engines and is
unacceptable.
- The language is both clumsy and misleading. It is clumsy because,
to satisfy requirement 3(ii), application schemas that use the
language must restrict their use to the
AbstractFeatureType element type. Thus, instead of being
able to say
<element name="my-element" type="gml:PolygonType"/>
the application schema must say
<element name="my-element">
<complexType>
<complexContent>
<extension base="gml:AbstractFeatureType"/>
</complexContent>
</complexType>
</element>
But notice in the above that the ability has been lost to restrict the
possible shapes <my-element> can take on to, say,
polygons. The geometry language is further misleading because
declarations such as PolygonType and
<Polygon> are publicly visible, and application
schemas will naturally assume that they can be directly referenced.
An alternative approach would be to abandon
AbstractFeatureType altogether, and use GML's
<metaDataProperty> element for storing associated
bounding boxes. In this approach, an application could say
<element name="my-element" type="gml:PolygonType"/>
and an instance document would look like
<my-element>
<gml:metaDataProperty>
<gml:GenericMetaData>
<gml:boundedBy>
<gml:Envelope>
<gml:coordinates>...</gml:coordinates>
</gml:Envelope>
</gml:boundedBy>
</gml:GenericMetaData>
</gml:metaDataProperty>
<gml:exterior>
...
</gml:exterior>
</my-element>
Whether the
language should support aggregate shapes (i.e., sets of disjoint
shapes treated as first-order shapes)—and if so, which
kinds—is an open question. On the one hand, aggregate shapes
are desirable because they can offer vastly greater fidelity to true
region shapes: consider the footprint of the United States described
as an aggregate of three shapes (contiguous 48 states; Alaska; Hawaii)
versus as a convex hull or bounding box of those shapes. On the other
hand, aggregate shapes bring concomitantly large increases in
interface and implementation complexity. Then again, if the language
were to support aggregates, consumers would always have the option of
falling back to bounding boxes.
Finally, below is an extension to the above geometry language that
adds a disk shape (defined by center and radius) and several
convenience declarations. As an extension, it is necessarily
incompatible with GML.
| ADL-geometry-extended.xsd |
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:adlgml="tag:alexandria.ucsb.edu,2003:geometry"
xmlns:gml="http://www.opengis.net/gml"
targetNamespace="tag:alexandria.ucsb.edu,2003:geometry"
elementFormDefault="qualified">
<import namespace="http://www.opengis.net/gml"
schemaLocation="ADL-geometry.xsd"/>
<complexType name="DiskType">
<complexContent>
<extension base="gml:AbstractGeometryType">
<sequence>
<element ref="gml:coordinates"/>
<element ref="gml:radius"/>
</sequence>
</extension>
</complexContent>
</complexType>
<element name="Disk" type="adlgml:DiskType"
substitutionGroup="gml:_Geometry"/>
<complexType name="FeatureType">
<complexContent>
<extension base="gml:AbstractFeatureType"/>
</complexContent>
</complexType>
<element name="Feature" type="adlgml:FeatureType"/>
<element name="Footprint" type="adlgml:FeatureType"/>
<element name="QueryRegion" type="adlgml:FeatureType"/>
</schema> |
Greg
Janée
Created: 2004-08-25
Last modified: 2008-02-28 10:58
|