|
Spatial Similarity FunctionsA spatial similarity function is a scalar function of two regions in the Euclidean plane. We intuitively think of the function as indicating the similarity of the regions in terms of size, shape, and location. Here's a quick comparison of three candidate spatial similarity functions. [If your browser can't display mathematical symbols, try reading this version instead.] Given regions P and Q, functions A (think addition) and U (think union) are given by A(P, Q) = 2 area(P∩Q) / (area(P) + area(Q)) Function H, the Hausdorff distance, is given by H(P, Q) = max(h(P, Q), h(Q, P)) where h(P, Q) = max{∀p∈P | min{∀q∈Q | dist(p, q)}} The helper function h(P, Q) can be thought of as the "directed distance" from P to Q: it is the distance between the point p∈P that is farthest from any point in Q and the point q∈Q that is closest to p. Intuitively, the Hausdorff distance is a measure of the mismatch between P and Q; if the Hausdorff distance is d, then every point of P is within distance d of some point of Q and vice versa. H(P, Q) is a true distance function and forms a metric space over the set of all closed, bounded sets in the Euclidean plane. If both regions are rectangular boxes, calculating the Hausdorff distance is relatively easy, and sample Python code to do so is given below. Some examples...
The above examples show that Hausdorff distance has many of the characteristics we desire for a spatial similarity function with the exception of being entirely dependent on extremal distances. An improvement over Hausdorff distance might be made by incorporating more information about the distribution of points, for example, by computing the average distance between all pairs of points in the two regions. Unfortunately, analytic solutions to average distance problems are surprisingly complex and burdensome to compute; see Birendranath Ghosh, "Random Distances Within a Rectangle and Between Two Rectangles," Bulletin of the Calcutta Mathematical Society 43 (1951): 17-24; and Steven R. Dunbar, "The Average Distance Between Points in Geometric Figures," The College Mathematics Journal 28(3) (May 1997): 187-197. The Python code below computes the Hausdorff distance between two rectangular boxes P and Q. The algorithm is based on the following observation. If h(P, Q) = d, then the distance d must be achieved by one of the corners of P; call it p. The closest point to p in Q will be either the corner of Q nearest p or the perpendicular projection of p onto the nearest boundary of Q.
Addendum: in general, the Hausdorff distance between any two closed, convex (and bounded and nonempty) sets is achieved on the boundaries of those sets. See: M. D. Wills, Hausdorff Distance and Convex Sets, Journal of Convex Analysis 14(1) (January 2007): 109-118. Greg Janée Created: 2003-05-21 Last modified: 2007-01-08 15:20 |