How Does It Work
Natural Language Processing
The algorithm used by GEOLocate
begins by standardizing the locality string into common terms and parsing out
distances, compass directions, and key geographic identifiers.
This information is then used in a series of lookups and displacement
calculations to determine geographic coordinates.
Placename, river mile, legal land description and higway-waterbody
crossing datasets are used for lookups.
Displacements from these lookups are calculated if indicated by the parsed
locality information. Coordinates
output from the initial georeferencing may be further refined via an additional
function to scan the locality string for waterbody names and “snap” output
coordinates to the nearest point on the waterbody found.
This feature has proven very useful for aquatic collections.
The resulting coordinates are then ranked based on the type of
information found within the string and plotted on the digital map display for
user verification, correction and error determination. You can try out an
online version of the process
here or request a free copy of the full featured
The following are an
actual sample of data georeferenced by GEOLocate:
Arkansas River at River Mile 10 boat
landing., USA, Arkansas, Desha
Green River at Roachville ford
approximately 2 mi. E. of Greensburg, USA, Kentucky, Green
Alabama River at Wilcox Bar; River Mile
120., USA, Alabama, Wilcox
Missouri River 3 mi. SE of Pierre, South
Dakota., USA, South Dakota, Hughes
Natalbany River at U.S. Hwy. 190, USA ,
Tussahaw Creek at LeGuin Mill Road,
approximately 3.5 mi. ENE Locust Grove - Segment 3., USA, Georgia, Henry
Escatawpa River at Hwy. 612., USA,
South Fork Little Red River at Arkansas
Hwy. 95, SW Clinton, Section 11., USA, Arkansas, Van Buren
Little Pine Barren Creek at Hwy. 99; T4N
R32W Sec. 4., USA, Florida, Escambia
t1N r3e sec. 13, USA, Nebraska, Jefferson
One of our goals was to provide an interface by which users could georeference
records one by one or in batches from files, vizualize and correct calculated
coordinates and determine polygonal error descriptions. GEOLocate
uses XML as its native file format but also supports data import from .CSV and
delimited .TXT files. Once coordinates have been derived from
a locality description adjustments may be made by simply click and
dragging a displayed point on a map. Error estimates can then be
recorded as the maximum extent which a description could occupy. This
extent is represented as a comma delimited array of polygon vertices and can
easily be drawn onto the map.