Tài liệu Expert SQL Server 2008 Development- P7 - Pdf 92

CHAPTER 9  DESIGNING SYSTEMS FOR APPLICATION CONCURRENCY
281
 Tip If you are
interested in finding further statistics about the waits enforced by Resource Governor, try looking
for rows in the
sys.dm_os_wait_stats
DMV where
wait_type
is
RESMGR_THROTTLED
.
Summary
Concurrency is a complex topic with many possible solutions. In this chapter, I introduced the various
concurrency models that should be considered from a business process and data collision point of view,
and explained how they differ from the similarly named concurrency models supported by the SQL
Server database engine. Pessimistic concurrency is probably the most commonly used form, but it can
be complex to set up and maintain. Optimistic concurrency, while more lightweight, might not be so
applicable to many business scenarios, and multivalue concurrency control, while a novel technique,
might be difficult to implement in such a way that allowing collisions will help deliver value other than a
performance enhancement.
Finally, I covered an overview of how Resource Governor can balance th
e way in which limited
resources are allocated between different competing requests in a concurrent environment. The
discussion here only scratched the surface of the potential for this technique, and I recommend that
readers interested in the subject dedicate some time to further research this powerful feature.
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
C H A P T E R 10

  

Point, LineString, and Polygon:
283
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 10  WORKING WITH SPATIAL DATA
• A Point is the most fundamental type of geometry, representing a singular
location in space. A Point geometry is zero-dimensional, meaning that it has no
associated area or length.
• A LineString is comprised of a series of two or more distinct points, together with
the line segments that connect those points together. LineStrings have a length,
but no associated area. A simple LineString is one in which the path drawn
between the points does not cross itself. A closed LineString is one that starts and
ends at the same point. A LineString that is both simple and closed is known as a
ring.
• A Polygon consists of an exterior ring, which defines the perimeter of the area of
space contained within the polygon. A polygon may also specify one or more
internal rings, which define areas of space contained within the external ring but
excluded from the Polygon. Internal rings can be thought of as “holes” cut out of
the Polygon. Polygons are two-dimensional—they have a length measured as the
total length of all defined rings, and also an area measured as the space contained
within the exterior ring (and not excluded by any interior rings).
 Note The word geometry has two distinct meanings when dealing with spatial data in SQL Server. To make the
distinction clear, I will use the word geometry (regular font) as the generic name to describe Points, LineStrings,
and Polygons, and
geometry
(code font) to refer to the
geometry
datatype.
Sometimes, a single feature may be represented by more than one geometry, in which case it is
known as a GeometryCollection. GeometryCollections may be homogenous or heterogeneous. For
example, the Great Wall of China is not a single contiguous wall; rather, it is made up of several distinct

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 10  WORKING WITH SPATIAL DATA
Spatial Reference Systems
A spatial reference system is a system designed to unambiguously identify and describe the location of
any point in space. This ability is essential to enable spatial data to store the coordinates of geometries
used to represent features on the earth.
To describe the positions of points in space, every spatial reference system is based on an
underlying coordinate system. There are many different types of coordinate systems used in various
fields of mathematics, but when defining geospatial data in SQL Server 2008, you are most likely to use a
spatial reference system based on either a geographic coordinate system or a projected coordinate
system.
Geographic Coordinate Systems
In a geographic coordinate system, any position on the earth’s surface can be defined using two angular
coordinates:
• The latitude coordinate of a point measures the angle between the plane of the
equator and a line drawn perpendicular to the surface of the earth at that point.
• The longitude coordinate measures the angle in the equatorial plane between a
line drawn from the center of the earth to the point and a line drawn from the
center of the earth to the prime meridian.
Typically, geographic coordinates are measured in degrees. As such, latitude can vary between –90°
(at the South Pole) and +90° (at the North Pole). Longitude values extend from –180° to +180°.
Figure 10-2 illustrates how a geographic coordinate system can be used to identify a point on the
earth’s surface.
Projected Coordinate Systems
In contrast to the geographic coordinate system, which defines positions on a three-dimensional, round
model of the earth, a projected coordinate system describes positions on the earth’s surface on a flat,
two-dimensional plane (i.e., a projection of the earth’s surface). In simple terms, a projected coordinate
system describes positions on a map rather than positions on a globe.
If we consider all of the points on the earth’s surface to lie on a flat plane, we can define positions on
that plane using familiar Cartesian coordinates of x and y (sometimes referred to as Easting and

axes, which is known as the inverse-flattening ratio.
Different reference ellipsoids provide different approximations of the shape of the earth, and there
is no single reference ellipsoid that provides a best fit across the whole surface of the globe. For this
reason, spatial applications that operate at a regional level tend to use a spatial reference system based
on whatever reference ellipsoid provides the best approximation of the earth’s surface for the area in
question. In Britain, for example, this is the Airy 1830 ellipsoid, which has a semimajor axis of
6,377,563m and a semiminor axis of 6,356,257m. In North America, the NAD83 ellipsoid is most
commonly used, which has a semimajor axis of 6,378,137m and a semiminor axis of 6,356,752m.
The reference frame defines a set of locations in the real world that are assigned known coordinates
relative to the reference ellipsoid. By establishing a set of points with known coordinates, these points
can then be used to correctly line up the coordinate system with the reference ellipsoid so that the
coordinates of other, unknown points can be determined. Reference points are normally places on the
earth’s surface itself, but they can also be assigned to the positions of satellites in stationary orbit around
the earth, which is how the WGS84 datum used by global positioning system (GPS) units is realized.
Prime Meridian
As defined earlier, the geographic coordinate of longitude is the angle in the equatorial plane between
the line drawn from the center of the earth to a point and the line drawn from the center of the earth to
the prime meridian. Therefore, any spatial reference system must state its prime meridian—the axis
from which the angle of longitude is measured.
It is a common misconception to believe that there is a single prime meridian based on some
inherent fundamental property of the earth. In fact, the prime meridian of any spatial reference system
is arbitrarily chosen simply to provide a line of zero longitude from which all other coordinates of
longitude can be measured. One commonly used prime meridian passes through Greenwich, London,
but there are many others. If you were to choose a different prime meridian, the value of every longitude
coordinate in a given spatial reference system would change.

288
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 10  WORKING WITH SPATIAL DATA


DATUM[
"World Geodetic System 1984",
ELLIPSOID[
"WGS 84",
6378137,
298.257223563
]
],
PRIMEM["Greenwich", 0],
UNIT["Degree", 0.0174532925199433]
]
Returning to the example at the beginning of this chapter, using this spatial reference system, we
can describe the approximate location of each corner of the US Pentagon building as a pair of latitude
and longitude coordinates as follows:
38.870, -77.058
38.869, -77.055
38.871, -77.053
38.873, -77.055
38.872, -77.058
Note that, since we are describing points that lie to the west of the prime meridian, the longitude
coordinate in each case is negative.
Now let’s consider another spatial reference system—the Universal Transverse Mercator (UTM)
Zone 18N system, which is a projected coordinate system used in parts of North America. This spatial
reference system is based on the 1983 North American datum, which has a reference ellipsoid of
6,378,137m and an inverse-flattening ratio of 298.257222101. This geodetic model is projected using a
transverse Mercator projection, centered on the meridian of longitude 75°W, and coordinates based on
the projected image are measured in meters. The full properties of this system are expressed in WKT
format as follows:
290
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

321465, 4304585
Comparing these results clearly demonstrates that any coordinate pair only describes a unique
location on the earth when stated with the details of the coordinate system from which they were
obtained. However, it would be quite cumbersome if we had to write out the full details of the datum,
prime meridian, unit of measurement, and projection details every time we wanted to quote a pair of
coordinates. Fortunately, there is an established set of spatial reference identifiers (SRIDs) that provide
a unique integer code associated with each spatial reference system. The two spatial reference systems
used in the preceding examples are represented by SRID 4326 and SRID 26918, respectively.
Every time you state an item of spatial data using the geography or geometry types in SQL Server
2008, you must state the corresponding SRID from which the coordinate values were obtained. What’s
more, since SQL Server does not provide any mechanism for converting between spatial reference
systems, if you want to perform any calculations involving two or more items of spatial data, each one
must be defined using the same SRID.
If you don’t know the SRID associated with a set of coordinates—say, you looked up some latitude
and longitude coordinates from a web site that didn’t state the system used—the chances are more than
likely that they are geographic coordinates based on SRID 4326, the system used by GPSs.
291
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 10  WORKING WITH SPATIAL DATA
 Note To find out the SRID associated with any given spatial reference system, you can use the search facility
provided at
www.epsg-registry.org
.
Geography vs. Geometry
Early Microsoft promotional material for SQL Server 2008 introduced the geography datatype as suitable
for “round-earth” data, whereas the geometry datatype was for “flat-earth” data. These terms have since
been repeated verbatim by a number of commentators, with little regard for explaining the practical
meaning of “flat” or “round.” A simple analogy might be that, in terms of geospatial data, the geometry
datatype operates on a map, whereas the geography datatype operates on a globe.
With that distinction in mind, one obvious difference between the datatypes concerns the types of

and
M
coordinates, which can represent two further dimensions associated with
each point (typically,
Z
is elevation above the surface, and
M
is a measure of time). However, while these values
can be stored and retrieved, none of the methods provided by the
geography
or
geometry
datatypes account for
the value of
Z
and
M
coordinates in their calculations.
292
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 10  WORKING WITH SPATIAL DATA
Standards Compliance
The geometry datatype operates on a flat plane, where the two coordinate values for each point represent
the x and y position from a designated origin on the plane. As a result, many of the standard methods
provided by the geometry datatype can be performed using elementary trigonometry and geometry. For
example, the following code listing demonstrates how to calculate the distance between a Point located
at (50,100) and a Point at (90,130) using the STDistance() method of the geometry datatype:
DECLARE @point1 geometry = geometry::Point(50, 100, 0);
DECLARE @point2 geometry = geometry::Point(90, 130, 0);
SELECT @point1.STDistance(@point2);

CHAPTER 10  WORKING WITH SPATIAL DATA
In normal English language, most people would describe these two LineStrings as touching, but not
crossing. However, according to the OGC definitions, the reverse is true. You can test this for yourself by
examining the results of the STTouches() and STCrosses() methods, as shown in the following code
listing:
DECLARE @x geometry = geometry::STLineFromText('LINESTRING(0 0, 0 10)', 0);
DECLARE @y geometry = geometry::STLineFromText('LINESTRING(10 0, 0 5, 10 10)', 0);
SELECT
@x.STCrosses(@y),
@x.STTouches(@y);
The result of the STCrosses() method is 1, indicating that the LineString x crosses over the
LineString y. According to the OGC standards, two LineStrings cross each other if the geometry created
by their intersection is zero-dimensional. In this case, the two LineStrings intersect at a single point (5,5),
so they are deemed to cross. In contrast, two LineStrings only touch each other if the points at which
they intersect lie in the boundary (i.e., the ends) of the LineString. In this case, the point (5,5) lies in the
interior of both LineStrings rather than in their boundary, so the result of STTouches() is 0 (i.e., false). Be
careful to check the documentation of any methods to ensure that the behavior is exactly as you expect!
Accuracy
The world is round. The geometry datatype, however, operates on a flat plane. By definition, therefore,
any geospatial calculations performed using the geometry datatype will involve a degree of error. This is
not a limitation of the geometry datatype in itself, but rather of the inevitable distortions introduced
when using a projected coordinate system to represent a round model of the earth.
Generally speaking, the effects of distortion become greater as the area of projection is increased.
For this reason, results obtained using the geometry datatype will become less accurate than results
obtained using the geography datatype over large distances.
In global spatial applications, the geography datatype is a more suitable choice, as there are few
projected systems that can be used for general global purposes with sufficient accuracy. For storing
spatial data contained within a single country or smaller area, the geometry datatype will generally
provide sufficient accuracy, and comes with the benefits of additional functionality over the geography
type.


Figure 10-6. Polygon ring orientation is significant for the geography datatype
The solution used by SQL Server (and in common with some other spatial systems) is to consider
the ring orientation of the Polygon—i.e., the order in which the points of the ring are specified. When
defining a geography Polygon, SQL Server treats the area on the “left” of the path drawn between the
points as contained within the ring, whereas the points on the “right” side are excluded. Thus, the
Polygon depicted in Figure 10-6 represents the Northern Hemisphere. Whenever you define geography
polygons, you must ensure that you specify the correct ring orientation or else your polygons will be
“inside-out”—excluding the area they were intended to contain, and including everything else. In
geometry, data ring orientation is not significant, as there is no ambiguity as to the area contained within
a Polygon ring on a flat plane.
295
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 10  WORKING WITH SPATIAL DATA
A final technical difference concerns invalid geometries. In an ideal world, we would always want
our spatial data to be “valid”—that is, it meeting all the OGC specifications for that type of geometry.
However, as developers we have to reluctantly accept that spatial data, like any other data, is rarely as
perfect as we would like. This means that you will frequently encounter invalid data where, for example,
Polygons do self-intersect.
Rather perversely, perhaps, the geometry datatype, which conforms to OGC standards, is also the
datatype that provides options for dealing with data that fails to meet those standards. For example, not
only can the geometry datatype be used to store invalid geometries, but it also provides the STIsValid()
method to identify whether a geometry is valid or not, and the MakeValid() method to attempt to “fix”
invalid geometries. All geography data, in contrast, is assumed to be valid at all times. Although this
means that once geography data is in SQL Server, you can work with it comfortable in the knowledge
that it is always valid, it can provide an obstacle to importing that data in the first place. Since SQL Server
cannot import invalid geography data, you may have to rely on external tools to validate and fix any
erroneous data prior to importing it.
Creating Spatial Data
The first challenge presented to many users new to the spatial features in SQL Server 2008 is how to get

In order to create more complex geometries from WKT, simply specify the individual coordinate
pairs of each point in a comma-delimited list, as shown in the following example, which creates a
LineString between two points representing Sydney Harbor Bridge:
DECLARE @SydneyHarbourBridge geography;
SET @SydneyHarbourBridge = geography::STLineFromText(
'LINESTRING(151.209 -33.855, 151.212 -33.850)', 4326);
GO
Note that when using WKT to express coordinates for use in the geography datatype, as in the last
example, the longitude coordinate must be listed first in each coordinate pair, followed by the latitude
coordinate. This is in contrast to the expression of a “latitude, longitude” coordinate pair, which most
people are familiar with using in everyday speech.
One disadvantage of the WKT format is that, as with any text-based representation, it is not possible
to precisely state the value of certain floating-point coordinate values obtained from binary methods.
The inevitable rounding errors introduced when attempting to do so will lead to a loss of precision.
Additionally, since SQL Server must parse the text in a WKT representation to create the relevant spatial
object, instantiating objects from WKT can be slower than when using other methods.
Well-Known Binary
The WKB format, like the WKT format, is a standardized way of representing spatial data defined by the
OGC. In contrast to the text-based WKT format, WKB represents a geometry or geography object as a
contiguous stream of bytes in binary format. Every WKB representation begins with a header section
that specifies the order in which the bytes are listed (big-endian or little-endian), a value defining the
type of geometry being represented, and a stream of 8-byte values representing the coordinates of each
point in the geometry.
The following code demonstrates how to construct a Point geometry from WKB representing the
city of Warsaw, Poland, located at latitude 52.23 and longitude 21.02, using the geography
STPointFromWKB() method:
DECLARE @Warsaw geography;
SET @Warsaw = geography::STPointFromWKB(
0x010100000085EB51B81E0535403D0AD7A3701D4A40,
4326);

locations of cities, or the paths of roads and railways. There are lots of places to obtain such generic
spatial data, from a variety of commercial and free sources.
SQL Server doesn’t provide any specific tools for importing predefined spatial data, but there are a
number of third-party tools that can be used for this purpose. It is also possible to use programmatic
techniques based on the functionality provided by the SqlServer.Types.dll library, which contains the
methods used by the geography and geometry datatypes themselves. To demonstrate one method of
importing spatial data, and to provide some sample data for use in the remaining examples in this
chapter, we’ll import a dataset from the Geonames web site (www.geonames.org) containing the
geographic coordinates of locations around the world.
To begin, download and unzip the main dataset from the Geonames web site, available from
http://download.geonames.org/export/dump/allCountries.zip. This archive contains a tab-delimited
text file containing nearly 7 million rows, and when unzipped, occupies nearly 800MB. If you would like
to use a smaller dataset, you can alternatively download the
http://download.geonames.org/export/dump/cities1000.zip archive, which uses the same schema but
contains a subset of approximately 80,000 records, representing only those cities with a population
exceeding 1,000 inhabitants.
 Caution The Geonames
allCountries.zip
export is a large file (approximately 170MB), and may take some
time to download.
To store the Geonames information in SQL Server, first create a new table as follows:
298
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 10  WORKING WITH SPATIAL DATA
CREATE TABLE allCountries(
[geonameid] int NOT NULL,
[name] nvarchar(200) NULL,
[asciiname] nvarchar(200) NULL,
[alternatenames] nvarchar(4000) NULL,
[latitude] real NULL,

pane.
On the Advanced pane, click each column name in turn, and configure the column properties to
match the values shown in Table 10-1.
299
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
CHAPTER 10  WORKING WITH SPATIAL DATA

Figure 10-7. Previewing data downloaded from the Geonames web site

300
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.


Nhờ tải bản gốc

Tài liệu, ebook tham khảo khác

Music ♫

Copyright: Tài liệu đại học © DMCA.com Protection Status