In a Spatial Join, two geometry objects are merged based on their spatial relationship to one another.
在空间连接中,两个几何对象根据它们之间的空间关系进行合并操作。
# One GeoDataFrame of countries, one of Cities.
# Want to merge so we can get each city's country.
In [15]: countries.head()
Out[15]:
geometry country
0 MULTIPOLYGON (((180.000000000 -16.067132664, 1... Fiji
1 POLYGON ((33.903711197 -0.950000000, 34.072620... Tanzania
2 POLYGON ((-8.665589565 27.656425890, -8.665124... W. Sahara
3 MULTIPOLYGON (((-122.840000000 49.000000000, -... Canada
4 MULTIPOLYGON (((-122.840000000 49.000000000, -... United States of America
In [16]: cities.head()
Out[16]:
name geometry
0 Vatican City POINT (12.453386500 41.903282200)
1 San Marino POINT (12.441770200 43.936095800)
2 Vaduz POINT (9.516669500 47.133723800)
3 Lobamba POINT (31.199997100 -26.466667500)
4 Luxembourg POINT (6.130002800 49.611660400)
# Execute spatial join
In [17]: cities_with_country = cities.sjoin(countries, how="inner", predicate='intersects')
In [18]: cities_with_country.head()
Out[18]:
name geometry index_right country
0 Vatican City POINT (12.453386500 41.903282200) 141 Italy
1 San Marino POINT (12.441770200 43.936095800) 141 Italy
226 Rome POINT (12.481312600 41.897901500) 141 Italy
2 Vaduz POINT (9.516669500 47.133723800) 114 Austria
212 Vienna POINT (16.364693100 48.201961100) 114 Austria
GeoPandas provides two spatial-join functions:
GeoPandas 提供了两个空间连接函数:
GeoDataFrame.sjoin()
: joins based on binary predicates (intersects, contains, etc.)。基于二元谓词的连接(相交、包含等)GeoDataFrame.sjoin_nearest()
: joins based on proximity, with the ability to set a maximum search radius.根据距离的远近进行连接,并能设置最大搜索半径。
Note:
For historical reasons, both methods are also available as top-level functions sjoin()
and sjoin_nearest()
. It is recommended to use methods as the functions may be deprecated in the future.
由于历史原因,这两个方法也可以作为顶层函数sjoin()和sjoin_nearest()使用。建议使用方法,因为这些函数在未来可能会被废弃。
Binary Predicate Joins
Binary predicate joins are available via GeoDataFrame.sjoin()
.
二元谓词连接可通过 GeoDataFrame.sjoin() 获得。
GeoDataFrame.sjoin()
has two core arguments: how
and predicate
.
GeoDataFrame.sjoin() 有两个核心参数:how 和 predicate。
predicate
The predicate
argument specifies how geopandas
decides whether or not to join the attributes of one object to another, based on their geometric relationship.
predicate
参数指定了geopandas如何根据它们的几何关系决定是否将一个对象的属性连接到另一个对象。。
The values for predicate
correspond to the names of geometric binary predicates and depend on the spatial index implementation.
predicate 的值对应于几何二元谓词的名称,并取决于空间索引的实现。
The default spatial index in geopandas
currently supports the following values for predicate
which are defined in the Shapely documentation:
geopandas中的默认空间索引目前支持以下预定值,这些预定值在Shapely文档中定义。
- intersects
- contains
- within
- touches
- crosses
- overlaps
how
The how argument specifies the type of join that will occur and which geometry is retained in the resultant GeoDataFrame
. It accepts the following options:
how参数指定将发生的连接类型,以及在结果的GeoDataFrame中保留哪些几何图形。它接受以下选项。
left
: use the index from the first (or left_df)GeoDataFrame
that you provide toGeoDataFrame.sjoin()
; retain only the left_df geometry column。使用提供给 GeoDataFrame.sjoin() 的第一个(或 left_df)GeoDataFrame 的索引,只保留 left_df 几何列。right
: use index from second (or right_df); retain only the right_df geometry column。使用来自第二个(或right_df)的索引,只保留right_df的几何列。inner
: use intersection of index values from bothGeoDataFrame
; retain only the left_df geometry column。使用两个GeoDataFrame的索引值的交集,只保留left_df的几何列。。
Note more complicated spatial relationships can be studied by combining geometric operations with spatial join. To find all polygons within a given distance of a point, for example, one can first use the buffer()
method to expand each point into a circle of appropriate radius, then intersect those buffered circles with the polygons in question.
请注意,可以通过将几何运算与空间连接相结合来研究更复杂的空间关系。例如,要找到一个点的给定距离内的所有多边形,可以首先使用 buffer() 方法将每个点扩展为适当半径的圆,然后将这些缓冲的圆与相关多边形相交。
Nearest Joins
Proximity-based joins can be done via GeoDataFrame.sjoin_nearest()
.
基于邻近度的连接可以通过 GeoDataFrame.sjoin_nearest() 完成。
GeoDataFrame.sjoin_nearest()
shares the how
argument with GeoDataFrame.sjoin()
, and includes two additional arguments: max_distance
and distance_col
.
GeoDataFrame.sjoin_nearest() 与 GeoDataFrame.sjoin() 共享 how 参数,并包含两个附加参数:max_distance 和 distance_col。
max_distance
The max_distance
argument specifies a maximum search radius for matching geometries. This can have a considerable performance impact in some cases. If you can, it is highly recommended that you use this parameter.
max_distance 参数指定匹配几何图形的最大搜索半径。在某些情况下,这会对性能产生相当大的影响。如果可以,强烈建议您使用此参数。
distance_col
If set, the resultant GeoDataFrame will include a column with this name containing the computed distances between an input geometry and the nearest geometry.
如果设置,生成的 GeoDataFrame 将包含一个具有此名称的列,其中包含输入几何和最近的几何之间的计算距离。