1. 主页
  2. 文档
  3. GeoPandas 0.12.2中文文档
  4. User Guide
  5. Managing Projections
  6. Upgrading to GeoPandas 0.7 with pyproj > 2.2 and PROJ > 6

Upgrading to GeoPandas 0.7 with pyproj > 2.2 and PROJ > 6

Starting with GeoPandas 0.7, the .crs attribute of a GeoSeries or GeoDataFrame stores the CRS information as a pyproj.CRS, and no longer as a proj4 string or dict.

从 GeoPandas 0.7 开始,GeoSeries 或 GeoDataFrame 的 .crs 属性将 CRS 信息存储为 pyproj.CRS,而不再是 proj4 字符串或字典。

Before, you might have seen this:

之前,你可能已经看到了这个:

>>> gdf.crs
{'init': 'epsg:4326'}

while now you will see something like this:

而现在你会看到这样的内容:

>>> gdf.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)

>>> type(gdf.crs)
pyproj.crs.CRS

This gives a better user interface and integrates improvements from pyproj and PROJ 6, but might also require some changes in your code. See this blogpost for some more background, and the subsections below cover different possible migration issues.

这提供了更好的用户界面并集成了 pyproj 和 PROJ 6 的改进,但也可能需要对你的代码进行一些修改。有关更多背景信息,请参阅此博文,下面的小节涵盖了不同的可能出现的迁移问题。

See the pyproj docs for more on the pyproj.CRS object.

有关 pyproj.CRS 对象的更多信息,请参阅 pyproj 文档。

Importing data from files

When reading geospatial files with geopandas.read_file(), things should mostly work out of the box. For example, reading the example countries dataset yields a proper CRS:

使用 geopandas.read_file() 读取地理空间文件时,大多数情况下应该是开箱即用的。例如,读取示例国家/地区数据集会产生适当的 CRS:

In [9]: df = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))

In [10]: df.crs
Out[10]: 
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

However, in certain cases (with older CRS formats), the resulting CRS object might not be fully as expected. See the section below for possible reasons and how to solve it.

但是,在某些情况下(使用较旧的 CRS 格式),生成的 CRS 对象可能不完全符合预期。请参阅以下部分了解可能的原因以及解决方法。

Manually specifying the CRS

When specifying the CRS manually in your code (e.g., because your data has not yet a CRS, or when converting to another CRS), this might require a change in your code.

当在你的代码中手动指定CRS时(例如,因为你的数据还没有CRS,或转换到另一个CRS时),这可能需要改变你的代码。

“init” proj4 strings/dicts

Currently, a lot of people (and also the GeoPandas docs showed that before) specify the EPSG code using the “init” proj4 string:

目前,很多人(以及 GeoPandas 文档之前显示的)使用“init”proj4 字符串指定 EPSG 代码:

## OLD
GeoDataFrame(..., crs={'init': 'epsg:4326'})
# or
gdf.crs = {'init': 'epsg:4326'}
# or
gdf.to_crs({'init': 'epsg:4326'})

The above will now raise a deprecation warning from pyproj, and instead of the “init” proj4 string, you should use only the EPSG code itself as follows:

现在,上述内容将引起pyproj的弃用警告,而不是 “init “proj4字符串,你应该只使用EPSG代码本身,如下所示。

## NEW
GeoDataFrame(..., crs="EPSG:4326")
# or
gdf.crs = "EPSG:4326"
# or
gdf.to_crs("EPSG:4326")

proj4 strings/dicts

Although a full proj4 string is not deprecated (as opposed to the “init” string above), it is still recommended to change it with an EPSG code if possible.

尽管完整的 proj4 字符串并未被弃用(与上面的“init”字符串相反),但仍建议尽可能使用 EPSG 代码对其进行更改。

For example, instead of:

而不是下面这样:

gdf.crs = "+proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +a=6370997 +b=6370997 +units=m +no_defs"

we recommend to do:

我们建议这样做:

gdf.crs = "EPSG:2163"

if you know the EPSG code for the projection you are using.

如果您知道您正在使用的投影的 EPSG 代码。

One possible way to find out the EPSG code is using pyproj for this:

找到EPSG代码的一个可能的方法是使用pyproj来做这个。

>>> import pyproj
>>> crs = pyproj.CRS("+proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +a=6370997 +b=6370997 +units=m +no_defs")
>>> crs.to_epsg()
2163

(you might need to set the min_confidence keyword of to_epsg to a lower value if the match is not perfect)

(如果匹配不完美,您可能需要将 to_epsg 的 min_confidence 关键字设置为较低的值)

Further, on websites such as spatialreference.org and epsg.io the descriptions of many CRS can be found including their EPSG codes and proj4 string definitions.

此外,在 spatialreference.org 和 epsg.io 等网站上,可以找到许多 CRS 的描述,包括它们的 EPSG 代码和 proj4 字符串定义。

Other formats

Next to the EPSG code mentioned above, there are also other ways to specify the CRS: an actual pyproj.CRS object, a WKT string, a PROJ JSON string, etc. Anything that is accepted by pyproj.CRS.from_user_input() can by specified to the crs keyword/attribute in GeoPandas.

除了上面提到的 EPSG 代码之外,还有其他指定 CRS 的方法:实际的 pyproj.CRS 对象、WKT 字符串、PROJ JSON 字符串等。pyproj.CRS.from_user_input() 接受的任何内容都可以通过指定给 GeoPandas 中的 crs 关键字/属性。

Also compatible CRS objects, such as from the rasterio package, can be passed directly to GeoPandas.

兼容的 CRS 对象,例如来自 rasterio 包的对象,也可以直接传递给 GeoPandas。

The axis order of a CRS

Starting with PROJ 6 / pyproj 2, the axis order of the official EPSG definition is honoured. For example, when using geographic coordinates (degrees of longitude and latitude) in the standard EPSG:4326, the CRS will look like:

从 PROJ 6 / pyproj 2 开始,官方 EPSG 定义的轴顺序得到尊重。例如,当在标准 EPSG:4326 中使用地理坐标(经度和纬度)时,CRS 将如下所示:

>>> pyproj.CRS(3EPSG:4326")
<Geographic 2D CRS: EPSG:4326>

...
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
...

This mentions the order as (lat, lon), as that is the official order of coordinates in EPSG:4326. In GeoPandas, however, the coordinates are always stored as (x, y), and thus as (lon, lat) order, regardless of the CRS (i.e. the “traditional” order used in GIS). When reprojecting, GeoPandas and pyproj will under the hood take care of this difference in axis order, so the user doesn’t need to care about this.

这里提到的顺序是(lat, lon),因为这是EPSG:4326中的官方坐标顺序。然而,在GeoPandas中,坐标总是以(x,y)的形式存储,因此也是(lon,lat)的顺序,与CRS无关(即GIS中使用的 “传统 “顺序)。在重新投影时,GeoPandas和pyproj会处理这种轴顺序的差异,所以用户不需要关心这个问题。

Why is it not properly recognizing my CRS?

There are many file sources and CRS definitions out there “in the wild” that might have a CRS description that does not fully conform to the new standards of PROJ > 6 (proj4 strings, older WKT formats, …). In such cases, you will get a pyproj.CRS object that might not be fully what you expected (e.g. not equal to the expected EPSG code). Below we list a few possible cases.

有许多文件源和CRS定义,它们的CRS描述可能不完全符合PROJ > 6的新标准(proj4字符串,旧的WKT格式,…)。在这种情况下,你将得到一个可能不完全符合你预期的pyproj.CRS对象(例如,不等于预期的EPSG代码)。下面我们列出几种可能的情况。

I get a “Bound CRS”?

Some CRS definitions include a “towgs84” clause, which can give problems in recognizing the actual CRS.

一些 CRS 定义包括“towgs84”条款,这可能会给识别实际 CRS 带来问题。

For example, both the proj4 and WKT representation for EPSG:31370 (the local projection used in Belgium) as can be found at https://spatialreference.org/ref/epsg/31370/ include this. When taking one of those definitions from that site, and creating a CRS object:

例如,可以在 https://spatialreference.org/ref/epsg/31370/ 找到的 EPSG:31370(比利时使用的本地投影)的 proj4 和 WKT 表示都包含此内容。从该站点获取其中一个定义并创建 CRS 对象时:

>>> import pyproj
>>> crs = pyproj.CRS("+proj=lcc +lat_1=51.16666723333333 +lat_2=49.8333339 +lat_0=90 +lon_0=4.367486666666666 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +towgs84=106.869,-52.2978,103.724,-0.33657,0.456955,-1.84218,1 +units=m +no_defs")
>>> crs
<Bound CRS: +proj=lcc +lat_1=51.16666723333333 +lat_2=49.83333 ...>
Name: unknown
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- undefined
Coordinate Operation:
- name: Transformation from unknown to WGS84
- method: Position Vector transformation (geog2D domain)
Datum: Unknown based on International 1909 (Hayford) ellipsoid
- Ellipsoid: International 1909 (Hayford)
- Prime Meridian: Greenwich
Source CRS: unknown

You notice that the above is a not a “Projected CRS” as expected, but a “Bound CRS”. This is because it is “bound” to a conversion to WGS84, and will always use this when reprojecting instead of letting PROJ determine the best conversion.

你注意到,上述内容不是预期的 “投影CRS”,而是 “绑定CRS”。这是因为它被 “绑定 “到了WGS84的转换上,并且在重新投影时总是使用这个转换,而不是让PROJ确定最佳转换。

To get the actual underlying projected CRS, you can use the .source_crs attribute:

为了获得实际的基础预测CRS,你可以使用.source_crs属性。

>>> crs.source_crs
<Projected CRS: PROJCRS["unknown",BASEGEOGCRS["unknown",DATUM["Unk ...>
Name: unknown
...

Now we have a “Projected CRS”, and now it will also recognize the correct EPSG number:

现在我们有了一个“Projected CRS”,现在它也能识别正确的 EPSG 编号:

>>> crs.to_epsg()

>>> crs.source_crs.to_epsg()
31370

I have a different axis order?

As mentioned above, pyproj now honours the axis order of the EPSG definition. However, proj4 strings or older WKT versions don’t specify this correctly, which can be a reason that the CRS object is not equal to the expected EPSG code.

如上所述,pyproj 现在遵循 EPSG 定义的轴顺序。但是,proj4 字符串或较旧的 WKT 版本没有正确指定这一点,这可能是 CRS 对象不等于预期 EPSG 代码的原因。

Consider the following example of a Canadian projected CRS “EPSG:2953”. When constructing the CRS object from the WKT string as provided on https://epsg.io/2953:

考虑以下加拿大投影 CRS“EPSG:2953”的示例。根据 https://epsg.io/2953 上提供的 WKT 字符串构建 CRS 对象时:

>>> crs = pyproj.CRS("""PROJCS["NAD83(CSRS) / New Brunswick Stereographic",
...    GEOGCS["NAD83(CSRS)",
...        DATUM["NAD83_Canadian_Spatial_Reference_System",
...            SPHEROID["GRS 1980",6378137,298.257222101,
...                AUTHORITY["EPSG","7019"]],
...            AUTHORITY["EPSG","6140"]],
...        PRIMEM["Greenwich",0,
...            AUTHORITY["EPSG","8901"]],
...        UNIT["degree",0.0174532925199433,
...            AUTHORITY["EPSG","9122"]],
...        AUTHORITY["EPSG","4617"]],
...    PROJECTION["Oblique_Stereographic"],
...    PARAMETER["latitude_of_origin",46.5],
...    PARAMETER["central_meridian",-66.5],
...    PARAMETER["scale_factor",0.999912],
...    PARAMETER["false_easting",2500000],
...    PARAMETER["false_northing",7500000],
...    UNIT["metre",1,
...        AUTHORITY["EPSG","9001"]],
...    AUTHORITY["EPSG","2953"]]""")

>>> crs
<Projected CRS: PROJCS["NAD83(CSRS) / New Brunswick Stereographic" ...>
Name: NAD83(CSRS) / New Brunswick Stereographic
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
...

Although this is the WKT string as found online for “EPSG:2953”, this CRS object does not evaluate equal to this EPSG code:

尽管这是在网上找到的“EPSG:2953”的 WKT 字符串,但此 CRS 对象的计算结果并不等于此 EPSG 代码:

>>> crs == "EPSG:2953"
False

If we construct the CRS object from the EPSG code (truncated output):

如果我们从 EPSG 代码(截断的输出)构造 CRS 对象:

>>> pyproj.CRS("EPSG:2953")
<Projected CRS: EPSG:2953>
Name: NAD83(CSRS) / New Brunswick Stereographic
Axis Info [cartesian]:
- N[north]: Northing (metre)
- E[east]: Easting (metre)
...

You can see that the CRS object constructed from the WKT string has a “Easting, Northing” (i.e. x, y) axis order, while the CRS object constructed from the EPSG code has a (Northing, Easting) axis order.

你可以看到,从WKT字符串构建的CRS对象有一个 “东经,北纬”(即x,y)轴顺序,而从EPSG代码构建的CRS对象有一个(北纬,东经)轴顺序。

Only having this difference in axis order is no problem when using the CRS in GeoPandas, since GeoPandas always uses a (x, y) order to store the data regardless of the CRS definition. But, you might still want to verify it is equivalent to the expected EPSG code. By lowering the min_confidence, the axis order will be ignored:

在 GeoPandas 中使用 CRS 时,只有轴顺序有这种差异是没有问题的,因为 GeoPandas 始终使用 (x, y) 顺序来存储数据,而不管 CRS 定义如何。但是,您可能仍想验证它是否等同于预期的 EPSG 代码。通过降低 min_confidence,轴顺序将被忽略:

>>> crs.to_epsg()

>>> crs.to_epsg(min_confidence=20)
2953

The .crs attribute is no longer a dict or string

If you relied on the .crs object being a dict or a string, such code can be broken given it is now a pyproj.CRS object. But this object actually provides a more robust interface to get information about the CRS.

如果你依赖.crs对象是一个dict或一个字符串,鉴于它现在是一个pyproj.CRS对象,这样的代码可能会被破坏。但是这个对象实际上提供了一个更强大的接口来获取关于CRS的信息。

For example, if you used the following code to get the EPSG code:

例如,如果您使用以下代码获取 EPSG 代码:

gdf.crs['init']

This will no longer work. To get the EPSG code from a crs object, you can use the to_epsg() method.

这将不再有效。要从 crs 对象获取 EPSG 代码,可以使用 to_epsg() 方法。

Or to check if a CRS was a certain UTM zone:

或者检查 CRS 是否是某个 UTM 区域:

'+proj=utm ' in gdf.crs

could be replaced with the more robust check (requires pyproj 2.6+):

可以用更强大的检查来代替(要求pyproj 2.6以上)。

gdf.crs.utm_zone is not None

And there are many other methods available on the pyproj.CRS class to get information about the CRS.

pyproj.CRS 类上还有许多其他方法可用于获取有关 CRS 的信息。

标签 , ,
这篇文章对您有用吗?

我们要如何帮助您?

欢迎留下您的宝贵建议

Please enter your comment!
Please enter your name here