geopandas can read almost any vector-based spatial data format including ESRI shapefile, GeoJSON files and more using the command:
geopandas 可以使用以下命令读取几乎任何基于矢量的空间数据格式,包括 ESRI shapefile、GeoJSON 文件等:
geopandas.read_file()
which returns a GeoDataFrame object. This is possible because geopandas makes use of the great fiona library, which in turn makes use of a massive open-source program called GDAL/OGR designed to facilitate spatial data transformations.
上述命令返回一个 GeoDataFrame 对象。上述命令的实现是因为 geopandas 使用了很棒的 fiona 库,而 fiona 库又使用了一个旨在促进空间数据转换的名为 GDAL/OGR 的大型开源程序。
Any arguments passed to geopandas.read_file()
after the file name will be passed directly to fiona.open()
, which does the actual data importation. In general, geopandas.read_file()
is pretty smart and should do what you want without extra arguments, but for more help, type:
在文件名之后传递给geopandas.read_file()的任何参数都会直接传递给fiona.open(),后者会进行实际的数据导入。总的来说,geopandas.read_file() 非常有效,无需额外参数就可以做你想做的事,如要获得更多帮助,请输入:
import fiona; help(fiona.open)
Among other things, one can explicitly set the driver (shapefile, GeoJSON) with the driver
keyword, or pick a single layer from a multi-layered file with the layer
keyword:
除此之外,可以使用 driver 关键字显式设置驱动程序(shapefile、GeoJSON),或者使用 layer 关键字从多层文件中选择一个图层:
countries_gdf = geopandas.read_file("package.gpkg", layer='countries')
Currently fiona only exposes the default drivers. To display those, type:
目前fiona只公开了默认的驱动程序。要显示这些,请键入。
import fiona; fiona.supported_drivers
There is an array of unexposed but supported (depending on the GDAL-build) drivers. One can activate these on runtime by updating the supported_drivers dictionary like:
有一系列未公开但受支持(取决于 GDAL 构建)的驱动程序。可以通过更新 supported_drivers 字典在运行时激活它们,例如:
fiona.supported_drivers["NAS"] = "raw"
Where supported in fiona
, geopandas can also load resources directly from a web URL, for example for GeoJSON files from geojson.xyz:
在 fiona 支持的情况下,geopandas 还可以直接从 web URL 加载资源,例如来自 geojson.xyz 的 GeoJSON 文件:
url = "http://d2ad6b4ur7yvpq.cloudfront.net/naturalearth-3.3.0/ne_110m_land.geojson"
df = geopandas.read_file(url)
You can also load ZIP files that contain your data:
您还可以加载包含数据的 ZIP 文件:
zipfile = "zip:///Users/name/Downloads/cb_2017_us_state_500k.zip"
states = geopandas.read_file(zipfile)
If the dataset is in a folder in the ZIP file, you have to append its name:
如果数据集位于 ZIP 文件的文件夹中,则必须附加其名称:
zipfile = "zip:///Users/name/Downloads/gadm36_AFG_shp.zip!data"
If there are multiple datasets in a folder in the ZIP file, you also have to specify the filename:
如果 ZIP 文件中的一个文件夹中有多个数据集,您还必须指定文件名:
zipfile = "zip:///Users/name/Downloads/gadm36_AFG_shp.zip!data/gadm36_AFG_1.shp"
It is also possible to read any file-like objects with a os.read()
method, such as a file handler (e.g. via built-in open()
function) or StringIO
:
也可以使用 os.read() 方法读取任何类似文件的对象,例如文件处理程序(例如通过内置的 open() 函数)或 StringIO:
filename = "test.geojson"
file = open(filename)
df = geopandas.read_file(file)
File-like objects from fsspec can also be used to read data, allowing for any combination of storage backends and caching supported by that project:
fsspec 中的类文件对象也可用于读取数据,允许该项目支持的存储后端和缓存的任意组合:
path = "simplecache::http://download.geofabrik.de/antarctica-latest-free.shp.zip"
with fsspec.open(path) as file:
df = geopandas.read_file(file)
You can also read path objects:
您还可以读取路径对象:
import pathlib
path_object = pathlib.path(filename)
df = geopandas.read_file(path_object)
Reading subsets of the data
Since geopandas is powered by Fiona, which is powered by GDAL, you can take advantage of pre-filtering when loading in larger datasets. This can be done geospatially with a geometry or bounding box. You can also filter rows loaded with a slice. Read more at geopandas.read_file()
.
由于 geopandas 由 Fiona 提供支持,而 Fiona 由 GDAL 提供支持,因此您可以在加载更大的数据集时利用预过滤。这可以通过几何或边界框在地理空间上完成。还可以过滤加载了切片的行。在 geopandas.read_file() 中阅读更多内容。
Geometry Filter
The geometry filter only loads data that intersects with the geometry.
只加载与几何对象相交的数据。
gdf_mask = geopandas.read_file(
geopandas.datasets.get_path("naturalearth_lowres")
)
gdf = geopandas.read_file(
geopandas.datasets.get_path("naturalearth_cities"),
mask=gdf_mask[gdf_mask.continent=="Africa"],
)
Bounding Box Filter
The bounding box filter only loads data that intersects with the bounding box.
仅加载与边界框相交的数据。
bbox = (
1031051.7879884212, 224272.49231459625, 1047224.3104931959, 244317.30894023244
)
gdf = geopandas.read_file(
geopandas.datasets.get_path("nybb"),
bbox=bbox,
)
Row Filter
Filter the rows loaded in from the file using an integer (for the first n rows) or a slice object.
使用整数(对于前 n 行)或切片对象过滤从文件加载的行。
gdf = geopandas.read_file(
geopandas.datasets.get_path("naturalearth_lowres"),
rows=10,
)
gdf = geopandas.read_file(
geopandas.datasets.get_path("naturalearth_lowres"),
rows=slice(10, 20),
)
Field/Column Filters
Load in a subset of fields from the file:
从文件中加载字段的子集:
Note:Requires Fiona 1.9+
gdf = geopandas.read_file(
geopandas.datasets.get_path("naturalearth_lowres"),
include_fields=["pop_est", "continent", "name"],
)
Note:Requires Fiona 1.8+
gdf = geopandas.read_file(
geopandas.datasets.get_path("naturalearth_lowres"),
ignore_fields=["iso_a3", "gdp_md_est"],
)
Skip loading geometry from the file:
从文件中加载时跳过活动几何列“geometry”列【geometry】
Note:Requires Fiona 1.8+
Note:Returns pandas.DataFrame
返回pandas.DataFrame类型
pdf = geopandas.read_file(
geopandas.datasets.get_path("naturalearth_lowres"),
ignore_geometry=True,
)
SQL WHERE Filter
Load in a subset of data with a SQL WHERE clause.
使用 SQL WHERE 子句加载数据子集
Note:Requires Fiona 1.9+ or the pyogrio engine.
gdf = geopandas.read_file(
geopandas.datasets.get_path("naturalearth_lowres"),
where="continent='Africa'",
)