GeoPandas Basics: Working with Spatial Data in Python

A practical introduction to GeoPandas for reading, inspecting, filtering, and exporting vector spatial data in Python.

Problem statement

If you need to work with shapefiles or GeoJSON in Python, the main problem is usually not the file format itself. The problem is getting spatial data into a form where you can inspect attributes, check geometry, confirm the coordinate reference system, filter features, and save the result without handling low-level GIS details manually.

This is where GeoPandas basics matter. GeoPandas gives you a practical way to work with vector spatial data in Python using a table-like structure that feels similar to pandas, but with geometry support built in.

This page shows how to:

  • read spatial files
  • inspect geometry and attributes
  • view the CRS
  • filter features
  • preview data with a quick plot
  • save results to a new file

Quick answer

GeoPandas is a Python library for working with vector spatial data in a GeoDataFrame, which is similar to a pandas DataFrame but includes a geometry column.

A basic workflow looks like this:

  1. Load a file with gpd.read_file()
  2. Inspect columns, rows, and geometry
  3. Check the CRS with .crs
  4. Filter records by attribute or geometry presence
  5. Save the output with .to_file()
import geopandas as gpd

gdf = gpd.read_file("data/parcels.shp")
print(gdf.head())
print(gdf.crs)

filtered = gdf[gdf["zone"] == "RESIDENTIAL"]
filtered.to_file("output/residential_parcels.geojson", driver="GeoJSON")

Step-by-step solution

Install GeoPandas

A common installation method is:

pip install geopandas

If you use Conda, install GeoPandas in your environment before running the examples:

conda install geopandas

If installation fails, fix the Python environment first before debugging your code.

Import GeoPandas

Most vector workflows start with one import:

import geopandas as gpd

Using the gpd alias is standard and keeps code readable.

Read a shapefile or GeoJSON into a GeoDataFrame

Use gpd.read_file() to load common vector formats.

Read a parcel shapefile

import geopandas as gpd

parcels = gpd.read_file("data/parcels/parcels.shp")
print(parcels.head())

Read a neighborhoods GeoJSON

import geopandas as gpd

neighborhoods = gpd.read_file("data/city_neighborhoods.geojson")
print(neighborhoods.head())

This returns a GeoDataFrame with attribute columns and one active geometry column.

Inspect the GeoDataFrame structure

Before doing any analysis, inspect the dataset structure.

print(parcels.head())
print(parcels.columns)
print(parcels.dtypes)
print(parcels.geometry.name)
print(len(parcels))

This helps you confirm:

  • available fields
  • geometry column name
  • data types
  • number of records

If you are reading unfamiliar data, this step prevents many later errors.

Check the coordinate reference system

CRS matters in every GIS workflow. If the CRS is wrong or missing, maps, overlays, and distance-based analysis can be misleading.

print(parcels.crs)
print(neighborhoods.crs)

Typical output may look like:

EPSG:26917

or:

EPSG:4326

Check CRS before plotting, combining layers, or measuring anything.

Filter spatial records

A common task is filtering features based on an attribute field.

Example: keep only residential parcels

residential = parcels[parcels["zone"] == "RESIDENTIAL"]
print(residential.head())
print(len(residential))

You can also remove rows with missing geometry:

parcels_with_geometry = parcels[parcels.geometry.notna()]

This is a simple but useful quality check before plotting or exporting.

Check geometry types

GeoPandas makes it easy to inspect geometry types.

print(parcels.geometry.geom_type.head())
print(parcels.geometry.geom_type.value_counts())

Common geometry types include:

  • Point
  • LineString
  • Polygon

If you expected polygons but see mixed geometry types, verify the source data before continuing.

Plot the data for a quick visual check

A quick plot helps confirm that the layer loaded correctly. This is not advanced cartography. It is a fast validation step.

residential.plot()

For a slightly clearer result:

ax = residential.plot(figsize=(8, 6), edgecolor="black")
ax.set_title("Residential Parcels")

This can help you spot empty layers, unexpected extents, or geometry problems.

Save the result to a new file

After filtering or cleaning data, write it back to disk.

Save to GeoJSON

residential.to_file("output/residential_parcels.geojson", driver="GeoJSON")

Save to shapefile

residential.to_file("output/residential_parcels.shp")

GeoJSON is often easier for exchange and web workflows. Shapefile is still common in older GIS systems.

Code examples

Example 1: Load a shapefile into GeoPandas

import geopandas as gpd

parcels = gpd.read_file("data/parcels/parcels.shp")
print(parcels.head())

Example 2: Load a GeoJSON file and inspect columns

import geopandas as gpd

neighborhoods = gpd.read_file("data/city_neighborhoods.geojson")

print(neighborhoods.columns)
print(neighborhoods.head())
print(neighborhoods.geometry.name)

Example 3: Check CRS and geometry types

import geopandas as gpd

neighborhoods = gpd.read_file("data/city_neighborhoods.geojson")

print("CRS:", neighborhoods.crs)
print(neighborhoods.geometry.geom_type.value_counts())

Example 4: Filter features by attribute

import geopandas as gpd

parcels = gpd.read_file("data/parcels/parcels.shp")

commercial = parcels[parcels["land_use"] == "COMMERCIAL"]
print(commercial[["parcel_id", "land_use"]].head())

Example 5: Plot and export filtered data

import geopandas as gpd

parcels = gpd.read_file("data/parcels/parcels.shp")
residential = parcels[(parcels["zone"] == "RESIDENTIAL") & (parcels.geometry.notna())]

ax = residential.plot(figsize=(8, 6), edgecolor="black")
ax.set_title("Residential Parcels")

residential.to_file("output/residential_parcels.geojson", driver="GeoJSON")

Explanation

A GeoDataFrame is the core object in GeoPandas. It extends a pandas DataFrame by adding support for spatial geometry. That means you still get rows and columns like a normal table, but one column stores shapes such as points, lines, or polygons.

The geometry column is what makes spatial operations possible. Without it, you just have attribute data. With it, GeoPandas can plot features, inspect geometry types, and support later GIS tasks such as clipping or spatial joins.

This is the main difference between pandas and GeoPandas:

  • pandas works with tabular data
  • GeoPandas works with tabular data plus spatial geometry

CRS is also a required part of a reliable spatial data workflow. Two datasets can both look valid but still fail to align if their CRS values differ. Even in a basic workflow, checking .crs should be standard.

This page is a foundation for common vector tasks in Python. It covers loading, inspecting, filtering, plotting, and exporting data without going into reprojection, spatial joins, or geoprocessing.

Edge cases or notes

Missing or invalid geometry

Some rows may have null geometry values or invalid shapes. Check for missing geometry before plotting or analysis:

gdf = gdf[gdf.geometry.notna()]

If you need to check geometry validity, use:

print(gdf.geometry.is_valid.value_counts())

Invalid geometry can cause errors in later operations.

CRS may be missing

Some files do not contain CRS metadata. If .crs returns None, do not assume the coordinates are correct. An unknown CRS makes mapping and spatial analysis unreliable.

Shapefile field limitations

Shapefiles have older format constraints, including:

  • shortened field names
  • multiple sidecar files such as .shp, .shx, .dbf, and .prj
  • weaker support for long text and metadata

If possible, consider GeoJSON or GeoPackage for newer workflows.

Large files can be slow

Large parcel or boundary datasets may load slowly and be expensive to plot. Start by inspecting structure, columns, and row count before doing heavier operations.

For the broader concept, see Python for GIS: What It Is and When to Use It.

For related tasks, see How to Read a Shapefile with GeoPandas and How to Read GeoJSON in Python with GeoPandas.

If your data does not line up or the CRS is unclear, see Coordinate Reference Systems (CRS) Explained for Python GIS.

FAQ

What is GeoPandas used for?

GeoPandas is used for working with vector spatial data in Python. Common tasks include reading shapefiles and GeoJSON, filtering features, checking CRS, plotting data, and exporting results.

What is the difference between pandas and GeoPandas?

pandas handles standard tabular data. GeoPandas adds a geometry column so the table can store and work with spatial features such as points, lines, and polygons.

Can GeoPandas read shapefiles and GeoJSON?

Yes. gpd.read_file() can read both shapefiles and GeoJSON, along with other vector formats supported by the installed GIS libraries.

How do I check the CRS of a GeoDataFrame?

Use the .crs attribute:

print(gdf.crs)

This shows the coordinate reference system if it is stored with the dataset.