How to Reproject Spatial Data in Python (GeoPandas)
How to reproject spatial data in Python using GeoPandas to_crs(), with examples for common coordinate systems.
Problem statement
A common GIS task is changing the coordinate reference system of vector data so it matches another layer, works with web maps, or produces correct distance and area results.
Typical cases include:
- a shapefile in
EPSG:4326needs to display with web tiles inEPSG:3857 - you need a projected CRS before calculating area, length, or buffers
- two datasets do not align because they use different CRS values
- a file has valid coordinates, but the CRS metadata is missing
If you need to reproject a GeoDataFrame in Python, the main issue is knowing whether the CRS is already defined correctly. If it is, you transform it. If it is missing but known from external metadata or documentation, you assign it first.
Quick answer
Use GeoPandas .to_crs() to reproject geometries.
Basic workflow:
- check the current CRS with
gdf.crs - if the CRS is missing but known, assign it with
gdf.set_crs() - reproject with
gdf.to_crs(...) - save the result or continue analysis
Example:
import geopandas as gpd
gdf = gpd.read_file("data/roads.shp")
print(gdf.crs)
gdf_3857 = gdf.to_crs(epsg=3857)
print(gdf_3857.crs)
Important: .to_crs() only works correctly if the current CRS is already defined correctly.
Step-by-step solution
Check the current CRS
Before changing anything, inspect the CRS.
import geopandas as gpd
gdf = gpd.read_file("data/parcels.shp")
print(gdf.crs)
Example output:
EPSG:4326
This tells you what coordinate system the current geometry coordinates use.
You need to distinguish between:
- missing CRS:
gdf.crsisNone - incorrect CRS: a CRS is present, but it does not match the actual coordinates
A missing CRS prevents safe reprojection. An incorrect CRS is worse because reprojection will run, but the output will be wrong.
Code example: read a shapefile and inspect its CRS
import geopandas as gpd
gdf = gpd.read_file("data/city_boundary.shp")
print("Current CRS:", gdf.crs)
print(gdf.head())
Use this as the first check any time you work with a new shapefile or GeoJSON file.
Set the CRS if it is missing
Use .set_crs() only when the coordinates are already in a known CRS but the metadata is missing.
For example, if coordinates are longitude and latitude in WGS84, but gdf.crs is None, assign EPSG:4326.
if gdf.crs is None:
gdf = gdf.set_crs(epsg=4326)
This does not change coordinate values. It only labels what the existing coordinates mean.
Code example: assign a missing CRS with set_crs()
import geopandas as gpd
from shapely.geometry import Point
gdf = gpd.GeoDataFrame(
{"name": ["A", "B"]},
geometry=[Point(-73.9857, 40.7484), Point(-73.9819, 40.7681)]
)
print("Before:", gdf.crs)
gdf = gdf.set_crs(epsg=4326)
print("After:", gdf.crs)
Use this only if you know the source coordinates are in EPSG:4326. Do not guess the CRS from coordinates alone.
Reproject the GeoDataFrame with to_crs()
Once the source CRS is correct, use .to_crs() to transform the coordinates into a new system.
Code example: reproject to Web Mercator with to_crs()
A common case is converting from EPSG:4326 to EPSG:3857 for web map display.
import geopandas as gpd
gdf = gpd.read_file("data/points.geojson")
print("Source CRS:", gdf.crs)
gdf_web = gdf.to_crs(epsg=3857)
print("Reprojected CRS:", gdf_web.crs)
print(gdf_web.geometry.head())
This transforms the geometry coordinates from geographic coordinates (typically degrees in EPSG:4326) to projected coordinates in meters in Web Mercator (EPSG:3857).
Use EPSG:3857 for display, not for accurate area or distance calculations.
Common target CRS choices:
EPSG:3857for web mapping- local UTM zones for measurement
- national projected CRS for local or regional analysis
Code example: reproject to a projected CRS for area or distance analysis
For area and distance work, use a projected CRS instead of EPSG:4326.
import geopandas as gpd
gdf = gpd.read_file("data/parcels.geojson")
# Example: UTM zone 18N
gdf_utm = gdf.to_crs(epsg=32618)
gdf_utm["area_sqm"] = gdf_utm.area
print(gdf_utm[["area_sqm"]].head())
If your data is in a different location, choose the appropriate UTM zone or another suitable local projected CRS.
Save the reprojected data
After reprojection, save the output to a new file.
Code example: save the reprojected output
import geopandas as gpd
gdf = gpd.read_file("data/roads.shp")
gdf_3857 = gdf.to_crs(epsg=3857)
gdf_3857.to_file("output/roads_3857.shp")
You can also save to GeoJSON or GeoPackage:
gdf_3857.to_file("output/roads_3857.geojson", driver="GeoJSON")
gdf_3857.to_file("output/roads_3857.gpkg", driver="GPKG")
Verify the saved CRS by reading the file again:
check = gpd.read_file("output/roads_3857.gpkg")
print(check.crs)
Code examples
Reproject in one short workflow
import geopandas as gpd
gdf = gpd.read_file("data/buildings.geojson")
if gdf.crs is None:
gdf = gdf.set_crs(epsg=4326)
gdf_projected = gdf.to_crs(epsg=32618)
gdf_projected.to_file("output/buildings_32618.gpkg", driver="GPKG")
Match one layer to another layer's CRS
This is common before overlay, clipping, or spatial joins.
import geopandas as gpd
parcels = gpd.read_file("data/parcels.gpkg")
zoning = gpd.read_file("data/zoning.gpkg")
zoning = zoning.to_crs(parcels.crs)
print("Parcels CRS:", parcels.crs)
print("Zoning CRS:", zoning.crs)
Explanation
set_crs() vs to_crs()
This is the most important distinction in GeoPandas CRS workflows.
set_crs()
Use this when the coordinates are already correct, but the CRS label is missing.
It answers: what do these existing coordinates mean?
gdf = gdf.set_crs(epsg=4326)
to_crs()
Use this when you want to transform coordinates into a different coordinate system.
It answers: convert these coordinates into another CRS
gdf_projected = gdf.to_crs(epsg=3857)
If you use set_crs() when you meant to_crs(), your layer will not actually move into a new coordinate system. If you use to_crs() on data with a wrong or missing CRS, the output will be incorrect.
Choosing the right target CRS
EPSG:4326 is common for storage and exchange, especially with GeoJSON, but it is not a good choice for measurement because coordinates are stored in degrees.
Use a projected CRS when you need:
- distance
- area
- length
- buffers
- overlay analysis based on local accuracy
Examples:
EPSG:3857for web display- UTM CRS such as
EPSG:32618for local metric analysis - a national CRS for country-specific workflows
The correct CRS depends on where the data is located and what you need to do with it.
Why layers fail to align
A mismatched CRS is one of the most common reasons layers do not line up.
Other causes include:
- one layer has no CRS metadata
- the wrong CRS was assigned with
set_crs() - data was exported without correct CRS information
- axis order confusion in some external workflows
Before overlay, join, clip, or measurement, make sure both layers use appropriate and matching CRS.
Edge cases / notes
Notes and common mistakes
- reprojection fails if
gdf.crsisNone - assigning the wrong CRS can produce bad output even when the code runs
- GeoJSON is commonly used with
EPSG:4326 - GeoPackage usually preserves CRS metadata more reliably than shapefile
- shapefiles have format limitations, so GeoPackage is often a better output format for practical workflows
- large datasets can take longer to transform
- reproject both layers before spatial joins, overlays, clipping, or measurements
- invalid geometries can cause problems in later analysis, even if reprojection itself succeeds
A simple geometry check:
invalid = ~gdf.is_valid
print(gdf[invalid])
If needed, repair invalid geometries before more complex processing.
Internal links
If you need background on coordinate systems, see Coordinate Reference Systems (CRS) in Python GIS.
For related tasks, see:
If you need to export the result, see How to Export GeoJSON in Python with GeoPandas.
If your data still does not line up, check Why GeoPandas Layers Do Not Align on a Map.
FAQ
How do I reproject a GeoDataFrame in Python with GeoPandas?
Use gdf.to_crs(...) after confirming that gdf.crs is already set correctly.
gdf = gdf.to_crs(epsg=3857)
What is the difference between set_crs() and to_crs() in GeoPandas?
set_crs() assigns CRS metadata without changing coordinates.to_crs() transforms coordinates into a different coordinate system.
Why does reprojection fail when my GeoDataFrame has no CRS?
GeoPandas cannot transform coordinates if it does not know the source coordinate system. Set the CRS first if it is known:
gdf = gdf.set_crs(epsg=4326)
Which CRS should I use for distance or area calculations in GeoPandas?
Use a projected CRS, such as a local UTM zone or national projected CRS. Do not use EPSG:4326 for area or distance calculations.