Coordinate Reference Systems (CRS) Explained for Python GIS
Understand coordinate reference systems in Python GIS and learn how to check, assign, and convert them with GeoPandas.
Problem statement
Many Python GIS errors are really CRS problems.
A layer can load without errors and still give wrong results because its coordinate reference system is missing, incorrect, or different from the other layers in your workflow. This often causes:
- overlays that do not line up
- spatial joins that return no matches
- buffers with unrealistic sizes
- distance and area calculations that are wrong
- maps that display data in the wrong place
If you have ever asked what CRS means in GIS, the practical answer is simple: CRS controls what your coordinate values mean. This page explains CRS well enough to help you inspect, assign, and transform spatial data correctly in Python.
Quick answer
A CRS tells GIS software how coordinate numbers relate to real locations on the earth.
In practice, there are two separate questions:
- What numbers are stored in the geometry?
Example:(-73.98, 40.75)or(583240, 4501120) - What do those numbers mean?
Are they longitude/latitude in degrees, or projected coordinates in meters?
The practical rule for Python GIS workflows is:
- always check CRS before analysis
- use
set_crs()only when the CRS is missing but you already know the correct one - use
to_crs()when you need to transform data into a different CRS - reproject layers to a common CRS before overlay, join, clip, buffer, or measurement
Step-by-step solution
1. Understand what a CRS does
A CRS defines how coordinates map to real-world locations. The same geometry values can represent completely different places depending on the CRS.
For example, these numbers:
POINT (12 55)
could mean:
- longitude 12°, latitude 55° in a geographic CRS
- x=12 meters, y=55 meters in a projected CRS
Without CRS information, the coordinates are just numbers.
2. Know the difference between geographic and projected CRS
A geographic CRS stores angular coordinates, usually longitude and latitude in degrees.
Common example:
EPSG:4326— WGS 84
This is useful for:
- storing global data
- exchanging data
- web APIs and GeoJSON
- basic mapping
A projected CRS converts the curved earth to a flat coordinate system with planar units such as meters or feet.
Common example:
EPSG:32633— WGS 84 / UTM zone 33N
This is useful for:
- distance calculations
- area calculations
- buffering
- local or regional analysis
In practice, geographic CRS is good for storing and sharing locations, while projected CRS is usually better for measurement.
3. Use EPSG codes to identify CRS definitions
An EPSG code is a standard identifier for a CRS definition.
Examples:
EPSG:4326— WGS 84 geographic coordinatesEPSG:3857— Web Mercator, common in web mapsEPSG:32633— UTM zone 33N, projected in meters
For most day-to-day Python GIS work, the EPSG code is enough. GeoPandas and pyproj use it to look up the full CRS definition.
4. Inspect CRS before doing any analysis
Use the .crs attribute on a GeoDataFrame.
import geopandas as gpd
gdf = gpd.read_file("data/cities.geojson")
print(gdf.crs)
Possible output:
EPSG:4326
or a longer WKT definition.
You can also compare CRS values directly:
print(gdf.crs == "EPSG:4326")
A missing CRS appears as:
None
5. Distinguish between assigning CRS and reprojecting data
This is the most important CRS distinction in Python GIS work.
set_crs()assigns CRS metadata to existing coordinatesto_crs()transforms coordinates into a new CRS
Use set_crs() only if the coordinates are already in that CRS but the file is unlabeled.
Use to_crs() when you want to change coordinates from one CRS to another.
If you confuse these two operations, the data may look valid but produce wrong results.
6. Choose a projected CRS for measurement tasks
Use a geographic CRS when you need:
- standard storage format
- longitude/latitude output
- web service compatibility
Use a projected CRS when you need:
- distance in meters or feet
- area calculations
- buffering
- better local accuracy
For analysis, choose a projected CRS that matches your study area. In practice, this is often a local UTM zone or a national or regional projected CRS used by the data provider.
7. Reproject layers to a common CRS before combining them
Before running:
overlay()sjoin()clip()- buffering
- area or distance calculations
check that both layers use the same CRS.
Basic check:
if gdf1.crs != gdf2.crs:
gdf2 = gdf2.to_crs(gdf1.crs)
This avoids many CRS mismatch problems in GeoPandas workflows.
Code examples
Example 1: Read a file and inspect its CRS
import geopandas as gpd
roads = gpd.read_file("data/roads.shp")
print("CRS:", roads.crs)
if roads.crs is None:
print("This layer has no CRS metadata.")
Use this before any analysis. Never assume the CRS.
Example 2: Assign a CRS to data that has none
import geopandas as gpd
from shapely.geometry import Point
gdf = gpd.GeoDataFrame(
{"name": ["A", "B"]},
geometry=[Point(-73.98, 40.75), Point(-74.01, 40.72)]
)
print(gdf.crs) # None
# Coordinates are known to be longitude/latitude in WGS84
gdf = gdf.set_crs("EPSG:4326")
print(gdf.crs)
This is correct only if you already know the coordinates are in WGS 84. If you guess, you may attach the wrong CRS to valid geometry.
Example 3: Reproject a layer to a projected CRS
import geopandas as gpd
parcels = gpd.read_file("data/parcels.geojson")
print("Original CRS:", parcels.crs)
parcels_utm = parcels.to_crs("EPSG:32618")
print("Projected CRS:", parcels_utm.crs)
print(parcels.geometry.iloc[0])
print(parcels_utm.geometry.iloc[0])
After reprojection, the coordinate values change. That is expected. The geometries now use projected units, which makes area and distance work more reliably.
Example 4: Align two layers to the same CRS before overlay
import geopandas as gpd
buildings = gpd.read_file("data/buildings.shp")
flood_zones = gpd.read_file("data/flood_zones.geojson")
print("Buildings CRS:", buildings.crs)
print("Flood zones CRS:", flood_zones.crs)
if buildings.crs != flood_zones.crs:
flood_zones = flood_zones.to_crs(buildings.crs)
print("Aligned:", buildings.crs == flood_zones.crs)
This is a standard preparation step before clipping, spatial join, or overlay.
Example 5: Compare distance results in geographic vs projected CRS
import geopandas as gpd
from shapely.geometry import Point
gdf = gpd.GeoDataFrame(
geometry=[Point(-73.98, 40.75), Point(-73.99, 40.76)],
crs="EPSG:4326"
)
# Distance in degrees, usually not useful for analysis
distance_degrees = gdf.geometry.iloc[0].distance(gdf.geometry.iloc[1])
print("Distance in geographic CRS:", distance_degrees)
# Reproject to a local projected CRS in meters
gdf_proj = gdf.to_crs("EPSG:32618")
distance_meters = gdf_proj.geometry.iloc[0].distance(gdf_proj.geometry.iloc[1])
print("Distance in projected CRS (meters):", distance_meters)
This shows why measurement tasks should usually be done in a projected CRS.
Explanation
In operational terms:
- A CRS gives meaning to coordinate values.
- Geographic CRS stores angular coordinates, usually longitude and latitude in degrees.
- Projected CRS stores planar coordinates, usually in meters or feet.
- Measurements like distance, area, and buffers should usually be done in a projected CRS.
set_crs()labels existing coordinates.to_crs()transforms coordinates to a new system.
If your analysis looks wrong, check CRS first. In many workflows, that is the real issue.
Edge cases or notes
Missing CRS metadata
A file can contain valid geometry and still have no CRS metadata. In that case, only assign a CRS if you know the source coordinate system from documentation, data provider notes, or another trusted source.
Wrong CRS metadata
Sometimes the file has a CRS value, but it is wrong. Common symptoms:
- layer appears far from expected location
- overlay returns no matches
- distances are extremely large or small
Axis order and longitude-latitude confusion
Some services and formats may use latitude/longitude order instead of longitude/latitude. If points appear in the wrong continent, inspect coordinate ranges and verify map output.
Mixed CRS in multi-layer workflows
Spatial joins, overlays, and clipping are unreliable when layers are in different CRS. Always compare .crs values before combining layers.
Invalid geometries
CRS is not the only source of failures. Overlay and buffer operations can also break if polygons are invalid. If CRS looks correct but analysis still fails, inspect geometry validity too.
Web mapping CRS
Web maps often use EPSG:3857 for display. That does not mean it is the best CRS for measurement. Display CRS and analysis CRS are often different.
Internal links
For the broader concept, see Projected vs Geographic Coordinate Systems in GIS.
Related task pages:
If your layers do not align, see How to Fix CRS Mismatch Errors in GeoPandas.
FAQ
What is a CRS in GIS?
A CRS is the definition that tells GIS software how coordinate values relate to real positions on the earth. Without it, coordinates have no reliable spatial meaning.
What is the difference between EPSG:4326 and a projected CRS?
EPSG:4326 is a geographic CRS using longitude and latitude in degrees. A projected CRS uses planar units such as meters or feet, which makes it more suitable for distance, area, and buffering.
When should I use set_crs() instead of to_crs() in GeoPandas?
Use set_crs() when the data already uses a known CRS but the metadata is missing. Use to_crs() when you need to transform the coordinates into a different CRS.
Why are my distance or area results wrong in Python GIS?
This usually happens because you measured data in a geographic CRS instead of a projected CRS, or because the layer had the wrong CRS assigned.
Do all layers need the same CRS before a spatial join or overlay?
Yes. Before a spatial join, clip, or overlay, all participating layers should usually be reprojected to the same CRS.