Reading Shapefiles into Pandas Dataframes
24 Jan 2017I’ve just about had it up to here with and ArcMap and arcpy. Today I begin my quest to free myself from ever needing to rely on ESRI for spatial analysis and mapping.
Geopandas seems great, but I have had a lot of trouble getting it installed and have therefore been hesitant to rely on it in any package I create. Instead, I’ve used the following snippet to read a shapefile into a Pandas dataframe for quick analysis. You will need the pyshp package and Pandas. If you don’t have these, install them via pip and you’re ready to go:
import shapefile #the pyshp module
import pandas as pd
#read file, parse out the records and shapes
shapefile_path = r'path/to/shapefile/'
sf = shapefile.Reader(shapefile_path)
#grab the shapefile's field names (omit the first psuedo field)
fields = [x[0] for x in sf.fields][1:]
records = sf.records()
shps = [s.points for s in sf.shapes()]
#write the records into a dataframe
shapefile_dataframe = pd.DataFrame(columns=fields, data=records)
#add the coordinate data to a column called "coords"
shapefile_dataframe = shapefile_dataframe.assign(coords=shps)
Now shapefile_dataframe
has all of the input shapefile’s records and geometry data.