5th Sep 2024 10 minutes read A Guide to PostGIS: Basic Geospatial Data Query Examples Michał Wasiluk PostgreSQL Table of Contents Our Sample Geospatial Data Basic Spatial Queries with PostGIS Visualizing Geospatial Data Converting Geospatial Data to Text Spatial Relationships ST_Intersection ST_Contains ST_Within Distance and Area Calculations Distance Calculations Area Calculations Advanced Spatial Analysis Buffering and Proximity Analysis Overlay Operations Network Analysis Continue Exploring Spatial Data with PostGIS Geospatial data is becoming increasingly important in many fields, from urban planning to environmental science. In this article, we’ll demonstrate basic PostgreSQL PostGIS queries for working with this type of data. Geospatial data, which contains information about locations on Earth, requires specialized tools for effective use. PostGIS is a powerful PostgreSQL extension that turns a Postgres database into a full-featured Geographic Information System (GIS). With PostGIS, you can store geographic objects, run spatial queries, and perform advanced analyses directly in SQL. This makes it an essential tool for anyone working with location-based data. Installing PostGIS allows you to perform spatial operations that usually require specialized GIS software. Whether you need to calculate distances, measure areas, or analyze spatial relationships, PostGIS equips you with the tools necessary to handle complex spatial data efficiently. If you're looking to deepen your expertise in PostGIS, consider enrolling in our PostGIS course. It will take you from GIS basics to advanced PostGIS techniques. When you’ve completed it, you’ll understand how to leverage the power of PostGIS in your projects. In this guide, we’ll introduce you to the basics of PostGIS. We’ll start with simple spatial queries and move on to advanced techniques like proximity analysis and network analysis. If you want to learn even more about geospatial data, check out our article An Introduction to Spatial Databases and Geospatial Data. Our Sample Geospatial Data In this article, we’ll be working with a spatial dataset drawn from San Francisco, California. We’ll be using several tables: sf_tram_stops: This table contains the locations of tram stops across the city; it also stores location coordinates and whether transfers between lines are possible at that stop. It has the following columns: id – A unique identifier for each stop. coordinates – The location coordinates of the stop. transfer_possible – If a passenger can change trams at this stop. sf_planning_districts: This table defines the boundaries of San Francisco’s planning districts; this information is essential for urban analyses. It has the following columns: id – A unique identifier for each planning district. name – The district’s name. boundaries – The district’s boundaries (as location coordinates). sf_bicycle_routes: This table stores information on the city's bicycle routes; you’ll need this to analyze SanFran’s biking infrastructure. It has the following columns: id – A unique identifier for each bicycle route. course – The route’s course. condition_rating – The route’s rating (based on its conditions). sf_restaurants: This table lists information on San Francisco's restaurants, including their names, locations, ratings, and types of cuisine. It has the following columns: id – A unique identifier for each restaurant. name – The restaurant’s name. food_type – The type of cuisine served in this restaurant. rating – The restaurant’s rating. coordinates – The restaurant’s location coordinates. sf_sights: This table records information on notable landmarks and points of interest (POIs) within the city. It has the following columns: id – A unique identifier for each point of interest. name – The POI’s name. coordinates – The POI’s location. sf_atms: This table stores details of SanFran’s ATMs, including each ATM’s location and which company operates it. It has the following columns: id – A unique identifier for each ATM. company – The company that operates this ATM. coordinates – The ATM’s location. This data serves as the foundation for the spatial queries we'll be exploring. We’ll use these queries to analyze proximity between landmarks and amenities, calculate areas within planning districts, and much more. Each query will demonstrate how PostGIS helps you extract meaningful insights from spatial data. Basic Spatial Queries with PostGIS Once your geospatial data is stored in PostgreSQL with PostGIS, you can start querying it! Spatial queries allow you to select, filter, and manipulate data based on its location, shape, and spatial relationships. So, let’s begin working with geospatial data. Visualizing Geospatial Data First, let’s see what geospatial data looks like. Here we select tram stop IDs and coordinates for those stops where riders can transfer: SELECT id, coordinates FROM sf_tram_stops WHERE transfer_possible = true; Here’s what the result looks like: idcoordinates 10101000020E610000030D80DDB16995EC0742497FF90CE4240… 40101000020E61000000DAB7823F3985EC010AFEB17ECCE4240… 50101000020E6100000FDD98F1491995EC07AC7293A92CF4240… … As you can see, the geospatial data in the coordinates column is not really readable. For it to be really usable, you’d need specialized software to plot it on the map. Fortunately, our PostGIS course comes with a built-in map; now you can see the location of these tram stops: Converting Geospatial Data to Text Maybe you want to see the location coordinates of the above tram stops without a map. You can use the following query to convert the data into human-readable coordinates: SELECT id, ST_AsText(coordinates), ST_Y(coordinates), ST_X(coordinates) FROM sf_tram_stops WHERE transfer_possible = true; This query uses the PostGIS function ST_AsText to get the coordinates in a readable format. It uses ST_Y and ST_X to extract the latitude and longitude values. Here is a partial result: idst_astextst_yst_x 1POINT(-122.39202 37.6138)37.6138-122.39202 2POINT(-122.38984 37.61658)37.61658-122.38984 3POINT(-122.39948 37.62165)37.62165-122.39948 … Spatial Relationships PostGIS offers several functions to explore the spatial relationships of geospatial objects. Let’s quickly review them. ST_Intersection This function returns the shared part (i.e. the intersection) of two geometries. The following query shows all bicycle routes (or their parts) within the borders of the Downtown district: SELECT sfbr.id, ST_Intersection(sfpd.boundaries, sfbr.course) FROM sf_bicycle_routes sfbr JOIN sf_planning_districts sfpd ON ST_Intersects(sfpd.boundaries, sfbr.course) WHERE sfpd.name = 'Downtown'; The ST_Intersection function finds the area where two shapes overlap – in this case, where bicycle routes cross the boundaries of the "Downtown" district. The ST_Intersects function checks if the bicycle routes and the district boundaries touch or cross each other, making sure only those that do are included. And here’s a partial result as text: idst_intersection 4090102000020E61000000C000000438F3471DF9A5EC058F13A2… 4410102000020E6100000100000007451E9429B995EC0EF7A00F.. 4120102000020E610000009000000ED6FEAA8999A5EC06469EB2.. … And as a map: ST_Contains The ST_Contains function checks if one geometry fully contains another. To list planning districts that contain ATMs from the Crown Financial company, you can run this query: SELECT DISTINCT sfn.name FROM sf_planning_districts sfn JOIN sf_atms sfa ON ST_Contains(sfn.boundaries, sfa.coordinates) WHERE sfa.company = 'Crown Financial Inc.'; Here’s the result: name Downtown Northeast ST_Within The ST_Within function also checks if one geometry is entirely within another. This query locates restaurants with a rating above 4.0 within the Northeast district: SELECT sep.name, sep.coordinates FROM sf_planning_districts spd JOIN sf_restaurants sep ON ST_Within(sep.coordinates, spd.boundaries) WHERE rating > 4.0 AND spd.name = 'Northeast'; This query returns the names and coordinates of the restaurants, making them easy to visualize on a map. namecoordinates Fast Duck0101000020E6100000B9FC87F4DB995EC02E73BA2C26E64240 Red Curry0101000020E6100000569FABADD8995EC0E1455F419AE54240 The Saloon0101000020E6100000D52137C30D9A5EC0431CEBE236E64240 Distance and Area Calculations Calculating distances and areas is another fundamental aspect of working with geospatial data. These calculations can help answer questions like "How far is this ATM from a particular sight?" or "What is the area of this planning district?" Distance Calculations The ST_Distance function returns the distance between two geospatial objects. It takes two geometry arguments: ST_Distance(geometryA, geometryB). To visualize all ATMs located within 300 meters of Fisherman’s Wharf, you can use the following query: SELECT sa.id, sa.coordinates, ST_Distance( ST_Transform(sa.coordinates, 26910), ST_Transform(ss.coordinates, 26910)) AS distance FROM sf_sights ss JOIN sf_atms sa ON ST_Distance( ST_Transform(sa.coordinates, 26910), ST_Transform(ss.coordinates, 26910)) < 300 WHERE ss.name = 'Fisherman''s Wharf'; In this query, the ST_Transform function converts the given coordinates into a specific coordinate system suited for spatial calculations. The coordinate system ensures that all spatial operations (like distance calculations) are both accurate and relevant to the specific needs of the geographical area and the project. In this query we use the coordinate system defined by the SRID (Spatial Reference System Identifier) 26910, which corresponds to the NAD83 / UTM zone 10N system. This system is commonly used for accurate distance calculations in the San Francisco area. How can you find the right SRID? Try one of these websites: Spatial Reference Website: Use the EPSG tab. io: Type the country name, code, or coordinate system name to search and get all the possible coordinate reference systems. By transforming the coordinates, the query ensures that the distance calculation between the ATM and Fisherman’s Wharf is precise. Here’s the result: idcoordinatesdistance 1320101000020E6100000FC00A436719A5EC009FEB7921DE74240189.1839552624224 1330101000020E61000000D897B2C7D9A5EC033A7CB6262E74240277.9680083048927 1350101000020E61000000E15E3FC4D9A5EC0650113B875E74240216.11656973079064 1360101000020E6100000910A630B419A5EC033FE7DC685E74240283.89507791796825 Area Calculations To calculate the area of a planning district, use the ST_Area function. This takes a geometry argument (ST_Area(geometry)) and returns the area of the resulting shape. Here’s the query. Note that we’re using SRID = 26910 again as our coordinate system: SELECT ST_Area(ST_Transform(boundaries, 26910)) FROM sf_planning_districts WHERE name = 'Buena Vista'; And this is the result: st_area 2617829.8666631826 With SRID = 26910, the default unit is meters. Thus, the result of the ST_Area is presented in square meters. Advanced Spatial Analysis As you become more familiar with PostGIS, you can use its advanced capabilities to perform in-depth spatial analyses. These techniques allow you to explore spatial relationships, perform proximity analyses, and even conduct network analysis – making your geospatial data insights much more actionable. Buffering and Proximity Analysis Buffering creates zones around spatial features. This is particularly useful in proximity analysis, where you need to determine how close certain features are to one another. For example, if you want to identify all restaurants within a 3000-meter radius of the 'Palace of Fine Arts', you can use the ST_DWithin function: SELECT sep.name, sep.rating, sep.food_type, sep.coordinates, ST_Distance( ST_Transform(sep.coordinates, 26910), ST_Transform(ss.coordinates, 26910)) AS distance FROM sf_sights ss JOIN sf_restaurants sep ON ST_DWithin( ST_Transform(sep.coordinates, 26910), ST_Transform(ss.coordinates, 26910), 3000) WHERE ss.name = 'Palace of Fine Arts'; And here are your results in table and map form: nameratingfood_typecoordinatesdistance Olive2.10Greek0101000020E6100000C9AB730CC89A5EC0527E52EDD3E542402764.17980825426 La fragranza4.86Italian0101000020E61000008811C2A38D9D5EC038F3AB3940E442402361.1920187470487 Overlay Operations In mapping, overlay operations allow us to see how different sets of geographical information match up or interact. Imagine you have two maps, one showing all the parks in a city and another showing areas prone to flooding. By placing one map on top of the other, you can find out which parks are at risk of flooding. This process helps us to understand and visualize where different geographical features coincide; it makes planning, analyses, and decision-making more effective. The following query uses the ST_Union function to merge the boundaries of all ‘recreational’ planning districts into a unified geometric shape: SELECT ST_Union(boundaries) FROM sf_planning_districts WHERE type = 'recreational'; Similarly, you could use ST_Intersection to accomplish the same thing. Network Analysis Network analysis involves examining routes, identifying the shortest paths, and improving travel efficiency within a network. It’s critical for understanding and optimizing transportation systems and connectivity. PostGIS – when paired with pgRouting – offers robust tools for this type of analysis. The pgRouting extension provides advanced algorithms for route optimization. One of the best-known is the pgr_dijkstra function; it computes the shortest or fastest path between two points by considering factors such as road types, traffic conditions, and other variables. This capability is invaluable for navigation systems, logistics planning, transportation management and more. By leveraging such extensions, users can enhance route planning, reduce travel times, and improve overall network efficiency. Continue Exploring Spatial Data with PostGIS PostGIS, with its seamless integration into PostgreSQL, provides a powerful platform for managing and analyzing geospatial data. Throughout this article, we've covered the basics of spatial queries, ventured into advanced spatial analysis, and demonstrated how PostGIS can help you answer complex geographic questions with precision and efficiency. Learning PostGIS can transform your approach to spatial data. It will make it easier to extract valuable insights, optimize workflows, and make data-driven decisions. Whether you're involved in urban planning, environmental analysis, transportation networks, or any GIS-related field, PostGIS equips you with the tools necessary to handle geospatial data effectively. As you explore PostGIS’s possibilities, you'll find it enhances your ability to work with geospatial data and opens up new avenues for innovation and discovery. The combination of PostgreSQL's robust database capabilities and PostGIS's geospatial functions empowers you to tackle even the most complex spatial challenges with confidence. Before I go, I’ll just mention our comprehensive PostGIS course again; it’s really a great choice to get real-world practice with spatial data. It will guide you through every aspect of PostGIS, from the fundamentals to advanced techniques. Before you know it, you’ll be confidently tackling geospatial data projects. PostGIS is more than just an extension of PostgreSQL. It's a gateway to unlocking the full potential of geospatial data within your database systems. By mastering PostGIS’s features, you can turn spatial data into actionable insights that drive success in your projects and for your company. Tags: PostgreSQL