Introduction to the SnowEx Database
Contents
Introduction to the SnowEx Database¶
What we’re gonna attempt to cover¶
Introduction
Database Structure/ Contents
Forming Useful Queries
Examples
Exporting Data
Group Exercise
QGIS Setup
Why a database?¶
”Dude, I am into pits not bits. What gives?!”
Standardizing diverse datasets
Cross referencing data
Enables GIS functionality
Ready for use in your code
Provenance!
Ready for use in a GIS software like ArcGIS or QGIS!
TL;DR Do less wrangling, do more crunching.¶
What is it exactly?¶
PostgreSQL database
PostGIS extension
Supports vector and raster data
And a host of GIS operations
What’s in it?¶
Snow pits - Density, hardness profiles, grain types + sizes
Manual snow depths - TONS of depths, Can you say spirals?
Snow Micropenetrometer profiles - (Subsampled to every 100th)
Snow depth + SWE rasters from ASO inc
GPR
Pit site notes
Camera Derived snow depths
Snow off DEM from USGS 3DEP at GM
And almost all the associated metadata
All this and more is easily indexed, cross referenceable, and put into GIS ready formats!
How do I get at this magical box of data?¶
# Import the connection function from the snowexsql library
from snowexsql.db import get_db
# This is what you will use for all of hackweek to access the db
db_name = 'snow:hackweek@db.snowexdata.org/snowex'
# Using the function get_db, we receive 2 ways to interact with the database
engine, session = get_db(db_name)
1. Using the Engine Object¶
The engine
object returned from the get_db
function is not used much in the snowexsql library. It does allow you to use typical SQL
strings to interact with the database.
Note: Users who have used python + SQL before will likely be more familiar with this approach. Additionally those who don’t know python but know SQL will also be more comfortable here.
# Form a typical SQL query and use python to populate the table name
qry = "SELECT DISTINCT site_id FROM sites"
# Then we execute the sql command and collect the results
results = engine.execute(qry)
# Create a nice readable string to print the site names using python
out = ', '.join((row['site_id'] for row in results))
# Print it with a line return for readability
print(out + '\n')
8C18, 1C1, Open 6, 2S4, 2S7, 6N36, Skyway Open, 6S22, 2S6, 9S40, 9N30, 1N1, 9N47, 1N3, 9N29, 8N45, 9N39, 8C35, 1C5, 6N31, 3S33, 5N24, 3N53, 1C7, 7N40, SNB 2, 8N55, County Line Open, 6N18, 5C20, 5N10, 8N52, 2N21, 1N5, 6S32, GML, 1S1, 2C3, 8S41, JPL 1, Forest 14, 6S53, C1, 8C11, Open, 2C13, Open Flat, Joe Wright, 8N54, 7S23, 6N16, 8C31, Open 2, 5S42, Mesa West Open, 8N34, Upper, 6C37, 5S49, 9S39, 1S17, 2S35, 1C8, BA Flux Tower, 8N9, FL2A, 5S31, 8N38, 6S26, Caples Lake, Open 4, 2S46, 8S28, 8C36, 5N15, 2C33, 6N46, 3S14, 6S15, 8N35, Skyway Tree, TLSFL2A, 2N13, 3N26, 1S8, Saddle, Banner Open, 3S47, 3S52, 4N2, 2S9, 9S51, Trench 13, 6C24, Panorama Dome, 5C27, Senator Beck, 2S25, Swamp Angel, FL1B, 9N42, 1N6, JPL 2, 2S11, 2N8, 9N59, 1N7, 8C25, 3S5, 8N58, 9C28, 2S10, 2S45, 5C21, 5S24, 7S50, 2N49, 8C22, Forest 13, 2N14, 9C17, 5N19, 2C9, 5N50, 2N4, Mores Creek Summit, LDP Tree, 1C14, 2C2, CUES, SNB 1, 8S18, Michigan River, 7C15, Irwin Barn, 2S20, 1S12, 6S44, 2S48, 9C19, 9N43, 9N56, 9N44, 8S30, 8C26, 7N57, 3S38, 9C16, 5N11, 6S34, 4N27, Forest 12, 5S21, 2C12, 2N12, 9C23, 1S2, 3N22, 2N48, 2S3, 5S29, 8N25, 2C4, Forest North, 2C6, Forest, 2S37, 2S16, HQ Met Station, 1S13, Alta Collins, 4C30, County Line Tree, Bogus Upper, 2S36, 6S19, 8N37, Forest Flat, Atwater, LDP Open, 9N28, 1N23, Gothic, 6N17, 8N51, 1N20, Forest South, 5N41, 8C32, 5S43, 8N33, 5N32, 6C10, Tower 4, Banner Snotel, 2S27, Mesa West Trees, 6C34, Aspen, 8C29
2. Using the Session Object¶
The session object allows a user to interact with the database in a pure python form. This approach is called Object Relational Mapping (ORM).
# Import the table classes from our data module which is where our ORM classes are defined
from snowexsql.data import SiteData
# Form the query to receive all the site_id from the sites table
qry = session.query(SiteData.site_id).distinct()
# Execute the query and collect the results
results = qry.all()
# Print it with a line return for readability
print(', '.join([row[0] for row in list(results)]))
# Close your session to avoid hanging transactions
session.close()
8C18, 1C1, Open 6, 2S4, 2S7, 6N36, Skyway Open, 6S22, 2S6, 9S40, 9N30, 1N1, 9N47, 1N3, 9N29, 8N45, 9N39, 8C35, 1C5, 6N31, 3S33, 5N24, 3N53, 1C7, 7N40, SNB 2, 8N55, County Line Open, 6N18, 5C20, 5N10, 8N52, 2N21, 1N5, 6S32, GML, 1S1, 2C3, 8S41, JPL 1, Forest 14, 6S53, C1, 8C11, Open, 2C13, Open Flat, Joe Wright, 8N54, 7S23, 6N16, 8C31, Open 2, 5S42, Mesa West Open, 8N34, Upper, 6C37, 5S49, 9S39, 1S17, 2S35, 1C8, BA Flux Tower, 8N9, FL2A, 5S31, 8N38, 6S26, Caples Lake, Open 4, 2S46, 8S28, 8C36, 5N15, 2C33, 6N46, 3S14, 6S15, 8N35, Skyway Tree, TLSFL2A, 2N13, 3N26, 1S8, Saddle, Banner Open, 3S47, 3S52, 4N2, 2S9, 9S51, Trench 13, 6C24, Panorama Dome, 5C27, Senator Beck, 2S25, Swamp Angel, FL1B, 9N42, 1N6, JPL 2, 2S11, 2N8, 9N59, 1N7, 8C25, 3S5, 8N58, 9C28, 2S10, 2S45, 5C21, 5S24, 7S50, 2N49, 8C22, Forest 13, 2N14, 9C17, 5N19, 2C9, 5N50, 2N4, Mores Creek Summit, LDP Tree, 1C14, 2C2, CUES, SNB 1, 8S18, Michigan River, 7C15, Irwin Barn, 2S20, 1S12, 6S44, 2S48, 9C19, 9N43, 9N56, 9N44, 8S30, 8C26, 7N57, 3S38, 9C16, 5N11, 6S34, 4N27, Forest 12, 5S21, 2C12, 2N12, 9C23, 1S2, 3N22, 2N48, 2S3, 5S29, 8N25, 2C4, Forest North, 2C6, Forest, 2S37, 2S16, HQ Met Station, 1S13, Alta Collins, 4C30, County Line Tree, Bogus Upper, 2S36, 6S19, 8N37, Forest Flat, Atwater, LDP Open, 9N28, 1N23, Gothic, 6N17, 8N51, 1N20, Forest South, 5N41, 8C32, 5S43, 8N33, 5N32, 6C10, Tower 4, Banner Snotel, 2S27, Mesa West Trees, 6C34, Aspen, 8C29
Crash Course in Object Relational Mapping (ORM)¶
Question: How is a database used in pure python?!…Are you down with the O.O.P? The answer is as a Class where each column is mapped to that class as an attribute e.g. obj.attribute AND… in the correct type for python!
Consider the following table:
id |
site_id |
ground_roughness |
---|---|---|
0 |
GML |
rough |
1 |
2S27 |
smooth |
2 |
3S52 |
smooth |
In our python repo we have a made up class SiteData
defined to map to this table.
from snowexsql.data import SiteData
id |
site_id |
ground_roughness |
---|---|---|
SiteData.site_id |
SiteData.ground_roughness |
|
SiteData.site_id |
SiteData.ground_roughness |
|
SiteData.site_id |
SiteData.ground_roughness |
If we queried the whole table above using the session object we would get back 3 Sitedata objects in a list. 1 for each row.
[<snowexsql.data.SiteData object at 0x7fcf0f9bea90>, <snowexsql.data.SiteData object at 0x7fcf0f9bec70>, <snowexsql.data.SiteData object at 0x7fcf0f9bed00>]
This at first doesn’t seem useful until you start to use the objects.
print(my_queried_data[0].ground_roughness)
rough
Question
How would you access from our list the
site_id
of the 2nd row?
Recap¶
You just:
Accessed a geodatabase using python
Saw two methods for interacting with the db using the snowexsql library
Pulled all the unique pit site id numbers from the db
Had a high level intro to ORM