Forming Queries
Contents
Forming Queries¶
Get Familiar with querying the database. BUT don’t forget your cheat sheets!
Process¶
Getting Connected¶
Getting connected to the database is easiest done using the snowexsql library function get_db
# Import the function to get connect to the db
from snowexsql.db import get_db
# This is what you will use for all of hackweek to access the db
db_name = 'snow:hackweek@db.snowexdata.org/snowex'
Importing the tables classes¶
These are critical for build queries. You will need at least one of these every query since they reflect the data were interested in.
from snowexsql.data import SiteData, PointData, LayerData, ImageData
Query Time!¶
We build queries in python using session.query()
. Whatever we put inside of the query parentheses is what we will get back in the results!
# This is what you will use for all of hackweek to access the db
engine, session = get_db(db_name)
# Lets grab a single row from the points table
qry = session.query(PointData).limit(1)
# Execute that query!
result = qry.all()
session.close()
Pause for moment and consider what is in result
….
Is it:
A. a single value
B. a bunch of values
C. an object
D. a row of values
# uncomment the line below and print out the results
print(result)
[<snowexsql.data.PointData object at 0x7f27183208b0>]
This feels soooo limited :)
Questions
What happens if we changed the number in the limit? What will we get back?
Where are our column names?
What if I only wanted a single column and not a whole row?
Filtering¶
The database had a silly number of records, and asking for all of them will crash your computer.
So let talk about using .filter()
All queries can be reduced by applying session.query(__).filter(__)
and a lot can go into the parentheses. This is where your cheat sheet will come in handy.
# This is what you will use for all of hackweek to access the db
engine, session = get_db(db_name)
# Its convenient to store a query like the following
qry = session.query(LayerData)
# Then filter on it to just density profiles
qry = qry.filter(LayerData.type == 'density')
# protect ourselves from a lot of data
qry = qry.limit(5)
result = qry.all()
print(result)
session.close()
[<snowexsql.data.LayerData object at 0x7f27183231f0>, <snowexsql.data.LayerData object at 0x7f2718323880>, <snowexsql.data.LayerData object at 0x7f2718323a00>, <snowexsql.data.LayerData object at 0x7f2718323a60>, <snowexsql.data.LayerData object at 0x7f2718323ac0>]
Questions
What happens if I filter on a qry that’s been filtered?
What happens if I just want a single column/attribute back? How do I do that?
How do I know what to filter on?¶
Queries and .distinct()
!
# This is what you will use for all of hackweek to access the db
engine, session = get_db(db_name)
# Get the unique datanames in the table
results = session.query(LayerData.type).distinct().all()
print('Available types = {}'.format(', '.join([r[0] for r in results])))
# Get the unique instrument in the table
results = session.query(LayerData.instrument).distinct().all()
print('\nAvailable Instruments = {}'.format(', '.join([str(r[0]) for r in results])))
# Get the unique dates in the table
results = session.query(LayerData.date).distinct().all()
print('\nAvailable Dates = {}'.format(', '.join([str(r[0]) for r in results])))
# Get the unique surveyors in the table
results = session.query(LayerData.observers).distinct().all()
print('\nAvailable surveyors = {}'.format(', '.join([str(r[0]) for r in results])))
session.close()
Available types = sample_signal, force, density, grain_size, reflectance, permittivity, lwc_vol, manual_wetness, equivalent_diameter, specific_surface_area, grain_type, temperature, hand_hardness
Available Instruments = IS3-SP-15-01US, IRIS, snowmicropen, None, IS3-SP-11-01F
Available Dates = 2020-03-12, 2020-02-04, 2020-03-07, 2020-02-21, 2020-04-06, 2020-01-09, 2020-05-03, 2019-12-10, 2020-01-04, 2020-02-27, 2020-02-13, 2020-02-14, 2020-04-17, 2020-01-19, 2020-01-17, 2020-02-19, 2019-12-18, 2020-02-26, 2019-12-30, 2020-02-03, 2020-03-13, 2020-01-30, 2019-10-24, 2020-03-11, 2020-02-01, 2020-01-24, 2020-04-22, 2020-02-09, 2020-04-19, 2020-03-31, 2020-05-12, 2020-01-08, 2020-01-23, 2020-02-06, 2020-03-05, 2020-04-01, 2019-10-29, 2020-01-31, 2020-04-18, 2020-04-29, 2020-01-22, 2020-02-20, 2020-02-07, 2020-02-12, 2020-05-06, 2020-03-18, 2019-12-03, 2020-02-08, 2020-05-13, 2019-12-20, 2019-12-23, 2020-04-28, 2020-02-28, 2020-05-20, 2020-01-28, 2020-02-05, 2020-01-16, 2020-04-21, 2019-12-19, 2020-02-11, 2020-03-28, 2020-01-27, 2020-02-10, 2020-03-06, 2020-03-04, 2019-11-24, 2020-04-09, 2020-03-21, 2020-03-16, 2020-02-02, 2020-04-27, 2020-02-25, 2020-04-08, 2020-03-25, 2019-12-16, 2020-01-29
Available surveyors = None, Juha Lemmetyinen, Kate Hale, Céline Vargel, Carrie Vuyovich, Juha Lemmetyinen & Ioanna Merkouriadi, Carrie Vuyovich & Juha Lemmetyinen, Kehan Yang
Recap¶
You just explored using the session object to form queries and compounding filters results with it
You should know:
How to build queries using filtering
How to isolate column data
Determine what values to filter on
If you don’t feel comfortable with these, you are probably not alone, let’s discuss it!