B Appendix B - Instructions for Accessing Tabulated Hydrology Results via BigQuery

Tabulated Results via BigQuery

Tabulated hydrology results are available via Google BigQuery, a cloud-based relational database that includes a distributed SQL engine. The data are located on the tnc-data-v1 data bucket (sign-in require). The table is named tnc-data-v1:hydrology.gfdl. BigQuery supports several client libraries. See https://cloud.google.com/bigquery/docs/reference/libraries for a list of supported clients libraries.

Using R, the tnc-data-v1 databucket can be accessed through a database connection using the DBI package:

Schema

The table schema are shown in Table B.1.

Table B.1: BigQuery Table Schema for tnc-data-v1:hydrology.gfdl.
Fieldname Type Description
grid STRING WRF Grid ID Number
year INTEGER Year of Simulation
month INTEGER Month of Simulation
comp STRING HSPF Runoff component (AGWO, IFWO, SURO)
hru000 … hru252 STRING Runoff (mm) (one column for each HRU)
Datetime TIMESTAMP Simulation Hour (UTC)
simulation_day INTEGER Day of simulation (01-Jan-1970 = Day 1)
simulation_day INTEGER Day of simulation (01-Jan-1970 = Day 1)|

Querying Tabulated Results

The data may be queried through Google Cloud Platform directly, or through a number of available software libraries. Queries are performed through standard SQL language. Some example queries are provided below.

Get all surface flow components from the SeaTac precipitation grid (ID16_V7) for the years 1970-1999:

SELECT
  *
FROM
  `tnc-data-v1.hydrology.gfdl`
WHERE
  grid = "ID16_V7"
  AND comp IN ('suro',
    'ifwo')
  AND year BETWEEN 1970
  AND 1999

Get the annual peak flow for surface flow components from the SeaTac precipitation grid (ID16_V7) for the years 1970-1999:

SELECT
  max(hru250) as peakQ, Year, 
FROM
  `tnc-data-v1.hydrology.gfdl`
WHERE
  grid = "ID16_V7"
  AND comp IN ('suro',
    'ifwo')
  AND year BETWEEN 1970
  AND 1999
 Group by Year

Querying Geometry

Google BigQuery supports PostGIS geometry functions (see https://cloud.google.com/bigquery/docs/reference/standard-sql/geography_functions for instructions).

Grid geometries are available from the tnc-data-v1.gfdl.geometry table on Big Query. The table schema is as follows:

Fieldname Type Description
grid STRING WRF Grid ID Number
xy GEOGRAPHY Centroid of the grid (PostGIS point) )
geohash String PostGIS geohash string approximating grid boundary
geometry STRING Well known text format of the grid boundary

An example query to return the Grid ID covering the Seattle Center:

WITH
  SeattleCenter AS (
  SELECT
    ST_geogpoint(-122.35,
      47.62) AS location)
SELECT
  grid
FROM
  `tnc-data-v1.gfdl.geometry`
WHERE
  ST_DWITHIN(ST_GEOGFROMTEXT(geometry),
    (
    SELECT
      location
    FROM
      SeattleCenter),
    0)

Returns the grid ID pertaining to this location:

Row|grid 1|ID16_V9