class: center, middle, title # Open Source Tools Applied Toward Agro-ecological Insight (and more) ## --- layout: true --- class: center, middle # Jonah Duckles ## @jduckles ## jonah@duckles.org # February 23, 2018 # AbacusBIO --- class: middle, center ![](/img/DSC_0195.jpg) --- class: middle # About me * B.S. - Physics * M.S. - Landscape Ecology (GIS/Remote Sensing) * Dabbled in Open Source since early 1990s. * Have the Open Source ecosystem to thank for being a foundation of my varied careers. * All comes together to empower scientifically capable people to master their own workflows. --- class: center,middle # Longterm Technology Roadmapping (TRM) .one-half[ .center[![](/img/TRM.png)] ] .one-half[ * Long term technology visioning work * Linked to business process at large tech companies ] .citation[ IMG Source [Roadmapping for strategy and innovation](https://www.ifm.eng.cam.ac.uk/uploads/Research/CTM/Roadmapping/roadmapping_overview.pdf) ] --- # Masters Research .one-half[ ![](/img/muskegon.png) ] .one-half[ * GIS & Remote Sensing * Land Use Change Model - LTM * Artificial Neural Networks (ANNs) * Trained on historical changes in Urban, Forest, Agriculture extent. * Policy Scenarios * Environmental response to applied policies ] --- # Backcasting ![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/Screen-Shot-2018-02-20-14-35-24.png) Using historical tabular data (by county), synthesize a spatial map. --- class: middle, center ![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/Screen-Shot-2018-02-22-10-27-59.png) --- class: middle, center .center[![](/img/MS_Thesis.png)] --- class: middle, center .center[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/Screen-Shot-2018-02-20-14-30-05.png)] --- class: middle, center .center[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/Screen-Shot-2018-02-20-14-30-56.png)] --- # Sub watersheds .center[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/Screen-Shot-2018-02-20-14-32-24.png)] --- # % Rainfall Runoff under given policy .center[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/Screen-Shot-2018-02-20-14-34-29.png)] --- class: middle, center # Modeling Corn/Soy/Wheat Production Systems --- class: middle # Making estimates of total production .one-half[ ![](/img/corn-planting.jpg) ![](/img/harvest.jpg) ] .one-half[ To get reasonable estimates on total agricultural production requires: * Estimating total acreage planted * split by crop * Modeling the upper and lower bounds of yield * Building a spatially explicit yield estimate * Adjusting model/estimates to statistical reporting ] --- class: middle, center # Production ACRES x YIELD = PRODUCTION 1-acre = 0.4-hectares --- class: middle # Acres .one-half[ ![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/Screen-Shot-2018-02-21-13-53-17.png) ] .one-half[ ## Assumptions * Most acres are planted every year * Rotations are **usually** predictable * We can "focus" medium resolution data using high-resolution classified map ## Caveats * Rotations can change under * Delayed planting * Extreme weather * Conservation land exit/entrance * Wheat/Fallow rotations ] --- class: center, middle # Rotations ![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/Screen-Recording-2018-02-21-14-05-14.gif) --- # Data sets for acreage .one-half[![](/img/cs_crp_compare_.png)] .one-half[ * Moderate resolution satellite time-series data (250m 8-day composites of NDVI) * High Resolution Classified Map (30-50m Annual classified crop map) * High Resolution Vegetation Index (30m, not atmospherically corrected) * Land parcels (sometimes) * Conservation exits * Snow depth * Microwave Soil Moisture * Soil survey data ] --- class:middle .one-half[![](https://github.com/jduckles/fossmodules/raw/gh-pages/GRASS/images/ndvi_sample.png) .caption[MODIS - 250m resolution]] .one-half[![](https://github.com/jduckles/fossmodules/raw/gh-pages/GRASS/images/L5_ndvi.png) .caption[Landsat 5 - 30m resolution]] --- # Acres (observational) .one-half[ ![](/img/fourstates.png) ] .one-half[ * At emergence validate "planted" assumption * Crop Progress (observational report) * Planted * Emerging * Etc. * Confirm/reject planted * Is there an increase in vegetation index over time? (planted) * All acres eventually get planted, we're really looking for planting date. * Can build a "planted date" surface to adjust later yield estimates. ] --- # Acres (modeling) .one-half[ ![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/Screen-Shot-2018-02-22-10-49-10.png) ] .one-half[ * Ground-truth major producing districts to verify corn/soy/other split. * 3-5 teams drive 3,000 miles per week, gather GPS points recording the crop that is emerging * Automatically translate points from road to fields * Compare rotations observed to historical practice (5-10 years of classified crop maps) * Confirm/reject expected rotation vs. observed rotations ] --- # Yield .one-half[ ![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/Screen-Shot-2018-02-22-10-30-24.png) ] .one-half[ * Computationally "grow" idealized field for each crop at each available weather station over 20-30-year history * Adjust yield output from crop model to historically (statistically) reported yield for that district * Run that calibrated district model on this year's weather * Project good, average, bad weather to complete year ] --- class: middle, center # Production ACRES x YIELD = PRODUCTION 1-acre = 0.4-hectares --- class: middle, center # Acre ≠ Acre ≠ Acre # Statistical - Cadastral - Raster --- class: middle # What we learned: .one-half[ ![](/img/tractor.jpg) .citation[[Stuck tractor](http://billingsgazette.com/news/state-and-regional/montana/stuck-tractor/image_f28ff368-a8af-504c-a9e7-e1d89cca1b4d.html)] ] .one-half[ * Agricultural statistical reporting works fine in "average" years * Under extreme conditions it fails * Being "right" can result in 6-12 months of looking "wrong" (due to harvest cycle) * The statistical universe is adjusted slowly when there are mistakes in it ] --- class: middle, center # Informatics Support for Research --- # Oklahoma - CyberCommons .one-half[ ![](https://github.com/cybercommons/cybercom-docs/blob/master/docs/images/cybercommons.png?raw=true) ] .one-half[ * Architected and built an open source Service Orientated Architecture * Leveraging open source tools: * Databases (NoSQL, Geospatial) * Pub/Sub Distributed Asynchronous Task Queues * Data driven web frameworks * Web mapping * Doing more, for more researchers with less effort * Wrapping research code to bring to web ] --- class: center ![](/img/igos.png) .citation[[EOMF Website](http://www.eomf.ou.edu/aboutus/news)] --- # Animal Migration .one-half[ ![](/img/pabu.jpg) ] .one-half[ ![](/img/bunting_migration_map.gif) ] .citation[ [animalmigrationl.org](http://www.animalmigration.org/bunting/index.htm) ] --- class: middle, center # Radar Aeroecology .one-half[ ![](/img/bats.jpg) ] .one-half[ ![](/img/nexrad.jpg) ] .citation[[Twitter](https://twitter.com/RadarAndStuff/status/963747860662669312)] --- class: middle, center # Peer teaching ![inline](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/iStock_95290313_LARGE.jpg) --- class: middle, center .full-width[ ![](/img/TheCarpentries.svg) ![](/img/SWC_and_DC.png) ] --- Alphabet soup of tools, all free: .one-half[ * R * Python * MongoDB * PostgreSQL * Celery * RabbitMQ * BASH ] .one-half[ * Docker * Puppet/Chef/Vagrant * Git * GRASS * PostGIS * QGIS ] --- # Creating data from data for insight * Crop modeling * Cropland + Vegetation time series = Cropland vegetation state * Vegetation State by Reporting District = Actionable intelligence * Radar Aeroecology * Time series of Radar Reflectivity + Filter for "not weather" = bat population estimate * Phenocam * Daily picture of forest, pasture, cropland + image analysis = in-situ vegetation phenology estimate --- # Some opportunities in NZ .one-half[![](https://jduckles-dropshare.s3-us-west-2.amazonaws.com/Screen-Shot-2018-02-23-09-20-26.png)] .one-half[ * Land Parcels * Forage vegetation condition compared to historical time series * Timber age mapping * Titles List * Change detection on Titles List * Informing land transactions and due-diligence * Identifying land/parcels with particular geographic characteristics ] --- # Golden Age of Open Software .one-half[![](/img/lego.jpg)] .one-half[There is almost nothing in Geospatial or Statistical analysis today that requires a proprietary license to conduct.] --- All tools are available in a Windows, Mac, Linux or Cloud environment * [GRASS GIS](https://grass.osgeo.org/) - best as raster GIS * [PostGIS](https://postgis.net/) - PostgreSQL extensions for vector GIS analysis * [QGIS GIS](https://qgis.org/en/site/) - Overlays/quick exploratory and visual analysis * [GDAL](http://www.gdal.org/) - manipulation of vector raster data (re-project, re-sample etc) * [R](https://www.r-project.org/) - Visualization, statistical modeling, static mapping * Mapbox - slippy web maps * [Python](https://python.org) - great geospatial scripting environment and packages --- class: middle, center # Small tools + Glue (scripts) --- class: middle, center # Major Computational Skill Areas of the Data Driven Analyst --- class: middle, center # Syntax / Computational Thinking ## Python, R, Shell --- class: middle, center # Data persistence/access methods ## filesystems, SQL, bucket stores, NoSQL --- class: middle, center # Collaboration ## Git, GitHub, Gitlab --- class: middle, center # Visualization ## Exploratory data analysis, Shiny, Never Surrender --- class: middle, center # Reporting ## Literate programming, document automation, templating --- class: middle, center # Geospatial **(optional)** ## Raster/Vector Analysis Methods --- class: middle, center # What are the Data/GIS/Analysis challenges at AbacusBIO? --- class: middle, center # Thank you! ## @jduckles ## jonah@duckles.org