2014 FOSS4G

Notes from the conference talks I attended. Videos of the talks are linked from the titles. "Hi Peter!"

Follow me onGitHub

Wednesday

Keynote: Mike Bostock, The New York Times

"The Toolmakers Guide"
  • Mike Bostock known for New York Times, D3.js, TopoJson
  • People are pushed to use defaults in a tool even if changes are possible.
  • RGB color space doesn't have equivelant change in perception for equivilent change in each color channel intensity. This was default color space in the D3 tool. Now provides perceptual color spaces. Now "HCL" and "Color Brewer" and "Cube Helix"
  • Important! Provide examples that use the best practices even for things that are not the point of the example.
  • Map Projection Bias:
    • Thought of a point transformation.
    • However, this ignores cutting the sphere.
    • Projection system needs to be a gneric geometry transformation.
    • Mercator projected data even in Lat Lon coordinates is not truely spherical data. Straight lines in Mercator are great circles. But transformations to other projections rarely transform those great circles correctly.
  • Solve the smallest possible interesting problem.
  • People don't use tools in the way the designer intended. Design them to fail obviously and usefully.
  • Standards based tools reduce the "viscosity" of switching to different tools.

Session 1

Track 1: Fiona Rasterio: Data Access for Python Programmers and Future Python Programmers - Sean Gillies, Mapbox

  • Interfaces to OGR and GDAL
  • Native GeoJSON speakers
  • Command line oriented
  • Developed to use to produce cloudless imagery data mosaics
  • Replace older GDAL python bindings (osgeo)
  • Ships data as simple object with no methods built in. Use shapely to import methods.
  • Rasterio is for the raster data.
  • Fiona is for the vector data.
  • rio is a command line program using the rasterio module to replace gdalinfo.

Track 8: Government as a Contributing Member of the OpenStreetMap (OSM) Community - Bibiana McHugh, TriMet

  • Why OSM? Seamless, routable, affordable, ommunity based.
  • Commercial data sets not affordable.
  • Existing municipal data sets not routable.
    • Single line
    • No turn restrictions
  • Uses Open Trip Planner (OTP) to route on OSM, elevation, and another dataset.
  • Trimet Uses OSM with Computer Aided Dispatch
  • Started improving OSM in 2011 with college students (PSU)
  • Consider the public benefit and open data policy
  • Spider (an open source tool) compares OSM with Regional Land Information System (RLIS) and other area shape files.
  • Presentation created using Sozi & InkScape
License Issues relevant to King County:
  • Is King County data distribution policy compatible with ODBL?
  • Speaker hoping to spur a donation terms system to help municpal governments to participate.

Track 5: The Manager’s Guide to PostGIS - Paul Ramsey

Managers are about:
  • What Is it?
    Database with data types and functions that are spatial in nature. Maintains data integrity. Handles large volumes of data. Functionaly equivalent to Oracle Spatial.
  • Who Else Uses it?
    Skype (PostgreSQL, not for GIS), Instagram (PostgreSQL), Google, DigialGlobe,UK Ordinance Survey (500 million records), FAA, Pierce County WA
  • Large Agencies tend to start adopting for web and work in to the enterprise.
  • How to do a migration:
    • Two databases run for a period of time
    • Test interoperability
    • Hands-on opportunity for staff
    • "Upgrade" the staff. Good staff want to learn.
    • "Augment" Staff with an existing expert
    • Analysts and developers have more access to data and functions
    • Support: Buy it from companies. Boundless (speakers company) supports it through the OpenGeo Suite.
    • QGIS plugins can provide versioning on top of PostGIS. GeoGig (formerly GeoGit) can also provide versioning

Session 2

Track 1: GeoScript – A Geospatial Swiss Army Knife - Justin Deoliveira

  • Co-Presented by Jared Erickson
  • Spatial capabilities for scripting languages on the Java Virtual Machine (JVM)
  • Groovy, Jython (Python on the JVM), Java Rhino (JavaScript engine on JVM)
  • Similar API accross languages
  • Builds on top of GeoTools and JTS Java libraries
  • Geoscript Modules: Plot; Process; Rendering
  • GeoScript Groovy Shell (gsgs) can use these tools
  • GeoServer can use Python scripts using these tools
  • uDig Spatial Tools can use these tools
  • Can embed them in Java applications

Track 5: Managing public data on GitHub: Pay no attention to that git behind the curtain - Landon Reed, Atlanta Regional Commission

  • Considering using OSM but...
    • Need to attach attributes to existing street network
    • Custom Interface
    • Track Changes
  • They didn't know how it was accomplished last time, ten years ago
  • Why use GitHub?
    • GeoJSON.io
    • GitHub Pages = "The Curtain", which is crucial in local government
    • Issue tracking
    • Intergrated communications
    • Ability to share with other agencies through forking.
  • Workflow
    • Shape file to GeoJSON: Had to drop features and split data set to fit on GitHub with size restrictions.
    • Users assigned to counties via GitHub teams.
    • Using leaflet, MapBox, and GitHub API to produce a page with a map lnked to forms that present GitHub issues.
  • Extract GeoJSON change requests and convert to shapefiles for review
  • Challenges: unfamiliar data; old browsers
  • Considering extending to another data set
  • koop is a server that translates GitHub repository of GeoJSON to shapefiles for ArcMap users to consume.

Track 5: Building Open Source Projects in Government ESRI Ecosystems - Lyzi Diamond, Code for America

Slides available as PDF
  • Open Source web applications
  • Requires work from government agencies and external web developer
  • Code for America - constiuent facing applications
  • "Civic Technologists"
  • Dealing with GIS data, ArcGIS Server (AGS) or ArcGIS Online, available on web
  • Sometimes the data needed is legaly protected, or restricted, or hoarded institutionally.
  • Government exposes a REST service to the public
  • Extract Transform and Load (ETL): Usually transform to a GeoJSON
  • opentraildata.org is an example of this process. It requires jurisdictions uploading shapefiles.
  • 5 star classified open data rating
  • Got a good response to my question of if the ESRI Open Data web interface was authentically useful to open data activists. They said that it wasn't that is was.

Session 3

Invited Talk: Making Space for Diverse Mappers - Alyssa Wright, MapZen

  • A call to make Open Geography community more diverse and welcoming of diverse mappers.
  • Identify awkward moments and know that you are experiencing diversity. Do not try to avoid awkward moments.
  • Who is weird? Everyone
  • Admonition: "Get your head out of your code!"

Track 2: Client-side versus server-side geoprocessing: Benchmarking the performance of web browsers processing geospatial data using common GIS operations. - Erin Hamilton, University of Wisconsin-Madison

  • Testing suite based on JTS
  • Tested randomized data sets sized up to 3.8MB
  • Tested multiple browsers and multiple OSes
  • Machines had different specs but results were normalized using Geek Bench results
  • Results:
    • IE and Safari were slowest browsers on client side
    • Firefox could process 30,000+ vertices, no other browsers could
    • Server at least an order of magnitude faster than any browser
    • With a 1 second wait tolerance, could process ~1k vertices
    • With a 10 second wait tolerance, could process ~10k vertices
  • Conclusion: Sever was faster. The server was a "medium" Amazon server instance.
  • Testing was with only 1 simultaneous user
  • Details in masters thesis
  • Several developers in the audience were skeptical of results being so poor in the client side browsers. These are likely the people writing the JavaScript tools that were being tested. Speakers defense was that the analysis being tested was implemented using the online documentation that is currently available for these tools.

Thursday

Keynote: Sarah Novotny, NGINX

  • Open source software needs to be licensed with an open license. No license creates uncertainty and risk in people who use it.
  • GitHub initiative - choosealicense.com
  • Humanity, Respect, Trust (HRT) pronounced "heart" is a guiding concept for forming open source community. Read about it in a book called Team Geek
  • Free and open source software are "free like a puppy"

Session 1

Track 7: GRASS GIS 7: your reliable geospatial number cruncher - Markus Neteler, Fondazione Edmund Mach

GRASS 7 features:
  • Works well with massive amounts of raster data
  • Can combine a histogram into the legend
  • Geospatial Modeler is still in development. Will be able to produce a python script from the model.
  • Vector data is toplogoical and GRASS provides a topological vector digitizer.
  • Much faster than 6
  • Not yet the stable version
  • Provides network analysis capabilities
  • New tools for hydrology
  • New Python API: import pygrass
  • Can integrate as a processing extension in QGIS which allows the use of QGIS to get data from PostGIS and process with GRASS tools
  • Integrates with R
  • Web Processing Service (WPS)
  • Image Rectification
  • New space-time framework for raster data
  • 3D animation tools

Track 4: Next Generation of Printed Maps - Jesse Eichar, Camptocamp SA

Mapfish Print Version 3 is a Web based Java application that can print Reports including maps.
  • Web site
  • Not clear if you can print out at a specified scale on paper.
  • Jasper Reports is used to create the reports
  • Jasper Studio can be used to crate templates that control the formatting of the reports
  • Multiple Maps, tables, and/or charts per page of report
  • Data Sources: Web, Files, Database, Vector, Raster
  • Web service is scalable to server clusters

Track 6: Mapping Words and Phrases from Geographic Knowledge on the Web - Benjamin Adams, Centre for eResearch, The University of Auckland

Session 2

Track 7: From Nottingham to PDX: QGIS 2014 roundup - Pirmin Kalberer, Sourcepole AG

Cool new features:
  • Inner stroke on Polygon
  • Inverted Renderer
  • Shape Burst
  • Categorization based on expressions
  • Rule based renderer
  • Print Composer contains a color blindness simulation preview
  • NOT Curves. They are expected by version 2.8

Track 5: Oregon Metro’s combination of FOSS4G with enterprise in web app development - Ben Sainsbury, Oregon Metro

"The right thing for the right stuff"
  • Suggests publishing open data with an explicit Open Database License (ODBL) to promote uptake of your data into Open Street Map (OSM)
  • Oregon Metro serves a similar function to the Puget Sound Regional Council (PSRC)
  • Using: OGC, GDAL, Tilemill, Node.js, and leaflet
  • Switched from using ESRI binary data type to ST_geometry data type in oracle database used with ESRI SDE
  • Seen as the first step towards migrating to a PostgreSQL/PostGIS enterprise database
  • Serves out Map Box Tiles with an ASP.NET service
  • Use an ESRI JSON format and a tile JSON format
  • Tilemill has superior cartographic quality to ArcGIS Server
  • Mentioned Coalition for a Livable Future 2.0 which seems relevant to King County ESJI work.
  • Presented some interesting heat maps

Session 3

Invited Talk: OSGeoLive: An Overview of the best Geospatial Open Source Software - Angelos Tzotsos, NTUA

  • OSGEO Live Distribution: GRASS, gvSIG, uDig, OpenJUMP GIS, Desktop GIS (Kosmo, & SAGA)
  • Leaflet is mobile friendly
  • Geomajas - Browser based GIS
  • Map Bender - Geoportal framework
  • Cataro & GeoNOde - Geospatial CMS
  • pycsw - metadata catalog
  • Data Stores + PostGIS + SpatiaLite (based on SQLite)
  • Marble - Virtual Globe
  • Open STreet Map Tools: JOSM
  • Viking - GPS Navigation
  • Mapnik - Cartographic Rendering
  • Natural Earth: global data sets

Track 6: Easy ETL with OGR - Pirmin Kalberer, Sourcepole AG

  • High level Extract Transform and Load (ETL): HALE, GeoKettle, Talend
  • Lower level ETL: PostGIS, XSLT, OGR (ogr2ogr)
  • OGR Data Model:
    • Attribute fields
    • Feature ID
    • One geometry field v 1.10
    • Multiple geometry fields newer versions
  • OGR Virtual format
  • github repository
  • QGIS Plugin without additional downloads
OGR Tools - Python Library
pip install ogrtools
ogr -h
ogr sql example.shp "SELECT * from example_table"
ogr genconfig -format GeoJSON example.shp
ogr transform --config ex.config example.JSON example.shp

Friday

Session 1

Track 2: Server-Side Marker Clustering For Rapid Display of Large Datasets - Eric Ingbar

  • Demonstration Site
  • Backend is mongo DB - provides "awesome" full text search
  • Tech used: Node.js, MongoDB/spatial, AWS S3 and SES
  • Originally used Leaflet with Cluster Mark Plugin, but it is slow with over 10,000 features. Processing is done on the client side.
  • Switched to server side leaflet used to push geoJSON to MongoDB. Clusters are pre-generated at various zoom scales and cached in MongoDB nightly. Used $geo "within"
  • Considering switching to GeoTools for real time clustering
  • PostGIS has a clustering data type that could be useful for this project
  • GeoServer has a feature called point stacker that could also be useful

Track 4: Writing better PostGIS queries - Regina Obe, Paragon Corporation

Consider reading her book, she says to get the 2nd edition. The tips in the presentation seemed excellent.
  • Describing PostgreSQL 9.3 (with H-Store and PostGIS 2.1
  • Using Open Street Map with data for examples
  • To convert an existing column user ALTER rather than DROP for performance
  • Don't forget to create spatial indices
  • To determine proximity, ST_Distance is slow, it doesn't utilize spatial indices. ST_DWithin gives better performance it does use spatial indices
  • To count records don't query for all the records and then count them, just count them it's faster
  • Create a compound GiST index if you are querying spatial and attribute indices at the same time.
  • Don't use web mercator for proximity analysis because of distance distortion. You can do a distance witha cast to geography. Also, could do buffer in geography, the transform buffer to mercator and intersect.
  • You can create a geography based spatial index.
  • N nearest neighborss. Use k-nearest-neighbors geometries to utilize index in the order by clause. Only uses bounding box, not detailed geometry.
  • Lateral clause is new. I'm not sure what that is
  • Segmentizea linestring in geography to avoid projection errors.
  • Raster tip: Clip first, then union

Track 3: Open source, open standards and 50 lines of code: A look behind GitHub’s GeoJSON rendering and diffing - Ben Balter, GitHub

  • Mentioned Where-Gov sounds interesting
  • Can use github for collaborative mapping
  • geojson-diff - Could this be used to show Urban Growth change over time?
  • geojson-diff takes two geoJSON files as input, and produces three geoJSON files as output: added, removed, and unchanged
  • Presentation is here.

Session 2

Track 2: MapLoom: A New Web-client With Versioned Editing (GeoGit) Integration - Syrus Mesdaghi, LMN Solutions / NSP-Noblis; Tyler Garner, LMN Solutions / NSP-Noblis

  • Developed under Rougue Project and GeoSHAPE
  • GeoSHAPE has to be free and open source data. Each partner will have their own instance of GeoSHAPE
  • Accessess GeoServer
  • Didn't use GeoExplorer because identify button was a barrier for niave users
  • Can edit features and show all commits with details from GeoGit. Can show a diff for a feature with two or more commits

Track 1: Mending Spatial Data with PostGIS - Eo Hsu, Paragon Corporation

Consider reading his book, he says to get the 2nd edition. Husband and wife team same book as mentioned above.
  • A union of four contiguous line segments creates a multi-line feature. Use line merge if everthing has matching coordinates at the end-points. Use simplify to remove uneeded vertices.
  • ST_Snap function - Does what it sounds like.
  • Correcting undershoots is problematic to automate. ST_ShortestLine is not often what you want, and ST_Snap isn't either
    ST_ShortestLineST_Snap
    ST_ShortestLine diagramST_Snap diagram
  • Suggests that openjump can be used successfully to mend spatial data.

Track 3: Projections in web browsers are terrible and you should be ashamed of yourself - Calvin Metcalf