I had the opportunity to attend the 2019 Spatial Data Science Conference (SDSC) at Columbia University on October 16th. Hosted by NYC-based Carto, the event was attended by over 500 people from government, nonprofits, industry and business, and academia. The day-long conference was highlighted by a variety of presentations and lightning talks from data scientists and program managers representing organizations from around the world including Uber, Airbnb, Datarobot, Waze, Instacart, MIT, Sidewalk Labs, Two Sigma, and Facebook among others. All of the presenters use Carto’s Software as a Service (SaaS) platform which provides GIS functionality, web mapping, and spatial data science tools.
Data Science
Founded in 2017, SDSC brings together organizations who are pushing the boundaries of spatial data modelling – ranging from large enterprise, to cities and government, as well as thought leaders from academic institutions. Shown through the use of geospatial applications and organizational project initiatives, there was no doubt the common denominator and focus of those attending: Data. It was definitely worth the trip. SDSC is similar, but a very different kind of a “geospatial conference” for those of us who have spent a career running in traditional government geospatial circles. Attendees and presenters are largely a completely different make-up from those normally attending the annual New York State GIS conferences. (Of the 500+ preconference registrants, nearly 55% were from the private sector). The day’s event included high quality presentations on the latest in modelling techniques, data science and analytics, visualization practices, and new data streams. This later issue an increasingly important and interesting one across the statewide geospatial community as the day’s conversation clearly illustrated both the growing number of online geospatial data sources (for example, numerous references were made to U.S. Census Bureau American Community Survey) and related data access tools. Speakers noted both the importance and benefits of open data portals though not without the caveat that it was not uncommon to have to clean and often normalize the data prior to using in applications. Numerous references to popular sites such as GitHub, Leaflet, OpenStreetMap, and Elastic (which presented at the event) were made with regard to supporting the open source ecosystem.
Data for the Social Good
Another common theme throughout the day was the use of geospatial data for the social good. Interesting to hear the point being made not only from nonprofits and academia, but from the private sector as well. Way far away from the normal Albany GIS crowd, Stuart Lynn made a presentation focusing on how Two Sigma, an investment management firm located in New York City, provides spatial analysis support through the company’s Data Clinic program to nonprofits, academic institutions, and government organizations. Their focus: Enabling and promoting social impact through data driven predictive models while funding breakthrough research, education and a wide range of charities and foundations. The image below identifies some of their recent projects. Great stuff and I’m already planning on a more in-depth article with Stuart in the future.
Stuart’s talk, as well as others during the day, made reference to the The Data for Good movement which was a social media movement first started by DataKind to highlight how data science could be used to help address a range of humanitarian issues. DataKind is a collaborative network of organizations that work together with data scientists to leverage the science of data for social impact. DataKind’s afternoon presentation “Spatial Data Science for Social Good: Improving Access to Dignified Sanitation in Haiti” was excellent. The video for this session and all of the day’s presentations are now available online at the Spatial Data Science Conference website.
Arguably, the demand for “data scientists” will increasingly intersect and expand in government geospatial programs in areas such as the health and human services, climate change, public safety, sustainability and resiliency, and social and environmental justice to name only a few. As the universe of geospatial data continues to expand and be integrated with both new artificial intelligence (AI) and machine reading technologies, as well as combined with the availability of more powerful GIS software, it is easy to see where the demand for data scientists focusing on locational and predictive analytics is headed.
I asked Javier de la Torre, Carto founder and SDSC organizer, what his impressions were of the day’s presentations and content. He replied:
“First, moving the data discussion from WHERE to Why. Time to move to analyzing data using maps opposed to just seeing data in maps. Second, the rise of the Spatial Data Scientist and/or where advance GIS is emerging as a new platform. And third, identifying the need for better data marketplaces which provides interactive solutions resulting in increased performance to users”
Links to the 2017 and 2018 presentations are also available on the SDSC website.