Hold the Mayo: Yeah, a Sandwich Map

Nothing worse than being in a place where you know there are zillions of places to shop and eat, but you really don't know where anything is. Around the next corner? Street level at the next subway stop? Somewhere right in front of you? Tired of being lost in this regard, NYC sandwich guru and data scientist Dan DeWitz took matters into his own hands and came up with his own solution for the next time he has that certain palate craving.

Background

Held this past March 22-30, Open Data Week is an annual festival of community-driven events organized by the NYC Open Data Team at the Office of Technology and Innovation (OTI), BetaNYC, and Data Through Design.  It is held each March to coincide with the anniversary of the City’s first open data law which was signed in March 2012.  Additionally, one of the highlights of Open Data Week is the NYC School of Data which includes a myriad of workshops, panels, and demonstrations with the intent of  demystifying the policies and practices around data and technology in the City.  This year’s program included over 30 presentations.  At its foundation, an immerseful week of users, open data, open software, and open development discussions.

Not all of the week’s presentations are specific to GIS technology as topics such as law, journalism, government budget transparency, data analytics, career development, even data comics, and much more are covered in the  agenda.   And as Open Data Week has evolved over the years, there’s also always been a healthy dose of the traditional GIS and mapping concepts which the geospatial community is actively engaged in.  Though always the space for the unexpected mapping product and presentation, too.  Something a little out-of-the-box and out of the norm.  This year was no different.

Ye​ah, ​a Sandwich Map.

Background

As part of the Data Week’s events, Dan DeWitz led an online interactive hour-long workshop  showing attendees how he built the sandwich map from scratch.  Working from his Jupyter Notebook, DeWitz illustrated how  he used ChatGPT as a coding partner, as well as python code, to scrape a list of  sandwich names – and associated information – from a website and create an interactive sandwich map using the Google Maps API and GitHub Pages.

New York Times Article

Central to the map was a New York Times article published ​February 18, 2024.  Authored by Nikita Richardson, the article was  based on three months of work by  New York Times Food staff which covered all five boroughs in search of heroes, bodega staples, breakfast specialties, veggie entrees, and much more.  The intent wasn’t to necessarily rank the sandwiches but much more so to illustrate the culinary diversity across the New York City urban landscape.  As mappers and always worrying about icons and how to symbolize our work, the images depicting the  sandwich categories are a treat, too.  Which DeWitz was able to scrape and use in his web map as well.

Nine sandwich types were identified in the article: Breakfast Bangers, Hero Worship, Veg In, Pastrami City, Gotham Greats, “Let Me Get, Uhh..”, Diner Party, Honorary New Yorkers, and Extremely Online. Something for everyone!

 Scraping and Geocoding

DeWitz’s initial dialog with ChatGPT to start building and ultimately generating the sandwich map is shown in the image below.  He initially intended to be able to parse each of the variables shown – sandwich and restaurant names, addresses, description, and images from the Times website, but determined large language models (LLM) like ChatGPT still can’t parse websites just from a URL.  Dewitz notes,  “I used Python to  scrape data from a page and guiding the model through each step: identifying the structure of the site, selecting relevant HTML elements, handling errors, formatting results, and mapping the output – all while applying judgment about what matters and why”.

DeWitz’s initial dialog with ChatGPT to start building and ultimately generating the sandwich map

For geocoding, DeWitz used the Google Maps API and ChatGPT handled writing most of the python code – most of which was the focus of his Data Week presentation.  DeWitz felt the Google Maps API did a good job with the addresses, particularly in context that the restaurant addresses included only a cross street and didn’t have the city name or zip – as presented in the Times article. For example, just passing the text (address) below, which was scraped from the article, to the Google Maps API, was good enough for most of the shops and the API returned the correct lat/long:

135 India Street (Manhattan Avenue)

For the remaining shops that did not match, he queried the API by restaurant name and then manually looked at the results and filtered as needed.  The web map base map also comes from the Google Maps API.

In the Jupyter Notebook, the interface is made up of individual cells with some containing text and others with code that can be executed.  The blocks with a tinted blue background are the ones where users write and execute Python code. This code sometimes comes AI engines such as chatgpt.com or the chatgpt assistant which is integrated into the Jupyter Notebook. All of the major steps and coding DeWitz used in the development process is available on his GitHub page which is available here

The End Product

For those wanting to take a deeper dive into the process and product, a  recording of the actual Data Week online presentation can be found at this YouTube link.   And of course, here’s the online Iconic NYC Sandwiches interactive map.

Interesting to note in context of determining that he actually accomplished what he set out to do, DeWitz feels the process had a built in quality assurance test. “Since I’m making a map, the output itself is the quality assurance test— especially that now I can actually use the map to go to restaurants and shops.”

It certainly appears that the QA passed.

Contact:

Dan DeWitz
dewitz.dan@gmail.com

10 Questions: ChatGPT for New York State GIS

I thought I was going to get out in front of all the ChatGPT for GIS discussion for the New York State GIS community, but that turned out to be off by a magnitude of sorts.  YouTube is already replete with ChatGPT for GIS tutorials and if we wait until next week there will probably be another ChatGPT alternative.  (I found the Matt Forrest YouTube video a good ChatGPT for GIS intro – actually formed a question after one of his examples).  Even Amazon jumped into the frey recently with their Amazon Bedrock option.  Take your pick, Bing Chat, Chatsonic, YouChat, Google Bard (available via a waiting list) and a whole host of others on the market in 2023. 

No, the robots aren’t taking over and I suspect we won’t see any of the early Terminator Cyberdyne Systems androids applying for your local GIS Analyst position anytime soon – though it might be a solution for those long dreadful digitizing jobs in the cart lab.  One way or another, Chatbot AI technology is definitely here to stay, evolve, and become another piece of the geospatial toolbox.  

Nothing specific or implied in any the questions.  Just random chains of thought though it does take a little getting used to putting the questions together vs. a basic Google search.  And it does seem to help to refresh and ask the same question twice – or even three times – to get more detailed results.  Never was a programmer, so a little short on questions in that space.

A lot here.  Actually 17 questions in all.  Straight copy and paste. Skim through as you want – find a topic/question you might be interested in.  There are tons more which could have been included.

Without any further adieu, my first machine interview: 

eSpatiallyNewYork:  What freshwater fish datasets are best for mapping fish population based on water depth in the Great Lakes?

ChatGPT:  There are several freshwater fish datasets that can be useful for mapping fish populations based on water depth in the Great Lakes. Here are a few options:

  1. Great Lakes Acoustic Telemetry Observation System (GLATOS) – This dataset provide information on the movements and behaviors of tagged fish in the Great Lakes.  It includes data on depth, temperature, and location, which can be used to create maps of fish populations based on water depth.
  2. Great Lakes Fishery Commission (GLFC) – The GLFC maintains a database of fish populations in the Great Lakes.  This dataset includes information on fish specifies, size, and location, which can be used to map fish population based on water depth.
  3. Great Lakes Environmental Database (GLENDA) – GLENDA is a comprehensive database of environmental data for the Great Lake region.  It includes information on water depth,  temperature, and environmental factors that can be used to map fish populations.
  4. USGS Great Lakes Science Center – The USGS Great Lakes Science Center conducts research on fish populations in the Great Lake and maintains a database of fish surveys.  This dataset includes information on fish species, abundance, and location which can be used to map fish populations based on water depth.

Ultimately, the best freshwater fish dataset for mapping fish populations based on water depth in the Great Lakes will depend on the specific research question and the type of analysis being conducted.

eSpatiallyNewYork:  To what degree of accuracy can I map features from New York State orthophotography?

ChatGPT:  The degree of accuracy with which you can map features from New York State orthophotography depends on various factors such as the resolution of the imagery, the accuracy of the ground control points used for georeferencing, and the skill and expertise of the analyst.

New York State orthophotography typically has a resolution ranging from 0.5 to 1 meter, which means that individual features such as buildings, roads, and other infrastructure can be identified and mapped with a reasonable level of accuracy. However, mapping smaller features such as individual trees, shrubs, or small structures may be more challenging, as these features may not be visible at this resolution.

The accuracy of mapping also depends on the quality of the georeferencing. The accuracy of the ground control points used for georeferencing should be of a similar or higher resolution than the orthophotography itself. This ensures that the imagery is correctly aligned with the ground, allowing for accurate measurements and mapping.

The skill and expertise of the analyst are also critical factors in the accuracy of mapping features from orthophotography. A trained analyst with experience in image interpretation and feature extraction techniques can accurately identify and map features with a high degree of accuracy.

Overall, mapping features from New York State orthophotography can provide a reasonable degree of accuracy for larger features such as buildings, roads, and other infrastructure. However, mapping smaller features may require higher resolution imagery or other sources of data. It is always important to validate and verify the accuracy of the mapping results through ground truthing and other quality control measures.

Continue reading