In the early days of my dissertation, I was collecting Twitter data but had no idea how to store, manage, or query these data. First, I tried PostgreSQL with PostGIS. I had been wanting to learn PostGIS anyway, not to mention a suitable alternative to MS SQL Server. After learning PostGIS’s intricacies, I soon found it was unsuitable for my needs. After some searching, I stumbled upon MongoDB.
So I started using Elasticsearch to handle my data. After many painstaking attempts to index geographic data in Elasticsearch, I was becoming concerned that I’d spend the next several years trying a never-ending list of databases, learning new skills, drowning in the digital sea….. and never complete my dissertation. Thankfully, with the help of MapButcher’s tutorial, I was finally able to index my data and perform spatial queries in a reasonable amount of time.
My search for a suitable database stopped after three. This process was incredibly valuable in that I learned many new skills and created some useful tools for myself, but ten months had elapsed before I found a suitable workflow. I thought, What about researchers who don’t have ten months to spare learning the intricacies of multiple databases? I now had three different workflows for working with spatially enabled databases, but each was somewhat clunky. Since I knew I’d be working with Elasticsearch for quite some time, it was in my best interest to create something more efficient. If I could make a generalizable tool, others could benefit from it and avoid some of the mishaps I encountered along the way. Combine these elements with my need to present something at the AAG Annual Meeting (and save dissertation article presentations for later), and the stars had wonderfully aligned. I decided to create a software tool to aid geographers in inserting/indexing spatial data.
There really was no need to create a shapefile to PostGIS tool, since it
shp2pgsql. What I needed (and sought to create) was the NoSQL
equivalent, which I call
shp2nosql. Currently it works with Elaticsearch and
MongoDB, but I’d love others to contribute and make the tool more robust,
suggest new features, and possibly support more databases. Technical details can
be found on my GitHub page here, and
the presentation I delivered at the AAG Annual Meeting can be