Today, I joined GITA, the Geospatial Information & Technology Association, and frankly I’m kind of excited about it because I’d been wanting to do so for some time. I’m hoping that this will be a rich resource for networking with other GIS professionals and satisfying what I have previously dubbed my interest in the Science of Geographic Information Systems. Maybe, if I feel bold, I can squeeze in the October Conference in Houston, although that’s happening at the same time as the Texas GIS Forum 2010.
Monthly Archives: July 2010
I started this blog about two years ago (04/2008) in an effort to: 1) document my journey (adventures) in GIS, 2) to make some connections with readers and others in the GIS community, and 3) to create a GIS presence on the web. So far, it’s served me well regarding 1) and 2). I won’t make any claims about 3). I’ve made some interesting contacts with people whose blogs I now subscribe to, and on numerous occasions, I have referred to my own previous posts while trying to remember something. Longterm, I would like this blog to supplement my GIS resume.
If you’ve been here before, you might’ve noticed that although most blog posts are somehow related to GIS, there really isn’t much of a red thread, and while some themes or projects recur, other threads are seemingly abandoned as loose ends. If this is a cause for frustration, I apologize. I wish I could find the time to document all my projects and discoveries to completion. But I can’t.
I started this blog as “Geologist”. My job title now reads Geologist/GIS Specialist”. What this means is that I have been spending more and more time exploring ways how we (my employer) can succesfully leverage GIS in our field operations and data storage and analysis, and have achieved – if not a Quantum leap – a shift towards more streamlined data collections (e.g. GPS) and improved record keeping. Yet, my day job still includes many responsibilities not related to GIS (note the hard hat in the picture), and – at least for the time being – I enjoy the balance of indentifying the hands-on challenge in the field and solving it with GIS tools from my desk. And while not all challenges on my wishlist are bound to get fixed – some are simply not practical, some do not warrant the expenditure in equipment or manhours, I find that based on comments and emails from readers, many of the things I tinker with are related to what others in the community work on. This is documented in the number of people who google their way to my blog
The more time I spend solving GIS problems and using different software packages, the more interested I get in the nitty gritty that lies beneath the surface. The Science of Geographic Information Systems. When it comes right down to it, I don’t think it matters much which software you use. I worked some with ArcINFO and Atlas GIS in college in the mid 1990’s. Then with Mapinfo in about 2000. It’s all a blur now because I’ve since switched to ArcGIS.
Rather than spend a lot of time trying to decipher one package’s every trick and toolbar, I’ve decided that I want to spend my time more wisely and really learn how to program. I believe that as a GIS hacker you’re bound to be more powerful than someone who can answer 1001 FAGQ (Frequently Asked GIS Questions).
Needless to say, the choice of language for me is Python. I’ve used it for ArcGIS geoprocessing, to create KML files, to hack into my Garmin GPS, to collect and format data for gINT, and most recently to manipulate raster images. Naturally, my to-do-list for Python is much longer. It includes spending more time with Geodjango, IronPython, numpy, postgres/postgis…
Python is straightforward, fun, well documented, everywhere, and unstoppable ! I still believe it’s the super-glue language for GIS. — But enough of my ramblings. I hope I haven’t bored you. Let’s get back to GIS. Leave me a comment if you like. I would love to hear from you.
While I am always interested in remote sensing, my focus has been on GIS/GPS data and applications, largely due to the nature of my day-to-day routines at work which include GIS analysis and map production using almost exclusively vector data.
Working with Google Earth the other day, I wondered how easy it would be to select an area of a raster image based on certain color, i.e. to select or identify features characterized by certain color ranges of each raster bands. A simple quality control measure would be a comparison of the raster selection with a shapefile with point data for a few sample features.
Since I don’t have a Spatial Analyst license, I tried to attack this with only ArcView…
Find a raster file and add individual bands as separate layers to your map. In my case, this was a (more or less random) 6BM (approx. 3000x2000px) GeoTIFF with 3 bands.
Find the color range in each band that is typical for the type of feature. – By default, one-band rasters will be displayed as grayscale images, using a stretched color ramp. A neat little tool I wasn’t familiar with before is the Pixel Inspector. By moving your cursor over the raster, you get a look at the 8-bit color value of each pixel and thus a sense of the range values for your feature.
With a bit of trial and error, you can narrow down the range of pixel values that best outlines the features you’re interested in, and by using the classified color ramp on the Symbology tab, you’re able to check what areas will be included when using as certain range of values.
Now it’s time to create a new raster file that contains only those colors making up the features of interest. Of course, there will be some other noise of similarly colored features. But I am hoping that upon merging the three bands, only those areas that are part of all three rasters will be left over.
Turns out that to turn my original one-band raster into one containing only the pixels of interest, I need the Raster Calculator, which, in understand, is a feature of Spatial Analyst.
Time to look for Open Source help. And I quickly found OpenEv, which is part of FWTools, used GDAL, has Python bindings, and provides all kinds of raster manipulation capabilities. All I had to do was open my one-band raster and use some Python to run my own – VERY CRUDE – raster calculation. I don’t know much about numpy (Python’s module for all kinds of numerical/mathematical tricks) but this was pretty straight forward and goes something like this:
#Band1.img is the single band raster saved in ArcGIS
#columns and rows are the number of columns/rows respectively
# 100 and 180 are the min/max pixel values of interest
array1 = LoadFile(`Band1.img`)
for c in range(columns):
for r in range(rows):
pixel = array1[c:c+1,r:r+1]
if pixel < 100 or pixel > 180:
Now, I can export array1 as another GeoTIFF and open it in ArcGIS. Basically, I get a raster where the pixels of interest are unchanged, and everything else has been set to zero or BLACK. That’s as far as I’ve gotten. Now, I just need to the same for all three bands and work on recombing them.
PS: I stumbled across another open source project that might help with raster file – TerraLib/TerraView. But I haven’t had a chance to try it out.
Part 1) is about the concept of Objects: For some reason, I’m always surprised to read that there are veteran programmers who have a hard time understanding or implementing OOP. The first Python book I worked through explained some of the concepts of OOP, and it made total sense. In fact, I couldn’t imagine programming things differently. That may, of course, be a shortcoming of mine since I’m new to programming in general. Anyway, the idea of creating blue prints (classes) for objects you’re working with, and then creating objects as instances of those classes appeals to me, and has been the perfect fit for my recent programming project.
Part 2) is about Modeling objects (e.g. UML): I am really not familiar with the formal aspects of modeling and modeling language. I will probably skip this part for now. In the context of my current work and programming challenges, I can get away with scribbling little diagrams on pieces of paper. But there certainly appears to be lots of “meat” in this part of the book for those interested.
Part 3) is about putting the concepts to work in C#: While OOP makes sense to me and Python does, too, I do hope to find some more sense in C# on day. I have been complaining a lot about C# and realize this has to do with the mere look and – what I perceive to be – lack of clarity of the code. I like to think it’s a lack of effort on my part to wrap my brain around and embrace C#. So while the OOP concepts of C# already make sense to me, I hope for a better understanding of the language in Part 3.
Ok, downloaded the new ArcGIS 10 from esri.com (approx. 4GB ISO file) and installed on Windows 7. I don’t remember seeing this on prior versions – a conflict detection feature on the installation menu. I remember fighting with this in the past, hunting down leftovers of ArcGIS(X-1) before installing ArcGISX. Unfortunately, this time around, again, everything 9.3… had to be removed.
The GUI looks very similar. I like the Python window, and one thing I noticed was the built-in KML >> SHP conversion in ArcToolBox. I have been using a Python script up to this point, (which I preferred to Arc2Earth).
Also, first visit to ArcGIS 10 Resource Center, where there is finally some info on developments with arcpy & Python. All very promising. Looking forward to working with this new version, and using Python for working the actual .
Epilogue: Tried upgrading to ArcGIS on another machine, and while the conflict detection feature stubbornly declared: “No conflict”, the installer for 10 would inform me that there were still 9.3.something elements installed. After giving the thing the benefit of the doubt a couple of times, I went into the Windows registry and wiped out anything ESRI. That worked. Somehow, this has done little to increase my confidence in version 10.
I’m about 500 lines of code into my Python project, and I am having a blast. Finally, I’ve begun to apply the knowledge gained from reading countless Python tomes. I used to wonder how people cranked out 1000’s of lines of code. I don’t anymore. It’s surprisingly easy to write a lot of code in a short time. I probably need to come up with a better way of testing/debugging than running whole sections of code over and over just to see if the last changes will crash the program.
What I enjoy especially has been writing some code, and then going back and improving it, e.g. by creating and calling functions rather than writing repetitive sections of code. I guess you could call that refactoring.
An exaple (taken from my project) follows. To get data into my program, I’ve been writing things like this – method that’s part of class to be instantiated every time a project object is created (for more context see recent posts):
self.Title = raw_input(“Enter Project Title Line 1: “)
self.Title2 = raw_input(“Enter Project Title Line 2: “)
self.Name = raw_input(“Enter Project Title Line 3: “)
self.Location = raw_input(“Enter Project Location Line 1: “)
and on and on… pretty redundant code, huh ? Plus, I was wondering how to add some validation to my inputs, e.g. data type, values etc. So I came up with this funtion to be called instead of raw_input every time I prompt the user for data entry.
def dataEntry(item, prompt, checktype, choices):
while count == 0:
question = “Enter “+ item+ “: <“+ prompt +”> ”
data = raw_input(question)
if type(data) == checktype:
# print “correct type”
if data.upper() in choices:
# print “condition true”
print “Select from: “,
for each in choices:
if each != choices[len(choices)-1]:
print each, “or“,
A simple example of how this would work is:
Entry = dataEntry(“Title“,”Investigation“, types.StringType, [“Investigation“,”Evaluation“,”Examination“])
You are prompted:
Enter Title: <Investigation>
You can enter your own string, which has to be one of the three in choices, or accept the default by just hitting <Enter>. I put the check for datatype in because I will be using this for a wide range of data entry that will include numeric and string data, and I need data to be re-entered if it’s the wrong kind. Essentially, instead of just collecting data with raw_input and then running checks, the check parameters are fed to the function and all the checking “happens in the function definition.
I hope what I tried to accomplish here is more or less self-explanatory. Please – I’m a beginner – let me know how I can make this even better or more Pythonic.
PS (07/15/2010): My type checking, of course, is hampered by the fact that anything entered thru raw_input is string. Obviously, this needs some refining.