Python Standard LIbrary – Book

Working in Python, I find myself googling for solutions to programming snags all the time. More and more often, my questions turn out to be about use of the Python Standard Library, be it hashlib for file comparison purposes or shutil for standard file system operations, or whatever. So I finally took the plunge and placed the Standard Library on my desk – the Addison-Wesley Professional/Developers Library text. I’ve said before, I learn more from reading 2 pages in a book than googling 20 pages.

This is not a beginners book on how to learn Python. This is the book you need, being the bookish type, if you use Python every day and would like to learn to finer details of how to use the standard modules effectively. This is a great book to just browse and start reading anywhere. I promise you that you will learn a lot.

Leave a Comment

Filed under Uncategorized

ArcSDE Revisited

I stumbled upon my post from this spring about ArcSDE. Well, I’ve since had the opportunity to get to know SDE a little better. I installed ArcSDE 10 and created a geodatabase in SQL Server 2008. I haven’t pushed a lot of data into SDE since, mainly because I have been trying to come up with a good plan of how to structure our geospatial data, which will be accessed by both ArcGIS Server and other Non-ESRI applications, e.g. LandWorks Property Management (LPM). Once I’ve completed my analysis, it will be time to start moving data. During all my reading and googling on ArcSDE/SQL Server functionality and configuration though, I again lament ESRI’s decision not to update the PDF documents that they were still issuing for most ArcGIS 9.3 products. I find it so much easier to read a chapter in a book-like PDF file than trying to do so in the online or on-disk help file, hampered by constantly expanding and collapsing bullets and distracting links. But maybe that’s just me.

Leave a Comment

Filed under Uncategorized

FME Desktop 2011 Class – Safe Software, Vancouver

I recently had the pleasure of escaping the Texas summer heat and traveling to Vancouver, BC, for a Safe Software training class. I had stopped at the Safe Software booth at ESRI UC for a few minutes this July, and one of their reps gave me a quick intro to FME Desktop. Well, one thing led to another, and before I knew it, I was signed up for the class.

I haven’t taken a whole lot of similar classes but enough to know which ones are a failure, and which ones are worth your time. The one for FME Desktop 2011 was one of the best I’ve taken. In fact, I couldn’t help myself and had to rave to the friends at Safe about it. So let me quote from my own email:

First of all, let me share with you that I found the FME Desktop class to be top-notch ! I have taken a few similar classes or workshops in the past and this one stands out. Not only did I get the impression that [the instructor] really knows her stuff never mind the occasional FATAL ERROR :-) but the format of the class (30 minutes teaching, 10 minutes hands-on), plus the quality of the manual, made for a real good package that I would recommend to anyone interested in FME. Well done ! —  That said, I think FME is fantastic. I think I mentioned to both of you that I have pulled my hair out trying to wrap my brain around ESRI’s ModelBuilder more than once. FME makes more sense to me, is a lot more intuitive to my thinking and the work flows I deal with …

So, there you have it. If you haven’t taken a look at FME and deal with lots of ETL (extraction-transformation-loading) procedures, then this may be for you. Funny thing is (and I think both instructor and manual shared this with us), they say, that 90% of all FME transformations consist of shapefile >>> shapefile conversion e.g. reprojection, adding attributes… What does that say about ArcGIS if people use FME to transform shapefiles  ? I’ve certainly tried to use ModelBuilder more than once, and every time I found that is was much easier and straight forward to just write things out in Python. With FME, I felt that – while things can get very complex – the approach stays very clear and intuitive.

Leave a Comment

Filed under Uncategorized

Getting PETRA well bore paths into ArcMAP using Python

I recently had to bring in some IHS well data from Petra (or IHS Enerdeq) into ArcMap. While I endured 2 days of Introduction to Petra for Geoscientists this summer, I easy and expertise when it comes to using Petra is rather limited. I agree with many online posters who complain about how clumsy, cluttered, and counter-intuitive PETRA is. But that’s another story. I was told by a well seasoned PETRA user that PETRA does not export to SHP file format, upon which she exported a well data table with lat/long’s for me as XLS. Whether the SHP limitation is indeed the case, I have not further investigated. I simply created an ESRI feature class from the LL data. Another needed step was exporting the well bore paths for directional wells into a polyline feature class for use in ArcMap. Again, the only way to get the data out of PETRA appeared to be as a table. I chose CSV.

The table has the following headers:

UWI | NAME | MD | TVD | EWOFFSET | NSOFFSET | DIP | AZM | TVDSS | XPATH | YPATH | SURNAME | SURSTAT

and provides a row of data for each survey depth, with UWI(API) repeating for all rows belonging to the same well.

To read this into Python, I did this:

import csv

### Reference to Input CSV File
infile  = open('c:/temp/dirEXPORT.csv', "rb")
reader = csv.reader(infile)

rownum = 0
well = ""
Wells = []
points = []

### Read CSV File, Line by Line
for row in reader:

    if rownum == 0: # First line in file: read headers
        colcount = 0
        for column in row:
            print column,
            if column == "UWI":
                uwi = colcount
            if column == "XPATH":
                xpath = colcount
            if column == "YPATH":
                ypath = colcount
            if column == "MD":
                z = colcount
            colcount = colcount + 1
            rownum=+1

    else:

        print "\n"

        ### Check if row (line) is for the same well as row above
        ### if so, read coordinates from field and assign to xyz as list
        if row[uwi] == well:
            xyz = [row[xpath],row[ypath],row[z]]
            points.append(xyz)
            print row[uwi], xyz
            rownum=+1

        ### If row does not belong to the same well as the the last row,
### then append coordinate list and well name to Wells list, reset
### coordinates list (=[]), and assign the coordinates to the new well
        else:
            Wells.append([well,points])
            print "New Well"
            points = []
            well = row[uwi]
            xyz = [row[xpath],row[ypath],row[z]]
            points.append(xyz)
            print row[uwi], xyz
            rownum=+1
infile.close

Any remaining print statements allow me to watch what’s going on when the program is run. It’s my crude way of debugging code.
Once all the data is loaded into a list, I start some arcpy magic:

import arcpy
from arcpy import env
import os

env.overwriteOutput = True
env.workspace = "C:/temp"

cnt = 0

mypath = env.workspace
outputFile = r"wormtracks.shp"
template = r"c:\temp\template.shp"
fieldLength = 25

sr = arcpy.Describe(template).spatialReference

arcpy.CreateFeatureclass_management(mypath,outputFile,"POLYLINE","","","",sr)
arcpy.AddField_management(outputFile, 'API', "TEXT", "", "", fieldLength) # add API as new table field

rows = arcpy.InsertCursor(outputFile)
array_container = arcpy.Array()

for well in Wells[1:]: # Ignore first item in Wells list. Poor programming where an empty list is created.
    well_api = well[0] # First item in list is the API

    wellList = well[1] # 2nd item in list is a list of XYZ points

    for pt in wellList:
        point_object = arcpy.Point()
        point_object.X = float(pt[0])
        point_object.Y = float(pt[1])

        print pt[0]
        print pt[1]
        array_container.append(point_object) # I couldn't figure out the difference between array.append and array.add
        print point_object
        del point_object

    row = rows.newRow()
    row.shape = arcpy.Polyline(array_container) # Do not use arcpy.Multipoint which created a really messy polyline w/points ouf of sequence.
    row.API = well_api
    rows.insertRow(row)
    array_container.removeAll()

del row #unlock row
del rows #unlock table

del arcpy
del outputFile

Leave a Comment

Filed under Uncategorized

ESRI/arcpy – Locked Files

When working with geoprocessing tools in ArcGIS, especially using arcpy, you quickly learn about locked files. These files are locked because they are currently being edited, or being viewed in ArcCatalog, or otherwise accessed with ESRI products. This creates a LOCK files that is displayed in e.g. Windows Explorer. Typically, upon successful completion of your geoprocessing task, the lock is removed (the LOCk files disappears). Unfortunately, when you run arcpy scripts outside of ArcMap’s Python window, oftentimes this doesn’t work, especially with cursors. I have found that I was getting errors like:

ExecuteError: ERROR 000258: Output C:/temp/myfile.shp already exists
Failed to execute (CreateFeatureclass

That despite the fact that I had set

arcpy.env.overwriteOutput = True

(see me last post). So it’s a good habit to delete any references to cursors, e.g.:


del row #unlock row
del rows #unlock table
del arcpy


A couple of links with similar explanations are here: from ESRI (scroll down on page), and from Penn State GIS. Also very useful is Python garbage collection module to force deleting any references to strange objects or files you may be missing. Very simply:

import gc
gc.collect()

This has been may weapon of last resort. More about the module at Python Central.

Leave a Comment

Filed under Uncategorized

ESRI/arcpy frustrations [Python Case Sensitivity]

When I don’t use ESRI Python module for geoprocessing  – arcpy – every day, some of the idiosyncratic syntax of this (often not very Pythonic) module slips my mind. I was recently reminded of this when I ran into some problems with very short scripts. Turns out that since Python is case sensitive, you’re best bet as to which words to capitalize or camelcase, isn’t always correct. More frustrating though is the fact

>>> import arcpy
>>> sr = arcpy.Spatialreference()

Traceback (most recent call last):
File “<pyshell#4>”, line 1, in <module>
sr = arcpy.Spatialreference()
AttributeError: ‘module’ object has no attribute ‘Spatialreference’

>>> sr = arcpy.SpatialReference()
>>>

So ‘Spatialreference throws an error but ‘SpatialReference’ doesn’t. Remember to capitalize that ‘R’. ‘SpatialReference’ is a class of the arcpy module, and classes, according to PEP8, should be named using the CapWord (or CamelCase) spelling convention.
So what about this …

 >>> fc = "c:/temp/myfile.shp"
 >>> desc = arcpy.Describe(fc)
 >>> print desc.spatialreference.name
 NAD_1927_StatePlane_Texas_South_Central_FIPS_4204
 >>>

Apparently, here I don’t have to capitalize anything. ‘spatialreference’ works as the name of the property of dataset fc. But guess what ? So do:

>>> print desc.spatialreference.name
NAD_1927_StatePlane_Texas_South_Central_FIPS_4204
>>> print desc.SpatialReference.name
NAD_1927_StatePlane_Texas_South_Central_FIPS_4204
>>> print desc.SpatialReference.name
NAD_1927_StatePlane_Texas_South_Central_FIPS_4204

Well, that seems very inconsistent. Another one (taken from that tripped me up was:
fc = "c:/temp/dirSurvey.shp"
desc = arcpy.Describe(fc)
sr = desc.spatialReference
arcpy.CreateFeatureClass_management("C:/temp/","newFC","POLYLINE", "", "", "", sr)

Traceback (most recent call last):
  File “<pyshell#14>”, line 1, in <module>
    arcpy.CreateFeatureClass_management(“C:/temp/”,”newFC”,”POLYLINE”, “”, “”, “”, sr)
AttributeError: ‘module’ object has no attribute ‘CreateFeatureClass_management’
Apparently, this either used to be the correct name or some of the ESRI online documentation is wrong. But the method really is:
arcpy.CreateFeatureclass_management("C:/temp/","newFC","POLYLINE", "", "", "", sr)

Another one I keep getting wrong because I forget to capitalize is:
from arcpy import env
env.overwriteOutput = True # instead of overwriteoutput which would be wrong

To make a long story short, I’ve learned that when I can’t remember what the propert name is, I type:
>>> dir (arcpy) # or dir(arcpy.env)

and that gives me all the correct names for attributes and methods of arcpy that I need.

2 Comments

Filed under Uncategorized

Merging PDF Files with Python

I think of at least one good topic a day for a blog post but I never seem to get around to documenting them. So here is a recent discovery in Python & PDF.

I was tasked with going through a directory tree (sound familiar ? I have been do this sort of thing a lot lately), finding all PDF files and merging them into one large PDF file. The first thought was: what kind of Adobe product do we need for that. Then, I said: let me try that in Python. That in fact, turned out to be very straightforward. Simply download pyPdf, and take a look at the example there. No needc for Adobe.


import pyPdf
import os

startDir = "c:/temp"
os.chdir(startDir)

fileList = os.listdir(startDir)
output = pyPdf.PdfFileWriter()

for item in fileList:
if os.path.splitext(item)[1].upper() == ".PDF":
pdfDocument = os.path.join(startDir,item)
input1 = pyPdf.PdfFileReader(file(pdfDocument, "rb"))
for page in range(input1.getNumPages()):
output.addPage(input1.getPage(page))

outputStream = file("MyNewOutput.pdf", "wb")
output.write(outputStream)
outputStream.close()

Just replace “c:\temp” with the directory you want to pull the PDF from and you’re in business. If I combine this with the recent script I wrote to explore directory structure, you can drill down through the directory tree searching for and merging PDF files.

Leave a Comment

Filed under Uncategorized

ESRI UC 2011 – First Time Attending

This was the first year that I made it out to the ESRI User Conference in San Diego. As expected, the sheer size of the event was a little overwhelming. While I had spent some time looking over the agenda to see what I might be interested in, once inside the Convention Center, it felt like I was spending more time running from session to session than actually sitting still. There is definitely a lot of resources to be found at the UC but you have got to pick out the goodies. I thought some of the most useful material came from the “tips and tricks” types technical sessions while the audience size for something like “How to implement Enterprise Solution …” made for a less fruitful learning experience. Most of all, I enjoyed meeting a bunch of fellow GIS folks, e.g. at the the PUG Social. Looking forward to 2012 !

Leave a Comment

Filed under Uncategorized

Exploring Directory Structure [PYTHON]

So,here is a recent Python script I cobbled together. If you’re more of a programmer than myself, you will probably scream at some of my syntax. Do I really need all the ‘globals’ ? I have a sense there a more Pythonic (succinct) ways of accomplishing the same result. But this worked.

The idea was to type in a starting directory and a name for a text file, after which the script recursively drills down through the directory tree, printing file names, examining file extensions, counting files and file size, and then spits out totals, plus – for a visual aid -  prints a quick and dirty histogram. So let me know what to improve next time around.


from __future__ import division

### Cobbled Together by Arne, July 2011
### using the MyOutput() bits by xiao, from
### http://tech.xster.net/tips/python-log-stdout-to-file/

import os, sys

class printToFile():
''' directs print output (stdout) to textfile'''
def __init__(self, logfile):
self.stdout = sys.stdout
self.log = open(logfile, 'w')

def write(self, text):
self.stdout.write(text)
self.log.write(text)
self.log.flush()

def close(self):
self.stdout.close()
self.log.close()

def dictHistogram(extCount):
''' Creates simple histogram based on key-value entries in
file extension/frequency dictionary (extCount) '''
vcount = 0
kcount = 0
for k,v in extCount.iteritems():
vcount = vcount + v

for k,v in extCount.iteritems():
#print v
share = round((v/vcount) * 100,2)
print v, "\t", "File Type ", k, "\t", int(share)*"#", share, "% of total"

def displayFileInfo(entryPath):
'''Displays file name, size when exploreSub() encounters an entry
that is not a directory. New extensions are added to the extensions
list, files are counted in totalFiles, and file size is added up
in totalSize'''
global extensions
global totalSize
global fileCount
global totalFiles
global extCount

print "\t",os.path.basename(entryPath)
ext = os.path.splitext(entryPath)[1].upper()

totalSize = totalSize + os.path.getsize(entryPath)

if ext not in extensions:
extensions.append(ext)
extCount[ext] = 1

else:
a = extCount[ext]
a = a + 1
extCount[ext] = a

def exploreSub(dirEx):
''' Drills down into a file tree, starting with dirEx. If an empty
directory is encountered, function breaks from loop, if a file is
encountered, the file's path is added to a list of files to examined
later, and if a non-empty directory is encountered, exploreSub is
recursively drills down to the next level. Once all directories have
been explored, the files in the list are examined one at a time.'''

global files
global fileCount
print
print dirEx

if os.listdir(dirEx) == []:

print "Is an empty directory."
print

else:
for entry in os.listdir(dirEx):
entryPath = os.path.join(dirEx,entry)
if os.path.isdir(entryPath):
print "\t", entry, "<Dir>"
else:
displayFileInfo(entryPath)

for entry in os.listdir(dirEx):
entryPath = os.path.join(dirEx,entry)
if os.path.isdir(entryPath):
try:
exploreSub(entryPath)
except:
Print "Unable to open ", entryPath

extensions = []
files = []
extCount = {}
fileCount = 0
totalFiles = 0
totalSize = 0

start = raw_input("Enter Directory: ")
log = raw_input("Enter Name for Logfile: ")

sys.stdout = printToFile(log)
exploreSub(start)

print
print "Total Number of Files ", totalFiles
print "Total File Volume ", round(totalSize/1048576), " MB"
print "File Types Encountered: "
print extensions
print

dictHistogram(extCount)

2 Comments

Filed under Uncategorized

First 2 Months as GIS Analyst…

As mentioned back in June, I started a new job as GIS Analyst for an oil & gas (E/P) company, and I’ve been having a blast. After straddling the worlds of GIS and engineering geology for a few years, I’m glad to finally get to focus on GIS. And there is no lack of challenges in the new job. The learning curve has been a joy!

The GIS department I’m part of has been tasked to build an enterprise GIS using ArcGISServer, ArcSDE, SQLServer 2008, and MS Silverlight. Moreover, we are working on a Master Data Management Solution that will allow data from multiple departments to be available through the Silverlight powered web app.

For me, the challenge, therefore, has been and continues to be two fold:

1) Brush up on many aspects of ESRI technology:

ArcGIS Server, ArcSDE, Python. Silverlight

2) Learn about a whole range of oil/gas industry software applications:

IHS Petra, SMT Kingdom, NeuraSection, Landworks, Merrick/RIO, the list continues

and how they might become part of the MDMS using NeuraDB, Volant, FME

I guess you could say we got our work cut out for us.

Leave a Comment

Filed under Uncategorized