Tag Archives: oil/gas

Fuzzy Date Matching – Python

So this won’t cover all aspects of matching dates based on similar records stored in different databases but it could be the start for another Python solution. Picking up where I left off with my last post about fuzzy well name matching in oil gas, matching dates for different events in the life of an oil/gas well (permitting, drilling, completing) is another challenge.

Matching “09/30/2014″¬†30 Sept 2014” is one thing. but what if the dates are approximate and you’d consider Sept 30 and Oct 2 a match because they’re close.

### https://pypi.python.org/pypi/fuzzyparsers
### https://docs.python.org/2/library/datetime.html
import datetime
from fuzzyparsers import parse_date
date1 = "September 30, 1985"
date2 = "10/02/1985"

transformed_date1 = parse_date(date1)
transformed_date2 = parse_date(date2)

timediff = transformed_date2-transformed_date1

print timediff.days

>>> type(timediff)
<type 'datetime.timedelta'>

So parse_date from fuzzyparses cleans up your dates and the datetime.timedelta datatype allows setting a threshold for what you consider a match.


Leave a comment

Filed under Uncategorized

From GIS Technologist to Geo Technologist

I feel blessed to be so busy at work that I haven’t found the time to blog. But it’s a shame I’ve been so blogless.

For the past 6 months or so, I have been working on all kinds of geospatial challenges. I’m slowly moving from GIS into data management. Working for an oil-gas company, I’m a lot more exposed to non-GIS software application used in the E/P industry. So I haven’t only learned how to operate Trimble GPS hardware and software, and better understand ArcSDE, I am also getting better with SQL Server and SQL databases. I’ve been involved in our company’s enterprise wide Master Data Management solutionefforts, developing schemata for organizing unstructured E/P data in RDBMS and investigating use and applications of PPDM (Public Petroleum Data Model) and pipeline models… the list goes on.

Sometimes, I wonder if I’m still a GIS guy since much of this isn’t really geospatial. But that’s okay. Branching out makes life and work more interesting. Also, I’ve had the chance to get better with Python. I signed up for a 4-course Python Certificate online, and am about 50% done. Most of what was covered in the 1st course I already knew. But the 2nd course was a good bang for my buck, and I’m excited about the rest.

What I’d like to check out is the extension ofPtyhon’s DB API for SQL Server. I’ve spent a little time working with the FME SQl Server edition to grab data from various SQL Server databases and plug them into something else (e.g. ArcSDE), and FME isnt just easy to use, it’s also nice and fast. But I would love to be able to just write in Python and do the same thing. So while Python/MySQL was covered in my 2nd course, I need to look into SQL Server. Well, ¬†that’s wraps it up for the day. I hope to be a regular on my blog again soon. Headed to the ESRI PUG 2012 in Houston next. So hopefully that will help re-connect with GIS and provide enough inspiration for new GIS challenges and blog posts.


Leave a comment

Filed under Uncategorized