Fuzzy Date Matching – Python

So this won’t cover all aspects of matching dates based on similar records stored in different databases but it could be the start for another Python solution. Picking up where I left off with my last post about fuzzy well name matching in oil gas, matching dates for different events in the life of an oil/gas well (permitting, drilling, completing) is another challenge.

Matching “09/30/2014″ 30 Sept 2014” is one thing. but what if the dates are approximate and you’d consider Sept 30 and Oct 2 a match because they’re close.

### https://pypi.python.org/pypi/fuzzyparsers
### https://docs.python.org/2/library/datetime.html
import datetime
from fuzzyparsers import parse_date
date1 = "September 30, 1985"
date2 = "10/02/1985"

transformed_date1 = parse_date(date1)
transformed_date2 = parse_date(date2)

timediff = transformed_date2-transformed_date1

print timediff.days

>>> 
2
>>> type(timediff)
<type 'datetime.timedelta'>

So parse_date from fuzzyparses cleans up your dates and the datetime.timedelta datatype allows setting a threshold for what you consider a match.

Advertisements

Leave a comment

Filed under Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s