So this won’t cover all aspects of matching dates based on similar records stored in different databases but it could be the start for another Python solution. Picking up where I left off with my last post about fuzzy well name matching in oil gas, matching dates for different events in the life of an oil/gas well (permitting, drilling, completing) is another challenge.
Matching “09/30/2014″ 30 Sept 2014” is one thing. but what if the dates are approximate and you’d consider Sept 30 and Oct 2 a match because they’re close.
### https://pypi.python.org/pypi/fuzzyparsers ### https://docs.python.org/2/library/datetime.html import datetime from fuzzyparsers import parse_date date1 = "September 30, 1985" date2 = "10/02/1985" transformed_date1 = parse_date(date1) transformed_date2 = parse_date(date2) timediff = transformed_date2-transformed_date1 print timediff.days >>> 2 >>> type(timediff) <type 'datetime.timedelta'>
So parse_date from fuzzyparses cleans up your dates and the datetime.timedelta datatype allows setting a threshold for what you consider a match.