Exploring Directory Structure [PYTHON]

So,here is a recent Python script I cobbled together. If you’re more of a programmer than myself, you will probably scream at some of my syntax. Do I really need all the ‘globals’ ? I have a sense there a more Pythonic (succinct) ways of accomplishing the same result. But this worked.

The idea was to type in a starting directory and a name for a text file, after which the script recursively drills down through the directory tree, printing file names, examining file extensions, counting files and file size, and then spits out totals, plus – for a visual aid –  prints a quick and dirty histogram. So let me know what to improve next time around.


from __future__ import division

### Cobbled Together by Arne, July 2011
### using the MyOutput() bits by xiao, from
### http://tech.xster.net/tips/python-log-stdout-to-file/

import os, sys

class printToFile():
''' directs print output (stdout) to textfile'''
def __init__(self, logfile):
self.stdout = sys.stdout
self.log = open(logfile, 'w')

def write(self, text):
self.stdout.write(text)
self.log.write(text)
self.log.flush()

def close(self):
self.stdout.close()
self.log.close()

def dictHistogram(extCount):
''' Creates simple histogram based on key-value entries in
file extension/frequency dictionary (extCount) '''
vcount = 0
kcount = 0
for k,v in extCount.iteritems():
vcount = vcount + v

for k,v in extCount.iteritems():
#print v
share = round((v/vcount) * 100,2)
print v, "\t", "File Type ", k, "\t", int(share)*"#", share, "% of total"

def displayFileInfo(entryPath):
'''Displays file name, size when exploreSub() encounters an entry
that is not a directory. New extensions are added to the extensions
list, files are counted in totalFiles, and file size is added up
in totalSize'''
global extensions
global totalSize
global fileCount
global totalFiles
global extCount

print "\t",os.path.basename(entryPath)
ext = os.path.splitext(entryPath)[1].upper()

totalSize = totalSize + os.path.getsize(entryPath)

if ext not in extensions:
extensions.append(ext)
extCount[ext] = 1

else:
a = extCount[ext]
a = a + 1
extCount[ext] = a

def exploreSub(dirEx):
''' Drills down into a file tree, starting with dirEx. If an empty
directory is encountered, function breaks from loop, if a file is
encountered, the file's path is added to a list of files to examined
later, and if a non-empty directory is encountered, exploreSub is
recursively drills down to the next level. Once all directories have
been explored, the files in the list are examined one at a time.'''

global files
global fileCount
print
print dirEx

if os.listdir(dirEx) == []:

print "Is an empty directory."
print

else:
for entry in os.listdir(dirEx):
entryPath = os.path.join(dirEx,entry)
if os.path.isdir(entryPath):
print "\t", entry, "<Dir>"
else:
displayFileInfo(entryPath)

for entry in os.listdir(dirEx):
entryPath = os.path.join(dirEx,entry)
if os.path.isdir(entryPath):
try:
exploreSub(entryPath)
except:
Print "Unable to open ", entryPath

extensions = []
files = []
extCount = {}
fileCount = 0
totalFiles = 0
totalSize = 0

start = raw_input("Enter Directory: ")
log = raw_input("Enter Name for Logfile: ")

sys.stdout = printToFile(log)
exploreSub(start)

print
print "Total Number of Files ", totalFiles
print "Total File Volume ", round(totalSize/1048576), " MB"
print "File Types Encountered: "
print extensions
print

dictHistogram(extCount)

Advertisements

2 Comments

Filed under Uncategorized

2 responses to “Exploring Directory Structure [PYTHON]

  1. Charles Morton

    It doesn’t do everything your script does, but the python os.walk function is really great for traversing all of the subfolders and files of a path.

    • Arne

      I started out looking at os.walk(). But to go one dir at a time, dig through the sub-dirs, then back to the higher level and on then next dir… I wasn’t getting anywhere. No saying it can’t be done. I just didn’t get it to work. No doubt it’s a nice function though.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s