Order through Concatenation
April 25, 2020, 6:51 p.m.
I had an idea for another "spring cleaning" project that involved concatenating many text files that grew too numerous and unmanageable. I wanted a unified file that took all these text files, listed them in a table of contents, then in descending modified by date order, their contents listed with each file being bracketed and its title and modified date right above.
Initially I hardcoded a test folder path with 10 text files to guide my results. Using Python's "os" module, I was able to set the directory to pull from, then create a file list based on my sorted descending filter as follows:
fileList = sorted(filter(os.path.isfile, os.listdir('.')), key=os.path.getmtime, reverse=True)I wanted the concatenated file list to save in the directroy it was called from with a custom name and today's date. This snippet would help me achieve that:
today = str(datetime.today().strftime('%Y-%m-%d'))
fileName = dir_path + "\\" + txtName + " " + today + ".txt"
With the directory, file list and filename setup, it was time to create my table of contents with every file name.
with open(fileName, "a", encoding="utf-8") as concatFile:On running the app to this point I was able to generate a text file that gave me all files listed in descending order from modified date. The next part tripped me up initially because of how I wanted the file contents listed.
# Table of Contents of File Names
concatFile.write("\t\t\tFile Contents Summary\n")
concatFile.write("=" * 78 + "\n")
for title in fileList:
modDate = os.path.getmtime(directory + "\\" + title)
formattedModDate = time.strftime("%D %I:%M:%S %p", time.localtime(modDate))
concatFile.write(title + "\n")
concatFile.write("=" * 78 + "\n\n\n\n\n")
I wanted to append the file's name and its modified date, then a row of "=" separators, and finally the file contents to the newly concatenated file with all formatting. Abstractly it seemed simple enough, but I had to be careful of the indentation and positioning in Python and just how to nest the loops with the file appending.
Code within code. Simple.
I had some issues where all the file names were created with their own separators on every instance of one file being concatenated. After some Stackoverflowing, I realized I was opening my file twice and running the entire piece of code over again, causing these duplicates.
Within the for loop to append each file name, I had to open the individual file and then concatenate it into the master file. Since these files were a mix of UTF-8, ANSI and Unicode decoding, I wanted it to ignore any special character errors at the beginning, and if those errors still came up to just ignore them. After double checking my test files, everything transferred over to the master file without any data loss.
The final bit of code is as follows:
with open(fileName, "a", encoding="utf-8") as concatFile:After getting the consistent results I set out for, I wanted to make this applet more modular by letting the user input the master directory to concatenate from and a prompt to enter the file name. There was no recursion at this point so all files are assumed to be in the same root directory.
# Append Title & Modified Date
for title in fileList:
modDate = os.path.getmtime(directory + "\\" + title)
formattedModDate = time.strftime("%D %I:%M:%S %p", time.localtime(modDate))
concatFile.write(title + "\t\tMod: " + str(formattedModDate) + "\n")
#Append File Contents to new Concat File with Formatting
with open(title, "r", encoding="utf-8", errors='ignore') as infile:
concatFile.write("=" * 78 + "\n")
try:
for line in infile:
concatFile.write(line)
except UnicodeDecodeError as e:
print(e)
concatFile.write("\n" + "=" * 78 + "\n")
concatFile.write("\n\n\n")
directory = input("Please enter directory to concatenate files from: ")Now whenever I need to concatenate multiple text files while preserving certain file information, I would only have to perform a few steps and get the results I wanted. My motherboard must be so proud of me being able to clean my hard-drive.
txtName = input("Now please name your file: ")