Ad

Saving Footer (last Few Rows) Of Csv As Separate File Using Pandas In Python

- 1 answer

I have a csv file that contains extra rows at the very end (last 9 rows) that are important, but do not fit the schema at all and need to be processed differently. They just contain number of clicks for different sites. I want to split these last few rows from the original csv and save it as a different file.

So far, I can get the most important rows out using pandas, skipping the footer. If the number of rows was consistent, then I could do the same for saving the footer using skiprows=0-2000 (for example), but these rows will change.

The code to save all the main rows is as follows:

reader = pd.read_csv(os.path.join(DATA_DIR, file), encoding='utf8', header=0, skipfooter=9, index_col=0)
trimmed_file_name = 'trimmed_{}'.format(file)
path = os.path.join(DATA_DIR)
full_path = path + "\ ".strip(' ') + trimmed_file_name 
     # had to use this odd way of creating a path because it kept trying to use \ as an escape char, just ignore
print(full_path)
reader.to_csv(full_path, mode='a')

So how do I just get out those last 9 rows without 'skiprows'? Any ideas? The footer is consistently the last 9 rows if that helps.

Ad

Answer

After reading in the first dataframe, we know how many regular rows there are. So just read the remaining by

footer = pd.read_csv(file, skiprows=len(reader))
Ad
source: stackoverflow.com
Ad