Ad

Split CSV By Unique Columns

- 1 answer

I ran into a problem trying to split my CSV into the minimum value of CSV files so each has only unique ids in it

by running

count = df['id'].value_counts().max()

I already know the number of csv files I should create (file1, file2, file3, file4)

my expected out put should be

file1

 person_name     id    Total  Paid        Date          No
      Deniss  55227  1191,75  0,00  21/08/2019  15/06/2018
      RINALDS  56002   169,00  0,00  21/08/2019  15/06/2018
      OLGA  54689   812,90  0,00  21/08/2019  15/05/2018

file2

person_name     id    Total  Paid        Date          No
Deniss  55227  1191,75  0,00  21/08/2019    20180615
RINALDS  56002   169,00  0,00  21/08/2019    20180615
OLGA  54689   812,90  0,00  21/08/2019    20180515

file3

person_name     id    Total  Paid        Date          No
Deniss  55227  1191,75  0,00  21/08/2019    20180613
RINALDS  56002   169,00  0,00  21/08/2019    20180614

file4

person_name     id    Total  Paid        Date          No
Deniss  55227  1191,75  0,00  21/08/2019    20180612


Ad

Answer

Use GroupBy.cumcount for counter Series and then write files in loop:

g = df.groupby('id').cumcount() + 1

for i, df in df.groupby(g):
    df.to_csv(f'file{i}.csv', index=False)

Test with sample data:

for i, df in df.groupby(g):
    print (df)

      person_name     id    Total  Paid        Date          No
    0      Deniss  55227  1191,75  0,00  21/08/2019  15/06/2018
    4     RINALDS  56002   169,00  0,00  21/08/2019  15/06/2018
    7        OLGA  54689   812,90  0,00  21/08/2019  15/05/2018
      person_name     id    Total  Paid        Date        No
    1      Deniss  55227  1191,75  0,00  21/08/2019  20180615
    5     RINALDS  56002   169,00  0,00  21/08/2019  20180615
    8        OLGA  54689   812,90  0,00  21/08/2019  20180515
      person_name     id    Total  Paid        Date        No
    2      Deniss  55227  1191,75  0,00  21/08/2019  20180613
    6     RINALDS  56002   169,00  0,00  21/08/2019  20180614
      person_name     id    Total  Paid        Date        No
    3      Deniss  55227  1191,75  0,00  21/08/2019  20180612
Ad
source: stackoverflow.com
Ad