Ad

Python: How To Separate Array Into Chunks

I am still very new to python programming

I have an array I am trying to break down into chuncks. My array seems to have multiple arrays within it (I think).

The output looks something like this:

[array([None, '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0', '0', '0', '0', '0', '0', None, None, None],
      dtype=object)
 array([None, None, '0', '0', '0', '1', '0', '0', '0', '0', None, None,
       None, None, None, None, None, None, None, None, None, None, None,
       None], dtype=object)
 array([None, None, '0', '0', '0', '0', '0', '0', None, None, None, None,
       None, None, None, None, None, None, None, None, None, None, None,
       None], dtype=object)

This a snippet of the printed output. Is there any way to display this output in one array with 24 columns?

I created my array based off a dataframe I created with 24 columns. I wanted to populate those columns using a for loop. The loop works but it only populates the array.

Here is some sample output from my dataframe. I have 24 "status" columns and a column named "Account Opened Date"

this is the output of one of the status columns:

0       1
1       0
2       P
3       0
4    None
Name: status6, dtype: object 

The idea is to take the output of all 24 status columns and place them in new columns named "stat" which will also have a range of 24. so the output of status 24 would be populated in stat 1 and status 23 would populate stat 2 etc.

I saw this example of how to break an array into chunks but I couldn't get the output I wanted. https://www.geeksforgeeks.org/break-list-chunks-size-n-python/

from datetime import date
import pandas as pd

df = pd.read_sql(sql,cnxn)

#add stat1-24 into the data frame
df = df.join(pd.DataFrame({
        'stat1':'','stat2':'','stat3':'','stat4':'',
        'stat5':'','stat6':'','stat7':'','stat8':'',
        'stat9':'','stat10':'','stat11':'','stat12':'',
        'stat13':'','stat14':'','stat15':'','stat16':'',
        'stat17':'','stat18':'','stat19':'','stat20':'',
        'stat21':'','stat22':'','stat23':'','stat24':'',},index=df.index))

#call status1-24 from the data frame and store the columns in an array
status = df.as_matrix(columns=df.columns[6:30])

#call stat1-24 from the data frame and store the columns in an array
stat = df.as_matrix(columns=df.columns[31:55])

l = len(df)

#calculate difference in months between startDate and AccountOpenedDate
def monthly_diff(d2,startDate):
    return(d2.year - startDate.year) * 12 + d2.month - startDate.month

startDate = date(year=2017, month = 7, day = 1)

df['Difference_IN_Months'] = df['AccountOpenedDate']


for x in range(l):
    d2_1=df['AccountOpenedDate'][x]
    d2=d2_1.date()
    df['Difference_IN_Months'][x]= monthly_diff(d2,startDate)
    for i in range(0,23):
        if 3 <= 24 - monthly_diff(d2,startDate) - i + 1 <=24:    
            stat[x,i] = status[24 - monthly_diff(d2,startDate) - i + 1] 
        else: stat[x,i]=''


print(stat[1,:])

I hope my code isn't too confusing. Everything works fine except the part where my array "stat" should populate my dataframe columns (stat1-stat24) with the relevant data.

Ad

Answer

This is the best I can understand from your code and question.

import pandas as pd
import numpy as np



start=0
l=[np.array([None, '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0', '0', '0', '0', '0', '0', None, None, None],
      dtype=object),
 np.array([None, None, '0', '0', '0', '1', '0', '0', '0', '0', None, None,
       None, None, None, None, None, None, None, None, None, None, None,
       None], dtype=object),
 np.array([None, None, '0', '0', '0', '0', '0', '0', None, None, None, None,
       None, None, None, None, None, None, None, None, None, None, None,
       None], dtype=object)]

d={'stat1':'','stat2':'','stat3':'','stat4':'','stat5':'','stat6':'','stat7':'','stat8':'','stat9':'','stat10':'','stat11':'','stat12':'','stat13':'','stat14':'','stat15':'','stat16':'','stat17':'','stat18':'','stat19':'','stat20':'','stat21':'','stat22':'','stat23':'','stat24':''}     
df = pd.DataFrame(d,index=[0])

print(df)
for i in l:
    df.loc[len(df)] = i
print(df)

output:

  stat1 stat2 stat3 stat4 stat5 stat6 stat7 stat8 stat9  ... stat16 stat17 stat18 stat19 stat20 stat21 stat22 stat23 stat24
0                                                        ...

[1 rows x 24 columns]


  stat1 stat2 stat3 stat4 stat5 stat6 stat7 stat8 stat9  ... stat16 stat17 stat18 stat19 stat20 stat21 stat22 stat23 stat24
0                                                        ...
1  None     0     0     0     0     0     0     0     0  ...      0      0      0      0      0      0   None   None   None
2  None  None     0     0     0     1     0     0     0  ...   None   None   None   None   None   None   None   None   None
3  None  None     0     0     0     0     0     0  None  ...   None   None   None   None   None   None   None   None   None

[4 rows x 24 columns]
Ad
source: stackoverflow.com
Ad