Ad

How To Append Data To A Panel Which Is Stored In HDFStore File

I have a Panel stored in a file, and I want to append more data to that panel appending in memory works fine, but when trying to append data to the file I get this error:

import pandas as pd
import numpy as np

df = pd.DataFrame(data = np.random.randn(5,6),columns=('a','b','c','d','e','f'))
pw =  pd.Panel(major_axis = df.columns,minor_axis=df.index)
pw2 = pd.Panel(major_axis = df.columns,minor_axis=df.index)
pw['A'] = df
pw['B'] = df*2
pw['C'] = df*3
pw2['D'] = df*4

pw.to_hdf('proc.h5','proc' , mode='w',format='table',append=True)
pw2.to_hdf('proc.h5','proc' , mode='a',format='table',append=True)

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 884, in to_hdf
return pytables.to_hdf(path_or_buf, key, self, **kwargs)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 279, in to_hdf
f(store)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 272, in <lambda>
f = lambda store: store.append(key, value, **kwargs)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 914, in append
**kwargs)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 1273, in _write_to_group
s.write(obj=value, append=append, complib=complib, **kwargs)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 3578, in write
**kwargs)
File "C:\Python27\lib\site-packages\pandas\io\pytables.py", line 3229, in create_axes
item in items))
ValueError: cannot match existing table structure for [A,B,C] on appending data
Ad

Answer

Documentation is here for the axes parameter.

Storing a > 2 dim object (a Panel is 3 dim) flattens the object into a table structure, where (in this case), the major_axis and minor_axis are the indices. The items axis are the 'columns' in the table.

So appending is allowed on any of the indices, hence you can append a new panel that has changed major and/or minor axes. However the items axis is fixed the first time that a table is appended.

To achieve efficiency, PyTables/HDF5 requires this fixed dimension.

You can specify different axes to append if you would like, eg. axes=['items','major_axis'] or simply transpose the panel to get it in the form that you need. This is a parameter that must be specified on the first append.

You can view the structure that is created with ptdump -av <file.h5>.

Ad
source: stackoverflow.com
Ad