Making Arrays Of Columns (or Rows) Of A (space-delimited) Textfile In Python

- 1 answer

I have seen similar questions but the answers always give strings of rows. I want to make arrays of the columns of a text file. I have a textfile like this (There is a text FILE that looks like this, but has 106 rows and 19 columns):

O2     CO2     NOx     Ash     Other
20.9     1.6     0.04     0.0002    0.0
22.0     2.3     0.31     0.0005    0.0    
19.86     2.1     0.05     0.0002    0.0
17.06     3.01     0.28     0.006    0.001

I expect to have arrays of columns (either a 2D array of all columns or a 1D array of each column), and the first row is only for names so then a list for the first row. Since I would like to plot them later.

The desired result would be for example for a column:

         0.31 ,
         0.28 ], dtype=float32)

and for the first row:

   species= ['O2','CO2','NOx','Ash',' Other']


I'd recommend not to manually loop over values in large data sets (in this case a sort of tab separated relational model). Just use the methods of a safe and well-known library like NumPy:

import numpy as np

data = np.transpose(np.loadtxt("/path/to/file.txt", skiprows=1, delimiter="\t"))

with the inner loadtxt you read your file and with skiprows=1 parameter skip the first row (the column names) to avoid incompatible data types and further conversions. If you need this row in the same structure just use insert a new row at index 0. then you need to transpose the matrix for which there's a safe method in NumPy as well. I just used the output of loadtxt (which is a list of lists for each row) as input of transpose to give a one-liner. But it's better to use them apart in order to avoid "train wrecks" and also be able to see what happens in between and eventually correct the unwanted results.

PS: the delimiter parameter must be adjusted to match the one in the original file. Check the loadtxt documentation for more info. I considered it to be a TAB. @KostasCharitidis - thanks for your note