Ad

How To Get The Column Header Of A Particular Column In Numpy

- 1 answer

I have the following data

admit_data = np.genfromtxt('/content/drive/My Drive/Colab/admission_predict.csv', delimiter=',')

What I need is to get some particular column header. I am using the following code to get the data. But not able to get those column name

print(admit_data[1:].tolist())

Is there any function like .tolist() so that I can extract only that column's name?

Edit 1

Added sample data format

enter image description here

Ad

Answer

Firstly, you need to get the column names from the csv with np.genfromtxt(), e.g. by specifying names=True, then the names of the columns end up in the dtype as data.dtype.names, e.g.:

import numpy as np


data = np.genfromtxt(
    io.StringIO('A,B,C\n1,2,3\n4,5,6'),
    dtype=None, names=True, delimiter=',', encoding='utf8')
print(data)
# [(1, 2, 3) (4, 5, 6)]
print(data.dtype.names)
# ('A', 'B', 'C')

However, please note that with data[1:] you are not selecting columns, but rows! To select the rows, you have to use one of the names:

print(data[1:])
# [(4, 5, 6)]

print(data['A'])
# [1 4]

print(data[['A', 'B']])                                                                                                                      
# [(1, 2) (4, 5)]

and more advanced indexing are actually a bit cumbersome:

# print(data.shape)
# (2,)
print(data[1:][0][1])
# 5

On the other hand, Pandas would offer a much more direct syntax and that is one of the main reasons for it to be the preferred tools for this use case:

import pandas as pd


df = pd.read_csv(io.StringIO('A,B,C\n1,2,3\n4,5,6'))


print(df['A'])
# 0    1
# 1    4
# Name: A, dtype: int64

print(df['A'][0])                                                                                                                              
# 1
Ad
source: stackoverflow.com
Ad