# Python - Scipy: Multivariate_normal - Select The Right Subsets Of Input

## 16 October 2019 - 1 answer

Any help that pushes me towards the right solution is greatly appreciated...

I am trying to do a classification in two steps:

1.) Calculate mu, sigma, and pi on the training set. 2.) Create a test routine, that takes

``````- mu, sigma, pi
- an array of Feature IDs
- testx and testy.
``````

Part 1.) works. It returns - mu # shape 4,13 - sigma # shape 4,13,13 - pi # shape 4,

``````def fit_generative_model(x,y):
k = 3  # labels 1,2,...,k
d = (x.shape)  # number of features
mu = np.zeros((k+1,d))
sigma = np.zeros((k+1,d,d))
pi = np.zeros(k+1)
for label in range(1,k+1):
indices = (y == label)
mu[label] = np.mean(x[indices,:], axis=0)
sigma[label] = np.cov(x[indices,:], rowvar=0, bias=1)
pi[label] = float(sum(indices))/float(len(y))
return mu, sigma, pi
``````

Part 2.) does not work, as I seem to be unable to select the right subsets of mu and sigma

``````def test_model(mu, sigma, pi, features, tx, ty):
mu, sigma, pi = fit_generative_model(trainx,trainy)
# set the variables
k = 3 # Labels 1,2,...,k
nt = len(testy)
score = np.zeros((nt,k+1))
covar = sigma
for i in range(0,nt):
for label in range(1,k+1):
score[i,label] = np.log(pi[label]) + \
multivariate_normal.logpdf(testx[i,features], mean=mu[label,:], cov=covar[label,:,:])
predictions = np.argmax(score[:,1:4], axis=1) + 1

errors = np.sum(predictions != testy)

return errors
``````

It should return the number of mistakes made by the generative model on the test data when restricted to the specified features.

`mean=mu[label,features], cov=covar[label,features,features]`