Ad

Python - Scipy: Multivariate_normal - Select The Right Subsets Of Input

- 1 answer

Any help that pushes me towards the right solution is greatly appreciated...

I am trying to do a classification in two steps:

1.) Calculate mu, sigma, and pi on the training set. 2.) Create a test routine, that takes

- mu, sigma, pi
- an array of Feature IDs
- testx and testy.

Part 1.) works. It returns - mu # shape 4,13 - sigma # shape 4,13,13 - pi # shape 4,

def fit_generative_model(x,y):
    k = 3  # labels 1,2,...,k
    d = (x.shape)[1]  # number of features
    mu = np.zeros((k+1,d))
    sigma = np.zeros((k+1,d,d))
    pi = np.zeros(k+1)
    for label in range(1,k+1):
        indices = (y == label)
        mu[label] = np.mean(x[indices,:], axis=0)
        sigma[label] = np.cov(x[indices,:], rowvar=0, bias=1)
        pi[label] = float(sum(indices))/float(len(y))
    return mu, sigma, pi

Part 2.) does not work, as I seem to be unable to select the right subsets of mu and sigma

def test_model(mu, sigma, pi, features, tx, ty):
    mu, sigma, pi = fit_generative_model(trainx,trainy)
    # set the variables
    k = 3 # Labels 1,2,...,k
    nt = len(testy)
    score = np.zeros((nt,k+1))
    covar = sigma
    for i in range(0,nt):
        for label in range(1,k+1):
            score[i,label] = np.log(pi[label]) + \
            multivariate_normal.logpdf(testx[i,features], mean=mu[label,:], cov=covar[label,:,:])
    predictions = np.argmax(score[:,1:4], axis=1) + 1

    errors = np.sum(predictions != testy)

return errors

It should return the number of mistakes made by the generative model on the test data when restricted to the specified features.

Ad

Answer

Try this. It should work.

mean=mu[label,features], cov=covar[label,features,features]

Ad
source: stackoverflow.com
Ad