Ad

Error When Calculating Predicted Values Of Polynomial Regression Python

- 1 answer

I am trying to calculate predicted values after running a polynomial regression in Python using the following code:

np.random.seed(0)
n = 15
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x) + x/6 + np.random.randn(n)/10

X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)

X = X_train.reshape(-1, 1)
X_predict = np.linspace(0, 10, 100)

poly = PolynomialFeatures(degree=2)
X_train_poly = poly.fit_transform(X)

model = LinearRegression()
reg_poly = model.fit(X_train_poly, y_train)
y_predict = model.predict(X_predict)

After running it I get the following error:

ValueError: Expected 2D array, got 1D array instead:
array=[ 0.          0.1010101   0.2020202   0.3030303   0.4040404   0.50505051  ......
Reshape your data either using array.reshape(-1, 1) 
if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

I tried reshaping the array as was said in the error message, so the last line of code would be:

y_predict = model.predict(X_predict.reshape(-1,1))

But as a result I got this error:

ValueError: shapes (100,1) and (3,) not aligned: 1 (dim 1) != 3 (dim 0)

Can someone please explain what I am doing wrong?

Ad

Answer

You forgot to prepare data for your prediction in the same way you prepared training data for the model. In particular, you forgot to fit_transform your X_predict with PolynomialFeatures.

Since the shape of data you used to predict have to exactly match the shape used for training, you need to recreate all you did for X_train_poly (you used that for training) for X_predict. Therefore your line should look like:

y_predict = model.predict(poly.fit_transform(X_predict.reshape(-1, 1)))
Ad
source: stackoverflow.com
Ad