Ad
Indices That KFold Split Method Return For A DataFrame Is It Iloc Or Loc?
When we use _KFold.split(X)
where X is a DataFrame, the indices that gets generated to split data into training and test set, is it iloc
(Purely integer-location based indexing for selection by position) or is it loc
(loc of group of rows and columns by label(s))?
Ad
Answer
You need DataFrame.iloc
for select rows by positions:
Sample:
np.random.seed(100)
df = pd.DataFrame(np.random.random((10,5)), columns=list('ABCDE'))
#changed default index values
df.index = df.index * 10
print (df)
A B C D E
0 0.543405 0.278369 0.424518 0.844776 0.004719
10 0.121569 0.670749 0.825853 0.136707 0.575093
20 0.891322 0.209202 0.185328 0.108377 0.219697
30 0.978624 0.811683 0.171941 0.816225 0.274074
40 0.431704 0.940030 0.817649 0.336112 0.175410
50 0.372832 0.005689 0.252426 0.795663 0.015255
60 0.598843 0.603805 0.105148 0.381943 0.036476
70 0.890412 0.980921 0.059942 0.890546 0.576901
80 0.742480 0.630184 0.581842 0.020439 0.210027
90 0.544685 0.769115 0.250695 0.285896 0.852395
from sklearn.model_selection import KFold
#added some parameters
kf = KFold(n_splits = 5, shuffle = True, random_state = 2)
result = next(kf.split(df), None)
print (result)
(array([0, 2, 3, 5, 6, 7, 8, 9]), array([1, 4]))
train = df.iloc[result[0]]
test = df.iloc[result[1]]
print (train)
A B C D E
0 0.543405 0.278369 0.424518 0.844776 0.004719
20 0.891322 0.209202 0.185328 0.108377 0.219697
30 0.978624 0.811683 0.171941 0.816225 0.274074
50 0.372832 0.005689 0.252426 0.795663 0.015255
60 0.598843 0.603805 0.105148 0.381943 0.036476
70 0.890412 0.980921 0.059942 0.890546 0.576901
80 0.742480 0.630184 0.581842 0.020439 0.210027
90 0.544685 0.769115 0.250695 0.285896 0.852395
print (test)
A B C D E
10 0.121569 0.670749 0.825853 0.136707 0.575093
40 0.431704 0.940030 0.817649 0.336112 0.175410
Ad
source: stackoverflow.com
Related Questions
- → What are the pluses/minuses of different ways to configure GPIOs on the Beaglebone Black?
- → Django, code inside <script> tag doesn't work in a template
- → React - Django webpack config with dynamic 'output'
- → GAE Python app - Does URL matter for SEO?
- → Put a Rendered Django Template in Json along with some other items
- → session disappears when request is sent from fetch
- → Python Shopify API output formatted datetime string in django template
- → Can't turn off Javascript using Selenium
- → WebDriver click() vs JavaScript click()
- → Shopify app: adding a new shipping address via webhook
- → Shopify + Python library: how to create new shipping address
- → shopify python api: how do add new assets to published theme?
- → Access 'HTTP_X_SHOPIFY_SHOP_API_CALL_LIMIT' with Python Shopify Module
Ad