Machine Learning

Best Practices

September 15, 2021

Feature Engineering Normalize parameter to get the percentage Following example shows how to use normalize input parameter to get the grouping in percentage print (y_test.value_counts(normalize= True )* 100 ) Exited 0 79.25 1 20.75 Split to Train, Test and Validation Following example shows how to divide data into temporary and test sets with a ratio of 80:20 divide the temporary set into train and validation with a ratio of 75:25 # first we split data into 2 parts, say temporary and test X_temp, X_test, y_temp, y_test = train_test_split( X, y, test_size=0.2, random_state=1, stratify=y ) # then we split the temporary set into train and validation X_train, X_val, y_train, y_val = train_test_split( X_temp, y_temp, test_size=0.25, random_state=1, stratify=y_temp ) print(X_train.shape, X_val.shape, X_test.shape) Data Encoding from sklearn.preprocessing import LabelEncoder

Search This Blog

Machine Learning

Posts

Best Practices