Skip to main content

Model Training

Model training is the core process of machine learning where an algorithm learns patterns from data to make predictions or decisions.

Steps in Model Training

  1. Data Split: Divide data into training, validation, and test sets.
  2. Model Selection: Choose an appropriate algorithm for your problem.
  3. Hyperparameter Tuning: Optimize model parameters.
  4. Model Fitting: Train the model on the training data.
  5. Model Validation: Evaluate the model on the validation set.

Data Splitting

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Model Selection
from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Hyperparameter Tuning
from sklearn.model_selection import GridSearchCV

param_grid = {
'n_estimators': [100, 200, 300],
'max_depth': [None, 5, 10, 15]
}

grid_search = GridSearchCV(RandomForestClassifier(random_state=42), param_grid, cv=5)
grid_search.fit(X_train, y_train)

best_model = grid_search.best_estimator_

# Cross-validation
from sklearn.model_selection import cross_val_score

scores = cross_val_score(model, X_train, y_train, cv=5)
print(f"Cross-validation scores: {scores}")
print(f"Average score: {scores.mean()}")