How decision tree Works in ML

 


Decision Trees are like flowcharts, splitting the data based on certain conditions or features. They are applied to regression as well as classification.

The way it operates is by using feature values to split the dataset into more manageable subgroups. Every internal node symbolizes an attribute test, every branch denotes the test’s result, and every leaf node represents a class label (decision).

Evaluation Metrics:

For classification: Accuracy, precision, recall, and F1 score.

For Regression: Mean Squared Error (MSE), R-squared.

Apply with sci-kit learn:

We’ll use the Wine dataset for Decision Trees, a classification task. This dataset is about classifying wines into three types based on different attributes. We’ll train the model, predict wine types, and evaluate it using classification metrics.

Here are the steps to follow to write a code:

1.Load a wine dataset :

Chemical investigations of three distinct varieties of wines produced in

the same region of Italy are included in the Wine dataset. Thirteen components are identified in different amounts in each of the three categories of wines by the study.

2.Split the dataset :

There are training and testing sets inside the dataset. This is done to train the model on one part of the data (training set) and test its performance on unseen data (testing set). We used 80% of the data for training and 20% for testing.

3.Create and train a decision tree model:

A Decision Tree Classifier is created. This model will learn from the training data. It builds a tree-like model of decisions, where each node in the tree represents a feature of the dataset, and the branches represent decision rules, leading to different outcomes or classifications.

4.Predict and evaluate :

The model is used to predict the classifications of the test set. The performance of the model is then assessed by contrasting these predictions with the actual labels.

Here is the code snippet:

from sklearn.datasets import load_wine
from sklearn.tree import DecisionTreeClassifier

# Load the Wine dataset
wine = load_wine()
X, y = wine.data, wine.target

# Splitting the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Creating and training the Decision Tree model
model = DecisionTreeClassifier(random_state=42)
model.fit(X_train, y_train)

# Predicting the test set results
y_pred = model.predict(X_test)

# Evaluating the model
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='macro')
recall = recall_score(y_test, y_pred, average='macro')
f1 = f1_score(y_test, y_pred, average='macro')

# Print the results
print("Accuracy:", accuracy)
print("Precision:", precision)
print("Recall:", recall)
print("F1 Score:", f1)


Here is the output.

Machine Learning Algorithms for Beginner Data Scientists

These results indicate that the Decision Tree model performs very well on this dataset. The high precision suggests that when it predicts a particular class of wine, it’s usually correct.

Comments

Popular posts from this blog

Mastering the Difference Array Technique: A Simple Guide with Examples

Leetcode 2594. Minimum Time to Repair Cars explained clearly

Finding a Unique Binary String in Python