Posts

Showing posts from August, 2024

K-Nearest Neighbors (KNN)

Image
  An easy-to-understand approach for regression and classification is K-Nearest Neighbors (KNN). A data point is classed according to the classification of its neighbors. KNN looks at the ‘K’ closest points (neighbors) to a data point and classifies it based on the majority class of these neighbors. For regression, it takes the average of the ‘K’ nearest points. Evaluation Metrics Classification : Accuracy, Precision, Recall, F1 Score. Regression : Mean Squared Error (MSE), R-squared. Applying with Sci-kit Learn We’ll use the Wine dataset again but this time with KNN. We’ll train the KNN model to classify the types of wine and evaluate its performance with classification metrics. Here are the steps we’ll follow. 1. Create and Train the KNN Model: A K-Nearest Neighbors (KNN) model is created with n_neighbors=3. This means the model looks at the three nearest neighbors of a data point to make a prediction. The model is trained (fitted) with the training data. During training, it does...

what is bert

Image
  BERT (Bidirectional Encoder Representations from Transformers) is a deep learning model developed by Google in 2018 that revolutionized natural language processing (NLP). It is based on the Transformer architecture and has significantly improved the state of the art in various NLP tasks, such as sentiment analysis, question answering, and named entity recognition. Key Concepts of BERT: 1.   Transformer Architecture :  BERT is built on the Transformer architecture, introduced by Vaswani et al. in 2017. The Transformer model relies on self-attention mechanisms to process input data in a parallelized manner, which is more efficient than traditional RNNs and LSTMs. 2. Bidirectional Training: Unlike previous models like GPT (which is unidirectional and processes text from left to right), BERT is bidirectional. This means BERT looks at both the left and right context simultaneously when processing words. This bidirectional approach allows BERT to capture richer contextual in...

A Beginner's Guide to Probabilistic Classifiers

Image
  “Naive Bayes classifiers” are a family of simple “probabilistic classifiers” that use the Bayes theorem and strong (naive) independence assumptions between the features. It’s particularly used in text classification. It calculates the probability of each class and the conditional probability of each class given each input value. These probabilities are then used to classify a new value based on the highest probability. Evaluation Metrics: Accuracy:  Measures overall correctness of the model. Precision, Recall, and F1 Score : Especially important in cases where class distribution is imbalanced. Applying with Sci-kit Learn We’ll use the Digits dataset, which involves classifying images of handwritten digits (0–9). This is a multi-class classification problem. We’ll train the Naive Bayes model, predict digit classes, and evaluate using classification metrics. Here are the steps we’ll follow. Load the Digits Dataset: The Digits dataset consists of 8x8 pixel images of handwritten...

How decision tree Works in ML

Image
  Decision Trees are like flowcharts, splitting the data based on certain conditions or features. They are applied to regression as well as classification. The way it operates is by using feature values to split the dataset into more manageable subgroups. Every internal node symbolizes an attribute test, every branch denotes the test’s result, and every leaf node represents a class label (decision). • Evaluation Metrics : • For classification : Accuracy, precision, recall, and F1 score. • For Regression : Mean Squared Error (MSE), R-squared. Apply with sci-kit learn : We’ll use the Wine dataset for Decision Trees, a classification task. This dataset is about classifying wines into three types based on different attributes. We’ll train the model, predict wine types, and evaluate it using classification metrics. Here are the steps to follow to write a code: 1.Load a wine dataset : Chemical investigations of three distinct varieties of wines produced in the same region of Italy are ...

a easiet way to learn logisitic regression

Image
Logistic Regression Logistic Regression is used for classification problems. It predicts the probability that a given data point belongs to a certain class, like yes/no or 0/1. It uses a logistic function to output a value between 0 and 1. This value is then mapped to a specific class based on a threshold (usually 0.5). Evaluation Metrics Accuracy : Accuracy is the ratio of correctly predicted observations to total observations. Precision and Recall : Precision is the ratio of correctly predicted positive observations to all expected positive observations. Recall is the proportion of correctly predicted positive observations to all observations made in the actual class. F1 Score : An equilibrium between recall and precision.  Applying with Sci-kit Learn Breast Cancer dataset, another preloaded dataset in scikit-learn. It’s used for binary classification, making it suitable for Logistic Regression. Here are the steps we’ll follow to apply logistic regression. Load the Breast Cancer...