We have seen in multiple scenarios where we have to deal with multiple classes in the target variable in any machine learning model.

Lets understand ,how to visualize and understand the model performance with more then 2 classes

Lets assume we have below three classes in our target variable

#Lets Assume three classes

D=’Dog’

C=’Cat’

R=’Rat’

#Since we are just testing lets have our actual and predicted values

y_test =[D,D,D,D,D,D,C,C,C,C,C,C,R,R,R,R,R,R]

y_predict=[D,D,D,C,D,D,C,C,C,R,D,C,R,R,R,D,R,R]

**#Python Implementation**

from sklearn import metrics

#Calculate Confusion Metrics

Conf_metrics=metrics.confusion_matrix(y_test,y_predict)

print(Conf_metrics)

**Picture 1: **All Predicted values are shown vertical for a given classes and all the Actual values are…

In statistics we have various types of test to validate the a single group or multiple groups.Here we can say a group as a feature in the data set.These features can be either categorical or numerical features.

With the help of T-test and Chi-square test we can conclude that if the given group/feature’s are independent or dependent with each other.

When we want to compare one or two numerical variable for independence we use T-test.Git Link

**It is of three types.**

1-Sample T-test

2-Sample T-Test or Independent T-test

3-Paired T-test

**1-Sample T-test**

To compare population mean and sample means if…

According to given definition.

Deep learningis a subset ofmachine learningin artificial intelligence (AI) that has networks capable oflearningunsupervised from data that is unstructured or unlabeled. Also known asdeepneurallearningordeep neural network

Deep learning can be broadly divided into three major categories.

- Artificial Neural Network also know as ANN.
- Convolutional neural network know as CNN
- Recurrent neural network know as RNN.

Artificial Neural Network widely know as ANN is an information processing system that is inspired by the way human biological nervous systems, such as the brain, process information.

In simple terms…

In the previously discussed topics ,in Bag of Words (BOW)& TF-IDF approach, semantic information is not stored. Here BOW give equal preference to each words in corpus where as TF-IDF gives importance to uncommon words.

Semantic means that in a sentence the order & relation of words are important. Like if I have a sentence say “He is going to Collage” it’s important to have ordering between words in this sentence

There is also chance of over fitting with BOW & TF-IDF

Solution for both of above issues is **Word2Vec**

**Introduction & Working of Word2Vec**

1.In word2vec,each word id basically…

As we know from my previous article of Bag of Words,we convert sentences into vectors of words through BOW which converts words as either 0 (when word is not there in sentence) or 1 (when word is there).

**Bag of Words** just creates a set of vectors containing the count of **word** occurrences in the sentences/corpus , but it does not contain information on important **words.**

So we have another technique to achieve the words importance is called

TF-IDFwhich meansTerm Frequency and Inverse Document Frequency,is a scoring measure widelyusedin information retrieval (IR) or summarization.…

When dealing with corpus we come across multiples words which we use in Natural Language Processing(NLP) applications to get meaningful insight .To do that we need to convert those word into something which model can understand. We are going to discuss here some thing know as **“**Bag of Words**”**.

In Simple terms

“Bag of words also called BOW” is representation of words into vectors of numbers 0 (zero) and 1 (one).These words can belong from various sentences and various paragraphs.

**Understanding Bag of Words in details**

Two Major things which we achieve from Bag of Words or BOW is

- Frequency…

Natural Language is the language which is human readable like text, messages. Processing these languages by machine for the use of different applications is called as Natural language Processing or NLP.

Some Practical example of NLP is Sentiment Analysis, Analyzing Restaurant Reviews, google/Alexa voice search which converts speech into text and then use for internal processing.

1. Details about NLP Application

2. Applications using Natural Language Processing (NLP)

3. Understand NLP using Python & NLTK library

4.Word Tokenizer and Sentence Tokenizer

5. part-of-speech tagging

6. Stemming and Lemmatization

As we know that today almost everyone having smart phones/Laptop and easy…

Cosine similarity is used to determine the similarity between documents or vectors. Mathematically, it measures the cosine of the angle between two vectors projected in a multi-dimensional space.There are other similarity measuring techniques like Euclidean distance or Manhattan distance available but we will be focusing here on the Cosine Similarity and Cosine Distance.

The relation between cosine similarity and cosine distance can be define as below.

- Similarity decreases when distance between two vectors increases

2. Similarity increases when distance between two vectors decreases.

The ANOVA(one way Anova specifically here) test in statistics is used to determine whether there are any statistically significant differences between the means of three or more independent (unrelated) groups and draw a conclusion based on that for any test.

An **ANOVA **test is a way to find out if survey or experiment results are significant.

Lets try to understand ANOVA by setting our Null Hypothesis and Alternate Hypothesis

There are various distributions types used in machine learning to describe the distribution of data in a population or sample of data set.In machine learning it is used to visualize the data distribution and outliers detection in data set.

**Gaussian Distribution****Z-Distribution****T- Distribution**

Discovered by Carl Friedrich Gauss, Gaussian Distribution also known as Normal Distribution is a bell shape curve and shows the distribution of data values of a population. This is used to check the deviation and skewness of the data.

**Gaussian distribution follows the empirical rule :**

1. This is symmetrical curve where 50 percent data lies…