Finding patterns in Predictor Variables

Finding correlated variables is very important in order to remove multicollinearity in multiple regression models.Once you know the correlated variables you can choose few from them or may be just one from them.It depends on correlation value and other factors.Let’s begin to visualize the correlation in variables in a nice plot.Open your RStudio and begin typing in !!!

Let's import the library corrplot for correlation matrix visualization

## Warning: package 'corrplot' was built under R version 3.2.5

Let's store the dataset mtcars into dataframe named DataFrame

DataFrame <- mtcars

Let's find the correlation matrix of this data set

corrMatrix <- cor(DataFrame)

Best Online Courses for Machine Learning and Data Science.Follow this link

Machine Learning and Data Science best online courses

Let'c visualize the correlation matrix.This plot will tell about the correlation between the feature variables in the data set.
Dark red means correlation=-1
Dark blue means correlation=1
White means correlation=0

corrplot(corrMatrix, method="circle")
Pattern Recognition
Pattern recognition method=”circle”

Let's use the hierarchical clustering for finding the patterns in the feature variables.This will reorder the variables so that similar variables will be near to each other.

corrplot(corrMatrix, order ="hclust",addrect = 5)

Pattern recognition
Pattern Recognition