Finding correlated variables is very important in order to remove multicollinearity in multiple regression models.Once you know the correlated variables you can choose few from them or may be just one from them.It depends on correlation value and other factors.Let’s begin to visualize the correlation in variables in a nice plot.Open your RStudio and begin typing in !!!
For Best Course on Data Science Developed by Data Scientist ,please follow the below link to avail discount
Let’s import the library corrplot for correlation matrix visualization
## Warning: package 'corrplot' was built under R version 3.2.5
Let’s store the dataset mtcars into dataframe named DataFrame
DataFrame <- mtcars
Let’s find the correlation matrix of this data set
corrMatrix <- cor(DataFrame)
Best Online Courses for Machine Learning and Data Science.Follow this link
Let’c visualize the correlation matrix.This plot will tell about the correlation between the feature variables in the data set.
Dark red means correlation=-1
Dark blue means correlation=1
White means correlation=0
Let’s use the hierarchical clustering for finding the patterns in the feature variables.This will reorder the variables so that similar variables will be near to each other.
corrplot(corrMatrix, order ="hclust",addrect = 5)