For Best Course on Data Science Developed by Data Scientist ,please follow the below link to avail discount

https://www.udemy.com/machine-learning-using-r/?couponCode=DISFOR123

Importing libraries

Let’s import the ggplot2 library which is needed for ggplot visualization

``````library(ggplot2)
``````

Let’s import the data set named “diamonds” into the dataframe named DataFrame

``````DataFrame<-diamonds
``````

Looking at data

Let’s check the str of the data set

``````str(DataFrame)
``````
``````## Classes 'tbl_df', 'tbl' and 'data.frame':    53940 obs. of  10 variables:
##  \$ carat  : num  0.23 0.21 0.23 0.29 0.31 0.24 0.24 0.26 0.22 0.23 ...
##  \$ cut    : Ord.factor w/ 5 levels "Fair"<"Good"<..: 5 4 2 4 2 3 3 3 1 3 ...
##  \$ color  : Ord.factor w/ 7 levels "D"<"E"<"F"<"G"<..: 2 2 2 6 7 7 6 5 2 5 ...
##  \$ clarity: Ord.factor w/ 8 levels "I1"<"SI2"<"SI1"<..: 2 3 5 4 2 6 7 3 4 5 ...
##  \$ depth  : num  61.5 59.8 56.9 62.4 63.3 62.8 62.3 61.9 65.1 59.4 ...
##  \$ table  : num  55 61 65 58 58 57 57 55 61 61 ...
##  \$ price  : int  326 326 327 334 335 336 336 337 337 338 ...
##  \$ x      : num  3.95 3.89 4.05 4.2 4.34 3.94 3.95 4.07 3.87 4 ...
##  \$ y      : num  3.98 3.84 4.07 4.23 4.35 3.96 3.98 4.11 3.78 4.05 ...
##  \$ z      : num  2.43 2.31 2.31 2.63 2.75 2.48 2.47 2.53 2.49 2.39 ...
``````

Boxplot using ggplot

Let’s choose the cut variable and price variable in the dataset.

• Cut is categorical or factor
• Price variable is continuous variable
• Let’s visualize cut and price variable together in boxplot
``````ggplot(data = DataFrame)+geom_boxplot(aes(x =cut,y=price),
color="orange",
fill="blue",
alpha=0.5
)+
scale_x_discrete()+
scale_y_continuous(name="Price range in each cut cateogory")+
theme_bw()
`````` Boxplot

Meaning of arguments in ggplot function:

Above is the format which you can use for any boxplot visualization.
Meanings of some parameters and functions are as follows:

Functions used are :

• ggplot() is basic function which is used in every visualization
• This takes in data argument which is the name of dataframe
• geom_boxplot is used for plotting boxplot.This takes in aes() function
and other arguments
• scale_x_discrete is used for customizing the x axis(categorical or discrete variable)
• theme_bw() is used for customizing the plot background

Parameters used are :

• fill=used for colour used in filling the reactangular boxes of boxplot
• color=used for colour of edges of boxplot,median value and outliers
• alpha=used for transparency.Very useful when you want to plot one over other
• x= name of x variable which is categorical
• y= name of y variable which is continuous