Scatter plot
Scatter plot

Introduction

Let’s begin learning about how to plot scatter plot in R using ggplot2

Importing libraries

Let’s import the ggplot2 library which is needed for ggplot visualization

library(ggplot2)

Reading data set

Let’s import the data set named “diamonds” into the data frame named “DataFrame”

DataFrame<-diamonds

Looking at data 

Let’s check the str of the data set

str(DataFrame)
## Classes 'tbl_df', 'tbl' and 'data.frame':    53940 obs. of  10 variables:
##  $ carat  : num  0.23 0.21 0.23 0.29 0.31 0.24 0.24 0.26 0.22 0.23 ...
##  $ cut    : Ord.factor w/ 5 levels "Fair"<"Good"<..: 5 4 2 4 2 3 3 3 1 3 ...
##  $ color  : Ord.factor w/ 7 levels "D"<"E"<"F"<"G"<..: 2 2 2 6 7 7 6 5 2 5 ...
##  $ clarity: Ord.factor w/ 8 levels "I1"<"SI2"<"SI1"<..: 2 3 5 4 2 6 7 3 4 5 ...
##  $ depth  : num  61.5 59.8 56.9 62.4 63.3 62.8 62.3 61.9 65.1 59.4 ...
##  $ table  : num  55 61 65 58 58 57 57 55 61 61 ...
##  $ price  : int  326 326 327 334 335 336 336 337 337 338 ...
##  $ x      : num  3.95 3.89 4.05 4.2 4.34 3.94 3.95 4.07 3.87 4 ...
##  $ y      : num  3.98 3.84 4.07 4.23 4.35 3.96 3.98 4.11 3.78 4.05 ...
##  $ z      : num  2.43 2.31 2.31 2.63 2.75 2.48 2.47 2.53 2.49 2.39 ...

Scatter plot using ggplot 

Let’s choose the x and y variable in the dataset.These continuous variable
and let’s visualize them in scatter plot

ggplot(data = DataFrame)+geom_point(aes(x =x,y=carat),
                                   color="orange",
                                   size=2,
                                   alpha=0.3

                                   )+
  scale_x_continuous(name="x")+
  scale_y_continuous(name="carat")+
  ggtitle(label = "Relation between x and carat")+
  theme_bw()
Scatter plot
Scatter plot

Meaning of arguments in ggplot function:

Above is the format which you can use for any scatter visualization
Meanings of some parameters and functions are as follows:

Functions used are :

  • ggplot() is basic function which is used in every visualization.
  • This takes in data argument which is the name of dataframe.
  • geom_point is used for plotting scatter plot.This takes in aes() function
    and other arguments
  • scale_x_continous is used for customizing the x axis
  • theme_bw() is used for customizing the plot background
  • ggtitle for main title


Parameters used are :

  • x=name of x variable
  • y=name of y variable
  • alpha=used for transparency.Very useful when you want to plot one
    over other
  • color=used for color of points
  • size=used for size of points