Plotting in RStudio
A commonly used plotting function in RStudio is ggplot. To get access to data visualization functions, one must first import ggplot2 from the library. After importing ggplot2 from the library, the ggplot function is available for use. The ggplot function is applicable for plotting objects. Each ggplot contains the name of the dataset and the labels for the x-axis and y-axis in the command. In addition, the function must contain a plot component referring to the type of plot. There are various types of plots, such as scatterplots, boxplots or heat maps.
Scatterplot
A scatterplot provides an overview of how data is distributed. It displays the relationship between continuous variables. In RStudio, the code for a scatterplot includes the geom_point object. The figure below is an example of a scatterplot made with RStudio. The dataset ‘Mall Customers’ is available at the Kaggle Data Science Community. In the example, the scatterplot shows the relationship between income and spending score. The plot indicates that a higher income does not necessarily imply a higher spending score. Additionally, the colored dots denote the age of the customer.

Boxplot
Another plot that could be useful to provide an overview of how data is distributed is a boxplot. A boxplot has a minimum, first quartile, median, third quartile, and maximum. The plot shows how tightly the data is grouped and how the data is skewed. The median is the middle value of the dataset and does not necessarily equal the mean value. In the figure below, outliers are depicted as individual points. From the plot, you can also derive the median for the amount of purchases separated by gender. Generally, a boxplot is convenient to compare summarizing statistics.

Heat Map
Data plots could also look like an image. A heat map represents data values as colors. The data that supports the graph below is retrieved from the Kaggle Data Science Community. Data on the Australian weather is available from November 2007 to June 2017. The heat map below only contains data over the year 2017. Also, the plot only contains well-known places in Australia. Observations of other locations are removed from the dataset. The heat map displays the temperature at 3p.m. for each location at a certain time period.

#Filter the dataset by year >Australian_Weather_2017 <- Australian_Weather %>% filter(Date > 2017)
#Filter the dataset by popular cities >Australian_Weather_Main_Cities <- Australian_Weather_2017 %>% filter(city=”Adelaide”|city=”AliceSprings|city=………|etc.)
#Run the code for the heat map >ggplot(Australian_Weather_Main_Cities, aes(x = time, y = city, fill=Temperature)) + geom_tile
Violin Plot
Similar to a boxplot is a violin plot. As illustrated in the figure below on Australian weather data, a violin plot has almost the same shape as a boxplot. In contrast to boxplots, violin plots include the probability density. The oddly shaped lines show the distribution shape of the data. A wider section, such as the wider section for main city ´Darwin´ in the figure below, indicates that the probability is high the temperature takes approximately 31,5 degrees. Like a boxplot, a violin plot has a lower adjacent value, median and an upper adjacent value.

Histogram
Histograms show the distribution of numerical data. A histogram is different from a bar graph since a bar graph contains categorical data. The plot below shows the frequency of the amount of purchases by gender. The amount of purchases is probably by dollar cents.
