Skip to main content

Posts

Showing posts with the label machinelearning

“Gapminder” Exploratory Data Analysis using R for Data Science

M ain focus is to investigate the dataset Gapminder and interact with it. To illustrate the basic use of EDA in the dplyr,ggplot2 package, I use a “gapminder” datasets. This data is a data.frame created for the purpose of predicting sales volume. Using the dplyr package to perform data transformation and manipulation operations.  Using the ggplot2 package to visually analyze our data. Load Packages #install.packages("gapminder") library(gapminder) library(dplyr) library(ggplot2) The variables are explained as follows: Country — factor with 142 levels Continent — Factor with 5 levels Year — ranges from 1952 to 2007 in increments of 5 years lifeExp — life expectancy at birth, in years pop — population dgoPercap — GDP per capita head(gapminder_unfiltered,5) #Unfiltered data tail(gapminder_unfiltered,5) Display name of Variables : names(gapminder_unfiltered) Data Cleaning : Finding the missing values as we can see this data has no missing values str(gapminder_unfiltered) sum...