This code calculates the correlation between the number of bales and total weight and the number of bales and cost per pound. This analysis is conducted by using the is.na() function to identify the location of the missing values (NA’s) and assign it to a new data frame that do not contain the missing values. This process is done for each of the columns which are Number of Bales, total weight, and cost per pound, gradually reducing the data frame to one that contains numeric values in all value locations. Lastly, the number of rows is returned to compare the reduction in rows as we rid the data of the NA values.
With all numeric values, the new data frame can be used to assess the relationship between each of the columns. This is done using the plot() function and cor() function, one significant relationship is observed in the relationship between a number of bales and total weight. The scatter plot created with the plot function shows a significant positive slope. The correlation calculated additionally determines there is an eighty-one percent chance that the relationship did not occur by chance.This means that the increase in the number of bales likely will indicate an increase in the total weight.
A correlation analysis was conducted on the relationship between cost per pound and the number of bales using the same functions as above. The relationship, in this case, is weak and negative. The analysis found a twenty-six percent likelihood that the relationship occurred by chance. This indicates a fairly insignificant relationship between cost per pound and the number of bales.