A House is A Home

A lot of us are spending a lot more time at home than usual. The fact that I’m even considering replacing my habit of working on my phone from bed with formally sitting at a desk I might order is proof of that. If experimenting with new packages in R is not in the cards for you during this time, that’s totally understandable. But if it is, I have a tutorial for you - one inspired by our additional nesting hours.

Before learning R, the only significance of “gg” to me was Gossip Girl. That still remains largely the case. However, I’m making the rounds through the many and constantly growing “gg” packages, the newest being “ggformula.”

install.packages(“ggformula”) 
library(ggformula) 

When I put in these commands, I was greeted with a message, telling me to check out a tutorial about “ggformula.” I had no idea what would happen, but I typed in the code they suggested:

learnr::run_tutorial("introduction", package = "ggformula") 

The output of the code pulled up a browser with detailed instructions how to create plots using “ggformula.”

I had been looking for robust datasets involving housing features and I found the package “AmesHousing”, which contains a few datasets with a large number of variables. After downloading and calling the package, I found the variables “KitchenQual” (the quality rating of the houses’ kitchens) and “SalePrice” most interesting. I wanted to create scatterplots that visualize the sale price of the houses by their lot areas for each level of kitchen quality:

gf_point(Sale_Price/100000 ~ Lot_Area, data = house) %>%  
gf_facet_grid(Kitchen_Qual ~ . ) %>%  
gf_labs(title = "Iowa Housing Prices ", y = "Sale Price (in $100,000)", x = "Lot Area")
IowaPrices.png

It’s no surprise that the houses with higher quality kitchens sell for higher prices. I did find it a bit surprising that the size of the house’s lot doesn’t seem to impact the price of the house at all, but perhaps if I were to witness the houses in their physical locations, the noticeable difference wouldn’t be significant.

If you’re curious to check out other variables within the “AmesHousing” package’s datasets, I highly recommend it as there are 81 from which to choose. I’m curious to know if other packages offer built-in tutorials like I stumbled upon with “ggformula.” If you’ve witnessed this with other packages, please give me a holler!

Like what you read?

Please support the circulation of more tutorials like this one by becoming a patron on Patreon. Even the $3 a month level hooks you up with extra content and perks!