# Tree Harvests in RStudio using the mosaic package

This is an RStudio Markdown document to accompany Eric Marland's presentation on the Tree Harvest data at the MAA Prep Workshop July 2012.

``````require(mosaic)
trellis.par.set(theme = col.mosaic())
``````
``````trees = fetchData("TreeHarvest.csv")
``````
``````## Retrieving data from http://www.mosaic-web.org/go/datasets/TreeHarvest.csv
``````
``````names(trees)
``````
``````## [1] "time"    "rate"    "species"
``````

We can calculate some summary statistics by species.

``````mean(rate ~ species, data = trees)
``````
``````##  black walnut loblolly pine mountain pond
##         0.965         1.030         1.829
``````
``````favstats(rate ~ species, data = trees)
``````
``````##                min     Q1 median    Q3  max  mean     sd  n missing
## black walnut  0.40 0.7475  1.000 1.212 1.35 0.965 0.3071 16       0
## loblolly pine 0.10 0.2700  0.510 1.562 4.30 1.030 1.1351 16       0
## mountain pond 0.65 1.1750  1.875 2.475 2.97 1.829 0.7720 16       0
``````

Next we plot the relationship between rate and time, separately for each species.

``````plotPoints(rate ~ time | species, data = trees, layout = c(1, 3))
``````

Let's focus on the mountain ponderosa pine species.

``````plotPoints(rate ~ time, data = subset(trees, species == "mountain pond"))
``````

We can fit two models (one spline, one smoother) to these data, and compare the fit to the observed values. We also add the average value (in purple).

``````mp.spline = spliner(rate ~ time, data = subset(trees, species ==
"mountain pond"))
mp.smooth = smoother(rate ~ time, data = subset(trees, species ==
"mountain pond"))
plotFun(mp.spline(time) ~ time, time.lim = c(5, 155))
plotFun(mp.smooth(time) ~ time, add = TRUE, col = "red")
plotPoints(rate ~ time, add = TRUE, data = subset(trees, species ==
"mountain pond"))
# calculate average value
F <- antiD(mp.smooth(x) ~ x)
average <- makeFun(1/x * F(x) ~ x)
plotFun(average(x) ~ x, add = TRUE, xlim = c(0, 155), col = "purple")
``````