Getting Started

Computing the Cubage

The cubage function calculates the volume of a tree (cubage) using different cubing methods. It can handle both single trees and multiple trees in a dataset. The available cubing methods are Smalian, Newton and Huber.

Cubing a Simple Tree

When cubing a single tree, you need to provide vectors of diameters (d) and heights (h) measured at different points along the tree stem. Diameters should be in centimeters, and heights should be in meters. The diameter at breast height (DBH) and total tree height (Ht) are essential inputs.

using ForestMensuration

# Diameters at different heights (cm)
d = [9.0, 7.0, 5.8, 5.1, 3.8, 1.9, 0.0]

# Corresponding heights (m)
h =  [0.3, 1.3, 3.3, 5.3, 7.3, 9.3, 10.8]

# Calculate cubage using the Smalian method
cubage(Smalian, h, d)
1×11 DataFrame
Rowvtv0vcvrvndhhcaffnffqf
Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64
10.02292540.001908520.02087510.00.0001417647.010.89.30.5515780.4882670.719286


  • vt: Total volume
  • v0: Volume of the stump
  • vc: Commercial bole volume
  • vr: Residual volume above commercial limit
  • vn: Volume of the top (cone)
  • dbh: Diameter at breast height
  • ht: Total tree height
  • hc: Commercial height
  • aff: Artificial form factor
  • nff: Natural form factor
  • qf: Form quotient


Including Bark Thickness

With the bark thickness value, it is possible to calculate the bark factor and total and commercial volumes without bark. Note: the provided thickness should be the 'single thickness' in centimeters. The function will convert it into 'double thickness'.

# Bark thickness at corresponding heights (cm)
bark = [0.9, 0.5, 0.3, 0.2, 0.2, 0.1, 0.0]

# Define a commercial diameter limit
diameter_limit = 4.0

# Calculate cubage using the Newton method, including bark thickness and diameter limit
cubage(Newton, h, d, bark, diameter_limit)
1×17 DataFrame
Rowvtv0vcvrvndhhcaffnffqfkvtwbv0wbvcwbvrwbvnwb
Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64
10.01302330.001908520.01097310.00.0001417647.010.86.992310.3133380.2773720.7192860.9325150.01132490.001659620.009542010.00.000123276


Additional columns include:

  • k: Bark factor
  • : vtwb, v0wb, vcwb, vrwb, vnwb: Corresponding volumes without bark

Cubing Multiple Trees

To calculate cubage for multiple trees, organize your data in a DataFrame with columns for tree identifiers, heights, diameters, and optionally bark thickness.

using ForestMensuration
using DataFrames

# Sample data for multiple trees
data = DataFrame(
    tree = [148, 148, 148, 148, 148, 148, 148, 222, 222, 222, 222, 222, 222, 222, 222, 222, 222, 222],
    h = [0.3, 1.3, 3.3, 5.3, 7.3, 9.3, 10.8, 0.3, 1.3, 3.3, 5.3, 7.3, 9.3, 11.3, 13.3, 15.3, 17.3, 19.5],
    d = [9.0, 7.0, 5.8, 5.1, 3.8, 1.9, 0.0, 16.0, 12.0, 11.6, 10.1, 9.4, 8.2, 7.0, 6.0, 4.0, 2.0, 0.0],
    bark = [0.9, 0.5, 0.3, 0.2, 0.2, 0.1, 0.0, 1.2, 0.5, 0.3, 0.3, 0.2, 0.2, 0.3, 0.0, 0.0, 0.0, 0.0]
)

# Define a commercial diameter limit
diameter_limit = 2.5

# Calculate cubage for each tree using the Huber method
cubage(Huber, :tree, :h, :d, data, diameter_limit)
2×12 DataFrame
Rowtreevtv0vcvrvndhhcaffnffqf
Int64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64
11480.0233530.001908520.02069840.0006043640.0001417647.010.88.668420.5618670.4973750.719286
22220.1065030.006031860.09939210.000848230.00023038312.019.516.80.4829180.4935540.660833


Including Bark Thickness for Multiple Trees

Additionally, bark thickness values can be provided to calculate bark factors and volumes without bark.

# Calculate cubage including bark thickness
cubage(Huber, :tree, :h, :d, :bark, data, diameter_limit)
2×18 DataFrame
Rowtreevtv0vcvrvndhhcaffnffqfkvtwbv0wbvcwbvrwbvnwb
Int64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64
11480.0233530.001908520.02069840.0006043640.0001417647.010.88.668420.5618670.4973750.7192860.9325150.02030740.001659620.0179990.0005255460.000123276
22220.1065030.006031860.09939210.000848230.00023038312.019.516.80.4829180.4935540.6608330.9652380.09922670.005619780.0926020.0007902820.000214644

Fitting Linear Regressions

The regression function automatically generates and evaluates multiple regression models based on the provided data. It explores various transformations of the dependent and independent variables, creating a comprehensive set of models for analysis.

Adjusting a Hypsometric Relationship

In forestry, hypsometric relationships model the relationship between tree height (h) and diameter at breast height (dbh). The regression function generates numerous models to find the best fit.

using ForestMensuration
using DataFrames

# Sample dataset with tree heights and diameters
data = DataFrame(
    plot = ["A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A",
            "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B",
            "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C",
            "D", "D", "D", "D", "D", "D", "D", "D", "D", "D", "D", "D", "D", "D", "D"],
    h = [20.9, 19.6, 13.2, 23.3, 19.2, 16.2, 8.3, 19.7, 11.0, 24.0, 25.8, 28.2, 24.2, 26.2, 28.3,
         14.4, 14.9, 15.6, 8.2, 22.1, 16.7, 22.3, 19.5, 15.9, 16.7, 24.5, 21.7, 23.8,
         20.8, 17.7, 19.3, 16.7, 22.2, 18.6, 6.9, 22.3, 8.7, 22.1, 21.0, 23.5,
         19.5, 19.7, 18.2, 13.9, 12.3, 14.5, 12.3, 18.6, 18.0, 17.4, 24.3, 22.8, 23.2, 23.5, 25.2],
    dbh = [31.5, 30.0, 26.5, 31.0, 29.0, 26.5, 14.5, 28.8, 19.0, 31.5, 32.5, 33.8, 32.5, 33.3, 36.0,
           24.0, 28.0, 23.0, 15.5, 31.0, 27.0, 29.0, 28.0, 26.0, 29.0, 30.0, 29.0, 30.5,
           25.0, 26.8, 27.5, 26.0, 26.0, 25.8, 10.8, 27.0, 16.5, 26.5, 27.0, 26.3,
           26.0, 25.5, 25.0, 23.5, 22.0, 23.0, 23.0, 26.0, 25.5, 27.5, 26.5, 26.5, 27.8, 26.0, 27.0]
)

# Perform regression analysis between height and diameter
models = regression(:h, :dbh, data)
492-element Vector{LinearModel}:
 h = -2.4842 + 0.6487 * dbh + 0.005977 * dbh ^ 2
 h = -26.5433 + 4.373 * log(dbh) ^ 2 - 128.4 * dbh ^ -1 + 2419.0 * dbh ^ -2
 h = 57.2762 - 1335.0 * dbh ^ -1 + 8652.0 * dbh ^ -2
 h = 157.3244 - 0.01234 * dbh ^ 2 - 123.6 * log(dbh) + 25.66 * log(dbh) ^ 2
 h = -87.3282 + 0.001564 * dbh ^ 2 + 31.09 * log(dbh) + 2360.0 * dbh ^ -2
 h = 33.7974 - 374.1 * dbh ^ -1
 h = -93.5981 - 0.9571 * dbh + 11.0 * log(dbh) ^ 2 + 525.8 * dbh ^ -1
 h = -16.632 + 3.346 * log(dbh) ^ 2
 h = -46.5986 + 20.15 * log(dbh)
 h = -159.8033 + 48.06 * log(dbh) + 564.4 * dbh ^ -1
 ⋮
 dbh ^ 2 / (h - 1.3) = 134.678 + 4.77 * dbh - 19.67 * log(dbh) ^ 2 - 6344.0 * dbh ^ -2
 dbh ^ 2 / (h - 1.3) = -63.893 + 1.922 * dbh + 1932.0 * dbh ^ -1 - 13460.0 * dbh ^ -2
 dbh ^ 2 / (h - 1.3) = 19.927 + 0.7885 * dbh
 dbh ^ 2 / (h - 1.3) = 429.0095 + 0.04163 * dbh ^ 2 - 107.9 * log(dbh) - 1689.0 * dbh ^ -1
 dbh ^ 2 / (h - 1.3) = 4.3221 - 0.4626 * dbh + 4.575 * log(dbh) ^ 2
 dbh ^ 2 / (h - 1.3) = 47.2955 + 0.005149 * dbh ^ 2 - 258.4 * dbh ^ -1
 dbh ^ 2 / (h - 1.3) = 713.3716 + 5.465 * dbh - 221.5 * log(dbh) - 2430.0 * dbh ^ -1
 dbh ^ 2 / (h - 1.3) = 685.7053 - 370.1 * log(dbh) + 54.29 * log(dbh) ^ 2 - 10760.0 * dbh ^ -2
 dbh ^ 2 / (h - 1.3) = -16.9943 + 0.0001126 * dbh ^ 2 + 17.72 * log(dbh)

This generates 240 different models combining various transformations of h and dbh.

#number of fitted regressions
length(models)
492

Regression Selection Criteria

After fitting the models, you can evaluate and rank them based on specific criteria using the criteria_table function.

# Evaluate models
best_models = criteria_table(models)
10×8 DataFrame
Rowmodelrankadjr2dsyxaicbicsignificance
LinearMo…Int64Float64Float64Float64Float64Float64Float64
1log(h) = -1.4894 + 1.348 * log(dbh)170.7296030.91794613.9489267.57271.5851.0
2h = -96.1659 + 34.02 * log(dbh) + 2613.0 * dbh ^ -2200.7296490.9207613.9477269.561275.5831.0
3log_minus(h - 1.3) = -2.0472 + 1.495 * log(dbh)210.7290260.91897313.9638267.687271.7021.0
4h = -159.8033 + 48.06 * log(dbh) + 564.4 * dbh ^ -1250.7295050.92079213.9514269.59275.6121.0
5h = -5.7502 + 0.9352 * dbh260.7275170.91798314.0026267.993272.0081.0
6log1p(h) = 0.7588 + 0.2064 * log(dbh) ^ 2280.7269950.91964714.016268.098272.1131.0
7log1p(h) = 4.9266 - 65.76 * dbh ^ -1 + 375.7 * dbh ^ -2340.7264290.92168114.0305270.212276.2341.0
8log(h) = 0.5413 + 0.2215 * log(dbh) ^ 2350.7250950.91986514.0647268.48272.4941.0
9log(h) = 4.9804 - 69.14 * dbh ^ -1 + 389.6 * dbh ^ -2370.7254090.92174414.0567270.417276.4391.0
10log_minus(h - 1.3) = 5.064 - 74.0 * dbh ^ -1 + 407.0 * dbh ^ -2420.723570.92181514.1037270.784276.8061.0
# Evaluate models based on Adjusted R², Standard Error and chosing the 5 bests
best_5_models = criteria_table(models, :adjr2, :syx, best=5)
5×4 DataFrame
Rowmodelrankadjr2syx
LinearMo…Int64Float64Float64
1h = -35.3323 + 4.828 * log(dbh) ^ 2 + 1745.0 * dbh ^ -220.72967613.947
2h = -96.1659 + 34.02 * log(dbh) + 2613.0 * dbh ^ -240.72964913.9477
3log(h) = -1.4894 + 1.348 * log(dbh)60.72960313.9489
4h = -159.8033 + 48.06 * log(dbh) + 564.4 * dbh ^ -180.72950513.9514
5log1p(h) = -1.1303 + 1.255 * log(dbh)100.72933413.9559


Selecting the Best Model

To select the best model based on the combined ranking you can simply use the criteria_selection function:

# Select the top model
top_model = criteria_selection(models)
h = -96.1659 + 34.02 * log(dbh) + 2613.0 * dbh ^ -2

Plotting the Regression

You can visualize the regression model using the plot_regression function.

# Plot the top model
plot_regression(top_model)
Example block output


Predict

The predict function. allows you to generate predicted values from a regression model on the original scale of the dependent variable. This is particularly useful when the model involves transformations of the dependent variable (e.g., logarithmic transformations). The function automatically applies the appropriate inverse transformations and corrections, such as the Meyer correction factor for logarithmic models.

# Returns the predicted values from the model on the original scale
h_pred = predict(top_model)
55-element Vector{Float64}:
 23.839396334946976
 22.449444919838996
 19.046655226169193
 23.380687056885154
 21.499772962767707
 19.046655226169193
  7.239680599465139
 21.307637639879843
 11.245167159289501
 23.839396334946976
  ⋮
 15.446233283156385
 18.543111236482176
 18.03556588328393
 20.0411325227971
 19.046655226169193
 19.046655226169193
 20.33608673315981
 18.543111236482176
 19.546038496517237

The predict! function extends this by adding the predicted values directly to your DataFrame. It creates new columns for the predicted and actual values, combining observed measurements with model predictions where data may be missing. This is especially useful in forest inventory datasets where certain tree attributes might not be measured for every tree, and predictions need to be filled in for these gaps.

# Automatically adds predicted and actual height columns to the provided DataFrame.
# This combines observed heights and predicted heights for trees with missing or unmeasured heights.
predict!(top_model, data)

# Firsts values of dataset
println(data[1:10, :])
10×5 DataFrame
 Row │ plot    h        dbh      h_predict  h_real
     │ String  Float64  Float64  Float64?   Float64
─────┼──────────────────────────────────────────────
   1 │ A          20.9     31.5   23.8394      20.9
   2 │ A          19.6     30.0   22.4494      19.6
   3 │ A          13.2     26.5   19.0467      13.2
   4 │ A          23.3     31.0   23.3807      23.3
   5 │ A          19.2     29.0   21.4998      19.2
   6 │ A          16.2     26.5   19.0467      16.2
   7 │ A           8.3     14.5    7.23968      8.3
   8 │ A          19.7     28.8   21.3076      19.7
   9 │ A          11.0     19.0   11.2452      11.0
  10 │ A          24.0     31.5   23.8394      24.0

Adjusting a Qualitative (Dummy) Hypsometric Relationship

If your data includes categorical variables (e.g., different plots or species), you can include them in the regression analysis.

# Perform regression including 'plot' as a categorical variable
qualitative_models = regression(:h, :dbh, data, :plot)

# Select the best model
top_qual_model = criteria_selection(qualitative_models, :adjr2, :syx, :aic)
log(h) = 0.9926 + 0.06795 * dbh + 0.04215 * plot: B + 0.2089 * plot: C + 0.1953 * plot: D


Plotting the Qualitative Regression

# Plot the top qualy model
plot_regression(top_qual_model)
Example block output

Site Classification

The site classification enable you to evaluate and classify forest sites based on regression models relating tree height and age. These functions are particularly useful for assessing site productivity and quality by comparing observed data with expected values derived from well-calibrated models.

Calculating Site Classification

The site_classification function calculates the expected dominant height at a given index age for each observation based on a fitted regression model. This is a key step in classifying the productivity of a forest site.

using ForestMensuration, DataFrames

# Create a DataFrame containing tree plot data
data = DataFrame(
    plot = repeat(1:6, inner=5),
    age  = repeat([36, 48, 60, 72, 84], outer=6),
    h    = [13.6, 17.8, 21.5, 21.5, 21.8,
            14.3, 17.8, 21.0, 21.0, 21.4,
            14.0, 17.5, 21.2, 21.2, 21.4,
            13.4, 18.0, 20.8, 20.8, 23.2,
            13.2, 17.4, 20.3, 20.3, 22.0,
            13.2, 17.8, 21.3, 21.3, 22.5]
)

# Fit a regression model to relate height (h) to age
reg = regression(:h, :age, data) |> criteria_selection

# Define the target index age (for example, 60 months)
index_age = 60

# Calculate the site classification values (site indices) for each observation
site_indices = site_classification(reg, data, index_age)

println("Site Classification Values:")
println(site_indices)
Site Classification Values:
[20.5, 20.4, 21.5, 20.7, 20.3, 21.1, 20.4, 21.0, 20.0, 19.6, 20.8, 20.1, 21.2, 20.3, 19.6, 20.4, 20.5, 20.8, 19.7, 22.8, 20.2, 20.0, 20.3, 19.1, 20.6, 20.2, 20.4, 21.3, 20.4, 21.5]

Calculating Dominant Height Classification

The hdom_classification function uses the site classification values to predict the dominant height for each observation at the specified index age. This reverses the site classification process, allowing you to forecast tree heights based on site productivity.

# Now, compute the dominant heights for each observation using the site indices
dominant_heights = hdom_classification(reg, data, index_age, site_indices)

println("Dominant Height Values:")
println(dominant_heights)
Dominant Height Values:
[13.6, 17.8, 21.5, 21.5, 21.8, 14.3, 17.8, 21.0, 21.0, 21.4, 13.9, 17.5, 21.2, 21.2, 21.4, 13.5, 18.0, 20.8, 20.8, 23.2, 13.2, 17.4, 20.3, 20.3, 22.0, 13.2, 17.8, 21.3, 21.3, 22.5]

Generating a Site Table

The site_table function creates a comprehensive site table and an associated site plot. This table shows the predicted dominant heights at various ages for different site index classes. You can specify a height increment (hi) to define the granularity of the site classes.

# Generate the site table
analysis = site_table(reg, index_age)
5×6 DataFrame
 Row │ age      S_19.5   S_20.5   S_21.5   S_22.5   S_23.5
     │ Float64  Float64  Float64  Float64  Float64  Float64
─────┼──────────────────────────────────────────────────────
   1 │    36.0     12.4     13.6     14.8     16.2     17.7
   2 │    48.0     16.8     18.0     19.2     20.4     21.8
   3 │    60.0     19.5     20.5     21.5     22.5     23.5
   4 │    72.0     20.6     21.4     22.1     22.9     23.6
   5 │    84.0     21.3     21.9     22.5     23.0     23.6

Plotting the Site Index

# Generate site plot
analysis.site_plot
Example block output

Frequency and Statistical Functions

Creating Frequency Tables

The frequency_table function creates frequency distributions for a vector of values, which is useful for analyzing the distribution of diameters or heights in your data.

# Frequency table for diameters using Sturges' formula for class intervals
frequency_table(data.dbh)
6×7 DataFrame
RowLIXiLSfiFifriFri
Float64Float64Float64Int64Int64Float64Float64
110.012.515.0111.818181.81818
215.017.520.0345.454557.27273
320.022.525.0263.6363610.9091
425.027.530.0273349.090960.0
530.032.535.0195234.545594.5455
635.037.540.03555.45455100.0


  • LI: Lower class limit
  • Xi: Class center
  • LS: Upper class limit
  • fi: Frequency count
  • Fi: Cumulative frequency
  • fri: Relative frequency (%)
  • Fri: Cumulative relative frequency (%)


Specifying Class Width

You can specify the class width (hi) to customize the intervals.

# Frequency table for heights with class width of 4 meters
frequency_table(data.h, 4)
6×7 DataFrame
RowLIXiLSfiFifriFri
Float64Float64Float64Int64Int64Float64Float64
18.010.012.0447.272737.27273
212.014.016.0599.0909116.3636
316.018.020.0122121.818238.1818
420.022.024.0143525.454563.6364
524.026.028.0175230.909194.5455
628.030.032.03555.45455100.0

Calculating Dendrometric Averages

he dendrometric_averages function computes various dendrometric averages of a forest stand, providing insights into the stand structure and growth patterns.

# Calculate dendrometric averages for the dataset
dendrometric_averages(data.dbh, area=0.05)
1×7 DataFrame
Rowd₋dgdwdzd₁₀₀d₊
Float64Float64Float64Float64Float64Float64Float64
121.840126.516426.918227.226.533.6231.1927


  • d₋: Lower Hohenadl's diameter
  • d̄: Mean diameter
  • dg: Quadratic mean diameter
  • dw: Weise's diameter (60th percentile)
  • dz: Diameter of the tree with central basal area
  • d₁₀₀: Mean diameter of the 100 largest trees per hectare (returns NaN if fewer than 100 trees)
  • d₊: Upper Hohenadl's diameter


Estimating Heights Using a Regression Model

If you have a regression model (e.g., top_model from earlier), you can estimate the corresponding heights for each calculated diameter.

# Estimate heights for dendrometric averages using the regression model
dendrometric_averages(top_model, area=0.05)
1×14 DataFrame
Rowd₋dgdwdzd₁₀₀d₊h₋hghwhzh₊h₁₀₀
Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64
121.840126.516426.918227.226.533.6231.192714.224419.063119.464619.744619.046725.733623.558