Woody encroachment, or invasion of woody plants, is rapidly shifting tallgrass prairie into shrub and evergreen dominated ecosystems, mainly due to exclusion of fire. Tracking the pace and extent of woody encroachment is difficult because shrubs and small trees are much smaller than the coarse resolution (>10m2) of common remote sensed images. However, the US government has been investing in finer resolution (<2m2) remote sensing through USDA NAIP and the National Ecological Observatory Network (NEON), both of which cost multi-million dollars each year and contain different remote sensed products. We compared two methods of classification (random forests and support vector machines) with these two freely available remotely sensed aerial images to determine if and how much NEON adds to classification accuracy and determine which method of machine learning was more accurate. All models have very high overall classification accuracy (>91%), with the NEON image a few percent more accurate than NAIP. The NEON image significantly relies on canopy height (LiDAR) to make classifications, but the importance of bands is more evenly distributed during NAIP classification. Lastly, accuracy for Eastern Red Cedar specifically is high with NEON (78-84%), compared to the relatively low classification accuracy using NAIP imagery (55-61%).
The purpose of our research is to determine which method of machine learning and which aerial imagery (NAIP or NEON) more accurately classified vegetation on Konza, specifically Eastern Red Cedar. We collected training polygons using a high-precision GPS on various locations around Konza, tracing trees, shrubs, and open areas of grassland by walking around them. We collected training polygons from June to August 2021. Training points were supplemented with computer-drawn polygons, which were traced using a combination of the 2020 NEON RGB-10 cm imagery and publicly available 1 m2 RGB in ArcGIS. All training points were merged together then plotted on a raster stack, where the value of each layer for each pixel was extracted and placed in a excel file. The data was then split, with a random 70% used to train the model, and 30% used to evaluate the model.
We created six models in R: two different machine learning methods (random forests and support vector machines) with three aerial images (NAIP, NEON, and NAIP+NEON stacked together). We used 9 bands from NAIP, 8 bands from NEON, and 17 for the full raster stack. NAIP imagery provides 4 bands (red, green, blue, and infrared), and we calculated 5 more (red neighborhood, green neighborhood, blue neighborhood, infrared neighborhood, and normalized difference vegetation index [NDVI]). NAIP NDVI was calculated in R and therefore, is not included in the table of inputs below. We also used 8 bands from NEON (enhanced vegetation index [EVI], normalized difference nitrogen index [NDNI], normalized difference latent heat index [NDLI], soil-adjusted vegetation index [SAVI], atmospherically resistant vegetation index [ARVI], NDVI localized, NDVI neighborhood, and canopy height [LiDAR]). In R, we used packages ‘e1071’ and ‘randomForest’ to create our support vector machines and random forest models, respectively. Models were trained using the training data and their accuracies assessed using our evaluation data. Finally, the models were used to classify the entire area of Konza by classifying each pixel in the rasterstack. The output of the classification is a list, so we utilized a Konza aspect raster, set all values to NA, and filled them in with the classified values, creating a classified vegetation map of Konza.
Note: Raster stack file - rs.stack file is a .tiff with 17 layers which can be found here.