Main folder “DataTemplates_and_Codes_Mondragone”
Commented R Script providing all codes to reproduce all analyses and figures contained in the paper (code.R)
The present README text both in .html and .Rmd format
Jessica Cristina Menghi Sartorio (Department of Enterprise Engineering “Mario Lucertini”, University of Rome “Tor Vergata”)
Eugenio Bortolini (Department of Cultural Heritage, University of Bologna)
Gregorio Oxilia (Department of Cultural Heritage, University of Bologna)
The present digital archive is the outcome of the paper: Oxilia, G., Bortolini, E., Marciani G., Menghi Sartorio JC., . et al. Direct evidence that late Neanderthal occupation precedes a technological shift in southwestern Italy, Journal of Human Evolution.
During the Middle to Upper Palaeolithic transition (50,000 and 40,000 years ago), interaction between Neanderthals and Homo sapiens varied across Europe. In southern Italy, the association between Homo sapiens fossils and non-Mousterian material culture, as well as the mode and tempo of Neanderthal demise, are still vividly debated. This work presents two lower deciduous molars uncovered at Roccia San Sebastiano (Mondragone-Caserta, Italy), stratigraphically associated with Mousterian (RSS1) and Uluzzian (RSS2) artefacts. Using virtual morphometric methods and supervised learning algorithms we show that RSS1, whose Mousterian context appears more recent than 44,800-44,230 cal BP, can be attributed to a Neanderthal, while RSS2, found in an Uluzzian context that we dated to 42,640-42,380 cal BP, is attributed to Homo sapiens. This site therefore yields the most recent direct evidence for a Neanderthal presence in southern Italy and confirms a later shift to Early Upper Palaeolithic technology in southwestern Italy compared to the earliest Uluzzian evidence at Grotta del Cavallo (Puglia, Italy).
The present repository contains R codes for running the analyses on cervical and crown outlines presented in the paper including dataset generation, Generalised Procrustes Analysis, Principal Component Analysis, Permutational Multivariate ANOVA, and probabilistic taxonomic attribution through supervised learning algorithms. The latter comprise:
Flexible Discriminant Analysis (FDA), a flexible extension of Linear Discriminant Analysis (LDA) that uses non-linear combinations of predictors allowing for a low misclassification error when modelling non-linear, non-normal, and non-homogeneous data;
MultiAdaptive Regression Splines (MARS; Friedman 1991, Hastie 2017). This algorithm identifies the value intervals that best discriminate between groups by iteratively running linear regressions for each group and finding the predictor points that minimize within-group total error (knots). These points are then used to link individual linear functions into the final model (Hastie et al., 1994, 2009). Control for overfitting is obtained using generalized cross-validation (GCV), a stepwise process which assesses the ratio between the goodness of fit of the model and the number of parameters.
a Random Forest (RF) classifier (Liaw & Wiener, 2002) which uses recursive binary splitting to grow classification trees carrying out a multiple sampling with replacement at each node and choosing the most commonly occurring model among all predictions based on the sampled subsets (SOM S2).
Results obtained with all methods (FDA, MARS, and RF) are validated through a repeated 10-fold cross validation.
NB: the present digital archive does not include raw data on cervical and crown outlines as they are currently unpublished and are kindly provided by different stakeholders. They will be published separately and independently in forthcoming publications. Therefore, at present, results of our paper cannot be fully reproduced. Nevertheless, we are making the full code available to facilitate replication/development of the methods we used at the best of our current possibilities.
Commented R Script providing all codes to reproduce all analyses and figures contained in the paper (code.R)
The present README text both in .html and .Rmd format
A .txt file providing the list of individual samples labels paired with the respective species (label_specie.txt)
A .txt file providing the template for the list of cervical outline coordinates to be used in Generalised Procrustes Analysis and following analyses (Lower_dm2_cervix_outline.txt)
A .txt file providing the template for the list of crown outline coordinates to be used in Generalised Procrustes Analysis and following analyses (Lower_dm2_cervix_outline.txt)
A .txt file providing the list of individual sample labels used in the analysis of cervical outlines (Lower_dm2_label.Cervix.txt)
A .txt file providing the list of individual sample labels used in the analysis of crown outlines (Lower_dm2_label.Crown.txt)
A .txt file providing the list of species labels attributed to each sample, in the same order as sample lables, for cervical data (Lower_dm2_specie.Cervix.txt)
A .txt file providing the list of species labels attributed to each sample, in the same order as sample lables, for crown data (Lower_dm2_specie.Crown.txt)
Code: MIT (https://choosealicense.com/licenses/mit/ year: 2021, copyright holder: Jessica Cristina Menghi Sartorio, Eugenio Bortolini, and Gregorio Oxilia.
R version 4.1.0 (2021-05-18)
Packages * rgl (v.0.106.8) * shapes (v.1.2.6) * tripack (v.1.3.9.1) * MASS (v.7.3.54) * lmtest (v.0.9.38) * ape (v.5.5) * ade4 (v.1.7.16) * pls (v.2.7.3) * Morpho (v.2.8) * geomorph (v.4.0.0) * geometry (v.0.4.5) * car (v.3.0.10) * grDevices (v.4.1.0) * factoextra (v.1.0.7) * vegan (v.2.5.7) * randomForest (v.4.6.14) * mda (v.0.5.2) * RColorBrewer (v.1.1.2) * caret (v.6.0.88) * earth (v.5.3.0) * RVAideMemoire (v.0.9.79)