2.4. Statistical Analysis
A descriptive analysis of farming practices, environmental context (i.e. landscape and wild boar density, risk factors with regard to potential pathogen transmission characteristics and interactions with wild boars) was first performed. For this analysis, outdoor surface, number of pigs in the farm and the wild boar density index were categorised either using breaks in the distribution or thirtiles as cut-off points when no break was clearly visible. . The farms were then classified in different groups using a multiple factor analysis (MFA) followed by a hierarchical cluster analysis (HCA). This approach allowed the identification of groups of farm types, that have similar farming practices, environmental context, biosecurity levels and reported events of incursions or interactions with wild boars (Table 2). In this approach, MFA allows dimension reduction by characterising each farm with synthetic variables (referred to as factors) instead of the original variables. Each factor captures the information of a number of original variables that are associated, so that the number of factors required to capture most of the information in the original data is usually much lower than the number of original variables. The contributions of the different original variables to the MFA factors reflect the links among them. Furthermore, MFA is more appropriate than MCA (multiple component analysis) when the variables are grouped (here in 4 categories: farming practices, environmental context, biosecurity level against wild boar incursions, interactions with wild boars) and the number of variables varies from one group to another. Indeed, with MFA, even when the number of variables differs among groups, the influence of the different groups on the MFA factors is balanced (Escofier & Pagès, 1994). A total of 13 original variables were used to characterise each farm. Eight, two, one and two variables were related to farming practices, environmental context, protection against wild boar intrusions and interactions with wild boars, respectively (Table 1). In the second stage, HCA was used to identify groups of farms with similar characteristics (Table 2). HCA was conducted on the farms’ MFA factors’ scores, using Ward’s method. With this method, the farms are grouped so that both the homogeneity within the group and the heterogeneity between groups are maximized (Ward, 1963). The set of characteristics statistically associated with each group of farms was identified using the statistical test for differences among group for categorical provided in the HCPC function output of the FactoMineR package. Finally, a chi-squared test was used to test for variation in the proportion of farms inside or outside the OD outbreak zone among farm types. Data processing, descriptive statistics and multivariate analyses were performed with R version 3.1.1 (R Core Team, 2014), using the package FactoMineR for MFA and HCA (Husson, Lê, & Pagès, 2017; Lê, Josse, & Husson, 2008). Several maps to represent farm locations were produced from the ggplot, ggmap and cartography packages of the R software.