2.4. Statistical Analysis
A descriptive analysis of farming practices, environmental context (i.e.
landscape and wild boar density, risk factors with regard to potential
pathogen transmission characteristics and interactions with wild boars)
was first performed. For this analysis, outdoor surface, number of pigs
in the farm and the wild boar density index were categorised either
using breaks in the distribution or thirtiles as cut-off points when no
break was clearly visible. . The farms were then classified in different
groups using a multiple factor analysis (MFA) followed by a hierarchical
cluster analysis (HCA). This approach allowed the identification of
groups of farm types, that have similar farming practices, environmental
context, biosecurity levels and reported events of incursions or
interactions with wild boars (Table 2). In this approach, MFA allows
dimension reduction by characterising each farm with synthetic variables
(referred to as factors) instead of the original variables. Each factor
captures the information of a number of original variables that are
associated, so that the number of factors required to capture most of
the information in the original data is usually much lower than the
number of original variables. The contributions of the different
original variables to the MFA factors reflect the links among them.
Furthermore, MFA is more appropriate than MCA (multiple component
analysis) when the variables are grouped (here in 4 categories: farming
practices, environmental context, biosecurity level against wild boar
incursions, interactions with wild boars) and the number of variables
varies from one group to another. Indeed, with MFA, even when the number
of variables differs among groups, the influence of the different groups
on the MFA factors is balanced (Escofier & Pagès, 1994). A total of 13
original variables were used to characterise each farm. Eight, two, one
and two variables were related to farming practices, environmental
context, protection against wild boar intrusions and interactions with
wild boars, respectively (Table 1). In the second stage, HCA was used to
identify groups of farms with similar characteristics (Table 2). HCA was
conducted on the farms’ MFA factors’ scores, using Ward’s method. With
this method, the farms are grouped so that both the homogeneity within
the group and the heterogeneity between groups are maximized (Ward,
1963). The set of characteristics statistically associated with each
group of farms was identified using the statistical test for differences
among group for categorical provided in the HCPC function output of the
FactoMineR package. Finally, a chi-squared test was used to test for
variation in the proportion of farms inside or outside the OD outbreak
zone among farm types. Data processing, descriptive statistics and
multivariate analyses were performed with R version 3.1.1 (R Core Team,
2014), using the package FactoMineR for MFA and HCA (Husson, Lê, &
Pagès, 2017; Lê, Josse, & Husson, 2008). Several maps to represent farm
locations were produced from the ggplot, ggmap and cartography packages
of the R software.