What winter tourism, ethnicity and employment structure tells us about SARS-Cov2 spreading

In the first post in the, hopefully, better 2021 we'll again make a short trip to the northern Italian province South Tyrol. Well, there are multiple reasons, you may love northern Italy: magnificent alpine landscape, fine wine and cuisine, better coffee than on the other side of the Alps... For a data scientist, there is one more: excellent public data availability, both concerning the COVID-19 pandemic and stats for cities and villages. It's enough, you go to the province or the federal statistical authority portal or the local COVID-19 dashboard and you may freely download even the most insane data for each South Tyrolean and Italian commune - simply great! It gives us a unique opportunity to analyze the course of the SARS-Cov2 outbreak in South Tyrol, in particular the spring and fall/winter waves, and identify some factors fostering the spread and, with a bit of imagination, speculate on a local protective (herd) immunity against the pathogen.
 
As usual, the scripts of my analyses can be found here.

Did the European outbreak start in Ischgl?

It was a sunny weekend day at the beginning of March 2020. I was traveling with a terribly crowded ski bus to a small North Tyrolean ski resort Axamer Lizum, squeezed with my ski touring gear somewhere between a group of elderly sledgers and young freeriders loudly talking about virgin powder lines down to a neighbor valley. Of course, everybody had heard about Wuhan and the drama in Bergamo. But: SARS-Cov2 in Austria, in ski resorts? Improbable! Looking from an almost one year perspective, I know that the passengers of the ski bus including me had a pretty good luck... At the same time, ignored by the health authorities and Austrian media, a trio of Tyrolean ski destinations: Ischgl, St. Anton and Sölden had already been developing to a COVID hot spot not only affecting the inhabitants and seasonal workers but also exporting the disease with infected international guests to numerous European countries.
 

 One ride a day keeps my COVID-19 away... May be true if apart from crowded ski destinations of North Tyrol and Italy
 
Well, although the case of Ischgl gained the greatest medial hype across the continent, it's improbable that the winter tourism in few particular Northern Tyrolean villages had ignited the pan-continental spread. What about other ski destinations? What about other factors augmenting the bug transmission? And, finally, how the high incidence during the first wave may influence the strength of the second COVID-19 wave? Is there a place for herd immunity? I'll try to check if the great data treasure from South Tyrol may answer some of these questions.

Three types of 2020 SARS-Cov2 spread in South Tyrol's communes: stroke by the first wave protected from the second

Well, the only benefit for the local community from the Ischgl disaster in March 2020 is some kind of protection from being infected anew with SARS-Cov2 during the second wave of the outbreak. In summer 2020, antibodies against the pathogen were found in more than 40% of the village population. However, the local cohort has not been investigated with systematic follow-ups and hence, real-life evidence for the local 'herd-immunity' is missing. Alternatively, one may assume, that the high prevalence during the spring wave may drive the incidence during the second. Unfortunately, I was not able to corroborate those two scenarios with Austrian data, because the incidence figures for single communes are not available. In turn, it's not hard to get such information for the neighbor Italian province - let's have a look how the COVID-19 incidence in the first wave relates to the spread in the second one in single South Tyrolean municipalities (Graphic 1). Few words of explanation: for each commune, the cumulative incidence for the 2020's halves was calculated, i.e. the sum of cases till 2020-06-30 and the sum of cases from 2020-06-30 on, and presented as a percent of infected persons among commune's inhabitants.

 
Graphic 1. Cumulative SARS-Cov2 incidence in the first and second half of 2020 in 116 South Tyrol's communes (cutpoint: 2020-06-30). The sum incidence is presented as percent of the commune population. (A) Cumulative incidences in the 2020's halves. Blue trend line represents a fitted LOESS trend. Note that the way higher incidence during the fall/winter outbreak. (B) Normalized cumulative incidences in the 2020's halves. The normalized data was subjected to k-means clustering, point colors represent the incidence clusters: Cluster 1 - strong first wave, weak second wave, Cluster 2 - weak first wave, strong second wave, Cluster 3 - both waves suppressed. (C) Comparison of cumulative incidences in the 2020's halves between the clusters. Statistical significance was determined by one-way ANOVA with Tuckey post-hoc test.
 
At first glance, the incidence plot (Graphic 1A) suggests that the later scenario sketched above doesn't hold true: if the spring wave were a driver of the fall/winter one, I would expect a bunch of municipalities in the upper right corner of the plot. In other words, something like a simple direct correlation may have been expected. This is not the case! Instead, there are three quite clear types of spread in the communes:
  • Cluster 1: high incidence in spring and less-than-average incidence in fall/winter
  • Cluster 2: low incidence in spring and over-average incidence in fall/winter
  • Cluster 3: less-than to average incidence during the both waves
In fact, we can apply a proper, smart statistical tool called k-means clustering, to assign the municipalities to the above mentioned clusters (Graphic 1B). To account for the substantially lower incidence in the spring 2020 than in fall/winter, the numbers were normalized, i. e. mean-centered and their spread related to the standard deviation. To verify the result of the cluster assignment, I compared the incidences in the year halves between the commune clusters: the differences in case numbers are highly significant and, hence, our assignment correct (Graphic 1C). 
 
Table 1. Population data for Clusters' communes.

In Table 1, I gathered some basic population data of each Cluster. Curiously, Cluster 1, i.e. the set of the towns stroke primarily by the spring COVID-19 wave, is the smallest one, both in terms of the commune number and total population. In addition, it tended to contain rather small municipalities as indicated by the lowest mean and median commune population. Hence, it seems probable that the initial SARS-Cov2 spread, at least in South Tyrol was limited to a handful municipalities being home for 10 - 15% of the province inhabitants. In contrast, the second wave-affected Cluster 2 contains roughly 50% of the province population even though it consists of only 34 towns! This is already a first important hint that the fall/winter outbreak displays the highest intensity in large, densely populated communes - you'll see soon, that the capital Bozen/Bolzano - Meran/Merano (MeBo) region is the focus of the recent disease activity.
 
Let's have a short look at representative members of that clusters (Graphic 2).
 

 Graphic 2. Cumulative SARS-Cov2 incidence in the first and second half of 2020 in representative communes assigned to incidence Cluster 1 (A, strong first wave, weak second wave), Cluster 2 (B, weak first wave, strong second wave) and Cluster 3 (C, both waves suppressed). Dashed lines represent the cutpoint date of 2020-06-30.

 
Cluster 1 seems to me to be the most striking one: as you may appreciate for Kastelruth/Castelrotto and Wolkenstein/Selva di Val Gardena (Graphic 2A), the communes were particularly dramatically affected by the first wave and 'only' mildly swept by the second one, at least in relation to the other clusters. In fact, the real prevalence of the disease in Cluster 1 was deeply underestimated by the official incidence figures. Similar to Ischgl, the presence of anti-SARS-Cov2 antibodies was measured in few Cluster 1 municipalities located in Dolomite's Grödental/Val Gardena during early summer 2020 and found to be roughly 27%. It's a lot but still far below the 60 - 72% seropositivity required for the SARS-Cov2 herd immunity (for details see: here). So, most likely, the suppression of the second wave in Cluster 1 communes may not solely rely on the presence of protective immunity in the local community but, as we'll see later, can be a result of the shut-down of the local economy extensively based on tourism. 
 
The Cluster 2 municipalities are only a bit less interesting (Graphic 2B). For me, it was a real  surprise, that the capital, the far biggest town of the province and a typical Cluster 2 municipality, Bozen/Bolzano, was hardly stroke by the spring SARS-Cov2 outbreak, even though its hospitals and ICUs were full of the COVID-19 patients. Apparently, the size, density of inhabitants and the local economy involving lots of manpower from the surrounding towns didn't count as driving factors of the spring wave! Instead they seem to play a role right now in the winter season and the high prevalence in the capital city fans down to the neighbor municipalities as we'll see later. Another example of a Cluster 2 town is Sexten/Sesto, a small peripheral municipality in the remote upper Pustertal/Val di Pusteria, which experienced just one case in the spring and a cumulative incidence of 10% between October and December 2020! A possible explanation for that phenomenon follows in a moment.

Finally, there's Cluster 3 which seems not to be particularly affected by either seasonal outbreak (Graphic 2C). As representative examples I chose two communes laying at the opposite ends of the land - you'll see that their placement outside of the most populated Bozen/Bolzano - Meran/Merano (MeBo) center of the province serves as a 'geographic' protective factor.

Geography matters if it comes to coronavirus spread

Let's delve for a moment into location of the Cluster's communes on the map of South Tyrol and check it for some distribution patterns (Graphic 3).


Graphic 3. Geographic distribution of communes assigned to incidence Clusters 1, 2 and 3. Lines in (B) represent two-dimensional density function of spatial commune distribution.


First of all, Cluster 1 communes tend to build one clear center in the south-eastern part of the province, more precisely, in the central Dolomites with Grödental/Val Gardena and Gadertal/Val Badia. Another more sparse focus encompasses the southern end of South Tyrol. Those are relatively sparse populated regions outside of the capital MeBo region - along with the population data presented in Table 1 - and, importantly, touristic ones with famous ski resorts Seiseralm, Selva di Val Gardena, Corvara and Alta Badia. Surprisingly, the map of Cluster 1 communes does not include any of large winter destinations in the West of the land, including Meran/Merano, Sulden/Solda and Schnalstal. To me, it suggests that the spring SARS-Cov2 outbreak in South Tyrol, but maybe also in Austrian North Tyrol, was initiated by isolated superspreader events in single ski resorts. In other words, in spring 2020 the disease was still rare and strictly regional, which explains why I came back home healthy after my trip to Axamer Lizum in March...

For Cluster 2 with municipalities affected by the second wave, three centers can be distinguished: the middle densely populated and interconnected region around MeBo (Meran/Bozen) and two peripheral foci at the opposite, remote ends of the province: Upper Vinschgau/Val Venosta and Upper Pustertal/Val Pusteria. The first region consists of two pretty large cities and their pendant communes, supplying the economy of the capital region with daily commuting workers. The last two ones comprise mostly small villages and towns (like the extreme cases of Sexten/Sesto or Schluderns/Sluderno, Graphic 1) with, probably, strongly networked community favoring the local spread.

Cluster 3 includes most of the South Tyrol's municipalities with two sparse geographical centers in the middle parts of Vinschgau/Val Venosta in the West and Pustertal/Val Pusteria in the East. Those are mostly middle-sized towns dwelling at the periphery of the capital MeBo region or, in case of Pustertal, a way outside it. Their looser connections with the agglomeration and, at least in part, agricultural character seem to protect them from both SARS-Cov2 outbreaks.

Extensive tourism, private sector economy and Ladin language are the hallmarks of the communes affected by the spring SARS-Cov2 wave

Now, let's try to find out some non-demographic and non-geographic factors which distinguish between Cluster 1, 2 and 3 municipalities. To this end I analyzed a colorful medley of various commune variables provided by the South Tyrolean statistic authority ASTAT. Methodologically, I applied one-way ANOVA with Tuckey's post-hoc test to compare differences between each two clusters. For strongly non-normally distributed variables, log10 transformation was used.
 
Graphic 4. Commune stats significantly varying between the incidence clusters identified by one-way ANOVA. The bars display p values corrected for multiple comparisons with Benjamini-Hochberg/FDR method (pFDR). The dashed line represents the significance threshold (pFDR = 0.05)
 

As presented in Graphic 4, the ANOVA analysis points out statistically significant differences for several stats referring to language structure, number and employment in private enterprises, number of agricultural units and, last but not least, the extent of local tourism. We'll now investigate them in more detail.
 

Graphic 5. Language structure of communes assigned to incidence Clusters 1, 2 and 3. Data were analyzed by one-way ANOVA with Tuckey's post-hoc test.

A little piece of information for readers beyond the Austrian or Italian Alps: South Tyrol is, due to its turbulent history, a multi-ethnicity region with three main languages spoken: German (and dialects), Italian and Ladin. The most strikingly different Cluster when it comes to the distribution of those languages is the first one (Graphic 5). One can easily identify there a distinct bunch of communes with predominant Ladin language. Yes, it fits perfectly well with the geographic location shown in Graphic 3: the central part of the Dolomites with local predominance of the Ladin ethnicity. Interestingly, Cluster 2 tends to have an increased frequency of Italian speakers, in line with its placement in the capital region with mostly Italian-speaking Bozen/Bolzano.The least COVID-19 affected Cluster 3, in contrast, tends to be highly dominated by German speakers, which, again fits well with the peripheral geographic localization.
 

Graphic 6. Number of private companies, employment in the private economy sector and number of agriculture unit in communes assigned to incidence Clusters 1, 2 and 3. Data were analyzed by one-way ANOVA with Tuckey's post-hoc test.

Among economy stats analyzed, number of private enterprises and their employees were found significantly different between the clusters (Graphic 6AB). Both figures peaked in Cluster 1 stroke predominantly by the spring disease outbreak. As an cautious reader, you may register two details. First, Cluster 1 contains few municipalities with an extreme number of, probably tiny, firms: say 1.5 to 3 enterprises per 10 inhabitants, including babies, infants and elderly! Second, there are some towns there with an employment in the private sector far above 50% or even 100%, again counted as a percent of total and not professionally active population. Taken together this suggests an atomic structure of the local economy with a large contribution of commuting and seasonal workers - quite characteristic for touristic destinations!

On the pretty other end of the scale, the number of farms was found the hallmark of the 'corona-protected' Cluster 3, especially when compared with the 'urban' second wave-prone Cluster 2. This confirms my previous assumption that the slowest SARS-Cov2 spread takes place in peripheral agricultural regions of South Tyrol.


 
Graphic 7. Tourism stats for communes assigned to incidence Clusters 1, 2 and 3. Data were analyzed by one-way ANOVA with Tuckey's post-hoc test.

A more detailed look at the tourism stats (Graphic 7) corroborates our observations for Cluster 1: it includes communes with up to 2 hotels per 10 inhabitants and some 2 to 6 hotel beds per capita characteristic rather not for 'slow' or 'near-to-nature' tourism but of a real tourism-oriented industry! Similarly, the figures of total, summer and winter hotel guests reach their maxima in Cluster 1 (mind the log10 scale!) and are drastically higher than in the second-wave Cluster 2, which turns out to be the least 'guest-oriented'. 
 
In summary, the first SARS-Cov2 wave in South Tyrol was, like in the northern neighbor land, driven by the tourism with extremely high turnover of guests and employees of the hotels, restaurants and ski facilities. It means as well, that the slow kinetics of the fall/winter outbreak in the communes suffering from the spring wave is most probably not the effect of some protective immunity but rather the almost complete shutdown of tourism in Europe due to travel restrictions and fear among potential guests. Well, these portion of data should be taken into consideration while opening ski resorts right now in the middle of the winter COVID-19 season - like it has happened in Austria. Switching on the tourism machinery, even with tough hygienic restrictions and for locals only is like igniting a match of a heavy bomb - with exploding incidence counts at the end of the day!

At the moment, the fall/winter SARS-Cov2 outbreak in South Tyrol seems to be fueled by density of population and unavoidable professional commuting in and out the agglomeration regions and, to a lesser extent, by the local spread in small enclosed communities. Well, it pertains to the issue raised many times in my blog: reducing intensity and frequency of contacts and social distancing in everyday situations is a 100% proven measure braking spread of the bug. For local and non-local authorities: favoring home office and remote working will do the same. At least till we have a widely available vaccination - more on that in the next post! Stay healthy!

Comments

Popular posts from this blog

Fact check: does anti-SARS-CoV-2 vaccination work?

'Mild-course', home-isolated: still a missing puzzle to understand the COVID-19 pandemic

Fact check: is Omicron less dangerous than Delta?