1. Introduction ¶
In this project we will cluster cities located in the proximity of DOCG areas in the Veneto region, in Italy. In order to do so, we will first need to get familiar with some of the key concepts.
We will start with a brief introduction to italian wine classification and Veneto, and then we will dive in how we will be tackling this project.
A. Italian wine classification ¶
Wine is produced in every region of Italy, which is home to some of the oldest wine-producing areas in the world. Italy is the world's largest producer of wine, with an area of 702,000 hectares under vineyard cultivation.
In 1963, the first official Italian system of classification of wines was launched. Since then, several modifications and additions to the legislation have been made. The last modification established four basic wine categories. The categories, from the bottom to the top level, are:
- Vino da Tavola: generic wines that are made either mostly from one kind of authorized 'international' grape variety or entirely from two or more of them.
- Vini IGT: wines produced in a specific territory within Italy. These need following a series of specific and precise regulations on authorized varieties, viticultural and vinification practices, labeling instructions, etc.
- Vini DOP: This category includes two sub-categories:
- Vini DOC: DOC wines generally come from smaller regions that are particularly vocated for their climatic and geological characteristics, quality, and originality of local winemaking traditions.
- Vini DOCG: In addition to fulfilling the requisites for DOC wines, DOCG wines must pass stricter analyses prior to commercialization. They must also demonstrate a superior commercial success.
DOCG wines are the top level wines produced in Italy, therefore the areas where they are produced are generally popular touristic destinations.
B. Veneto ¶
Veneto is a gem of a region in the northeast corner of Italy. Bound on the west by Lake Garda, on the north by the Dolomite Mountains and on the east by the Adriatic Sea, the landscape of the Veneto is rich and varied. From the grandeur of crumbly old Venice to the medieval flavor of Bassano del Grappa, and on to Belluno, a striking town that's a gateway for visiting the Dolomites, the Veneto makes a fascinating region to explore.
Veneto is one of the leading Italian regions in terms of quantity and quality production of grapes. The wines produced in this region are famous throughout the world: Prosecco, Amarone, Recioto, Soave, Valpolicella and Bardolino, are only a few of the names of wines known at international level.
Veneto is also the region that can count the highest number of DOCG areas in Italy.
C. Our objective ¶
Now that we are familiar with DOCG areas and the Veneto region we can define our objective.
There are many beautiful cities located within or in proximity of DOCG areas. These cities offer the most different kind of activities, venues and places of interest. If a person is interested in visiting the DOCG vineyards and stay in a city nearby, it would be great if they were able to choose a destination based on their preferred type of activities. That is the purpose of our project.
A. DOCG geographical areas ¶
The geographical information regading DOCG appellations area is contained in an SHP file.
A SHP (shapefile) is a simple, nontopological format for storing the geometric location and attribute
information of geographic features. Geographic features in a shapefile can be represented by points,
lines, or polygons (areas). In Python, in order to access the shapefile we can use the
pyshp
library. We then need to convert the file to geojson format in order to
utilise it with folium
for our maps plotting.
After having created the dataset we can see the first 5 rows:
appellation | code | zone | coords | |
---|---|---|---|---|
0 | RECIOTO SOAVE CLASSICO | A021 | A | [(11.252029507865142, 45.41758433331832), (11.... |
1 | RECIOTO SOAVE | A021 | X | [(11.207064614961942, 45.4507371295929), (11.2... |
2 | BARDOLINO SUPERIORE CLASSICO | A025 | A | [(10.794778650134427, 45.518760038125784), (10... |
3 | BARDOLINO SUPERIORE | A025 | X | [(10.843049258063267, 45.43160165449561), (10.... |
4 | SOAVE SUPERIORE CLASSICO | A026 | A | [(11.252029507865142, 45.41758433331832), (11.... |
There are four features:
- appellation: the name of the DOCG
- code: the area code
- zone: the zone type
- coords: a list of latitude and longitude coordinates
And here is the DOCGs data plotted on a map.
IFrame('maps/docgs.html', width=1000, height=450)
B. Veneto municipalities geographical areas ¶
We managed to get ahold of another SHP file containing the georaphical coordinates of every
municipality in the Veneto region. As with the previous shapefile we can store the information in both
a DataFrame
and a geojson file.
This is how the data looks:
Comune | Prov | CODISTAT | NOMCOM | PROVINCIA | AREA | PERIMETER | ID1 | coords | |
---|---|---|---|---|---|---|---|---|---|
0 | 29033 | 29 | 29033 | Occhiobello | RO | 3.251909e+07 | 28900.15864 | 527 | [(11.574961743249316, 44.95070722627497), (11.... |
1 | 29025 | 29 | 29025 | Gaiba | RO | 1.206460e+07 | 18468.00608 | 526 | [(11.479809311494138, 44.97789519756071), (11.... |
2 | 29009 | 29 | 29009 | Canaro | RO | 3.266567e+07 | 33974.60289 | 525 | [(11.661722276937734, 44.97455175786645), (11.... |
3 | 29021 | 29 | 29021 | Ficarolo | RO | 1.796072e+07 | 21152.56640 | 524 | [(11.440782497382445, 44.98232147591316), (11.... |
4 | 29045 | 29 | 29045 | Stienta | RO | 2.408899e+07 | 24452.03201 | 523 | [(11.559372185120054, 44.98162314511416), (11.... |
We do not need to focus on any of the fields here besides NOMCOM, which is the name of the municipality, and the coords that we will need for plotting.
This is the comunes data plotted on a folium
map.
IFrame('maps/comunes.html', width=1000, height=500)
This dataset will then need to be filtered to include only the cities relevant to our study.
C. Touristic cities ¶
The data stored in a csv file contains information on the amount of tourists visiting each of Veneto's comunes in a given year. The data was collected from 2003 to 2013. Here we can see the last 5 rows:
year | comune | province | n_tourists | |
---|---|---|---|---|
5572 | 2013 | Taglio di Po | ROVIGO | 4819.0 |
5573 | 2013 | Trecenta | ROVIGO | 482.0 |
5574 | 2013 | Villadose | ROVIGO | 584.0 |
5575 | 2013 | Villamarzana | ROVIGO | NaN |
5576 | 2013 | Porto Viro | ROVIGO | 2566.0 |
It contains information about:
- year: the relevant year
- comune: the name of the municipality
- province: the name of the province
- n_tourists: the number of tourists that have visited the comune that given year
We selected the first 32 cities in proximity of DOCG areas by total number of tourists for our analysis. With the help of the Nominatim geolocator we managed to find the latitude and longitude coordinates of all of the cities. The first five rows of our final dataset look like this:
latitude | longitude | |
---|---|---|
comune | ||
Abano Terme | 45.360314 | 11.789783 |
Asiago | 45.875377 | 11.510700 |
Bardolino | 45.547559 | 10.724215 |
Bassano del Grappa | 45.766911 | 11.734347 |
Brenzone | 45.707599 | 10.765873 |
Here the selected cities are pinned to the map of the DOCG areas. You can hover on the tooltip to see the name of the comunes and those of the DOCG appellations.
IFrame('maps/selected_cities.html', width=1000, height=450)
D. Foursquare API ¶
We used the Foursquare API in order to find venues and places of interest in each of the locations.
After leveraging the platform our final result is a dataset containg info about venues within these municipalities. A radius of 5km from each location was used to perform the search. We can have a look again at the first five rows.
city | city_lat | city_lon | venue | venue_lat | venue_lon | category | |
---|---|---|---|---|---|---|---|
0 | Abano Terme | 45.360314 | 11.789783 | L'ombra Che Conta | 45.361623 | 11.790219 | Trattoria/Osteria |
1 | Abano Terme | 45.360314 | 11.789783 | Abano Grand Hotel | 45.354321 | 11.785206 | Hotel |
2 | Abano Terme | 45.360314 | 11.789783 | Panoramic Hotel Plaza | 45.354413 | 11.783820 | Hotel |
3 | Abano Terme | 45.360314 | 11.789783 | Grand Hotel Trieste & Victoria | 45.352713 | 11.781310 | Hotel |
4 | Abano Terme | 45.360314 | 11.789783 | Parco Urbano Termale | 45.351798 | 11.783535 | Park |
The information conatined stores information about:
- city: the municipality
- city_lat: the latitude of the municipality
- city_lon: the longitude of the municipality
- venue: the venue name
- venue_lat: the latitude of the venue
- venue_lon: the longitude of the venue
- category: the category of the venue
Our query returned a total of 1925 venues in the areas of interest. We can the first five rows of the table containing the total number of venues found for each category.
venues | |
---|---|
category | |
Accessories Store | 2 |
Agriturismo | 3 |
American Restaurant | 7 |
Argentinian Restaurant | 1 |
Art Gallery | 3 |
After all the datasets are acquired we are now ready to start the analysis.
3. Methodology¶
This section represents the main component of our analysis. As a reminder our purpose is to cluster similar cities based on some similar activities. As always we start with some exploratory data analysis. In particular, our first question is to see whether these municipalities have venue categories in common.
We can plot the top 5 categories for each of the comunes.
We can see some categories in common amongst some of the municipalities. For example the are places like Caorle or Cavallino-Treporti that have a more maritime related kind of venue, or activity. In another instance, Castelnuovo del garda and Peschiera del Garda both have theme park attraction as their main category. We can also see that venues like restaurant are very popular in all of this locations.
We need to perform some transformations to our data so that we are able to cluster the municipalities. We will approach the clustering problem by implementing the k-means algorithm. k-means is a distance-based method that iteratively updates the location of k cluster centroids until convergence. The main user-defined "ingredients" of the k-means algorithm are the distance function (often Euclidean distance) and the number of clusters k. This parameter needs to be set according to the application or problem domain.
In a nutshell, k-means groups the data by minimizing the sum of squared distances between the data points and their respective closest centroid. It is particulary used in problems involving spatial data.
In Python we can use the KMeans
class from scikit-learn. We then analysed the inertias
for different values of k and picked 5 as our hyperparameter. Here we can see a plot of
inertia values for different values of k.
After fitting our model we are able to apply the clusters to the municipalities. Following are the first five rows of the municipalities with their assigned cluster.
city | cluster | |
---|---|---|
8 | Abano Terme | 2 |
29 | Asiago | 0 |
9 | Bardolino | 3 |
26 | Bassano del Grappa | 0 |
24 | Brenzone | 2 |
The code for the data acquisition and creating the model can be found in the exploration notebook, which is part of this repo. We are now ready to check the results of our clustering.
4. Results ¶
All of our cities are now clustered into 5 different groups. As a first step we can visualize the different clusters.
IFrame('maps/cities_clustered.html', width=1000, height=550)
Our next step is to see what the discriminants are to distinguishing these groups. This could be useful, for example, in recommending tourists wanting to visit the DOCG areas which cities to visit or stay at, based on particular activities they would like to do while staying at these locations. We will check how the clusters were chosen.
Grouping by cluster we can see a normalized table with the percentage of venue category for each of the clusters.
Accessories Store | Agriturismo | American Restaurant | Argentinian Restaurant | Art Gallery | Art Museum | Arts & Crafts Store | Asian Restaurant | Athletics & Sports | BBQ Joint | Bagel Shop | Bakery | Bar | Basketball Court | Basketball Stadium | Bay | Beach | Beach Bar | Bed & Breakfast | Beer Bar | Beer Garden | Bistro | Board Shop | Boarding House | Bookstore | Boutique | Bowling Alley | Brazilian Restaurant | Breakfast Spot | Brewery | Bridge | Buffet | Burger Joint | Cafeteria | Café | Campground | Canal | Castle | Cheese Shop | Chinese Restaurant | Chocolate Shop | Church | City | Clothing Store | Cocktail Bar | Coffee Shop | Comfort Food Restaurant | Concert Hall | Coworking Space | Creperie | Cupcake Shop | Deli / Bodega | Department Store | Dessert Shop | Diner | Discount Store | Dive Bar | Dive Spot | Donut Shop | Eastern European Restaurant | Electronics Store | Event Space | Fast Food Restaurant | Fish Market | Flea Market | Flower Shop | Food | Food & Drink Shop | Football Stadium | Fried Chicken Joint | Furniture / Home Store | Gaming Cafe | Garden | Garden Center | Gas Station | Gastropub | General Entertainment | German Restaurant | Golf Course | Gourmet Shop | Greek Restaurant | Grocery Store | Gym | Gym / Fitness Center | Gym Pool | Harbor / Marina | Hill | Historic Site | History Museum | Hobby Shop | Hockey Arena | Hot Spring | Hotel | Hotel Bar | Hotel Pool | Ice Cream Shop | Indian Restaurant | Italian Restaurant | Japanese Restaurant | Kids Store | Lake | Lighthouse | Liquor Store | Lounge | Market | Mediterranean Restaurant | Men's Store | Mexican Restaurant | Middle Eastern Restaurant | Monument / Landmark | Mountain | Movie Theater | Multiplex | Museum | Music Venue | Neighborhood | Nightclub | Noodle House | Nudist Beach | Opera House | Outdoors & Recreation | Outlet Store | Park | Pastry Shop | Pedestrian Plaza | Performing Arts Venue | Pharmacy | Piadineria | Pizza Place | Plaza | Pool | Pub | Public Art | Racetrack | Record Shop | Resort | Rest Area | Restaurant | River | Road | Rock Club | Sandwich Place | Scenic Lookout | Science Museum | Sculpture Garden | Seafood Restaurant | Shoe Store | Shop & Service | Shopping Mall | Shopping Plaza | Skating Rink | Ski Area | Snack Place | Soccer Field | Spa | Sporting Goods Shop | Stadium | Steakhouse | Supermarket | Sushi Restaurant | Tea Room | Tennis Court | Thai Restaurant | Theater | Theme Park | Theme Park Ride / Attraction | Toy / Game Store | Trail | Train Station | Trattoria/Osteria | University | Used Bookstore | Vacation Rental | Vegetarian / Vegan Restaurant | Veneto Restaurant | Video Game Store | Water Park | Waterfront | Wine Bar | Wine Shop | Winery | Women's Store | Zoo | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
cluster | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 | 0.012241 | 0.011494 | 0.021419 | 0.01 | 0.01 | 0.01 | NaN | 0.012463 | 0.010000 | 0.057844 | 0.076923 | 0.016616 | 0.032521 | NaN | 0.025641 | NaN | 0.090000 | NaN | 0.014519 | 0.012463 | 0.012463 | 0.010000 | 0.01 | 0.01 | 0.013247 | 0.051948 | 0.011494 | 0.033248 | 0.014311 | 0.030553 | 0.015 | 0.025641 | 0.015633 | NaN | 0.079821 | 0.013772 | NaN | 0.017195 | 0.047619 | 0.014563 | 0.012987 | 0.020 | 0.025641 | 0.034926 | 0.035887 | 0.014975 | 0.01 | NaN | NaN | NaN | 0.027035 | NaN | 0.034412 | 0.019370 | 0.023018 | 0.019658 | NaN | NaN | NaN | 0.011494 | 0.025466 | 0.076923 | 0.023901 | NaN | 0.02439 | 0.01321 | 0.034542 | 0.016043 | NaN | 0.011494 | 0.032484 | 0.011494 | 0.015 | NaN | NaN | 0.030445 | 0.01 | NaN | 0.017755 | 0.014925 | 0.011494 | 0.016085 | 0.027854 | 0.018079 | 0.01 | NaN | 0.01 | 0.018772 | 0.023810 | 0.01 | 0.071429 | NaN | 0.059223 | 0.025974 | NaN | 0.026077 | 0.01 | 0.116428 | 0.026454 | 0.012821 | NaN | NaN | 0.014925 | 0.020967 | 0.01 | 0.014925 | 0.012987 | 0.01641 | NaN | 0.02 | NaN | 0.017463 | 0.01214 | 0.01813 | 0.01 | 0.010000 | 0.018430 | 0.012637 | NaN | NaN | 0.090909 | 0.018734 | 0.018445 | 0.01 | 0.01 | 0.012987 | 0.01 | 0.019658 | 0.105097 | 0.033375 | 0.013136 | 0.040340 | 0.025641 | 0.01 | 0.01 | 0.010000 | 0.013455 | 0.042192 | 0.050455 | 0.01094 | 0.017544 | 0.015705 | 0.013772 | 0.010000 | NaN | 0.041800 | 0.015949 | 0.012821 | 0.017024 | 0.012241 | 0.01 | 0.02381 | 0.015705 | 0.010498 | 0.028063 | 0.01593 | 0.012821 | 0.01527 | 0.031048 | 0.01112 | 0.01 | 0.017821 | NaN | 0.010000 | 0.011494 | 0.025641 | 0.014925 | 0.028810 | 0.030773 | 0.045753 | 0.01 | NaN | NaN | 0.01 | 0.017508 | 0.014925 | NaN | NaN | 0.024114 | 0.01 | 0.014832 | 0.012987 | 0.011494 |
2 | NaN | 0.018353 | 0.018353 | NaN | NaN | NaN | NaN | NaN | NaN | 0.020833 | NaN | NaN | NaN | NaN | NaN | NaN | 0.031746 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.060516 | 0.044643 | NaN | NaN | NaN | 0.020833 | NaN | NaN | NaN | NaN | 0.015873 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.041667 | 0.015873 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.020833 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.041667 | 0.052579 | NaN | NaN | 0.015873 | NaN | 0.149802 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.020833 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.047123 | NaN | 0.015873 | NaN | NaN | NaN | NaN | NaN | 0.020833 | 0.026290 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.055060 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.018353 | NaN | NaN | NaN | NaN | 0.018353 | 0.034226 | 0.283234 | NaN | NaN | NaN | 0.047619 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.015873 | NaN | NaN | NaN | NaN | NaN |
3 | NaN | NaN | 0.012937 | NaN | 0.02 | 0.01 | 0.01 | 0.012937 | 0.033333 | NaN | NaN | 0.030000 | 0.015000 | 0.02 | NaN | 0.01 | 0.043478 | NaN | NaN | NaN | NaN | NaN | NaN | 0.02 | NaN | NaN | NaN | 0.010000 | 0.020000 | NaN | 0.010 | NaN | NaN | NaN | 0.047572 | 0.033333 | 0.01 | NaN | NaN | 0.017937 | 0.010000 | 0.020 | 0.033333 | NaN | 0.023370 | 0.020000 | NaN | 0.01 | NaN | NaN | 0.025000 | 0.01 | 0.010000 | 0.021958 | 0.045608 | NaN | NaN | 0.012937 | NaN | NaN | 0.020000 | NaN | NaN | NaN | NaN | NaN | 0.015291 | 0.015000 | 0.02 | 0.010000 | NaN | NaN | NaN | 0.010000 | NaN | 0.015873 | NaN | NaN | 0.019735 | 0.010000 | NaN | NaN | 0.020000 | 0.020000 | NaN | 0.046763 | NaN | 0.027937 | 0.012937 | 0.02 | NaN | 0.012937 | 0.227050 | NaN | 0.038333 | 0.029841 | NaN | 0.162131 | NaN | NaN | NaN | 0.01000 | NaN | 0.015873 | NaN | 0.010000 | NaN | NaN | 0.015873 | NaN | 0.01 | NaN | NaN | 0.02500 | NaN | NaN | 0.020873 | NaN | NaN | 0.01 | 0.012937 | NaN | 0.016468 | 0.01 | NaN | NaN | NaN | NaN | 0.103801 | 0.080000 | 0.010000 | 0.025291 | 0.020000 | NaN | NaN | 0.030317 | NaN | 0.061739 | NaN | NaN | 0.012937 | 0.010000 | 0.030000 | 0.012937 | 0.01 | 0.031739 | NaN | NaN | 0.010000 | NaN | NaN | NaN | 0.020000 | NaN | 0.010000 | NaN | NaN | NaN | NaN | NaN | NaN | 0.015873 | NaN | 0.010000 | NaN | NaN | NaN | 0.033333 | NaN | 0.028360 | NaN | 0.01 | 0.033333 | NaN | 0.010000 | NaN | NaN | NaN | 0.018624 | 0.02 | 0.015000 | NaN | NaN |
4 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.035208 | NaN | NaN | NaN | 0.035151 | 0.013611 | 0.010000 | NaN | 0.016022 | 0.037037 | NaN | NaN | NaN | NaN | 0.037037 | NaN | NaN | NaN | NaN | NaN | 0.010000 | 0.037037 | 0.037636 | 0.030814 | NaN | 0.010000 | 0.037037 | NaN | NaN | NaN | 0.022045 | NaN | 0.025208 | NaN | NaN | NaN | 0.037037 | 0.01 | NaN | NaN | 0.037037 | NaN | 0.014419 | NaN | 0.01 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.020000 | 0.020000 | NaN | NaN | NaN | NaN | NaN | 0.037037 | NaN | 0.010000 | NaN | 0.01 | 0.016628 | NaN | NaN | NaN | NaN | 0.013611 | NaN | NaN | NaN | 0.010000 | NaN | NaN | NaN | 0.020000 | 0.125005 | NaN | 0.023256 | 0.037469 | NaN | 0.338304 | NaN | NaN | 0.023256 | NaN | NaN | NaN | NaN | 0.010000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.02000 | NaN | 0.037037 | 0.010000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.054743 | 0.010000 | 0.010000 | 0.020188 | NaN | NaN | NaN | 0.010000 | NaN | 0.043709 | NaN | NaN | NaN | NaN | 0.016818 | NaN | NaN | 0.012708 | NaN | 0.010000 | 0.010000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.013611 | NaN | 0.016944 | 0.010000 | NaN | 0.020818 | NaN | 0.038453 | NaN | NaN | NaN | NaN | NaN | NaN | 0.01 | NaN | 0.013333 | NaN | NaN | NaN | NaN |
5 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.031426 | NaN | NaN | NaN | 0.187273 | 0.024695 | NaN | NaN | 0.025000 | 0.024390 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.025000 | NaN | 0.036890 | 0.153846 | NaN | NaN | NaN | 0.025000 | NaN | 0.025 | 0.024390 | NaN | 0.031731 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.025000 | NaN | NaN | NaN | NaN | 0.02439 | NaN | NaN | NaN | NaN | 0.02439 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.024390 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.024695 | NaN | NaN | 0.049390 | NaN | 0.176313 | NaN | NaN | NaN | 0.02439 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.025000 | NaN | 0.025 | NaN | NaN | NaN | 0.024390 | NaN | NaN | NaN | NaN | NaN | 0.070341 | NaN | NaN | 0.024695 | NaN | NaN | NaN | 0.069887 | NaN | 0.088462 | NaN | NaN | NaN | NaN | 0.024390 | NaN | NaN | 0.082958 | NaN | NaN | 0.024390 | NaN | NaN | NaN | 0.031426 | NaN | NaN | NaN | NaN | NaN | 0.025000 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 0.024695 | NaN | NaN | NaN | NaN |
Based on the venue categories we can differentiate and define the different clusters as:
-
Cluster 1:
This cluster contains the following cities: Asiago, Bassano del Grappa, Bussolengo, Eraclea, Iesolo, Mira, Mogliano Veneto, Noventa di Piave, Padova, San Giovanni Lupatoto, San Michele al Tagliamento, Treviso, Verona, Vicenza and Villafranca di Verona. It mainly includes venues or activities like spas, river walks and cocktail bars. -
Cluster 2
Is the smallest cluster and includes the cities of Castelnuovo del Garda and Peschiera del Garda. These locations are situated in the proximity of an amusement park and share theme park attractions in the proximity. -
Cluster 3:
Venice, Abano Terme, Brenzone, Garda, Montegrotto Terme and Preganziol form our third cluster. This cluster offers plenty of hotels and all sorts of gastronomic venues. -
Cluster 4:
Is almost entirely made up of the towns in proximity of lake Garda. Bardolino, Costermano, Lazise, Quarto d'Altino, San Zeno di Montagna and Torri del Benaco form this cluster. The majority of the venues are wine and food related, with many restaurants and wine bars. -
Cluster 5:
This includes the maritime areas situated on the gulf of Venice and offers beaches, seafood restaurants and resorts. These municipalities included are Caorle, Cavallino-Treporti and Chioggia.
5. Discussion ¶
We are able to propose different destinations, based on the type of activities, to a person visiting the DOCG areas.
We can ultimately summarize the different clusters into what they would be best suited for in the following table.
Cluster | Cities forming the cluster | Best suited for |
---|---|---|
1 | Asiago, Bassano del Grappa, Bussolengo, Eraclea, Iesolo, Mira, Mogliano Veneto, Noventa di Piave, Padova, San Giovanni Lupatoto, San Michele al Tagliamento, Treviso, Verona, Vicenza, Villafranca di Verona | Relaxational destinations |
2 | Castelnuovo del Garda, Peschiera del Garda | Family destinations |
3 | Venice, Abano Terme, Brenzone, Garda, Montegrotto Terme, Preganziol | City experience |
4 | Bardolino, Costermano, Lazise, Quarto d'Altino, San Zeno di Montagna, Torri del Benaco | Gastronomical tour |
5 | Caorle, Cavallino-Treporti, Chioggia | Beach destination |
Finally, it is important to notice that this classification is limited to the information retreived through the Foursquare API. The amount of venues taken into consideration is only a fraction of the actual amount.
6. Conclusion ¶
We analysed some beautiful cities located in the proximity of DOCG areas and are able to pick a destination, based on our preferred activities, for our holiday. It is now up to you to decide which location suits you best in order to visit those wonderful areas made of outstanding wines and food.
I hope you enjoyed this journey in the land of wines. You can fin the code and all the assets by following this link. The repo includes all of the files used in this project, including the datasets with the geographical data.
If you are not able to view the maps on github you ca read the notebook following this link.