Barcelona Trees' import

This jupyter notebook contains the script for importing Barcelona's Trees into OSM as well as the documentation of the whole process in a single file, making it easier to review both the process and the results as well as the decisions taken.

The goal is to manually merge and import all the trees' information provided by Barcelona City Council, while testing the scripts for data preparation.

Data Sources

Two datasets provided by Barcelona City Council will be used:

  • Arbrat viari: Name of the species and geolocation of the trees of the city of Barcelona located on public roads. The information contains, among other data, the scientific name, the common name, the height, the direction and the width of the sidewalk... The trees of the parks are not included. The coordinates are expressed in the ETRS89 reference system. This dataset complemens of Zone trees of the city of Barcelona. Historical resources which contain data available until the last week of the term are published. The resources are ordered by year and term, information that can be found in the name of the resource.
  • Arbrat zona: Name of the species and geolocation of the trees of the city of Barcelona located on public roads. The information contains, among other data, the scientific name, the common name, the height, the direction and the width of the sidewalk... The trees of the parks are not included. The coordinates are expressed in the ETRS89 reference system. This dataset complemens of Street trees of the city of Barcelona. Historical resources which contain data available until the last week of the term are published. The resources are ordered by year and term, information that can be found in the name of the resource.

License

We have an express authorization from the Barcelona city council for the reuse of open data published on theirs open Government website

Import type

This import will be done manually, using JOSM to edit the data. Consider using Task Manager.

Data preparations

All data preparations will be made automatically in this notebook.

import pandas as pd
import geopandas as gpd
from osmi_helpers import data_gathering as osmi_dg

# Define Data Sources

ARBRAT_VIARI_URL = "https://opendata-ajuntament.barcelona.cat/data/dataset/27b3f8a7-e536-4eea-b025-ce094817b2bd/resource/28034af4-b636-48e7-b3df-fa1c422e6287/download"
ARBRAT_ZONA_URL = "https://opendata-ajuntament.barcelona.cat/data/dataset/9b525e1d-13b8-48f1-abf6-f5cd03baa1dd/resource/8f2402dd-72dc-4b07-8145-e3f75004b0de/download"

CSV_PARSER = 'fields_mapping.csv'

Fields' mapping.

# Read CSV file with fields' mapping and description.
fields_mapping = pd.read_csv(CSV_PARSER)

# Display table.
fields_mapping
Original field Description OSM tagging Comments
0 CODI Internal ID source:pkey Primary Key, tagging as proposed in Osmsync's ...
1 X_ETRS89 X coordinates, ETRS89 format NaN Not imported
2 Y_ETRS89 Y coordinates, ETRS89 format NaN Not imported
3 LATITUD_WGS84 Latitude coordinates, WGS84 format lat Geometry information. No tagging will be used.
4 LONGITUD_WGS84 Longitude coordinates, WGS84 format lon Geometry information. No tagging will be used.
5 TIPUS_ELEMENT Object's type (viari/zona) NaN Not imported
6 ESPAI VERD Name of Green space where the tree is located NaN Not imported
7 ADRECA Address NaN Not imported
8 ALCADA Tree's height. It does not use meters, but cat... height height is calculated according to this field, ...
9 CAT_ESPECIE_ID Species' ID NaN Not imported
10 NOM_CIENTIFIC scientific name of the species (popularly know... species NaN
11 NOM_CASTELLA Name in Spanish species:es NaN
12 NOM_CATALA Name in Catalan species:ca NaN
13 CATEGORIA_ARBRAT Tree's category. Internal classification accor... circumference Not directly imported, but used to calculate c...
14 AMPLADA_VORERA Sidewalk's width NaN Not imported.
15 DATA_PLANTACIO Date in which the tree was planted planted_date NaN
16 TIPUS_AIGUA Water type NaN Not imported.
17 TIPUS_REG Watering mechanism NaN Not imported.
18 TIPUS_SUPERFICIE Surface type NaN Not imported.
19 TIPUS_SUPORT Support type NaN Not imported.
20 COBERTURA_ESCOCELL Whether the Tree pit is covered or not NaN Not currently imported due to lack of specific...
21 MIDA_ESCOCELL Tree pit Size NaN Not currently imported due to lack of specific...

Data gathering

Run the code below to download original datasources and convert them into a dataframe.

# Download a file and convert it into a dataframe.
df_aviari = pd.read_csv(ARBRAT_VIARI_URL)
df_azona = pd.read_csv(ARBRAT_ZONA_URL)

# Combine both datasources into a single one.
df_raw = pd.concat([df_aviari, df_azona])

df_raw.head(10)
CODI X_ETRS89 Y_ETRS89 LATITUD_WGS84 LONGITUD_WGS84 TIPUS_ELEMENT ESPAI_VERD ADRECA ALCADA CAT_ESPECIE_ID ... CATEGORIA_ARBRAT AMPLADA_VORERA DATA_PLANTACIO TIPUS_AIGUA TIPUS_REG TIPUS_SUPERFICIE TIPUS_SUPORT COBERTURA_ESCOCELL MIDA_ESCOCELL VORA_ESCOCELL
0 0000022AR 430319.118 4587765.810 41.438442 2.165919 ARBRE VIARI Can Ensenya, C.V. (Fabra i Puig 439, Villalba ... Pg Fabra i Puig, 468 NaN 1104 ... NaN NaN NaN NaN MÀNEGA GESPA PARTERRE SENSE COBERTURA major que o igual a 100 cm VORA METÀL·LICA
1 0000025AR 430270.562 4587637.998 41.437287 2.165353 ARBRE VIARI Central de Nou Barris, Parc Pg Fabra i Puig, 450 PETITA 152 ... PRIMERA NaN 09/05/2017 NaN GOTEIG AVARIAT PAVIMENT ESCOCELL TRIANGULAR SENSE COBERTURA major que o igual a 100 cm ALTRES
2 0000028AR 430277.559 4587643.344 41.437335 2.165436 ARBRE VIARI Central de Nou Barris, Parc Pg Fabra i Puig, 450 PETITA 152 ... PRIMERA NaN 09/05/2017 NaN GOTEIG AVARIAT PAVIMENT ESCOCELL TRIANGULAR SENSE COBERTURA major que o igual a 100 cm ALTRES
3 0000386AR 430035.239 4587693.836 41.437769 2.162530 ARBRE VIARI Central de Nou Barris, Parc C\ Doctor Letamendi, 90 MITJANA 126 ... SEGONA NaN NaN NaN SENSE INFORMAR PAVIMENT ESCOCELL TRIANGULAR SENSE COBERTURA major que o igual a 100 cm VORA METÀL·LICA
4 0000387AR 430032.831 4587696.005 41.437788 2.162501 ARBRE VIARI Central de Nou Barris, Parc C\ Doctor Letamendi, 90 MITJANA 126 ... SEGONA NaN NaN NaN SENSE INFORMAR PAVIMENT ESCOCELL TRIANGULAR SENSE COBERTURA major que o igual a 100 cm VORA METÀL·LICA
5 0000388AR 430030.367 4587698.393 41.437810 2.162471 ARBRE VIARI Central de Nou Barris, Parc C\ Doctor Letamendi, 90 GRAN 126 ... SEGONA NaN NaN NaN SENSE INFORMAR PAVIMENT ESCOCELL TRIANGULAR SENSE COBERTURA major que o igual a 100 cm VORA METÀL·LICA
6 0000423AR 430250.886 4587703.209 41.437872 2.165110 ARBRE VIARI Central de Nou Barris, Parc Pg Fabra i Puig, 423 GRAN 108 ... EXEMPLAR NaN NaN NaN GOTEIG ALTRES ESCOCELL RECTANGULAR NaN NaN NaN
7 0001109AR 430196.862 4587543.015 41.436425 2.164482 ARBRE VIARI Central de Nou Barris, Parc Pg Fabra i Puig, 438 EXEMPLAR 152 ... TERCERA NaN NaN NaN GOTEIG AVARIAT PAVIMENT ESCOCELL TRIANGULAR SENSE COBERTURA major que o igual a 100 cm ALTRES
8 0001110AR 430198.143 4587545.750 41.436449 2.164497 ARBRE VIARI Central de Nou Barris, Parc Pg Fabra i Puig, 438 PETITA 2336 ... PRIMERA NaN 31/10/2019 NaN GOTEIG AVARIAT PAVIMENT ESCOCELL TRIANGULAR SENSE COBERTURA major que o igual a 100 cm ALTRES
9 0001111AR 430199.000 4587548.000 41.436470 2.164507 ARBRE VIARI Central de Nou Barris, Parc Pg Fabra i Puig, 450 MITJANA 152 ... SEGONA NaN 01/01/2008 NaN GOTEIG AVARIAT PAVIMENT ESCOCELL TRIANGULAR SENSE COBERTURA major que o igual a 100 cm ALTRES

10 rows × 23 columns

Data conversion

Run the cell below to convert raw data into a suitable OSM-friendly structure, according to the provided CSV fields with fields' mappings stated in CSV_PARSER variable.

df_mapping = pd.read_csv(CSV_PARSER)

df_mapping
Original field Description OSM tagging Comments
0 CODI Internal ID source:pkey Primary Key, tagging as proposed in Osmsync's ...
1 X_ETRS89 X coordinates, ETRS89 format NaN Not imported
2 Y_ETRS89 Y coordinates, ETRS89 format NaN Not imported
3 LATITUD_WGS84 Latitude coordinates, WGS84 format lat Geometry information. No tagging will be used.
4 LONGITUD_WGS84 Longitude coordinates, WGS84 format lon Geometry information. No tagging will be used.
5 TIPUS_ELEMENT Object's type (viari/zona) NaN Not imported
6 ESPAI VERD Name of Green space where the tree is located NaN Not imported
7 ADRECA Address NaN Not imported
8 ALCADA Tree's height. It does not use meters, but cat... height height is calculated according to this field, ...
9 CAT_ESPECIE_ID Species' ID NaN Not imported
10 NOM_CIENTIFIC scientific name of the species (popularly know... species NaN
11 NOM_CASTELLA Name in Spanish species:es NaN
12 NOM_CATALA Name in Catalan species:ca NaN
13 CATEGORIA_ARBRAT Tree's category. Internal classification accor... circumference Not directly imported, but used to calculate c...
14 AMPLADA_VORERA Sidewalk's width NaN Not imported.
15 DATA_PLANTACIO Date in which the tree was planted planted_date NaN
16 TIPUS_AIGUA Water type NaN Not imported.
17 TIPUS_REG Watering mechanism NaN Not imported.
18 TIPUS_SUPERFICIE Surface type NaN Not imported.
19 TIPUS_SUPORT Support type NaN Not imported.
20 COBERTURA_ESCOCELL Whether the Tree pit is covered or not NaN Not currently imported due to lack of specific...
21 MIDA_ESCOCELL Tree pit Size NaN Not currently imported due to lack of specific...
# Selects and renames fields according to CSV parser.
df = osmi_dg.csv_parser(df_raw, CSV_PARSER)

# Calculate some fields.

# Create genus column
df['genus'] = df['species'].str.split().str[0]

# Convert columns into categories (R's factors)
# https://pandas.pydata.org/pandas-docs/stable/user_guide/categorical.html
# When converted to categories, export to Geojson does not work.
# df["species"] = df["species"].astype("category")
# df["species:ca"] = df["species:ca"].astype("category")

# TODO: Populate leaf_cycle column according to species
# https://wiki.openstreetmap.org/wiki/Key:leaf_cycle

# Calculate height, according to city council's guide:
# https://ajuntament.barcelona.cat/ecologiaurbana/sites/default/files/Plagestioarbratviaribcn_cat.pdf (p. 22)
df.loc[df.height == "PETITA", 'height'] = 5
df.loc[df.height == "MITJANA", 'height'] = 10
df.loc[df.height == "GRAN", 'height'] = 15
df.loc[df.height == "EXEMPLAR", 'height'] = 20


# Convert 'CATEGORIA' into circumference, according to city council's
# guide: https://ajuntament.barcelona.cat/ecologiaurbana/sites/default/files/Plagestioarbratviaribcn_cat.pdf (p. 19)
df.loc[df.circumference == "PRIMERA", 'circumference'] = 0.4
df.loc[df.circumference == "SEGONA", 'circumference'] = 0.8
df.loc[df.circumference == "TERCERA", 'circumference'] = 1.1
df.loc[df.circumference == "EXEMPLAR", 'circumference'] = 1.5

# TODO: Tag tree pits for accessibility purposes.

# Create a source column with "Opendata Ajuntament Barcelona"
df['source'] = "Opendata Ajuntament de Barcelona"

df.head(10)
source:pkey lat lon height species species:es species:ca circumference planted_date genus source
0 0000022AR 41.438442 2.165919 NaN Celtis australis Almez Lledoner NaN NaN Celtis Opendata Ajuntament de Barcelona
1 0000025AR 41.437287 2.165353 5 Populus nigra 'Italica' Chopo lombardo Pollancre gavatx 0.4 09/05/2017 Populus Opendata Ajuntament de Barcelona
2 0000028AR 41.437335 2.165436 5 Populus nigra 'Italica' Chopo lombardo Pollancre gavatx 0.4 09/05/2017 Populus Opendata Ajuntament de Barcelona
3 0000386AR 41.437769 2.162530 10 Platanus x hispanica Plátano Plàtan 0.8 NaN Platanus Opendata Ajuntament de Barcelona
4 0000387AR 41.437788 2.162501 10 Platanus x hispanica Plátano Plàtan 0.8 NaN Platanus Opendata Ajuntament de Barcelona
5 0000388AR 41.437810 2.162471 15 Platanus x hispanica Plátano Plàtan 0.8 NaN Platanus Opendata Ajuntament de Barcelona
6 0000423AR 41.437872 2.165110 15 Pinus pinea Pino piñonero Pi pinyoner; pi pinyer 1.5 NaN Pinus Opendata Ajuntament de Barcelona
7 0001109AR 41.436425 2.164482 20 Populus nigra 'Italica' Chopo lombardo Pollancre gavatx 1.1 NaN Populus Opendata Ajuntament de Barcelona
8 0001110AR 41.436449 2.164497 5 Fraxinus angustifolia 'Raywood' - - 0.4 31/10/2019 Fraxinus Opendata Ajuntament de Barcelona
9 0001111AR 41.436470 2.164507 10 Populus nigra 'Italica' Chopo lombardo Pollancre gavatx 0.8 01/01/2008 Populus Opendata Ajuntament de Barcelona

Export clean data into a geojson

If the attributes above are correct, we have to proceed to export them into a geojson file that can be used in the Task Manager's project.

# Convert dataframe into a GeoDataframe.
gdf_trees = gpd.GeoDataFrame(
    df,
    geometry=gpd.points_from_xy(df.lon, df.lat))


# Export to geojson.
gdf_trees.to_file("data/processed/bcn_trees.geojson", driver='GeoJSON')

# TODO: drop latitude and longitude fields.

The resulting geojson file can be found in the folder /data/processed/bcn_trees.geojson in this repo.

TODOs:

  • Drop latitude and longitude fields from GeoJson, Issue #19 (help appreciated!)