Comunitat Valenciana' pharmacies' import

This jupyter notebook contains the script for importing the pharmacies in Comunitat Valenciana into OSM, as well as the documentation of the whole process in a single file, making it easier to review both the process and the results as well as the decisions taken.

The goal is to manually merge and import all the pharmacies' information provided by Generalitat Valenciana, while testing the scripts for data preparation.

Data Sources

License

We have requested autorization due to COV19 emergency. Data is released under CC

Import type

This import will be done manually, using JOSM to edit the data. Consider using Task Manager.

Data preparations

All data preparations will be made automatically in this notebook.

import numpy as np
import pandas as pd
import geopandas as gpd
import geopy
from osmi_helpers import data_gathering as osmi_dg

# Define Data Sources

DATA_RAW = 'data/interim/ListadoOficinasFarmacia_clean.csv'

CSV_PARSER = 'fields_mapping.csv'

Fields' mapping.

# Read CSV file with fields' mapping and description.
fields_mapping = pd.read_csv(CSV_PARSER)

# Display table.
fields_mapping
Original field Description OSM tagging Comments
0 ESTABLECIMIENTO Internal ID source:pkey NaN
1 PROVINCIA Province addr:province NaN
2 MUNICIPIO Municipality addr:city NaN
3 TITULAR NaN operator NaN
4 DIRECCIÓN Address addr:full Full address, to be splitted into `adr:street`...
5 DEPARTAMENTO DE SALUD NaN NaN Not imported
6 ZONA ZBS NaN NaN Not imported
7 ZONA FARMACÉUTICA NaN NaN Not imported

Data gathering

Run the code below to download original datasources and convert them into a dataframe.

# Download a file and convert it into a dataframe.
df_raw = pd.read_csv(DATA_RAW)

df_raw.head(10)
ESTABLECIMIENTO PROVINCIA MUNICIPIO TITULAR DIRECCIÓN DEPARTAMENTO DE SALUD ZONA ZBS ZONA FARMACÉUTICA
0 A-164-F ALICANTE Agost CARLOS GISBERT ARQUES PLAZA DE ESPAÑA, 19, 03698 19 8 25
1 A-285-F ALICANTE Agost FRANCISCO AYUSO MACIA Avinguda de Virgen de la Paz, 30, 03698 19 8 25
2 A-510-F ALICANTE Agres INMACULADA FERRERO PEREZ Carrer de San Antonio, 13, 03837 15 9 15
3 A-540-F ALICANTE Aigües JAVIER VILLAMAYOR PIÑAS Carrer de CANALEJAS, 12, 03569 17 6 22
4 A-13-F ALICANTE Albatera ISABEL BALSAMEDA MORALES Carrer de Ramon y Cajal, 6, 03340 21 1 35
5 A-14-F ALICANTE Albatera MARIA JOSE BALMASEDA DEL ALAMO, ARTURO CORBACH... Carretera de HONDON. EDIF.ROMEO, S/N, 03340 21 1 35
6 A-580-F ALICANTE Albatera MARIA DEL MAR GARCIA MOLINA Avinguda de Calvario, 39, 03340 21 1 35
7 A-814-F ALICANTE Albatera FRANCISCO GARCIA CANOVAS Carrer de Meson, 45, 03340 21 1 35
8 A-531-F ALICANTE Alcalalí ENCARNACION MOLL MENGUAL Carrer de Calvari, 7, 03728 13 1 1
9 A-748-F ALICANTE Alcalalí RAQUEL FERRER GONZALEZ Carrer de SALAMANCA, 19, 03723 13 8 1

Data conversion

Run the cell below to convert raw data into a suitable OSM-friendly structure, according to the provided CSV fields with fields' mappings stated in CSV_PARSER variable.

# Selects and renames fields according to CSV parser.
df = osmi_dg.csv_parser(df_raw, CSV_PARSER)

# Fix uppercase.
df['operator'] = df['operator'].str.title()
df['addr:province'] = df['addr:province'].str.title()
df['addr:full'] = df['addr:full'].str.title()

# Split address.
df['addr:street'], df['addr:housenumber'], df['addr:postcode'] = df['addr:full'].str.split(',', 2).str
df['addr:housenumber'] = df['addr:housenumber'].replace(regex = 'S/N', value = '')

# Create some hardcoded fields.
df['source'] = "Opendata Generalitat Valenciana"
df['amenity'] = 'pharmacy'


df.head(10)
source:pkey addr:province addr:city operator addr:full addr:street addr:housenumber addr:postcode source amenity
0 A-164-F Alicante Agost Carlos Gisbert Arques PLAZA DE ESPAÑA, 19, 03698 PLAZA DE ESPAÑA 19 03698 Opendata Generalitat Valenciana pharmacy
1 A-285-F Alicante Agost Francisco Ayuso Macia Avinguda de Virgen de la Paz, 30, 03698 Avinguda de Virgen de la Paz 30 03698 Opendata Generalitat Valenciana pharmacy
2 A-510-F Alicante Agres Inmaculada Ferrero Perez Carrer de San Antonio, 13, 03837 Carrer de San Antonio 13 03837 Opendata Generalitat Valenciana pharmacy
3 A-540-F Alicante Aigües Javier Villamayor Piñas Carrer de CANALEJAS, 12, 03569 Carrer de CANALEJAS 12 03569 Opendata Generalitat Valenciana pharmacy
4 A-13-F Alicante Albatera Isabel Balsameda Morales Carrer de Ramon y Cajal, 6, 03340 Carrer de Ramon y Cajal 6 03340 Opendata Generalitat Valenciana pharmacy
5 A-14-F Alicante Albatera Maria Jose Balmaseda Del Alamo, Arturo Corbach... Carretera de HONDON. EDIF.ROMEO, S/N, 03340 Carretera de HONDON. EDIF.ROMEO 03340 Opendata Generalitat Valenciana pharmacy
6 A-580-F Alicante Albatera Maria Del Mar Garcia Molina Avinguda de Calvario, 39, 03340 Avinguda de Calvario 39 03340 Opendata Generalitat Valenciana pharmacy
7 A-814-F Alicante Albatera Francisco Garcia Canovas Carrer de Meson, 45, 03340 Carrer de Meson 45 03340 Opendata Generalitat Valenciana pharmacy
8 A-531-F Alicante Alcalalí Encarnacion Moll Mengual Carrer de Calvari, 7, 03728 Carrer de Calvari 7 03728 Opendata Generalitat Valenciana pharmacy
9 A-748-F Alicante Alcalalí Raquel Ferrer Gonzalez Carrer de SALAMANCA, 19, 03723 Carrer de SALAMANCA 19 03723 Opendata Generalitat Valenciana pharmacy

Geocode dataframe

# Geocode
from geopy.geocoders import Photon
geolocator = Photon(timeout=10, user_agent = "myGeolocator")

#df = df.iloc[0:25, :]
df['addr_full'] = df['addr:street'] + ', ' + df['addr:city'] + ', ' + df['addr:province']
df['addr:housenumber'] = df['addr:housenumber'].replace(regex = '', value = np.NaN)


df['gcode'] = df.addr_full.apply(geolocator.geocode)

# Store rows that have not been geolocated.
df_not_found = df[df['gcode'].isnull()]

df_not_found


# Proceed with geolocated values only.
df_loc = df[df['gcode'].notna()]

# Generate a `lat` and `lon` columns with latitude and longitude values.
df_loc['lat'] = [g.latitude for g in df_loc.gcode]
df_loc['lon'] = [g.longitude for g in df_loc.gcode]

df_loc
source:pkey addr:province addr:city operator addr:full addr:street addr:housenumber addr:postcode source amenity addr_full gcode lat lon
0 A-164-F Alicante Agost Carlos Gisbert Arques PLAZA DE ESPAÑA, 19, 03698 PLAZA DE ESPAÑA 19 03698 Opendata Generalitat Valenciana pharmacy PLAZA DE ESPAÑA, Agost, Alicante (Alt de la Venta de Agost, 03698, Monforte del... 38.402291 -0.658807
1 A-285-F Alicante Agost Francisco Ayuso Macia Avinguda de Virgen de la Paz, 30, 03698 Avinguda de Virgen de la Paz 30 03698 Opendata Generalitat Valenciana pharmacy Avinguda de Virgen de la Paz, Agost, Alicante (avinguda Verge de la Pau, 03698, Agost, Valen... 38.437354 -0.638250
2 A-510-F Alicante Agres Inmaculada Ferrero Perez Carrer de San Antonio, 13, 03837 Carrer de San Antonio 13 03837 Opendata Generalitat Valenciana pharmacy Carrer de San Antonio, Agres, Alicante (Carrer San Antonio, 03730, Xàbia / Jávea, Val... 38.797296 0.181354
3 A-540-F Alicante Aigües Javier Villamayor Piñas Carrer de CANALEJAS, 12, 03569 Carrer de CANALEJAS 12 03569 Opendata Generalitat Valenciana pharmacy Carrer de CANALEJAS, Aigües, Alicante (Carrer de Canalejas, 03569, Aigües, Valencian... 38.500772 -0.364732
4 A-13-F Alicante Albatera Isabel Balsameda Morales Carrer de Ramon y Cajal, 6, 03340 Carrer de Ramon y Cajal 6 03340 Opendata Generalitat Valenciana pharmacy Carrer de Ramon y Cajal, Albatera, Alicante (carrer de Ramón y Cajal, 03698, Agost, Valenc... 38.437354 -0.638250
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2352 V-812-F Valencia Xirivella Julia Miranda Sanz, Carmen Miranda Sanz Avinguda de Del Cami Nou, 92, 46950 Avinguda de Del Cami Nou 92 46950 Opendata Generalitat Valenciana pharmacy Avinguda de Del Cami Nou, Xirivella, Valencia (Avinguda del Camí Nou, 46950, Xirivella, Vale... 39.463784 -0.435451
2353 V-887-F Valencia Xirivella Inmaculada Diaz Delgado Avinguda de Del Cami Nou, 195, 46950 Avinguda de Del Cami Nou 195 46950 Opendata Generalitat Valenciana pharmacy Avinguda de Del Cami Nou, Xirivella, Valencia (Avinguda del Camí Nou, 46950, Xirivella, Vale... 39.463784 -0.435451
2354 V-451-F Valencia Yátova Inmaculada Gil Pellin Carrer de Mayor, 42, 46367 Carrer de Mayor 42 46367 Opendata Generalitat Valenciana pharmacy Carrer de Mayor, Yátova, Valencia (Carrer de Yátova, 46183, l'Eliana, Valencian ... 39.565037 -0.539945
2355 V-1072-F Valencia La Yesa Mª Jose Monserrat Soria Carrer de Calvo Sotelo, 25, 46178 Carrer de Calvo Sotelo 25 46178 Opendata Generalitat Valenciana pharmacy Carrer de Calvo Sotelo, La Yesa, Valencia (Carrer de Calvo Sotelo, 46185, la Pobla de Va... 39.592464 -0.551801
2356 V-1018-F Valencia Zarra Maria Jesus Soler Landete Carrer de Camino del Lavadero, 1, 46621 Carrer de Camino del Lavadero 1 46621 Opendata Generalitat Valenciana pharmacy Carrer de Camino del Lavadero, Zarra, Valencia (Camino del Lavadero, 46621, Zarra, Valencian ... 39.090594 -1.073273

2179 rows × 14 columns

Now display all the rows that haven't been geolocated.

df_not_found
source:pkey addr:province addr:city operator addr:full addr:street addr:housenumber addr:postcode source amenity addr_full gcode
5 A-14-F Alicante Albatera Maria Jose Balmaseda Del Alamo, Arturo Corbach... Carretera de HONDON. EDIF.ROMEO, S/N, 03340 Carretera de HONDON. EDIF.ROMEO 03340 Opendata Generalitat Valenciana pharmacy Carretera de HONDON. EDIF.ROMEO, Albatera, Ali... None
13 A-122-F Alicante Alcoi / Alcoy Inés Llopis Boluda, Maria José Llopis Boluda Carrer de Na Saurina d’Entença, 45, 03803 Carrer de Na Saurina d’Entença 45 03803 Opendata Generalitat Valenciana pharmacy Carrer de Na Saurina d’Entença, Alcoi / Alcoy,... None
19 A-229-F Alicante Alcoi / Alcoy Miguel Domenech Lloret Avinguda de Alameda de Camilo Sesto, 53, 03803 Avinguda de Alameda de Camilo Sesto 53 03803 Opendata Generalitat Valenciana pharmacy Avinguda de Alameda de Camilo Sesto, Alcoi / A... None
20 A-231-F Alicante Alcoi / Alcoy María Carmen Azorín Navarro Avinguda de Juan Gil - Albert, 13, 03804 Avinguda de Juan Gil - Albert 13 03804 Opendata Generalitat Valenciana pharmacy Avinguda de Juan Gil - Albert, Alcoi / Alcoy, ... None
29 A-338-F Alicante Alcoi / Alcoy Juan Antonio Lopez Cobelo, Carlos Lopez Cobelo Carrer de Lluis Braille, 17, 03802 Carrer de Lluis Braille 17 03802 Opendata Generalitat Valenciana pharmacy Carrer de Lluis Braille, Alcoi / Alcoy, Alicante None
... ... ... ... ... ... ... ... ... ... ... ... ...
1960 V-216-F Valencia València Maria Rosario Vidal Blasco Carrer de Luis Garcia-Berlanga Marti (Director... Carrer de Luis Garcia-Berlanga Marti (Director... 5B 46023 Opendata Generalitat Valenciana pharmacy Carrer de Luis Garcia-Berlanga Marti (Director... None
2227 V-829-F Valencia València Jose Miguel Cavero Rausell, Ana Roda Segrelles Avinguda de Professor Lopez Piñero (Historiad... Avinguda de Professor Lopez Piñero (Historiad... 16 46013 Opendata Generalitat Valenciana pharmacy Avinguda de Professor Lopez Piñero (Historiad... None
2335 V-550-F Valencia Xàtiva Angel Bruno Dominguez Barbera Avinguda de Alameda Jaume I, 32, 46800 Avinguda de Alameda Jaume I 32 46800 Opendata Generalitat Valenciana pharmacy Avinguda de Alameda Jaume I, Xàtiva, Valencia None
2342 V-485-F Valencia Xeresa Joaquin Vicente Loras Lovaco Carrer de Dr Miguel Vivo, 34, 46790 Carrer de Dr Miguel Vivo 34 46790 Opendata Generalitat Valenciana pharmacy Carrer de Dr Miguel Vivo, Xeresa, Valencia None
2345 V-1246-F Valencia Xirivella Rafael Navarro Sanchez PLAZA Doctor Gerardo Garces, 10, 46950 PLAZA Doctor Gerardo Garces 10 46950 Opendata Generalitat Valenciana pharmacy PLAZA Doctor Gerardo Garces, Xirivella, Valencia None

178 rows × 12 columns

Export clean data

If the attributes above are correct, we have to proceed to export them into a CSV and geojson files that can be used in the Task Manager's project.

# Drop unnecessary fields.
df_loc = df_loc.drop(columns=['addr:full', 'addr_full', 'gcode'])


# Generate  a CSV File.
df_loc.to_csv('data/processed/pharmacies_cval.csv', index = False)
df_not_found.to_csv('data/processed/pharmacies_not_found_cval.csv', index = False)
# Convert dataframe into a GeoDataframe.
gdf = gpd.GeoDataFrame(
    df_loc,
    geometry=gpd.points_from_xy(df_loc.lon, df_loc.lat))

# Export to geojson.
gdf.to_file('data/processed/pharmacies_cval.geojson', driver='GeoJSON')

As a result of this script, we get the following files (all of them stored in data/processed folder:

TODOs:

  • Drop latitude and longitude fields from GeoJson, Issue #19 (help appreciated!)