Tools to import and mantain public transport data from opendata in differents formats to OpenStreetMap
- AMB (Autoritat Metropolitana de Barcelona) provides a GTFS dataset with the bus lines off the city of Barcelona
- TMB (Transports Metropolitans de Barcelona) provides a GTFS dataset with the bus and metro lines of the city of Barcelona
- FGC (Ferrocarrils de la Generalitat de Catalunya) provides a GTFS dataset with suburban train lines and metro lines that depend on the Catalonia Government
- Generalitat de Catalunya provide a propietary format dataset with interurban bus lines of Catalonia
Less structured data that can be found in other opendata portals
- Terrassa city council provide a shapefile and a data table with the city bus lines and schedule
- Sabadell city council provide a set of files provide a set of KML files with the bus lines of the city.
The GTFS files are processed with the gtfsdb to import to a database (postgreSQL + PostGIS). An example of the import command:
$ ./bin/gtfsdb-load --database_url postgresql://your-username:your-password@localhost:5432/amb --is_geospatial ../google_transit-amb.zip
It is recommended to create a different cluster with the different databases because the different gtfs files must be stores in different databases An example of the commands to set up a cluster and a PostGIS database:
$ sudo pg_createcluster -d /home/db/postgresql/clusters/gtfs -l /home/db/postgresql/logs/gtfs.log -p 5435 --start --start-conf auto 9.5 gtfs
$ sudo su postgres
$ createuser -p 5435 -P gtfsuser
$ createdb -p 5435 -E UTF8 -O gtfsuser amb
$ psql -p 5435 -d amb -c "CREATE EXTENSION postgis; CREATE EXTENSION postgis_topology; CREATE EXTENSION postgis_sfcgal;"
$ exit
Unfortunately not all GTFS files are 100% standard compliant. For example,
the TMB dataset do not provide a valid agency_id
in the routes.txt
file.
- Add
agnecy_id
to the routes
UPDATE routes SET agency_id = 1;
- AMB trips file do not provide a headsign. This headsign is necessary to create
the OSM route for the
to
andfrom
attributes text. The solution of This problem must have a manual check so a CSV file is used
Step 1: Run the select query (through a shell or with a client)
psql -h localhost -p 5432 -W -U your_username amb -f ../queries/select-itinerary.sql > itiniraris.csv
COPY (SELECT routes.route_short_name, routes.route_long_name, agency.agency_name,
CASE WHEN trips.direction_id = 0 THEN TRIM(split_part(routes.route_long_name, '-', 2), ' ')
ELSE TRIM(split_part(routes.route_long_name, '-', 1), ' ')
END AS hint, trips.direction_id
FROM routes
INNER JOIN agency ON routes.agency_id = agency.agency_id
INNER JOIN trips ON routes.route_id = trips.route_id
INNER JOIN patterns ON trips.shape_id = patterns.shape_id
GROUP BY patterns.geom, routes.route_short_name, routes.route_long_name, agency.agency_name, hint, trips.direction_id
ORDER BY routes.route_short_name, agency.agency_name, trips.direction_id)
TO STDOUT
DELIMITER ','
CSV;
Step 2: Check the CSV file and modify it with an editor (LibreOffice, Excel, gedit...)
Step 3: This step should be just a quary, but if you try to create a temporary table and a copy a csv from stdin in one script, the copy script consumes the the script instead of the input. So first create a table:
psql -h localhost -p 5432 -W -U gtfsuser amb -f ../queries/update-trip-headsign-1.sql
CREATE TABLE tmp_trips (route_id varchar(255), route_short_name varchar(255), route_long_name varchar(255), agency_nameas varchar(255), headsign varchar(255), direction int);
Step 4: Copy the data
cat itiniraris-update.csv | psql -h localhost -p 5432 -W -U gtfsuser amb -c "COPY tmp_trips FROM STDIN (FORMAT csv, DELIMITER ',', QUOTE '\"');"
Step 5: Update the trips table
psql -h localhost -p 5432 -W -U gtfsuser amb -f ../queries/update-trip-headsign-3.sql
UPDATE trips
SET trip_headsign = tmp_trips.headsign
FROM tmp_trips
WHERE trips.route_id = tmp_trips.route_id
AND trips.direction_id = tmp_trips.direction
;
DROP TABLE tmp_trips;
- AMB - SF2 - Only have one whilst there are three (mondays, saturdays and other days)