Advanced Python Course
Chalmers DAT690 / DIT516 / DAT516
2025
by Aarne Ranta & John J. Camilleri
2025-11-13: Added a better example and explanation to the time dictionary specification. The specification itself has not changed, but it is hopefully clearer now. If you already have your lab working well, you don’t need to do anything. But as several students have asked about it, I thought it would be useful to improve the explanation.
The purpose of Lab 1 is to read information from different formats and combine it to useful data structures. We will consider two different data formats:
json,[0], slices [1:5], and standard
string methods such as split(), strip(), join().The data collected from these files is saved in a new JSON file,
tramnetwork.json, which is ready to be used in applications - including Labs 2 and 3.
The command python3 tramdata.py init produces this file.
The target data structures are dictionaries, which enable efficient queries about the data.
If run with the command python3 tramdata.py, the program will enable the following kind of dialogue:
$ python3 tramdata.py
> via Chalmers
['6', '7', '8', '10', '13']
> between Chalmers and Valand
['7', '10']
> time with 6 from Chalmers to Järntorget
10
> distance from Chalmers to Järntorget
1.628
These structures and queries are preparation for the later labs, where they are embedded in an object-oriented hierarchy (Lab 2) and used in the back-end of a web application (Lab 3).
Learning outcomes:
math librarythe JSON data format and the json library
The task is to write three functions that build dictionaries, four functions that extract information from them, and a dialogue function that answers to queries. The dialogue function should be divided into two parts to enable more accurate testing and debugging.
build_tram_stops(jsonobject), building a stop dictionary, where
Here is a part of the stop dictionary, showing just one stop:
{
'Majvallen': {
'lat': 57.6909343,
'lon': 11.9354935
}
}
An input file in the expected format is tramstops.json.
The function involves an easy conversion using the json library.
build_tram_lines(lines), building a line dictionary, where
Here is an example:
{
"9": [
"Angered Centrum",
"Storås",
"Hammarkullen",
# many more stops in between
"Sandarna",
"Kungssten"
]
}
An input file in the expected format is
tramlines.txt.
It is a textual representation of timetables for each line, looking as
follows:
1:
Östra Sjukhuset 10:00
Tingvallsvägen 10:01
Kaggeledstorget 10:03
Ättehögsgatan 10:03
Thus, for each tram line, there is a section starting with the line number and a colon. After that, the stops are given together with times. For simplicity, each line starts from time 10:00. We are not interested in these times as such, but in the transition times between adjacent stops. Thus, for instance, the transition time between Tingvallsvägen and Kaggeledstorget is 2 minutes. We want to store the transition times in a non-redundant way, under the following assumptions:
Hence, we don’t want to add transition times to the line dictionary, because this would lead to storing redundant information.
Instead, from the file tramlines.txt, we also build a time dictionary which stores the times between adjacent stops, where
Here is an example of a time dictionary entry:
{
"Kaggeledstorget" : {
"Tingvallsvägen": 2,
"Ättehögsgatan": 0
}
}
To summarize, the general idea with these data structures and functions is to avoid redundancy: every piece of information is given only once in the dictionaries. In particular,
Hint (not necessary to follow, but may be useful): A way to enforce the latter condition is to use alphabetical order: the time dictionary of Kaggeledstorget includes Tingvallsvägen, but not the other way round. When you then need to look up the time from Tingvallsvägen to Kaggeledstorget, you can find it by first looking up Kaggeledstorget.
Moreover, you should aim at the following:
build_tram_network(stopfile, linefile) puts everything together. It reads the two input files and writes a
third one, entitled tramnetwork.json.
This JSON file represents a dictionary that contains the three dictionaries built:
{
"stops": {
"Östra Sjukhuset": {
"lat": 57.7224618,
"lon": 12.0478166
}, // and so on, the entire stop dict
},
"lines": {
"1": [
"Östra Sjukhuset",
"Tingvallsvägen",
// and so on, all stops on line 1
], // and so on, the entire line dict
},
"times": {
"Tingvallsvägen": {
"Kaggeledstorget": 2
}, // and so on, the entire time dict
}
}
Each of the following functions uses one or more of the dictionaries you built.
lines_via_stop(linedict, stop) lists the lines that go via the given stop.
The lines should be sorted in their numeric order, that is, ‘2’ before ‘10’.
lines_between_stops(linedict, stop1, stop2) lists the lines that go from stop1 to stop2.
The lines should be sorted in their numeric order, that is, ‘2’ before ‘10’.
Notice that all lines are assumed to run in both directions.
time_between_stops(linedict, timedict, line, stop1, stop2) calculates the time from stop1 to stop2 along the given line. This is obtained as the sum of all times between adjacent stops. If the stops are not along the same line, an error message is printed.
distance_between_stops(stopdict, stop1, stop2) calculates the geographic distance between any two stops, based on their latitude and longitude.
The distance is hence not dependent on the tram lines.
You can implement this function by using the
Haversine library.
Testing will be addressed more systematically in Lab 2 and also be a part of it.
However, you can already train your hand at writing tests, because it is a great help in developing your code.
The file templates/test_tramdata.py already tests if all stops associated with lines in linedict also exist in stopdict.
You could try and add at least the following tests:
tramlines.txt are included in linedict,tramlines.txt and linedict,The dialogue(tramfile) function implements a dialogue about tram information.
It starts by reading the data from the JSON file tramnetwork.json,
which has been produced by your program.
Then it takes user input and answers to any number of questions by using your query functions.
Following kinds of input are interpreted:
via <stop>, answered by lines_via_stop()between <stop1> and <top2>, answered by lines_between_stops()time with <line> from <stop1> to <stop2>, answered by time_between_stops()distance from <stop1> to <top2>, answered by distance_between_stops()quit - terminating the program> .The main challenge is to deal correctly with stop names that consist of more than one word.
A hint for this is to locate the positions of keywords such as “and”, which can appear between stop names, and consider slices starting or ending at them.
The simplest method is the standard index() method of strings.
Also the regular expression library re could be used, but is probably more complicated to learn unless you already know it from before.
For the purpose of testing, and more generally to cleanly separate input and output from processing, the dialogue() function should be divided into two separate functions:
answer_query(tramdict, query), which takes the query string and returns the answer as a value (list or integer or float). You should decide how to handle queries that cannot be interpreted.
dialogue(tramfile) itself, which reads the file into a dictionary, loops by asking for input, and for each input prints the value returned by answer_query(tramdict, query), except for input quit (terminating the loop) and for uninterpreted input (asking the user to try again).
Testing a complete dialogue is tricky, but you can can easily test the answer_query(tramdict, query) function.
What you should test is that the answer printed for a query
(in the format written by the user) is the same as the expected answer.
This then tests that queries are parsed and interpreted correctly.
There is already one example of this in test_tramdata.py, which you should extend with your own test cases.
Here are some more examples to get you started:
> via Botaniska Trädgården
['1', '2', '7', '8', '13']
> between Medicinaregatan and Saltholmen
['13']
> time with 5 from Munkebäckstorget to Sankt Sigfrids Plan
9
> distance from Temperaturgatan to Lackarebäck
10.092
Remember to also test negative examples, to ensure that your error handling works correctly, e.g.:
> between Medicinareberget and Saltholmen
unknown arguments
> distance between Chalmers and Ramberget
sorry, try again
At the end of your file, make a conditional call under
if __name__ == '__main__':
calling build_tram_network() if the argument init is present, dialogue() otherwise.
Hint: You can check the presence of this argument by using sys.argv:
if __name__ == '__main__':
if sys.argv[1:] == ['init']:
build_tram_network("tramlines.txt", "tramstops.json")
else:
dialogue("tramnetwork.json")
You also need to import sys.
You should submit the following files:
tramdata.pytramnetwork.json (generated by your code with tramdata.py init)The submitted code must be usable in the following ways:
python3 tramdata.py init to produce the file tramnetwork.jsonpython3 tramdata.py to start the query dialogueimport tramdata from another Python file or the Python shell, without starting the dialogue or printing anything