WARNING! Huge techical text!
In my last blog entry
I promised to start a nice metagroup for the community. I want to share the details why I decided to delay the start.
In the current state of the code the program saves all of its stuff to the local disk (well, in case of a Raspberry Pi
, the SD card) which seemed fine for a while.
I toyed with the thought of using an SQL server to store the data of the group. I started making a sketch of the database and I stopped working on it at this state:
For the second thougt, it seems a pretty poor design. However, it made me to stay at the local storage way ... well, at least for the last few days. The group works nicely with files according to the simulations I made. At the time of the blog post I planned to write the note style and the code which deals with disconnections and push the start button. As I started to put the pieces together I was thinking about the file handling of the program and I thought I can make it a little better before the kick off.
Currently, all stuff is saved in separate files. This leads to more than thousand files in a season directory. One file goes for each of the following stuff:
- Pairings (content example: ["Undead","Dwarf"])
- Done (content: one byte boolean)
- Attendances (content: two four byte integers for the two races which counts the numbers of how many times a race played since the start of the round until the contribution)
- Report (content: the JSON container of the match report)
Since for normal races there are 23 rounds with 12 pairings each this leads to 1104 files. In addition, one file per round saved the attendances of races during the round and another stored the start date and match ID of the round. This increases the number of files with 46. Moreover, another 11 files stored the data of head-to-head results of both coaches and races, races a coach played, scores of coaches, WDL of races etc.
A developer should know that a file opening and closing has its cost in time. Working with several thousands could not said to be effective on a small hardware even if it uses a memory card instead of a hard drive for the task.
What I want instead is to keep the data in several files per season only whose has some higher complexity without loosing the flexibility. In flexibility I mean that replacing a tiny part of the structure should not require the whole file to be rewritten. How is that possible? Let's say I want ot store the number of appearances of each races. I clearly need 24 integers for that. Integers can be stored many ways, since higher possible values require longer data. For this task, four byte integers will be fine enough which makes my file 96 bytes in size. The nice part of this structure is that I can replace a singe value by navigating to its four bytes long space and rewrite that four bytes (or maybe less) only which is a very tiny I/O operation.
I, however, want to allow more complex nested structures into my files without loosing the ability to manage them in a nice and fast way. I started working on it from scratch. The current state of the repository is here
if one may interested. I put different data of various types together in a row structure including array of array of integers (whitch is a matrix) and was satisfied. This would be great to develop and use(!) ... and I will continue its development shortly after I released the metagroup.
I figured out that because the Pi would run 24/7 I should get back to the SQL way (PostgreSQL
in specific) or buy a UPS
to protect the system from power supply losses. PostgreSQL has a nice transaction mechanism
which garantees I won't end up messing up my entire database if the power loss hits in the middle of a running transaction.
I have long-term plans with the file container project though. Some of you may remember cheating saves of various games with hex editors. One should replace a specific part of the file with FF hex values which resulted the maximum credit or whatever the game could provide. The writer of the description reverse enginered the save file format of the game by saving the game and searching for a value he knew. As I said, integers have many faces so it is not an easy task. Probably he had to do many saves at different value states and searching for the appropriate value in each to catch the location.
I want to develop an open-source package which is not only a binary data file builder and manager, but also a reverse engineering library. The description of a file format could be used for I/O and file management. It will be human readable which could be published making the structure of a binary data file understandable and open knowledge. So my time is not wasted at all.
You guys, however, may still waiting for the metagroup to start. Well, despite my former promise, I should rethink the database structure and migrate the data of the RRRC
to that place. I really believe now that it will spare me a lot of headache and maintenance work later on. I want this stuff to run with as little maintenance requirement as possible.
Sorry for the delay!
Be patient, and play some Fantasy Football until the RRRC kick off!
The development repository is here:
I splitted the database to separate schemas (layers). The public layer deals with the data on FUMMBL. Think it as a mirror. The RRRC layer is for the metagroup. It gets informed of incoming matches of the public layer and updates the results and standings based on that.
You can watch the structure of the database schemas by opening the DIA files in Dia Diagram Editor