Merge flush-in-batch capability #40

BLuedtke · 2024-11-26T14:44:16Z

Closes #39

So far, when bidib_flush was called (bidib_transmission_send.c), the content of the send buffer would be written byte-by-byte. This results in one write system call per byte (see the bidib_serial_port_write function). However, write is perfectly capable of writing more than one byte at a time.

I therefore added a write_n ("write n bytes") analogue to the write call we have used so far. However, we cannot simply pass the buffer and the buffer size to write_n as-is, because bytes for escaping and the crc (checksum) have to be inserted somewhere in the buffer. Various ways of dealing with this have been benchmarked, see this comment here. One of the solutions turned out to be superior, specifically the one using an auxilliary statically allocated buffer.

The old byte-by-byte flushing is still present, in case it is needed as a fallback (e.g., if the auxilliary buffer is too small, though I don't expect this to happen).

In addition to the write-stuff, I also added some logging if received packet processing takes longer than a certain threshold. Not directly related, but it was due to be added anyhow IMO.

I want to re-run the physical tests a few times before merging, but I think the code is ready to review anyway.

BLuedtke · 2024-11-27T11:11:38Z

The physical test "drive with two trains at the same time" for swtbahn-full seems relatively fragile regarding occupancy detection/misses, especially if the train's wheels are somewhat dirty. As far as I can tell, these issues are not related to the flush-in-batch changes. -> Improve physical tests.

BLuedtke · 2024-11-27T14:32:34Z

Issue seems to go deeper - from time to time, the whole application locks up. I don't know if its because of lock contention or something else. If any, the trackstate rwlock seems to be the problem, which is not surprising as it has to be locked extremely often. However, I can't rule out external influences of e.g. the raspberry pi. -> not sure what is actually blocking.

BLuedtke · 2024-11-27T14:52:09Z

Trying to analyze it with gdb -> problem doesn't occur when running via gdb. WTF.

BLuedtke · 2024-11-28T16:33:08Z

I rewrote the guard mechanism for the trackstate members - now there is one mutex per trackstate member, except for points and signals which are grouped together. Until now, we had one lock for the whole trackstate collection, which is very corse. The change from a read-write lock to a mutex was to keep it a bit simpler - with this more fine grained access control, I don't think we need the parallel read capability anymore for the trackstate.

BLuedtke · 2024-12-03T10:29:59Z

I implemented a good part of the trivial requested changes and marked the corresponding conversations as resolved.

BLuedtke · 2024-12-03T10:37:19Z

One more change I would like to perform:
There are 2 read-write locks, one for bidib_boards, one for bidib_trains. These collections they guard contain, as far as I can see, mostly static information parsed from the config files. However, they are defined in bidib_state.c, which is why the corresponding locks were called bidib_state_boards_rwlock and bidib_state_trains_rwlock, respectively.
The actual runtime state of entities that changes regularly is contained in bidib_track_state, also defined in bidib_state.c.

When I return to the codebase after a longer break, I'm sometimes confused by the naming - bidib_state_trains_rwlock sounds like it should guard the STATE of the trains, i.e., bidib_track_state.trains, but it guards bidib_trains. Therefore, I'd like to rename bidib_state_boards_rwlock to bidib_boards_rwlock and bidib_state_trains_rwlock to bidib_trains_rwlock. Might as well do it now, with all these changes to the locks/mutexes already done.
Will do that probably after all the previously requested changes have been adressed, to keep the timeline simpler.

…pies for state.data.state_id members of various kinds

…ng to NULL.

BLuedtke · 2024-12-03T13:56:08Z

Tested the memory-related changes with Valgrind, fixed leaks accordingly.

eyip002 · 2024-12-03T20:10:05Z

@BLuedtke What SWTbahn platform did you test the code on? On the SWTbahn Lite, I remember the Linux GUI freezing every so often for a few seconds, but couldn't figure out why. This was happening even when I was only navigation through folders.

BLuedtke · 2024-12-04T10:42:25Z

@BLuedtke What SWTbahn platform did you test the code on? On the SWTbahn Lite, I remember the Linux GUI freezing every so often for a few seconds, but couldn't figure out why. This was happening even when I was only navigation through folders.

So far, I've always been testing on the swtbahn-full.
I can't recall full UI freezes on the swtbahn-full, but I also suspect that the freeze/lockup we are seeing is related to the raspberry pi in one way or another. Perhaps it just overheats and throttles, perhaps it's unfortunate scheduling, maybe its when the syslog gets synchronized/written to the SD card -> syslog blocks, and so on.
We already have raspberry pi 5's here, but we haven't had the time to switch the Pi's. Also requires some setup adjustments, as for the pi 5 I'm considering switching to the raspberry pi ubuntu image instead of Raspbian. I should task Jochen with setting up the pi 5 to also work with the screen for showing the IP address, and for the VNC server setup etc..
It would still be good to know what exactly is happening, but maybe that would cost too much time. I guess we move forward with the rest of the changes and see what happens once we switch to a pi 5.

BLuedtke · 2024-12-04T10:48:26Z

One more thing I could try is to switch to the realtime kernel and see if the freezes still occur.

…ting time

…in bidib_state_setter overall

BLuedtke · 2024-12-04T12:16:13Z

the heartbeat thread/log causes the tests to take a bit longer, as when joining it might be in a sleep(2). Will see how that could be fixed.

…and order improved

…only on floats/reals)

BLuedtke · 2024-12-04T12:41:17Z

Heartbeat problem fixed by dividing the long sleep into several 0.1s ones and checking if bidib is still running.

The read-write locks I wrote about earlier have been renamed.

I think that's all for this pull request.

eyip002 · 2024-12-04T18:39:19Z

Changes look good

BLuedtke added 17 commits May 6, 2024 10:10

add benchmark time for flush and new write_n bytes

43dfa86

new batch flush variants

9f9a0f7

change flush standard to batching

845ab0a

adjusted physical test

55e36fe

further testsuite adjustments

27c30b1

fix track cov test case running forever

df63348

tuned receive thread reading

4cd55dc

log if receive-packet takes longer than 0.1s

2fdb0a3

handle receive benchmarking

615c8c0

more split packet time logging

39ce1d9

more time logging

09e752d

time logging

914ddb6

time logging ad.

1312b16

time logging adjustment

6290bf6

adjusted "took long" threshold for auto receive thread

11cddc7

stop logging every flush duration

9b5ba56

documentation improved for additions and removed commented out code

cd788cb

BLuedtke linked an issue Nov 26, 2024 that may be closed by this pull request

Flush in batch instead of byte-by-byte #39

Closed

BLuedtke self-assigned this Nov 26, 2024

BLuedtke added the enhancement label Nov 26, 2024

BLuedtke added 2 commits November 27, 2024 14:34

adjustments for debugging

8702c37

more debug stuff

3f79d82

BLuedtke added 3 commits November 28, 2024 16:09

rewrote trackstate access protection single lock to individ. mutexes

48e3d7e

check other locks, documentation, naming

6cf84d9

fix incorrect include

7d322b3

change heartbeat to be more visible for debugging.

bf24354

BLuedtke added 2 commits December 3, 2024 10:56

remove all "Old: ..." in function docs

0bdc375

Remove all old trackstate readwrite lock lines

b07b1b6

BLuedtke added 6 commits December 3, 2024 11:47

improved flush-batch to not require byte-per-byte fallback

054dd38

remove commented code with the dupl loop for train peripheral bits

b9f67c0

Fixed alloc and free for reverser_state->data state_id.

6dcd97f

where sensible, set pointer to NULL after free, and also make deep co…

1f08eb8

…pies for state.data.state_id members of various kinds

fix missing check if mem has to be freed and remove unnecessary setti…

e56b03f

…ng to NULL.

minor fix one memleak, heartbeat thread join, const.

c302abb

BLuedtke and others added 6 commits December 4, 2024 12:14

removed single-byte write and related functions

7187679

physical testsuite driveTo cleanup a little

bfcf8f1

cleanup heartbeat log, and remove ineffective field specifier in prin…

76a35f4

…ting time

adjusted log level on bidib_state_lc_stat and some minor adjustments …

98bec1b

…in bidib_state_setter overall

formatting/whitespace

e296092

Merge branch 'master' into 39-flush-in-batch

ba9680a

BLuedtke added 4 commits December 4, 2024 13:25

heartbeat log sleep partitioned into smaller sleeps, bidib_stop logs …

7f21700

…and order improved

remove ineffective format specifiers (%.9 has no effect on integers, …

0f784e6

…only on floats/reals)

read-write locks renamed to be less confusing

8ea9959

formatting/whitespace in bidib_state.c

c097281

eyip002 approved these changes Dec 4, 2024

View reviewed changes

eyip002 merged commit 1959bd1 into master Dec 4, 2024
1 check passed

eyip002 deleted the 39-flush-in-batch branch December 4, 2024 18:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge flush-in-batch capability #40

Merge flush-in-batch capability #40

BLuedtke commented Nov 26, 2024 •

edited

Loading

BLuedtke commented Nov 27, 2024 •

edited

Loading

BLuedtke commented Nov 27, 2024

BLuedtke commented Nov 27, 2024

BLuedtke commented Nov 28, 2024

BLuedtke commented Dec 3, 2024

BLuedtke commented Dec 3, 2024 •

edited

Loading

BLuedtke commented Dec 3, 2024

eyip002 commented Dec 3, 2024

BLuedtke commented Dec 4, 2024

BLuedtke commented Dec 4, 2024

BLuedtke commented Dec 4, 2024

BLuedtke commented Dec 4, 2024

eyip002 commented Dec 4, 2024

Merge flush-in-batch capability #40

Merge flush-in-batch capability #40

Conversation

BLuedtke commented Nov 26, 2024 • edited Loading

BLuedtke commented Nov 27, 2024 • edited Loading

BLuedtke commented Nov 27, 2024

BLuedtke commented Nov 27, 2024

BLuedtke commented Nov 28, 2024

BLuedtke commented Dec 3, 2024

BLuedtke commented Dec 3, 2024 • edited Loading

BLuedtke commented Dec 3, 2024

eyip002 commented Dec 3, 2024

BLuedtke commented Dec 4, 2024

BLuedtke commented Dec 4, 2024

BLuedtke commented Dec 4, 2024

BLuedtke commented Dec 4, 2024

eyip002 commented Dec 4, 2024

BLuedtke commented Nov 26, 2024 •

edited

Loading

BLuedtke commented Nov 27, 2024 •

edited

Loading

BLuedtke commented Dec 3, 2024 •

edited

Loading