-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCS_MAVLink: respect txbuf flow control for FTP messages #26098
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -833,6 +833,15 @@ float GCS_MAVLINK::telemetry_radio_rssi() | |||||
return last_radio_status.rssi/254.0f; | ||||||
} | ||||||
|
||||||
bool GCS_MAVLINK::last_txbuf_is_greater(uint8_t txbuf_limit) | ||||||
{ | ||||||
if (AP_HAL::millis() - last_radio_status.received_ms > 5000) { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Check for zero to make sure we do the correct thing if never received. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would a comment pointing out that the initialization of txbuf accomplishes this be OK instead? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, comment is OK too. Really the key thing is to be able to understand what is going on without having to go look in the |
||||||
// stale report | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
return true; | ||||||
} | ||||||
return last_radio_status.txbuf > txbuf_limit; | ||||||
} | ||||||
|
||||||
void GCS_MAVLINK::handle_radio_status(const mavlink_message_t &msg) | ||||||
{ | ||||||
mavlink_radio_t packet; | ||||||
|
@@ -849,7 +858,7 @@ void GCS_MAVLINK::handle_radio_status(const mavlink_message_t &msg) | |||||
last_radio_status.remrssi_ms = now; | ||||||
} | ||||||
|
||||||
last_txbuf = packet.txbuf; | ||||||
last_radio_status.txbuf = packet.txbuf; | ||||||
|
||||||
// use the state of the transmit buffer in the radio to | ||||||
// control the stream rate, giving us adaptive software | ||||||
|
@@ -1164,7 +1173,7 @@ uint16_t GCS_MAVLINK::get_reschedule_interval_ms(const deferred_message_bucket_t | |||||
interval_ms *= 4; | ||||||
} | ||||||
#if AP_MAVLINK_FTP_ENABLED | ||||||
if (AP_HAL::millis() - ftp.last_send_ms < 500) { | ||||||
if (AP_HAL::millis() - ftp.last_send_ms < 1000) { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems odd that we would use 1 second here but 3 seconds to time out a FTP session. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've only looked at QGroundControl which has a 1 second ack/nak timeout with 3 retries. Does Mission Planner use a 3 second timeout? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was thinking of this timeout:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (our responses crossed here) OK. I think things should work OK with this changed to 3 seconds. The stream rates will just take 3 seconds to go back to normal instead of 1. Should we move the define to a header file and use FTP_SESSION_TIMEOUT instead of 500 or 1000 in GCS_Common.cpp? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, after more consideration, I think there is little reason that this timeout should be the same as the session timeout. This is essentially a heuristic and should be based on the expected behavior of the GCS, including radio latency, but doesn't need to match the session timeout which really needs to be conservative (long) as discarding the open state too early would be bad. With the current GCS behavior, I think 1 second is fine. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My thinking is that if FTP thinks the session is still active then we should still lower its data rates. However a quick test shows the MAVProxy terminates the session after param download whereas QGC and MP do not. If we get the GCS to terminate correctly there should be no problem in using the longer time out here. |
||||||
// we are sending ftp replies | ||||||
interval_ms *= 4; | ||||||
} | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -97,12 +97,8 @@ void GCS_MAVLINK::handle_file_transfer_protocol(const mavlink_message_t &msg) { | |
|
||
bool GCS_MAVLINK::send_ftp_reply(const pending_ftp &reply) | ||
{ | ||
/* | ||
provide same banner we would give with old param download | ||
*/ | ||
if (ftp.need_banner_send_mask & (1U<<reply.chan)) { | ||
ftp.need_banner_send_mask &= ~(1U<<reply.chan); | ||
send_banner(); | ||
if (!last_txbuf_is_greater(33)) { // It helps avoid GCS timeout if this is less than the threshold where we slow down normal streams (<=49) | ||
return false; | ||
} | ||
WITH_SEMAPHORE(comm_chan_lock(reply.chan)); | ||
if (!HAVE_PAYLOAD_SPACE(chan, FILE_TRANSFER_PROTOCOL)) { | ||
|
@@ -121,11 +117,6 @@ bool GCS_MAVLINK::send_ftp_reply(const pending_ftp &reply) | |
reply.chan, | ||
0, reply.sysid, reply.compid, | ||
payload); | ||
if (reply.req_opcode == FTP_OP::TerminateSession) { | ||
ftp.last_send_ms = 0; | ||
} else { | ||
ftp.last_send_ms = AP_HAL::millis(); | ||
} | ||
return true; | ||
} | ||
|
||
|
@@ -155,9 +146,25 @@ void GCS_MAVLINK::ftp_error(struct pending_ftp &response, FTP_ERROR error) { | |
// send our response back out to the system | ||
void GCS_MAVLINK::ftp_push_replies(pending_ftp &reply) | ||
{ | ||
ftp.last_send_ms = AP_HAL::millis(); // Used to detect active FTP session | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why move this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because we want to detect the active FTP session as soon as possible in get_reschedule_interval_ms(), even if we are blocked from sending the first response because of txbuf flow control. This slows down normal streams sooner and allows the radio's buffer to drain more quickly and reduces the delay the GCS sees, which helps avoid GCS timout on the first FTP response. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, that makes sense. But its there now a risk of of FTP timing out if we sit here waiting to send for a while. Looks like the threshold is 3 seconds? Maybe we need a better way to check if there is FTP stuff happening than just looking at There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ultimately, since messages can be lost and the GCS may give up on FTP, testing for FTP activity will need to involve a timeout and I think this may be the simplest way. Where is the timeout 3 seconds? QGC seems to use 1 second timeouts. |
||
|
||
while (!send_ftp_reply(reply)) { | ||
hal.scheduler->delay(2); | ||
} | ||
|
||
if (reply.req_opcode == FTP_OP::TerminateSession) { | ||
ftp.last_send_ms = 0; | ||
} | ||
IamPete1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
/* | ||
provide same banner we would give with old param download | ||
Do this after send_ftp_reply() to get the first FTP response out sooner | ||
on slow links to avoid GCS timeout. The slowdown of normal streams in | ||
get_reschedule_interval_ms() should help for subsequent responses. | ||
*/ | ||
if (ftp.need_banner_send_mask & (1U<<reply.chan)) { | ||
ftp.need_banner_send_mask &= ~(1U<<reply.chan); | ||
send_banner(); | ||
} | ||
} | ||
|
||
void GCS_MAVLINK::ftp_worker(void) { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -75,7 +75,7 @@ GCS_MAVLINK::queued_param_send() | |
} | ||
count -= async_replies_sent_count; | ||
|
||
while (count && _queued_parameter != nullptr && get_last_txbuf() > 50) { | ||
while (count && _queued_parameter != nullptr && last_txbuf_is_greater(33)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What would we call it? one_third? Is that really better? Anything less obvious and I would be annoyed when reading the code to have to look up the constant. I considered eliminating the parameter and hard code 33 in last_txbuf_is_greater(), but I wanted to allow for other threshold values, even though I doubt we will need them. But I'll make the change if you let me know what I should call it. |
||
char param_name[AP_MAX_NAME_SIZE]; | ||
_queued_parameter->copy_name_token(_queued_parameter_token, param_name, sizeof(param_name), true); | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the new timeout we don't need to init to 100 anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add the check for 0 ms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems more efficient to initialize just once than to execute the extra check every time.