Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

解决个别异常网络情况下的稳定性 #2

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

longtengmcu
Copy link
Contributor

1、解决服务器断开连接情况下,mqtt客户端不释放socket资源导致的socket文件句柄用尽的问题
2、优化个别地方的LOG打印信息,正常输出的信息使用LOG_I级别,错误信息使用LOG_E

2、优化个别地方的LOG打印信息,正常输出的信息使用LOG_I级别,错误信息使用LOG_E
@ueJone
Copy link

ueJone commented Jan 15, 2021

    rc = mqtt_read_packet(c, &packet_type, timer);

    switch (packet_type) {
        case 0: /* timed out reading packet */
            /*when mqtt closed by server, funtion mqtt_read_pacek return no delay time, so add mqtt_sleep_ms to let cpu time zhaoshimin 20200723*/
            mqtt_sleep_ms(100); /* 休眠 */
            break;
        }
  1. 现在能检查到断开、释放和重连了,不过还要等上一段时间才行,想问以下为什么不在recv返回0的时候直接释放Socket而是进行sleep呢?
  2. 另外接收数据调用的函数实在太深了,会影响性能吧
mqtt_packet_handle->mqtt_read_packet->network_read->platform_net_socket_recv_timeout->platform_net_socket_recv->recv()

@longtengmcu
Copy link
Contributor Author

longtengmcu commented Jan 15, 2021 via email

@ueJone
Copy link

ueJone commented Jan 16, 2021

因为Mqtt有一个线程来处理网络的连接与断开

明白您的意思,说下我的理解:之所以无法进行线程切换是因为socket断开后没马上释放socket而是在连续recv(),此时recv不会阻塞而是立即返回从而导致了线程一直占用cpu。再来看下面代码,对socket断开即nread == 0未作处理。感觉在检测到socket断开时立即释放socket比较好,不过我对这个库的逻辑还不熟悉,能不能在此处释放还请指教

int platform_net_socket_recv_timeout(int fd, unsigned char *buf, int len, int timeout)
{
   ...
   platform_net_socket_setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, (char *)&tv, sizeof(struct timeval));

    while (nleft > 0) {
        nread = platform_net_socket_recv(fd, ptr, nleft, 0);
        if (nread < 0) {
            return -1;
        } else if (nread == 0) {
            break;
        }

        nleft -= nread;
        ptr += nread;
    }
   ...
}

@longtengmcu
Copy link
Contributor Author

longtengmcu commented Jan 16, 2021 via email

@ueJone
Copy link

ueJone commented Jan 16, 2021

这里读取socket返回0时,表示未读取到数据,可能是原因是socket连接正常,mqtt服务器端没有发送数据,或是网络线路故障,服务器发送的数据没有正确传输到mqtt客户端,所以这里就不能断开socket连接。Socket的正常通信时的连接断开要通过检测keepalive超时来断开。如果服务端主动断开了连接,读取socket会返回-1,mqtt线程会立即进行重新连接。 目前的mqtt库的socket连接断开的处理已经比较合理了,可以满足大部分应用,不建议在这种地方做这么小的优化了,而且你对整个通信机制没有深入的理解,很难优化的更好。 发送自 Windows 10 版邮件应用 发件人: ueJone 发送时间: 2021年1月16日 9:47 收件人: jiejieTop/kawaii-mqtt 抄送: longteng; Author 主题: Re: [jiejieTop/kawaii-mqtt] 解决个别异常网络情况下的稳定性 (#2) 因为Mqtt有一个线程来处理网络的连接与断开 明白您的意思,说下我的理解:之所以无法进行线程切换是因为socket断开后没马上释放socket而是在连续recv(),此时recv不会阻塞而是立即返回从而导致了线程一直占用cpu。再来看下面代码,对socket断开即nread == 0未作处理。感觉在检测到socket断开时立即释放socket比较好,不过我对这个库的逻辑还不熟悉,能不能在此处释放还请指教 int platform_net_socket_recv_timeout(int fd, unsigned char *buf, int len, int timeout) { ... platform_net_socket_setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, (char *)&tv, sizeof(struct timeval)); while (nleft > 0) { nread = platform_net_socket_recv(fd, ptr, nleft, 0); if (nread < 0) { return -1; } else if (nread == 0) { break; } nleft -= nread; ptr += nread; } ... } — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

接收返回大于0表示接收到数据,返回0表示对端关闭了链接,其他异常返回-1通过errno判断具体异常。现已明确的知道socket断开了没必要等keepalive

@supperthomas
Copy link

RT-Thread/packages#1618

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants