Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qos2 message lost when followed by a qos0 message #994

Open
gmondada opened this issue Sep 23, 2024 · 6 comments
Open

qos2 message lost when followed by a qos0 message #994

gmondada opened this issue Sep 23, 2024 · 6 comments
Labels

Comments

@gmondada
Copy link

System Information

  • Aedes: 0.51.3 (commit d0c4414)
  • NodeJS: 20.16.0
  • OS: Windows 10
  • Arch: x86_64

Describe the bug
The broker has to dispatch two messages to the same client: a qos2 message followed by a qos0 message.
These two messages go through Aedes.prototype.publish(). They are numbered (the brokerCounter property is assigned) and go through MQEmitter.prototype.emit() and deliverQoS() in client.js.
At that moment, however, the order has been inverted. The qos0 comes first and gets delivered. The qos2 message immediately follows but is detected as duplicate and gets discarded.
In fact, its brokerCounter property is lower then the last delivered packet and get discarded by getToForwardPacket().

To Reproduce
I have two clients: C1 and C2 which play ping pong with messages. C1 publishes two messages which C2 receives. C2 replies by publishing two messages that C1 receives, etc. All these messages are qos2. In addition to that, C2 publishes a sort of heartbeat message in qos0 every second. At some point C1, waiting 2 messages from C2, actually gets only one messages and waits for the second message forever.

Expected behavior
No message lost, especially when qos2 is in use.

Additional context
C1 is connected via a WebSocket. C2 via a raw socket.

@gmondada gmondada added the bug label Sep 23, 2024
@gmondada
Copy link
Author

I could help fixing this issue and submit a pull request, but I'm not sure how message should be managed in this scenario. I could suggest having a different counter for each qos, for instance, but need confirmation that this results in the right behavior.
I guess getToForwardPacket() want to detect the case where a client has subscribed to two different topics which overlap. In that scenario, we want to avoid sending a message twice to the same client. But what should happen if these two topics are subscribed with different qos? Should the broker send the message only once? Should it use the highest qos?

@robertsLando
Copy link
Member

@gmondada What emitter are you using? Could you try a different emitter (like redis/mongo) and tell me if the issue persists? I think the issue is that the emitter should respect the order

@gmondada
Copy link
Author

gmondada commented Sep 24, 2024

Hi @robertsLando, thanks for replying. I didn't specify any special emitter or message queue, I guess it's the one by default (MQEmitter?). I use a single broker with no persistence. Here below is how I instanciante the broker. Please note that aedes.on("publish", ...)displays the messages in the right order.
In my debugging session I saw the two messages going through different queues because of their different qos.
To test with redis/mongo, do I need those databases? I could build a small test program to reproduce if it's helpful.

const aedes = new Aedes();
const socket = createSocket(aedes.handle);
socket.listen(port, function () {
  console.log("Aedes MQTT listening on port", port);
});

const http = createHttp();
const ws = new wsServer({ server: http });
ws.on("connection", (connection, _) => {
  let stream = createWebSocketStream(connection);
  aedes.handle(stream);
});
http.listen(wsPort, () => {
  console.log("Aedes MQTT-WS listening on port", wsPort);
});

aedes.on("publish", (packet, client) => {
    // messages are in the right order here
});

@robertsLando
Copy link
Member

@gmondada I was asking because there were some issues with on disk persistences a while ago but if you are using the default one the issue is not the persistence, could you create a test that reproduce the issue and submit a fix for this? I'm not sure about what's the best way to fix this ATM

@gmondada
Copy link
Author

gmondada commented Oct 1, 2024

I build a simple test program to show the problem: https://github.com/gmondada/aedes/blob/gm-tests/test1.js

To reproduce:

git clone https://github.com/gmondada/aedes.git
cd aedes
git checkout gm-tests
npm i
node test1.js

Then, wait a minute, until the program stops...

@robertsLando
Copy link
Member

I actually have no time to look at this, feel free to submit a PR to fix the issue, I will be happy to review it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants