Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control protocol content header #41

Open
BenediktBurger opened this issue Feb 8, 2023 · 22 comments
Open

Control protocol content header #41

BenediktBurger opened this issue Feb 8, 2023 · 22 comments
Labels
discussion-needed A solution still needs to be determined distributed_ops Aspects of a distributed operation, networked or on a node messages Concerns the message format

Comments

@BenediktBurger
Copy link
Member

In #33 we decided upon one frame for header information.
In this issue, we can discuss the content of that header frame.

@BenediktBurger BenediktBurger added distributed_ops Aspects of a distributed operation, networked or on a node discussion-needed A solution still needs to be determined messages Concerns the message format labels Feb 8, 2023
@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 8, 2023

AFAIK, the current proposal is to include

  • message type
  • conversation ID
  • message ID

I think we should be able to use fixed-length byte sequences for all of those, which would allow us to decode this header by length/position.
I suggest to include the items in the above order, as that is presumable in decreasing order of importance.

Question 1: If we include a conversation ID, do we also need the message ID, or would it be sufficient to track request/response/response/... patterns/sequence by conversation ID alone?

Question 2: What should we use as conversation/message ID? uuid was proposed, but might be wastefully long? We can consult our goals to get OOM estimates of the (unique) message counts we need to keep track of.

Question 3: Do we want to include a timestamp? If yes, maybe folded into the message ID (e.g. UUID7 or UUID5), in which case we would include the message ID?

@BenediktBurger
Copy link
Member Author

Question 1: If we include a conversation ID, do we also need the message ID, or would it be sufficient to track request/response/response/... patterns/sequence by conversation ID alone?

If we have a conversation ID, we do not need a message ID for tracking responses etc.

However, we loose the option to identify every message (and the timestamps in the message IDs).

Conversation ID

2 Bytes should be sufficient, as it is only relevant for the requesting party, which should not open more tan 65000 conversations at once.

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 9, 2023

2 Bytes should be sufficient, as it is only relevant for the requesting party, which should not open more tan 65000 conversations at once.

Don't we want conversation IDs to be unique for ~a whole session? That would make it ~trivial to filter e.g. all messages belonging to one conversation out of a log file/database.

edit: If I computed correctly using https://kevingal.com/apps/collision.html, 16 bits of ID, at 100 concurrent conversations (1 per device, order of magnitude estimate, e.g. 33 new conversation per second at 3 second average conversation duration), gives us a 7.3% probability of a collision! Waay to high imo.

@BenediktBurger
Copy link
Member Author

There are two ideas:

  • The conversation_id of a response is message_id of the original message
  • The conversation_id of a response is the conversation_id of the original message (so it is chosen by the sender)

COAP follows the second idea:

Token

Every request carries a token (but it may be zero length) whose value was generated by the client. The server must echo every token value without any modification back to the client in the corresponding response. It is intended for use as a client-local identifier to match requests and responses, especially for concurrent requests.

Matching requests and responses is not done with the message ID because a response may be sent in a different message than the acknowledgement (which uses the message ID for matching). For example, this could be done to prevent retransmissions if obtaining the result takes some time. Such a detached response is called "separate response". In contrast, transmitting the response directly in the acknowledgement is called "piggybacked response" which is expected to be preferred for efficiency reasons.

(https://en.wikipedia.org/wiki/Constrained_Application_Protocol#Token)

@BenediktBurger
Copy link
Member Author

edit: If I computed correctly using https://kevingal.com/apps/collision.html, 16 bits of ID, at 100 concurrent conversations (1 per device, order of magnitude estimate, e.g. 33 new conversation per second at 3 second average conversation duration), gives us a 7.3% probability of a collision! Waay to high imo

I thought, that a Component (let's say a Director) has an internal conversation counter, so it can have 65000 (256*256) conversations until it starts from zero again.
So only if a conversation lasts longer, than you start 65000 other ones, then you have a collision.

@BenediktBurger
Copy link
Member Author

Here is the COAP protocol definition: https://www.rfc-editor.org/rfc/rfc7252

I read their concept:

  • Messages have a message_id and a token
  • A server must respond to a request with the same token
  • There are confirmable and non-confirmable messages.
  • Confirmable messages require either an Acknowledgement or a Reset. In either case, this response has the same message_id as the request has. The request is sent, until either response arrives.
  • An answer to a non-confirmable message has a new message_id.
  • If it takes some time to respond to a confirmable message, the server sends an ACK (same message_id, no token) and later the response with the content (new message_id, original token)
  • Message_ids are used for duplicate detection and responses to confirmable messages.

@BenediktBurger
Copy link
Member Author

AFAIK, the current proposal is to include

I proposed to include the content formatting type (json, avro, binary), not the message type. However, we can think about the message_type in the header.

@BenediktBurger
Copy link
Member Author

edit: If I computed correctly using https://kevingal.com/apps/collision.html, 16 bits of ID, at 100 concurrent conversations (1 per device, order of magnitude estimate, e.g. 33 new conversation per second at 3 second average conversation duration), gives us a 7.3% probability of a collision! Waay to high imo.

Oh, I see one difficulty with the conversation_id: If both endpoints send each other a message with the same conversation_id (as both chose the same ID due to some circumstance), they will interpret the others message as a response, not a new request.
Did you mean that?

We could mitigate that, if the message type differ (request type is different from response type).

@bilderbuchi
Copy link
Member

Here is the COAP protocol definition: https://www.rfc-editor.org/rfc/rfc7252

I read their concept:

* Messages have a message_id and a token

* A server must respond to a request with the same token

* There are confirmable and non-confirmable messages.

* Confirmable messages require either an Acknowledgement or a Reset. In either case, this response **has the same message_id** as the request has. The request is sent, until either response arrives.

* An answer to a non-confirmable message has a new message_id.

* If it takes some time to respond to a confirmable message, the server sends an ACK (same message_id, **no token**) and later the response with the content (new message_id, original token)

* Message_ids are used for duplicate detection and responses to confirmable messages.

These points look useful/applicable in our situation. Token is basically our conversation id (in role).

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 17, 2023

I proposed to include the content formatting type (json, avro, binary), not the message type. However, we can think about the message_type in the header.

With "message type" I mean the command verbs (#29). We need to have those somewhere, I thought the content header would be the logical place.

Serialisation information will also be important, in case we use more than one scheme.
Just for completeness, avro can use either json or binary within, but I assume you mean self-made schemes here.

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 17, 2023

Oh, I see one difficulty with the conversation_id: If both endpoints send each other a message with the same conversation_id (as both chose the same ID due to some circumstance), they will interpret the others message as a response, not a new request.
Did you mean that?

Basically, yes. We can't use sequential codes because two Components might be on the same "offset" concurrently. If we use random ones, we have a certain risk of collision, i.e. two nodes accidentally choosing the same ID via RNG. By adjusting the length/complexity of the ID scheme against the expected message ID generation frequency, we can tune the collision risk to acceptable levels.

I'm not sure message type helps us with this, because message type is not a "random" variable.

@BenediktBurger
Copy link
Member Author

I'm not sure message type helps us with this, because message type is not a "random" variable.

It helps: If I receive a "response type message", I know, it is the response to my request with the same conversation_id. "If I receive a "request type message", I know it is a new request and I have to return a response with the same conversation_id.
For example Acknowledge and error are response message types.
GET/SET/CALL are request message types.

@bilderbuchi
Copy link
Member

Is your argument that by having two kinds of messages, this improves the odds of a collision by a factor of 2?

@BenediktBurger
Copy link
Member Author

BenediktBurger commented Feb 17, 2023

I'd say there are no collisions anymore:

Yes, several messages with the same conversation Id might arrive, but only those of "response type" are a response to my request. As the original sender sets the conversion ID, it can determine, what it gets back (just using "free" id's).

All the other messages with the same if have to be requests, and therefore this I'd is not relevant for the Components under scrutiny.

More clear:

  • we look at some Component CA
  • it sends every request with a unique (in CA) conversation id
  • all answers to the requests will have this conversation Id and their message type is of "response"
  • requests arriving at CA will be recognizable as requests by message type and CA will not try to match such a message to a request it sent itself.
  • CA answers requests withe the conversation Id, it received in the request

So, we have a combination of two filters: message type has to be "response" and conversation Id has to match. Only then, we got the response to our request.

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 17, 2023

So if Co1 sees/routes two "response type" messages from CA->CB and from CE->CF, which could have the same conversation id because they originate from different Components, what happens then?

I'm also thinking of e.g. logging streams, it would be practical/simple if the invariant "one conversation id <==> one conversation" would always be fulfilled (without further logic/analysis).

@BenediktBurger
Copy link
Member Author

I see the conversion ID as a help for the end points of a conversation (especially the requesting one).

The Coordinators routing do not care, whether the routed message is of one or another conversation.

For logging purposes: you need the combination of recipient/sender and conversation Id to get the "full conversation Id".

I see it difficult to achieve collision less conversation Id without a central authority.

I see, that the stream logging is very important to you (I did not think about it).

Oh: logging has another difficulty: you have to combine the logs of all Coordinators to get a full message log.

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 17, 2023

I see, that the stream logging is very important to you (I did not think about it).

I'm thinking of the poor folks that have to troubleshoot future messaging/routing problems. :-D

I see it difficult to achieve collision less conversation Id without a central authority.

Yes, just like you can't achieve collisionless git hashes. What you can do is lower the probability of a collision until you're comfortable (number TBD, but much less than 7% :-p)

@BenediktBurger
Copy link
Member Author

In my test system I use the conversation_id more and more, but not the message_id.

We could include the timestamp in the conversation_id and the message_id consists of the conversation_id and a temporal offset to the begin of the conversation. So:

  • The conversation_id is a timestamp plus a few random bytes (see the discussion about unique ids): unique.
  • The message_id is not additional to the conversation_id, but contains it and a temporal offset (a few bytes)
  • Example structure for complete ID part of the header (contains both IDs): "Timestamp", "Random bytes", "Message offset".

Advantages:

  • The temporal offset does not require a large byte count, such that the combination of both IDs has a low byte count (instead of two independently unique IDs).
  • Timestamp of every message can be calculated.
  • Easy ordering of messages in a conversation
  • You can identify the beginning of a conversation (temporal offset 0) easily from the ID itself (not from the message type)

Disadvantage:

  • The message timestamp has to be calculated.

@bilderbuchi
Copy link
Member

bilderbuchi commented Mar 5, 2023

At a first glance: sounds reasonable. I haven't thought deeply on this, yet.

@BenediktBurger
Copy link
Member Author

With the resurfaced links in #16 my proposal for the content header:

  • UUIDv7 as conversation id, maybe with the first 12 bits of random code used for sub millisecond resolution
  • 4 bytes (number of bytes TBD) offset from the conversation id timestamp in milliseconds (e.g. first message has 0, response has some value different from 0). The UUIDv7 plus this offset is the message id (timestamp can be calculated).
  • Alternatively instead of the offset, we could have a one byte counter: Each response (of the same conversation_id) increases it by one. Advantage: more simple ordering, disadvantage: timestamp of response is not known.
  • Last: 1 Byte for the Serialization scheme

@BenediktBurger
Copy link
Member Author

Talking about PyMoDAQ in combination with LECO, we considered it good to send binary data, therefore it is good, that we have that byte indicating the serialization scheme.

@BenediktBurger
Copy link
Member Author

Regarding the serialization scheme byte: We could allow a certain range (let's say 127-255) for user defined applications

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion-needed A solution still needs to be determined distributed_ops Aspects of a distributed operation, networked or on a node messages Concerns the message format
Projects
None yet
Development

No branches or pull requests

2 participants