Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message Encoding #43

Closed
BenediktBurger opened this issue Feb 9, 2023 · 51 comments · Fixed by #54
Closed

Message Encoding #43

BenediktBurger opened this issue Feb 9, 2023 · 51 comments · Fixed by #54
Labels
discussion-needed A solution still needs to be determined distributed_ops Aspects of a distributed operation, networked or on a node messages Concerns the message format

Comments

@BenediktBurger
Copy link
Member

Part out of #20

How do we encode the content of messages?

We already have a header frame, which could indicate different encoding styles. For example we could allow binary (for file transfer), json (for easy implementation), avro (for RPC via Apache Avro RPC), etc.

Some ideas (from the beginning of #20):

  • We can use json serialization with lists and dictionaries to convey key value pairs. In my implementation (growing with this project and going to be adjusted) I currently test a list of lists: Each entry of the outer list is a "sentence". A sentence is a list consisting in a command type and possibly arguments. An example: b"[['GET', ['property1', 'another property']], ['SET', {'property1': 7}], ['GET', ['property1']]. In this example I request the values of two properties, I set one of these properties to a specific value and I request the property again.
  • yaq uses Apache Avro RPC to serialize data. That could be another possibility.
    @bilderbuchi: I read a bit on the subway today, it really looks like a good option; this would also take some of the handshaking, capability listing, reply tracking, message verification (with schema), RPC burden from us. zmq would probably mostly be the transport layer (so one frame per avro message; plus maybe the topic frame). JSON or binary option. Implementations for several languages.
    Specification here.
    Yaq notes on why/how they use avro: https://yeps.yaq.fyi/107/
@BenediktBurger BenediktBurger added distributed_ops Aspects of a distributed operation, networked or on a node discussion-needed A solution still needs to be determined messages Concerns the message format labels Feb 9, 2023
@BenediktBurger
Copy link
Member Author

I like the idea to encode Transport Layer messages with some simple protocol (e.g. json) in order to reduce complexity in basic message handling. That way it is easier to write additional Components in different languages.

For example I see someone using Labview for Instrument control. However, I guess it is quite some work to use Apache avro in Labview (at least a quick research did not show an answer).

@bilderbuchi
Copy link
Member

OTOH, you'll then need some way to distribute schema information, do handshake, version comparison, etc. (what Avro does behind the scenes.

Wouldn't/couldn't Labview be a Driver component accessed via an Actor? Or can we provide a translation component? Can Labview interface with a C/C++/C# API (I guess yes)?

@bklebel
Copy link
Collaborator

bklebel commented Feb 10, 2023

I have to admit, Avro was a bit of a roadblock for me when I looked at yaq. If it enables a rather easy backend to provide simple commands on the "frontend", i.e. in the case of yaq it looks a bit like if we were talking to a pymeasure Instrument "over the wire", in our case a director could have (in python) an object for every connected Component it cares about enough, and mirror the methods/attributes/properties of that Component directly to that object, so that in the director we could use them as if we would directly manipulate the actor itself. I suspect that if we aimed for something like this, Avro would be a good choice.
I guess that works well if we have somebody dedicated to the programming part, and users would only ever see the frontend, like I see it at user facilities.
AFAIK, our scope is slightly more limited, even more so as the low entry threshold is quite important to me. When introducing new students to my current implementation which uses dictionaries instead of lists to send such commands, if the same students just finished with the python tutorial up to the point of dictionaries, they don't have the easiest time to understand just this. I am not sure whether even I want to invest the time to understand Avro sufficiently to use it appropriately, and I am very reluctant to hit it over students' heads. Maybe things will change in the future, when students might have learned programming in school already, but right now that is not the case.
Avro ist just a beast: it can do a lot (as far as I understand/perceive), but it is also quite a huge thing in itself.
Designing our own schemata for special things might become messy and time-consuming too, but hopefully the definition/specification would not be that huge. I think I will now have to take a deeper look at Avro to make a more informed decision, but just the matter that I do have to do so is a bad sign for me.

@BenediktBurger
Copy link
Member Author

I see the advantage of avro, that you know the capabilities of the other side.

But I see the same problem as @bklebel and am still hesitant, whether I want to use Avro. I want to try it first.

My original proposal was a more simple setup:
We only specify get, set, call.

The user (director) has to know the possible attributes to get/set and methods to call.

For example you would send "get, some_property" and the Actor will try to read "some_property" (for example via "getattr" in Python).

@bilderbuchi
Copy link
Member

The user (director) has to know the possible attributes to get/set and methods to call.

For which the Actor has to send some kind of schema to the Director (which is built into Avro already, including version compatibility etc).
I see your points about Avro complexity, though. Let's study it in more detail until we know enough.

@bklebel e.g., I'm not sure where using Avro actually complicates things for your students. The protocol would autoconstruct the Avro stuff/schema from e.g. any pymeasure Instrument, no need for the students to touch that. The pymeasure-leco interface is not something that students will write. Message content will be serialized/deserialized by pyleco Components (or e.g. the logging solution). Where do you see your students interacting with Avro directly?

@bklebel
Copy link
Collaborator

bklebel commented Feb 11, 2023

For which the Actor has to send some kind of schema to the Director (which is built into Avro already, including version compatibility etc).

I disagree. Hardware devices do not typically tell you what commands you might send them, you have to look that up in the manual. Of course, that manual then too needs to be written, and stored in a safe space, as a matter of fact, we have an Instrument in our lab for which we do not have a manual, and which does not have a clear brand/name/serial number/most anything to search for a manual, and being able to request the command set would be nice, no doubt.

The protocol would autoconstruct the Avro stuff/schema from e.g. any pymeasure Instrument,

And what if I want to use some obscure Driver I found on the internet, for some obscure Device? I cannot expect that whatever we put into pyleco will be able to handle it out of the box. Somebody will need to implement that.

When I compare our system to a LEGO set, e.g. a police station, or something bigger, some stuff from movies, if I don't consider building it as play now, but only toying around with the finished thing (whatever finished means), it is not like I can only ever play with it if I put everything together as it is described. I can start fooling around with individual bricks from the start if I have a fancy. It is kind of strongly suggested to go through with building it, but it is still optional.
If possible, I would want to see this system have a similar optionality to it. If somebody wants to take the time (and we would strongly suggest it), they may include all that command-schema, so that a Director can request it, and use that information, but nobody has to do it for a connection between Actor and Director to work. A bit similar to pymeasure, where you do not need to implement a full device, but you can just put one command and be done with it.

Parts of that might also boil down to good documentation, the cake everybody wants to eat but almost nobody wants to make (it seems to me, including myself). I will need to think more about it.

@BenediktBurger
Copy link
Member Author

For which the Actor has to send some kind of schema to the Director (which is built into Avro already, including version compatibility etc).

No, it is not necessary to send the schema etc.
If I write a Director for my laser system, I know (because I wrote the Driver or due to pymeasure API), that I set the property "emission" to True for my seed laser and then I set the output power setpoint of the amplifier (most recent pymeasure PR 😁) to some value.

Still a device may send its capabilities (via some command.

Maybe we start with a simple message layer (I'm for json) and fiddle in parallel with avro.
Then json could be the "simple layer" understood by every component and avro a more complex, optional message layer version.

The header could indicate the version/type of content (json, avrò, binary...)

@bklebel
Copy link
Collaborator

bklebel commented Feb 11, 2023

I agree. At least, Avro is encoded in json, and will not be invalidated if we add more additional commands.
In that sense, we could first go for the more simple stuff, which might end up as "additional things" in Avro. If a user wants to use the more elaborate/complicated structure of Avro they can do so, but it is not a necessary condition to building a system with leco.

I was a bit hesitant to propose such a pathway, because we would essentially work on two message formats in parallel. However, it could possibly be the best way to handle the usecases we are aiming for.

In general, we could also put the Avro implementation in the backlog, focus on something more simple now, get it going/running rather quickly, and look at the more sophisticated options with Avro in a future version. Possibly we could in the first version already allow Avro, but not specify it completely, although I am not sure whether that would be wise.

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 12, 2023

I disagree. Hardware devices do not typically tell you what commands you might send them, you have to look that up in the manual. Of course, that manual then too needs to be written, and stored in a safe space, as a matter of fact, we have an Instrument in our lab for which we do not have a manual, and which does not have a clear brand/name/serial number/most anything to search for a manual, and being able to request the command set would be nice, no doubt.

No, it is not necessary to send the schema etc.
If I write a Director for my laser system, I know (because I wrote the Driver or due to pymeasure API), that I set the property "emission" to True for my seed laser and then I set the output power setpoint of the amplifier (most recent pymeasure PR grin) to some value.

I fear you'll have to unpack that a little more. I must be missing something here.

Note: I see this as primarily about if we want to require schemas or not. Forget about Avro or JSON for the moment, how we achieve propagating this information this is an implementation (well, specification) detail.
This is not about Avro, but about how automatic we want things to be. I did not realise your seeming/apparent positions on this until now.

Do I understand you correctly that to get a setup running with pyleco, users of a laser system will have to

  1. Study the manual to learn what their device can do
  2. Study the manual when implementing a Driver, e.g. in pymeasure, or yaq, or whatever, to interface the device's capability with software
  3. (Let's say you don't need to write an Actor as you are using an instrument library that an Actor implementation already exists for.)
  4. You study the manual again (or the Driver source) when they create their pyleco Director/GUI, to find out what the device can do?! (although the Driver should be perfectly capable to tell its Actor what it can do, e.g. set the "emission" to true)

Mind that the people doing step 2 and step 3 will probably not be the same, as one of the main intentions behind our Python packages is to avoid people needlessly redoing work, because they reuse the work of others who came before:

Hardware devices do not typically tell you what commands you might send them, you have to look that up in the manual.

I guess 80% of pymeasure's value proposition is that user do not have to look that up in the manual, as it's a tab-completion away.

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 12, 2023

And what if I want to use some obscure Driver I found on the internet, for some obscure Device? I cannot expect that whatever we put into pyleco will be able to handle it out of the box. Somebody will need to implement that.

So, let's go through this:

  • You find some obscure driver on the internet. You're lucky this time, it's in Python.
  • You study how it works, how its API is set up (turns out, all its "Action" equivalents are prefixed with do_, except for two, and "Parameters" are implemented as get_ and set_ methods (maybe a C++ programmer wrote the driver).
  • You want to integrate this as a Driver into your pyleco network, so you write an Actor class (based on the pyleco base Actor) that interfaces this Driver's API with the consistent Actor API (Actions and Parameters, basically, and how incoming messages trigger these).
  • Now, the new Actor already knows the capabilities of the Driver, but cannot tell you (as no schema exchange was foreseen). So when you start to connect the new Actor to your GUI (Director), it cannot autopopulate the widgets. You have to dive into the manual (or into the implementation/docs of your Actor) to find out which buttons should be there, and which text fields, and which of those should be editable (controls) or not (measurements). To use the LEGO analogy, you have a new Actor brick, but to find out how/if it fits the Director brick, you have to check the (colorful) instructions.
  • change of scene: Some person halfway around the planet uses a LECO setup in their thesis work. They have a weird device in the lab that they have to use, it's a power supply. They find some obscure LECO Actor that some helpful soul wrote for exactly that device on the internet -- what are the chances?? 😅 They connect the device to their network. Now, how do they find out what the device can do? Read the source code to find out...
  • In a parallel universe, the Actor tells the Director what capabilities it has (details how TBD), and the Driver uses that information to populate its GUI with widgets. The user can immediately set a voltage and check the "output enabled" checkbox.

I'm not sure how much we can stretch the LEGO analogy, but for me the bricks are the Components (Actor, ...), and they fit together, and you can assemble something from them, without consulting device manuals to see if that new brick you found has the thing you need.

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 12, 2023

Now, there is of course a wrinkle here, in that Python is not statically typed, so sometimes it will be hard to find out the type of values to exchange with some Driver interface (boolean, int, float,...), so it will be hard to fully populate widgets without some additional information.
We have a similar problem in pymeasure, which is why it would be good to standardize property docstrings to include the type -- then the PymeasureActor can infer that information automatically. We could also use dicts to codify and look up that information. I think this is a tractable problem.

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 12, 2023

In conclusion, I guess we could allow an Actor to not include schemata for the instrument driver packages (or single-model code) whose API it interfaces, but I would consider that to offer "degraded experience" in LECO. So, optional/MAY if you prefer, but IMO we should plan/design the base case to include schema exchange.

If we want to use something other than Avro to achieve this, that's fine, soo. I suspect we (and others who might code LECO implementations) will have to implement much of the machinery that would come with Avro ourselves. Ultimately it will be more and duplicated work, but maybe it makes sense because we need only a very small subset of Avro's capability.
https://json-schema.org/ comes to mind.

@BenediktBurger
Copy link
Member Author

Where is the difference between looking at the pymeasure API documentation or looking at the schema returned by an actor?
Basically, the schema contains the same information as the api documentation, doesn't it?

I see the advantages of a schema (I was never against a command returning a schema), such that I am for "an Actor SHOULD provide a schema of its capabilities".

For the data protocol, Schemata might be more difficult, but that is another story.

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 12, 2023

Where is the difference between looking at the pymeasure API documentation or looking at the schema returned by an actor?

The schema can be processed/consumed by a program, i.e. the Director code, Actor code. Also, the schema is consistent across all instrument libraries, pymeasure, instrumentkit, etc

Basically, the schema contains the same information as the api documentation, doesn't it?

Yes, maybe more (type information), and it's homogeneous across interface instrument deiver libraries (same point as above).

@bklebel
Copy link
Collaborator

bklebel commented Feb 12, 2023

I guess 80% of pymeasure's value proposition is that user do not have to look that up in the manual, as it's a tab-completion away.

I might be a bit old-fashioned, but I currently do not use an IDE with tab-completion for python, and for pymeasure I actually typically look at the source code, or at least at the instruments docs. Even so, for completions I need already some idea of what the instruments contributors would have called something, and though it helps not to make typing mistakes, I do not magically get exactly what I want/need in a particular instance. If I missed out on existing magic like this hitherto, and you can point me to it, I would be pleased.
I agree that if we have a clear schema, this might be resolved into a clean structure of attributes by the Director, however in that you definitely assume to work in an interactive environment where the information is actually present in the Director-object. If I am writing a "static" Director, which should e.g. consume a sequence-definition and tell all Actors how they should behave, I would rather look up the API definition of the Actor/Driver than opening an interactive session, spin up a Director, and start querying the Actor for what it might be capable of.
And since quite often I find the documentation of python packages to have only a limited scope and quality, I tend to actually look at source code, especially if I want to use it to control some hardware which I might break (e.g. thinking of a SC magnet).

I guess I agree that this discussion is fundamentally about how much and what should be automatic (for example autogenerated GUIs for Directors would rely on Actors providing this information). However, that also touches the scope we want to go to, and amount of work we are prepared to invest, the size this protocol would grow to have, and through that the entry-threshold to use it.

@bilderbuchi
Copy link
Member

I might be a bit old-fashioned, but I currently do not use an IDE with tab-completion for python, and for pymeasure I actually typically look at the source code, or at least at the instruments docs. Even so, for completions I need already some idea of what the instruments contributors would have called something, and though it helps not to make typing mistakes, I do not magically get exactly what I want/need in a particular instance. If I missed out on existing magic like this hitherto, and you can point me to it, I would be pleased.

I'm not sure what you mean with that last sentence, but let's say you have a Keithley2260B PSU, know nothing but that this pymeasure thingy has a driver for it, and want to know how to talk to it. For this you only need an IPython REPL, not an IDE. I pressed TAB after the period.
image

To find out more about that power you notice:
image

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 12, 2023

I agree that if we have a clear schema, this might be resolved into a clean structure of attributes by the Director, however in that you definitely assume to work in an interactive environment where the information is actually present in the Director-object.

I am definitely not assuming an interactive environment (well, depends on your definition of "interactive"), and the information does not have to be present in the Director object (that might be convenient though, and would be the case for Avro as that is part of the handshake), but the Director can get that information from a connected Actor when required.

If I am writing a "static" Director, which should e.g. consume a sequence-definition and tell all Actors how they should behave, I would rather look up the API definition of the Actor/Driver[...]

Excellent example! If your Director consumes a sequence definition, and if your Director has an idea of the capabilities of the Actor (by getting a capability description via a schema), the Director can check your sequence definition against that schema when you load it, and will be able to determine that on line 135 you mis-spelled powre instead of power, and tell you that that is an invalid thing to query. If this is not the case, you will find out at runtime, maybe a couple of hours into your experiment's run.

Mixing languages is something we want to enable by communicating via zmq. How would you look up the API definition of an Actor that is implemented in C++, and for which you might only have a compiled shared library?

In any case, maybe a misconception: The schema thing is not necessarily/primarily for you (the user), but for the Components of the protocol, to be able to automatically use information for the good of the user (check sequence definitions, reject invalid inputs, autocreate GUIs,...).

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 12, 2023

Anyway, it looks to me like you are not convinced of the benefits of mandating/baselining a schema/capability description for Actors.
Before we derail this discussion further, I'll take out an action for myself to write up a more complete description of what I mean and why I think it should be done, and maybe find a way to prototype/demonstrate this (in finite time :D). - #45

@BenediktBurger
Copy link
Member Author

To find out more about that power you notice

Oh nice: I did not know, that you could use the question mark in that way

@bklebel
Copy link
Collaborator

bklebel commented Feb 12, 2023

I'm not sure what you mean with that last sentence, but let's say you have a [...]

It turns out I was looking quite exactly for what you showed, with which I was not quite familiar. Sadly I tend not to be 100% up to date with nice tools, and it seems I quite totally missed out on IPython. I should look into it. What is tied to that however, is that at least in the labs I am working in, the typical person who would work on something like that is even less knowledgeable than me (in python), and their supervisors increasingly so. What most people know is "this is the code in this script here, and you can execute that with writing 'python script.py' in the terminal", and not necessarily much more in regard to such convenience tools like IPython REPL.

Still, the example you showed is what I call interactive, but if we can get Directors to be used as simple as that, and the Actors passing all the necessary information, it could be workable to query device capabilities like that with having a live Director session open to guide the development of the static script. However, again, spinning up a Director involves having all the configuration at hand, all of which is one more thing to do. Finding out about these possibilities is again one more thing to do, and not always the most probable outcome.

Mixing languages is something we want to enable by communicating via zmq. How would you look up the API definition of an Actor that is implemented in C++, and for which you might only have a compiled shared library?

That is so far the most convincing argument for me - indeed, having a way of requesting the capabilities in a more or less straightforward manner here would be great.

the Director can check your sequence definition against that schema when you load it, and will be able to determine that on line 135 you mis-spelled powre instead of power

That too is a good example of how it could be very nice. Not sure it would come out like this though

@bklebel
Copy link
Collaborator

bklebel commented Feb 12, 2023

Do I understand you correctly that to get a setup running with pyleco, users of a laser system will have to

  1. Study the manual to learn what their device can do

Well, my approach starts a little differently:

  1. I want to achieve a specific thing in the laboratory, and need to find out how I can do that.
  2. I search all the Devices in the laboratory which are currently not in use, to see whether one of them might have the capability which I need - that already involves reading some of their manuals, except maybe if it is a very basic power supply.
  3. In case I cannot find such a Device, I need to go and see whether I can buy one which suits my needs - again, I would need to make sure the Device can do what I need, before I spend a significant amount of money (many of the Devices for which pymeasure sports Drivers do not come cheap). Again, I am reading manuals.
  4. Now, whether bought or found, I have a Device in hand, which can typically do a lot of stuff, of which I need only a small subset. I will search for a Driver which implements exactly those functions which I need (and yes, quite often they will be the ones many others need too), and look them up in their API definition.

At this point, I already studied at the manuals and the API definition - it will be somewhat familiar to me.

In a parallel universe, the Actor tells the Director what capabilities it has (details how TBD), and the Driver uses that information to populate its GUI with widgets. The user can immediately set a voltage and check the "output enabled" checkbox.

Yes, well, I agree, that sounds quite nice. However, in my parallel universe, I have money to hire a scientific software engineer who will deal with that in their way.
In my actual universe, much of the code written in laboratories I know, is written by people who barely ever developed any kind of code, maybe they had one semester of C/C++ (which they barely survived) a few years ago, if I am lucky they are actually interested in programming, but it might well be that they are not, and they just have to do it themselves because there is nobody properly qualified to do this technical part for them, so they throw together some garbage, somehow it works, but the next person who should work with it will just put everything in the trash and start over themselves. There is zero documentation (possibly the docstring has something halfway meaningful, if it exists), code is developed by somebody who went through a python tutorial up to a point where they know the most common types and how to define functions, and then off they go. More experienced colleagues (if they exist in that context, and did not leave academia/the lab once they had acquired a certain level of skill) will be working on different tasks. I even have quite a hard time to convince people to use a framework like pymeasure in the first place, instead of just putting together a few functions with a plain serial connection.

I am not sure whether this is the same around the globe, maybe we are here an outlier (I would hope so, but I am not sure) - it would be nice to have more data on who and how this would be used, to tailor it to the specific scope, need, and probability of it being used.

@bklebel
Copy link
Collaborator

bklebel commented Feb 12, 2023

Anyway, it looks to me like you are not convinced of the benefits of mandating/baselining a schema/capability description for Actors.

I am not sure this is so much the case, I am just trying to wrap my head around it, and can only extrapolate from my own experiences, until somebody takes the time to bring in theirs, or explain a bit more, like you did, which I am quite happy about! I am looking forward to content of #45!

Also, I have to be honest, when I argue along the lines that if we ask people to write Avro-compliant Actor behavior they will run away screaming, I do feel like running away from Avro myself. In fact, I have been quite actively running away from it, to the effect of never having passed the first few lines of the specification document. I am repeating myself, I should take a deeper look into it I guess....

@bilderbuchi
Copy link
Member

To find out more about that power you notice

Oh nice: I did not know, that you could use the question mark in that way

If you use two question marks, you get more info, mainly the source code. 😀

@BenediktBurger
Copy link
Member Author

Well, my approach starts a little differently:
...
In my actual universe, much of the code written in laboratories I know, is written by people who barely ever developed any kind of code, maybe they had one semester of C/C++ (which they barely survived) a few years ago,

That is also my experience: I know my devices, as I have to study its capabilites before I use it anyways.

I'm managing the experiment and have a few students doing there Bachelor or Masters thesis (so a high rotation of short term colaborators). Therefore, I mainly maintain the code for the instrument interaction myself, as I want to use it in the long run.

Also, I have to be honest, when I argue along the lines that if we ask people to write Avro-compliant Actor behavior they will run away screaming, I do feel like running away from Avro myself. In fact, I have been quite actively running away from it, to the effect of never having passed the first few lines of the specification document. I am repeating myself, I should take a deeper look into it I guess....

I looked at the Avro specification, but am unsure yet, how it will work out in the end. I want to try it out, but finishing the Transport Layer definition (and my dayjob) is my priority.

I have a working version of an Actor, small Director and Coordinator (according to the current PR state), so they could serve to try Avro out. As a message layer (for the control messages) I use json.

@bilderbuchi
Copy link
Member

In my actual universe, much of the code written in laboratories I know, is written by people who barely ever developed any kind of code, maybe they had one semester of C/C++ (which they barely survived) a few years ago, if I am lucky they are actually interested in programming, but it might well be that they are not, and they just have to do it themselves because there is nobody properly qualified to do this technical part for them, so they throw together some garbage, somehow it works, but the next person who should work with it will just put everything in the trash and start over themselves. There is zero documentation (possibly the docstring has something halfway meaningful, if it exists), code is developed by somebody who went through a python tutorial up to a point where they know the most common types and how to define functions, and then off they go. More experienced colleagues (if they exist in that context, and did not leave academia/the lab once they had acquired a certain level of skill) will be working on different tasks. I even have quite a hard time to convince people to use a framework like pymeasure in the first place, instead of just putting together a few functions with a plain serial connection.

I am not sure whether this is the same around the globe, maybe we are here an outlier (I would hope so, but I am not sure)

I'm pretty sure it's approximately the same everywhere -- what you are describing is what people call "programming in academia".

Interfacing between this world and "proper" software engineering is the job of "research software engineers". Fun fact, this turned out to be part of my tasks in the previous job (Python "guru"--not my word, got called that-- for 15 experimentalists), as well as in the current one (outside academia now, still in R&D).

@bklebel
Copy link
Collaborator

bklebel commented Feb 13, 2023

I'm pretty sure it's approximately the same everywhere -- what you are describing is what people call "programming in academia".

Ok, I suspected as much....then our question might be whom we want to "target" with leco and pyleco: research software engineers, or BSc/MSc/PhD students? Can we do both - something simple for students, which can be extended to sth nice by engineers? Or at least once they become research software engineers? But, once something works somehow, will the engineers really go and make it nice, or will they rather be busy patching up whatever does not work for somebody who cannot fix it themselves? Sadly we do not have a research software engineer in my group (as dedicated position, although it might be half of my night-job besides doing a PhD), nor at my institute (although perhaps in the whole institution somebody does have the money to pay for one), so I cannot judge on the typical scenarios that unfold.

I understand that as a research software engineer, one might not be the one designing the experiment, so it might be the first time somebody looks in the manual, when they actually already need to develop a Driver.

@bilderbuchi
Copy link
Member

bilderbuchi commented Feb 14, 2023

Can we do both - something simple for students, which can be extended to sth nice by engineers?

Well, ideally yes. Who is pymeasure "for", in your opinion? The nice thing about coding is that you can hide complexity ("abstract it away") from others as needed!

  • Using a premade pymeasure Instrument should not be taxing for anybody who can speak Python somewhat.
  • Writing a pymeasure instrument driver takes a bit more skills, but not much depending on the device model (simple SCPI instrument vs. multichannel byte-based VNA, e.g.).
  • Extending pymeasure (e.g. to accomodate multiple channels, prologx stuff, etc.) takes yet more skill to do well.
    These are three different levels for three different kinds of people, and imo it works out fine.

Btw, this is also why I don't fully get your reticence about Avro w.r.t inexperienced students -- ideally, the students use pyleco, and have no contact with Avro at all (maybe aside from writing a small dictionary to help define Actor Parameter type info).

@BenediktBurger
Copy link
Member Author

@BenediktBurger
Copy link
Member Author

I took a look again at json-schema and I like it, except for the fact, that method calls are missing.

open-rpc is an interface description for JSON-RPC, so similar to json-schema, but for JSON-RPC, which in turn is a standard for encoding procedure calls.

I propose to

  • use open-rpc for encoding method calls,
  • either have a default method "get" / "set" for getting/setting properties
  • or use a different (LECO) command type for getting/setting properties according to json-schema

A Matthew Burkard has Python packages to send JSON-RPC requests (client side) and to handle them (server side): jsonrpc2-objects, jsonrpc2-pyclient, openrpc

@bilderbuchi
Copy link
Member

Thanks for these links! I'll try to dig into them.

@BenediktBurger
Copy link
Member Author

I found protocol buffers. Maybe they can serve as message content:
https://protobuf.dev/overview/

@bilderbuchi
Copy link
Member

I guess there are N \rightarrow \infty serialisation formats out there. I haven't investigated the trade-offs w.r.t protobuf, but I'd assume it's a rather large thing, and it comes from Google which I'm a bit hesitant about.
The initial attraction of Avro for me was that we get RPC with the same package. It seems that that hasn't caught on so well, so we have more freedom (for better or worse).
Maybe we can kick that can down the road, and stay with plaintext (and/or json) for the first iteration, and deal with the serialisation format when we have more experience with a practical implementation and can make a better-educated guess?

@BenediktBurger
Copy link
Member Author

btw. json-rpc already includes a "conversation_id", which would be redundant with the one in the content header frame.

@BenediktBurger
Copy link
Member Author

To make the decision more simple, I propose the following setup:

Regarding properties, I have two ideas:

  1. We have a single "get" and a single "set" method, which you can call via RPC. (no property description via open-rpc possible)
  2. For each property we create a "get_property" and "set_property" (maybe with a period instead of underscore) method. Advantage is, that all the properties can be described by open-rpc.

My favourite version is version 2.
Advantages:

  • The whole message serialization / deserialization is defined by (simple) standards. These standards are simple and could be implemented easily by ourselves, however, there are good python repositories for json-rpc and open-rpc available on gitlab by Matthew Burkard.
  • We do not need to use two different standards (and some way of combining them) for method calls and property access.

The only question remains, how we call our getters and setters:

  • With a leading underscore?
  • With a underscore or with a period separating "get" from the property name?

I'm open to allow also other serialization schemes (with a marker in the content type header, see #53 ), but I like the idea to have a human readable, simple scheme for the network itself (sign ins etc.) and as a default implementation.

Do you see any problem with json-rpc, @bilderbuchi , @bklebel ?

Btw. the specification of json-rpc is a really quick read and contains some examples.

@BenediktBurger
Copy link
Member Author

As a consequence, the question about the message types (#29 ) turns into a question, which method calls each Component should offer. For example a Coordinator should allow sign ins/sign outs and an Actor should allow to get/set some properties.

btw. json-rpc already includes a "conversation_id", which would be redundant with the one in the content header frame.

As you can do more than one method call in a single message (they can be handled in parallel and in any order!), these individual "id"s are complementary to our message conversation_id.

@BenediktBurger
Copy link
Member Author

One reason to implement "proper leco protocol" messages (sign in, etc.) also as remote call of methods:

Users of leco won't need ever to care about message formatting. They only care about calling procedures remotely.

Leco procedures will do the transition from call to message and from message to call.

Even writing a new actor will be easy, as users have to offer certain methods (getting / setting properties, locking resources...), but don't have to react directly to messages.

@bilderbuchi
Copy link
Member

* We do not need to use two different standards (and some way of combining them) for method calls and property access.

If I understand correctly (and to summarise back),

  • we define (as per Message type collection #29) LECO messages that are exchanged between Components
  • some of these messages are for getting/setting Parameter values, and calling Actions
  • you propose using json-rpc to trigger method calls in a LECO implementation (e.g. pyleco)
  • you propose using open-rpc to
    • specify valid messages (and their details) for Components
    • specify an Actor's capabilities (Parameter get/setting, Action calling)
  • An Actor (e.g. for pymeasure instruments) implemented in a LECO implementation (e.g. pyleco) takes care of the mapping between LECO method calls (controlled via json-rpc) and whatever the respective Driver has to do (read a property myprop, call a getter getmyprop, ...)

This sounds good to me so far. If an instrument library (Driver) used with LECO uses properties or not for its Parameters is abstracted away behind the Actor, so it boils down how we map our messages/message types to methods.

The only question remains, how we call our getters and setters:

* With a leading underscore?

* With a underscore or with a period separating "get" from the property name?

I'd lean towards the latter. We have to take care that in the list of possible methods, we don't have a collision between our LECO messages and any Actors Parameter/Action names, so prefixing Parameter-stuff with GET_ and SET_, and Action-stuff with CALL_ seems reasonable.

On one GET(parametername) method vs generating many GET_XY() for Actors:
The latter bloats our set of messages, but we can more easily describe details about a given Parameter (type, valid values, description,...), so this sounds attractive.
We can probably drop LIST_PARAMS/LIST_ACTIONS if we go that way, as that should already be contained in an exchanged open-rpc spec.

@bilderbuchi
Copy link
Member

  • All messages (including sign-in etc.) are remote procedure calls (RPC).

I realised I'm not sure about this. Do our messages always map cleanly to method calls on the receiving component?

For one example, let's say a Component sends a SIGN_OUT message to a Coordinator. This means that the Coordinator needs to have a SIGN_OUT method, but this message does not sign out the Coordinator (as you could assume), but sign out the sending component instead? Futhermore, the component needs to supply its ID as a parameter to that method call as per the RPC approach (so that the Coordinator knows which Component to sign out), instead of letting the Coordinator pick that information out of the message it just received.
Are we not needlessly restricting ourselves with this approach? Is there something I'm misunderstanding?

I can definitely see the benefit of using some schematised approach to describe our actors' capabilities, and to encode our messages. Maybe the RPC thing is not that applicable to all messages? Maybe only in some way for the interaction with an Actor, but even then I'm not sure of the benefit of RPC vs. having a known set of Actor-internal methods/functions to react to messages?

@BenediktBurger
Copy link
Member Author

BenediktBurger commented Jun 3, 2023

I realised I'm not sure about this. Do our messages always map cleanly to method calls on the receiving component?

Thanks for finding that difficulty. Yes, doing sign in / out via rpc would need some extra code outside pure rpc in the Coordinator, which needs that extra information not contained in the rpc call itself.

So there we need some special messages.

@BenediktBurger
Copy link
Member Author

I'd lean towards the latter. We have to take care that in the list of possible methods, we don't have a collision between our LECO messages and any Actors Parameter/Action names, so prefixing Parameter-stuff with GET_ and SET_, and Action-stuff with CALL_ seems reasonable.

In my current actor implementation, each parameter or method call is directed towards the instrument, unless you specify explicitly to talk to the actor instead of the instrument.

We could use "actor.xy" for the "xy" method of the actor instead of the instrument method.

That way we do not have to prefix a call with "call_".

@BenediktBurger
Copy link
Member Author

I'm not sure of the benefit of RPC vs. having a known set of Actor-internal methods/functions to react to messages?

In the end, calling these "Actor internal methods" is already rpc: if I send this message, that method will be called. With a known rpc protocol, the translation is clearly defined. We can describe it ourselves and thus have more work and less compatibility (due to a proprietary format).
With rpc, we abstract the message away and the user specifies directly, which method to call.

@bklebel
Copy link
Collaborator

bklebel commented Jun 4, 2023

I went through the json-rpc specification, and it seems good to me for our purpose. Also, the OpenRPC implementation (at least the given examples) look nice, simple and straightforward. A small thing which I observed: if we use OpenRPC from Matthew Burkard, we might be restricted by the way json-rpc methods are registered, through the function/method name. It might not be practical to use a period, since we cannot have that in a function name in python after all.

I do not have much more to say right now, about prefixes and separations and so on, I will come back to it in the evening.

@BenediktBurger
Copy link
Member Author

A small thing which I observed: if we use OpenRPC from Matthew Burkard, we might be restricted by the way json-rpc methods are registered, through the function/method name.

I looked at the code, you can give the "@method" decorator a name, even with periods, so that's not a problem.

@BenediktBurger
Copy link
Member Author

I looked through #29 and all message types, except for sign-in /sign-out (and their coordinator variants) could be implemented with RPC calls.

The question is, how the responses to these messages look like. For simplicity, I'd form the response according to json-rpc.

@BenediktBurger
Copy link
Member Author

having a known set of Actor-internal methods/functions to react to messages?

For all components, we would require a single method:

  • "live-check" responds with the state (for ping and "state-checking" purposes). (Or doing something different for the ping?)

For all components we could allow (they can have it, but do not have to have it):

  • a shutdown method
  • setting the logging level

Actors can offer some more methods of a standardized set:

  • "lock_resource" (with the unlock and force_unlock variants)
  • start/stop regular readout (and setting the readout interval)

@BenediktBurger
Copy link
Member Author

A quote from the current protocol definition (LECO):

The Message layer is the common language to understand commands, thus creating a remote procedure call.

Isn't that an argument for RPC?

@BenediktBurger
Copy link
Member Author

I thought about our desired "locking mechanism":
We want, that a Component controls another one remotely and its sender name is the authorization to do the remote call or not (e.g. restricted motor access).
For that feature (and for the sign_in/out) we need to make the full message available to any remotely called procedure.

For example: you call "sign_in()" (without parameter) of the coordinator or some "move_motor(5)" of an actor. In both cases the recipient (Coordinator or Actor) accesses the internally stored original message to verify, that the sender of the message may sign_in / move that motor.

So for these features, we need an instance variable with the message (or a global variable etc., depending on the language and implementation).

@BenediktBurger
Copy link
Member Author

Update: I implemented the RPC approach (even for sign-in etc.) and it works as expected, so we can go totally with RPC.

@mcdo0486
Copy link

mcdo0486 commented Jun 7, 2023

Just chiming in to say the RPC paradigm fits well with an experiment control and recording setup. json-rpc seems like a good first choice for proving out how the different pieces of the framework will communicate and work together.

I have reservations of a vanilla json protocol from a speed point of view with the additional overhead parsing, for example when recording data points. From an end user's perspective, if the protocol can't keep up with their measurements, they simply won't use it. This is a future problem but "fast yet reliable" is one of the main goals of pyleco, worth keeping in mind.

One of the benefits of Avro is the data packing for speed, besides cross platform compatibility, however it seems to be more opinionated about the core structure of the schema. I think this would be a fork in the road for Avro.

A possible next step, in the future, could be gRPC, which doesn't seem to have an opinionated structure for the schema and so it would be compatible with json-rpc.

In reality this is just the first choice, and as such it can be iterated on, but wanted to bring up gRPC as an option for a fast implementation of a json-rpc schema.

@BenediktBurger
Copy link
Member Author

One idea was, that the header indicates which type of serialization the message contains. That allows more than one technique.
We started with a human readable version, as that makes debugging easier. For the sign-in messages etc. speed (at that level) does not matter, so json can be default.
Later we can allow some binary format, if both endpoints of the conversation understand it.

I'm not sure, how much faster a binary protocol is, which requires some serialization as well, in comparison to plain json. The message size might be smaller, but the serialization/deserialization?

@mcdo0486
Copy link

mcdo0486 commented Jun 8, 2023

One idea was, that the header indicates which type of serialization the message contains. That allows more than one technique. We started with a human readable version, as that makes debugging easier. For the sign-in messages etc. speed (at that level) does not matter, so json can be default. Later we can allow some binary format, if both endpoints of the conversation understand it.

I'm not sure, how much faster a binary protocol is, which requires some serialization as well, in comparison to plain json. The message size might be smaller, but the serialization/deserialization?

The only real way would be to run a test of pyleco with binary vs plain json, but I'd be surprised if the binary protocol isn't 2-5x faster.

@BenediktBurger BenediktBurger linked a pull request Jul 1, 2023 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion-needed A solution still needs to be determined distributed_ops Aspects of a distributed operation, networked or on a node messages Concerns the message format
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants