Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x509-cert: Creating x509 certificates without allocations #689

Open
NicsTr opened this issue Jul 25, 2022 · 25 comments
Open

x509-cert: Creating x509 certificates without allocations #689

NicsTr opened this issue Jul 25, 2022 · 25 comments

Comments

@NicsTr
Copy link

NicsTr commented Jul 25, 2022

Hello,

I'm interested in using x509-cert in an environment where no heap allocation is possible. I'm still new to Rust but if I understand correctly, x509-cert is working well in a no_std setting but currently require an implementation for alloc (mostly to use Vec).

Do you think that having a "no_alloc" version of x509-cert is possible and desirable? The main drawback I see is that I think it will force the developper to set a maximum size for its x509 certifcate at compilation time.

If you think it is a good idea to provide a version of x509-cert which does not require alloc, do you have a prefered approach for that (for example, replacing the use of Vec with heapless::Vec or tinyvec::ArrayVec)? If needed, I can work on it myself but I'll be happy to follow any advice you can give me.

@tarcieri
Copy link
Member

tarcieri commented Jul 25, 2022

The current implementation of the crate relies on alloc.

However, der is definitely designed for "heapless" use cases. We could potentially add *Ref equivalents of various types which could support heapless use cases, potentially with some slightly more complicated ergonomics.

Backing storage for SET OF is the trickiest problem. It can potentially be managed via references to arrays, although that works better for serialization than parsing.

For heapless X.509 parsing, a push parser may be a better approach so certificates can be validated on-the-fly as opposed to parsed into a data structure, however the der crate is presently pull-parser only.

It might be possible to make a heapless pull parser work using something like heapless::Pool as backing storage for SET OF.

Which use cases are you interested in? Parsing? Serialization?

@NicsTr
Copy link
Author

NicsTr commented Jul 25, 2022

Thank you for your answer.

I am mostly interested in x509 serialization.

@tarcieri
Copy link
Member

Serialization shouldn't be too hard to support.

It may require adding something like SetOfRef to the der crate which can be backed by a slice. That way you can provide the storage as arrays which are borrowed as slices.

@NicsTr
Copy link
Author

NicsTr commented Jul 27, 2022

Great news. I will try implementing it, following what you advised.

Then, If I'm successful, I'll open a PR.

@NicsTr
Copy link
Author

NicsTr commented Aug 18, 2022

Hi,

I tried to decorrelate parsing and serialization, to focus on serialization. To do so, I first implemented a SetOfRef struct in the der crate as advised. SetOfRef is a wrapper around a slice of an array of the underlying type that only implements what is needed and sufficient for serialization.

Then, I tried defining *Ref (heapless) types for the Attribute* types in x509-cert with the same goal, and I was quickly stuck because the ASN.1 Sequence type is currently implemented as a trait, that is defined as a subtrait of Decode and that is used to also implement Encode and FixedTag. Thus, I'm having a hard time trying to find a way to elegantly separate the parsing and serialization of Sequence. I'm not 100% comfortable with all the Rust concepts (e.g., specifics of the derive macros) and I'm not sure what is the right way to proceed from this point.

Do you have some more advices?

@tarcieri
Copy link
Member

tarcieri commented Aug 18, 2022

You don't actually have to implement the Sequence trait, and in the next release of der we might actually get rid of it.

Instead just impl the EncodeValue and FixedTag traits for sequence-specific structs. In the next release we can look at supporting that via custom derive.

@DemiMarie
Copy link

DemiMarie commented Mar 24, 2023

For SET OF, I recommend not sorting during serialization. You might well be able to get away with pre-encoded values or with handling sorting yourself before serializing.

Edit: Another option is to build the DER in reverse (from end to beginning) into a pre-allocated buffer effectively “by hand”, the way e.g. the TPM2 Reference Implementation does it.

@tarcieri
Copy link
Member

@DemiMarie we've never sorted during serialization.

Sorting occurs at construction time, and the implementation has been changed to ignore out-of-order SET OF elements during deserialization, which is unfortunately necessary to support real-world use cases (though ideally we'd introduce some sort of lint which would fail in that case)

@DemiMarie
Copy link

@tarcieri To elaborate: instead of building a data structure that is then serialized, it might use less resources to directly extend the buffer in back-to-front fashion in one pass.

@tarcieri
Copy link
Member

The serialization logic in this crate relies on knowing the entire document structure in advance, which is needed to calculate the many, many nested length tags. It's effectively two pass: the first pass calculates the lengths, and the second pass calculates the document.

Off the top of my head I can't see a way to build a sort of on-the-fly serializer which works in two passes like that.

@DemiMarie
Copy link

@tarcieri One can avoid this by serializing in reverse order. In the case of an X.509 certificate, this would mean serializing the signature first and the length field last. The result is that length tags do not need to be calculated until after the things they are lengths of have already been serialized. The TPM 2.0 reference implementation does exactly this.

@tarcieri
Copy link
Member

I guess that's a possibility, but it would be a very radically different approach from the current implementation and couldn't reuse any of the existing traits

@DemiMarie
Copy link

ASN.1 objects also have a LOT of constant strings. For instance, it is much faster to serialize an AlgorithmIdentifier by memcpy() of a pre-encoded buffer, rather than by actually encoding the AlgorithmIdentifier.

@tarcieri
Copy link
Member

That helps when the AlgorithmIdentifier is actually a constant, but many of them are not. In PKCS#5 for example they contain things like (PB)KDF salts and cipher initialization vectors.

Anyway, that's a separate micro-optimization.

@tarcieri
Copy link
Member

tarcieri commented Mar 25, 2023

I'll just say this: the crates in this repo tend largely follow a "DOM" approach which "pull" parses/serializes types which represent a constructed document.

A "streaming" crate with a push parser/serializer would probably be a better API for heapless use cases, but also a radically different approach to the point it may be best to implement it as a separate crate.

@DemiMarie
Copy link

How does WebPKI do things?

@tarcieri
Copy link
Member

webpki is also a pull parser but one which functions similarly to how x509-cert did prior to #806 and #841 in that it borrowed from its input.

Since those PRs landed, x509-cert has gone even farther in the other direction in that it now owns all of its data and no longer has an input lifetime.

We made those changes to make things like certificate builders easier.

@mx-shift
Copy link

I've been building up x509-related tooling related to secure boot on a no_std, non-alloc embedded platform. I've been using x509-cert in some of those tools (https://github.com/oxidecomputer/pki-playground/ for generating test certificate hierarchies) as a lead up to using x509-cert in a microcontroller bootloader (htps://github.com/oxidecomputer/bootleby/, specifically oxidecomputer/bootleby#8). I had seen the no_std support and missed that alloc was currently required. I and a few coworkers are very familiar with the work required to support no-alloc and are willing to do that work if that's a direction this crate wants to go.

@DemiMarie
Copy link

@kc8apf do you need certificate generation or just parsing?

If the latter, then https://github.com/barebones-x509/barebones-x509 implements no-alloc certificate parsing, including signature checking. It does depend on ring but I would be open to a PR to replace it.

@mx-shift
Copy link

mx-shift commented Apr 13, 2023 via email

@baloo
Copy link
Member

baloo commented Apr 20, 2023

https://github.com/oxidecomputer/pki-playground/ for generating test certificate hierarchies

@kc8apf Unrelated to the initial question, I'm definitely eager for feedback on the x509 builder (currently lives in master, it's not released yet).

@DemiMarie
Copy link

@kc8apf would you be willing to make a PR to barebones-x509?

@MathiasKoch
Copy link

MathiasKoch commented Mar 18, 2024

I find myself in a position where I need at least part of this, and are willing to put in the effort, but I might need some input as to which way you would prefer to see this going?

Currently, all I need is alloc-less serialization, more specific I need to be able to serialize a CSR.
As a starting point, I have cloned and added an alloc feature to everything else, but ran into Name being the only one I couldn't just feature gate.

I have read through related issues and PRs I could find, and see you are considering yoke, what's the status on that?

I also see in #765 the discussion on duplicating implementations for *Ref, but not any conclusions?

All I need for now is the serialization part, so I am looking for the shortest path to that, but would much rather contribute a bit more than I need, in favor of re-implementing in some private repo.

Ping @tarcieri

@tarcieri
Copy link
Member

At this point given the level to which this crate has embraced an alloc-backed "DOM" model for X.509 certificates, I'm not sure it makes sense to try to shoehorn in a no-alloc certificate builder, or how much code such a builder could share with the existing implementation, if any.

I think every part of the crate needs to be designed around those requirements, which presently this is not.

One which was identified earlier was changing how encoding works so the serializer starts off writing the end of the certificate/document into the end of the buffer, i.e. recursing down through the document tree to the leaf nodes and iterating over those right-to-left, effectively walking the tree in complete reverse from how it is now. With the lengths of the children known they can be used to compute the parents, walking all the way back up to the root of the document.

Another would be to have a different kind of builder API which borrows data everywhere and only for the duration of serializing that part of the document. Once the data has been written to the certificate-in-progress, the input no longer needs borrowed.

This is a fairly radical departure from how the der crate is designed, where every object in the document knows its children and how to calculate their length in advance, which currently involves owning but would otherwise require borrowing all inputs for the duration of serializing the entire document rather than just a particular TLV record.

While we could put such a builder in this crate, it might make more sense to prototype out-of-tree, at least at first.

@MathiasKoch
Copy link

Fair enough.
I will copy what I deem relevant (mostly from the source tree prior to making Name an owned type), and leave a note to watch this issue.. Hopefully I can eventually replace it with an upstream solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants
@tarcieri @baloo @MathiasKoch @DemiMarie @mx-shift @NicsTr and others