UDL2 #2081

arg0d · 2024-04-25T10:07:29Z

The current UDL syntax is awkward in some cases, and the UDL parser is straight up hostile towards the user. The goal of this post isn't to provide a final proposal, but rather to concentrate discussions about the following problems in a single issue, and possibly refine some kind of action plan.

While the current UDL syntax itself is rather straight forward, it's quite foreign when considering that UDL goes hand in hand with writing Rust code. I propose to overhaul the UDL syntax completely, in order to bring it more inline to Rust syntax and make it feel more native.

rename certain types/keywords to match Rust, e.g.
- dictionary -> struct
- string -> String
- sequence<> -> Vec<>
- record<> -> HashMap<>
- T? -> Option<>
- etc..
invert variable name and type order, i.e. String name -> name: String
invert function/method return type declaration, i.e. String hello_world() -> hello_world() -> String
separate "flat" enums and enums with associated data into distinct types, i.e. enum for "flat" enums, and union for enums with associated data.
etc..

The current implementation of UDL parser does not generate useful error messages, making it hostile towards users, especially new ones. Making basic syntax errors, such as missing semicolons, or misplacing comma, generates unhelpful error messages, leaving the user scratching their head and wasting time and energy to manually figure out what is wrong with the syntax in their UDL. Solving this issue requires a different parser, one that would generate clear and helpful error messages. The use of a different parser could be implemented as a standalone feature, without revamping UDL syntax.

With the proc-macro related changes to enable creation of ComponentInterface while still supporting UDL, I feel like implementing/supporting adjacent implenentation for UDL2 syntax/parser should be quite easy. My current working idea is to use lalrpop to implement a new syntax and parser. From what I've seen, parsers implemented using lalrpop look really lean, and I believe that this would make it trivial to support UDL2. I'm gonna try to find some time to create POC, and come back to this issue with some working code.

Does this sound like something you agree with?

The text was updated successfully, but these errors were encountered:

mhammond · 2024-04-25T13:18:33Z

Let me preface this by saying I'm not a fan of UDL for all the reasons you specify. I constantly fail to get the order of arg names and types correct, fail to remember how to specify a void function, fail to remember exactly what attributes I should use when, grumble very loudly when it tells me there is a syntax error but tells me literally nothing about what the problem is, etc. I'm also speaking only for myself and haven't discussed this opinion with anyone else on the team, so my opinion might not be shared with anyone else.

However, my concern with this proposal is that there will then be 3 different ways to spell out an interface and the docs are already messy enough trying to support 2. While UDL is messy, it does have the benefit of not being something specially invented just for this project - eg, idl syntax highlighters exist for vscode, markdown rendering etc. A new, custom language closer to rust but with zero external tooling etc and zero possibility of users having come across it before doesn't seem like a great win to me TBH. Personally, I'd push people towards proc-macros first (because then it's all in the same source and actually is Rust), UDL second (because that's what our docs focus on and because it's where our examples and tests could be found), so I wonder what the uptake would be?

My other concern would be the burden on people contributing: the most recent changes to UDL involved adding support for async constructors, enum discriminants, non-exhaustive enums, etc. If we didn't insist that all new features added support to all 3 of proc-macros, UDL and UDL2, I fear it would just get left behind, and somehow the docs would need to accurately reflect exactly what UDL2 supports and what it does not. If we did insist on contributors adding support for all 3, then IMO that would be an unfair burden and likely to turn contributors away. Having a new language syntax which only supported a subset of features would be unlikely to be successful IMO.

However, given how uniffi_udl is now its own crate, I can't see a reason why, with some cleanups and reorganization, support for this couldn't be maintained externally. At the end of the day, you would just be emitting metadata which is then consumed by uniffi_bindgen. This would leave the maintenance and "new feature support" burden on the external contributors, but that sounds more reasonable than having the burden lie on contributors to the core. At the very least, this would be a good way to experiment with this and once a working system is in place we could re-discuss whether to fold support for it into the core (although we'd want to be very clear that choosing to not fold it in would be a very real possibility)

mgeisler · 2024-05-02T10:25:53Z

Personally, I'd push people towards proc-macros first (because then it's all in the same source and actually is Rust), UDL second

As a new user of UniFFI, I've so far completely avoided using the UDL. I know Rust and if at all possible, I would prefer to stay in that comfortable space 🙂 With #2004, it now seems that the proc macros support a syntax not supported by the UDL.

The natural question for me is if the UDL can be deprecated and the proc can be made the primary interface?

arg0d · 2024-05-02T11:40:21Z

While I agree that proc-macors are generally superior to UDL, from my POV there are couple of exceptions:

UDL has proven quite valueable when supporting 4 different consumer teams (Kotlin, Swift, C#, Go), where many FFI questions are quickly resolved by referring the consumers to the UDL file. UDL provides concise, up-to-date FFI definition for both developers and consumers. Without UDL, the only way inspect the FFI is to either read through Rust source code, or to read through generated bindings code. While reading Rust source code isn't a problem for developers, it is not an option for library consumers who aren't familliar with Rust. Consumers can always read the generated bindings, but the generated bindings are quite verbose, and current generators do not generate any FFI reference/overview documentation.
We've had very promising results generating C++ scaffolding code to enable C++ libraries to use uniffi. Right now the only way to generate C++ scaffolding is to use UDL. We might investigate some type of proc-macro equivalent mechanism for C++ in the future, but it looks like it would be much more difficult to do than in Rust. The current UDL syntax and lack of informative parsing errors is a big pain point for our C++ teams who have attempted to integrate uniffi into their C++ projects.

arg0d · 2024-05-02T12:07:40Z

@mhammond So, to summarize:

adding another FFI frontend would complicate the docs
adding another FFI frontend would make it more difficult to develop new features and raise the bar for external contributors
UDL (WebIDL) is well known and has existing tooling, e.g. syntax highlighting
implement UDL2 as an external project to minimize the above problems for uniffi core

I have to agree with all of your points. But I wonder how would UDL2, implemented as external project, be integrated by users. My main concern is the ease of use, where if UDL2 is part of uniffi core, the users could simply name their file with .udl2 extension to opt into UDL2. If UDL2 is implemented as an external project, then the user would have to run some type of pre-processing step to be able to use UDL2.

Now that I'm thinking about this further, I feel like the main problem with UDL is lack of concise and informative error messages, not the syntax or type names that are different from Rust. The syntax of UDL and "weird" type names wouldn't be much of an issue if the parser generated specific error messages, e.g. unknown type 'String', did you mean 'string'?, or expected ; at X:X. The error messages could be tackled by simply integrating/implementing a different parser, and all of the above issues related to another UDL version would go away.

mhammond · 2024-05-02T14:03:17Z

The one scenario where proc-macros alone don't work is whenever "library mode" doesn't work - ie, wherever we can't get hold of the metadata emitted by proc-macros. It's not clear if that's transient (ie, we just need creative ways to solve it) or if some use-cases might end up with constraints which make it persistent (eg, some build/packaging environments make it very difficult to do for various reasons).

Re the parser: There might be a path towards having UDL2 be external but avoid it needing to be a pre-processor - however, I agree that being outside the package would hurt both adoption and discoverability.

The "weedle" IDL parser we have is (effectively) dead and I can't see any reason a new parser wouldn't be welcomed. We'd need to be a little careful about what that does to our dependency tree, but I'd be surprised if anything about that ended up blocking the effort.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UDL2 #2081

UDL2 #2081

arg0d commented Apr 25, 2024 •

edited

mhammond commented Apr 25, 2024

mgeisler commented May 2, 2024

arg0d commented May 2, 2024

arg0d commented May 2, 2024

mhammond commented May 2, 2024

UDL2 #2081

UDL2 #2081

Comments

arg0d commented Apr 25, 2024 • edited

mhammond commented Apr 25, 2024

mgeisler commented May 2, 2024

arg0d commented May 2, 2024

arg0d commented May 2, 2024

mhammond commented May 2, 2024

arg0d commented Apr 25, 2024 •

edited