-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question: are refinement types viable? #429
Comments
Speaking entirely for myself here: I do believe refinement types might eventually make sense to include for WIT. In Rust we've been toying with the idea of introducing a limited form of this in the form of pattern types. This is more limited because it doesn't allow for arbitrary pre/post-conditions, but basically just makes Rust's pattern language work in more places. Here is a basic example of what pattern types in Rust would look like using number ranges: // a function which can only operate
// on numbers between 1 and 10
fn refined(num: u8 is 1..10) { .. } Earlier this year I wrote a section on the challenges Rust will face when trying to extend existing APIs with pattern types (link). I suspect that WIT will have an easier time adding refinement/pattern types than a language like Rust does. That's because even if we refine APIs in existing WIT docs, that doesn't mean that bindings generators need to change their codegen to match. And vice versa: codegen may change even if the WIT docs remain stable. I believe there may be relatively few downsides to extending WITs type system with some version of refinement. Though there are probably a number of other features we should prioritize first. |
I've definitely seen some use cases come up that seem like they'd benefit from refinement types. One part of the design space that has stumped me in the past that you also touch concerns runtime validation. While we could consider the predicates to be optionally validated, I worry this will create scenarios of mismatched expectations where a consumer of a value assumes the predicate holds and this leading to, ultimately, security vulnerabilities for cases that should have been guaranteed traps. This makes me wonder if instead we should define a small expression language that can be validated by the C-M lifting/lowering machinery so that a value consumer could trust that the predicate held, but this of course raises a lot of design questions. |
Yeah, arbitrary pre/post conditions seem like they would be hard to encode. I believe that if we could constrain refinements to some declarative subset, we might get most of the benefits without the downsides. I wonder how well pattern types could map onto WIT? WIT doesn't have patterns, but it does have strings, enums, and numbers — maybe we could define a pattern language? Is that roughly what you were thinking too? |
(I am for defining constraint expressions/predicates on types, which can be matched as needed in wit-bindgen extensions. But against explicit pattern matching in functions in WIT language. wit-bindgen extensions can match expression and decide what way they express these in generated code). Refinement types will lead to more consistency in WIT language usage and good competition in runtimes (forks of source generator extensions). Part of refinements are simple arithmetic predicates. A mix of arithmetic rules, numeric limits, and categorical properties allows WIT to validate certain constraints at compile time. Forked Extensions may pattern-match expressions to determine the types they generate and decide which WIT definitions to reject. Matching refinements can help decide on fixed types, dictionaries/maps/lists of tuples, attributes, and traits to generate in code—all from simple math or categorical rules. Industry ImpactRefinements will make WIT more consistent across the industry. Each solution/runtime can fork the extensions they need, deciding how to handle source generation, runtime execution, and expression matching. Canonical ABI and official guidance (focused on safety) will serve as the default. Look at official Rust extension, it already defines 2 cases, one in which Strings to be validated to be UTF8 and other case when they are just bytes. So it seems will progress in other directions regarding types and validations. Examples
Relevant Links
|
Refinement types as an enhancement to type alias can be helpful. For example, define type non-nan-f32 = f32 where not nan // NonNan<f32>
Another example type vec = list<u8>
type hash-set<u8> = list<u8> where elements unique
type btree-set<u8> = list<u8> where elements unique and sorted
type index-set<u8> = list<u8> where elements unique and indexed The corresponding specific interfaces usually have some better mechanisms I think we can formulate a set of syntax through I suggest using A predicate function is a transformation of Its mapping type is The boolean version It is up to the user to decide whether to check. Tools can be configured with a simple option to require the use of |
Another benefit is to add an OpenAPI-like extension definition to wrpc, which can then generate a good user interface similar to Swagger UI. |
what
Refined types allow producing new types by limiting some values of the base type by predicates.
Wuffs is one industry-successful example of an application. Jolie is something like WIT, but not popular.
I do not know what the exact best syntax is,
or if it should be mixed with attributes.
Most typical refinements are ranges, and possibly we can define a dictionary via refinements too.
I will share some examples, and then some reflection on who should validate refinement.
examples
how to validate
Validating numeric ranges is easy.
Validating sorted is easy too; the first 2 elements define order.
Requires
<
or>
refinements too (need to do comparison for sortedness).Sorted can be desc or asc if needed.
Sorted unique is also easy to validate.
IMHO it should fail to allow defining unique but not sorted, at least feature-gate it behind a DDoS flag :).
who to validate
By default, nobody should validate; there is no spec.
If the reader trusts the writer of ABI, then the reader does not need to validate.
If the reader does not trust the writer, it should validate.
The wit-bindgen generator can generate code, or the reader can do manual validation.
Wit-bindgen can validate data provided by the writer, maybe in debug mode.
type system
It may be safe to allow validation only of fixed-size entities by default, other features gated.
Except length, it specifically for variable length.
Generators for target languages which support some kind of narrowing,
for example via inheritance.
Can generate code for each refinement type, and allow a function for a broader type to receive a more narrow type.
Implementer which reads refined type data, can decide to store refined u8 in u5 as part of some stucture if that fits, for example.
But that is optional to add type safety to targets.
So refinements act as:
- documentation (always)
- generation of validator for reader or writer (optional)
- for type systems of target languages (optional)
Refinements will be less ad hoc and more clear, rather than people start put refinements into #58 .
options
Seems Wuffs and https://github.com/dfinity/candid/blob/master/spec/Candid.md and Jolie do not come with dependent types.
So many of validation types are attempts on refinements.
WIT does not have in place backward compatiblity. Protobuf has it implicitly embedded and ad hoc.
Candid has has in place versioning, but seems it pays with extra bytes https://github.com/dfinity/candid/blob/master/spec/Candid.md.
May be refinements give other option for versioning without binary overhead.
links
https://en.wikipedia.org/wiki/Refinement_type
https://www.jolie-lang.org/index.html
https://github.com/google/wuffs/blob/main/doc/glossary.md#refinement-type
https://www.reddit.com/r/dependent_types/comments/ay7d86/what_can_refinement_types_do_that_dependent_types/
https://ucsd-progsys.github.io/liquidhaskell/
https://json-schema.org/understanding-json-schema/reference
https://zod.dev/
https://github.com/gcanti/io-ts
https://github.com/stefan-hoeck/idris2-refined/blob/main/docs/src/Intro.md
https://dafny.org/dafny/DafnyRef/DafnyRef#sec-module-refinement
https://flux-rs.github.io/flux/
https://plv.mpi-sws.org/refinedrust/
microsoft/TypeScript#43505
microsoft/TypeScript#4895
https://fstar-lang.org/tutorial/book/part1/part1_getting_off_the_ground.html
https://www.w3.org/TR/xmlschema11-1/
https://smithy.io/2.0/spec/constraint-traits.html
https://yoric.github.io/post/rust-refinement-types/
The text was updated successfully, but these errors were encountered: