Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This article expresses some opinions I've also held about integer types. If we express integer types as what we need them to be capable of instead of what they are then the compiler can choose things like int_fast32_t on our behalf and without us having to deal with promotion or typecasting.

I've wondered too if it's possible to have more information about the possible values of a type. Imagine you declare an Int<1, 10>. You can only initialize it with 1-10 inclusive or face a compiler error. What happens if you then double it? Well the range must change to <2, 20> but you can also prove that the value will be even. If you increment it, could it become an Int<3, 21, Odd, ...> ?

My guess is there are an infinite number of similar traits an integer can have so, for this to be possible, the compiler would have to be able to evaluate only what traits are used by the code.

Does anything like this already exist?



Off the top of my head, this sounds like dependent typing: https://en.wikipedia.org/wiki/Dependent_type . Simply put, it allows you to define types that are generic on specific values, rather than other types. So with normal typing you have generics like List<T>, but with dependent typing you can do Int<1, 10> like you described. Similarly, you can define types like the type of odd integers, or the type of lists with at least 3 elements, etc. They are very cool, but as you can imagine complicated to implement. In general dependent types can make type checking a program undecidable.

You might also be interested in the numeric subtypes you can declare in some Pascal languages, like Ada. See http://www.isa.uniovi.es/docencia/TiempoReal/Recursos/textbo... for some examples. In Ada you can define subtypes, like the set of integers from 1 to 10. You can also define subtypes on floats, characters, and enums. You can also define the overflow rules for your type to a certain degree. It isn't quite as powerful as dependent types, I don't think you can really make it as generic, but still an interesting feature. One very interesting property of Ada is the ability to make arrays which are indexed by a non-integer type. So you can make a enum for the days of the week, then declare an array of strings indexed by the days of the week type. Then you can do things like "array[Saturday]" to access values. Effectively this lets you make an array of 7 values with strongly-typed indexing.


> Simply put, it allows you to define types that are generic on specific values, rather than other types

Dependent types are far more general than that. They allow a type to be parameterized on a "value" where the "value" isn't merely a program constant, but rather an arbitrary expression that's type correct at that point in the program. This is what allows them to express refinement types: the refinement type bundles a value with a description of a function that takes any given value to a correct-by-construction assertion that the value satisfies some interesting property.

So if you wanted to prove that a value x in your program is in the range [m, n], you "simply" have to show how you could compute (x - m) and (n - x) at that point in your program, where the - operator can only return non-negative numbers by construction. The flip side is that m and n can be arbitrary run-time values, and there's no run-time check at all; these are essentially 'phantom' values and any computation involved in these proofs is elided: checking that a proof is correct is a special case of type checking. This is also why such languages cannot be Turing-complete, since that would allow a non-terminating function to be a valid "proof" of any property.


It does exist already as a compiler optimisation method, where the compiler infers possible values at different locations in the code. It's called Value Range Propagation or Value Range Analysis [1]. Loops complicate it as a sibling comment mentions, but they don't prevent it from being useful.

There are indeed an infinite number of similar traits. Compilers have to limit themselves to a reasonable subset, because if you take the idea of general numeric traits far enough, you end up with general purpose theorem proving.

[1] https://en.wikipedia.org/wiki/Value_range_analysis


I think the hard part will be loops. What type would you then give to your integers? If you want it to be precise, the type checking might become undecidable. Too loose, and you end up with Int without any additional info.


I'd envisage that working like Rust failing to compile when you have a mutable reference inside a loop but working if you clone it or make it immutable except with the additional case that you can just promote it to the catch-all type. If it's a loop over a range of ints, it might be possible to use that range to determine the necessary type.

All that said, if you're using unbounded loops doing arithmetic on signed integers then you're already dealing with UB in C/C++ land.


Ada has integer types that are basically defined by just their range, and if they are wrapping. So for example you might define

    type My_Tiny_Int is range 1 .. 20;
    type Unsigned_Wrapping_int is mod 2 ** 5;
I don't think it does any fancy inference based on that, just overflow checks.


It should exist, in theory, but the type becomes moot in the presence of any dynamic input.


Well... either the type becomes meaningless, or the type becomes something that must be enforced in the presence of dynamic input.

If your type says "an integer from 1 to 10", and you ask the user to enter a number from 1 to 10, and they enter 20, then something has to happen. If the compiler haa a routine that parses user input into "integer from 1 to 10" (or any range), then it has to either clip, wrap, or fail. If it doesn't, then you just parsed the user input into an int. When you try to stick that in your "integer from 1 to 10", one of the same things must happen. (Or, as you say, the type becomes moot.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: