Hacker Newsnew | past | comments | ask | show | jobs | submit | anon-3988's commentslogin

Before we start writing Lean. Perhaps we can start with something "dumber" like Rust or any typed program. If you want to write something correct, or you care about correctness, you should not be using dynamic languages. The most useful and used type of test is type checking.

Type errors, especially once you have designed your types to be correct by construction, is extremely, extremely useful for LLMs. Once you have the foundation correct, they just have to wiggle through that narrow gap until it figures out something that fits.

But from what I understood and read so far, I am not convinced of OP's "formal verification". A simple litmus test is to take any of your recent day job task and try to describe a formal specification of it. Is it even doable? Reasonable? Is it even there? For me the most useful kind of verification is the verification of the lower level tools i.e. data structures, language, compilers etc

For example, the type signature of Vec::operator[usize] in Rust returns T. This cannot be true because it cannot guarantee to return a T given ANY usize. To me, panic is the most laziest and worst ways to put in a specification. It means that every single line of Rust code is now able to enter this termination state.


I once attended a talk by someone who is or was big in the node.js world. He opened with the premise, "a static type check is just a stand-in for a unit test."

I wanted to throw a shoe at him. A static type check doesn't stand in for "a" unit test; static typing stands in for an unbounded number of unit tests.

Put another way, this common misconception by users of languages like Javascript and Python that unit testing is just as good as type checking (plus more flexible) is a confusion between the "exists" and "for all" logical operators.


Plus, it is simply more enjoyable to design the types in your program than to write unit tests. The fun factor comes from operating on a higher level of abstraction and engages more of your brain’s puzzle-solving mode than just writing unit tests. Making yourself think about “for all x” rather than a concrete x forces your brain to consider deeply the properties of x being used.

> it is simply more enjoyable to design the types in your program than to write unit tests.

I have tried both and I have no idea what you're talking about.

> Making yourself think about “for all x” rather than a concrete x forces your brain to consider deeply the properties of x being used.

The entire point of dynamic typing is that you can think about interfaces rather than concrete types, which entails deep consideration of the properties of the object (semantics of the provided interface).


That's not the entire point of dynamic typing, because all the interface stuff comes from statically typed languages. Some* dynamic languages borrowed it, but most use "implicit" interfaces - where the interface is whatever kind of works, I guess.

> because all the interface stuff comes from statically typed languages.

No, it doesn't. It comes from theory that came after the languages.

> Some* dynamic languages borrowed it, but most use "implicit" interfaces

An implicit interface is an interface, and is exactly the sort of thing I'm talking about in GP. The point is that you think about the object in terms of its capabilities, rather than some proven-up-front categorization that it fits into. What it does, not what it is.


You can achieve this with structural subtyping, such as Go interfaces and Python protocols. Whether that is desirable is a different question.

> "a static type check is just a stand-in for a unit test."

This is not an original argument. Rich Hickey made a similar argument in his "Simple made easy" talk in 2011, though his focus was on a fact that every bug that easiest in a software system has passed unnoticed through both a type checker and a test suit. And even before that similar ideas of test suits being a suitable replacement for a type checker have percolated through Python and Ruby communities, too.

I distinctly remember that the "tests makes static type checks unnecessary" was in fact so prevalent in JavaScript community that TypeScript had really hard time getting adoption in its first 3-4 years, and only the introduction of VSCode in 2015 and subsequent growth of its marketshare over Atom and SublimeText got more people exposed to TypeScript and the benefits of a type checker. Overall it took almost 10 years for Typescript to become the "default" language for web projects.


> I wanted to throw a shoe at him.

You should have!


Agreed.

Besides, it's not like types don't matter in dynamically typed languages. The (competent) programmer still needs to keep types in their head while programming. "Can this function work with a float, or must I pass an int?" "This function expects an iterable, but what happens if I pass a string?" Etc.

I started my career with JavaScript and Python, but over the years I've come to the conclusion that a language that hides types from programmers and does implicit conversion magic in the background does not deliver a better DX. It might make the language more approachable initially, and the idea of faster prototyping might be appealing, but it very quickly leads to maintenance problems and bugs. Before type hinting tools for Python became popular, I worked on many projects where `TypeError` was the #1 exception in Sentry by a large margin.

Gradual and optional typing is better than nothing, but IME if the language doesn't require it, most programmers are lazy and will do the bare minimum to properly add type declarations. Especially with things like TypeScript, which makes many declarations difficult to read, write, and understand.

I think that type inference is a solid middle ground. Types are still statically declared, but the compiler is smart enough to not bother the developer when the type is obvious.


> Before type hinting tools for Python became popular, I worked on many projects where `TypeError` was the #1 exception in Sentry by a large margin.

My experience is radically different. `ValueError` is far more common in my un-annotated Python, and the most common cause of `TypeError` anyway is the wrong order or number of arguments after a refactoring.


Hhmm I could be misremembering if it was `ValueError` or `TypeError`. This was a few years ago. I know that typing issues were always the most frequent in any Python project I have worked on.

How is your experience different?


Hundreds of unit tests replace a type.

Start using properties and it is in the thousands.

Most code should be typed. Python is great for prototypes, but once the prototype gels, you need types.


I've always hated Python. Could never enjoy it at all. Pretty much the same poor DX as PHP, Javascript, Ruby, etc.

Finally set up neovim with pyright; use types on every single fucking thing, and now I love Python[1].

Can't wait to see TC39 become a reality (learned about it just this past week on HN, actually). Maybe I'll enjoy Javascript too.

--------------------

[1] Within reason; the packaging experience is still extremely poor!


Didn't get the bit about TC39 - it is just a group. It has nothing to do with types in JS.

Sorry, I should have been clearer about which proposal I was talking about:

https://github.com/tc39/proposal-type-annotations


I've never seen types used correctly in Python. A `tuple` is not even a type! It's a type constructor!

The distinction you are trying to make is nonsensical in Python's object model. Types are inherently callable, and calling them constructs (i.e. instantiates) the type (normally; this can be overridden, by design). There is also no type->kind->category hierarchy; `type` itself is an object, which is its own type.

When you're at a level of theory where terms like "type constructor" are natural, it's unreasonable to expect any of it to be applicable to Python. This is why the Haskell people speak of dynamically-typed languages in the Python mold as "untyped" regardless of their attitude towards implicit casts.

And I love it, and have been using it for decades, and write beautiful things where the annotations hardly ever seem worth the effort — perhaps for documentation, but not for a static checker. Then I look at other, newer Pythonistas trying to figure out how to write complex generic type expressions (and sacrificing backwards compatibility as they keep up with the churn of Python figuring out how to offer useful annotation syntax) and deal with covariance vs contravariance etc. and I just smile.


I’ve been doing Python and Typescript professionally, Python for almost two decades, Typescript for last 5 years and I can very confidently say that it doesn’t matter.

Besides, you see to be confusing Python run-time with Python typecheck-time, theoretically unfortunate, but again practically irrelevant distinction. (Unfortunate since Python typecheck is basically a different language than Python execution; irrelevant, because the right subsets of both align well.)


Most of the types are "not even types". It's why you can pass a type as a value (not a string) to a function.

If the type was a type, you'd not be able to use it as a value.


A unit test is a functional assertion. A type is a semantic construct that can provide that, but it provides a lot more.

As a trivial example, if I create a type alias from “string” to “foobarId,” I now (assuming a compliant language) can prevent code that consumes foobarIds from accidentally consuming a string.


Good that Python supports types then

Python has formalized magic comments.

You can run a third party linter on those comments, but you must hope that they're correct. There are usually some checks for that, but they're only reliable in trivial cases.

This is not static typing any more than "you can use emscripten to transpile JavaScript to C" means that JavaScript is a low level language with native assembly support. It's a huge step forward from "no system at all" and I'm thrilled it exists, but it's hardly the same thing.


It's actually remarkable how with the success of TypeScript so many other dynamic languages switched to gradual typing.

Erlang and Clojure were the early ones, TypeScript followed, and now Python, Ruby, and even Perl have ways to specify types and type check your programs.


> Good that Python supports types then

"Optional typing" is not the same as "Static typing".

Great, my program will crash, because I forgot to opt-in to typing :-/


C has void pointers.

> C has void pointers.

And? Void pointers are not the default type :-/

With Python I have to do extra work to get type errors.

With C I have to do extra work to hide the type errors.

I am battling to understand the point you are making.


He's probably conflating static and strong typing.

C is statically typed, but weakly typed - you need to throw away types to do a bunch of run of the mill things. Python is dynamically typed, but strongly typed, where it will just fail if typed don't resolve.

C# and C++ are both statically typed and strongly typed, although C# more than C++ in practice.


Hard to argue with an argument which isn’t internally coherent.

Poorly, though, and with lots of edge cases and foot guns to make you miserable once your code calls out to numpy or gets some JSON.

Tell me more please: how does one use types in Python? Unfortunately I write Python professionally these days (it is the language that has all the libraries) and hate it with a passion.

mypy is the simplest/oldest option, look into pyright and ty, too.

install uv and then

    uvx mypy
should do the rest.

If you care about the type of a parameter you can just add an assertion in the method /s

Good luck using static typing to model many real world unit tests for the programming languages people use most. I start with an easy example: those records should be sorted by date of birth. We can move on to more complicated scenarios.

> "records should be sorted by date of birth."

What's wrong with C#'s:

    System.Collections.Generic.SortedList<DoBDateTime, PersonRecord>

?

The comment didn’t claim that types are a stand in for tests either! IMO, they are orthogonal.

The comment explicitly set out to refute the idea "...that unit testing is just as good as type checking" by describing the former as simply inferior.

No. They refuted the claim that "a static type check is just a stand-in for a unit test". That is a claim that you can just remove your type checks and replace them with unit tests at no loss. The comment stated that removing a type check just so you can replace it with a unit test is inferior. The prior state was already pre-supposed to have a type check/type checkable condition that you could replace.

That is the literal converse of the claim in the response to that comment arguing that the comment stated that all unit tests can be replaced with type checks. Those are not at all the same claim.

To make it even more clear the comment said: I saw a talk that said Type Check -> Unit Test. I said that is silly.

Response said: Unit Test -> Type Check is not reasonable. So clearly your claim that Type Check -> Unit Test is silly is wrong.


No one claims that types are a stand in for all unit tests.

They stand in for the banal unit tests.


> A static type check doesn't stand in for "a" unit test; static typing stands in for an unbounded number of unit tests.

You have conflated "a static type check" with "static typing". Unit tests stand in, in the same way, for an unbounded number of states of real-world input. They're simply being subjected to a trial verification system rather than a proof system. It turns out that writing proofs is not very many people's idea of a good time, even in the programming world. And the concept of "type" that's normally grokked is anemic anyway.

> Put another way...

Rhetoric like this is unconvincing and frankly insulting. You pass off your taste and opinion as fact, while failing to understand opposed arguments.


The author is in the comfortable position of working on a system that does have a formal specification and a formally verified reference implementation. The post is not about how they wish things would work, but how their existing system (Cedar) works.

Regarding your point on Rust, the vast majority of software has nowhere near the amount of static guarantees provided by Rust. If you need more, use static memory allocation, that's what people do for safety critical systems. By the way, it seems that Rust aborts on OOM errors, not panics: https://github.com/rust-lang/rust/issues/43596


I think it's possible to write correct systems with dynamic languages, just not the ones we commonly use like Python and JavaScript. I find Clojure, for example to be one example of a dynamic language that is pretty easy to manage and I attribute that to the immutable nature and data-centric ethos. I'm sure there are other dynamic languages that would work as well.

Now, I wouldn't necessarily use Clojure on a huge multi-organization codebase (maybe it's fine, this is outside of my experience with it), but it can be the right tool for some jobs.


Common Lisp as well. I can’t explain why, but type errors are just not something I struggle with in Common Lisp! But it is in JS and Python for sure. Maybe someone knows why it feels different?

I think it’s cause there’s less imperative code and side effects to track data transformations through.

Like any random JS/php app is probably a huge pile of loops and if statements. To track what happens to the data, you need to run the whole program in your head. “And now it adds that property to the object in the outer scope, and now that object gets sorted, now it hits the database… ok…”. Whereas in clojure most functions are either a single atomic transformation to a set of data, or batch of side effects. You still have to run it through your head, but you can do it more piece-by-piece instead of having to understand a 1,000 method with class states being auto loaded and mutated all over the place. Also you have a REPL to try stuff out as you go.

Dont get me wrong, I LOVE static types. Statically typed clojure would be the best fckin language ever. But there is definitely a wide gulf between a dynamic language like JS, and one like clojure!


> Like any random JS/php app is probably a huge pile of loops and if statements. To track what happens to the data, you need to run the whole program in your head. “And now it adds that property to the object in the outer scope, and now that object gets sorted, now it hits the database… ok…”. Whereas in clojure most functions are either a single atomic transformation to a set of data, or batch of side effects. You still have to run it through your head, but you can do it more piece-by-piece instead of having to understand a 1,000 method with class states being auto loaded and mutated all over the place. Also you have a REPL to try stuff out as you go.

Nothing really forces you to write imperative code in a large fraction of cases, and typically the state-change operations can be quite localized within the code. And of course JavaScript and Python both also have REPLs.


But nothing forces you to write functional code either. I’ve seen a whooole lot of php and JS, and most of it has been pretty terrible lol. Of course you can write terrible code in any language, but I think the ease of starting with JS/php combined with the lack of built-in opinions makes it easy to build huge piles of spaghetti.

Though these days fresh typescript codebases are usually pretty decent. I love typescript and it’s really nice to work with a well-typed, modern project with proper schema validation and such. Def miss that in clojure.

Also I wouldn’t really compare JS or pythons REPL to clojure’s. Python’s is useful, but I pretty much live inside the clojure repl


> Statically typed clojure

Well, if you also like Common Lisp, there's Coalton, which is Common Lisp with a Haskell-like type system: https://coalton-lang.github.io/


I haven't done much with CL so I can only speculate, but I think stricter FP principles in general work to minimize the downsides of dynamic typing. CL, to my understanding, isn't the most "pure" when it comes to FP, but does a good job at giving the programmer a lot of power to constrain and explore systems.

> Perhaps we can start with something "dumber" like Rust or any typed program. If you want to write something correct, or you care about correctness, you should not be using dynamic languages. The most useful and used type of test is type checking.

Lean or TLA+ are to Rust/Java/Haskell's type systems what algebraic topology and non-linear PDEs are to "one potato, two potatoes". The level of "correctness" achievable with such simple type systems is so negligible in comparison to the things you can express and prove in rich formal mathematics languages that they barely leave an impression (they do make some grunt work easier, but if we're talking about a world where a machine can do the more complicated things, a little more grunt work doesn't matter).


> To me, panic is the most laziest and worst ways to put in a specification.

This why the "existing programs don't have specs!" Hand-ringing is entirely premature. Just about every code base today has error modes the authors think won't happen.

All you have to do is start proving they won't happen. And if you do this, you will begin a long journey that ends up with a formal spec for, at least, a good part of your program.

Proving the panics are dead code is a Socratic method, between you and the proof assistant / type checker, for figuring out what your program is and what you want it to be :).


Yeah, Rust has been pretty good for formal verification so far. Hoare spec contracts I think are the way forward, especially since they fairly naturally flow from unittests. I've been using Hax to pretty good effect so far. I'm generally suspect that advances in Lean proof solving by dedicated models are that useful for program verification, compared to generalist models, though it could help lower costs a good bit.

I agree. And learning a typed language is significantly easier now that AI can explain everything. The types also help AI to write a correct code. A very positive feedback loop.

a rust to js transpiler would be pretty sweet. idk if someone has turned rust into a frontend language yet like typescript.

> No, it will just kill the browser. The fact he thinks otherwise tells me how out of touch he is.

Believe me when I say this but 99.99999% of the human population does not give a shit what is Firefox, Chrome, Edge, Brave, whatever.

Their survival is completely detached from how "good" it is. As long as it runs, opens a page, opens picture, plays video.

We all live in the tech bubble, to them its an "app" that is "annoying me with ads". And that if they know its an ad, not just part of the page. That is if they even know its a page, not just something my son told me to click if I want to go to "Facebook".


The interesting part to me is that this bug does not necessarily happen in an unsafe block. The fix happens in an unsafe block, I think the API should change to avoid this. Perhaps by forcing users to pass a lambda to do stuff instead of having to manually lock and drop?

The `unsafe` block was present because `List::remove` is marked `unsafe` [0]:

    /// Removes the provided item from this list and returns it.
    ///
    /// This returns `None` if the item is not in the list. (Note that by the safety requirements,
    /// this means that the item is not in any list.)
    ///
    /// # Safety
    ///
    /// `item` must not be in a different linked list (with the same id).
    pub unsafe fn remove(&mut self, item: &T) -> Option<ListArc<T, ID>> {
I think it'd be tricky at best to make this particular API safe since doing so requires reasoning across arbitrary other List instances. At the very least I don't think locks would help here, since temporary exclusive access to a list won't stop you from adding the same element to multiple lists.

[0]: https://github.com/torvalds/linux/blob/3e0ae02ba831da2b70790...


If the API cannot be made safe then it must be marked unsafe.

I mean, remove() is already marked unsafe?

Otherwise there's the question of where exactly the API boundaries are. In the most general case, your unsafe boundary is going to be the module boundary; as long as what you publicly expose is safe modulo bugs, you're good. In this case the fix was in a crate-internal function, so I suppose one could argue that the public API was/is fine.

That being said, I'm not super-familiar with the code in question so I can't definitively say that there's no way to make internal changes to reduce the risk of similar errors.


Yeah this is a bad fix. It should be impossible to cause incorrect things to happen from safe code, especially from safe code calling safe code.

The author of the patch does mention that the better thing to do in the long run is to replace the data structure with one that is possible to better encapsulate: https://lore.kernel.org/all/20251111-binder-fix-list-remove-...

> Classic Motte and Bailey. Rust is often said "if it compiles it runs".

That claims is overly broad, but its a huge, huge part of it. There's no amount of computer science or verification that can prevent a human from writing the wrong software or specification (let plus_a_b = a - b or why did you give me an orange when I wanted an apple). Unsafe Rust is so markedly different than safe default Rust. This is akin to claiming that C is buggy or broken because people write broken inline ASM. If C can't deal with broken inline ASM, then why bother with C?


Yeah. I spent many years getting paid to write C, these days I don't write C (even for myself) but I do write Rust.

I write bugs, because I'm human, and Rust's compiler sure does catch a lot more of my bugs than GCC used to when I was writing C all day.

Stronger typing a big part of why this happens. For example in C it's perfectly usual to use the "int" type for a file descriptor, a count of items in some container and a timeout (in seconds? milliseconds? who knows). We could do better, but we usually don't.

In idiomatic Rust everybody uses three distinct types OwnedFd, usize and Duration. As a result while arithmetic on ints must work in C, the Rust compiler knows that it's reasonable to add two Durations together, it's nonsense to add a Duration to a size, and all arithmetic is inappropriate for OwnedFd, further it's also not reasonable to multiply two Durations together, a Duration multiplied by an integer makes sense and the other way around likewise, but 5 seconds multiplied by 80 milliseconds is nonsense.


> I suspect debugging is not that straightforward to LLM'ize.

Debugging is not easy but there should be a lot of training corpus for "bug fixing" from all the commits that have ever existed.


> To give a contrived example, let's say you want to state that a particular computation terminates. To do it in a clear and concise manner, you want to express the property of termination (and prove that the computation satisfies it), but that property is not, itself, computable. There are some ways around it, but as a rule, a specification language is more convenient when it can describe things that cannot be executed.

Do you really think it is going to be easier for the average developer to write a specification for their program that does not terminate

vs

Giving them a framework or a language that does not have for loop?

Edit: If by formal verification you mean type checking. That I very much agree.


Maybe it's difficult for the average developer to write a formal specification, but the point of the article is that an AI can do it for them.

Is there any plans by some DE to replace D-bus with Binder?

Why haven't anyone randomly generate a bunch of code of various kinds, use LLM to create some summary of it and dump them to github with a restrictive license? The patent office isn't here to enforce anything.

It'd be like copyright trolling the Library of Babel. The set of useful programs would be totally eclipsed by incoherent gibberish (even if there were a means to ensure that the randomly generated code were syntactically correct). In other words, the signal to noise ratio would be microscopic and running this scheme in finite time would effectively result in zero valuable code being successfully squatted.

I'm curious as to where this is going. What happens next?

The US could have easily, easily won the Vietnam war if they just dropped 1 or 2 nukes. The modern military is going to have drone that swarm the sky 24/7. They can develop virus that only they have the cure to. They can drop EMPs. They can grow their own food in their own lab while we all slowly die and wither outside.

These are powers that are actually, technically, plausibly be granted to a single or several individual in the future.

The future where human is obsolete is scary. Just reread that sentence again. Humans are obsolete.


Since no one has bothered to explain how wrong you are… I’ll give you the easy version…

Tanks and drones, don’t stand on street corners and enforce non-assembly and curfews.

The tanks and drones argument and later Biden’s “we have F15s” claim are wildly devoid of reality. You do not understand what a “modern military” is. Each MRAP takes multiple people to keep it running, and it’s just a diesel truck.

You think tanks and drones don’t take teams of people to keep running?


Thinking that people won't fall in line is blind idealism. Autonomous weapons of war are already here as it is - formidable individually, worse than a WMD at scale. Day by day, we're getting closer to a militaristic reality where a commanding officer doesn't need a subordinate's turnkey or permission to enact scaled conflict.

Open a browser tab or start a conversation at a bar today, millions of people are in uproar because elected representatives and military officers issued a video that was JUST A REMINDER that military members have a moral and legal duty to reject manifestly illegal orders. Nevermind how they'll inevitably act when the chips are down, and it's now actually time to reject an order from the commander in chief - or someone that answers to him.

This place fetishizes CGP Grey more than anything - watch his dictatorship video about only needing to hold a few "key" (figuratively and literally) officials in place to get your bidding done most efficiently.


I'd say it takes about 1 million people with modern military capability to completely take over the world.


I claim that 99.9999% of software should be written in a GC language. Very, very, very few problems actually requires memory management. It is simply not part of the business requirement. That said, how come the only language closest to this criteria is Go except it hasn't learn about clean water (algebraic data types).


Meanwhile, earlier this week we had a big conversation about the skyrocketing costs of RAM. While it's technically true that GC doesn't mean a program has to hold allocations longer than the equivalent non-GC code, I've personally never seen a GC'ed program not using multiple times as much RAM.

And then you have non-GC languages like Rust where you're hypothetically managing memory yourself, in the form of keeping the borrow checker happy, you never see free()/malloc() in a developer's own code. It might as well be GC from the programmer's POV.


> And then you have non-GC languages like Rust

Rust is a GC language: https://doc.rust-lang.org/std/rc/struct.Rc.html, https://doc.rust-lang.org/std/sync/struct.Arc.html


By that token so is C.

GC is a very fuzzy topic, but overall trend is that a language is GC if it's opt-out of GC. Not opt-in. And more strictly, it has to be tracing GC.


By your definition, is Nim a GC language?


So by that definition yes, as of Nim 2.0 ORC is the default. You need to opt-out of it.


I'm not sure this opt-in/out "philosophical razor" is as sharp as one would like it to be. I think "optionality" alone oversimplifies and a person trying to adopt that rule for taxonomy would just have a really hard time and that might be telling us something.

For example, in Nim, at the compiler CLI tool level, there is opt-in/opt-out via the `--mm=whatever` flag, but, at the syntax level, Nim has both `ref T` and `ptr T` on equal syntactic footing . But then in the stdlib, `ref` types (really things derived from `seq[T]`) are used much more (since it's so convenient). Meanwhile, runtimes are often deployment properties. If every Linux distro had their libc link against -lgc for Boehm, people might say "C is a GC'd language on Linux". Minimal CRTs vary across userspaces and OS kernel/userspace deployment.. "What you can rely on/assume", I suspect the thrust behind "optionality", just varies with context.

Similar binding vagueness between properties (good, bad, ugly) of a language's '"main" compiler' and a 'language itself' and 'its "std"lib' and "common" runtimes/usage happen all the time (e.g. "object-oriented", often diluted by the vagueness of "oriented"). That doesn't even bring in "use by common dependencies" which is an almost independent axis/dimension and starts to relate to coordination problems of "What should even be in a 'std'-lib or any lib, anyway?".

I suspect this rule is trying to make the adjective "GC'd" do more work in an absolute sense than it realistically can given the diversity of PLangs (sometimes not so visible considering only workaday corporate PLangs). It's not always easy to define things!


> I think "optionality" alone oversimplifies and a person trying to adopt that rule for taxonomy would just have a really hard time and that might be telling us something.

I think optionality is what gives that definition weight.

Think of it this way. You come to a project like a game engine, but you find it's written in some language and discover for your usage you need no/minimal GC. How hard is it to minimize or remove GC. Assume that changing build flags will also cause problems elsewhere due to behavior change.

> Similar binding vagueness between properties (good, bad, ugly) of a language's '"main" compiler' and a 'language itself' and 'its "std"lib' and "common" runtimes/usage happen all the time (e.g. "object-oriented", often diluted by the vagueness of "oriented")

Vagueness is the intrinsic quality of human language. You can't escape it.

The logic is fuzzy, but going around saying stuff like "Rust is a GC language" because it has an optional, rarely used Arc/Rc, is just off the charts level of wrong.


Do you consider ORC a tracing GC, even though it only uses tracing for cyclic types?


You can add your own custom GC in C — you can add your own custom anything to any language; its all just 1s and 0s at the end of the day — but it is not a feature provided by the language out of the box like in Rust. Not the same token at all. This is very different.


Well, that's mostly what a Arc<T> and Rc<T> in Rust are. Optional add-ons.


...as part of the language. Hence it being a GC language.

Is this another case of "Rustacians" randomly renaming things? There was that whole debacle where sum types bizarrely became enums, even though enums already had an established, different meaning, with all the sad talking past everyone else that followed. This is starting to look like that again.


> ...as part of the language.

Which part? It's not available in no-std without alloc crate. You can write your own Arc.

Most crates don't have to use Arc/Rc.

> Is this another case of "Rustacians" randomly renaming things?

No. This is a case of someone not having enough experience with Rust. Saying Rust is a GC language is like claiming Pascal is object oriented language because they share some surface similarities.


> Which part?

The part that is detailed on rust-lang.org.

> It's not available in no-std without alloc crate.

no-std disables features. It does not remove them from existence. Rust's worldly presence continues to have GC even if you choose to disable it for your particular project.

> This is a case of someone not having enough experience with Rust.

Nah. Even rust-lang.org still confuses sum types and enums to this very day. How much more experienced with Rust can you get than someone versed enough in the language to write comprehensive, official documentation? This thought of yours doesn't work.

> Saying Rust is a GC language is like claiming Pascal is object oriented language because they share some surface similarities.

What surface similarity does Pascal have to OO? It only has static dispatch. You've clearly not thought that one through.

Turbo Pascal has dynamic dispatch. Perhaps you've confused different languages because they happen to share similar names? That is at least starting to gain some surface similarity to OO. But message passing, of course, runs even deeper than just dynamic dispatch.

Your idea is not well conceived. Turbo Pascal having something that starts to show some very surface-level similarity to OO, but still a long way from being the real deal, isn't the same as Rust actually having GC. It is not a case of Rust having something that sort of looks kind of like GC. It literally has GC.


> no-std disables features. It does not remove them from existence.

It's the other way around; standard library adds features. Because Rust features are designed to be additive.

Look into it. `std` library is nothing more than Rust-lang provided `core`, `alloc` and `os` crates.

> Nah.

You don't seem to know how features work, how std is made or how often RC is encountered in the wild. It's hard to argue when you don't know language you are discusing.

> Even rust-lang.org still confuses sum types and enums to this very day.

Rust lang is the starting point for new Rust programmers; why in the heck would they start philosophizing about a bikesheddy naming edge case?

That's like opening your car manual to see history and debates on what types of motors preceded your own, while you're trying to get the damn thing running again.

> What surface similarity does Pascal have to OO?

Dot operator as in (dot in `struct.method`). The guy I was arguing with unironically told me that any language using the dot operator is OO. Because the dot operator is a sign of accessing an object or a struct.

Much like you, he had very inflexible thoughts on what makes or does not make something OO; it reminds me so much of you saying C++ is a GC-language.

> Your idea is not well conceived.

My idea is to capture the colloquial meaning of GC-language. The original connotation is to capture languages like C#, Java, JS, etc. That comes with a (more or less) non-removable tracing garbage collector. In practice, what this term means is

- How hard is it to remove and/or not rely on GC? Defaults matter a lot.

- How heavy is the garbage collection GC? Is it just RC or ARC?

- How much of the ecosystem depends on GC?

And finally, how many people are likely to agree with it? I don't care if my name is closest to frequency of red, if no one else agrees.


I can't say I've heard of a commonly used definition of "GC language" that includes C++ and excludes C. If anything, my impression is that both C and C++ are usually held up as exemplars of non-GC languages.


C++ didn't add GC until relatively recently, to be fair. When people from 30 years ago get an idea stuck in their head they don't usually ever change their understanding even as life continues to march forward. This isn't limited to software. If you look around you'll regularly find people repeating all kinds of things that were true in the past even though things have changed. And fair enough. There is only so much time in the day. You can't possibly keep up to date on everything.


The thing is that the usual comparisons I'm thinking of generally focused on how much the languages in question rely on GC for practical use. C++11 didn't really move the needle much, if at all, in that respect compared to the typical languages on the other side of said comparisons.

Perhaps I happen to have been around different discussions than you?


> focused on how much the languages in question rely on GC for practical use.

That's quite nebulous. It should be quantified. But, while we wait for that, if we assume by that metric C++ is not a GC language today, but tomorrow C++ developers all collectively decide that all heap allocations are to depend on std::shared_ptr, then it must become a GC language.

But the language hasn't changed in any way. How can an attribute of the language change without any changes?


> That's quite nebulous. It should be quantified.

Perhaps, but I'm reluctant to speak more definitively since I don't consider myself an authority/expert in the field.

> But the language hasn't changed in any way. How can an attribute of the language change without any changes?

The reason I put in "for practical use" is because since pedantically speaking no language actually requires GC - you "just" need to provision enough hardware (see: HFT firms (ab)use of Java by disabling the GC and resetting programs/machines at the end of the day). That's not relevant for basically everyone, though, since practically speaking you usually want to bound resource use, and some languages rely on a GC to do that.

I guess "general" or "normal" might have been a better word than "practical" in that case. I didn't intend to claim that how programmers use a language affects whether it should be considered a GC language or not.


If Rust is a GC language because of Rc/Arc, then C++ is a GC language because of std::shared_ptr, right?


Absolutely.


Interesting... To poke at that definition a bit more:

Would that also mean that C++ only became a GC language with the release of C++11? Or would C++98 also be considered a GC language due to auto_ptr?

Are no_std Rust and freestanding C++ GC languages?

Does the existence of Fil-C change anything about whether C is a GC language?


> Or would C++98 also be considered a GC language due to auto_ptr?

auto_ptr does not exhibit qualities of GC.

> Are no_std Rust and freestanding C++ GC languages?

These are not different languages. A developer opting out of certain features as enabled by the language does not change the language. If one was specific and said "no_std Rust", then you could fairly say that GC is not available, but that isn't applicable here. We are talking about "Rust" as written alone.

> Does the existence of Fil-C change anything about whether C is a GC language?

No. While Fil-C is heavily inspired by C, to the point that it is hard to distinguish between them, it is its own language. You can easily tell they are different languages as not all valid C is valid Fil-C.


> auto_ptr does not exhibit qualities of GC.

OK, so by the definition you're using C++ became a GC language with C++11?

> If one was specific and said "no_std Rust", then you could fairly say that GC is not available, but that isn't applicable here.

I'd imagine whether or not GC capabilities are available in the stdlib is pretty uncontroversial. Is that the criteria you're using for a GC language?

> No. While Fil-C is heavily inspired by C, to the point that it is hard to distinguish between them, it is its own language. You can easily tell they are different languages as not all valid C is valid Fil-C.

OK, fair; perhaps I should have been more abstract, since the precise implementation isn't particularly important to the point I'm trying to get to. I should have asked whether the existence of a fully-compatible garbage collecting implementation of C change anything about whether C is a GC language. Maybe Boehm with -DREDIRECT_MALLOC might have been a better example?


> I should have asked whether the existence of a fully-compatible garbage collecting implementation of C change anything about whether C is a GC language.

In a similar vein, tinygo allows compilation without GC[1]. That is despite the Go spec explicitly defining it as having GC. Is Go a GC language or not?

As you can see, if it were up to implementation, a GC/non-GC divide could not exist. But it does — we're talking about it. The answer then, of course, is that specification is what is significant. Go is a GC language even if there isn't a GC in implementation. C is not a GC language even if there is one in implementation. If someone creates a new Rust implementation that leaves out Rc and Arc, it would still be a GC language as the specification indicates the presence of it.

[1] It doesn't yet quite have full feature parity so you could argue it is more like the Fil-C situation, but let's imagine that it is fully compatible in the same way you are suggesting here.


You make some good points, and after thinking some more I think I agree that the existence of GC/non-GC implementations is not determinative of whether a language is typically called a GC language.

After putting some more thought into this, I want to say where I diverge from your line of thinking is that I think whether a language spec offers GC capabilities is not sufficient on its own to classify a language as a "GC language"; it's the language's dependence on said GC capabilities (especially for "normal" use) that matters.

For example, while you can compile Go without a GC, the language generally depends on the presence of one for resource management to the point that a GC-less Go is going to be relatively restricted in what it can run. Same for Java, JavaScript, Python, etc. - GC-less implementations are possible, but not really reasonable for most usage.

C/C++/Rust, on the other hand, are quite different; it's quite reasonable, if not downright common, to write programs that don't use GC capabilities at all in those languages. Furthermore, removing std::shared_pointer/Rc/Arc from the correspondning stdlibs wouldn't pose a significant issue, since writing/importing a replacement is something those languages are pretty much designed to be capable of.


Games are 0.00001% of software? Web browsers? Operating systems?


Large parts of web browsers (like entire Firefox' UI) is written in javascript already

Operating systems _should_ use GC languages more. Sure, video needs to have absolute max performance... but there is no reason my keyboard or mouse driver should be in non-GC language.


> Large parts of web browsers (like entire Firefox' UI) is written in javascript already

Is UI even a large part of Firefox? I imagine that the rendering engine, JS engine, networking, etc, are many times larger than UI.


Most programs should be written in GCd languages, but not this.

Except in a few cases, GCs introduce small stop-the-world pauses. Even at 15ms pauses, it'd still be very noticeable.


Might want to read up on Ponylang. GC'd FOSS language without the stop-the-world pauses, demonstrating that it's possible. Should also be pointed out, that there are a number of proprietary solutions that claim GC with no pauses. Unfortunately, if coming from more common C-family languages, Ponylang may require more to get used to the actor model and different syntax.


There are GC algorithms that don’t require stopping the world.


Good point. However, similar offshoot languages of Golang, do have them. Those are Vlang (aka Sum Types) and Odin (aka Discriminated Unions). However and per your comment, only Vlang has the optional GC (used by default) and the flexible memory management philosophy (to turn the GC off without its stdlib depending on it).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: