>Despite this, there has always been interest in the possibility that Java programs could be directly compiled to machine code and execute standalone without a JVM.
Cue the "you could always do that with some semi-supported neglected FOSS like GCJ or some niche commercial product all 3 people used" from certain HN circles -- you know who you are :)
The title of the article is a bit misleading. The goal is first to merge GraalVMs JIT compiler (as an alternative to C2). Then later to bring in the AOT compiler.
After this merge we will have JVM implemented mostly in Java (aka "self hosted"). Currently about 1% of the JVM runtime is implemented in C and this will be replaced with instrumented Java code (including garbage collectors). This means that only one runtime will be maintained in the future and Java developers will do it (and not C/C++ developers which are harder to get). It will be much easier to implement and test new memory management mechanisms than it is today.
If they also decide to merge support for other languages (including WASM and LLVM) we will be able to run everything that compiles to WASM and to LLVM on the same runtime - in the same JVM. So we will get another "Docker Without Containers"...
EDIT: It seems that I am wrong with the above 1% statement. I meant that about 1% of the Java standard library is written in C/C++. Regarding JVM: I guess that the whole HotSpot VM will be replaced with GraalVM in the future and when this happens, majority of the JVM code will be in Java.
Where did you get the 1% figure from? HotSpot has millions of lines of C++, that seems far too low.
Graal doesn't affect memory management. It's a JIT compiler, not a GC. Although there's a relatively simple GC written in SystemJava for native images, G1, ZGC and Parallel will continue to be the standard GCs in the regular JVM for the foreseeable future and they are all written in C++.
To add to your question, assuming the tooling at github is ok, it has ~20% as c/c++ for https://github.com/openjdk/jdk. I would think most of that would be concentrated in the hotspot code, but I don't know. Would be delighted to know what the original post was referring to.
I think the entirety of Hotspot is written in C++ [0]? The standard library in the JDK is mostly written in Java, but the VM itself with the GC and JIT code is in C++, which makes sense.
If you do not count the standard library as part of the JVM (but part of the JDK), then the JVM is written mostly in C++.
Because of its inherent memory safety, Java is attractive for anything in the JVM which is not strictly necessary to be coded in C/C++ for some reason (raw performance, low-level, or OS-specific). With Projects Valhalla ann Panama
around the corner, there will be even less hard reasons going forward. Also, C/C++ compilers can be tricky dependencies at times.
It's surprising to me that effective knowledge of C and/or C++ is the stumbling block for contributing to the core of the JVM.
I work at a company with one large C++ codebase, and also many Go projects where C and C++ knowledge are helpful. We have trouble hiring good C++ developers, because our candidate pool starts as "people who want to write web applications".
But if you're already restricted to people who want to work on an extremely mature compiler, VM, and JIT - is lack of C knowledge really what's making it hard to hire people? These are people who are likely to end up reading kernel code and C-defined ABI documentation and etc. etc. even if the codebase is 100% Java...
A good pay should cover your expenses, vacations and such, a bit of retirement savings, and leave you with some disposable money as well; otherwise, it depends on how much you want the job. (It's the same old supply vs. demand kind of thing.)
In my mind, this feels more like "acceptable pay". Excluding absolute emergencies, I'd never take any job as a grown adult that didn't check all these boxes.
Even you're maybe right about "the C people" (most of I'm meet were indeed the kind of "there's nothing besides C") this doesn't seem to be relevant here. The JVM is written in C++, not C.
Also there is a large chunk of people (like me) who don't care about Java (the language) but very much about the JVM (and to some degree also about the Java ecosystem). I'm a Scala developer. But I'm also interested lately in Rust (and to some degree C++, as both languages are related), 'cause, you know, max performance and full control, and such.
So at least when it comes to interests there is some pool of people for sure who would like to look into the low-level JVM internals.
People are human; we get petty and tribal at every level. And ever putting all that aside, if you genuinely believe that one kind of programming or another is the way forward then of course you're going to be more interested in working on technologies that support that.
The ideal hire would be "language implementation" people. There's a tendency for these to be C/C++ people because these offer conveniences in implementations. But they would also care about other languages and how they work.
> Regarding JVM: I guess that the whole HotSpot VM will be replaced with GraalVM in the future and when this happens, majority of the JVM code will be in Java.
I believe this is still incorrect. GraalVM is HotSpot with the Graal JIT compiler and the truffle languages.
The embedded and systems space are hard and require special talent and knowledge. You can't just take some front-end React developer and get them writing quality systems C code in a weekend... people spend entire careers in that space, and the number of people interested in that work shrinks by the year (look at all developer surveys).
Incidentally, that's one of the reasons for which Rust is so successful in this space. Rust won't magically make you productive in a week-end, but the Rust compiler (with its error messages) is pedagogic enough that it is going to help you learn from your mistakes.
That's the extreme opposite of raw C (or raw JavaScript, etc.) in which it's very easy to believe that you have solved an issue, only to realize much later that you've been entirely misunderstanding some constructions of the language and (if your code somehow managed to land) you have made the entire product unstable or insecure without realizing it.
Of course, there are tradeoffs to each approach. But I believe that investing in Rust was a great initiative from Mozilla.
I have said this multiple times, but I am grateful for learning Rust as it helped shape the way I view memory ownership, and in the process made me a better developer.
It's certainly very interesting since Java was among the few high level languages that had unlimited access to C OS devs in theory, but it doesn't show this in practice.
I always felt they were hostile to understanding OSes and problems/solutions they provide and thereby limiting themselves to mostly junior C devs who didn't rock the boat by making real OS features for the JVM.
The percentage of developers that do C might be shrinking but the number of C developers surely has to be increasing. You can barely throw a rock in a major city without it landing on a developer nowadays.
It actually is. Managed languages don't suffer from many kinds of bugs that C/C++ and other unsafe languages do. Memory corruption and concurrency bugs can be devilishly hard to find in the latter.
>Despite this, there has always been interest in the possibility that Java programs could be directly compiled to machine code and execute standalone without a JVM.
Cue the "you could always do that with some niche commercial product all 3 people used" from certain HN circles -- you know you who are :)
I don't see too many immediate term improvements from what we have today, but I do see GraalVM to be just different enough from mainline OpenJDK releases to question exploring the features of that build (in JVM or native modes). I poked at the release from time to time between 19 and 22 but always ended up going back. If this can improve that story, I'd say it's a win for all. Of course I'm not sure what this changes in regards to Oracle's extended paid version of the GraalVM runtime.
GraalVM is a bunch of components, but the most important one are 1) Graal, a library in Java that is a code generator and optimizer for compiler authors, 2) Native Image, a tool which turns Java programs into native executables via Graal, and 3) Truffle, a tool for implementing languages in Java that then get code generated, again by Graal
"GraalVM" is the umbrella name of all these things packaged together, together with a couple languages implemented in Truffle. That's what you download when you go to the website.
Galahad is a proposal to merge the Graal code generator library in Java, into OpenJDK's codebase, so that it is developed in tandem and comes with first-class support. And none of the other things, for now (Native Image will come later.) But you can also do this today: you can instead configure a stable LTS release of Java to use Graal as the code generator for the HotSpot VM, as an alternative to C2. It's a little bit of work to do so, but you can use any compliant OpenJDK build and test it out. Galahad will just make this an "out of the box" option rather than having to get Graal from Maven or whatever.
So, you can just test things incrementally and see if Graal gives performance improvements over C2. You don't need to throw out your whole existing JVM, it's mostly just a new maven dependency. That's a much simpler and more targeted change, if you haven't given it a try (mostly by fiddling with some JVM -XX flags, as usual.)
GraalVM will mostly benefit those who are building and deploying Function/Serverless and Microservices most - the native compilation really shines in that space (for near-instant start times and footprint reduction).
For anyone else, the normal JVM is likely a better choice for now - the tech built in there is very mature and battle tested.
With that said - baking a bunch of GraalVM features into the normal JVM is nothing but good for everyone. Native AOT compilation does have it's uses outside microservices/serverless, and some cross-pollination of VM ideas is a great idea.
There are a lot of other neat parts for graal that will likely be beneficial. One of the big benefits of graal is that it makes implementing guest languages a snap (via the truffle framework).
So you can get all the operational benefits of the JVM + a nice set of languages to play with along with interopt cross language (python calling java calling ruby calling perl, etc).
That particular claim/goal used to get me excited (run any language on GraalVM), but the reality seems to have worked out to be a very large asterisk next to that claim.
From what I can tell, the effort to get any particular language running on Graal is non-trivial, leading to that relatively small list of supported languages - most of which seem to be custom dialects in one fashion or another. Perhaps I'm wrong though...
GraalVM team member here.
Implementing any mainstream language is indeed a challenge, more so if you have to maintain bug-compatibility and cope with all the bits of bad design that went through the cracks in the de-facto implementation.
Truffle is not for beginners, but knowing the basic set of features e.g. partial evaluation, deoptimization... can get you very far already e.g. you can easily speedup any interpreter by 10X or more with minimal changes.
How long does take to implement a programming language? Well, from hours to years... depending on the language.
To make my point; how long would it take to implement a JVM? A JVM is a complex beast, so I would myself guess from years to a decade probably, what if I told you, that Espresso was written in just 6 months by an intern and a seasoned engineer... in just 6 months it was able to Minecraft and even run itself.
I assure you there's no magic here, and certainly no blinding talent either; the only reason for this unheard productivity was Graal/Truffle.
So, whenever I talk about Espresso I always give all credit to Graal/Truffle, it is a sublime platform for implementing fast languages and runtimes, of which Espresso is just a byproduct.
Just a tiny side note, a basic toy JVM is actually not that hard (without JIT, trivial GC, limited standard lib) from personal experience, of course a performant/having feature parity one is indeed impressive (though I yet to play with Espresso!)
It can run Javascript with similar peak performance than V8 with a tiny tiny fraction of development than what went into the latter. (Of course it is possible only due to it leveraging the many many dev-hours spent on the JVM), it is the fastest Ruby implementation by an integer factor and it can run C dependencies (often used by Python/Ruby) as well, and even optimize across language boundaries.
The smallish list of languages is just due to lack of effort, plus lack of immediate benefits — it is not hard to create an alternative language implementation. It is very hard to create one that is 100% drop-in replaceable with the real thing as every sufficiently complex program will depend on implementation quirks as well.
Regarding Javascript: there are at least two or three mature Java implementations out there, and many more non-Java ones. Its inconsistencies and most implementation gotchas are well-known by now. Also, it's a comparatively simple language that has not experienced as much organic growth as Python, Perl, or Ruby.
I am still impressed that it can beat V8 handily without extensive further optimization work in the JVM, but it's not that surprising.
Is it a small list? Especially if you count ILs like LLVM bitcode or WASM, it probably includes most of them.
Many of the languages people use in the real world are really large, really old, really hairy and rely heavily on native extensions that poke arbitrary interpreter internals. Truffle at least makes these implementable with higher performance, and the team size needed to get that done isn't infeasible. But implementing a Ruby or Python will never be a five minute job regardless of what tech you use.
It's small in that nearly half of the published list is academic languages and dialects, and the other half seems to be custom dialects of languages.
While LLVM or WASM support is nice, most modern languages are not compiling to either one by default, and likely means it's not "turn key" to get your system running on Graal anyway.
On the other hand, Truffle is just a Java library (with some special handling when used by Graal JIT compiler), so one will be able to just simply place it on the classpath.
It'll help build cross-platform desktop applications. In theory, it'll streamline using GitHub Actions to produce platform-specific binaries, such as my FOSS Markdown editor KeenWrite[0], without having to have a copy of every target operating system.
To my knowledge, cross-compiling "native" Linux and Windows binaries using Java requires duct tape, chewing gum, and warp-packer.[1][2] If there's another way to cross-compile a Java application into a standalone binary for multiple target platforms, do tell!
GraalVM isn't a panacea. For example, GraalVM cannot compile Renjin[3], a pure Java R interpreter, which means switching from Renjin to FastR. Not trivial. Other sinkholes likely lurk that'll only reveal themselves when walking through the weeds.
The challenge with GraalVM runtime is that there are differences in licensing.
What was great about Java is that only for specialized use cases, one had to invest in a paid runtime. GraalVM moves essential features known from Java Runtime into an Enterprise Edition.
From GraalVM docs [1] on memory management: "The G1 GC (only available with GraalVM Enterprise Edition) is a multi-threaded GC that is optimized to reduce stop-the-world pauses and therefore improve latency, while achieving high throughput. Currently, G1 can only be used in native images that are built on Linux for AMD64."
I think tweaking the garbage collector when running on native is a specialized use case. Anecdotally, the projects using native image I'm aware of don't tweak the garbage collector settings that are available in the community version at all. If throughput is critical enough to warrant garbage collector fine tuning, there's a decent chance you'll run it on a VM anyway. Of course, there are exceptions, but I don't think this is a limitation for most projects.
That being said, most of the JVM projects I'm very familiar are written in Clojure, so they may well be outliers compared to Java or other JVM languages.
Cue the "you could always do that with some semi-supported neglected FOSS like GCJ or some niche commercial product all 3 people used" from certain HN circles -- you know who you are :)