I huge part of the goal here is to reduce the need for the "unholy mix of C, C++, fortran, matlab, octave, R and python routines" in both academic research work and machine learning / data science code in industrial settings. The whole project kicked off with me ranting about how I was sick of cobbling things together in six or seven different languages.
So interop is a very, very high priority. We have pretty good C ABI interop now. You can just call a C function like this:
We still want better C/C++ interop though. I've already looked into using libclang so that if you have the headers you don't even have to declare what the interface to a function is. That was very preliminary work, but I got some stuff working. Making Julia versions of C struct types transparently would be another goal here.
Another major interop issue is support for arrays of inline structs (as compared to arrays pointers to heap-allocated structs). C can of course do this, but in any language where objects are boxed, it becomes very tricky. We're working on it, however, anyone who wants to discuss, hop on julia-dev@googlegroups.com :-)
in any language where objects are boxed, it becomes very tricky. We're working on it, however
Naive question (and I'll hold my tongue on my naive guesses): I understand why it is necessary to box user-defined types on the JVM, but why build this restriction into a new language that doesn't run on a restricted platform? Especially when performance and C interop are high priorities?
Or perhaps I misunderstood, and your statement should be read as sparsevector suggested.
Boxed values are pretty much necessary for dynamic languages — that's where the information about what kind of value something is gets stored. It is a pain for this kind of thing, however. In a fully statically compiled language like C, however, you can eliminate the need for a box entirely. If you want dynamic typing, that's the price we've gotta pay.
Not necessarily. You just store a pointer to the type info inline with the data (like C++'s vtables). You can have unboxed "value-types" (const structs, essentially) in dynamic languages. In fact, you could even differentiate between a "boxed" ref type and a value type at runtime, because refs don't need all 64 bits of the pointer. So a ref is a 64 bit pointer with the first bit set to, say, 0, and a value (struct) type always begins with a 64 bit pointer to it's type information, only it's tagged with MSB of 1. Since you can't extend concrete types, you can easily store value types inline in an array, and just have the type-info pointer (which must be the same for all elements, b/c there is no inheritance) at the beginning of the array. And if your structs are aligned OK, you could easily pass them to C by skipping the type-info pointer both in the single value case and in the array case.
Ok, having read this comment again, here's a more measured response. You're assuming in this comment the value-type vs. object-type dichotomy that's used in, e.g., C#. That's one way to go, but I'm not sold that it's the best way. Deciding whether you want something to be storable inline in arrays or not when you define a type is kind of a strange thing. Maybe sometimes you do and sometimes you don't. So the bigger question is really if that's the best way to go about the matter.
It seems that in dynamically typed languages, you either need to have two kinds of objects (value types vs. object types), or two kinds of storage slots (e.g. arrays that hold values inline vs. arrays that hold references to heap-allocated values). The boxing aspect is really only part of that since you can't get the shared-reference behavior unless the storage is heap-allocated, regardless of whether there's a box header or not.
This is an interesting scheme and seems like it might work, but I'm not sure. Would you be willing to pop onto julia-dev@googlegroups.com and post this suggestion there so we can have a full-blown discussion of it? Hard to do here — and some of the other developers would need to chime in on this too.
As a compromise, I think it would be helpful to be able to define structs that have a known layout in memory but no dynamic identity. They would be treated like primitive types in Java, but with the crucial difference that users could define their own. That way users could write Julia code that stores and accesses data in the same format they need for interoperating with whatever native libraries they use, instead of serializing and deserializing between Julia objects and C struct arrays (or using int or byte arrays in their Julia code and giving up most of the advantages of a modern programming language.)
We've discussed our way down that path but the design ends up being unsatisfying because "objects" and "structs" end up being so similar yet different. It may be what we have to do, but I haven't given up yet on having structs and Julia composite types be compatible somehow. pron's scheme is interesting.
Ah, there are finalisers. The function finalizer lets you define a function to be called when there are no more references to an object. I guess maybe the idea is to use this from within the constructor.
No, currently it's an inline array of immutable 128-bit numeric values and we use bit-twiddling to pull the real and imaginary parts out. However, that's a temporary hack. (It's also why the mandel benchmark is relatively slow — all the bit-twiddling is not very efficient.)
The longer-term approach is still up in the air and that's what I was talking about above. My favorite approach at this point is to allow fields to be declared as const — which in Julia means write-once. Then if all fields are const the object is immutable and can automatically be stored in arrays inline.
Can you please explain or give a link what the problem really is (that is, why do you have "bit twiddling" at all) and how you imagine that const arrays of complex can be write once and still efficient?
What's the problem to have arrays of doubles and complexes as "basic" types even in the dynamic language? I believe this could give you a C footprint and C performance with array operations?
It sounds like he's saying it's hard for Julia to interop with languages that don't support arrays of inline structs (e.g. Java). I could be misreading it though.
So interop is a very, very high priority. We have pretty good C ABI interop now. You can just call a C function like this:
and it works. It works in the repl too and you can dynamically load new libraries in the repl. See this manual page for more details: http://julialang.org/manual/calling-c-and-fortran-code/.We still want better C/C++ interop though. I've already looked into using libclang so that if you have the headers you don't even have to declare what the interface to a function is. That was very preliminary work, but I got some stuff working. Making Julia versions of C struct types transparently would be another goal here.
Another major interop issue is support for arrays of inline structs (as compared to arrays pointers to heap-allocated structs). C can of course do this, but in any language where objects are boxed, it becomes very tricky. We're working on it, however, anyone who wants to discuss, hop on julia-dev@googlegroups.com :-)