Having now learned kdb+/q, k, and J, I'm wondering what the next logical evolution of these kind of languages is. Static typing, laziness, GPU execution? I believe (or maybe just hope) that all programming is going to move into this direction, just like how we've seen statically typed functional languages become mainstream in the last 20 years. What's next?
Static typing wouldn't be helpful for performance and would and take away from the APL like terseness. Static typing can only help for speeding up classical von-neumann-bottlenecked programs, but to go beyond that the program has to run massive data parallel tasks. APL languages, SQL and pyspark show that you don't need static typing for this at all and can achieve better perfomance than classic statically typed languages assuming simplicity of writing parallel code this way. GPU programming is also much easier in higher lvl languages which force you to think in array like sequential operations only. Computers are designed for sequential operations natively, any "random memory access" is just a hardware optimized hack to make it bearable.
My bet is integration with relational processing engines. Blend of APL family and SQL.
The huge shortcomings of SQL is inability to create functions, adhoc tables, select n-1 columns, or any metaprogramming so people use text templating languages instead. Also the relational model is bad for performance because it assumes sets without order. APL, kdb show that order is important and you can achieve much better performance if you can assume order in your data. Most big datasets are time ordered anyway because they represent some log of data updates in time.
My wish is for somebody to create something like Spark or programmable SQL engine, but
- no Java (if you care about performance why choose Java in the first place)
- no 100 line Java obscure tracebacks
- setup as easy as installing a binary and running a repl. Give me kdb like simplicity pls, not days spent figuring out how to install and configure spark
- tables should have notion of order so that you can do merge-joins without sorting (asof joins mostly - it's painful to watch companies try to implement this at scale with full outer joins or other hackery)
I tried implementing a mix of APL, lisp, forth and SQL based on these principles but got overwhelmed unfortunately.
My bet is integration with relational processing engines. Blend of APL family and SQL.
The huge shortcomings of SQL is inability to create functions, adhoc tables, select n-1 columns, or any metaprogramming so people use text templating languages instead. Also the relational model is bad for performance because it assumes sets without order. APL, kdb show that order is important and you can achieve much better performance if you can assume order in your data. Most big datasets are time ordered anyway because they represent some log of data updates in time.
My wish is for somebody to create something like Spark or programmable SQL engine, but
- no Java (if you care about performance why choose Java in the first place) - no 100 line Java obscure tracebacks - setup as easy as installing a binary and running a repl. Give me kdb like simplicity pls, not days spent figuring out how to install and configure spark - tables should have notion of order so that you can do merge-joins without sorting (asof joins mostly - it's painful to watch companies try to implement this at scale with full outer joins or other hackery)
I tried implementing a mix of APL, lisp, forth and SQL based on these principles but got overwhelmed unfortunately.