For anyone reading stuff like this, read Ellul’s Technological Society instead. Optionally followed by “The Meaning of the City.” I haven’t read the manifesto and don’t plan to; it apparently cited Ellul a fair amount and Ellul seems like a saner guide.
Actually, David Skrbina's "The Metaphysics of Technology" is the most easily readable, comprehensive cover of critical philosophical views towards technology from the ancient greeks until modern time.
Ted K made virtually no reference to Ellul in his manifesto either. In either case, both were quite sane. Ted's manifesto is not a philosophical analysis of technology like Ellul or Skrbina. Ted's manifesto is a practical treatise agains technology and its primary thesis is that technological society must be destroyed.
Why not both? Ted Kaczynski was an important thinker whether or not you agree with his thesis and methods. ISAIF was published in newspapers and the sky didn't fall.
I tried to read Kaczynski's manifesto once and came away with the impression he was mentally ill. I think there's stuff in there that impresses people with personality types who are disposed to agree with him, and it sounds cogent to those people, but I found it hard to escape that he was pretty incoherent and made logical leaps into a private reality.
I can't recall the details, but I saw a documentary that made me doubt that the mkultra experience was very formative for him. Iirc he had a big change in his outlook and demeanor at around the typical age of onset for schizophrenia and similar issues. According to what I've read, these conditions seemingly have multiple causative factors within an individual, genetic and environmental, where "environmental" factors can include stuff that started in the womb.
I think his ideas come across more clearly and coherently in his later work than in his manifesto, but that's often the case.
I don't regard him as a philosopher but rather an agitator, not unlike Thomas Paine or (insert your preferred historical figure here). This largely because his writing is not in a spirit of inquiry over where technology might go, rather it is conclusory - industrialism is path-dependent and net negative, with the only open questions being how to undermine it effectively.
His works read like a book-smart guy that thought his intelligence in X field translated into intelligence at large. It didn't, and frankly his actions were so illogical and nonsensical that it's a puzzle why people think he is some sort of misunderstood genius.
He's no different than the many, many cranks writing ill-informed manifestos online.
That interconnect was 3 or 4 orders of magnitude faster than Ethernet at the time for things like barrier synchronization and the hardware was fairly simple. [1]
Very cool. It all started because we had a bunch of gateway PCs that had parallel ports and we wanted something to synchronized across them. It was fun trying out different parallel cards made with different 374 latches. We even made a few custom ISA cards to play around with other ideas. Hard to believe it has been 30 years since I was at Purdue!
Good links there. At a slightly higher level, I've sometimes wondered whether RTOSs and OS schedulers could make use of special-function hardware registers for keeping track of priority logic, etc.
It was almost definitely HP Dynamo. (Edit: if you combine ideas from HP Dynamo, SafeTSA JIT-optimized bytecode, and IBM's AS/400's TIMI/Technology Independent Machine Interface, you get a better version of the current Android Run Time for bytecode-distributed apps that compile ahead of time to native code and self-optimize at runtime based on low-overhead profiling.)
The really nice thing about Dynamo was that it was a relatively simple trace-based JIT compiler from native code to native code (plus a native code interpreter for non-hotspots). This meant that it would automatically inline hotspots across DLLs and through C++ virtual method dispatches (with appropriate guards to jump back to interpreter mode if the virtual method implementation didn't match or the PLT entry got modified). They didn't have to do any special-casing of the interpreter to handle virtual method calls or cross-DLL calls, it's just a natural consequence of a trace-based JIT from native code to native code.
The only downsides of something like Dynamo are (1) a bit of complexity and space usage (2) some startup overhead due to starting in interpretive mode and (3) if your program is abnormal in not having a roughly Zipf distribution of CPU usage, the overhead is going to be higher.
Ever since I read about Michael Franz et al.'s SafeTSA SSA-based JVM bytecode that more quickly generated higher-performing native code, I've had a long-term back-burner idea to write a C compiler that generates native code in a particular way (functions are all compiled to arrays of pointers to strait-line extended basic blocks) that makes tracing easier, and also storing a SafeTSA-like SSA bytecode along with the native code. That way, a Dynamo-like runtime wouldn't use an interpreter, and when it came to generate an optimized trace, it could skip the first step of decompiling native code to an SSA form. (Also, the SSA would be a bit cleaner as input for an optimizer, as the compilation-decompilation round-trip tends to make the SSA a bit harder to optimize, as shown by Franz's modification of Pizza/JikesRVM to run both SafeTSA and JVM bytecode.) Once you have your trace, you don't need on-stack replacement to get code in a tight loop to go into the optimized trace, you just swap one pointer to native code in the function's array of basic blocks. (All basic blocks are strait-line code, so the only way to loop is to jump back to the start of the same basic block via the array of basic block pointers.)
The background for HP Dynamo is that during the Unix wars, there were a bunch of RISC system vendors vying for both the high-end workstation and server markets. Sun had SPARC, SGI had MIPS, DEC had Alpha AXP (and earlier, some MIPS DECStations) and HP had PA-RISC. The HP Dynamo research project wanted to show that emulation via dynamic recompilation could be fast, so to get an apples-to-apples comparison for emulation overhead, they wrote a PA-RISC emulator for PA-RISC.
This project grew into an insanely powerful tool. It's called DynamoRIO and is still under active development and use today. It's one of the coolest technologies I've ever worked with.
It's used by the winafl fuzzer to provide basic block coverage for black box binaries.
Yes, I have poked around DynamoRIO a few times. It's now geared toward dynamically modifying binaries for various purposes from code coverage to fuzzing to performance and memory profiling.
There doesn't appear to currently be a turn-key solution similar to the original Dynamo. DynamoRIO could be used to put a small conditional tracing stub at the start of every basic block at application startup time, and then do some binary rewriting, similar to the original Dynamo, but it doesn't seem there are downloadable binaries that currently do this.
This dynamic optimization would be much easier and lower overhead (but less general) with cooperation from the compiler.
Could such a compiler include the runtime for this in the binary as an option? That might make it a lot more likely to be used by people, because it is all nice and stand-alone.
Who would benefit from this most? Is the benefit so diffuse it would almost have to be an open-source project without funding? Or could there be parties that see enough of an advantage to fund this?
I guess you could try and get a certain instruction set vendor (probably RISC-V, maybe ARM or x86 based) to have this as a boost for their chips. I guess the "functions are pointers to blocks" compilation could benefit from hardware acceleration.
You could presumably statically link in the runtime. Also, without the dynamically-optimizing runtime, it would run just fine, just a bit slower than normal native code due to the extra indirection. Lots of indirect calls also increase the chances of address mispredictions due to tag collisions in the BTB (Branch Target Buffer).
Function calls/jumps through arrays of pointers are how virtual method calls/optimized virtual method tail calls are executed. Though, in this case, the table offsets would be held in a register instead of immediate values embedded within the instruction. I'm not aware of any instruction set where they've decided it's worthwhile making instructions specifically to speed up C++ virtual member function dispatch, so I doubt they'd find optimizing this worthwhile.
Also, if things go according to plan, your hot path is a long strait run of code, with only occasional jumps through the table.
I should add that the GP only asked about CPU instructions for faster indirect jumps, but I should add that there are at least 4 things that would help a system designed for pervasive dynamic re-optimization of native code:
1. Two special registers (trace_position pointer and trace_limit pointer) for compact tracing of native code. If the position is less than the limit, for all backward branches, indirect jumps, and indirect function calls, the branch target is stored at the position pointer, and the position pointer is incremented. Both trace_position and trace_limit are initialized to zero at thread start, disabling tracing. When the profiling timer handler (presumably SIGVTALRM handler on Linux) executes, it would do some heuristic to determine if tracing should start. If so, it would store the resumption instruction pointer to the start of a thread_local trace buffer, set trace_position to point to the second entry in the trace buffer, and set trace_limit to one after the end of the trace buffer. There is no need to implement a separate interrupt for when the trace buffer fills up, it just turns of tracing; instead, re-optimizing the trace can be delayed until the next time the profiling timer handler is invoked.
2. Lighter weight mechanism for profiling timers that can both be set up and handled without switching from user space to kernel space. Presumably it looks like a cycle counter register and a function pointer register that gets called when the counter hits zero. Either the size of the ABI's stack red zone would be hard-coded, or there would need to be another register for how much to decrement the stack pointer to jump over the red zone when going into the signal handler.
3. Hardware support for either unbiased reservoir sampling or a streaming N-most-frequent algorithm[0] to keep track of the instruction pointer of the instructions causing pipeline stalls. This helps static instruction scheduling for those spots where the CPU's re-order buffer isn't large enough to prevent stalls. (Lower power processors/VLIWs typically don't execute out of order, so this would be especially useful there.) Reservoir sampling can be efficiently approximated using a linear feedback shift register PRNG logical-anded against a mask based on the most significant set bit in a counter. I'm not aware of efficient hardware approximations of a streaming N-most-frequent algorithm. One of the big problems with Itanium is that it relies on very good static instruction scheduling by the compiler, but that involves being good at guessing which memory reads are going to be cache misses. On most RISC processors, the number of source operands is less than the number of bytes per instruction, so you could actually encode which argument wasn't available in cases where, for instance, you're adding two registers that were both recently destinations of load instructions.
4. A probabilistic function call instruction. For RISC processors, the target address would be ip-relative with an offset stored as an immediate value in the instruction. The probability the function is taken would be encoded in the space usually used to indicate which registers are involved. This allows lightweight profiling by calling into a sampling stub that looks back at the function return address. Presumably some cost estimation heuristic would be used to determine the probability embedded in the instruction to make the sampling roughly weighted by cost.
I think there is room for nuance here. True deliberate practice is easiest to apply in those domains, but there are some lessons from Ericsson's work that can be applied using the NDM framework. I think the point that people miss often is that it is deliberate practice, not just practice. While we may not be able to practice like a potential chess grand master, we can deliberately challenge ourselves and look to learn from our mistakes. If you squint hard enough, some of what Klein writes about in "The Power of Intuition" seem like attempts to deliberately practice something in a field without that long tradition of pedagogical development (or fields that don't train for tacit knowledge). His paper with Peter Fadde, "Deliberate Performance" also talks about this. That said, you sometimes don't learn the right lessons, and if the situation changes too dramatically, all your tacit knowledge might work against you.
One tangent comment to go with this: One of Klein's occasional coauthors, Robert Hoffman, co-wrote a book on expert weather forecasters that I really liked. Weather is hard to predict, but one thing they found that what the best forecasters did was to look at the data before looking at what the computer models predicted. Once they had an idea of what they thought the weather might look like, they compared with the model. This kept their skills sharp and ensured that they continued to learn.
In Peak, Ericsson seems to finally settle on one definition, which was what I used in the end. He calls what you just describe "not being able to deliberately practice like a grandmaster" a completely different thing — purposeful practice.
There's a whole chapter in Peak where he tries to talk about what to do if you are in a field with badly developed pedagogical methods. It's basically a badly written copy of The Power of Intuition (Klein). I was incredibly dissatisfied with it, because I was mostly interested in putting DP to practice, and his recommendations were far from practicable. I wish he had just referred to Hoffman or Klein, both of them practitioners in NDM, and therefore both more familiar with attempts to design training programs for fields where no pedagogical rigour exist.
I know you're inclined to give Ericsson a pass, and pass things off as deliberate practice even when his definition clearly excludes said thing. But my view is that we should call a spade a spade and use the exact definitions the man used. If he thought it was good enough for his popular audience, it should be good enough for me.
That makes sense, a couple of years ago there was a lot of misleading interpretations going around, precise definitions help clear things up. I read Peak a while back, and I must’ve forgot that distinction.
I found the Klein book you mentioned more useful than Ericsson’s as well, that Fadde/Klein paper I mentioned was also pretty helpful. I need to reread both, and put them into practice more than I have. I read too much, and I don’t get the tacit knowledge that comes from experience...
Another good book is Surpassing Ourselves by Bereiter and Scardamalia. They studied how students developed writing skill. Their definition of expertise is a bit different than Ericsson’s, but I think it is more useful.
I think the optimal base is actually the natural log e, or about 2.71. I saw a derivation of this before but can't find the reference. Ternary actually comes about a bit better from an information density standpoint, but all the other points you made about difficulties with base 3 and benefits of base 2 still stand.
I'm sure this isn't what Graeham had in mind with this post, but I'll comment anyway: Steven Spear identifies rapid experimentation as one of Toyota's strengths, not necessarily in product development, but developing how they build the product. An example given in his book was checking to see if adjusting the height of a source of parts. Rather than welding it a new position, bolting it, or even using duct tape, the fastest way to check is to just hold it there. It's cheaper and faster. When you lower the costs (both money and time) of experimentation, learning happens more quickly. Or as this post puts it: "the more you can build quickly, the faster you can find what you don’t know"
Rapid prototyping/experimentation leads to rapid feedback, which in turn can lead to rapid learning.
Well put. Toyota's lean manufacturing was in fact what I had in mind when referring to the origins of the term. I wasn't familiar with that particular example, but was referring in general to Toyota's reputation for trying to reduce various forms of "waste" in their production.