More

Morg · on June 1, 2012

Exactly, take the problem / solution, reduce it to 5 words tops and it'll make a standard short intro that people think is cool these days.

From what I can gather it's about a platform to make readable content more accessible, more dynamic, in an attempt to provide a nice reading/learning experience much like wikipedia clicking for half an hour, but on more serious topics and in a friendlier fashion that allows you to easily go back to the article/topic you were reading one thought step before.

That's still too long, but summarizing the idea made it way more appealing to me so it might be a direction.

BenoitEssiambre · on June 1, 2012

Thanks. This helps :-)

Morg · on June 1, 2012

Annoying UI really . I hate the open close thingies, can't quite see the diff between one and the next inside, the I is either too big or ugly, and the fat grey bar in the bottom is ugly as well, prolly needs some rounding or whatever.

I know pale UI's are all the rage but you might want to consider a bit more color still - not very good at UIs myself it just felt really really grey.

Morg · on June 1, 2012

Well maybe you should check that again.

Your approach sounds like you should have stored procs instead. Using prepared statements or variable binding to fight SQL injections is not the best idea, although its widespread.

In most cases where you want a prepared statement, you'd be better off using a stored proc, as you'll skip the expensive optimization every single time.

MySQL is not even a real RDBMS (no ACID, no triggers, fail APIs, etc.), anyone using it should switch to PostgreSQL yesterday unless their data really doesn't matter.

SQL injections are 100% avoided by user input control in the application, and the simplest way is to escape all escape characters, that may require reading a bit of doc but w/e.

fendale · on June 2, 2012

> Using prepared statements or variable binding to fight SQL injections is not the best idea

What is the best idea then? If you bind variables to SQL statements, you are SQL injection safe 100% of the time. There is no crafty input sequence that can fool anything.

> In most cases where you want a prepared statement, you'd be better off using a stored proc, as you'll skip the expensive optimization every single time.

I am only qualified to speak about Oracle which is a DB I know extremely well. A query is a query, whether it comes from Java, Perl, Ruby or inside of a stored proc. If you prepare a statement once, and then cache that handle and execute it many times, you optimize the query one time. Also in Oracle, if you prepare-bind-execute one time only, the next time you do the same sequence of steps you Oracle doesn't have to optimize the query again - it can spot it is the same as a previous query and short circuit the process.

Mysql I think doesn't cache SQL statements for later reuse like Oracle, which is why binding isn't as important for performance (in Oracle, not bind queries is a pretty good way to bring the database to its knees) - but its still essential for security.

> SQL injections are 100% avoided by user input control in the application, and the simplest way is to escape all escape characters

What this bug has just proven, is that this escaping is not all that easy - crafty attackers can come up will all sorts of strings that seem to work around the escaping time and time again.

ironchef · on June 1, 2012

"Your approach sounds like you should have stored procs instead." Not necessarily. As one moves towards stored procs, it becomes more common to place business logic in said stored procs. This tends to go against MVC and also limits some scaling options (as your business logic then is executing in your DB). They also often will tie you to a singular DB...making moving DBs more painful.

"Using prepared statements or variable binding to fight SQL injections is not the best idea, although its widespread." I read it not as _the_ best idea, but as yet another layer to potentially catch something. Belt _and_ suspenders if you will.

Don't get me wrong. I'm not saying prepared statements are the greatest thing since sliced bread; however, in oracle or postgresql backed apps, they're a best practice to investigate.

sk5t · on June 1, 2012

I am weary of reading that every RDBMS does an expensive query-plan calculation on every non-sproc DML... maybe you know of a few that behave this way, but there are plenty that don't.

Alternately, suppose one would like to use SQLite with a competent ORM--what's the harm in that?

on June 1, 2012

[deleted]

Morg · on June 1, 2012

Indeed it would affect everyone just as much - still, I will spread MySQL knowledge (not hate) whether or not there is a reason for it.

And I'll say it time and time again, If you think MySQL does not have major issues as a DBMS, you should not make database-related decisions as your knowledge is too limited to make a sensible decision.

Just like windows.

The real Windows experts can both tell you how much it's made of fail and fix your issues, the others are charlatans.

Morg · on June 1, 2012

Exactly. OSX had very bad security before the first big news on virus's, and it won't have good security before another thousand big news on virus's, trojans and backdoors.

Morg · on June 1, 2012

As usual, when you trust a piece of code without reading the source, the fail will be strong.

This is like java's GUID that are random but not unique, etc. you can't guess it from the function name or description, you need to know the internal process to know how it's going to explode and when.

Cloven · on June 1, 2012

95% of even good developers wouldn't be able to tell when a sql sanitization function is poorly coded or has a hidden gotcha. Having the source is not nearly as important as trusting the upstream to be smart and to promptly resolve security issues when discovered.

Morg · on June 1, 2012

I trust noone. except maybe the pgsql guys. However, imho on the topic of SQL injection, either escaping the escape characters is enough or you should change DBMSs / APIs right away.

But really, security without reading sources is blind more or less calculated risk, not security.

PetrolMan · on June 1, 2012

That's a really strange way of looking at things, in my opinion. There are things in life that you just have to trust implicitly. I'm not saying someone else's code falls in that category but just because it is open source and you can supposedly discover any caveats or security risks on your own does not make that task truly reasonable. I'm not in a position where I can read through all of the source code for MySQL, Apache, Passengers, Rails, Ruby, etc in order to make sure that someone hasn't made a mistake. To be honest, I'm not sure that I would recognize an error like this by just reading the code.

What do you do with proprietary/closed source software? What do you do with hardware that is just as capable of poorly implementing security? What about poor decisions that really only become apparent after a security hole is discovered?

Morg · on June 1, 2012

First, I don't believe I need real security, that protects me from most of the worries you cited.

I know it's not safe, and I don't care.

It's like mail or gmail or anything, I know someone has access to my data, and I don't care because it's unavoidable/ not an issue.

You have to trust, but actively try to prove wrong, that weeds out most of the crappy software, like MySQL, MSSQL (lolwut 32 trigger chain?lets cut it here silently) or others.

You have to base your decision on stuff that really works rather than the latest fad, so fck ruby and all that crap, write in C, that's safe, proof is even the chinese and the military have their OS written in C.

Proprietary/closed source, you remain paranoid, test it yourself for what you can think, never think it cannot be the cause.

Hardware you cannot trust, have to learn where the limitations are, remain paranoid as well, question the status quo (is ECC really doing its job or am I just trusting my enterprise data to magic).

Poor decisions that you realize later ? everyone makes mistakes, who cares ?

IMO the main thing is, don't trust anyone to do it right, especially in IT, sometimes you come to trust a specific group, like linux kernel or pgsql because they're proven right time and again - and imo you have to leave it there, I don't want to write an OS at the moment.

Most poor security decisions are related to trivial things like: -using windows -not updating your OS / kernel / tart -using testing tech, like the latest release of ruby, node.js, mongolianDB, etc. -not researching tech before using it (i.e. google mysql ACID, you'll read a few of my posts from when I was pissed off to discover it was in fact just a toy db with half-implemented features) -not actively trying to hack/destroy your own creation -not spending a few K on a honeypot session -not actually knowing anything about hacking -not reading about standard hacking tactics, like SQLi for nubs, XSS, MitM HTTPS, tomato launchers and many more

etc. I'm no security pro and I wouldn't pretend being one before winning several honeypots.

Morg · on June 1, 2012

HIV (just to be clear, I have no clue wether HIV was assisted by some military programmes, but I can safely say such "mistakes" have been made in the past by the same army, like when they used to test nukes for example)

Luckily this time it's a simple pc virus we can easily disassemble and counter - I think cyber war's still miles better than the alternatives.

dpark · on June 1, 2012

No, you can't "safely" say that. You've no evidence that the US military has ever been involved in creating a wide-scale biological pandemic. This is just conspiracy theory bunk.

I'm not sure how you can reasonably compare testing nuclear weapons to the supposed propagation of HIV, either. These two things have nothing in common.

Morg · on June 1, 2012

Alright, you want details ? During the testing of nuclear weapons, the US had no problem testing the secondary effects of nukes through radiation far beyond the blast zone, by putting boats with soldiers to watch the thing.

It was widely known at that time that radiation was bad for you mkay, and that nuclear fission bombs were nuclear fission bombs, i.e. accelerated nuclear degradation bombs and drained all their explosive power from radiations, that kill mkay.

In the past, biological, chemical, explosive weapons were tested on rocks, plants, prisoners, personnel, unsuspecting local populations, etc. by the nazi regime, the US govt, the USSR and France - that are widely confirmed.

I wouldn't put it past THOSE people to do such a thing, would you ?

So really, if you want to say it's IMPOSSIBLE or UNLIKELY that they would've done that too, without knowing the consequences - I suppose you must be right.

Symmetry · on June 1, 2012

Yes, the US has done bad things, and AIDS is a bad thing, but it doesn't follow that the US caused AIDS. After all, nature has had no trouble creating pandemics without any deliberate human help over the centuries.

Morg · on June 1, 2012

(just to be clear, I have no clue wether HIV was assisted by some military programmes, but I can safely say such "mistakes" have been made in the past by the same army, like when they used to test nukes for example)

That means I just used the HIV word to connect to the concept of bio weapon testing gone wrong - weapon testing gone wrong.

The reason why is that one of the most popular theories on HIV is that the US military had a part in its development.

I don't know and I don't care, those people have such a bad karma even AIDS wouldn't make much difference - just read the disclaimer next time ;)

dpark · on June 1, 2012

> That means I just used the HIV word to connect to the concept of bio weapon testing gone wrong - weapon testing gone wrong.

This means you just randomly connected unrelated things.

The connection doesn't even make sense at a basic level. If the US knowingly put sailors in boats near nuclear blasts to test for radiation effects, it wasn't a mistake. It was an intentional act. It makes no sense to say that this implies that super-HIV could have been accidentally released in the wild by the US government.

> The reason why is that one of the most popular theories on HIV is that the US military had a part in its development.

Popular among conspiracy theorists, perhaps, not among the general population or among experts in the field.

dpark · on June 1, 2012

> I wouldn't put it past THOSE people to do such a thing, would you?

This is amazingly bad logic.

  The US has done some bad things.
  This is a bad thing.
  Therefore the US did this.
  Q.E.W.T.F.

There is no link here, except in your own mind.

Morg · on June 1, 2012

Yet everyone uses much slower RAM in servers and will likely continue to do so, all the while caches swell, etc.

Optimizing memory usage is almost irrelevant today, until it starts being a bandwidth problem, and that's still solvable but only through complex scaling strategies that also cost several engineer-years.

Morg · on May 31, 2012

For people using mysql, that kind of query really isn't such bad SQL at all.

sp332 · on May 31, 2012

Shocked moviegoers will have been left wondering why a genius-level hacker would outer-join to the Victims and Keywords tables only to use literal-text filter predicates that defeat the outer joins

Any excuse for this? :)

HarrisonFisk · on May 31, 2012

The MySQL optimizer will actually notice and remove the outer join aspect automatically. So I have often done this out of pure laziness if I start with an outer join, but really end up needing an inner join.

huggyface · on May 31, 2012

Realistically that they started the query with keywords optional, then moved to them being required (which, it should be noted, a good query planner would effective make an inner join). On the scale of evils of SQL, it lies somewhere around "friendly benign".

Morg · on May 31, 2012

Someone should add basic numbers like ns count for 63 cycles modulo and that type of stuff - That'll help bad devs realize why putting another useless cmp inside a loop is dumb, and why alt rows in a table should NEVER be implemented by use of a modulo, for example.

Yes I know that's not latency per se but in the end it is too.

teach · on May 31, 2012

I think if you're worried about whether or not you use modulo to calculate alternating table rows (and you don't work for Facebook), then you're almost certainly optimizing prematurely.

jeltz · on May 31, 2012

Actually it does not matter if you are facebook or not. What really matters is how tight the loop is and how much time is spent in it.

EDIT: I agree with Morg. If coding right also results in faster code there is no reason not to do that.

Morg · on May 31, 2012

IT DOES NOT COST MORE TIME TO CODE CORRECTLY

Some approaches are NOT acceptable, it's not about optimizing prematurely, it's about coding obvious crap.

While you may be used to the usual "code crap, fix later" and "waste cycles, there are too many of it" , it doesn't mean you're right.

Everyone says it but you're still running on C (linux, unix), you're still going nuts over scaling issues (lol nosql for everyone) and you're still paying your amazon cloud bill.

teach · on May 31, 2012

I know you're ranting to the world at large, but I am not "going nuts over scaling issues". All my websites are static HTML files. I regenerate them as needed using custom Python code and my "databases", which are text files in JSON.

I have several sites running on a single smallest Linode, and the CPU utilization virtually never cracks 1%.

Also, note that I am not advocating "coding crap". I'm talking about not berating coworkers over the nanosecond cost of an extra modulo inside a loop.

Morg · on May 31, 2012

If said coworkers are actually trying to improve and can take the advice peacefully, I will deliver it peacefully.

The others I will be pleased not to work with.

hythloday · on May 31, 2012

Quite to the contrary, the optimization of using any particular method to colour rows is so tiny it can easily be outweighed over its lifetime by the 50 or so extra keystrokes it needs to type. That's how trivial this is (which is why people are reacting to your extremely aggressive tone).

Morg · on June 1, 2012

Indeed, I should drop the agression. However, the subject is not optimization but coding correctly in the first place.

And the anti-optimization argument would be correct if: -typing represented more than 1% of dev work -code was never reused -code was never massively used -code had a short lifespan

So let me help you see clearly: -I'm not a typist -Every bad code tutorial out there creates millions of code bits that contain the N times slower version, with an aggregate impact that actually matters -Any 10% opt mistake in a codebase like iptables would cause more carbon than you can imagine -Fortran is still in use because it's the fastest language there is with the best math libraries.

Those seem to be eternal so far, and C seems to remain the only other relevant language throughout the short history of coding.

Sure, there are much more problematic cases than the dumb even odd example, but I picked that one because many would recognize it.

mseebach · on May 31, 2012

It does cost considerable time and brain bandwidth to learn to "code correctly" if coding correctly means knowing how to avoid every excess few nanoseconds.

If your code is expressive, easy to reason about and fast enough, then less expressive, harder to reason about and even faster code isn't more correct.

njs12345 · on May 31, 2012

Or, you know, just get a decent compiler: http://publications.csail.mit.edu/lcs/pubs/pdf/MIT-LCS-TM-60...

Morg · on May 31, 2012

I suppose you are referring to the very particular case of the right shift, but as much as that's easily predictable, it's a corner case.

Who knows maybe the trend will be 3 colors instead of two. Or maybe it'll be another instruction that's wrongly abused. Or another compiler that actually sucks, like most JS interpreters.

The idea really is to use the simplest logical approach to the problem rather than the wrong one.

In the very well known case of the alt row table, it looks to me like we're alternating odd and even, why not just code that to start with, before any optimization ?

njs12345 · on May 31, 2012

No, this form of strength reduction can often eliminate modulo operations in a loop even when the modulus is not a constant. The example given is:

  for(t = 0; t < T; t++)
    for(i = 0; i < NN; i++)
      A[i%N] = 0;

which is optimised to this, without a modulo in sight:

  _invt = (NN-1)/N;
  for(t = 0; t <= T-1; t++) {
    for(_Mdi = 0; _Mdi <= _invt; _Mdi++) {
      _peeli = 0;
      for(i = N*_Mdi; i <= min(N*_Mdi+N-1,NN-1); i++) {
        A[_peeli] = 0;
        _peeli = _peeli + 1;
      }
    }
  }

I find the modulo easier to read in this case, but I guess that's a question of taste. It's certainly not 'wrong' to use a modulo, and probably worth the trade off in most cases if it makes your code clearer.

Morg · on May 31, 2012

Yes, sometimes the compiler can compensate bad decisions from the programmer, the jvm can collect your garbage etc. - none of these will save you from stupid data models and idiotic objects.

recursive · on May 31, 2012

How should they be implemented?

And per se should NEVER be spelled "per say".

Morg · on May 31, 2012

Indeed it should never be spelled wrong, as it means in itself in latin, my bad really.

Alt rows are a simple concept, the first row is odd, the next is even, etc.

A good step forward is an if/then/else or a switch or an unrolled loop - a huge step forward in terms of performance too, as a mod takes 63 cycles and a cmp takes almost nothing.

an example could be

rowClass='even'; loop if(rowClass=='odd'){ rowClass='even'; }else{ rowClass='odd'; } endloop

snotrockets · on May 31, 2012

Unrolling loops is premature optimization, and we all should know what that is the equivalent of.

Unroll your loop once they are tried, tested, and working correctly, and a profiler finds out you spend much too time in the specific parts that would be discarded when unrolling.

The liberties you took with your pseudocode above prove the point: as others have noted, you've chosen premature optimization over using a fitting data type. The first could be easily fixed before release. The latter is harder.

hythloday · on May 31, 2012

I think I must be misreading you. Are you suggesting doing a string comparison to avoid the performance hit of a mod?

Morg · on May 31, 2012

I did write it like that yes.

And it would still be faster than a mod, too, even though one byte might be better for registry usage, it won't affect cycles that much iirc.

recursive · on May 31, 2012

Hey, guess what? You're wrong. (at least in python, which is a reasonable guess for a language that's generating html)

    >>> import timeit
    >>> timeit.Timer(stmt="z=101%2").timeit()
    0.033080740708665485
    >>> timeit.Timer(stmt="z='even'=='odd'").timeit()
    0.05949918215862482

hythloday · on May 31, 2012

It does seem to be true for javascript though:

  > profile = function(fn) { var start = Date.now(); fn(); return Date.now() - start; }
  > cmp = function() { for (var i=0; i < 1000000000; i++) { var z = 'odd' === 'even'; } }
  > mod = function() { for (var i=0; i < 1000000000; i++) { var z = 101 % 2; } }
  > prof(cmp)
  20329
  > prof(mod)
  40792

Whether you think those 20 nanoseconds per test are worth saving is, I guess, an open question. :) I can imagine it being useful for game programming, for example.

Natsu · on May 31, 2012

If you know there are only two states, why not have a bool that you flip each time?

is_even = !is_even should be a lot cheaper than string comparison or modulus, assuming a reasonable language.

Morg · on June 1, 2012

I love fairy tail Indeed you can do much better, flipping boolean + cmp if you only need two states, and integer + array if you need three or more - main thing is, even the "stupid" string comparison is much cheaper than a modulus, and it's a very logical basic approach that says " if the previous row was odd, this one must be even ".

My idea with that is that it's extremely important to reach that conclusion, as it matches the problem perfectly and thus is much more efficient than our (often natural) standard approach of mod(x,2).

When you've reached that step, you can further improve by using a boolean instead of a short string, but that steps clearly into optimization, as it's not "formulating the problem correctly" but "finding a better way to implement the same solution".

There is major cost in not formulating the problem correctly (even odd is an approach to alternating colors, not the problem itself), and mod for table rows is a prime example of that.

seabee · on May 31, 2012

> Some approaches are NOT acceptable, it's not about optimizing prematurely, it's about coding obvious crap.

And yet we miss the simple, efficient answer?

    isEven = 1
    rowClasses[] = { 'odd', 'even' }
    loop
        isEven = 1 - isEven
        rowClass = rowClasses[isEven]
    endloop

It might only be an example but if you're going to complain about people's inefficient/wrong code, the least you can do is provide a good demonstration.

Morg · on June 1, 2012

As I said above, that's a better implementation of the same solution which is, pick two states and alternate by using the knowledge that if the previous is odd the next is even.

In essence your solution and the strcmp one follow the same logic, except yours is limited to two states as it is - but indeed the best n-state solution uses an array too.

What you posted here is a somewhat optimized implementation of the right solution, which is slightly better, like the boolean one (indeed you're using one bit that you flip ...).

But the BIG difference between the mod family of solutions and ours is that mod is over ten times slower because it does not correctly use the problem data.

I'm not the best coder there is, but I know it is much simpler to base yourself on something you already know (the state of the previous row) rather than doing additional computing because an analytical approach says alternating two colors is like having a color for even rows and one for odds.

By the way. your code is evil, if you're going to implement two-state logic, you're expected to use a boolean, and it will run faster with an if(b){str1}else{str2} than with an array that costs additional processing because of its nature (I'm talking straight out of my ass btw, but I still know it's inevitable that an array of two strings requires more bits and ops than two strings).

Also, the point of using an array for such an exercise would be to support n-state logic, yet your fake-boolean int approach makes it doubtful ;)

seabee · on June 15, 2012

> faster with an if(b){str1}else{str2}

I'm not aware of any compiler that optimises such a structure to avoid branch misprediction. An array of two strings might require more bits and ops in an unoptimised interpreted language, or one in which bounds checking is always enabled, but I can assure you that an indexed load is 1 instruction and a load for each string is... well, more than that. You are right, you are definitely talking straight out of your ass!

The n-state solution is a state machine, btw - but it is right to use a simple solution if that is all your problem requires.

cchurch · on May 31, 2012

For 99% of code, worrying about the number of cycles in an operation is an utter waste of time. It just doesn't matter. Correctness and readability trump machine efficiency.

Do you work on real-time systems, embedded code, or something similar?

Morg · on June 1, 2012

You are repeating a point of view you do not understand.

You don't know WHY people started saying that, WHEN they started and WHO started.

It was started by old people a while ago who told even older people that for the simple stuff they were writing for DESKTOP computers, it didn't matter anymore.

Indeed, if you have a 486dx4 and all you want to do is word processing, it didn't matter much wether it was optimized or microsoft word as the thing was way too powerful for that kind of stuff already.

Today, battery life is a concern, virtualization is a reality, scalability is a CORE issue, there are low power states etc.

Today, making your application 100 times more efficient gives you 10x more battery life,100x lower cloud hosting costs, 100x better scalability, etc.

Think that's unlikely ? You've been stacking inefficient blocks for a lifetime, sometimes with inefficiencies multiplying, where do you think you are today ?

Simple example, from a pgsql>jdbc>jboss>j2ee>hibernate>java report factory to a dumb php script that did simple SQL, you already have factors above 20 in favor of the simple solution.

That's before you make a better data model or even try using a fast language or a more suiting data store depending on your needs.

Besides, your argument is nonsense, the simplest most correct way IS the most efficient, that's the power of programming, there is absolutely NO compromise between reliability and efficiency in terms of code.

Readability is over rated, as long as you don't code crap, any COMPETENT coder will be able to read and understand fast enough, even without comments.

Do you work on overweight UIs that drain phone batteries or cloud-hosted applications or anything that needs scaling ?

debacle · on May 31, 2012

In general, there are very few things you actually need a modulo for. It's highly inefficient.

bitwize · on May 31, 2012

A lot of compilers are smart enough these days to optimize modulo by n, n a power of 2, to bitwise AND the complement of n-1.

Morg · on May 31, 2012

Indeed the title made the content look shitty.

Workflow engine means an engine to drive workflows - at least for me.

This thing is meh, no reason to use it over existing C tools, I wonder why someone rererererereinvented the (not actually round) wheel.