Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Frameworks Round 3 (techempower.com)
92 points by amarsahinovic on April 22, 2013 | hide | past | favorite | 113 comments


A few things really stand out for me:

1) I am shocked how slow EC2 is and how expensive. A m1.large is $0.240/hr or ~$175/month. And it's 7-10x slower than a $350/mo dedicated box. You would be spending $1500/mo to equal one dedicated box (not including bandwidth and S3 fees). A reserved instance is cheaper of course.

2) The multiple queries test would seem to be the best one to really simulate real-world usage. JSON-serialization is mostly testing the language.

This test pretty much puts most of the interpreted languages on a full stack framework together at the bottom. Although django seems to come out at the bottom of the pack for some reason.

Then come the "raw" tests and JIT languages running full stack frameworks. And at the top are the compiled languages. Not really surprising there.

3) I'm surprised php-raw did so well. And that go did so poorly.


The main selling point for EC2 isn't that it's cheap, it's that it's easy and fast to scale up/down and that you get managed failover. For that ease you pay a hefty premium. A lot of companies that have highly variable demand will often go with EC2 because if they get a sudden spike in demand they can quickly spin up 3 or 4 new instances to load balance across, and then when the demand dies off they can spin them all back down to save money. Doing something similar with dedicated hardware doesn't really work, you either end up without enough hardware to meet demand, or else you've got a bunch of hardware sitting around idle most of the time.


The key is that you don't need to plan for anything.

Financially and performance wise you're probably better off on dedicated hardware with some over-provisioning but first you need to know how many machines you're going to need.


It's more accurate to compare dedicated boxes to reserved instances.


Reserved instances are simply a pricing construct. You're paying upfront for up to 24/7 usage, but you're running on the same pool as the standard instance. Dedicated instances, on the other hand, do what you're talking about but they're much pricier:

http://aws.amazon.com/dedicated-instances/


The comment you replied to didn't talk about what they do. I took it to mean it's more accurate from a price perspective.


A little too generalized, LuaJIT is winning some of the benchmarks


Go seems to have issues with the database drivers, maybe because of the 1.1 update? It was quite fast in the previous iterations of the test which used Go 1.0 afaik.


In the previous round, we did not have Go implementations of the database test, so this is new community-contributed database test code here. However, see my post elsewhere in this thread [1] about the community's work to resolve the issues afflicting that test.

[1] https://news.ycombinator.com/item?id=5590132


Honest benchmarks. Open and gracious group of developers behind it. Disclaimer - I'm married to one of these guys. I'm not a programmer, but I can see how hard this group has worked to provide something useful to the community and I've enjoyed watching it grow.


Honest benchmarks.

"Select random number from MySQL"? I'd like to see the stochastic control for that one.


We have considered alternatives [1] but randomly selecting from 10,000 rows was selected because it's easy to implement in a threadsafe manner without introducing even more complicated requirements such as a concurrent queue or atomic counter.

Note that after the warm-up, MySQL is essentially guaranteed to have the entire table in its own cache, so the exercise is testing the ORM (if applicable), database connection pool, and the database driver.

Honestly, assuming none of the implementations are using an expensive random number generator (e.g., a random number generator intended for cryptographic use), I don't believe there would be different results if it were feasible to sequentially retrieve rows without the need to contend with concurrency.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/issues/11...


I'm sure they're great guys, but I found it interesting how many obvious suggested changes were not incorporated at all. What's honest about it?


Pekk, we'd love to address any problems you can point us to. Obviously we have limited time to work on this, so the best way to get a fix in is via a pull request. Are there any issues in particular you think are especially high-priority?


They have been extremely responsive to feedback so far and always carefully explained why they are not incorporating everything the community suggested.


Open a pull request.


Just curious but why are the major php frameworks so abysmally slow? It seems like php-raw does pretty well in all the tests but most of the big frameworks (cake, symfony, etc) are orders of magnitude slower.


Wow, I was floored at how slow Symfony is...

I have a feeling it's ORM that's killing these frameworks in the DB portion.


Hi,

I'm the user who contributed symfony 2 to the benchmark. My intention was to get as much php frameworks in the next round as possible (Symfony 2, Codeigniter, Laravel etc.), so I didn't pay much attention on the possible optimisations of the frameworks. On top of that I'm not a php guy, so it's possible that every php framework I contributed to the benchmark runs on breaks. ;o)

In Symfony's case I posted on the mailing list in order to get users with symfony experience to improve the performance, but this post was sadly not activated/moderated.


it is used without caching, leading to unnecessary parsing of a metadata structure. This is explicitly not recommended for production use and kills the request time here.


A PHP framework has to initialize all its systems from zero for each request. It may because of that, it may not, it's just a guess.


You would think the frameworks would be designed around that, loading the absolute minimum of code to handle the request.


That still doesn't prevent you from needing to do that on each request.

Also, with most frameworks, they're usually quite complex; how else are you going to be everything for everyone? Complexity comes at a cost, most notably in framework land this takes the form of code with very tall inheritance structures and/or tons of dependencies.

You need Foo? Ok, Foo inherits from BaseFoo, which inherits from CoreBaseFoo, which inherits from the Widget class, which itself is a BaseWidget, etc. Let's implement a few interfaces too, and now load in various dependencies... pretty soon that simple, 5-line class that you implemented now requires 100+ classes.

Things like op-code caching can greatly reduce this per-request penalty, but it doesn't change the fact that it still happens.


Except that nothing says that you need deep class hierarchies. In fact, i think heavy use of classes and class hierarchies is an anti-pattern in PHP because of its procedural per-request model.

Typically php frameworks follow a java-style model of unserializing data into objects, loading the corresponding class files on-demand, calling API's on the objects, and then reserializing. It is _much_ faster if you treat the data as a stream, cutting it up and transforming it as it passes through your code, without ever building up an object representation, and not doing any more deserialization than a simple json_decode (which is really fast). This is in fact the original PHP model, transforming a stream of annotated HTML.


Lazy Loading is a thing with modern PHP frameworks.


I'll give an example: zend_validate uses a proper OO hierarchy, so that for each type of validation (maxlength, digits, ...) there's a distinct validator class, which has to be instantiated prior to being applied to a piece of data. The whole thing is really clean and organized ... and abysmally slow, because on each request you're loading dozens of validator classes. That would have been much faster had it been a simple ugly procedural API with all its code in one file. The per-request overhead of input validation is more important than a 'pure' oo design in my opinion.


> https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

Doctrine is slow, and this is not how you would use it in the real world. Throwing a cache on top of everything will speed it up considerably.


We have expressly avoided caching in these tests (so far) in order to exercise the ORM and database connectivity. In future tests, we plan to introduce caching [1]. If you have thoughts about specific test characteristics you'd like to see when caching is added, I'd like to hear those thoughts. Thanks for the feedback!

[1] https://github.com/TechEmpower/FrameworkBenchmarks/issues/13...


Doctrine is explicitly not suitable for non cache usage. You should remove the ORM from the test and use the DBAL only instead if you don't have caching introduced.

Additionally you are not bootstrapping the Symfony cache correctly, you need to run "php app/console cache:clear --warmup --env=prod --no-debug" before running the tests, it might cause cache slams in your benchmarks.


Correct, running the debugging bar in Sf2 will kill performance.


At least the annotation/metadata/dql parser cache should be enabled:

metadata_cache_driver: apc

Else you are benchmarking the annotation and DQL parsers.

Also, for big PHP frameworks you have to make sure that APC's SHM size is large enough.


running apt-get install php5-apc;/etc/init.d/apache restart command will (at least) make it 3 times faster.

apc is no-config, no-cost php accelerator :)


All the PHP tests were done with PHP 5.4.13 with APC, PHP-FPM, running behind nginx because previous feedback suggested this was optimal.

http://www.techempower.com/benchmarks/#section=environment


I'm having a hard time making conclusions based off of this data, except that Python/Ruby/JavaScript are slow, JVM is fast, and the rest are in between... It's odd how Haskell (Yesod) and C (Onion) are both slower than the JVM frameworks. Is it just a maturity issue? I know Yesod is very new relative to some of these JVM frameworks.

Also, minor typo: the link to Erlang from the Environment tab is not the right URL. You probably meant to reference http://www.erlang.org/, not http://www.erlang.com/


On the topic of the JVM frameworks, you have to understand that the JVM is a very finely tuned machine at this point, it has something like 10+ years of performance tuning by a large portion of the entire industry behind it. Yesod just recently cracked the 1.0 release and has been going through massive changes in the last year or so, the fact that it performs as well as it does is a huge testament not only to Haskell as a language, but also to the work of Michael Snoyman and the others who's work he incorporated into Yesod.

That said, there are still a lot of rough patches with Yesod, and I'm sure there are lots of opportunities to improve performance in various small ways. The main Haskell compiler (ghc) is constantly improving and many of the changes allow for smarter automatic optimizations to be applied by both the compiler and the runtime. With continued refinement I feel it's definitely possible for Yesod to match, and even exceed the performance of any of the JVM based languages, but as I said, there are a lot of performance optimizations baked into the JVM and it's probably going to take a while before Yesod (and WAI which it's built on) have similar levels of optimization in them.


Because the benchmarks are quite narrow, sometimes you get something that has simply been optimized or isn't doing the same work across platforms.

For example, the last time I checked their php-raw code, it was executing a database query and appending the raw results to an array. Then, it took that array and passed it to json_encode which I'm pretty sure is implemented in C (PHP's built-in functions are in C, yes?). Anyway, comparing that to something like Rails' ActiveRecord, the work is quite different. With Rails, it's going to go through and call the accessor methods on the objects to get the data because we often override the default data stored in the database for the representation we want on an object level. So, they aren't doing the same thing. Rails at least used to have a way to get back an array of hashes rather than getting the AR objects, but I haven't been able to find it recently. That could be an interesting comparison.

So, in that case, it just isn't the same functionality.

Similarly, it's hard to know exactly how they're setting things up. Django will perform significantly better if you set up something like pgBouncer for connection pooling. Rails includes connection pooling, IIRC. Without knowing something like that, it's hard to gauge whether it reflects real-world usage (I mean, I'd argue that defaults matter, but if you're looking to scale Django, this isn't an onerous addition).

Similarly, with process-based concurrency, it's important to get settings that maximize CPU usage while not overloading the RAM. Since they don't talk about memory or CPU usage, it's hard to figure out whether they've dealt with this. It might be in their repository, but I haven't been able to go through it yet.

Heck, the multiple-query test in go seems really odd. I mean, coming in last place, serving less than half the requests of Rails? Similarly, Play 1 hits a nice 28% on EC2 and then can't serve a single request on the dedicated hardware.

Plus, the tests really stress JSON performance and don't use things like a framework's templating system. One doesn't need to argue that JSON performance is useful, but so is template performance and that just isn't being tested.

So, yeah, it would be hard to make conclusions off of this data except possibly that benchmarking is hard. None of this comment is meant as a dig against the people doing these benchmarks. They're improving them, they're making their code open, they're doing the kind of stuff that allows good, open questioning on their results. But I think it's still early to consider these to be really meaningful. I think a test that combined a bunch of different types of requests with database access, JSON, templates, etc. would be interesting and it seems like they might go there as they have more time.


Hi mdasen. Thank you for the feedback. We really appreciate these kinds of thoughtful contributions. To address some of your concerns:

The php-raw test, as with all tests that have the "raw" suffix is not using an ORM. The servlet-raw test uses raw JDBC. The tests without the "raw" suffix are assumed to be using an ORM or something ORM-like. There is a separate PHP test (named just "php") that uses PHP ActiveRecord.

To be clear: without the "raw" suffix, we expect the test to be exercising the framework's preferred ORM or something analogous to an ORM. For example, several but not all of the Java tests are using Hibernate.

I am of the opinion that it's of great value to include the raw tests alongside the ORM tests for comparison. Later versions of the results view will allow filtering of the results (e.g., filtering out the "raw" tests or filtering out all Java frameworks) [1] [2].

Django is being used with MySQL and does not (presently) have a connection pool [3]. We'd gladly accept a pull request that adds a MySQL connection pool. Separately, we aim to eventually add Postgres tests.

As for frameworks that use process-based concurrency, we have attempted to configure each according to the capacity of the hardware. For example, for a given process-concurrency framework on EC2 large, with two virtual processors, we may use two workers; on i7 with eight HT cores, we may use eight workers. You can review the configuration details in the repository and submit pull requests if they are wrong.

We aim to add server-side statistics capturing in a later round [4]. However, for the time being, I can say that anecdotal observations show that some frameworks do not saturate all CPU cores, but we have not observed any running into out-of-memory situations.

Regarding the curious Play1 and Go database test results, see elsewhere in this thread [5]. In both cases, the communities have offered to help address the problems the tests are running into. The Play community has already fixed the Play1 database test and we expect it to perform in line with Play1 + Siena in Round 4.

I agree that testing more components is desirable. The next test will include some minor work with collections and server-side templates [6]. We have started implementing Test 4 on a few frameworks and have been very pleasantly surprised to see the community has already started submitting implementations in their favorites as well.

Thanks again for your detailed thoughts and we look forward to continuously improving this project for as long as we have the time to do so. Please feel welcome to join in the conversation on any of the Github issues or create new ones.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/issues/15...

[2] https://github.com/TechEmpower/FrameworkBenchmarks/issues/12...

[3] http://www.techempower.com/benchmarks/#section=motivation

[4] https://github.com/TechEmpower/FrameworkBenchmarks/issues/10...

[5] https://news.ycombinator.com/item?id=5590132

[6] https://github.com/TechEmpower/FrameworkBenchmarks/issues/13...


Thanks pilgrim689. I'll get that fixed up.


Thanks amarsahinovic for sharing this here. This is a brief blog entry with some observations about Round 3 of our web frameworks benchmark project. The community contributions continue unabated and I think the impressively long charts demonstrate just how much the community has given us. Thank you to everyone who has participated and we are looking forward to Round 4 already!

Please reply to let us know if you have any comments, questions, or criticisms. We'd love to hear your feedback.


Really glad to see the performance of Yesod has moved up closer to where I'd expect now that some of the mistakes in the original benchmark have been fixed. One thing I do notice is that Yesod is using MySQL as the DB backend for the DB tests, but it does have support for MongoDB as well. Looked at from that perspective it's actually one of the higher performing frameworks for all the DB tasks, but it's hard to say for sure because it's never tested with MongoDB (it usually lands somewhere above all the MySQL based tests, but below all the MongoDB tests). Would be interested in seeing how it performs with the MongoDB backend used instead. Maybe if I get some time in the next couple weeks I'll try to swap out the backends and make a pull request.


The performance of Yesod isn't where I expect it to be. I'm actually quite disappointed.

With client session disabled, I'd expect it to be close to the top along with some of the micro frameworks.


Would still like Ruby to be tested with app servers other than Passenger. Is this planned for Round 4?


Oops! Thanks for pointing that out. This is an oversight in the readme for the Ruby tests. They are on Unicorn for Round 3 [1]. We'll get the readme updated and the Environment Details clarified as well.

[1] https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

Edit: I've edited the Environment Details page accordingly.


This is the readme I looked at before I commented: https://github.com/TechEmpower/FrameworkBenchmarks/tree/mast... It still references Passenger.


Something seems off about "Results from dedicated hardware." Play and Go went from 1000s to <10 respectively. Care to give any insight there?


I assume by "Play" you mean Play1, which is a newly community-contributed test. Since the completion of the Round 3 tests, the Play community resolved the Play1 database test's problems [1] and the Go community is investigating the Go database tests [2]. Assuming the Go fix is in soon, we anticipate that both should be resolved in Round 4.

[1] https://groups.google.com/forum/?fromgroups=#!topic/play-fra...

[2] https://groups.google.com/forum/?fromgroups=#!topic/golang-n...


Thank you for the response, and for the project in general.


Not being a web programmer, I have a naive question:

Are many websites running into speed or scalability issues?

I see programmers in other areas spend a lot of man-hours optimizing code that isn't a meaningful bottleneck. Is the same thing happening here?


If you don't have intelligent queries and a proper cache system. At least. You will quickly crash your database or saturate your traffic. It can even hurt small traffic websites (I remember coding a webchat with Mysql and crashing my database with only 5 people chatting).


I am not acquainted with what language every framework uses. It would be useful for me in interpreting the data if you could annotate -- Java/JVM, PHP, Python, Perl, etc.

This is great work, though. Thank you for doing it -- and double thank you for taking submissions from each of the communities so they can try to submit their best shot. That really helps make sure the tests are as ideal as possible.


xb95, I agree. In the next round, I'll add more color-coded columns to indicate language and platform information.


Is the following a fair generalization?

Java should be the conservative choice for your web framework's language (rather than PHP). Scala, Clojure, Nodejs, Erlang, Lua, and Haskell should be in your list of workable yet "cool" languages (rather than Ruby or Python).


I'd add the caveat that Java should be the conservative choice only if you are expecting it to matter that you can serve 200,000 simpleish (but nontrivial) requests per second. In the vast majority of cases, development speed and things other than raw framework performance will dominate as concerns. However, that is a non-zero niche.


I was considering other factors in that generalization, although they're implicit. E.g., a lot of the "cool" language frameworks I mentioned are comparable with the Java frameworks in speed, but Java has a considerably larger ecosystem and number of hirable developers (although I'm not sure if that translates well into web development). And I assume PHP is as verbose as Java from what I've heard from other people (I've never used PHP).

What language should be the conservative choice based on the many factors and ignoring the specific case for the general case?


PHP can be as verbose as you want it to be.

If you're using something like Symfony2, there'll be more abstractions and layers than using something like Silex - but they still don't come close to what Java looks like.


" And I assume PHP is as verbose as Java from what I've heard from other people (I've never used PHP)".

So you dont even know what you are talking about?


What do you think questions are for?


Some Java frameworks are not doing that well on all tests. So it is not a fair generalization. As for PHP vs Java , it is not really the point of the discussion , since raw php is doing very well.


The Java frameworks are generally doing very well (top quarter to top half), and the PHP frameworks are generally doing very poorly (bottom half). PHP-raw is doing well, but that's not heartening given how mature its frameworks should be (if we were talking about golang, I think that argument would be sound).


That's the point of PHP , you dont need a framework to do web development with PHP. PHP is merely a C Web dev DSL. Using a framework over a DSL makes no sense.


Of course it makes sense, because you're not writing C code - you're writing PHP.

A framework is simply a pre-packaged organization of your code. In the case of the mini php frameworks, they provide little more than a router.

I have worked on many projects - those with and without frameworks. I will choose code on a framework any day of the week.


That's some amazing performance out of the JVM.


Pretty amazing. I would like to see a raw C benchmark to see how far is the JVM from the limits.


I want to see a Wt (witty) benchmark.

C++ is welcome here too.


Wt was suggested before, and we'd love to include it. Care to submit a pull request?

https://github.com/TechEmpower/FrameworkBenchmarks/issues/92


Onion is C


In order to have an apples to apples comparison, wouldn't you at least need to have an opcode cache for the PHP frameworks?

PHP has to live and die upon each request, but Java culls all of its classes together on the first request and from then on it is running from RAM.

Many PHP deployments, and most serious ones, will opcode cache in addition to database and page cache. You could rerun these tests with opcode cache only, leaving page and database caches for your "cache" round of tests. That will at least approximate the dramatic differences in architecture between Java and PHP.


All the PHP tests were done with the APC opcode cache, running within PHP-FPM.

http://www.techempower.com/benchmarks/#section=environment

If this can be improved, please let us know. A pull request would be ideal.


Oh. Didn't make the connection that APC was on for Codeigniter, Laravel and the other frameworks.


I'd love to see these benchmarks on a beefier ec2 instance type. I'm curious how it would compare to the dedicated hardware


Agreed. A future enhancement I plan to make on the results viewer is the capacity to select from a menu of results files (the results are rendered from results.json files). Once that is in place, anyone who cares to could run the full suite on EC2 extra-large or, say, Rackspace Cloud, and submit those results files for us to include as options.


Do you have any plans to run asp.net MVC webapi on .net 4.5 using the task parallel library ?


Why is ASP.NET never included in these type of benchmarks?


Consider how different a task that is. Each of these benchmarks and frameworks are run on Linux, which means the turn-up and benchmarking automation all target Linux. Including a new framework involves creating a new stack (app/web/db server) and app that meets the criteria. Including Windows would require Windows versions of large parts of the toolchain.

I suspect that the most significant reason has to do with fragmentation in knowledge. Linux is a huge knowledge domain. Windows is also a huge knowledge domain. It's rare to find people (or even teams) that are well versed enough in both to deliver meaningful results on both.

My understanding is that there are ASP.NET community members stepping up to help bring ASP.NET benchmarks to the table for the next round.


There are significant restrictions as far as what benchmarks from the .Net framework are allowed to be publicly published. Not sure if this has anything to do with why this project hasnt included ASP .Net


Windows is clumsy on the network unless you've built up a suite of tools and know-how to manage it. The extra cost doesn't help much either.

As someone is said to be working on .net benchmarks, you may get your wish.


Oh gawd. Please, tell us how Windows is "clumsy" to get setup on a network.

Also, the "extra cost" doesn't even figure into a benchmark. Grab a 120 day trial ISO from Microsoft and install.


Simple really. Its heritage is single-user desktops. Therefore the security model is often "Don't look in here!" rather than having it done right the first time. Hides filenames; needs antivirus.

Industry standard network/admin/deployment tools don't work as well, if supported. Until a couple of years ago, the only viable way to manage it was from a GUI... they are still figuring out how to run it headless. It's not always compatible with open standards. Shall I continue?

Meanwhile Unix (and others) began life as time-sharing systems that became the original nodes of the internet, a scalable model that Windows has come back to forty years later. That's not to say that it doesn't have any strengths or isn't improving.

The extra cost includes per minute charges on EC2.


That's great except the only thing you've proven here is your own ignorance. Shall I go on?

You're spouting off about ancient history...and that somehow translates to Windows is (currently) "clumsy" on the network? Gee, I wonder why it only takes me 10 minutes flat to setup a headless ASP.Net server on EC2? And I can do that even with the GUI version of Windows because there's this little thing called RDP - maybe you've heard of it.

Oh and ASP.Net also runs on Linux. It didn't a few years ago though, so maybe it won't work for you since apparently you are living in the past.


(Forgot to mention drive letters, unc paths don't work on the console.)

rdp is no substitute for real deployment tools. That you've spent years working around the issues and recently got headless working seven months ago is not impressive. Not when its been mature elsewhere for decades, for free. Nor is your defensive tone.

The original point I made was that there are impediments to using Windows and many still exist whether you believe them or not. If they didn't Windows Server wouldn't be moving closer to the Unix model with every release.


They have someone working on ASP.net tests for the next round.


These are really great - one additional test I would love to see would be Mono Asp.net MVC. I think it would interesting to see how it compares to the JVM.


I just don't understand how could Symfony2 be so fast considering it's pretty huge (twig itself is already huge).

I guess it's time I consider switching from CI.


I think you've got the charts backwards. Symfony2 is slower than CI in every chart.


oh...


It's a matter of cache, APC cache, memcached, "compiled" autoload cache, there are many levels of caching in symfony, and boom, you have speed because many of your hits are hitting the ram memory.


Symfony caches everything , dumping resources ( configs , templates , ... ) into PHP files. So when you are using Twig on a production server you are not actually reading twig files but basic PHP files with echo statements.


Wow I'm surprised that Rails has obtained better performance than PHP frameworks, and on many occasions has proved better than DJango...


Consider that we are talking about web oriented programming languages ​​and frameworks.


Is the laravel on the benchmark Laravel 3 or Laravel 4? Still, CodeIgniter is the fastest PHP framework? Wow.


The Larevel test was community-contributed, but according to its readme [1], this is 3.2.14. Is Laravel 4 production-ready?

[1] https://github.com/TechEmpower/FrameworkBenchmarks/tree/mast...


I figured so too. Laravel 4 is still on beta as of now.


Where can I find this gemini? I went to environment and it's the only one that doesn't have a link.


It's our internal framework.

From our first benchmarks post: http://www.techempower.com/blog/2013/03/28/framework-benchma... "Why include this Gemini framework I've never heard of?" We have included our in-house Java web framework, Gemini, in our tests. We've done so because it's of interest to us. You can consider it a stand-in for any relatively lightweight minimal-locking Java framework. While we're proud of how it performs among the well-established field, this exercise is not about Gemini. We routinely use other frameworks on client projects and we want this data to inform our recommendations for new projects.


Any plan to open source it? Would love to learn more about it.


We are considering it. Although, fair warning: it is not documented (save for our internal documentation) and we don't have the capacity to support its usage by third-parties. The reason we are considering open sourcing it anyway is that we feel it's the fair thing to do since we've included it in this project.


Please do. I'd very much like to learn about it.


I'd be more interested in benchmarks that reflect production configurations.


From what I can tell the guys are trying quite hard to get close.

If you spotted things that seem off you should contribute to the repo or let them know.


Indeed, a key objective of the project is to simulate a production environment, within reason. We are now at a point where about half of the tests are community-contributed and we have to take it for granted that the community has similarly aimed for production-readiness within their tests.

As n1c points out, if there is something wrong with any of the configuration, please submit a Github issue or pull request.


A few things that I noticed, just random notes, take 'em with a grain of salt. I'm not running any of the code myself, I'm just browsing around the Github repo. This is a birds-eye analysis. I was a Django developer for a long time and recently went to the dark side and love Rails at the moment.

It's important to remember that database connections are a killer in most web apps. This is why a lot of people use connection pooling (in the postgresql world, for example, you have pgbouncer and pgpool). When you can pool and persist connections, you'll have a faster web app (assuming it interfaces with a db, as many of their tests do).

Last time I checked, Django does NOT provide built-in connection pooling, whereas Rails does. This could be part of the reason that Rails outperforms in most of the tests. That says something ... but it would be cool to see both frameworks strutting their stuff on an identical connection scenario.

Also note that the MySQL connection limit for nodejs is set to 256 connections, whereas Rails set to use a pool of 5. Since these tests are hitting the DB, I'd like to see some more consistency with the way connections are setup. Not that 256 connections is better than 5, but if all those connections are open, it's no wonder that an async framework can hammer the DB so much faster than Ruby chugging along on a pool of 5.

In the Node examples, it's important to note the difference between the mysql and mysql-raw tests. And to remember this when comparing to things like Django and Rails which use ORM's like ActiveRecord. In the raw msql test, node is really quick because there is no overhead there for mapping a row to a model.

Also, a lot of the viewers of these benchmarks might just look at the big bar for X framework and think, "fuck it, that's my next project". Sure, go for it. But also remember that the type of app you're building comes into play a lot. These tests do not have any state. That's why these bare metal frameworks like Gemini are doing so well. Node does great here because it's async. But certain apps (like your future project) isn't going to fit into the mold of an async hello world test. I'm not knocking the framework tests here, they're awesome, but just realize this.

One area where Django outperforms Rails is in the realm of JSON serialization. I'm asusming here that Python's native JSON engine is quicker than Ruby's. I learned this with Rails the hard way. With an out-of-the-box Gemfile, you're gonan have a bad time. In Ruby-land there are a few ways to boost your octane here. You start with MultiJSON and then take your pick of the many JSON engines in the world today. MultiJSON being an adaptor that lets developers put the choice of a JSON engine in your hands. Popular ones are yajl-ruby (ruby bindings for yajl, which is C) and oj. If you're doing JRuby, I hear that Jackson is super effing fast so that's nice. I'd love to see the JSON tests with various JSON parsers (at some point these guys are going to get funding for being the benchmark company, having every framework with every possible configuration on display)

Django Protip: read up on select_related and prefetch_related. Similarly, in Rails land, you'll want to learn about eager loading with things like includes() and joins(). This will let you do one or few queries up front, rather than doing a bunch of subqueries inside of a big loop. One big query is usually better than a billion tiny ones.

A reason why the PHP apps are getting murdered is because in a typical configuration your application initialized on every single request. Opcode caching can help here since that reduces that boot period. This why in a production environment you can modify a PHP file and instantly see the results. Versus (depending on the application) a Ruby or Python app, where the app is booted initially and until that worker or process is restarted, is running the code it started out with. This is for speed. This is also why, for example, a big Rails project (or better yet, a Java project) has a significantly longer warm-up period. It's loading the entire app. This is a big reason the PHP frameworks are suffering here, since they have a lot of overhead to load for each request. Whereas the raw PHP doesn't have that overhead.

PHP protip: Roll with opcode caching (APC is a favorite of mine, it's very simple to install and configure, and provides a handy web gui for viewing various details of things like cache hit so you can optimize it). Also, please throw apache and mod_php away if you can and run with Nginx + php_fpm ... although if you're an HN reader then you're probably already doing this.

The JVM is balls-quick because, again, unlike something like PHP, it gets booted up and then does it's thing. The JVM is infamous for it's slow startup process (which really varies based on a billion things) but there is a reason for that: the JIT. The slow startup is due to bytecode generation, I believe. It churns that bytecode into machine code and during that process it can make optimizations like unrolling loops. I'm no Java pro though. I don't like working with the language (the verbosity kills me, and XML is painful), but I have an immense deal of respect for its performance.

To be honest I am really going to take a solid look at the Play framework for my next side project. It's got a lot of the pragmatism of Rails but it runs on the JVM and comes in a Scala variant. These are things that I am stoked about dipping my toes into.

Finally, my hat is off to the folks running these. Great work!


Hi whalesalad. First, thank you for such a thoughtful and long reply. Second, you have an awesome handle.

Indeed, database connections are a huge pain point for web applications. In our tests, Django does not yet use a connection pool. We hope that is addressed soon either (a) by usage of a MySQL connection pool or (b) by implementation of a Postgres suite of tests, which are planned. We don't have an ETA on either of those, but option (a) could be completed by a pull request if anyone has time.

We've attempted to be consistent with database connection pooling. The Rails test runs in production mode and uses 256 connections in its pool [1].

Gemini is our in-house closed-source framework. We have been debating internally whether we are comfortable open sourcing it. It is not documented sufficiently for third-party use and we do not have the capacity to support its use by third-parties. On the other hand, we believe open sourcing it is the fair thing to do because we have included it here and other frameworks' fans should be able to examine it. We're still hemming and hawing over that. All that said, it is not a bare-metal framework [2].

My very strong opinion is that Node is not doing well because of its support for asynchronous application design, but rather because the V8 runtime that Node executes on is simply fast. I say that because the client-side concurrency is already sufficient to saturate the CPU cores using most frameworks (the exceptions being those that appear to have some lock or resource contention preventing full CPU utilization). Server-side concurrency fan-out strategies such as asynchronous evented loops may help superficially for our database tests, but pale in comparison to simply having a fast platform. I've written my opinions on that matter on my personal blog [3] (it's a later section of that blog entry).

None of the JSON tests are asynchronous. All respond immediately within the handler code.

I agree that it would be still more valuable to have additional dimensions for attributes such as the particular JSON serializer, the particular ORM, the particular database driver. With enough time, I'd love to have that breadth. The project has been steadily growing in scope, but handling that level of minor variations I fear remains quite a ways out. The priority right now is introducing new tests that include more representative functionality such as working with collections and server-side templates [4].

I'm not sure if your comment about PHP opcode caching is meant to suggest that we are not using it, but based on feedback prior to Round 2, we enabled APC for PHP, and that has remained enabled in this round [5].

I am not a part of the Play project, but I agree with your sentiment that Play deserves a good look by anyone starting a new web application. I've been especially impressed by their community. They have embraced this benchmark project and are helping us make sure Play performs as well as possible--with a fair and reasonable production-grade configuration. Their fans have given us some good advice and constructive criticism.

Finally, thanks so much for the kind words at the end of your comment. It means a lot to us that the community finds value in this project. We've been thrilled with the response. On the other hand, we've been absolutely murdered by the moderators here. This item had a massive HN Slapdown score, and we weren't even responsible for posting it. :)

[1] https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

[2] https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

[3] http://tiamat.tsotech.com/rps-vs-connections

[4] https://github.com/TechEmpower/FrameworkBenchmarks/issues/13...

[5] http://www.techempower.com/benchmarks/#section=environment


It would be useful to try the HybridJava framework as well.


I want to try Gemini. Can you give me some tips to start?


what's "django stripped"?


A stripped down version of Django - no unnecessary apps installed. This was recommended to test Django's full speed but really wouldn't be reflective of a full deployment.


It's disappointing that it's so slow. I suppose it doesn't matter a lot of the time, but would prefer it to be more competitive.


In the previous benchmark it was pointed out that they don't use connection pooling [1], I don't know is it fixed in this round

https://news.ycombinator.com/item?id=5499490


They don't say either way but it requires them to switch to Postgres so I think they would have said if they had.


That's correct; we are still testing on MySQL, so the Django tests are still configured without a connection pool. We would love a pull request that incorporates a connection pool for Django+MySQL. We do have Postgres as a target for future rounds of this project, but do not yet have an ETA for that.


I wonder if this would be sufficient: https://pypi.python.org/pypi/django-mysqlpool/0.1-7


oh, I see https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

I guess it was about sessions and auth hitting the db?


It's a bit strange to include node with mongodb in the tests. Because it changes 2 variables in the test ( the web solution and the database ). You should stick to mysql(or whatever DB you are using) with every tests or it doesnt sound that serious.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: