Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Pretty similar to how Neocities serves static sites (https://neocities.org).

There's a few differences. We don't use SQL in the routing chain, we use regex to pick out the site name and then serve from a directory of the same name (this is NOT as bad as it sounds, most filesystems can do this quite well now and take MUCH more than half a million sites to bottleneck).

DRBD is also a little hardcore for my tastes. Nothing wrong with it, I just don't know it well, and I don't like being dependant on things I don't know how to debug.

An alternative I wanted to show uses inotify, rsync and ssh combined into a simple replication daemon. It's obviously not as fast, but if you enable persistent SSH connections, it's not too bad. If it screws up, you can just run rsync. Rumor has it the Internet Archive uses an approach not too far away from this for Petabox. Check it out if you're looking for something a little more lightweight for real-time replication: https://code.google.com/p/lsyncd/

We're still working on open sourcing (!) our serving infrastructure, so eventually you will be able to see all of the code we use for this (sans the secrets and keys, of course). I've just been having trouble coming up with a good solution for doing this. For now, enjoy the source of our web app: https://github.com/neocities/neocities



> Nothing wrong with it, I just don't know it well, and I don't like being dependant on things I don't know how to debug.

Yep, this is basically our approach as well.

We've been using DRBD for quite a long time now on our Git fileservers (which also run in active/standby pairs - in fact, they look a lot like our Pages fileservers) so we have quite a lot of in-house experience with it and it's a technology we're pretty comfortable with. Given this, using it for the new Pages infrastructure was a pretty straight-forward decision.


Yeah, I've read the engineering posts about DRBD for the git fileservers, so I assumed that's why you made that decision. Makes total sense to me.


This exchange is wonderful, and absolutely what I'd expect out of you two, and maybe I'm just in a bad mood, but it stands in such contrast to the way I often see technologies discussed online.

This kind of thing is the way engineering should be. Kudos.


Yay for acknowledgements! Yay for recursive enthusiasm!


I make use of Lua and Redis for handling a few million redirects and have been happy with it so far. I never considered MySQL due to performance concerns.

3ms for connection setup + auth + query seems reasonable. Are you using persistent DB connections? Any other mods? What sort of timeouts have you configured for DB connections?


Here's our current nginx config on the proxy server. I've got the DDoS psuedo-protection (there's another layer upstream) and caching turned off right now because we're working on something, but this is basically it:

https://kyledrake.neocities.org/misc/nginx.conf.txt

Critique away. As you can see, we've just barely avoided pulling out the lua scripting.

The next step for me would probably be to write something in node.js or Go. There's probably a lot of people cringing at that thought right now, but it's actually pretty good with this sort of work, and I'd really like to be able to do things like on-demand SSL registration and sending logs via a distributed message queue. Hacking nginx into doing this sort of thing has diminishing returns, we're kindof at the wall as-is.


> We're still working on open sourcing (!) our serving infrastructure, so eventually you will be able to see all of the code we use for this (sans the secrets and keys, of course).

I just want to applaud what you've been doing with neocites -- when the project started, I thought "Oh, nice." -- but not much more -- but I love the fact that you've kept at it, and your approach to openness is great (pending infrastructure code notwithstanding). I especially like your status-page:

https://neocities.org/stats

(Which I found from your excellent update blog-post[1] -- but I think it could be even more discoverable. It's not linked from the donate/about pages?)

I hope your financial situation improves -- and still: I wonder how (almost) half-a-cent of revenue/month compares to most ad-funded startups sites? While you'll need... a "few" more to reach your goal. Actually "just" need 43x as many users with the same revenue/head to get there :-) )

ed: clarity (hopefully)

[1] https://neocities.org/blog/the-new-neocities



I didn't know about this, but really wanted this to exist, and it makes me happy! Keep it up


We used a master/master DRBD setup at a previous company, it was kind of a pain to work with. We had a fairly extensive document to solve split-brain problems.

I imagine the problems with DRBD mostly disappear if you're using it properly though, master/slave setups probably work really well.


This factored in for me. Neocities is two people. We don't have the budget yet to hire an ops team, so we need to use parts that we can understand without a lot of mental investment. DRBD is definitely something you need to invest in. Github obviously doesn't have our budget constraints and can hire the people needed to really own problems like this.

I also am pretty conservative on engineering choices generally, and the "superfilesystems" (DRBD, Gluster) feel a little monolithic (read: not very unix) to me. It's not that they're bad, it's that they're solving a lot of hard problems, and there's a lot that can go wrong when you have to do that, and if something happens, you're the one that has to fix it.

I'm not religious about "do one thing and do it well", but SSH handles the transfers, rsync does efficient copying, and inotify fires events on file changes. Put them together and you've got a very "unix" solution. It's more or less an event-driven script that sits on the stable solutions to hard problems. If something goes wrong, you just run rsync.

I can't say enough how awesome OpenSSH is. I want to use it for pretty much everything. It's a work horse that really hauls.


lsyncd seems like a really cool project. However in practice I used it to replicate a docroot accross 3 servers and it actually got out of sync pretty often.


I have a scheduled rsync execute that periodically checks for any inconsistency. It plays nicely with any updates that come in while it's doing it's work, so that's been our fallback incase things get out of wack.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: