Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Personally I think it shows that this is a "first look" at Docker. Much of it is much better than what the post indicates.

> The final image is 570MB big. I could not shrink it more unless I remove the whole Python and Perl stack. Since both are necessary for many system dependencies, starting with apt-get, this was not possible. I still need a way I can improve or upgrade my container.

?!? The article starts by pointing out they use immutable servers and blue/green deployment. In that context, you will not improve or upgrade the container: You build a new one. And if you want to cut build dependencies from the final container: Do the build in one container, install the build-artefacts to a volume, and use the contents of that volume to build a container without the build dependencies.

It'd be great to get "built in" support for this, but it's not hard to do.

> There’s no easy way logging with Docker.

The standard way of logging with Docker is to log to standard out, which gets captures and is accessible via "docker logs". If he'd not dismissed systemd out of hand, he'd also easily have gotten it fed into journald, with the option of having it relayed to a remote or local syslog as per his preferences.

> Let’s put it this way: as a way of provisioning a container, Dockerfile is a joke.

We don't need more complex provisioning tools. We have plenty of provisioning tools. Ultimately Dockerfiles needs to be able to specify what should be copied into the image. Everything else you can do with your standard/preferred build tools. There's no reason for Dockerfiles to try to become yet another fully featured provisioning tool.

> Forget your classic monitoring (unless you want to pull your hair with network bridges). Everything you’ll be able to monitor within the container are ports. That because you run the old school nrpe inside your host, so you won’t be able to check you actually have 8 workers running inside your container.

This is just flat out wrong. Anything running on the host can see the processes running in the container. With the right cgroup manipulation (via nsenter etc.) it can also see the mounted volumes or network space of a container, and so you can still monitor whatever you like.

> Making your application Docker compliant requires you to rethink the way it works.

Making your application take advantage of Docker, rather than treating Docker containers as sort-of VM's with less isolation requires you to rething the way it works. It's not something you need to do in one go - you can "break apart" a larger app environment piece by piece.

> The the tag nightmare begins. If I update my application and add new deps, I’ll have to update container #2. Unfortunately, how will I know I have to do that?

Uh. How does he know he has to update the machine images he deploys his applications to today? Personally I use make - tracking build dependencies is what it is for.



The main point he makes is valid however,

> Porting your application to Docker increases complexity. Really.

I think the main problem of Docker is that it's sold as an 'easy solution' by many bloggers who only deal with it superficially and then move on to the next big thing. There are a lot of gotchas with docker containers and the creation of clean docker images that are not immediately clear when you start out. A lot of your standard Linux know-how is not applicable.

edit: Also, there are obvious security issues that are not immediately clear to most beginners, most certainly not from the tutorials.

One of my favorites: If you provision your database container with environment variables to create a dba user, and then link your db container to your app container, voilà, your app container will now most certainly have the dba login and password inside its environment variables: https://github.com/docker/docker/issues/5169


The reason people call it easy is because it makes a lot of things that are traditionally hard very, very easy. One need only have written Chef scripts for any considerable amount of time to appreciate just how much easier it is to write a Dockerfile. And the things that CoreOS/fleet, Kubernetes and (hopefully) EC2 Container Service do aren't just difficult to do without something like Docker, they're basically impossible. And as much as I like our DevOps teams, the fact that Docker has basically made meetings with them a thing of the past is a truly wonderful thing.

That's why it's so frustrating to see developers making superficial forays into Docker and then declare it to be too complex. Yes, the simple and largely irrelevant stuff does get a bit more complex and you have to do some learning (and re-learning) before you use it for a production workload. But that's a trade off that a lot of us are willing to make to make the crazy-hard stuff significantly easier.

We're developers. Our tools should not be optimized for first use.


"That's why it's so frustrating to see developers making superficial forays into Docker and then declare it to be too complex. "

That goes both ways. It's equally frustrating to see developers making superficial forays into it and declaring it to be the magic bullet that makes everything simple.

The basic fact is that building and deploying complex software, managing dependencies, handling discovery - these are all complex things. There is no solution that makes it simple because it is inherently not a simple process.

Instead, we can only shuffle the complexity to places we're more comfortable in managing. For some use cases that's a dockerfile. For others, it's chef cookbooks [or other CM solution]. For yet others, it's both.


I think sometimes people also confuse "easiness" and "managed complexity". Docker makes it easy to work on a single container/image at a time, so you don't have to simultaneously grasp the rest of a complex system.

Complexity management is not a Docker-specific thing, but Docker makes it easier than before, and standardizes how it's done. I think that's the main added value.


You obviously know a lot more about Docker than I do, but I thought I'd add a couple of comments to your great list.

For size, it appears the OP started on his static compilation quest basing his image on Ubuntu. Wheezy is the standard base image in the Docker world for a reason -- it's significantly smaller. More and more images are using busybox based images, but I wouldn't want to try that with Rails + ImageMagick.

As for logging, I'd like to point out logspout: https://github.com/progrium/logspout


Absolutely agree re: Wheezy (Debian in general). Ubuntu is great when you want the kitchen sink; not so great if you want small.


His comments on logging are spot on. It's the only part of the post that I can agree on. You can get process monitoring as long as your tools are cgroup aware. Network monitoring is not so easy. This has been pointed out on the docker blog in the past.

Getting your logs out of docker is a PITA, though. Your best bet is to use syslog and configure each application to send their logs to a syslog server. It's a consistent widely supported way of shipping your logs around. Relying on stdout logs isn't always enough. Many applications do understand syslog out of the box but do not necessarily send important messages to standard out. Dockers management of stdout logs shouldn't be relied on at this stage.

Logspout does look interesting and I wish I knew about this a few months ago but see my above comment on stdout.


I wish they'd just provide a way to route the syslog call to the host.


>"We don't need more complex provisioning tools. We have plenty of provisioning tools."

Absolutely.

Thankfully, the Docker team seems in agreement with this based statements about avoiding making Dockerfiles "too clever" and the response to various proposals.

As you point out, most of the "issues" here are really misconceptions.

I expect it's a tough balance for any new(er) project. Maximizing exposure and adoption, but avoiding negative perceptions from being applied in ways aren't optimal.


Dockerfiles are deliberately dumb to let other tools take over as necessary, is my understanding.

My experience (over the last year) is that they're so limited as to be pretty useless. They don't even do what they're advertised to do, ie give you a reliable way to reproduce a build, and they're inflexible for my idea of real-world work with Docker. Where they're good is in giving everyone a point of reference.

I had a discussion with the maintainers last year about this:

https://groups.google.com/forum/#!topic/docker-user/3pcVXU4h...

I have a problem with most CM tools in that they're for moving target systems, not immutable ones. Ansible is the closest, but our experience has been that development on it is slow relative to the tool we use (see below). It's saved us a ton of money.

I blog on this and similar topics here:

http://zwischenzugs.wordpress.com/

The "tool for building and maintaining complex Docker deployments" is here:

http://ianmiell.github.io/shutit/ https://github.com/ianmiell/shutit

I also talk about this here:

https://www.youtube.com/watch?v=zVUPmmUU3yY


>"They don't even do what they're advertised to do, ie give you a reliable way to reproduce a build, and they're inflexible for my idea of real-world work with Docker."

Not exactly, as the thread you link points out you can reference an image ID in FROM rather than the name:tag which has potential to change silently.

It's the equivalent of using a package manager against a repo you don't own without pinning - expect problems.

This can be mitigated by FROM'ing via ID or avoided entirely by running your registry where tags are reliable.

Admittedly, these things are not necessarily obvious, but I think it's a bit disingenuous to paint Dockerfiles as worthless or broken.

That said, ShutIt looks very cool and seems to address exactly some of my concerns / desires about working with Docker.

I just don't agree with framing it in opposition to and at the expense of what exists.

There's value in a container description that is fully self-contained, transferable and 'dumb' enough to be transparent.


Hi, yes you're right - you can reference an image ID. However, as soon as you go to the network you're lost - any apt-get/yum update or install could break your system in surprising ways.

Having done _lots_ of builds lately I can vouch for that (see my blog for some examples).

In the end the image ID _is_ useful, but the dockerfile itself has limitations.

I agree with your last point as well - my evangelism comes from solving problems at my company in this way (which I know are not uncommon problems) rather than any belief that it beats others objectively.


I would love to peek at some of your Dockerfiles if possible?

Tried building a piece of infrastructure with Docker quite some time ago and left when it just didn't click. I certainly did quite some mistakes (at that time I did try to stuff all the provisioning into the Dockerfile for example, which you agree is a bad idea?) and the official Dockerfiles were .. mixed in clarity and not quite useful as examples to me.

So, I would really love reading about how people provision and manage their Docker instances The Right Way.


There are some examples in this article I wrote: http://www.hokstad.com/docker/patterns

I'll be posting a more detailed walkthrough of my process for using Docker with Ruby soon, as well as some other experiences.

> (at that time I did try to stuff all the provisioning into the Dockerfile for example, which you agree is a bad idea?

It's not necessarily a big deal for everything. I you have a lot of stuff that have limited build dependencies beyond Debian's "build-essential" target, you might as well create a debian:wheezy image with "build-essential" installed and work from that to begin with.

The issue is that some types of apps, like any Ruby app that uses Bundler and has dependencies on stuff with native extensions, can drag in hundreds of MB of dependencies that are only needed during build, and the build is only necessary when adding dependencies. For those kinds of apps it just wastes time and bloats your images to do the builds in the same Dockerfiles that you deploy with. If you're deploying 4-5 images now on a server or two, that's no big deal. When you're deploying 500 containers, you start getting picky about anything that slows down the process...

For those situations, I'd typically put together a Dockerfile with all the basic dependencies, and then "inherit" from that once for a dev/build container with the build dependencies, but without an entrypoint (so I can easily start it into bash or have it run a build script, or whatever I want), and then secondly for the final container where the latter image just adds the build artefacts.

Ideally I'd want everything I deploy to just have the files that are needed. Pragmatically, the tools are not quite there yet. Not least, it's often a real pain to build apps statically (if it's even possible), and the known dependencies are too granular (e.g. dependency on a large package, with no documentation about precisely what in the package has to be present) and it will take a long time to build up a "catalog" of more minimal images and build processes.


Seconded! I was able to set up a toy blogging system with Docker, but it's not something I'd feel comfortable using in a production environment yet. I'd really love to know more about how robust production environments are built with Docker.


>And if you want to cut build dependencies from the final container: Do the build in one container, install the build-artefacts to a volume, and use the contents of that volume to build a container without the build dependencies.

This is similar to what Netflix, one of the early big proponents of immutable servers, does. They build a deb package, then they create an AMI with that package installed.

http://techblog.netflix.com/2013/03/ami-creation-with-aminat...




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: