I've been trying to impress upon people from my own research in Python packaging...

osigurdson · 2025-05-01T17:25:56 1746120356

There may be a bit of culture at play sometimes as well. If a language isn't meant to be fast, then perhaps devs using the language do not prioritize performance very much. For some, as long as it is possible to point to some externality ("hey this is Python, what do you expect") this is sufficient.

Of course, not always the case. C++ is a good counter example with a massive range of performance "orientedness". On the other hand, I suspect there are few Rust / Zig or C programmers that don't care about performance.

zahlman · 2025-05-01T17:33:33 1746120813

To me, Python is more of a challenge than an excuse when it comes to performance.

On the flip side, I've seen quite a few C developers using their own hand-rolled linked lists where vector-like storage would be more appropriate, without giving it a second thought. Implementing good hash tables from scratch turns out not to be very much fun, either. I'm sure there are off the shelf solutions for that sort of thing, but `#include` and static compilation in C don't exactly encourage module reuse the same way that modern languages with package managers do (even considering all the unique issues with Python package management). For better or worse.

(For what it's worth, I worked in J2ME for phones in the North American market in the mid 00s, if you remember what those were like.)

__mharrison__ · 2025-05-01T23:06:53 1746140813

Do you have a real world example of a Python project that does selective import?

zahlman · 2025-05-02T13:13:11 1746191591

Not sure what you mean, but I guess you're getting at something like "is it really pip's fault that it imports almost everything on almost every run?".

Well first off, pip itself does defer quite a few imports - just not in a way that really matters. Notably, if you use the `--python` flag, the initial run will only import some of the modules before it manages to hand off to the subprocess (which has to import those modules again). But that new pip process will end up importing a bunch more eventually anyway.

The thing is that this isn't just about where you put `import` statements (at the top, following style guidelines, vs. in a function to defer them and take full advantage of `sys.modules` caching). The real problem is with library dependencies, and their architecture.

If I use a library that provides the `foo` top-level package, and `import foo.bar.baz.quux`, at a minimum Python will need to load the `foo`, `foo.bar`, `foo.bar.baz` and `foo.bar.baz.quux` modules (generally, from the library's `foo/__init__.py`, foo/bar/__init__.py`, `foo/bar/baz/__init__.py` and `foo/bar/baz/quux.py` respectively). But some libraries offer tons of parallel, stand-alone functionality, such that that's the end of it; others are interconnected, such that those modules will have a bunch of their own top-level `import`s, etc. There are even cases where library authors preemptively import unnecessary things in their `__init__.py` files just to simplify the import statements for the client code. That also happens for backward compatibility reasons (if a library reorganizes some functionality from `foo` into `foo.bar`, then `foo/__init__.py` might have `from . import bar` to avoid breaking existing code that does `import foo`... and then over time `bar` might grow a lot bigger).

For pip, rich (https://pypi.org/project/rich/) is a major culprit, from what I've seen. Pip uses little of its functionality (AFAIK, just for coloured text and progress bars) and it tends to import quite a bit preemptively (such as an emoji database). It's very much not designed with modularity or import speed in mind. (It also uses the really old-fashioned system of ad-hoc testing by putting some demo code in each module behind an `if __name__ == '__main__':` block - code that is only ever used if you do `python -m rich.whatever.submodule` from the command line, but has to be processed by everyone).

And yes, these things are slow even with Python's system for precompiling and caching bytecode. It's uniformly slow, without an obvious bottleneck - the problem is the amount of code, not (as far as I can tell) any specific thing in that code.