I am an ML Engineer at a Python shop supporting a team of 15-20 data analysts/scientists with a wide range of experience. Most of my gig is building tooling for them and dogfooding that tooling to make sure it works well. All of our Data people know SQL pretty well, but we'd rather not let people run wild trying to write data transformations using two paradigms (Pandas API vs. SQL queries) if we don't have to.
Am I chasing after a rainbow trying to provide a consistent DX here if I consider this to effectively be a solo project at this scale?
DuckDB seems like a promising contender here, but there is a whole new generation of tooling which has emerged to contend with all of the limitation of pandas.
Does anyone here have any positive stories of interacting with this tooling without effectively also signing up for huge maintenance efforts or a new expensive enterprise license? Open to any and all options that don't require my end users to move away from Python.
Disclaimer: I work on Bodo and wanted to share it in case others find it useful. https://github.com/bodo-ai/Bodo