Interesting podcast, thanks for sharing. I wasn't aware of MusicLM yet and just downloaded the paper and listened to the various samples on https://google-research.github.io/seanet/musiclm/examples/. MuseNet still produces more credible pieces from my point of view (practicing musician/composer/producer for decades) than the models based on raw audio files (like MusicLM); the latter lack true musical concepts as you expect from a human composer, or it just sounds like you listen to more than one piece at the same time; it might be suited to impress non-musicians though. So there is still much room for improvement.
We’re working on an app in this space over at wavtool.com with the goal of using AI to speed up the creation process and lower the barrier to entry. It’s effectively a DAW that can be controlled by a chatbot.
We’ve had pretty amazing results just relying on what GPT-4 knows about music. Without any special sauce it can write melodies and chord progressions and make sensible engineering decisions, and only occasionally misunderstands and starts writing in allcaps when a user says something is too quiet
Quite fun to play around with if you find this kind of thing interesting!
I was recently wondering how well a language model trained on file formats like an Ableton Live would work with input prompts. I suppose we have all the pieces and only have to connect them as has been done in a growing number of areas.
Question: If I want to play with all the different AI models, what cloud service offers the best value proposition for noncontinuous use that allows me to run whatever I want(e.g. SD webui, etc)?
This one's bound to meet in the middle.