Step 1: image captioner. You can use open-source models like GIT (github.com) or BLIP (arxiv.org)
Here’s a free app hosted on HuggingFace Spaces: huggingface.co
2/
Here’s a free app hosted on HuggingFace Spaces: huggingface.co
2/
Step 3: use any text-to-music models to synthesize the audio. Here’s another open-source model (“AudioLDM”) hosted on HuggingFace for you to try out today, for free!
huggingface.co
Thanks @_akhaliq for sharing this.
4/
huggingface.co
Thanks @_akhaliq for sharing this.
4/
If you are interested in how the latest and greatest music generative models work, below is my deep dive thread.
TL;DR: an AI storm is coming for music & SFX industries.
5/
TL;DR: an AI storm is coming for music & SFX industries.
5/
I open-source many AI recipes and ideas. Welcome to check out my past writings and follow @DrJimFan 🙌
Loading suggestions...