• Published on

    Stability AI has introduced Stable Audio Open, an open-source text-to-audio model that generates up to 47 seconds of audio samples, sound effects, and production elements.

    The model enables users to create drum beats, instrument riffs, ambient sounds, and foley recordings using text prompts. It also allows for audio variations and style transfer of audio samples.

    Stable Audio Open is a more specialised model compared to Stability AI’s commercial product, which can produce full tracks up to three minutes long.

    This open-source model is trained on audio data from Freesound and the Free Music Archive, respecting creator rights.

    The model weights are available on Hugging Face, and users can download and explore its capabilities.

  • Published on

    Udio is an AI app that transforms text into music across genres from pop to metal.

    Users enter prompts describing desired styles, lyrics, and elements to generate professional-quality vocal and instrumental tracks.

    Backed by musicians will.i.am and Common, and leading AI researchers and engineers formerly at Google DeepMind.

    The free beta allows 1200 song generations per month. While imperfect, Udio iterates quickly to improve quality, language support, and controllability. The team believes AI can expand musical boundaries for everyone.

  • Published on

    Stable Audio has unveiled its next-generation AI music model Stable Audio 2.0.

    The update enables generation of high-quality, structured music tracks up to 3 minutes long at 44.1 kHz from text prompts.

    It adds new audio-to-audio capabilities to transform uploaded samples through text guidance.

    Enhancements include expanded sound effect creation, style transfer for customisation and a diffusion transformer architecture for improved long-form coherence.

    The free model is available on the Stable Audio website, with an API upcoming.

    A 24/7 Stable Radio YouTube stream featuring AI-generated tracks also launched.

  • Published on

    AI music generator Suno has released v3 of its platform, its first model that produces music of radio-quality.

    Users can now create full two-minute songs in seconds in a variety of genres and styles, with better audio quality and improved prompt adherence, meaning fewer hallucinations and more graceful endings.

    The company is also developing a proprietary, inaudible watermarking technology to detect whether a song is created using Suno, to prevent users from creating music based on other artists’ references.

  • Published on

    Meta has launched Audiobox, a successor to its Voicebox tool for generating audio from natural language prompts.

    The new tool can generate audio clips including speech in various styles and environments, non-speech sound effects and soundscapes.

    It can also restyle voices, making them sound as if they are speaking in a particular environment such as a cathedral, or with a certain emotion.

    Users can input a description of the sound they wish to generate, or combine a voice input with a text style prompt to create the desired audio.

    The tool has been released to a limited number of researchers and institutions to encourage the development of responsible AI use.