Home ai Meta FAIR Releases New AI Models and Tools for Audio Generation, Text-to-Vision,...

Meta FAIR Releases New AI Models and Tools for Audio Generation, Text-to-Vision, and Watermarking

Meta’s FAIR team is making significant contributions to the field of AI by releasing several new AI models and tools for researchers to use. These advancements are focused on audio generation, text-to-vision, and watermarking. By sharing their early research work, Meta hopes to inspire further iterations and advancements in AI in a responsible manner.

One of the new AI models being released by Meta is called JASCO, which stands for Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation. JASCO allows users to input different audio elements, such as chords or beats, to improve the final AI-generated sound. Through text, users can adjust features like chords, drums, and melodies to achieve the desired sound. This model has the potential to revolutionize music generation by allowing artists to have more control and customization over the AI-generated music they create.

In addition to JASCO, Meta is also launching AudioSeal, a tool that adds watermarks to AI-generated speech. This tool helps identify content that has been created using AI. Meta believes that AudioSeal is the first audio watermarking technique specifically designed for detecting AI-generated speech within longer audio snippets. This localized detection allows for faster and more efficient identification of AI-generated segments within a longer sound clip. Unlike other models, AudioSeal will be released with a commercial license, making it accessible for a wide range of applications.

Meta is committed to fostering innovation and collaboration within the AI research community. They are releasing two sizes of their multimodal text model called Chameleon under a research-only license. Chameleon 7B and 34B can be used for tasks requiring both visual and textual understanding, such as image captioning. However, Meta has decided not to release the Chameleon image generation model at this time, focusing only on the text-related models.

Furthermore, Meta is providing researchers with access to their multi-token prediction approach, which trains language models on multiple future words simultaneously instead of one at a time. This approach can greatly enhance the efficiency and accuracy of language models. However, access to this approach will be limited to non-commercial and research-only use.

These new AI models and tools released by Meta’s FAIR team have the potential to drive significant advancements in the field of AI. By making these resources available to researchers and developers, Meta is fostering collaboration and innovation, ultimately pushing the boundaries of what AI can achieve. The company’s commitment to open science and responsible AI development is commendable, as it lays the foundation for a more inclusive and ethical AI ecosystem.

Exit mobile version