The AI Impact Tour: Exploring Methods for Auditing AI Models
As the development of generative AI foundation models continues to progress, the use of large language models (LLMs) is becoming more prevalent. However, the term “foundation model” (FM) is now more accurate, as multi-modal models that can generate images and videos are also being utilized. These models have the potential to bring real impact by sifting through information and adapting it to meet diverse needs. Despite the transformative opportunities on the horizon, there are increased costs associated with these advancements that must be managed effectively.
Understanding How Foundation Models Work
To gain a better understanding of how foundation models work, it is essential to delve into their inner workings. These models convert words, images, numbers, and sounds into tokens and predict the “best-next-token” that will generate a response that aligns with the user’s preferences. Over time, core models from various sources have become more attuned to user expectations. Additionally, the formatting of input prompts has been found to play a significant role in their effectiveness, with YAML format performing better than JSON. The generative AI community has also developed “prompt-engineering” techniques to improve model responses.
Join The AI Audit in NYC
The AI Audit in New York City is an exclusive event that provides an opportunity for top executive leaders to explore strategies for auditing AI models to ensure optimal performance and accuracy within organizations. This invite-only event promises insights from industry experts and thought leaders.
Expanding Information Processing Capabilities with LLMs
One significant advancement is the ability of state-of-the-art models to process up to 1 million tokens, equivalent to a full-length college textbook. This expanded capacity allows users to control the context with which they ask questions in ways that were not previously possible. For example, complex legal, medical, or scientific texts can be processed by LLMs with an 85% accuracy rate on relevant entrance exams. Tools like Anthropic’s Claude enable the processing of complex documents without the need for extensive infrastructure.
Retrieval Augmented Generation and Embedding Models
The development of embedding models, such as titan-v2, gte, or cohere-embed, enables the retrieval of similar text based on concepts rather than keywords. These models convert diverse sources into “vectors” learned from correlations in large datasets, allowing for efficient retrieval of relevant information. Specialized vector databases like turbopuffer, LanceDB, and QDrant help scale these systems to handle large volumes of data without significant drops in performance. However, scaling these solutions in production remains a complex task that requires collaboration across multiple disciplines.
The Next Evolution: Gen 2.0 and Agent Systems
While current advancements are making AI solutions more viable for organizations, the next evolution lies in creatively chaining multiple forms of gen AI functionality together. Manual development of action chains is one approach, with systems like BrainBox.ai ARIA providing step-by-step problem-solving capabilities. However, the limitations of these systems lie in defining the logic, which can only be done by hardcoding or limited to a few steps. The next phase, gen AI 2.0, involves agent-based systems that utilize multi-modal models and a reasoning engine to break down problems into steps and select AI-enabled tools to execute each step. This approach enables more flexible and complex solutions to be implemented.
Optimizing LLM-Based Solutions
While the accuracy and performance of LLM-based solutions are improving incrementally, there is still a need for extensive tuning to ensure cost-effectiveness. Without optimization, these systems can be expensive to run, requiring thousands of LLM calls passing numerous tokens to the API. Ongoing development in hardware, frameworks, cloud services, model parameters, and hosting is essential to optimize costs and make these solutions more accessible.
Conclusion: The Pursuit of High-Quality Outputs at Optimal Costs
As organizations continue to adopt LLMs, the focus will be on obtaining high-quality outputs quickly and cost-effectively. This pursuit requires continuous learning from real-world experiences and optimizing gen AI-backed solutions in production. Finding a partner with expertise in this field will be crucial for organizations looking to navigate the rapidly evolving landscape of generative AI.
Welcome to the DataDecisionMakers Community!
DataDecisionMakers is a community where experts, including those involved in data work, can share insights and innovations related to data. Join this community to stay up-to-date with cutting-edge ideas, best practices, and the future of data and data tech. Consider contributing an article of your own to contribute to the conversation.