Snowflake releases a flagship generative AI model of its own

Snowflake releases a flagship generative AI model of its own


The era of all-encompassing, widely applicable generative AI models was dominant in the past and arguably remains so today. However, as cloud providers of all sizes enter the generative AI arena, there’s a notable shift towards models tailored specifically for the most financially robust clientele: enterprises.


A prime example of this trend is Snowflake, a prominent cloud computing company, which recently introduced Arctic LLM, touted as an “enterprise-grade” generative AI model. Offered under an Apache 2.0 license, Arctic LLM is optimized for enterprise tasks, including database code generation, and is accessible for both research and commercial purposes.


During a press briefing, Snowflake‘s CEO, Sridhar Ramaswamy, emphasized the significance of Arctic LLM as a foundational element enabling Snowflake and its clients to develop enterprise-grade products and unlock the potential value of AI. He positioned this release as a pivotal initial step in Snowflake’s foray into the world of generative AI, with many more advancements anticipated in the future.


An enterprise model

My colleague Devin Coldewey recently delved into the persistent rise of generative AI models. I suggest giving his article a read for more depth, but here’s the essence: These models serve as a means for vendors to generate excitement around their research and act as gateways to their product ecosystems, offering features like model hosting and fine-tuning.


Arctic LLM, Snowflake’s latest addition to its Arctic family of generative AI models, follows this trend. Developed over three months, utilizing 1,000 GPUs and costing $2 million to train, Arctic LLM enters the market shortly after Databricks’ DBRX, another model positioned for enterprise use.


Snowflake highlights Arctic LLM’s superiority over DBRX in coding and SQL generation tasks, positioning it above Meta’s Llama 2 70B (though not the more recent Llama 3 70B) and Mistral’s Mixtral-8x7B. Additionally, Snowflake claims Arctic LLM excels on the MMLU benchmark, a measure of general language understanding, although some aspects of the benchmark may skew results toward rote memorization.


Baris Gultekin, Snowflake’s head of AI, emphasizes Arctic LLM’s enterprise-focused approach, tackling challenges like SQL co-pilots and high-quality chatbots rather than generic AI applications like poetry generation.


Arctic LLM, like DBRX and Google’s Gemini 1.5 Pro, adopts a mixture of experts (MoE) architecture, which divides tasks into subtasks delegated to specialized “expert” models. Despite its massive 480 billion parameters, Arctic LLM activates only 17 billion at a time, driving 128 separate expert models, a design Snowflake claims enables cost-effective training on public web datasets compared to similar models.


Running everywhere

Snowflake is going all-in with Arctic LLM, providing users with resources like coding templates and a list of training sources to facilitate the process of getting the model up and running and fine-tuning it for specific use cases. Recognizing the complexity and costliness of such endeavors (fine-tuning or running Arctic LLM requires around eight GPUs), Snowflake pledges to make Arctic LLM available across a variety of hosts, including Hugging Face, Microsoft Azure, Together AI’s model-hosting service, and Lamini, an enterprise generative AI platform.


However, Arctic LLM will initially be available on Cortex, Snowflake’s platform for building AI- and machine learning-powered apps and services. Pitched as the preferred way to run Arctic LLM with security, governance, and scalability, Cortex aims to offer an API within a year, allowing business users to directly interact with data.


Despite these efforts, I’m left wondering: Who is Arctic LLM really for besides Snowflake customers? In a landscape filled with “open” generative models that can be fine-tuned for various purposes, Arctic LLM doesn’t seem to stand out significantly. While its architecture may offer efficiency gains, it’s unclear whether they’ll be compelling enough to sway enterprises away from other well-known and supported generative models like GPT-4.


Moreover, Arctic LLM’s relatively small context window poses a limitation. With a context between ~8,000 and ~24,000 words, Arctic LLM falls short compared to models with larger contexts like Anthropic’s Claude 3 Opus and Google’s Gemini 1.5 Pro. This limitation, coupled with the inherent shortcomings of generative AI models such as hallucinations, raises questions about Arctic LLM’s practicality in real-world applications.


As my colleague Devin notes, until the next major technical breakthrough, incremental improvements are all we can expect in the generative AI domain. Despite this, vendors like Snowflake will continue to champion their offerings as significant achievements and market them accordingly.

Leave a Reply

Your email address will not be published. Required fields are marked *

error: Content is protected !!