AWS Launches Q Chatbot as It Positions Itself as the 'Steady Hand' for Enterprise AI
Q, a chatbot named after the gadgets-geek quartermaster in James Bond, headlined a long list of generative AI announcements at AWS re:Invent 2023 in Las Vegas on Nov. 28. The cloud services company positioned itself against competitors as a steady hand that offered choice and business-class capabilities for generative AI.
Q joins a long list of generative AI interfaces announced by enterprise software firms, abstracting a stack of services and capabilities through a conversational chatbot interface. Similar to Microsoft Copilot, Q will act as a user’s concierge to Amazon Web Services (AWS). For example, it will help manage compute resources in an EC2 instance, or it will help you build an application in CodeWhisperer, or to visualize business intelligence data in QuickSight.
Use cases aren’t limited to IT either, with Q also connecting to more than 40 different enterprise software data stores including Microsoft 365, Salesforce, ServiceNow, and Google. Amazon is also integrating Q with its Connect contact center service to assist frontline agents. Examples are given in an AWS press release for creating a social media campaign and measuring its success.
On stage, AWS CEO Adam Selipsky differentiated Q from the competition with some veiled barbs at Microsoft and OpenAI, without ever explicitly naming the firms. But with the recent boardroom chaos involving the firing and rehiring of Sam Altman within a matter of days, his meaning was clear.
“In a lot of cases, these [other chatbot] applications don’t really work at work. They don’t know your data or your customers, and this limits how useful their recommendations can be,” he said on stage. “These chatbots were launched without security and privacy capabilities. So many CIOs banned the use of these apps … and it’s much more difficult to bolt on security after the fact.”
AWS will not use any data input to Q’s chatbot to train its underlying models, Selipsky promised. When OpenAI launched its ChatGPT pilot to the web, many organizations reacted with a policy to ban or limit its use by employees due to concerns that intellectual property or sensitive data would be scooped up and incorporated into the main foundation model, exposing it to competitors and threating a firm’s competitive advantage.
Q is available today in preview from AWS Regions US East (N. Virigina) and US West (Oregon), so many users will be waiting to try the new assistant for now.
Info-Tech’s Tech Trends 2024 report identified that more than two-thirds of AI adopters (organizations already invested in AI or planning to by next year) are interested in using generative AI interfaces. A little less than half of AI skeptics say they are interested. Judging by the number of chatbot announcements from enterprise vendors, by this time next year the average knowledge worker may find themselves interacting with multiple chatbots over the course of a workday.
For AWS, Q just represents the front door to a deep stack of AI services designed to provide utility across the entire vertical – from AI foundation model builders to enterprises that want to harness generative AI capabilities for their applications. Selipsky shared a three-layer framework for how AWS views the stack: Q sits in Layer 3, applications that use AI; Layer 2 is for integrating AI into products and processes; and Layer 1 provides infrastructure for model training and inference.
Integrating AI with Amazon Bedrock
For customers who want to build their own applications and customize AI to specific use cases, Amazon Bedrock offers a selection of model choices available through a single API and development in a SOC 2 secure environment. Models available on Bedrock are provided by Amazon itself – with its Titan models – as well as Anthropic, AI21 Labs, Meta, Stability, and Cohere.
“We don’t think there is going to be one model to rule them all,” says Matt Wood, vice president of technology for AWS. “Having a choice of models will allow customers to choose the right model for the right use case.”
During the keynote, Anthropic CEO Dario Amodei was invited on stage for a discussion with Selipsky. He spoke to the previous week’s release of Claude 2.1, the firm’s latest model that he says boasts the largest context windows of 200,000 tokens (equivalent to about 150,000 words). This is important because it allows more customization on enterprise content, he said, such as providing context in the form of financial statements, a long book, or a collection of multiple documents. Providing this complementary data also helps reduce the rate of hallucinations – or the model making mistakes.
Wood emphasized the importance of pretraining and fine-tuning to make foundation models more practical for enterprise use: “LLMs (large language models) have a very broad set of information and general knowledge, but once you get down to any level of depth it’s like Swiss cheese down there.”
Organizations can resolve this data sparsity problem by complementing the models with their own data sets. This is accomplished with Knowledge Bases, a new feature in Bedrock to facilitate retrieval, augmentation, and generation (RAG) for foundation models on custom data sets.
Agents for Amazon Bedrock was also announced for general availability. It makes building applications with generative AI easier by automating a workflow where customers can select their foundation model, provide basic instructions about how it should behave, and connect it with APIs and data sources.
AWS executives stressed that customer data is private when using it to build generative AI applications. Data is stored in a container, and it remains there along with the customized model, with data encrypted while in transit and at rest.
Building foundation models with AWS silicon
AWS announced new releases for its proprietary silicon, including for AI and ML training on EC2. Its Trainium2 chip will provide four times faster training for generative AI and machine learning than the previous generation of chip with 65 exaflops of processing power.
“We’re already launching our second generation of chip; meanwhile, a lot of other cloud providers are still just talking about their ML chips,” Selipsky said.
The chips can be clustered in groups up to 100,000 to reduce training time, and they will also offer energy efficiency, AWS claims.
AWS Inferentia chips are also available for inference workloads.
Analyst perspective
AWS knew it had to offer a competitive conversational interface to Microsoft Copilot, and the announcement of Q should deliver on that capability – eventually. With the chatbot only available in two US regions in preview, we still don’t know the timeline for when most AWS customers will be able to role-play as James Bond and start asking it about new technology capabilities. Q isn’t a totally new concept in the AWS ecosystem, previously being available as an assistant int QuickSight to help with business intelligence visualizations. While Q may lag Copilot to market, this may not matter much, as customers are unlikely to switch platforms to get access to one intelligent agent or another just a few months sooner. More likely is that AWS customers will turn on the agent when it’s generally available, and Microsoft Azure customers will leverage Copilot on their stack.
Beyond the top layer of abstraction, AWS demonstrated its technical strengths for organizations with more advanced development capabilities. Being clear about data not being at risk of training a public foundation model is key, and Bedrock also offers the best selection of foundation models to build with compared to the other hyperscalers. Whether it offers the best foundation model is another question.
For the rare organization seeking to train its own foundation model, AWS looks to be a best-in-class option. AI leaders like Anthropic have trained very large models on the platform. With a focus on delivering speed, cost efficiency, and energy efficiency, AWS is hitting all the right notes as it sings its AI training tune.