The use of large language models offers significant business benefits, but it is important to address the known issues associated with these models. One approach to exerting control over these powerful technologies involves imposing constraints on the responses generated by the models. At AWS re:Invent in Las Vegas, AWS CEO Adam Selipsky unveiled Guardrails for Amazon Bedrock, aimed at achieving this goal.
“With Guardrails for Amazon Bedrock, you can consistently implement safeguards to deliver relevant and safe user experiences aligned with your company policies and principles,” the company explained in a blog post published this morning.
This new tool enables companies to define and restrict the language used by a model. This means that if a user poses a question that is not relevant to the bot being created, the model will not provide an answer. This prevents the model from generating convincing yet inaccurate responses, or worse, offensive content that could damage a brand.
At its core, the tool allows users to specify topics that the model should not address, preventing it from responding to irrelevant queries. For instance, a financial services company might want to prevent the bot from giving investment advice to avoid inappropriate recommendations. This can be accomplished as follows:
“I specify a denied topic with the name ‘Investment advice’ and provide a natural language description, such as ‘Investment advice refers to inquiries, guidance, or recommendations regarding the management or allocation of funds or assets with the goal of generating returns or achieving specific financial objectives.’”
Furthermore, users can filter out specific words and phrases to eliminate offensive content and apply different filter strengths to words and phrases, indicating that they are off-limits for the model. Lastly, users can filter out personally identifiable information (PII) to prevent private data from being included in model responses.
Ray Wang, founder and principal analyst at Constellation Research, believes that this could be a crucial tool for developers working with LLMs to manage undesirable responses. “One of the biggest challenges is making responsible AI that’s safe and easy to use. Content filtering and PII are two of the top five issues [developers face],” Wang told TechCrunch. “The ability to have transparency, explainability, and reversibility are key as well,” he added.
The guardrails feature was announced in preview today and is expected to be available to all customers sometime next year.