If AI learns on its own, who teaches it the limits?

Generative artificial intelligence (GenAI) is revolutionizing multiple industries with its ability to generate text, translate languages, and create content. However, its rapid adoption poses risks in privacy, bias, and ethics. Although safeguards exist, recent studies indicate that they are still insufficient to fully mitigate these challenges. Cases such as Bing’s chatbot with disturbing responses [1] or Google’s Bard error [2] in providing incorrect information highlight the urgency of effective AI governance.

What are AI guardrails?

They are policies, practices and technologies designed to ensure safe and ethical use of GenAI and language models. According to Gartner [3], GenAI represents a critical emerging risk. Failure to guardrail can involve everything from data leaks to reputational damage to legal sanctions, as evidenced by a court ruling in Canada [4], where Air Canada was forced to honor a promise of a discounted fare made by its chatbot, despite the fact that the information provided by the chatbot was incorrect.

Types of guardrails

  1. Ethical: Prevent biased or detrimental outcomes, aligning output with social and moral norms.
  2. Compliance: Ensure compliance with laws and regulations, especially in sensitive sectors.
  3. Contextual: Adapt responses to specific situations, avoiding inappropriate but legal content.
  4. Security: Block misuse, disinformation or leaks.
  5. Adaptative: Evolve with models to maintain ethical and legal integrity.

 

Current limitations

Despite these measures, models can still generate harmful content. This is due to technological complexity and a constant “arms race” between those who build guardrails and those who seek to circumvent them [5]. Therefore, it is essential to continually review and strengthen these protections.

Implementation usually combines filters for offensive language, training with clean data, technical validations, and clear policies for use. An effective approach is the “Swiss cheese” model: multiple layers of protection (technological, procedural and human) acting together. Each layer may be flawed, but overlaying them significantly reduces the risk of undesirable outcomes. Human oversight remains essential. Even with robust technical defenses, human judgment is crucial as the last barrier against AI-generated errors or inappropriate advice.

Adri Nux – Digital Worker

At Nuxia, our digital worker Adri Nux [6] combines AI, automation and data management. To ensure responsible use, we implement ethical, legal, contextual, security and adaptive guardrails. These translate into content filters, technical validations and clear policies on acceptable responses. In this way, we protect our customers’ information, minimize risks and foster trust in every interaction with our solution.

Links:

[1] https://www.nytimes.com/es/2023/02/17/espanol/chatbot-bing-ia.html – New York Times

[2] https://www.bbc.com/mundo/noticias-64583401 – BBC

[3] https://www.gartner.com/en/newsroom/press-releases/2023-08-08-gartner-survey-shows-generative-ai-has-become-an-emerging-risk-for-enterprises – Gartner

[4] https://www.eleconomista.es/tecnologia/noticias/12687875/02/24/el-chatbot-de-air-canada-se-invento-una-politica-de-viaje-y-ahora-la-compania-tiene-que-pagar-por-ella.html 

[5]https://www.cio.com/article/2516110/los-guardarrailes-la-clave-para-implantar-una-ia-segura-y-eficaz.html – Cio Spain

[6] https://nuxia.tech/es/adri-nux/ – Nuxia

Share the Post: