Virtue AI
RESEARCH TERMS

We conduct pioneering AI research to empower and ensure safe and secure AI.

Red Teaming & Risk Assessments

Pioneering comprehensive AI risk assessment across multiple sectors and languages. Our advanced red teaming algorithms rigorously test AI models and systems, ensuring robust safety measures aligned with global regulations.

Guardrail & Threat Mitigation

Developing cutting-edge, customizable content moderation solutions for text, image, audio, and video. Our guardrails offer transparent, policy-compliant protection with unparalleled speed and efficiency.

Safe Models & Agents

Crafting AI models and agents with inherent safety features, from secure code generation to safe decision-making. We’re integrating safety and compliance directly into AI development processes, setting new standards for responsible AI.

Publications

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression.

Abstract: Compressing high-capability Large Language Models (LLMs) has emerged as a favored strategy for resource-efficient inferences. While state-of-the-art (SoTA) compression methods boast impressive advancements in

Rob-FCP: Certifiably Byzantine-Robust Federated Conformal Prediction.

Abstract: Conformal prediction has shown impressive capacity in constructing statistically rigorous prediction sets for machine learning models with exchangeable data samples. The siloed datasets, coupled

RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content.

Abstract: Recent advancements in Large Language Models (LLMs) have showcased remarkable capabilities across various tasks in different domains. However, the emergence of biases and the

C-RAG: Certified Generation Risks for Retrieval-Augmented Language Models

Abstract: Despite the impressive capabilities of large language models (LLMs) across diverse applications, they still suffer from trustworthiness issues, such as hallucinations and misalignments. Retrieval-augmented

Fine-tuning aligned language models compromises safety, even when users do not intend to!

Optimizing large language models (LLMs) for downstream use cases often involves the customization of pre-trained LLMs through further fine-tuning. Meta’s open release of Llama models