Reinforcement Learning with Human Feedback (RLHF) for Large Language Models (LLMs) Post date October 24, 2024 Post author By Hakeem Abbas Post categories In deeplearning, humanfeedback, rlhf, techinnovation
RAG Predictive Coding for AI Alignment Against Prompt Injections and Jailbreaks Post date September 5, 2024 Post author By Stephen Post categories In ai-alignment, ai-chatbot, ai-chatbot-development, ai-safety, predictive-coding, prompt-injection, retrieval-augmented-generation, rlhf
Navigating Bias in AI: Challenges and Mitigations in RLHF Post date August 14, 2024 Post author By Chaithanya Ravulu Post categories In advanced-bias-detection, ai, counterfactual-fairness-in-ai, deep-q-learning, mitigating-bias-in-ai, reinforcement-learning, rl-with-human-feedback, rlhf