We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.
You must be logged in to block users.
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives
Python 73 4
Code for NeurIPS 2022 paper Exploiting Reward Shifting in Value-Based Deep RL
Python 29 4
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
Python 45 7
Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs
Python 22 2
Active reward modeling with last layer Fisher Information (ICML'25)
Python 7
10 1
There was an error while loading. Please reload this page.