There’s now a Python library for RLHF called TRLX! (The same reinforcement learning strategy used in training ChatGPT) It works well with Huggin
There’s now a Python library for RLHF called TRLX! (The same reinforcement learning strategy used in training ChatGPT) It works well with Huggin