Proposal: Human Evaluation of Methods for fine-tuning large language models

Human Evaluation of Methods for fine-tuning large language models

Proposal in one sentence: Evaluate reinforcement learning and supervised learning as methods for fine-tuning large language models using human contractors.

Description of the project and what problem it is solving: I’m working on a research project to evaluating the differences between reinforcement learning and supervised learning as methods for fine-tuning large language models. For the final comparisons I want to hire contractors to perform human evaluation of the trained models. This proposal is partially to support my own work on this project and also partially to directly fund the hiring of these contractors. I plan to open-source all the models, data and code to enable future work and analysis comparing these two popular methods.

Grant Deliverables:

  • Arxiv or conference paper for the research project including comparisons based on human evaluations
  • Open-sourced code, data and models from the research project.


Squad Lead: Robert Kirk

  • _robertkirk
  • darkaz#3185
  • 0x503A1860e355306CbA03d3ceD7E356a49625Bd74

Hello, this is a nice proposal. How do you plan to fund the contractors? On a label-by-label basis?
How do you plan to optimize your labeling costs?
Please take a look at an on-going Algovera funded project that might have infrastructure that might help with targeted labeling.

Hey, this sounds really exciting. I myself work on the evaluation of NLP and language models. Its quite an interesting idea to compare the evaluation and performance of language models using two different training paradigms.
Did you plan any experiments and plan about how will you go around it?

I was planning to pay Surge or a related company to do the data annotations. The project you linked looks interesting and I’ll be sure to have a look, but given I’m using the human annotations just for evaluation and not for training there’s a slight mismatch in the functionality. Do you think it could be adapted to enable efficient evaluation of models?

What kind of methods do you plan to use for reinforcement learning? I’m asking mainly because I like reinforcement learning a lot but find it hard to generalize all with reward, state, action modeling

We’re mostly following this paper: [2009.01325] Learning to summarize from human feedback

Generalisation in RL is definitely difficult, but prior work suggests that fine-tuning language models works pretty well where this is concerned, I think due to the strong generalisation power of the pretrained model. I actually wrote a survey on generalisation in RL if you’re interested (although it’s pretty long): [2111.09794] A Survey of Generalisation in Deep Reinforcement Learning

Hey, very cool idea to evaluate the models. This seems like a component of my project Lucidly where I’m trying to make my stable diffusion model churn up meaningful visual transitions.
Concerning the evaluation, I’m thinking of a metric that evaluates memory over different frames and all this led me believe a semi-supervised transformer does the equivalent thing, not sure about how RL may fit. Curious to know what you think.