Human Evaluation of Methods for fine-tuning large language models
Proposal in one sentence: Evaluate reinforcement learning and supervised learning as methods for fine-tuning large language models using human contractors.
Description of the project and what problem it is solving: I’m working on a research project to evaluating the differences between reinforcement learning and supervised learning as methods for fine-tuning large language models. For the final comparisons I want to hire contractors to perform human evaluation of the trained models. This proposal is partially to support my own work on this project and also partially to directly fund the hiring of these contractors. I plan to open-source all the models, data and code to enable future work and analysis comparing these two popular methods.
Grant Deliverables:
- Arxiv or conference paper for the research project including comparisons based on human evaluations
- Open-sourced code, data and models from the research project.
Squad
Squad Lead: Robert Kirk
- _robertkirk
- darkaz#3185
- 0x503A1860e355306CbA03d3ceD7E356a49625Bd74