Prompt Extend: AI tool to help with prompt engineering for Stable Diffusion by automatically generating suitable style cues for prompts

Name of Project: Prompt Extend

Proposal in one sentence: Text generation model to extend stable diffusion prompts with suitable style cues.

Description of the project and what problem it is solving:

To generate beautiful images, currently available diffusion models usually require complex prompts with additional style cues added to them, which can be time-consuming and difficult to create manually.

To address this issue, I made this AI tool that automatically generates suitable style cues to add to a stable diffusion prompt enhancing the image generation of the model and making it user-friendly to use.

Example:

You could play with it on HuggingFace Space. Here’s the GitHub repo for the project, and I’ve also uploaded the model on HuggingFace Hub. The project is all open-source.

Grant Deliverables:

  • Scaling up the model architecture and trying out different techniques for improvements.
  • Experiment with fine-tuning currently available pre-trained text models on the prompts dataset and comparing their generated outputs with the current from-scratch approach (maybe try a mix of both?)
  • Add this as a custom pipeline to the diffusers library so that it can be used directly with the diffusers library for image generation.

Round 7 deliverables completed:

Increased the training dataset size from 80k prompts to ~2 million prompts and made improvements to the tokenizer and the model, leading to much better and more context-aware style cues suggestions generated.

Squad

Partho Das. So far, it is a solo project.

  • Twitter handle: daspartho_

  • Discord handle: daspartho#3367

  • ETH mainnet wallet address for potential funds: 0xb70003E35ec3368c1B1BA82aa64C3687A730e107

Grants for the project will help me to develop this further.

1 Like