At inference time, InstructPix2Pix generalizes to real photos and user-written instructions after being trained on our produced data. Our model modifies photos rapidly, in just a few seconds, and does not require per-example fine-tuning or inversion because it conducts adjustments in the forward pass. We demonstrate effective editing outcomes for a wide range of input photos and textual instructions.
Tech Used:
GPT