• DABDA@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    1 year ago

    I agree with you in general, but for Stable Diffusion, “2.0/2.1” was not an incremental direct improvement on “1.5” but was trained and behaves differently. XL is not a simple upgrade from 2.0, and since they say this Turbo model doesn’t produce as detailed images it would be more confusing to have SDXL 2.0 that is worse but faster than base SDXL, and then presumably when there’s a more direct improvement to SDXL have that be called SDXL 3.0 (but really it’s version 2) etc.

    It’s less like Windows 95->Windows 98 and more like DOS->Windows NT.

    That’s not to say it all couldn’t have been better named. Personally, instead of ‘XL’ I’d rather they start including the base resolution and something to reference whether it uses a refiner model etc.

    (Note: I use Stable Diffusion but am not involved with the AI/ML community and don’t fully understand the tech – I’m not trying to claim expert knowledge this is just my interpretation)

    • barsoap@lemm.ee
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      AFAIU SDXL is actually an erm genetic descendant of SD1.5, with its architecture expanded, weights transferred from 1.5, and then trained on bigger inputs (512x512 in the end is awfully small). SD2.0 is a completely new model, trained from scratch and as far as I’m aware noone’s actually using it. Also noone is using the SDXL refiner if you go to civitai it’s all models with detailer capabilities baked in, what you do see is workflows that generate an image, add some noise at the very end and repeat the last couple of steps. Using the base sdxl refiner on the output of other sdxl models is sometimes right-out comical because it sometimes has no idea what it’s looking at and then produced exquisitely surface texture details of the wrong material. Say a silk keyboard because it doesn’t realise that it’s supposed to be ABS and, well, black silk exists.