• JohnDClay@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    3
    ·
    1 year ago

    I know people who have their passwords on a google doc or email passwords. I foresee a lot of accounts getting hacked once people can crack the right prompts for the LLM.

    • Natanael@slrpnk.net
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      They’ll probably isolate the models from each other, but yeah, if they want to train shared models from private data then that could happen.

        • Natanael@slrpnk.net
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Bard is the name of the service, they can create account specific models trained on your user data which aren’t shared with other accounts (as an extension of the base model built on public data). I’ve already read about companies doing this to avoid cross contamination. Pretty sure Google is aware of this.

          • JohnDClay@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            1 year ago

            But I don’t know if Google cares enough about privacy to bother training individual models to avoid cross contamination. Each model takes years worth of super computer time, so the fewer they’d need to train, the less costly.

            • Natanael@slrpnk.net
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 year ago

              Extending existing models (retraining) doesn’t need years, it can be done in far less time.

              • JohnDClay@sh.itjust.works
                link
                fedilink
                English
                arrow-up
                1
                ·
                1 year ago

                Hmm, I thought one of the problems with LLMs was they’re pretty baked in in the training process. Maybe that was only with respect to removing information?

                • Natanael@slrpnk.net
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  1 year ago

                  Yeah, it’s hard to remove data already trained into a model. But you can retrain them to add capabilities to an existing model, so if you copy one based on public data multiple times and then retrain with different sets of private data then you can save a lot of work

    • regbin_@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      That won’t work because they’re not going to train Bard on your email contents or documents.

        • regbin_@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Most probably yes, it will add those information to the context. Once you delete the chat, those data are gone.

          • JohnDClay@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            That’s much better than using it for general training. Does anything keep Google from using it for training in the future though?

            • regbin_@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 year ago

              Their terms and privacy policy I guess. Also the possibility of data leak. I don’t think even Google would train their LLM on knowingly private data, that would be utter insanity.

        • atrielienz@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Probably something similar to what they were doing back in 2010 with Google Now. Skimming data from emails and texts etc in order to give you more pertinent information with Google Assistant’s predecessor. The google now page of that time could tell me when my flight left, what gate it boarded at, if it was delayed, what airport entrance to use. It told me when my bills were due and how much. It tracked orders for me and told me when they were out for delivery or delivered. It would help me to pick a restaurant for a special occasion, direct me where to call or book for a reservation. I found it very useful and then privacy concerns basically tanked it.

          We got a rebrand that did some of the same things with Google Assistant. And for a while that was really useful for a lot of the same things. But now that they’ve realised that data collection for this doesn’t net them ever increasing profits there’s a push for new better things. The new better apps and services don’t really do what people want or need. They are specifically and ever increasingly meant to funnel more data to Google and more ads to consumers. There were a lot of potentially really good useful services that this style of scraping provided. But on the other hand, they didn’t ask. They just took that info. And then saved face by sunsetting the product that people were gripping about.

          If you use google services at all, google has a profile on you. Even limiting the spec of that profile by opting out of assistant and turning off a lot of their tracking doesn’t necessarily help you maintain privacy. And google services includes their app store and phones.