• jocanib@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      1
      ·
      edit-2
      1 year ago

      They’re circular. If the text is too predictable it was written by an LLM* but LLMs are designed to regurgitate the next word most commonly used by humans in any given context.

      *AI is a complete misnomer for the hi-tech magic 8ball

  • busturn@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I’ve recently checked my years-old essay using one of these AI plagiarism detectors and it said that the essay was 90% AI written. So either it’s all bs or I’m a time travelling AI.

  • Dohnakun@lemmy.fmhy.mlB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    This article was written to keep people as long on the page as possible. It didn’t get to the point before i left. Someone has a tl;dr?

  • dan1101@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    1 year ago

    As expected, they can’t be trusted. And the more AI evolves, the less likely AI content will be detectable IMO.

    • jocanib@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      2
      ·
      1 year ago

      It will almost always be detectable if you just read what is written. Especially for academic work. It doesn’t know what a citation is, only what one looks like and where they appear. It can’t summarise a paper accurately. It’s easy to force laughably bad output by just asking the right sort of question.

      The simplest approach for setting homework is to give them the LLM output and get them to check it for errors and omissions. LLMs can’t critique their own work and students probably learn more from chasing down errors than filling a blank sheet of paper for the sake of it.

      • weew@lemmy.ca
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        given how much AI has advanced in the past year alone, saying it will “always” be easy to spot is extremely short sighted.

        • Terrasque@infosec.pub
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Some things are inherent in the way the current LLM’s work. It doesn’t reason, it doesn’t understand, it just predicts the next word out of likely candidates based on the previous words. It can’t look ahead to know if it’s got an answer, and it can’t backtrack to change previous words if it later finds out it’s written itself into a corner. It won’t even know it’s written itself into a corner, it will just continue predicting in the pattern it’s seen, even if it makes little or no sense for a human.

          It just mimics the source data it’s been trained on, following the patterns it’s learned there. At no point does it have any sort of understanding of what it’s saying. In some ways it’s similar to this, where a man learned how enough french words were written to win the national scrabble competition, without any clue what the words actually mean.

          And until we get a new approach to LLM’s, we can only improve it by adding more training data and more layers allowing it to pick out more subtle patterns in larger amounts of data. But with the current approach, you can’t guarantee that what it writes will be correct, or even make sense.

          • nulldev@lemmy.vepta.org
            link
            fedilink
            English
            arrow-up
            0
            arrow-down
            1
            ·
            1 year ago

            it just predicts the next word out of likely candidates based on the previous words

            An entity that can consistently predict the next word of any conversation, book, news article with extremely high accuracy is quite literally a god because it can effectively predict the future. So it is not surprising to me that GPT’s performance is not consistent.

            It won’t even know it’s written itself into a corner

            It many cases it does. For example, if GPT gives you a wrong answer, you can often just send an empty message (single space) and GPT will say something like: “Looks like my previous answer was incorrect, let me try again: blah blah blah”.

            And until we get a new approach to LLM’s, we can only improve it by adding more training data and more layers allowing it to pick out more subtle patterns in larger amounts of data.

            This says nothing. You are effectively saying: “Until we can find a new approach, we can only expand on the existing approach” which is obvious.

            But new approaches come all the time! Advances in tokenization come all the time. Every week there is a new paper with a new model architecture. We are not stuck in some sort of hole.

            • Terrasque@infosec.pub
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 year ago

              An entity that can consistently predict the next word of any conversation, book, news article with extremely high accuracy is quite literally a god because it can effectively predict the future

              I think you’re reading something there other than what I said. Look, today’s LLM’s ingest a ton of text - more accurately tokens - and builds up statistics of which tokens it sees in that context. So statistically if you see the sentence "A nice cup of " statistically the next word is maybe 48% coffee, 28% tea, 17% water and so on. If earlier in the text it says something about heating a cup of oil, that will have a muuch higher chance. It then picks one of the top tokens at (weighted) random, and then the text (array of tokens) is fed in again into the LLM and a new prediction is made. And so on it continues until you stop the loop (usually from a end token or a keyword you’re looking for). Larger LLM’s are better at spotting more subtle patterns - or more accurate it got more layers of statistics that’s applied - but it still has the fundamental issue of going one token at a time and just going by what’s most likely to be the next token.

              It many cases it does. For example, if GPT gives you a wrong answer, you can often just send an empty message (single space) and GPT will say something like: “Looks like my previous answer was incorrect, let me try again: blah blah blah”.

              Have you tried that when it’s correct too? And in that case you mention it has a clean break and then start anew with token generation, allowing it to go a different path. You can see it more clearly experimenting with local LLM’s that have fewer layers to maintain the illusion.

              This says nothing. You are effectively saying: “Until we can find a new approach, we can only expand on the existing approach” which is obvious.

              But new approaches come all the time! Advances in tokenization come all the time. Every week there is a new paper with a new model architecture. We are not stuck in some sort of hole.

              We’re trying to make a flying machine by improving pogo sticks. No matter how well you design the pogo stick and the spring, it will not be a flying machine.

  • Candelestine@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    arrow-down
    2
    ·
    1 year ago

    Clearly the Founding Fathers were not advanced enough to have crafted the US Constitution unaided. It’s only reasonable to imagine that ancient aliens could have landed, given them an AI to assist them, and then departed with nobody the wiser.

    I am certain we can find evidence of this if we dig hard enough.