Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
AI Science Technology

AI Writing Is Improving, But It Still Can't Match Human Creativity (science.org) 42

sciencehabit shares a report from Science Magazine: With a few keystrokes, anyone can ask an artificial intelligence (AI) program such as ChatGPT to write them a term paper, a rap song, or a play. But don't expect William Shakespeare's originality. A new study finds such output remains derivative -- at least for now. [...] [O]bjectively testing this creativity has been tricky. Scientists have generally taken two tacks. One is to use another computer program to search for signs of plagiarism -- though a lack of plagiarism does not necessarily equal creativity. The other approach is to have humans judge the AI output themselves, rating factors such as fluency and originality. But that's subjective and time intensive. So Ximing Lu, a computer scientist at the University of Washington, and colleagues created a program featuring both objectivity and a bit of nuance.

Called DJ Search, it collects pieces of text of a minimum length from whatever the AI outputs and searches for them in large online databases. DJ Search doesn't just look for identical matches; it also scans for strings whose words have similar meanings. To evaluate the meaning of a word or phrase, the program itself relies on a separate AI algorithm that produces a set of numbers called an "embedding," which roughly represents the contexts in which words are typically found. Synonymous words have numerically close embeddings. For example, phrases that swap "anticipation" and "excitement" are considered matches. After removing all matches, the program calculates the ratio of the remaining words to the original document length, which should give an estimate of how much of the AI's output is novel. The program conducts this process for various string lengths (the study uses a minimum of five words) and combines the ratios into one index of linguistic novelty. (The team calls it a "creativity index," but creativity requires both novelty and quality -- random gibberish is novel but not creative.)

The researchers compared the linguistic novelty of published novels, poetry, and speeches with works written by recent LLMs. Humans outscored AIs by about 80% in poetry, 100% in novels, and 150% in speeches, the researchers report in a preprint posted on OpenReview and currently under peer review. Although DJ Search was designed for comparing people and machines, it can also be used to compare two or more humanmade works. For example, Suzanne Collins's 2008 novel The Hunger Games scored 35% higher in linguistic originality than Stephenie Meyer's 2005 hit Twilight. (You can try the tool online.)

AI Writing Is Improving, But It Still Can't Match Human Creativity

Comments Filter:
  • Inherent flaw? (Score:5, Insightful)

    by Chris Mattern ( 191822 ) on Saturday December 21, 2024 @08:11AM (#65030573)

    "A new study finds such output remains derivative -- at least for now."

    For now? The whole principle they're building on is to replicate what it's seen. How can it be anything *other* than derivative?

    • by dvice ( 6309704 )

      You simply add a random number generator to it. Generate random stuff, then start polishing it and you have yourself an original story. That is not the hard part.

      Hard part is to identify parts that humans enjoy. If you had a good scoring algorithm for that, you could just generate random stuff and pick the good stuff from the noise.

      • Re:Inherent flaw? (Score:4, Insightful)

        by Ol Olsoc ( 1175323 ) on Saturday December 21, 2024 @10:10AM (#65030695)

        You simply add a random number generator to it. Generate random stuff, then start polishing it and you have yourself an original story. That is not the hard part.

        Hard part is to identify parts that humans enjoy. If you had a good scoring algorithm for that, you could just generate random stuff and pick the good stuff from the noise.

        Creativity, and creative people are not normal people. Not throwing shade, but that they might see and think things that are not what most people think or see. So they create, and sometimes it is pretty profound. What is more, is the misunderstanding that creativity needs no bounds. Creativity is all about restrictions.

        • Re: (Score:2, Interesting)

          by gweihir ( 88907 )

          What is more, is the misunderstanding that creativity needs no bounds. Creativity is all about restrictions.

          Exactly, It is about doing something _meaningful_ within restrictions that make sense. It is about ideas and structures derived from that idea. AI can, say, replace a character in an existing story or it can mix some stories together, but it cannot add to things. It can only make derivative things that are on lower quality than the input.

          Incidentally, the unavoidable problem of "model collapse" is a result of that.

          • What is more, is the misunderstanding that creativity needs no bounds. Creativity is all about restrictions.

            Exactly, It is about doing something _meaningful_ within restrictions that make sense. It is about ideas and structures derived from that idea. AI can, say, replace a character in an existing story or it can mix some stories together, but it cannot add to things. It can only make derivative things that are on lower quality than the input.

            Incidentally, the unavoidable problem of "model collapse" is a result of that.

            And the closest that AI comes to creativity is when it hallucinates. Of course that is still not creativity at all. At best it can be inadvertently funny.

            • by gweihir ( 88907 )

              Hallucinations are sort-of randomizations with worse randomness. That may look profound in some cases, bit it is still complete bullshit. And the thing is the AI cannot tell in which cases it looks profound and in which cases it just looks like nonsense.

              It is a bit like the 1000 monkeys with 1000 typewriters and unlimited time. Sure, at some point they will have _also_ written all great works of literature, but they cannot tell where they are in all the nonsense and random crap.

          • Your criticisms of AI always rely upon definitions, terminology, and benchmark you that alone define and consider to be worthy of merit. Fortunately, many other of us try to think a little more critically about our statements.

            • by gweihir ( 88907 )

              No, they do not. Like at all. And you did not even notice that I do typically not criticise LLMs, I criticize the lies that get pushed about them.

              I think you have no clue what I write here about LLMs at all. You just see something you do not understand but somehow admire being criticized and then you try to fling crap.

      • by gweihir ( 88907 )

        Nope. You will have a _random_ story. That is fundamentally different. Randomness cannot replace insight or creativity, even if some artists throughout history have tried that path.

    • Re: (Score:3, Informative)

      by gweihir ( 88907 )

      Indeed. It will always be derivative and it will always be low quality with regard to content. Anything else would require insight and creativity and AI cannot do those. Period. What can get better is the language used, as that does not require insight or creativity.

      No idea why people continue to expect things from AI that it fundamentally cannot do.

      • Indeed. It will always be derivative and it will always be low quality with regard to content. Anything else would require insight and creativity and AI cannot do those. Period. What can get better is the language used, as that does not require insight or creativity.

        No idea why people continue to expect things from AI that it fundamentally cannot do.

        The claim that AI "will always be derivative" and "low quality" because it lacks "insight and creativity" is a strawman argument, trying to reframe the discussion. By framing the discussion around abstract qualities like insight and creativity—terms that are not well-defined and often subjective—it misrepresents the goals and capabilities of AI systems. Most advocates for AI development are not claiming these systems possess human-like consciousness or insight. Instead, they focus on achieving

        • by gweihir ( 88907 )

          We are talking about LLMs. And LLM results will always be derivative and lower quality than their training data because that is THEIR FUCKING MATHEMATICAL NATURE. No "strawman" in there, just a lot of people, like you, that refuse to see actual facts.

        • ok, but the paper defines creativity [openreview.net] and developed a metric for measuring it. If you had even read the summary you would have known that.
    • Children start writing highly derivative stories also. We're not really clear on what people do to actually write genuinely creative stories, but even highly skilled writers seem to start with a lot of derivative things. In that sense, ChatGPT's attempts to write fiction resemble that of about a 12 to 14 year old child (although I've seen 12 year olds who are better writers than it). What needs to be done differently still isn't clear. That said, I'm not sure that writers as writers really want this. Writin
      • What needs to be done differently still isn't clear.

        Two things: AI needs better semantic models. Or even just one would be better than what we have now. And then AI needs heuristics trained by semantic generate and test routines. Maybe supervised at the outset*. But eventually internalized, as it is with experienced humans. To throw the garbage out before it even surfaces as a creation.

        *But this would fly in the face of AI investment. Having to pay actual human tutors rather than scrape "free" stuff off the Internet.

      • Children start writing highly derivative stories also.

        I'm not sure that's true. Sometimes children's stories are literal hallucinations (Lucy in the Sky with Diamonds?).

    • "A new study finds such output remains derivative -- at least for now."

      For now? The whole principle they're building on is to replicate what it's seen. How can it be anything *other* than derivative?

      This assertion relies on flawed reasoning and underestimates both the emergent nature of creativity and the trajectory of AI development. A more productive discussion would explore how AI enhances innovation rather than reducing it to a narrow definition of "derivative."

      The claim that AI "must always be derivative" because it "replicates what it's seen" makes several missteps. First, it begs the question by assuming that replication excludes novelty, ignoring that creativity often emerges from the recombina

    • Whenever a new AI service comes out, I put it through its paces and inevitably see the same pattern emerge:

      For the first few days, it blows me away with seemingly unique, on-target responses to my prompts. This is true whether it's a chatbot for creative writing, an image generator, or a music generator.

      Then, over time, I start to see that it's rehashing the same concept over and over. In an extended roleplay conversation, it repeats the same stock phrases no matter what the context.

      What we're seeing is ELI [playclassic.games]

  • "relieves" itself in the woods?

  • Duh!
  • Wrong question (Score:5, Interesting)

    by allo ( 1728082 ) on Saturday December 21, 2024 @09:27AM (#65030641)

    Why should AI be creative when the whole source of its creativity is a long int seed? Without your own creativity, all you get is variants of what the model likes to write. It may read well, but after a while it will always be the same.
    Give the model input from your creativity and use the model's writing skills to make your vision of a text come true. Why do we need to outsource this to the model?

    • Re:Wrong question (Score:4, Interesting)

      by VeryFluffyBunny ( 5037285 ) on Saturday December 21, 2024 @10:19AM (#65030705)
      It's precisely this "turn of phrase" that we enjoy from skilled writers that this analytical tool is about. It's not measuring the ideational content of the writing, i.e. thought-provoking or entertaining stories, just the way it's written. The writer's style, as it were.

      I suspect this tool's analyses & results were a foregone conclusion when the researchers thought up the idea. Of course, GPT LLMs are going to produce bland prose; they're essentially "averaging machines" & all distinctiveness has been statistically cancelled out.
      • by allo ( 1728082 )

        I think the largest problem with bland prose is bad datasets. If you look at the bland prose, most of it isn't all that bad. Yes, all common tropes and so on, but not rare in other literature and not bad per se. But the models have a way too large repetition quote and too little diversity.

        One thing is, that the model starts anew with each text. Write one chapter without the previous one in the context, and you get repetitive phrasing, because the model doesn't know it (over)used this phrase in the last chap

        • I agree with much of what you've said but you've missed the point I made. The blandness, narrowness, & repetition comes from the statistical probabilities, i.e. the same "most likely" next morphemes keep coming up again & again simply because they're statistically probable. Language, with it's dual structure & multiple levels of analysis has a remarkably consistent Zipfian distribution across all of them. So, the same, overly obvious, bland turns of phrase keep rising to the top over and over ag
  • sure honestly ai just cant write like us its all robotic and stuff never gets the feelings right sometimes the stories are just boring and lack the depth you need its like trying to explain a joke to a brick wall ai will never truly understand what makes writing special its just lines of code not real creativity
  • They are measuring linguistic creativity which concerns only a measure of uniqueness of words in sentences and phrases rather than attempting to measure overall creativity of the work. Neither does the paper even once mention temperature parameter. They select poor models like ChatGPT well known for being highly overfit and llama2 when there are way better models tuned for this kind of work readily available.

    Overall I think the paper is fundamentally flawed and guaranteed to cause confusion in its choice

    • Well, I slammed a few of my poems into that tool, and got a creativity index between 75% and 80%.
      I'm yet to figure out... 75%-80% of what, exactly?

    • by gweihir ( 88907 )

      Indeed. Essentially, they have created a metric and benchmark to (fake) support for the conclusions they wanted to find. That is junk-science and meaningless.

    • Yeah, I think there's a basic contradiction in trying to 'prove' that humans are more creative than AI by applying a metric that is itself an algorithm. Train the AI to optimize this metric of creativity and my guess is it will take the lead.

      Since it is accepted that people possess creativity and the question is whether AI merits admission to the club, the judges of creativity must be human and the criteria must be subjective. This can still constitute proof if the judging is blind (the judges aren't to

  • Wrong priority (Score:3, Insightful)

    by MpVpRb ( 1423381 ) on Saturday December 21, 2024 @10:51AM (#65030757)

    We don't need robot artists.
    We need AI systems that can solve previously intractable problems in physics, medicine and engineering.
    The art problem has been solved a long time ago. People are good at art and don't need robot help.

  • and you won't have any left after using AI to think for you... <see link>

    We're headed to a world of zombies and automatons and "intelligent web agents" <ha haahahhaaa>
    You're being trained to NOT be creative, actually, to NOT think at all, and it appears you're embracing it
    press the button for your food pellet.

    see what the eggheads at U of T say... what I've being saying for a while now
    https://techxplore.com/news/2024-10-explores-impact-llms-human-creativity.html
  • Look, the more other people's works we steal, the better it gets!
  • The model doesn't stray far from the answer it has. It might be varied slightly, but not by much. It might have a different flavor or tone, but typically it is the same answer.

    At work, some people were writing a very long and complex prompt to get some information about a topic. I went ahead and wrote a few sentences, asking the question more directly. And while the others might have gotten some verbose output, flavored and toned a certain way, I got essentially the same answer.

    I say there is not enough dat

  • One of the first responses made the point that, since they rely on a "corpus" to give them clues as to the most-likely response to a particular set of words in a query, these tools can't help but be derivative. Which is pretty obvious to me.

    It's the more "meta" parts of the question of "originality" that I don't see any sign of (in the incessant writing on the subject - never had the slightest interest in actually trying them for myself. I'd rather spend hours per day reading novel papers on Ar&chi.,v)

Don't get suckered in by the comments -- they can be terribly misleading. Debug only code. -- Dave Storer

Working...