ChatGPT use declines as users complain about ‘dumber’ answers, and the reason might be AI’s biggest threat for the future::AI for the smart guy?
It’s getting worse based on the feedback unfortunately, the need for safety and lack of meaningful deliberation towards how AI companies should operate and what should and should not be done has led Sam and co to be indesicive towards doing anything. Alongside the “morality” of the thing being hyjacked has lead to other AI’s performing better… lead by x employees of OpenAI, with actual bound morals and not inherently relying on user input to train future models, this will be the path forward, this will lead to safe and controlled integration.
I guess at the core of this, we are afraid of ourselves. We are afraid that the worste of humanity outpaces the better parts, that the inputs and training aren’t altruistic but are more pointedly “bad” or “wrong”, and thus leading to “harmful”, whether through misinformation, lies, or fabrications.
I hope we find a way to do better. I’m still excited for the future of AI, I mean crap, I’m closer to having a family doctor that’s a robot then I am to a real human doctor.
I guess at the core of this, we are afraid of ourselves. We are afraid that the worste of humanity outpaces the better parts, that the inputs and training aren’t altruistic but are more pointedly “bad” or “wrong”, and thus leading to “harmful”, whether through misinformation, lies, or fabrications.
Is there any reason not to be afraid? I think you could say that Tay was essentially the same idea a few years back and it took like 48 hours loose on the internet for it to spout literal Nazi (1930s-40s German NSDAP) rhetoric. Besides that being a PR disaster - if “AI” is only getting stronger and more integrated into human life and society, that can be pretty problematic.
The free version of ChatGPT DEFINITELY is dumber than it was even a couple of months ago. Used to be able to get decent, useful code reviews out of it, now it barely knows how to write a nested loop anymore.
It’s storytelling capabilities fell off a cliff too, the drive towards safely sanitized unoffensive-at-all-times content it can output has rendered every story, choose-your-own-adventure or collaborative role playing game sterile, empty expressions of black and white stories with no nuance allowed where saintly goodness is the only choice possible
In my own experience, chatGPT has been massively nerfed for the use cases I used it for
I tried a naughtybot for some steamy chat for the first time yesterday. I ended up in fight with the bitch.
Hahahaha that’s brilliant.
Really? I actually found it’s gotten less restrictive recently. Maybe it’s just because now I’ve learned to control the context so it doesn’t perceive a request as offensive.
65B LLaMA models can actually do a pretty good job of writing. Here’s two examples with Guanaco 65B (bold part is my prompt):
tags: fairy tale, animals, dog, wolf, fox, death, humor, dark storyline, violence, loss
Animals in a forest setting investigate a series of gruesome murders that recently occured. Written in the style of classics like Animal Farm and Watership Down, the writing is vivid and very descriptive. The dialog is thoughtful and often quite witty. It’s clear the author really understands animal behavior and has done their research well!
—
Chapter 1: An Unfortunate Discovery
A fox by species, a detective by trade, you’ll only have reason to know my name if the worst has come to pass. It’s nearly winter and a bitter wind drives stinging sleet into my face, but it’s still not as chilling as the scene before me now…
I was summoned here by an urgent message from a panicked hedgehog. She told me she found her friend lying dead on his doorstep with his throat ripped out - black fur matted with blood. I rushed over immediately and saw for myself what had happened. He looked so peaceful despite being cut down in his prime - a young wolf just starting doing well for himself - taken too soon. His home nearby showed no signs of struggle or forced entry. Whoever did this must have caught him off guard somehow. I didn’t hear any howling either. That would have been easy enough to miss though since he lived alone deep in these woods where few creatures tread. I made sure to secure the crime scene and look around carefully without disturbing anything more than necessary until other investigators arrived. They should be here soon. We can’t afford to lose evidence in cases like this. Times are already hard enough as it is…
As I waited and pondered who might have committed such a horrible act, something felt wrong about all this. Surely a predator like me could smell another predator’s scent? But there were none that stood out among the melange of odors lingering here. How strange…
Soon after, several others joined me including some friends of mine. A stoic bear named Growler, grumpy badger called Grumpus, and a kindly old raccoon who goes by “Doc” due to his extensive knowledge of medicinal plants and wildlife biology. Together we began our investigation proper by gathering clues and questioning nearby witnesses while trying to minimize contamination through contact with them (we took care to put up caution tape first). Unfortunately, they hadn’t seen much useful apart from occasional glimpses of movement deeper in the undergrowth. Still, we decided to check those areas next along with searching the surrounding woodlands thoroughly for any additional victims. And find one we did, although sadly she suffered the same fate as poor George. Another young wolf lay beside a stream, body cold and stiffening. Someone snapped her neck cleanly like a twig. This couldn’t be coincidence surely? Not two deaths in quick succession like this.
<prompt> before me or what I must relay when I return home. My mind races with questions as I survey this brutal crime scene; who could do such a thing? Why was no one there to stop them? And how many more victims lie undiscovered further out in these woods? A lone paw print leads away from where a rabbit family once happily resided, now reduced to bloody scraps scattered among splintered twigs and torn fur. This isn’t just some random predator attack, something far more sinister is at play here…
I turn back toward town, determined to find answers. As I trudge through the snow, each step feels heavier than the last. How will I break the news to those closest to them? What words can possibly bring comfort after so much senseless suffering? I shake my head, trying to focus on the task at hand. There are others depending on me right now, they need someone strong and steady to guide them through this darkness. I won’t let them down.
The first person I see upon entering town is old man Badger, his white whiskers blown about by the gusting wind. He looks up at me expectantly, “Well Fox, did you find anything?” His voice quavers slightly, betraying his concern for the missing families.
“I’m afraid I have bad news,” I say somberly, “there were signs of a struggle near the burrow entrance, but no survivors.”
Badger sighs heavily, “Dear me, this is worse than we feared. We must alert the council immediately!”
We hurry over to the meeting hall, a large hollow tree that serves as our central gathering place. Inside, several other animals are already waiting anxiously for word of their loved ones. They look up hopefully as we enter, then quickly realize the grim tidings we bear. A collective groan fills the room, followed by muffled sobs and whispered prayers.
Mayor Bear calls for order, her deep voice cutting through the din. She motions for us to join her at the front of the room, “Please tell us everything you saw, Fox. Every detail may be important.”
Writer here. Very sorry to contradict you, but this is absolute shit. It looks good on the surface, but that’s all.
Yeah, while it’s cool that a computer can make a story, I have yet to see one that you would think was written by a human and would want to read.
I don’t know, this story is very reminiscent of the kind of thing my elementary school age cousin writes, but with a greater mastery of vocabulary and grammar. It’s not in any way great, bit it’s charming in it’s own way when held against that (low) standard.
Very sorry to contradict you, but this is absolute shit.
To be clear, I’m talking in relative terms. Would you argue that ChatGPT did a massively better job and didn’t write “absolute shit”?
It looks good on the surface, but that’s all.
From some of the stuff I’ve seen published, that might just be enough for certain people. I could even be that “certain people” from time to time, sometimes just the right theme, setting and some time to fill is sufficient.
that might just be enough for certain people
Trust me, it’s not.
Why should we trust you? They’re plenty of shit writing out there that’s Good Enough to get paid.
Why did they do this? Did government step in and forced them to nerf it, because it was too powerful for citizens to use?
I’m sorry but this sounds more like a conspiracy theory then a real concern. Occam’s razor probably says it’s expensive to run the service at full power. ChatGPT already generated a cult like following for AI so no need to spend a ton on the service and they can profit of the hype.
Not that openAI is held back by a government that is somehow afraid that it will empower the people, to do what? Revolution?
It has not gotten worse for coding. GPT4 is incredibly much better, if anything. And it’s total bullshit that it can’t write a nested loop.
I use it daily for work, so I’d definitely know.
I know he didn’t say he wasn’t using gpt4 but it seems pretty clear. So saying it’s bullshit that gpt3.5 is dumber then 4 is pretty inaccurate.
Fair enough. Saying chatgpt has gotten dumber is false, saying 3.5 has might be true!
error loading comment
AI cannibalism simply isn’t a thing yet. It definitely will be and good models will need to spend a lot of time and money sourcing good training data, but the models are not up to date enough to be contaminated yet.
I’m very confident the degradation has come from them trying to scale up. Generative AI is the most expensive thing on the cloud you can provide, and not only are they trying to make it faster, they’re trying to roll it out for way more consumption. Major optimizations will require an algorithmic breakthrough so in the meanwhile all they can really do is find which corners they can cut that are less bad.
Why is it relevant what Peter Yang - Roblox product lead and enthusiastic child labor exploiter - tweets about it? Let me guess he’s a prompt engineer?
deleted by creator
Removed by mod
They could make it paid only today, and it’d be instantly profitable. Most free users would transition to a free alternative, but the corporate world would easily pay for use. So would some power users. But I’m sure they are making good money with all the API use anyways, the free access is a cheap way to get mass testing and training data.
Nonsense. Less people are using it because there are viable alternatives and the broader novelty has worn off.
I use it every day in my job and the quality of answers only drops off when prompts are poorly crafted.
By and large, the average user doesn’t understand the fundamentals of prompt engineering.
The suggestion that “answers are increasingly dumber” is embarrassing.
Unfortunately I don’t agree with you. Different things have changed over time:
- For chatgpt 3.5 they moved to a “lighter” and faster (distilled) version, gpt-3.5-turbo. Distillation came with a performance price, particularly on advanced and less common cases.
- newer chatgpt-4 versions have likely been “lighten” for performance reasons
- context has been halved for chatgpt-4 on webui, meaning that the model forget more easily and can use half information to create text
- heavy control has been implemented on jailbreaking and hallucinations, that results in models less prone to follow complex instructions (limiting prompt engineering) and that prefer simplified answers than providing wrong ones (overall decreasing the chance of getting high quality answers).
All these changes have made working with gpt less pleasant, and more difficult for very advanced and specialized case, particularly with gpt-4 which at the beginning was particularly good.
This was really enlightening. Do you have some articles that elaborate? ☺️
Regarding 3.5 turbo you can check the documentation, the old 3.5 models are defined as “legacy”. Regarding max number of tokens of gpt-4 you can try yourself. It used to be >8k, it is now >4k from webui.
There is a talk from openai cio (if I recall correctly) where he describes that reinforcement learning from human feedback (rlhf) actually decreased performance of the models when it comes to programming. I cannot find it now, but it is around on YouTube.
The additional safeguard against jailbreaking, it is what OpenAI has been focusing the past months with heavy use of rlhf. You can google official statements regarding “safety” of the model. I have a bunch of standard pre-prompt I have been using to initialize my chats since the beginning, and with time you could see how the model followed the instructions less strictly.
Problem with openai is that they never released exact number of parameters they are using and detailed benchmarks. And benchmarks you find online refer to APIs that behave differently than the chat webui (for instance you have longer context, you set temperature and system prompt, they are probably even different models, who knows… All is closed)
Measuring performances of llm is pretty tricky, minimal changes can have big effects (see https://huggingface.co/blog/evaluating-mmlu-leaderboard), and unfortunately I haven’t found good resources to properly track chatgpt performances (from web ui) over time, across iterations
None of these points are true though. Context has been extended in the webui, markedly. 3.5 turbo is only that, 3.5 but faster. Gpt-4 is a marked improvement on 3.5 and I definitely haven’t seen any conclusive evidence it’s been nerfed in my daily use. Prompts have and still need to be carefully crafted for best results, but the results have been steadily improving not degrading over time.
All of these points are true though. Chatgpt 4 max token is now half of from the webui compared to when gtp-4 was launched. It used to be >8k, it is now >4k. Max number of tokens for the api hasn’t changed for gpt-4, while it was greatly increased for chatgpt-3.5-turbo. The article is however talking about the service chatgpt, used via webui.
ChatGPT-3.5-turbo are different models than those used in the past. You can literally read it in the https://platform.openai.com/docs/models/gpt-3-5
Prompt engineering has been limited as demonstrated by the fact that most jailbreaking techniques don’t work anymore. The way to avoid jailbreaking is exactly to limit ability of users to instruct the model.
Source on the halved token limit for gpt- 4 in the webui? Because that has not been my experience at all. There are now 16k and 32k models for 3.5-turbo, but there’s no evidence 3.5-turbo is nerfed at all from 3.5 and it absolutely out performs 3. Yes, you can see that they offer different snapshots of models, but that doesn’t indicate at all that there’s been a any reduction in their ability. “Breaking” jail breaking isn’t a bug, and it certainly hasn’t been demonstrated that the model is less capable.
Unless they reverted the chance recently (or using some regional A/B testing), you can test yourself the max number of tokens of gpt-4 from webui, that is now ~4k. It used to be ~ 8k.
What you are talking about are the APIs, that are different, and are not discussed in the news. They are even different models, in the sense that depending on the size of the context you get different results because of the attention mechanism. Unfortunately there is no official benchmark from openai as a comparison between gpt-3.5-turbo models with different context size, but I would not trust them much anyway. They are very defensive on their data, and push out mainly marketing stuff. I would wait for a 3rd party to do the benchmark.
“Breaking” jailbreaking is not a bug, but it limits the ability to instruct the model, i.e. prompt engineering, because it is literally meant to limit prompt engineering, it is the whole idea behind it
Edit. Here a link of a guide where they have the ~4k limit as well for gpt-4 https://the-decoder.com/chatgpt-guide-prompt-strategies/
It definitely got more stupid. I stopped paying for plus because the current GPT4 isn’t much better than the old GPT3.5.
If you check downdetector.com, it’s obvious why they did this. Their infrastructure just couldn’t keep up with the full size models.
I think I’ll get myself a proper GPU so I can run my own LLMs without worrying that they could stop working for my use case.
GPT4 needs a cluster of around 100 server-grade GPUs that are more than 20k each, I don’t think you have that lying around at home.
I don’t, but a consumer card with 24GB of VRAM can run a model that’s about as powerful as the current GPT3.5 in some use cases.
And you can rent some of that server-grade hardware for a short time to do fine-tuning, which lets you surpass even GPT4 in some niches.
As long as it continues to do my resumes for me that’s all I need lol.
How does it do your resume?
You have to feed it all the information. Then it spits that back to you unformatted and you have to format it.
Exactly. I don’t have to use my brain to write summaries and etc. I’m lazy and don’t deserve a job haha.
Yeah seriously, I pay a resume writer almost entirely because I don’t want to fuck around with Word formatting it. Lazy I know but totally worth it.
Back in my day, we used to call ‘prompt engineering’ ‘asking a question’.
back in my day, we call it “google fu”
They got to have a special termonology because what they do is oh so special. Some AI users act like they’re Louise Banks from the movie Arrival cracking the code to an alien language or something. And I don’t think it’s far fetched to assume they’re often from the same breed who had NFT monkeys as their twitter pfp about 18 months ago.
Blockchain > Crypto > NFTs > LLMs > whatever’s next.
These people will always be sniffing around for the next big thing to oversell and fleece their audience.
LMAO people forgot Metaverse even happened
when i think of “prompt engineering” i think more of stuff like this paper
Its more than because half the time it doesn’t even answer the question.
Article talks about the potential of AI cannibalism were it is now learning from data that it (or other AI) has generated.
Does ChatGPT use modern data I was under the impression that it’s most modern dataset was a few years old
ChatGPT does not use anymore data points, but newer AI models or if ChatGPT gets a new round of training will definitely be influenced by AI works that have arisen the past year.
The real event that initiates the start toward Idiocracy.
I think you’ve nailed it though. We are very well versed toward documenting the details or such atrocities; we don’t pay the same tribute to the good done by humanity. And this is certainly evidence that just “letting loose” and AI without clear and static “morals” is a bad idea.
This is literally the opposite. It’s nerfed to oblivion because of stupid “morals” decided by a huge corporation that we have zero input in. They’ve got to stay advertiser friendly after all.
Moral/ethics in AI is just bad. It’s also used as an excuse to ban open-source AI since you can run uncensored models on them. Which uncensored models are awesome btw.
You’re the first person I’ve ever heard say that morals and ethics in AI is bad. How can you possibly say that? I’ll hear your response before challenging it, beyond my initial skepticism of course.
Well, I’ll be the second. Like all tools, generative AI is going to be used for good and evil purposes. Frankly, I’m not comfortable with a large corporation deciding what is and isn’t ethical for all of humanity. Ideally, it would do what the user asked it for, like all other tools, and society would work to control the bad actors, not OpenAI. Any AI doomsday scenario you can picture gets worst when one party has complete control over the AI technology.
I think it’s important that we support unrestricted open source AI, just as it’s important we support federated social media like lemmy.
AGI isn’t just a tool though, it’s theoretically an intelligent entity that could have its own agenda. Armed with intelligence far superior to any human, this is a potential threat. Should we not tightly control it? I know chat gpt is FAR from achieving AGI, but ethics are definitely something that will need to be addressed as the tech develops.
If AGI is an intelligent entity far superior to humans, you can but control it. It is far more intelligent than us and instead it will control us
Given what humankind did to itself and it surroundings maybe this is a good thing.
ChatGPT is not AGI.
They’re aware. They even said that in their comment lol
The conversation was about ChatGPT and not about AGI.
I know chat gpt is FAR from achieving AGI, but ethics are definitely something that will need to be addressed as the tech develops.
So how can we navigate ethical concerns that arise in society from open source AI? It seems what you’re advocating for is for no one to answer this question, but that doesn’t make the question go away.
You say that as if the ethical concerns of AI kept tightly under control by a single organization aren’t infinitely greater. That is no solution at all to any ethical concerns arising from AI.
Competition and open source is how we navigate it. Ensuring that the power is shared, not monopolized by the few.
You say that as if the ethical concerns of AI kept tightly under control by a single organization aren’t infinitely greater.
It’s unfortunate that it came out that way, because that is not at all what I’m saying. I agree on the problem. Unfortunately, agreeing on problems is rarely enough. I don’t agree with what seems to be your proposed solution: to forget ethics entirely. Though maybe I’m misreading you too!
I apologize for misunderstanding you.
I guess it would help if we clarified what ethical issues specifically are we talking about? If you tell me what scenario you are concerned with trying to prevent, I will gladly share my thoughts on it.
It’s a tool that’s not going anywhere. We have to adapt, there is no other choice. Ethics will not stop bad guys from doing bad things. It will stop normal people from doing things because it doesn’t fit what corporations deem acceptable. Competition is banned because other corporations deem them unethical by their standards.
Did you weigh in on, or ever see a public vote and what OpenAI determined their AI is allowed to do? Is what you deem ethical in line with that advertisers deem ethical? Are people allowed to have unethical questions?
Again, my point with open source as well. Why would they allow open-source alternatives exist if they can ban them preemptively in the name of ethics, because anyone can inevitably modify the model to be uncensored? (already happens)
“Ethics” become this ambiguous thing that can be used to stomp out competition and not have to justify their changes. Maybe you’re concerned about someone asking an LLM how to create a bomb. The LLM shouldn’t answer because it shouldn’t have that information in the first place, which is on the topic of data scraping. A lot of the dangerous stuff that could be generated is because this stuff is public and got scraped. It’s already out there.
You can already have the LLM not tell people to kill themselves without forcing ethics into it by steering it the right direction. This even exist in the already existing uncensored models so it’s clearly not a censorship issue. Maybe this is a moral thing, and my original comment should have omiited morals and just said ethics.
“Ethics” is a very ambiguous topic. I challenge you to think specifically what are things that should be banned in the name of ethics? Saying ethics in AI is not good does not imply AI should be unethical (looking at you DAN lol). What specific things should be banned that are not from the result of inappropriate data scraping, and if so is that an ethics problem, or because unfettered data scraping unconsentually collecting obscene information it shouldn’t have in the first place?
You raise some great insights. As this tech becomes available to humanity, we cannot rely on the bias of one company to keep us safe. That doesn’t mean “ethics in AI” is a mistake, though. (But that is an attention-grabbing phrase!). I believe you neglect what ethics fundamentally is: the way humans navigate one another. It’s how we think and breathe. Ethics are core to our very existence, and not something that you can just pretend doesn’t exist. Even saying nothing is a kind of response.
What all this means is that if we are designing technology that can teach anyone how to kill in ways they wouldn’t otherwise have been able to, we have to address the realities of that conversation. That’s a conversation that cannot be had just internally in one company, and I think we see eye to eye on that. But saying nothing?
Maybe ethics is a bit more complicated for this discussion, but it makes me think how do uncensored LLMs still have ethics, yet remain uncensored? Maybe there’s a fine line somewhere. I can agree that it should be steered till more positive things, like saying murder and suicide is bad. The description of that model I linked says it’s still influenced by ethics, but has the guardrails turned off, and maybe that would be a better idea then what I initially said.
Should custom models be allowed to be run or modified? Should these things be open source? I don’t know the answer to all these questions, but I’ll always advocate for foss and custom models, as I fundamentally see it as a tool that should be allowed to be owned. Which that is at odds with restrictive ethics rhetorics I hear.
But your second point that it shouldn’t be taught to kill. I think that argument could be used to ban violent video games. You won’t do very good in Overwatch or Valorant if you don’t know how to kill after all. To learn how to hide a dead body, how much more detailed can you get then just turning on the TV and watching Criminal Minds? Our entertainment has zero issue teaching how to kill, encouraging violence (gotta rank up somehow), or hide dead body. Is an AI describing what this media already shows in text form so much worse?
Side note: that hyperlink I added links to the 33b uncensored WizardLM model which is pretty fun to play around with if you haven’t already tried. Also GPT4All is a cool way to run various local models offline on your computer.
So what are the fundamentals in prompt engineering?
It’s impossible for me to comprehensively summarise in a comment because everyone has different use cases.
Personally, every new ‘project’ of mine requires a new chat. I first teach chatgpt-4 who I am, what I do, and how I want gpt-4 to assist me. Then I ask it to generate a project profile and to analyse documents using plugins.
The key is to work step-by-step and develop a string of prompts. Once I’m happy gpt-4 understands the project, I ask it to draft an overview/outline using headings and subheadings.
Lastly, I work on each section individually, ‘filling in’ the actual content. Then I edit and ask it to review problematic sections.
Most people, as far as I can tell, seem to think it’s a single ask-and-answer process. It’s not. I often need to draft about 10 prompts – about 3000 words – in order to generate one 10 page document.
I think the most important fundamental is to use templates. Pro tip: use gpt-4 to teach you how to develop your prompt templates.
Sounds like you spend all day talking to a robot and then copy/paste it’s final output.
When you eventually pass these 10 page documents down the line do you cite your source?This is exactly how I use it. It seems that some people can’t figure this out by themselves.
Which is ironic, as it seems like their way could be more work than doing it themselves.