I read comment as "don't resist our egregious power, our business is to keep becoming more powerful by arguments with different persuasive power".
I have to admit, the car safety argument is among the most persuasive, like do you want to get harmed? But in reality the question is not about "harming and nothing more", the question is about growing the egregious power AND caring about the tax payers simultaneously.
You can make an LLM play pretend at being opinionated and challenging. But it's still an LLM. It's still being sycophantic: it's only "challenging" because that's what you want.
And the prompt / context is going to leak into its output and affect what it says, whether you want it to or not, because that's just how LLMs work, so it never really has its own opinions about anything at all.
> But it's still an LLM. It's still being sycophantic: it's only "challenging" because that's what you want.
This seems tautological to the point where it's meaningless. It's like saying that if you try to hire an employee that's going to challenge you, they're going to always be a sycophant by definition. Either they won't challenge you (explicit sycophancy), or they will challenge you, but that's what you wanted them to do so it's just another form of sycophancy.
To state things in a different way - it's possible to prompt an LLM in a way that it will at times strongly and fiercely argue against what you're saying. Even in an emergent manner, where such a disagreement will surprise the user. I don't think "sycophancy" is an accurate description of this, but even if you do, it's clearly different from the behavior that the previous poster was talking about (the overly deferential default responses).
It's not meaningless. What do you do with a person who contradicts you or behaves in a way that is annoying to you? You can't always just shut that person up or change their mind or avoid them in some other way, can you? And I'm not talking about an employment relationship. Of course, you can simply replace employees or employers. You can also avoid other people you don't like. But if you want to maintain an ongoing relationship with someone, for example, a partnership, then you can't just re-prompt that person. You have a thinking and speaking subject in front of you who looks into the world, evaluates the world, and acts in the world just as consciously as you do.
Sociologists refer to this as double contingency. The nature of the interaction is completely open from both perspectives. Neither party can assume that they alone are in control. And that is precisely what is not the case with LLMs. Of course, you can prompt an LLM to snap at you and boss you around. But if your human partner treats you that way, you can't just prompt that behavior away. In interpersonal relationships (between equals), you are never in sole control. That's why it's so wonderful when they succeed and flourish. It's perfectly clear that an LLM can only ever give you the papier-mâché version of this.
I really can't imagine that you don't understand that.
> Of course, you can simply replace employees or employers. You can also avoid other people you don't like. But if you want to maintain an ongoing relationship with someone, for example, a partnership, then you can't just re-prompt that person.
You can fire an employee who challenges you, or you can reprompt an LLM persona that doesn't. Or you can choose not too. Claiming that power - even if unused - makes everyone a sycophant by default, is a very odd use of the term (to me, at least). I don't think I've ever heard anyone use the word in such a way before.
But maybe it makes sense to you; that's fine. Like I said previously, quibbling over personal definitions of "sycophant" isn't interesting and doesn't change the underlying point:
"...it's possible to prompt an LLM in a way that it will at times strongly and fiercely argue against what you're saying. Even in an emergent manner, where such a disagreement will surprise the user. I don't think "sycophancy" is an accurate description of this, but even if you do, it's clearly different from the behavior that the previous poster was talking about (the overly deferential default responses)."
So feel free to ignore the word "sycophant" if it bothers you that much. We were talking about a particular behavior that LLM's tend to exhibit by default, and ways to change that behavior.
I didn't use that word, and that's not what I'm concerned about. My point is that an LLM is not inherently opinionated and challenging if you've just put it together accordingly.
> I didn't use that word, and that's not what I'm concerned about.
That was what the "meaningless" comment you took issue with was about.
> My point is that an LLM is not inherently opinionated and challenging if you've just put it together accordingly.
But this isn't true, anymore than claiming "a video game is not inherently challenging if you've just put it together accordingly." Just because you created something or set up the scenario, doesn't mean it can't be challenging.
I think they have made clear what they are criticizing. And a video game is exactly that: a video game. You can play it or leave it. You don't seem to be making a good faith effort to understand the other points of view being articulated here. So this is a good point to end the exchange.
> And a video game is exactly that: a video game. You can play it or leave it.
No one is claiming you can't walk away from LLM's, or re-prompt them. The discussion was whether they're inherently unchallenging, or if it's possible to prompt one to be challenging and not sycophantic.
"But you can walk away from them" is a nonsequitur. It's like claiming that all games are unchallenging, and then when presented with a challenging game, going "well, it's not challenging because you can walk away from it." This is true, and no one is arguing otherwise. But it's deliberately avoiding the point.
> This seems tautological to the point where it's meaningless. It's like saying that if you try to hire an employee that's going to challenge you, they're going to always be a sycophant by definition. Either they won't challenge you (explicit sycophancy), or they will challenge you, but that's what you wanted them to do so it's just another form of sycophancy.
I think this insight is meaningful and true. If you hire a people-pleaser employee, and convince them that you want to be challenged, they're going to come up with either minor challenges on things that don't matter or clever challenges that prove you're pretty much right in the end. They won't question deep assumptions that would require you to throw out a bunch of work, or start hard conversations that might reveal you're not as smart as you think; that's just not who they are.
Even "simply following directions" is something the chatbot will do, that a real human would not -- and that interaction with that real human is important for human development.
>> That's the default chatbot behavior. Many of these people appear to be creating their own personalities for the chatbots, and it's not too difficult to make an opinionated and challenging chatbot, or one that mimics someone who has their own experiences. Though designing one's ideal partner certainly raises some questions, and I wouldn't be surprised if many are picking sycophantic over challenging.
> You can make an LLM play pretend at being opinionated and challenging. But it's still an LLM. It's still being sycophantic: it's only "challenging" because that's what you want.
Also: if someone makes it "challenging" it's only going to be "challenging" with the scare quotes, it's not actually going to be challenging. Would anyone deliberately, consciously program in a real challenge and put up with all the negative feelings a real challenge would cause and invest that kind of mental energy for a chatbot?
It's like stepping on a thorn. Sometimes you step on one and you've got to deal with the pain, but no sane person is going to go out stepping on thorns deliberately because of that.
We don't have images / videos of all strikes. We don't have anything from the strike that the Colombian president claims was on a fishing boat, for one.
If you really want to go far enough with it, given the administration's general lack of trustworthiness, we don't even have confirmation that all the videos we've been given correspond to the strikes we've been told they correspond to.
I tried to reproduce the fork/spaghetti example and the fashion bubble example, and neither looks anything like what they present. The outputs are very consistent, too. I am copying/pasting the images out of the advertisement page so they may be lower resolution than the original inputs, but otherwise I'm using the same prompts and getting a wildly different result.
It does look like I'm using the new model, though. I'm getting image editing results that are well beyond what the old stuff was capable of.
The output consistency is interesting. I just went through half a dozen generations of my standard image model challenge, (to date I have yet to see a model that can render piano keyboard octaves correctly, and Gemini 2.5 Flash Image is no different in that regard), and as best I can tell, there are no changes at all between successive attempts: https://g.co/gemini/share/a0e1e264b5e9
This is in stark contrast to ChatGPT, where an edit prompt typically yields both requested and unrequested changes to the image; here it seems to be neither.
Flash 2.0 Image had the same issue: it does better than gpt-image for maintaining consistency in edits, but that also introduces a gap where sometimes it gets "locked in" on a particular reference image and will struggle to make changes to it.
In some cases you'll pass in multiple images + a prompt and get back something that's almost visually indistinguishable from just one of the images and nothing from the prompt.
Wildly different and subjectively less "presentable", to be clear. The fashion bubble just generates a vague bubble shape with the subject inside it instead of the"subject flying through the sky inside a bubble" presented on the site. The other case just adds the fork to the bowl of spaghetti. Both are reproducible.
Arguably they follow the prompt better than what Google is showing off, but at the same time look less impressive.
I don't even know if the recipes themselves are real and tested any more or just slop.
It seems like it's more often than not that I'm coming across dishes that just do not make sense, or are poorly plagiarized by someone who doesn't understand the cuisine they're trying to replicate with absolute nonsense steps or substitutions or quantities. I used to have a great success rate when googling for recipes but now it's almost all crap, not even a mixed bag.
It's "source available", not open source. Unreal is "source available" as well.
> Where do you see an NDA?
The top price tier says
> If you need more than 16 seats, or you have access to > $1m USD in the last 365 days, this tier is required, you must contact us to customize pricing.
This is really weird, honestly. If you hire a 17th seat your pricing goes from $50/seat/mo to... what? How do I plan around this jump?
I would also like to know what happens if my subscription lapses. Can I continue to use a version I previously downloaded? What if luxe goes out of business or, heaven forbid, suddenly increases their pricing to unsustainable levels? With Unreal I can keep using an old version.
Is there a meaningful difference between "technically not allowed but unenforced" and "allowed"? People and smaller corporations are drowned all the time with no real recourse other than "just be more rich so you can hire your own giant legal/PR/marketing teams"
It also sometimes asks to unlock my phone for commands that plain old Assistant was happy to do while locked. I haven't really found it useful at all yet, free ChatGPT is just better than free Gemini for "LLM stuff" and Google Assistant is better for "smart home stuff"
Waymo doesn't require a driver to supervise the car, they're supposedly running Level 4 cars. FSD requires continual supervision, you must be responsible and attentive at all times. How is this a reasonable comparison?
There's a big difference between "someone at central command can take over when the car signals" and "someone must watch the car constantly and take over when they see it do anything bad".
The disengagement events on recent videos of FSD are still the likes of "oops it almost turned into oncoming traffic" or "oops it almost ran into a pole", that's the sort of thing you have to catch before it happens, not after.