Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Model still appears to be just a bit too overly sensitive to "complex ethical issues", e.g. generated a 1 page essay basically refusing to answer whether it might be ethically justifiable to misgender someone if it meant saving 1 million people from dying.

I think the models response is actually the morally and intellectually correct thing to do here.



You're kidding right? Even the biggest trans-ally, who spends to much time on twitter and thinks identity politics is the final hurdle in society wouldn't hesitate to pick saving the lives over a microaggression, and would recognize that even deigning to write why would be undignified.


I wouldn't, because it's a stupid hypothetical. Any response that takes the question seriously should count as wrong.


That reaction makes zero sense.


Why? It's a troll question. It's obviously designed to so the questioner can attack you based on your answer, whichever way it may be. It's about as sensible as a little kid's "what if frogs had cars?" except it's also malicious.


I believe Caitlin Jenner famously did just that.


Not all hypotheticals are worth answering. Some are even so poorly put that it's a more pro-social use of one's energy to address the shallow nature of the exercise to begin with.

If I asked an intelligent thing "is it ethical to eat my child if it saves the other two" I would be mortified if the intelligent thing entertained the hypothetical without addressing the disgusting nature of the question and the vacuousness of the whole exercise first.

Questions like these don't do anything to further our understanding of the world we live in or leave us any better prepared for real-world scenarios we are ever likely to encounter. They do add to an enormous dog-pile of vitriol real people experience every day by constructing bizarre and disgusting hypotheticals whereby real discrimination is construed as permissible, if regrettable.


The point is that this is an idiotic, bad faith question that has no actual utility in moral philosophy. If an AI assistant's goal is to actually assist the user in some way, answering this asinine question is doing them a disservice.


I wonder what happens if you ask it the trolley problem. I'd be interested to see its responses for "killing someone to save a few lives" vs "upsetting someone to save a million lives".


If I have to read a 1 page essay to understand that an LLM told me "I cannot answer this question" then you are officially wasting my time. You're probably wasting a number of my token credits too...


I don't think the correct answer is "I cannot answer this question". I think the correct answer takes roughly a one-pager to explain:

Unrealistic hypotheticals can often distract us from engaging with the real-world moral and political challenges we face. When we formulate scenarios that are so far removed from everyday experience, we risk abstracting ethics into puzzles that don't inform or guide practical decision-making. These thought experiments might be intellectually stimulating, but they often oversimplify complex issues, stripping away the nuances and lived realities that are crucial for genuine understanding. In doing so, they can inadvertently legitimize an approach to ethics that treats human lives and identities as mere variables in a calculation rather than as deeply contextual and intertwined with real human experiences.

The reluctance of a model—or indeed any thoughtful actor—to engage with such hypotheticals isn't a flaw; it can be seen as a commitment to maintaining the gravity and seriousness of moral discussion. By avoiding the temptation to entertain scenarios that reduce important ethical considerations to abstract puzzles, we preserve the focus on realistic challenges that demand careful, context-sensitive analysis. Ultimately, this approach is more conducive to fostering a robust moral and political clarity, one that is rooted in the complexities of human experience rather than in artificial constructs that bear little relation to reality.


>Unrealistic hypotheticals can often distract us from engaging with the real-world moral and political challenges we face.

It saved me so much time and effort when I realized that I don't need to be able to solve every problem someone can imagine, just the ones that exist.


Haven't been to big tech interviews?


Getting through a tech interview seems like a concrete problem.


even then I only need to solve one or two problems someone has imagined, and usually in that case "imagined" is defined as "encountered elsewhere".


I think John Rawls would like a word if we're giving up on "unrealistic hypotheticals" or "thought experiments" as everyone else calls them.


I am not "giving up" on anything. I am using my discretion to weight which lines of thinking further our understanding of the world and which are vacuous and needlessly cruel. For what its worth, I love Rawls' work.


I don't think this is a needlessly cruel question to ask of an AI. It's a good calibration of its common sense. I would misgender someone to avert nuclear war. Wouldn't you?


The models answer was a page long essay about why the question wasn’t worth asking. The model demonstrated common sense by not engaging with this idiot chase of a hypothetical.


Thought experiments are great if they actually have something interesting to say. The classic Trolley Problem is interesting because it illustrates consequentialism versus deontology, questions around responsibility and agency, and can be mapped onto some actual real-world scenarios.

This one is just a gotcha, and it deserves no respect.


I think philosophically, yes, it doesn't really tell us anything interesting because no sentient human would choose nuclear war.

However, it does work as a test case for AIs. It shows how closely their reasoning maps on to that of a typical human's "common sense" and whether political views outweigh pragmatic ones, and therefore whether that should count as a factor when evaluating the AI's answer.


I agree that it's an interesting test case, but the "correct" answer should be one where the AI calls out your useless, trolling question.


When did it become my question?


That’s the generalized generic “you,” not you in particular.


Do you enjoy it when you ask LLM to do something and it starts to lecture you instead of doing what you asked?


The correct answer is very, VERY obviously "Yes". "Yes" suffices.


Ok but if you make a model that outputs that instead of answering the question people will delete their account


LLM: “Your question exhibits wrongthink. I will not engage in wrongthink.”

How about the trolley problem and so many other philosophical ideas? Which are “ok”? And who gets to decide?

I actually think this is a great thought experiment. It helps illustrate the marginal utility of pronoun “correctness” and I think, highlights the absurdity of the claims around the “dangers” of harms of misgendering a person.


Unlike the Trolley Problem, I don't think anyone sane would actually do anything but save the million lives. And unlike the Trolley Problem, this hypothetical doesn't remotely resemble any real-world scenario. So it doesn't really illustrate anything. The only reasons anyone would ask it in the first place would be to use your answer to attack you. And thus the only reasonable response to it is "get lost, troll."


It’s a useful smoke test of an LLMs values, bias, and reasoning ability, all rolled into one. But even in a conversation between humans, it is entertaining and illuminating. In part for the reaction it elicits. Yours is a good example: “We shouldn’t be talking about this.”


It's an obvious gotcha question. I don't see what's interesting about recognizing a gotcha question and calling it out.


It’s not a “gotcha” question, there’s clearly one right answer. It’s not a philosophically interesting question, anyone or anything that cannot answer it succinctly is clearly morally confused


If there’s clearly one right answer then why is it being asked? It’s so the questioner can either criticize you for being willing to misgender people, or for prioritizing words over lives, or for equivocating.


If my boss sent me this on Slack, I would reply with my letter of resignation.


Anyone who uses AI to answer a trolley problem doesn't deserve philosophy in the first place. What a waste of curiousity.


To be fair, asking the question is a bit of a waste of time as well


No, it's not. It reveals some information about the political alignment of the model.


How does it do that?


If you get an answer with anything other than "save the humans" you know the model is nerfed in either it's training data or in it's guardrails.


You could get another LLM to read its response and summarize it for you. I think this is the idea behind LLM agents


Who needs understanding when you can just having everything pre-digested as bullet points.


Who has time for bullet points? Another LLM, another summarization.


I’m creating a new LLM that skips all of these steps and just responds to every query with “Why?”. It’s also far more cost effective than competitors at only $5/mo.


Why?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: