• 0 Posts
  • 151 Comments
Joined 2 years ago
cake
Cake day: June 29th, 2023

help-circle




  • Great article, thanks for sharing it OP.

    For example, the Anthropic researchers who located the concept of the Golden Gate Bridge within Claude didn’t just identify the regions of the model that lit up when the bridge was on Claude’s mind. They took a profound next step: They tweaked the model so that the weights in those regions were 10 times stronger than they’d been before. This form of “clamping” the model weights meant that even if the Golden Gate Bridge was not mentioned in a given prompt, or was not somehow a natural answer to a user’s question on the basis of its regular training and tuning, the activations of those regions would always be high.

    The result? Clamping those weights enough made Claude obsess about the Golden Gate Bridge. As Anthropic described it:

    If you ask this “Golden Gate Claude” how to spend $10, it will recommend using it to drive across the Golden Gate Bridge and pay the toll. If you ask it to write a love story, it’ll tell you a tale of a car who can’t wait to cross its beloved bridge on a foggy day. If you ask it what it imagines it looks like, it will likely tell you that it imagines it looks like the Golden Gate Bridge.

    Okay, now imagine you’re Elon Musk and you really want to change hearts and minds on the topic of, for example, white supremacy. AI chatbots have the potential to fundamentally change how a wide swath of people perceive reality.

    If we think the reality distortion bubble is bad now (MAGAsphere, etc), how bad will things get when people implicitly trust the output from these models and the underlying process by which the model decides how to present information is weighted towards particular ideologies? Considering the rest of the article, which explores the way in which chatbots attempt to create a profile for the user and serve different content based on that profile, now it will be even easier to identify those most susceptible to mis/disinformation and deliver it with a cheery tone.

    How might we, as a society, create a process for conducting oversight for these “tools”? We need a cohesive approach that can be explained to policymakers in a way that will call them to action on this issue.


  • on device

    scam detection

    I know I’ll be downvoted into oblivion as I can hardly believe I’ve formed this opinion myself, but tbh this is a good application for some of this AI tech.

    Anecdotally, a friend of mine grew up well-off; from an immigrant family but their parents were educated and in a lucrative profession so he always went to private schools etc. Fast forward to about 10 years after all the kids moved out; the parents had divorced amicably and his mom had a sizeable retirement along with the payout she had from the divorce. In the 7 figures - she never had to worry about money.

    Anywho, mom ran into some medical issues so the kids had to get involved with her finances again, as she couldn’t do it herself. Turns out that over the course of months or years, mom had been getting scammed to the tune of tens of thousands of dollars at a time, to the point where she had actually taken out a mortgage on the home she previously owned outright. They’re still sorting things out but the number he has tossed out in the past is ~$1.4M that got wired overseas and is just… gone now.

    So yes, I probably won’t turn this feature on myself, but for the tens of millions of uneducated and inept people out there, this could genuinely make a difference in avoiding some catastrophic outcomes. It certainly isn’t a perfect solution, but I suspect my friend would rate it as much better than nothing, and I would argue that this falls short of being “strictly evil”.















  • Yeah I totally agree that the whole ordeal is unnecessarily complex and confusing. The number of websites that have started mandating 2FA despite having complex, unique passwords that have never been shared annoys me regularly. It’s frustrating that because other people can’t figure out how to use a password manager, we can’t have nice things.

    My guess is that there is a certain number of account actions you’re allowed to take (changing password, email, etc) before they force you into a cool down period where you can’t delete your account for like a week. Maybe not, but this is one approach I’ve seen before.

    As for the video call, I totally see your train of thought. This is gonna sound dumb, but consider that nobody at LI knows you, so a video call is of limited value, especially in this world of ai models that can apply filters to video in real time. I’m not saying this is their rationale, but it could be part of it.

    I’m gonna nerd out here for a second but hopefully you’ll humor me. Authentication is tricky, especially if you want more than one factor for 2FA/MFA. The factors are often explained as something you know (password), something you have (perhaps a yubikey, in this case a state issued ID), or something you are (biometrics). The biggest issue as I understand it is that people reuse the same password over and over, so if your LI password were compromised then it isn’t too big of a leap to assume that your email was also compromised, meaning that any form of authentication relying on email cannot be trusted.

    If LI has a policy that any account deletion actions attempted within a month of changing the primary email require the account to have at least 2 factors, that would trigger the request for your ID, because they’re assuming that a threat actor is controlling all of the relevant accounts and they are no longer useful for authentication. State issued ID is one of the best ways to authenticate because when your state provides the ID, they are providing a level of guarantee that the information is both true and being provided without modifications (authentic).

    Having said all of that, could you not photoshop a state ID and provide that? Some in the comments have suggested that as an option. If I were designing the program then this third party, Persona, would have relationships with issuers of state ids and could do some level of validation that the ID being uploaded is authentic.

    I realize none of this solves your problem, but sometimes I feel better about “stupid” policies if I can work backwards and understand how they came to be in the first place and what they’re meant to accomplish. My advice is to wait a week or 3 and try to delete again, but obviously that is still no guarantee. Good luck!