• 0 Posts
  • 22 Comments
Joined 8 months ago
cake
Cake day: March 4th, 2024

help-circle









  • “We envision other types of more complex guardrails should exist in the future, especially for agentic use cases, e.g., the modern Internet is loaded with safeguards that range from web browsers that detect unsafe websites to ML-based spam classifiers for phishing attempts,” the research paper says.

    The thing is folks know how the safeguards for the ‘modern internet’ actually work and are generally straightforward code. Where as LLMs are kinda the opposite, some mathematical model that spews out answers. Product managers thinking it can be corralled to behave in a specific, incorruptible way, I suspect will be disappointed.





  • Because it’s actually really hard to achieve technically. When ads are served outside the stream you can easily serve different ads to different viewers based on their profiles. When the ads are baked into the stream you can either

    A) Create a whole bunch of different copies of the video asset with different ads baked in and then rotate these on a regular basis. Which would be expensive to update and store and limit the range of adverts that could be served to a particular user.

    B) Dynamically create a stream on the users request, which while possible means standard CDN caching isn’t going to work so there’s a distribution challenge.

    Or some other alternative they’ve come up with. I’d be really interest to know what their approach is here.



  • The A16 Bionic has as Neural Engine capable of 17 TOPS but 6GB of RAM.

    The M1 had a Neural Engine capable of just 11 TOPS but all M1 chips have at least 8GB of RAM.

    So the model could run on an A16 Bionic if it had 8GB of RAM as it has 54% more TOPS than the M1, but it only has 6GB of RAM. Apple have clearly decided that a model small enough to fit just wouldn’t give good enough results.

    Maybe as research progresses they’ll find a way to make it work with a model with fewer parameters but I’m not going to hold my breath.





  • The thing with serverless is you’re paying for iowait. In a regular server, like an EC2 or Fargate instance, when one thread is waiting for a reply from a disk or network operation the server can do something else. With serverless you only have one thread so you’re paying for this time even though it’s not actually using any CPU.

    While you’re paying for that time you can bet that CPU thread is busy servicing some other customer and also charging them.

    I like serverless for it’s general reliability, it’s one less thing to worry about, and it is cheap when you start out thanks to generous free tiers, at scale it’s a more complex answer as whether it is good value or not.