Hacker Newsnew | past | comments | ask | show | jobs | submit | benchess's commentslogin

> - It's not possible to set screen time restrictions for Safari.

I thought this too. I discovered it actually is possible though, just doesn't appear in the list. Go "Screen Time" -> "App Limits" -> "Add Limit". In the "Choose Apps" dialog, you won't see Safari in the list. But you can type "Safari" in the search bar and it'll appear.

But I agree with the overall sentiment on this thread. iOS Parental Controls aren't where they need to be.


Heya, wanted to show off a project of mine to identify the scaling limits of a large Kubernetes deployment


The irony of posting this on GitHub which remains shamefully without IPv6


It’s mentioned


Ah I see it now. Sorry for the noise.


My kids like reading this over and over: Now & Ben: The Modern Inventions of Benjamin Franklin: https://a.co/d/jhm2SvM


Use a dns proxy and https://github.com/froggeric/DNS-blocklists/blob/main/NoAppl...

You may also need to disable Private Relay


They’re for sale on Mouser for $20625 each https://www.mouser.com/ProductDetail/BittWare/RS-GQ-GC1-0109...

At that price 568 chips would be $11.7M


Yeah, I don't know what the cost to us is to build out our own hardware but it's significantly less expensive than retail.


I presume that's because it's a custom asic not yet in mass production?

If they can get costs down and put more dies into each card then it'll be business/consumer friendly.

Let's see if they can scale production.

Also, where tf is the next coral chip, alphabet been slacking hard.


I think Coral has been taken to the wooden shed out back. Nothing new out of them for years sadly


Yeah. And it's a real shame bc even before LLMs got big I was thinking, couple generations down the line and coral would be great for some home automation/edge AI stuff.

Fortunately LLMs and hard work of clever peeps running em on commodity hardware are starting to make this possible anyway.

Because Google Home/Assistant just seems to keep getting dumber and dumber...


That seems to be per card instead of chip. I would expect it has multiple chips on a single card.


From the description that doesn't seem to be the case, but I don't know this product well

> Accelerator Cards GroqCard low latency AI/ML Inference PCIe accelerator card with single GroqChip


Missed that! Thanks for pointing out!


This isn't running on one chip. It's running on 128, or two racks worth of their kit. https://news.ycombinator.com/item?id=38739106

This doesn't mean much without comparing $ or watts of GPU equivalents


This article from less than a month ago says that it is on 576 chips https://www.nextplatform.com/2023/11/27/groq-says-it-can-dep...


GPUs can't scale single user performance beyond a certain limit. You can throw 100s of GPUs at it but the latency will never be as good.


Thanks, I need to correct my earlier guess: I believe this demo is running on 9 GroqRacks (576 chips) and I think we may also have an 8 rack version in progress. I can't remember off the top of my head whether this deployment has pipelining of inferences or whether that's work in progress. We've tried a variety of different configurations to improve performance (both latency and throughput), which is possible because of the high level of flexibility and configurability of our architecture and compiler toolchain.

You're right that it is important to compare cost per token also, not just raw speed. Unfortunately I don't have those figures to hand but I think our customer offerings are price competitive with OpenAI's offerings. The biggest takeaway though is that we just don't believe GPU architectures can ever scale to the performance that we can get, at any cost.


If I understand correctly, Groq chips have 220MB SRAM and the next best level is DDR4? How many chips are needed to run Llama2-70B at those speeds?


Cool that you know the tech specs of the GroqChip! Yes, that's right, 220 MB of SRAM per chip. I think the demo where we first broke 200 tokens / sec was running on 1 GroqRack, so 64 chips. The live public demo that's currently running at 275 tokens / sec I think might be running on two GroqRacks, so 128 chips. I'm not certain of either of these figures so please don't quote me! But those are the right ball-park.


This article from less than a month ago says that it is on 576 chips https://www.nextplatform.com/2023/11/27/groq-says-it-can-dep...


Thanks, looks like you're right and this demo is running on 9 GroqRacks (576 chips). I think we may also have an 8 rack version in progress. We've tried a variety of different configurations to improve performance, which is possible because of the high level of flexibility and configurability of our architecture and compiler tool chain.


The key constraint here is that the author has no access to persistent disks and can only use object storage for persistence. Otherwise Thanos would be extreme overkill for this number of metrics.

Single-node VictoriaMetrics can easily handle 1M metrics/sec


> Thanos would be extreme overkill for this number of metrics

Data volume is just one thing. Thanos makes Prometheus stateless and easy to shard, all in a non-invasive approach that is solid, boring, and just works. The architecture works well in small scale systems. I even use it in a single node k8s cluster in my homelab, pays only about ~$1 a month for Backblaze B2 so I never worry about data retention or disk usage.

> Single-node VictoriaMetrics can easily handle 1M metrics/sec

Even if I have disk access, I would think twice before deploying a database and manage it myself when I don't have to. Besides the maintenance burden and potential scaling issues in the future, it may cost you more to use block storage like EBS than S3.

Also, Prometheus memory usage overhead for remote write was wild[1], so, good luck with capacity planning and config tweaking.

1. https://prometheus.io/docs/practices/remote_write/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: