Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have a close friend working in core research teams there. Based on our chats, the secret seems to be (1) massive compute power (2) ridiculous pay to attract top talents from established teams (3) extremelly hard work without big corp bureaucracy.


Anecdotal, but I've gotten three recruiting emails from them now for joining their iOS team. I got on a call and confirmed they were offering FAANG++ comp but with the expectation of in-office 50h+ (realistically more).

I don't have that dog in me anymore, but there are plenty of engineers who do and will happily work those hours for 500k USD.


500k isn't FAANG++, it's standard FAANG comp


Should have been more clear, this was 500k for an E4 level role, you're correct that senior/staff at Meta and G are definitely making more.


wow.


If you can share: were these 500k cash or cash +rsu?


I have a friend who joined there with 2 YoE, and got fired in 3 months. He was paid 700k cash + 700k RSU


So in the end did he get anything? I dont know how these things work but did he just walk away with ~50k in pre tax income and 0 for RSU or did Musk pull a Twitter and not even pay him for those months?


IIRC it was cash, but I'm sure others can confirm.


It was mentioned during the launch that current datacenter requires up to 0.25 gigawatts of power. The datacenter they're currently building will require 1.25 (5x) (for reference, a nuclear powerplant might output about 1 gigawatt). Will be interesting to see if the relationship between power/compute/parameters and performance is exponential, logarithmic or something more linear.


It's logarithmic. Meaning you scale compute exponentially to get linearly better models. However there is a big premium in having the best model because of low switching costs of workloads, creating all sorts of interesting threshold effects.


It's logarithmic in benchmark scores, not in utility. Linear differences in benchmarks at the margin don't translate to linear differences in utility. A model that's 99% accurate is very different in utility space to a model that's 98% accurate.


Yes, it seems like capability is logarithmic wrt compute but utility (in different applications) is exponential (or rather s-shaped) with capability again


Not really since both give you wrong output that you need to design a system to account for(or deal with). The only percentage that would change the utility would be 100% accurate.


Linear in what metric?


Presumably the benchmarks? I'm also interested.


this is like a caveman dismissing technology because he wasnt impressed with the wheel. its like buddy, the wheel is just the start


> It was mentioned during the launch that current datacenter requires up to 0.25 gigawatts of power. The datacenter they're currently building will require 1.25 (5x) (for reference, a nuclear powerplant might output about 1 gigawatt).

IIRC achieving full AGI requires precisely 1.21 jigawatts of power, since that's when the model begins to learn at a geometric rate. But I think I saw this figure mentioned in a really old TV documentary from the 1980s, it may or may not be fully accurate.


The funny part was that none of his workers recognized the film, which was a blockbuster. A veritable "I must be getting old" moment.


And fun fact, without govt subsidirles, a nuclear power plant isn't economically feasible, which is why Elon isn't just building such a plant next to the data center.


No a bad recipe for success.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: