While Apple is incentivized to ship a smaller battery to cut costs, it is also incentivized to make their software efficient as possible to make the best use of the battery they do ship
If you use your own search tool, you would have to pay for input tokens again every time the model decides to search. This would be a big discount if they only charging once for all output as output tokens but seems unclear from the blog post
Thanks for the feedback, just updated our docs to hopefully make this a little clearer. Search results count towards input tokens on every subsequent iteration
Thanks for addressing it. Still sounds like a significant discount if only the search results and not all messages count are input tokens on subsequent iterations!
> The Nova family of models were trained on Amazon’s custom Trainium1 (TRN1) chips,10 NVidia A100 (P4d instances), and H100 (P5 instances) accelerators. Working with AWS SageMaker, we stood up NVidia GPU and TRN1 clusters and ran parallel trainings to ensure model performance parity
Does this mean they trained multiple copies of the models?
Models like this are experimentally pretrained or tuned hundreds of times over many months to optimize the datamix, hyperparams, architecture, etc. When they say "ran parallel trainings" they are probably referring to parity tests that were performed along the way (possibly also for the final training runs). Different hardware means different lower-level libraries, which can introduce unanticipated differences. Good to know what they are so they can be ironed out.
Part of it could also be that they'd prefer to move all operations to the in-house trn chips, but don't have full confidence in the hardware yet.
Def ambiguous though. In general reporting of infra characteristics for LLM training is left pretty vague in most reports I've seen.
I wonder why they didn't use AWS wide numbers rather then just EC2. I would have thought EC2 would lag in the transition while AWS services would make the switch quickly
Because EC2 represents a more realistic market adoption, it’s more important to know if you can run the software of your choice on ARM than can Amazon develop a service on an ARM stack.
This sounds suspicious. All these transactions are traceable now and it should be much easier for the income tax department to find out. No one in hier right mind would have so many people send money to their bank accounts like this.
Just asked my friend: apparently they used the college fees account. They were getting a lot of money transfers anyway so it's not such a bit deal at face value And since no one really raised any alarms during DeMo, no one really investigated further.
> The apt comparison would be between Apple’s revenue and Indonesia’s GDP, because GDP is not a measure of wealth, it’s a measure of output: the total value of goods and services produced by the country in a year.
Sounds like you want profit and not revenue then? Just like a trade deficit is subtracted from GDP, shouldn’t you subtract the costs from revenues of a company to arrive at a similar metric?
It's more that it doesn't take depreciation and destruction into account - the broken windows fallacy.
Annual change in National Wealth (an obscure metric, because it's so difficult to quantify) is a closer match to profit - after performing all your activities, how ahead do you get in terms of infrastructure, education, durable goods, etc.? Whereas GDP measures the amount of activity, in the same way as revenue or expenditure is a good way of estimating how much work is getting done at a company.
But Movie Pass charges the same amount no matter where you live so you'd expect it to be much more popular in expensive places. It would be challenging to come up with a worse business model than Movie Pass.
I do know that. But people from India fly to Dubai to purchase iPhones because they are cheaper than that. Apple even offers support for these phones but they warn people that Dubai iPhones do not have facetime.