Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
New Capabilities for GPT-3: Edit and Insert (openai.com)
185 points by todsacerdoti on March 15, 2022 | hide | past | favorite | 82 comments


I just gave it the opening paragraph of Pride & Prejudice in edit mode and the instruction to "rewrite it as a rap":

    When Jane saw mister Bingley,
    She knew that he was fine,
    But it took her a while to realize
    He wanted to be more than friends,

    Then she met Fitzwilliam Darcy,
    And she thought he had a lot of money,
    But it took her a while to realize
    He didn't want to talk about money,

    She was just a young girl,
    But she knew what was going on,
    She knew who she wanted to be with,
    But she also had her doubts,

    She knew he was great,
    But she also thought he was a snob,
    She was afraid that,
    He was going to break her heart.

    She loved him,
    But didn't know if she could trust him,
    So she decided to wait,
    Until his feelings became clear.

    Then all of a sudden,
    She realized,
    That she had been waiting all this time,
    And that she was in love.

    So she decided to go after him,
    And she got to him just in time,
    And they lived happily ever after.
It's a terrible rap but a fairly impressive summary of the book (from someone that's admittedly never read it)

Odd that it took "rewrite (the opening paragraph) as a rap" to mean "summarise the whole book".


It might have memorized a summary of the book from the data it's been trained on. GPT-3 is impressive, but it's kind of hard to control exactly what it's going to do. It recognized what you input was from pride and prejudice, remembered a summary of the book from somewhere, and turned that into a rap, which is both impressive and not really what you asked for.


Deep learning theory has shown that these huge neural networks can memorize basically any dataset they're given. The useful ones find ways to interpolate everything they've memorized to new contexts.


But they seem terrible at anticipating what people want or find entertaining, which is what a lot of human interaction and entertainment is about. GP wanted a rap but got a shitty poem.


That's because it's just a language model. It's been trained to find a probable completion to a piece of text, predict a likely next word. It's not trained for human interaction. It's not an agent. It has no motives or goals. It might seem like it does, but that's more of a side-effect of language modelling.


This is where GANs are going to come in.

It's beyond me that Facebook squandered its data value by half in not including even a privately tracked 'dislike' button.

The value in what people both like and dislike is going to be massive in the next few years, as it can help train the evaluation of content in both general and personalized scales.

I'd wager with where the tech is today we Tinder could generate synthetic matches that would be far preferred over any natural matches, even at equal or lower physical attractiveness scores in generated photos.

Pandora might be relevant again for music soon as their like/dislike data could allow for the generation of new music in line with preferred tastes and not just selection.

Transistor models like GPT-3 are only part of the equation, and as we continue to improve not only generation of new data, but evaluation of data too -- the application of the technology is likely going to move faster than anything we've seen before in the next few years.

Codex generated 35% of the code written by people with access to it, and Microsoft and OpenAI have the data on what was and wasn't selected. That data alone could produce an AI linter that could have significant value in catching sophisticated natural programmer errors - but in combination with Codex too?

(Keep in mind it was only summer 2020 when the tweet showcasing normal GPT-3 writing HTML was blowing minds, and only last summer Copilot was announced.)

We really haven't seen anything yet.

Things are moving so fast that anyone building products relying on 3rd party AI at a foundational level right now should absolutely be concerned with how they are planning around obsolescence within ~3 years for whatever they are building on top of in their product design and implementation.


Dislike buttons create a feeling of negativity that turned users off engaging with Facebook, which would have meant their overall data collection ability dropped. The loss of the button itself is not so big because they have other ways to measure engagement and can probably predict quite well which items someone would have disliked anyway.


* Transformer models like GPT-3

(Not 'Transistor' - what I get for typing it out on my phone and not proofreading)


Obligatory mention: GPT-3 cannot learn to rhyme, it can only brute-force memorize rhyme-pairs, because the BPE encoding erases phonetic information from the encoding of words that a GPT-3 (and similar models like Gopher or GPT-J or Jurassic-1) is trained on. https://www.gwern.net/GPT-3#bpes

This is why that rap looks so weird: so high-quality in general, yet so totally lacking a critical element we expect from rap. GPT-3 has a pretty good idea how to write rap, but it just doesn't know what words sound like. It is blind to that. It can't see that difference between rap and poems. To it, rap is just a sort of poetry, and some words are chosen for reasons it can't understand, no matter how many rap lyrics or poems it looks at.

The better the models get at writing, the more this difference throws us for a loop: "how can it be so good at ABC, and yet fail so totally at D?"


Totally disagree. The encoding scheme could give the model an inductive bias which would make rhyming easier. But the lack of such an inductive bias in no way prevents it from understanding rhymes. This model has 10^11 parameters! The embeddings for the encoding are maybe 0.1% of that - probably less I’d guess without looking it up. Converting to a different encoding would use a very small fraction of the model’s computational power.


I'm not talking about inductive bias. I am talking about it being erased from the data before inductive biases or parameters ever have a chance to matter. What you are proposing is tantamount to expecting a model to be able to generate colorful Impressionist masterpieces while trained on solely monochrome bitmaps. Yeah, doesn't work like that.

Doesn't matter how many parameters it has; and as a matter of fact, the GPT models have not improved their rhyming noticeably even as they go from a hundred million parameters to over a hundred billion parameters.

(And I have extensive evidence consistent with my interpretation at the link, ranging from several years of failure by many people to prove me wrong by eliciting rhymes even a fraction as good as its non-rhyme poetry to fixing GPT-3 performance on tasks by working around BPEs to different models by different groups but the same data encoding also showing the same utter failure to rhyme.)


I appreciate that you've analyzed GPT3's inability to rhyme rhyme well. And maybe some of that comes from BPE. But saying that "GPT3 _cannot_ rhyme" is a strong and unsupportable statement. What, really, would the difference be between memorizing a rhyming dictionary and being able to apply that vs "actually" learning to rhyme? Because GPT3 can certainly do the former, so why can't it do the latter?

Now if you ran an experiment comparing MLM (or any LM) on rhyming tasks with different encodings, then you could certainly make a statement like "BPE is worse at rhyming than other encodings" and it would be scientifically supportable. And that very well might be true. But your extreme conclusion is not supportable.


What would the difference be? That's very obvious - just think about any kind of comic or light verse!

A rhyme dictionary would still not replicate human rhyming capabilities. Think about neologisms or misspellings. A model can memorize every single entry in the rhyming dictionary (and let's say this somehow cashes out as apparent rhyming proficiency in being able to for every word recall an entry in the rhyme dictionary of valid other words), but it would not be able to write something like "Jabberwocky" inventing a bunch of new words or phrases or names which rhyme. (How would it know to rhyme "wabe" and "outgrabe" when they appear in no dictionaries - because they were just invented?) A model which has "actually" learned to rhyme would be able to take new words (not necessarily invented by it, but possibly invented by humans after it was trained, or invented on the spot for a prompt, or part of a new fictional work like worldbuilding) and rhyme them appropriately. A model which has memorized a rhyming dictionary would not.


I'm genuinely confused why you take such an extreme position on this issue. You seem like you understand some things about how neural networks operate. So I'd assume you understand their ability to interpolate between examples to new situations they've never literally seen before - what is commonly referred to as "generalization" in ML, which is really the key concept in the entire field of ML. But for some reason you've decided this simply can't apply to rhyming for the world's most advanced language model. Your choice buddy.


Interesting! So what I am hearing from this is that some day we will overcome this and have programs that can rap. And then companies with customer support bots will have a little check box if you want the bot to rap all responses to you.


Technically, we know exactly what we have to do to GPT-3 to give it a chance to learn how to rhyme: get rid of BPES and just use raw UTF-8 (or even just ASCII) as the encoding. Enlarge the model a bit, as necessary.

At least that's what I am getting from Gwern's write-up. I might have misunderstood Gwern, or Gwern might be wrong, of course.


Replacing BPEs not with characters but with a sylabbary (is that a word? a vocabulary made of possible syllables) would be even more powerful, and you could also run preprocessing to convert text to phonetic transcription, to enforce a concept that spelling is irrelevant but the pronunciation matters.


A syllabary or a phonetic encoding like IPA would patch the rhyme problem, but it would sabotage you in still other ways. People expect a language model to be able to solve anagrams or reverse strings, for example. And there are a lot of things in written language which are connected to the exact spelling but not necessarily the phonetics; there is no list of what those things are or what not learning them would do to a model, and you would have to discover (or more likely, not discover, in the way that people keep not discovering the BPE sabotage) the drawbacks the hard way. So you have a Bitter Lesson tradeoff here: yeah, you can build in that concept instead of learning from raw Unicode/bytes, and it will work initially better, but you are handicapping yourself in the long run. So, I always say go for raw Unicode for the foreseeable future.


That would make the system more complicated by baking in extra assumptions.

And eg it would probably perform worse at translating from English to French than a naive system. (Unless you preprocess your corpus extensively to figure out when you get English and when you get a snippet of French.) GPT-3 is surprisingly good at translating English to French.

Another problem is that English has not just one text-to-phonetic transcription: different accents pronounce words differently. For just one example:

> n most non-rhotic accents, if a word ending in written "r" is followed immediately by a word beginning with a vowel, the /r/ is pronounced, as in water ice. That phenomenon is referred to as "linking R". Many non-rhotic speakers also insert an epenthetic /r/ between vowels when the first vowel is one that can occur before syllable-final r (drawring for drawing). The so-called "intrusive R" has been stigmatized, but many speakers of Received Pronunciation (RP) now frequently "intrude" an epenthetic /r/ at word boundaries, especially if one or both vowels is schwa. For example, the idea of it becomes the idea-r-of it, Australia and New Zealand becomes Australia-r-and New Zealand, the formerly well-known India-r-Office and "Laura Norder" (Law and Order). The typical alternative used by RP speakers (and some rhotic speakers as well) is to insert a glottal stop wherever an intrusive R would otherwise have been placed.

https://en.wikipedia.org/wiki/Rhoticity_in_English

Btw, even without any explicit pronunciation data, I would expect the system to get a pretty good idea of how pronunciation works by essentially using the same technique human linguists use to reconstruct old pronunciations:

They observe what kinds of mistakes people make.

Eg when you see people mixing up "there" and "their" and "they're" in the corpus, that tells you that in modern English these three are pronounced almost the same.

From spelling mistakes in words like carburetor you can figure out that unstressed vowels in English are pretty much all pronounced the same: as a schwa. https://en.wikipedia.org/wiki/Schwa

You can also learn a lot from observing rhymes.


But you have to make those assumptions if you want to make a model for rhyming because large quantities of raw prose does not contain any information whatsoever about which words rhyme, how they are pronounced or where the accents lie. And of course that would be worse for other tasks, that's kind of the whole point - discarding irrelevant information is the key part that distinguishes learning from memorizing; for most use cases you do want to discard the pronunciation part and other nuances of representation to focus on the semantics, but for some (like this) you may want to discard some other parts in order to focus on how the language sounds like. The 'no free lunch theorem' is conceptually relevant even if it doesn't literally apply here.

Your example of their/they're/there has some data because the whole words can be mistaken for each other, but even if you take a billion pages of prose, you won't get data to deduce that 'there' rhymes with 'pear' but not 'here', that 'great' rhymes with 'straight' but 'cough', 'dough' and 'through' do not rhyme. A model can't learn something that's simply not represented in the training data.

So you either have to bring in external information (i.e. pronunciation models, or dictionary data with pronunciation guides, or audio recordings) or you have to have sufficient rhyming text inside your training data - i.e. train it on corpora of poetry instead of prose.

Also, I'm not sure if the variation of pronunciation is critical - any accent variations that affect every instance of some sound similarly would preserve rhyming. There are certain differences e.g. I recall a discussion about some parts of Shakespeare which rhyme perfectly in pronunciation of that period but do not in modern English, but I think that the variations of modern English accents should be mostly OK from that perspective.


> So you either have to bring in external information (i.e. pronunciation models, or dictionary data with pronunciation guides, or audio recordings) or you have to have sufficient rhyming text inside your training data - i.e. train it on corpora of poetry instead of prose.

Yes, the latter. Just mix some rhyming text into your corpus. If you take eg Wikipedia, there's already plenty of rhyming stuff in there. Nursery rhymes, song lyrics, etc. Similar for other corpora.


Would it be unreasonable to vaguely describe these huge neural nets as savants?


Yes, because savants still have motivations, while these neural nets are just mathematical functions.


I found the rap extremely entertaining


Not that it's easy to quantify, but do we know if the interpolation tends to happen at training time or inference time?


Essentially anything that happens at inference time also happens at training time.

Because training is largely running inference, and then correcting any mistakes. Over and over again.

(I say largely, because inference also does stuff like sample top-k completions etc.)


Does anyone actually use Copilot for their work? I can't imagine it's anywhere near as reliable as OpenAI claims. I'd imagine a user would spend more time fixing mistakes or re-trying with different queries than they'd actually save.


Just wanted to have a less glowing counterpoint to the other claims. I've used Copilot a bit, and found that the automatic completion was frequently interrupting my train of thought, making it harder to concentrate ("intrusive thoughts as a service"). I preferred only triggering completions upon pressing a keystroke, so I choose when to take a shortcut and ask to have code generated. I found it very helpful for generating boilerplate code, and debug logging I shouldn't have to think too hard about. Also it sometimes gave me clever ideas better than I thought of myself (like Rust code matching on a HashMap's Entry). Nonetheless I felt uneasy because I noticed myself getting too "lazy", not thinking about what code I want written before asking for help.

In the end, aside from boilerplate, I spend most of my time in Qt Creator (which doesn't have a Copilot plugin) rather than VS Code, so I mostly stopped using Copilot anyway.


I often have to turn it off. It's very useful when I know what needs to be typed but am too lazy. It's also sometimes amazing useful at coding stuff from a comment, and like you say it can teach you new idioms. It would be a great way of learning a new language from examples.

When I'm problem solving or not sure how to code something though, its constant suggestions are just noise, and then I turn it off.


I use it constantly and hope I never have to code without it (or something better) again. It does a good job writing the kind of boring code I don't want to write, and it generally seems to include fewer errors than I write in my first drafts of code. More than once I ignored its suggestion and wrote my own version, only to later realize its version was more correct and efficient. It does a good (sometimes incredible) job of even handling pretty specialized subject matter, and of using the context of other code and comments you've written to suggest exactly what you need next.


> It does a good job writing the kind of boring code I don't want to write (...)

If only we had programming languages that didn't force us to write boring code in the first place ...


Maybe once you eliminate one level of "boring" code, that just means parts of the next-higher level of code become rote and boring. It reminds me a little of Richard Gabriel's reply to Guy Steele, when Steele said something like "Lisp doesn't need design patterns; it has macros." Gabriel said "That just moves the patterns up a level of abstraction."

(I probably remember that story all wrong. But I like it anyway!)


One of the problems with (classic) Lisp macros is that they aren't first class, ie you can pass them around like you can do with functions (or numbers etc).


I agree, I always use it now when I have to code in Python. I find the completions easy to ignore and easy to accept. For something, like generating code with embedded SQL or SPARQL queries, I pause and test the queries independently.

Overall, a tremendous time saver.


I use it all the time, it's significantly improved my productivity.

It's a little like pair programming with an incredibly eager junior developer who has read a lot of the documentation of every popular API in the world. I need to review the code it produces, but it's very fast, and its suggestions are usually great.

It's annoying when I know exactly what I want to write, and most helpful when I'm unsure (either because I'm trying things out, or if I'm using a new API or a language I'm rusty at).


This seems like a Google problem. Google was really helpful when it had a tonne of organic content that it could systematically steal from, plagiarise, and rip off to the point of devaluing the entire internet. The result with google was absolute centralization of that content and therefore a withering of the organic content. I wonder if we're going to see the same thing with these AI tools. It's fine to learn from 100,000 developers all writing code. But if you steal their IP, rip off their designs and plagiarise their work to build your tool what you end up with is a tool that is basically just learning from itself. No one really writes basic code anymore, but as a result there's no source for the AI to learn from. In other words, it centralizes knowledge but doesn't advance knowledge, and in the process it devlues anyone else advancing knowledge.


I’ve been using it every day for a few months (for Typescript/React), and it still astounds me.

I can write a comment outlining what I want a function to do, and 90% of the time it will generate the code I need (or very close with a couple of small tweaks needed).

Coincidentally, my Stack Overflow visits have decreased by approximately 90%.


I was going to write exactly this and I'm glad you beat me to it (and that someone else is experiencing this benefit). It's obviously not a replacement for a human writing a big program (yet...), but it does a damn fine job of giving you a shell of a function based on a comment.

PS - it is also AMAZING with CSS and saves so much time on easy things that I just haven't memorized. "Style the LI so there are no bullet points" and boom...'list-style: none;'


Great to hear we’re having similar experiences.

It’s also quite impressive how Copilot will learn from any patterns you’ve typed on previous lines — so when I’m using a certain color name in a variable, it knows that I likely want to use it in subsequent code.

Or, more concretely, when using tokens for colors (e.g. blue.50 for light blue, blue.900 for dark blue) it can figure out that I probably want my background to be an x.50 colour and my text to be, say, x.700. So cool!


At first I thought copilot would be pretty useless until I actually tried it. It turns out that a lot of code is boilerplate and the same simple patterns, even with abstraction. Copilot is not particularly genius, but it fills in simple patterns (e.g. do the same for the Y-axis that you did for the X-axis), and autocompletes typical utility functions (e.g. add 2 2d positions, shuffle an array, setTimeout promise, etc. which I have to write functions for because they are not in the JavaScript standard library.) These may seem like odd scenarios but there are actually a lot of them.


Big supporter of Copilot. I am still amazed how good it is and I feel it's getting better and better. So much boilerplate code gone. Also it really gives you a boost in confidence when the A.I writes the same code you're thinking. I feel so lucky having to see these amazing developments in A.I and V.R. recently.


> boost in confidence when the A.I writes the same code you're thinking.

I would have expected the demonstration that your work is easily replaced by an AI to have the opposite effect. :)


Personally using Codex, not Copilot, but similar engine.

It's really good for boilerplate. Things like TDD tests where I'm just modifying a few parameters. You can get it to write functions like "parse this DateTime object into a format like Tuesday, 15 May 2020".

It's a useful lookup too. Like often I just want to extract a variable from a List and would spend 15 mins looking up the docs or sifting Stack Overflow. Codex is more faster and accurate.

With GPT-3 it's garbage in, garbage out. You have to invest a few days in learning the prompts that work.


What's your workflow for using Codex directly? Are you copying over to the playground or using a custom extension?


Yeah, just copying to a preset on the playground. Whenever I try to reach out to Stack Overflow or get stuck on documentation, I'd open up Codex.

It's possible to set up an API to it, but it's not yet at the point where it's worth doing. I think it's possible to get Codex to write the script, now that I think about it.


I use copilot, it's much more useful than you'd expect. Really helpful for places where you would normally need to record a small macro, copilot can infer the completions easily


In my experience copilot is amazing.

It prompted the (joke) thought that perhaps it is making me less productive because of how often I end up sitting back and marveling at how amazing it is. I really can’t believe how good it is.


I use it with Rust, and my only grief is that it only takes the current file as context.

It is dumb but very good at doing repeating things, or being a very smart auto-complete. For writing tests, it is actually quite decent.


I've tried Copilot, Tabnine, and a couple of the others out there.

I find them to be an annoying Clippy-like companion that interrupts my train of thought and introduces a whole separate class of programming bugs: "autocomplete errors" which are like copy & paste errors, but written by another programmer.

I like my auto-complete functionality built in to Jetbrains and Visual Studio for hinting at variable names, function parameters, classes, imports, and so forth. But the boiler-plate code that "assistant AI programmers" provide are not worth the effort at this time and do not live up to the hype. I think they help less experienced developers out, but the kind of code I am writing won't usually be found in Copilot or Tabnine. I could see the appeal if all I was doing was churning out boilerplate CRUD app code all day long, which honestly, I would outsource to Craigslist for thirty bucks an hour. Or just hand off to my assistant.


It really is surprisingly good. I use it.

It's quick to scan and ignore things that aren't right, and it's either completely right or close enough that it definitely feels like a timesaver.

The best parts are where it's doing something long-winded but fairly straight forward (e.g. assigning variables). But it has moments of shocking ability with more complex things.


Yeah. I thought the same thing at first, had it for ~2 months in early access and never turned it on.

Now I can't live without it.


I've used it briefly in someone else's IDE (who swears by it) and it blew me away. It pretty much removed ever having to google syntax or snippets from SO in a language I wasn't totally familiar with (python).


One thing that it’s really good at is writing boilerplate-y code. For example, web scrapers. It can even read the function’s name and deduce some proper variable names, or use variable names to deduce whether I want a list of elements or one element. Not 100% correct, of course, but good enough if you treat it like an advanced snippet manager.


For boilerplate code, Copilot gets the job done.

But you need to set the right expectations that it is not magic, which requires some tuning.


It’s too good. I found myself getting very lazy with coming up solutions on my own, it felt like my problem solving capabilities declined. IMO you should install, and then disable it. Only use it sparingly, when time is very limited or to write tests/docs.


How polyglot is Copilot? I see plenty of positive results in the replies for (Java|Type)Script and Python, but how does it fare with a Lisp-like language or Prolog (for examples)?


I'm on track to surpass 1 million euros in salary working for 4 different companies thanks to Copilot so yes, this has been a huge boon for me.


I did a few nonscientific tests: https://twitter.com/minimaxir/status/1503822287903985664

Both the edit endpoint (used in those tweets) and the insert endpoint have mixed performance and tend to go off-the-rails often, especially compared to the new "Instruct" models which do a much better job, although expensive while these new endpoints are in a free beta.

The coding endpoints are slightly better but a more narrow domain.

In all cases I recommend looking at the docs for examples.


Really, really happy they're enabling the edits endpoint as a free-to-use beta for now. In my experience, GPT-3 works really well but is pretty prohibitively expensive unless you're optimizing for profitable tasks. Offering even limited-time free use (and unlimited tokens) is a nice nudge toward the "open" in OpenAI.

Also, the edit/insert endpoints specifically should hypothetically help a lot with plot divergence, which has been a huge problem trying to generate long-form works with standard completions, even with top-down outline expansion strategies, scene transition metadata, etc.

Excited to see what the millions of "AI word processors" that've sprung up over the past year actually do with it, besides the obvious.


if you are already using copilot, do we get these new feature automatically?


As usual, they give us a press release and two or three examples, but no real data or systematic analysis about its consistency, which is the real problem with any of these models.

I'm excited about the potential of things like this, I just don't like this wish-wash way of presenting a technical tool.


"OpenAI" except everything is closed research. Why did they even pick this name?


Their initial intention was possibly genuine, but there are strong incentives for secrecy (trade secrets), in addition to the U.S. military most likely having an interest in suppressing information to avoid someone else creating AI first.


   the U.S. military most likely having an interest in suppressing information to avoid someone else creating AI first.
someone else creating ai first? You make it sound like they are trying to create general ai skynet...

  there are strong incentives for secrecy (trade secrets)
they do not have the right to call themselves "open" if they are focusing on secrecy. They are trying to paint themselves in a positive opensource light when really they are no different then any other corporate overlord hiding in the shadows. Open implies developed in the open, free for use + modification, and/or key details NOT being hidden from the public. Simply sharing details of things created or research done is no different than what researchers at any other ABC corporation/organization do.


I used it today for some javascript DOM manipulation (in which I'm a bit rusty). It definitely saved me time. But it wasn't a panacea. It I didn't have a good eye and some background understanding then I wouldn't have got working code.

But it definitely saved me time. It's won't write your software for you. It's more like autocomplete but much, much better.


I find the autocompletion for "improve the runtime complexity of the [Fibonacci] function" at the top excruciating. Surely Codex has seen verbatim the two-argument form many times?


The open ai project CoPilot is just absolutely amazing. I would say that it increases my productivity by 5x because it eliminates a lot of going to stack overflow to find the answer.


We evaluated the suffix capabilities (insert) at CopyAI and determined it's not ready for us to use in production. It's a great idea and I think they'll improve it with time. But it's not up to the quality bar of Davinci completions.


I work at OpenAI. Since Insert is still in beta we really appreciate the feedback! We've found that it's easier to get Insert to work in production settings for code than for natural language. For natural language which I think you are more interested in, we documented some best practices that can help the users succeed here: https://beta.openai.com/docs/guides/completion/inserting-tex... I'd especially recommend trying our new and improved best_of feature.


Thanks a bunch! Absolutely not trying to downplay the massive undertaking that went into the suffix parameter. And it should have gone without saying that expectations should be tempered given it's still a beta feature.


I've been playing with the edit beta mode in the playground using the alpha text-davinci-edit-001 model but it does not appear to be free as the article states: all the calls made to the edit endpoints were charged to my account.

As larger requests (ex: spellcheck or translate 1000 words) can cost 20c+ each and the model often requires to be ran 2 or 3 times before giving an acceptable answer on more complex tasks (ex: translation), that makes for a very expensive tool if you want to edit a significant amount of text.

I wonder if the free usage only apply to the Codex engine and not the GPT-3.


I've tried using the new codex engine (code-davicii-002) and found that it costs money. OpenAI's playground says the codex is free but that doesn't seem to be the case right now. I'm not sure if this is a bug or intentional.


I work at OpenAI. This was unfortunately a bug, and we've fixed it since.


I’m on the copilot wait list, super excited to give it a go. Cheeky question, does anyone know of a way to get it sooner?


Try emailing the GitHub CEO. I did, regarding the Codespaces waitlist, and he sent me three riddles to answer correctly before I could get access.

(I realise this sounds like I’m making it up, but I promise this is a real story. It was quite fun.)


I'm now curious to know whether Copilot could correctly answer those riddles


Brilliant thought. I’ll have to dig the emails out and test it.


So...what is your favourite color?


Is there a way I as a random civilian can play around with GPT-3 online?



Still not free tho, I tried to get an autocomplete for a paraphrase and I got this: "You've reached your usage limit."


maybe fix the racism in your current ai before adding more stuff to it?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: