5. ● AI research + deployment company
● Create AGI
● Promote, develop “friendly” AGI
○ "benefit all of humanity"
● ChatGPT, DALLE-2
6.
7. DALL·E
● OpenAI's deep learning model to make images from text
● Trained on text-image pairs
● 12 billion parameter version of GPT-3 (neural network to generate text)
19. 3 properties in image generation API call
1. prompt: required. Text description of desired generated
image
2. n: optional. # (1-10) of images to generate. 1=default.
3. size: optional. Size of generated images. Must be
256x256, 512x512, or 1024x1024.
Who here has used ChatGPT?
What are some prompts you've asked it?
Anyone use Dall-E or heard of Dall-E? Same…
OpenAI is an AI research and deployment company. They differ from other AI companies in that they aim to promote and develop "friendly" AGI. What does that mean?
AGI is artificial general intelligence, or AI that matches human intellect.
Friendly AI would have a positive effect on humanity or at least align with human interests or contribute to foster the improvement of the human species. It is a part of the ethics of artificial intelligence and is closely related to machine ethics.
These are pretty lofty goals!
You may know them as the creators of ChatGPT. Last year, their products ChatGPT and DALLE-2 managed to take AI mainstream, and now the company is looking to make a few new friends, with Microsoft reportedly planning to pump $10 billion into the business.
Here, I provide a prompt of "impressionist version of pikachu playing tennis" in the browser https://labs.openai.com/e/Adn2NE87pFTPm9oGHpmkZBuf
DALL·E is a deep learning model developed and trained by OpenAI to create digital images from text captions, called "prompts".
It was trained on a dataset of text-image pairs and is a 12-billion parameter version of GPT-3. GPT-3, or the third-generation Generative Pre-trained Transformer, is a neural network machine learning model trained using internet data to generate any type of text.
Prompt on left: stained-glass window of toucans in space
As an evangelist who makes a lot of demo apps, I have a thing for APIs–they allow developers to add different features build apps instead of building them out completely by themselves! They're like legos with code!
So what can you do with the OpenAI API?
You can build apps that use OpenAI's powerful models. You can access GPT-3, which performs a variety of natural language tasks. Codex, which translates natural language to code, and Dall-E, which creates and edits original images. This is one API call to the OpenAI API, using the davinci language model to make a list of astronomical observatories.
GitHub Copilot is an AI pair programmer that helps you write code faster with less work. Using Codex, GitHub Copilot applies the context in your editor and synthesizes whole lines and even entire functions of code.
Duolingo uses GPT-3 to correct French grammar. An internal Duolingo study shows that use of this feature leads to measurably better second language writing skills!
Get an API Key here in the console at platform.openai.com
People say it's pricey to use, but my bill was not that bad last month??? And I used it quite a bit for some tutorials and I shared the phone #s out to encourage usage, so I wonder if I'm missing something. Anyhoo…
Developers can now integrate DALL·E directly into their apps and products through our API.
This curl request to the API (using an OpenAI API key, unique to you!) (which sets the size, n is the # of images to generate) will generate this image given the prompt "a photo of a happy corgi puppy sitting and facing forward, studio light, longshot":
So we did that with a curl request to the OpenAI API, and previously we saw images generated from a text prompt in the browser.
Hmm. What's another medium we could do this from?
How about we do that via text message?
The Serverless Toolkit is CLI tooling that helps you develop Twilio Functions locally and deploy them to Twilio Serverless Functions, because why spin up and host your own server when Twilio will do it for you? The best way to work with the Serverless Toolkit is through the Twilio CLI, which can be installed by running these commands on the command line to install it and the Serverless Toolkit
Afterwards, we create a new project called dalle-img-generation-sms and install our lone requirement openai:
Open up your .env file in your Functions project root directory and add this line so our JS code can access the OpenAI API key while keeping it a secret. I have a friend Kevin who I interned with in 2016…
Anyways. Now in our JavaScript Twilio Function code, we can access that API key with the context.openai_api_key variable
In the /functions directory we make a new file called dalle.js containing this code which makes an async function that will handle incoming text messages.
The function
Creates a Configuration object (from the openai package) passing in an object containing the apiKey property.
Creates an OpenAIApi object passing in the configuration.
Retrieves the body of the incoming message and stores it in the inbMsg variable.
Calls the openai.createImage function to use one of their language models to generate an image based on the inbMsg prompt. We get the image URL we need to send an image using Twilio.
Creates a Twilio Messaging Response object which responds with some text and media pointing to the image generated by DALL·E 2. This twiml is passed into the callback function which will write it to the HTTP response.
In order to specify your image generation, you need to pass in a configuration object containing three properties: prompt, n, and size. prompt is the only required property, representing a text description of the desired generated image. n is an optional property that defaults to one. It's the number of images to generate and must be between one and ten. Lastly, the optional size property is the size of the generated images and must be either 256x256, 512x512, or 1024x1024.
To deploy your app to Twilio, run twilio serverless:deploy from the root directory. You should see the URL of your Function at the bottom of your terminal:
Using the Twilio CLI, you can update the phone number using the Phone Number SID of your Twilio phone number. You can see it in the Twilio Console
under Properties and it begins with "PN". (click) (click)
If you don't wish to configure your Twilio phone number using the Twilio CLI, you can grab the Function URL corresponding to your app (the one that ends with /dalle)
and configure a Twilio phone number with it. You'd select the Twilio number you just purchased in your Twilio Phone Numbers console and scroll down to
the Messaging section as shown here.
Paste the link in the text field for A MESSAGE COMES IN webhook making sure that it's set to HTTP POST. When you click Save, it should look like this!
The Service is the Serverless project name, environment provides no other options, and Function Path is the file name. Now take out your phone and text a question or prompt to your Twilio number.