A First (And Very, Very Surface-Level) Look At The ChatGPT API

Buy tokens -> Use API -> ? -> Profit

Evan SooHoo
7 min readApr 4, 2023
Written on ShlinkedIn, a satirical, open source website parody of Linkedin.

I had a chance to use the ChatGPT API and wanted to document what I have learned so far. What I have is disappointingly basic, but in the process I found some useful tutorials I will link, hopefully laying the groundwork for more impressive projects.

What this post will include:

  • A brief explanation of how the ChatGPT API and Whisper API work, and how I used a little Python to make an API call
  • Resources compiled by other people, one of whom wrote a 30-line Python script that acted as a virtual voice assistant, giving him the ability to talk effectively talk to ChatGPT as if it were Siri

What this post will not be:

  • A comprehensive explanation of how the ChatGPT API works
  • A tutorial

How It Works, In Brief

This is the official OpenAI blog post that outlines the release of the ChatGPT and Whisper APIs. From the source:

The ChatGPT model family we are releasing today, gpt-3.5-turbo, is the same model used in the ChatGPT product. It is priced at $0.002 per 1k tokens, which is 10x cheaper than our existing GPT-3.5 models. It’s also our best model for many non-chat use cases—we’ve seen early testers migrate from text-davinci-003 to gpt-3.5-turbo with only a small amount of adjustment needed to their prompts.

Now I will jump to two YouTube personalities who released tutorials. The first, by Fireship, was a four-minute meme video of him creating a voice assistant called FlatGPT. Since he does not link any source code and treats the video a bit more like an overview than a tutorial, I will summarize: The ChatGPT API does not allow fine-tuning (I am saving the link, though; fine-tuning with an older API sounds like a great way to better understand the actual machine learning aspect), but you can make requests and it will respond to you as if in a chat. It is not unlike the Try ChatGPT button you have likely already tried, which gave users the ability to talk to a clever chatbot that could converse, summarize papers, write poetry and write code for free. The official ChatGPT API documentation is here.

Whisper, on the other hand, is actually completely open source and can be used right now for free. It translates speech to text.

If you have an OpenAI account, you may have noticed free tokens; I did not.

Part Time Larry’s Work

A YouTuber called “Part Time Larry” made a video of himself coding a voice assistant. The 30-minute video is here, but I found it more helpful to go directly to the source code (which I am about to include)

This is the source code.

Source: https://github.com/hackingthemarkets/chatgpt-api-whisper-api-voice-assistant/blob/main/therapist.py

GRadio is a free tool some people in machine learning use to quickly set up a UI. PartTimeLarry, for a number of reasons, is using a value called OPENAI_API_KEY — this is what virtually everyone is doing because it is more reasonable than simply exposing his API key. He uses the Whisper API to transform his audio into text (I am actually not clear on why they bundled Whisper with the rest of the API), then he writes code similar to the Python examples in the OpenAI documentation to directly call the ChatGPT API. If you run this yourself, it is accessible at localhost:7860/ .

…I didn’t. It didn’t quite work for me. I had the issue documented here, but unlike this user I did not resolve the problem by adding three lines of Python to redefine my file type.

My Tests

Here was my little seven-line test. I can at least use the API.

There. Set role as system, send a basic message, get a response. How has my day been? Wow, you know what? No one has asked me that today. I feel such a connection with you, ChatGPT API.

My experience with Whisper was WAY more interesting. I just had a four-line Python script

import whisper

model = whisper.load_model(“base”)
result = model.transcribe(“Recording.m4a”)
print(result[“text”])

First I tried the JVNA song “Crazy” because of its licensing, though at this point in time I see no reason I would need to release my own copy of it. It is essentially the musical equivalent of open source.

Is the song amazing? Yes. Is the song about heartbreak? Yes. Does Whisper adequately capture the feeling of realizing the one person who always cared about you is gone, yet you can still happily reminisce about the moments you shared and feel grateful that you two had those moments at all? No, not really. Interestingly enough, it failed on music…but perhaps that is not shocking considering what it was designed for and what it was trained on.

Although in the studios, otherwise I wonder how long time I have to leave the city inside I was so desperate But when there’s not a hub, I didn’t know chance could see the line But doodown, I felt for you A, were regret for, an ill- Control Person For one, who loved me For who, I can’t go back to you Walking through all the dust Emories and frame A past we can simply let pain A what I don’t hope adore I need no chance For one, who’ll be back again You make me feel so crazy Still in love with you You make me feel amazing When I next to you You make me feel so crazy My heart breaks for you Can’t help but know I’m never You in love with you You in love with me Can’t go back to you You in love with me You in love with me You in love with me You in love with me You in love with me But when there’s no hub, I love it For you, who saw me Before I knew who I’d be What I have around me Nothing to me If I can’t share it with you Walking through all the dust Emories and frame A past we can simply let pain A what I don’t hope adore I need no chance For one, who’ll be back again You make me feel so crazy Still in love with you You make me feel amazing When I next to you You make me feel so crazy My heart breaks for you Can’t help but know I’m never You in love with you You in love with me W Can’t go back to you You make me feel so crazy Can’t go back to you You
— Whisper failing to translate JVNA

I decided to record my own voice. The result was extremely impressive.

This is just me recording a test audio file for the Whisper API. I tried doing it with a song and the result was not very impressive. I don’t know if that proves the technology isn’t there yet, or if maybe that’s just not what it was designed for. And maybe it should have been trained on songs if that was its intended use case. This is just going to be me reading a blog post. ChatGPT has been getting blogged about to death, and I would suspect that even ChatGPT is getting tired of this. Had it not been for the fact that ChatGPT is not sent in as some bloggers seem to think. It can write impressive poetry without rhythm, by the way. It can write in the style of Jordan Peterson, probably about lobsters, in a way that even freaked out Jordan Peterson. It can flood medium, and it can flood YouTube via text to speech. It can write code, and I notice that it’s sometimes even more useful than Google for some rudimentary coding interview prompts. But it can’t ramble like I can, at least not yet. It can’t scroll sentences with my uniquely misused words, and it can’t decorate diction with my grammatical imperfections. It can fade emotion, but it can’t feel. And in spite of so much progress, it can’t reliably produce poetry and I am a contaminor. I don’t really want to read more of this blog post. I don’t think it’s translating that well. But I do still need to read something in order to get reliable audio and to see if this forms coherent sentence. It has to be my work or something open so that I’m not taking credit for someone else. Let’s see. If you’re familiar with John Wick, you know that the entire movie series is a veiled metaphor for the importance of business. What the heck? Okay, this has been two minutes. I think I have enough data. Thank you.
— Whisper translating my voice successfully. The only errors are the bolded parts…one of them was supposed to be iambic pentameter, by the way…

Closing Thoughts

Fine-tuning would be the next logical place to go. That, or some actual, demoable code.

I did not know what the put for my conclusion, so here is a video about a corgi who loves showers…recommended to me via an algorithm. One day, what if it’s all not real? An AI-generated corgi who never existed. A user producing sensible conversation out of seemingly nothing. A million bots clapping along.

Until then, the corgi is real and all is well. Somewhere out there is this corgi, a real corgi, happily reacting to the existence of showers.

--

--

Evan SooHoo
Evan SooHoo

Written by Evan SooHoo

I never use paywalls (anymore) because I would get stuck behind them.

No responses yet