AI/TLDR

How to Build Your First AI Chatbot

Go from zero to a deployed, multi-turn AI chatbot — API key setup, conversation memory, Python and TypeScript code, and one-click deployment.

BEGINNER12 MIN READUPDATED 2026-06-12

In plain English

A chatbot is a program that swaps messages with a user and — when you wire it to a large language model — can answer questions, explain code, roleplay a character, or do almost anything you describe in a system prompt. Building one is the single best way to go from "I've heard of ChatGPT" to "I understand how this actually works."

Here's the honest surprise: a minimal AI chatbot is about 20 lines of code. The model itself is already trained — you're not building that. You're writing a thin wrapper that (1) collects the user's message, (2) sends the full conversation history to a model's API, and (3) prints the reply. That loop, repeated, is a chatbot. Everything fancy — streaming, a web UI, login, deployment — is layered on top of that same loop.

This article is a step-by-step build guide. By the end you'll have a working multi-turn chatbot in either Python or TypeScript, an understanding of why you send the whole history each turn, and a live URL you can share. If you're still deciding what to build, see AI project ideas for beginners first.

Why building one yourself matters

You can use ChatGPT without understanding anything about how it works. But the moment you build your own chatbot, several things click at once: you see that the model has no memory between API calls (your code is the memory), you feel what a system prompt actually does, and you understand why longer conversations cost more. These are not obvious from the outside.

More practically, a chatbot is the seed of almost every AI product. Customer support bots, coding assistants, tutors, document Q&A tools — they're all variations on the same core loop. Once you can write that loop, extending it to any of those use cases is a matter of swapping the system prompt and optionally adding retrieval. The chatbot is not a toy; it's the foundation.

What you'll understand after building itWhy it matters later
The model is stateless — your code holds the memoryExplains cost growth, token limits, history trimming
System prompts set personality and rulesThe lever you pull to customize any LLM product
The API returns the same shape every timeMakes error handling and streaming predictable
Tokens, not characters, are the billing unitGrounds your cost intuition for every project after this
Deployment is just hosting a script or a web serverDemystifies "going live" for all future projects

How the chat loop works

The LLM API is stateless: every request is independent, and the model has no built-in memory of previous messages. Multi-turn conversation works because your code assembles the full history into each request. The server reads the whole thread, generates the next reply, and forgets everything. Your app is the one keeping score.

The key insight is step 2: you send every prior message with each new request. That's the only way the model knows what was said before. The downside is that each turn is slightly more expensive than the last — a 10-turn conversation sends 10x more tokens on the final turn than the first. For a personal chatbot this is fine. For a high-volume product you'd trim old messages once the history grows past a threshold.

Each message in the history has a role (system, user, or assistant) and a content string. The system message is sent first and sets the rules — think of it as a persistent instruction sheet only the model can see. user messages are the human's turns; assistant messages are the model's prior replies. The model uses all of these to decide what to say next.

The model reads all four messages top to bottom and writes the next assistant reply. You then append that reply to your history list and the loop continues.

Step-by-step: get an API key and build the chatbot

Step 1 — Pick a model and get an API key

You need an account with a model provider. The two most popular choices for beginners are OpenAI (GPT-4o) and Anthropic (Claude). Both offer pay-as-you-go pricing with no upfront cost, and both have generous free credits for new accounts. The code patterns are nearly identical — only the SDK import and the model name differ.

  • OpenAI: Sign up at platform.openai.com, go to API Keys, create a new key. Model to use: gpt-4o-mini (cheapest capable model).
  • Anthropic: Sign up at console.anthropic.com, go to API Keys, create a new key. Model to use: claude-haiku-4-5 (fast and cheap).
  • Either way: copy the key into an environment variable called OPENAI_API_KEY or ANTHROPIC_API_KEY. Never paste it directly into source code.

Step 2 — Install dependencies

Python (using OpenAI SDK)bash
python3 -m venv .venv && source .venv/bin/activate
pip install openai python-dotenv
export OPENAI_API_KEY="sk-..."   # or put it in .env
TypeScript / Node.js (using OpenAI SDK)bash
npm init -y
npm install openai dotenv
# For TypeScript:
npm install -D typescript tsx @types/node
npx tsc --init

Step 3 — Write the multi-turn chat loop

Here's the complete chatbot in Python. It runs in your terminal — type a message, press Enter, get a reply. The history list is the memory: every past turn lives in it and travels with each request.

chatbot.py (Python, OpenAI)python
import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()  # reads .env file
client = OpenAI()  # uses OPENAI_API_KEY from environment

SYSTEM_PROMPT = "You are a concise, friendly assistant. Reply in 3 sentences or fewer."

# The history list IS the memory — every past turn lives here.
history = []

print("Chatbot ready. Type your message (Ctrl+C to quit).\n")
while True:
    user_input = input("You: ").strip()
    if not user_input:
        continue

    history.append({"role": "user", "content": user_input})

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            *history,  # send the WHOLE history every turn
        ],
        max_tokens=300,
    )

    reply = response.choices[0].message.content
    print(f"Bot: {reply}\n")
    history.append({"role": "assistant", "content": reply})

And here's the same chatbot in TypeScript. Save it as chatbot.ts and run with npx tsx chatbot.ts.

chatbot.ts (TypeScript, OpenAI)typescript
import OpenAI from "openai";
import * as readline from "readline";
import "dotenv/config";

const client = new OpenAI(); // uses OPENAI_API_KEY from environment

const SYSTEM_PROMPT = "You are a concise, friendly assistant. Reply in 3 sentences or fewer.";

type Message = { role: "user" | "assistant"; content: string };
const history: Message[] = [];

const rl = readline.createInterface({ input: process.stdin, output: process.stdout });

const ask = () =>
  rl.question("You: ", async (input) => {
    const userInput = input.trim();
    if (!userInput) { ask(); return; }

    history.push({ role: "user", content: userInput });

    const response = await client.chat.completions.create({
      model: "gpt-4o-mini",
      messages: [
        { role: "system", content: SYSTEM_PROMPT },
        ...history, // send the WHOLE history every turn
      ],
      max_tokens: 300,
    });

    const reply = response.choices[0].message.content ?? "";
    console.log(`Bot: ${reply}\n`);
    history.push({ role: "assistant", content: reply });
    ask();
  });

console.log("Chatbot ready. Type your message (Ctrl+C to quit).\n");
ask();

Step 4 — Add a web UI (optional, 10 extra lines)

A terminal chatbot is useful, but a browser-based one is shareable. The fastest path is Streamlit for Python or Next.js with the Vercel AI SDK for TypeScript. Streamlit wraps your exact chat logic in a chat box with st.chat_message and st.chat_input — no HTML or CSS required. For TypeScript, the Vercel AI SDK's useChat hook gives you a streaming chat UI in under 50 lines of React.

app.py (Streamlit web UI, 15 lines)python
import streamlit as st
from openai import OpenAI

client = OpenAI()

if "history" not in st.session_state:
    st.session_state.history = []

st.title("My AI Chatbot")

for msg in st.session_state.history:
    with st.chat_message(msg["role"]):
        st.write(msg["content"])

if prompt := st.chat_input("Your message..."):
    st.session_state.history.append({"role": "user", "content": prompt})
    with st.chat_message("user"):
        st.write(prompt)

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "system", "content": "You are a helpful assistant."},
                  *st.session_state.history],
        max_tokens=300,
    )
    reply = response.choices[0].message.content
    st.session_state.history.append({"role": "assistant", "content": reply})
    with st.chat_message("assistant"):
        st.write(reply)
Run the Streamlit appbash
pip install streamlit
streamlit run app.py

Deploy your chatbot (free)

Once your chatbot runs locally, deploying it takes about 10 minutes. The two most beginner-friendly platforms are Render (great for Python) and Vercel (great for TypeScript/Next.js). Both have free tiers, both connect directly to a GitHub repo, and both let you set environment variables through their dashboard — so your API key never touches your codebase.

Deploying a Python chatbot to Render

  1. Push your project to a GitHub repo (include requirements.txt with openai streamlit).
  2. Go to render.com, click New → Web Service, connect your GitHub repo.
  3. Set Build Command to pip install -r requirements.txt and Start Command to streamlit run app.py --server.port $PORT.
  4. Add OPENAI_API_KEY as an environment variable in the Render dashboard.
  5. Click Deploy — Render builds and starts your app. You get a public URL like https://your-chatbot.onrender.com.

Deploying a TypeScript chatbot to Vercel

  1. Use the Vercel AI SDK template: npx create-next-app@latest --example ai-chatbot my-chatbot.
  2. Push to GitHub, then go to vercel.com and click New Project → Import Git Repository.
  3. In Environment Variables, add OPENAI_API_KEY.
  4. Click Deploy — Vercel builds your Next.js app and gives you a public URL in under 2 minutes.

Going deeper

A working chatbot is a foundation, not a finish line. Here are the most common next steps, in roughly the order people hit them.

Streaming responses

By default the API waits until the full reply is ready before sending it — for a long answer that can feel like a frozen screen. Streaming sends tokens as they're generated, so the reply appears word-by-word like the real ChatGPT. In Python, add stream=True to your create() call and iterate over the chunks. See LLM streaming explained for the full pattern.

Trimming the history (managing context)

Every token you send costs money and counts against the model's context window. In a long conversation the history can balloon. A simple fix: keep only the last N turns. A smarter fix: summarize old turns into a single paragraph and substitute that summary into the history. Both patterns are worth knowing before your chatbot gets long-running conversations.

Giving the bot knowledge of your own documents

A chatbot that only knows what its training data contains will confidently make things up when asked about your company, notes, or PDFs. The fix is RAG (retrieval-augmented generation): you store your documents as vector embeddings, retrieve the relevant chunks at query time, and inject them into the system prompt. This is the single highest-value upgrade you can make after your first chatbot works.

Giving the bot tools

Once the basic chat loop is solid, you can give the model the ability to do things: search the web, run code, look up a database. This is called function calling or tool use. You describe available functions in your API request; the model decides when to call them; your code executes the call and feeds the result back. That loop is the foundation of every AI agent.

Making it production-ready

When real users start chatting, you need a few more pieces: authentication so only your users can access it, rate limiting so one user can't burn through your API budget, and error handling for when the API returns a 429 or times out. None of these are hard — they're just extra layers on top of the same chat loop you already built. The modern AI app stack article maps out all the pieces.

FAQ

How much does it cost to build and run an AI chatbot with the OpenAI API?

For a personal chatbot used lightly, typically a few cents a day. gpt-4o-mini costs $0.15 per million input tokens — a typical 10-turn conversation might use 2,000 tokens total, costing about $0.0003. Set a monthly spending cap in your provider's dashboard so there are no surprises.

Do I need to know machine learning to build an AI chatbot?

No. You're calling a pre-trained model's API — the math and training are already done. You need basic Python or JavaScript: variables, loops, functions, and how to install packages. That's it.

Why does my chatbot forget the conversation when I restart it?

Your history list lives in memory — when the script stops, the list is gone. To persist conversations across restarts, save history to a file or database (SQLite works fine for personal projects) and reload it on startup.

What is the difference between a system prompt and a user message?

The system prompt is a persistent, behind-the-scenes instruction that sets the model's persona and rules (e.g. "You are a pirate. Never break character."). It's sent with every request but never shown to the user. User messages are what the human actually types. See what is prompt engineering for the broader context.

Can I use Claude (Anthropic) instead of OpenAI for this tutorial?

Yes. The concepts are identical — swap from openai import OpenAI for import anthropic, change the model name to claude-haiku-4-5, and use client.messages.create(model=..., system=..., messages=history). The Anthropic SDK separates system from messages at the top level rather than as a message role, but the loop logic is the same.

What is the easiest way to add a web interface to my Python chatbot?

Streamlit. Run pip install streamlit, replace the input() loop with st.chat_input, and use st.chat_message to render the conversation. You get a browser-based chat UI with about 15 lines of code, and streamlit run app.py launches it immediately. No HTML or CSS needed.

Further reading