Assignment (5+6): Document an AI-driven application that calls an external API with an LLM.
This post describes the portfolio assistant: not a generic chat demo, but a real integration embedded in my Hugo site with backend, prompts, RAG, and structured error handling.
Application purpose#
Visitors can ask questions about me, my work, and how to get in touch. The app:
- Accepts multi-turn chat in the browser
- Sends conversation history to my backend
- The backend calls OpenAI (external LLM API)
- Optionally augments prompts with retrieved context (RAG)
- Returns a text reply displayed in the chat UI
That is an AI-driven feature inside a larger product (the portfolio), which matches the course definition of an LLM as a software component accessed over HTTP.
Architecture#
┌─────────────────┐ POST /chat ┌──────────────────┐
│ Hugo (static) │ JSON: messages, │ Node (Express) │
│ chat widget │ conversationId │ server/ │
└────────┬────────┘ ──────────────────────►└────────┬─────────┘
│ │
│ localStorage: conversations │ retrieveContext()
│ │ system + messages
│ ▼
│ ┌──────────────────┐
│ │ OpenAI API │
│ │ embeddings + │
│ │ chat completions│
└◄──────── { "reply": "..." } ─────└──────────────────┘Rule from the course: the API key stays on the backend only (OPENAI_API_KEY in server/.env).
How I set up the OpenAI API key#
There is no separate “RAG API key”. I use one OpenAI API key on the server for both embeddings (retrieval) and chat completions (answers). The browser never sees that key.
1. Create the key at OpenAI#
- I log in at platform.openai.com.
- I open API keys (under my account / dashboard).
- I click Create new secret key, give it a name (e.g.
portfolio-chat), and copy the value once — it is only shown at creation time. - I keep billing/usage limits in mind on the same account; both embedding and chat calls count toward usage.
If the course or a future employer uses Azure OpenAI or another provider, the idea is the same: a secret on the server, and optionally OPENAI_BASE_URL in .env for a compatible endpoint.
2. Store the key only in server/.env#
I do not put the key in Hugo, JavaScript, or Git.
- In the repo I go to the
server/folder. - I copy the template:
cp .env.example .env - I edit
.envand set:
OPENAI_API_KEY=sk-proj-...your-key-here...
OPENAI_MODEL=gpt-4o-mini
PORT=8788Optional variables I can add later:
OPENAI_EMBEDDING_MODEL=text-embedding-3-small— for RAG vectorsSYSTEM_PROMPT=...— override the assistant’s system messageCORS_ORIGIN=https://portfolio.kudskprogramming.dk— restrict browser origins in production
The file server/.env is listed in .gitignore, so it is not pushed to GitHub. Only server/.env.example (without a real key) is committed as documentation for myself and for anyone cloning the repo.
3. Start the backend#
From server/:
npm install
npm startThe API listens on port 8788 by default. On startup I watch the terminal: if RAG is enabled, I see a log line that chunks from rag-data/ were indexed. If the key is missing, POST /chat returns 503 with a message to configure .env — that is intentional so I notice misconfiguration early.
I sanity-check with:
GET http://localhost:8788/healthA healthy response includes "modelConfigured": true when OPENAI_API_KEY is set.
4. Point the Hugo site at the API#
In config/_default/params.toml I configure the public URL (no secret here):
[chat]
ragApiUrl = "http://localhost:8788/chat"
showContactLinkWithRag = falseFor production I change this to my deployed API, for example:
ragApiUrl = "https://api.kudskprogramming.dk/chat"Hugo writes that value into the chat widget as data-chat-api-url. When I build the site (hugo or deploy), the static HTML knows where to send chat requests — but still not how to authenticate to OpenAI; that happens only inside Node using .env.
5. Run the portfolio and test#
- I keep
npm startrunning inserver/. - I run the site locally (
hugo serveror open the builtpublic/site). - I open the homepage, click the chat button, and send a short question.
- The UI shows a typing indicator, then an assistant reply. In the network tab I see
POSTto/chaton my API, not toapi.openai.comfrom the browser.
If I stop the API or use a wrong key, the widget shows an error message in the chat instead of failing silently — that matches how I want visitors to experience outages.
What I deliberately avoid#
| Mistake | What I do instead |
|---|---|
Key in params.toml or front matter | Key only in server/.env |
Key in jk-chat-widget.js | Frontend only knows ragApiUrl |
Committing .env | .gitignore + example file without secrets |
| Sharing keys in blog posts or Moodle | Describe the steps, never paste the real key |
My API endpoint#
POST /chat
Request body (simplified):
{
"conversationId": "uuid-from-browser",
"messages": [
{ "role": "user", "content": "What do you work with?" },
{ "role": "assistant", "content": "..." }
]
}Response on success:
{
"reply": "Assistant message text"
}Response on failure:
{
"error": "Human-readable error message"
}The frontend uses fetch, shows a typing indicator, disables input while waiting, and surfaces API errors as assistant messages so the user is not stuck on a blank screen.
Prompts#
System prompt (role and rules)#
Defined in code with an optional override via SYSTEM_PROMPT in .env. Default behaviour:
- Assistant for Jonathan Kudsk’s portfolio
- Prefer CONTEXT from RAG when answering factual questions
- Do not invent CV or project facts
- Suggest the contact form for serious inquiries
User / assistant messages#
The conversation history from the browser is forwarded as OpenAI messages (user and assistant turns). The latest user message also drives RAG retrieval (embedding + top chunks).
This follows the course split:
- System = role, limits, output behaviour
- User/assistant turns = dialogue and task data
External LLM calls#
The backend uses two OpenAI endpoints:
| Call | Purpose |
|---|---|
POST /v1/embeddings | RAG: chunk index + query embedding |
POST /v1/chat/completions | Generate the reply (gpt-4o-mini by default) |
Model and base URL are configurable (OPENAI_MODEL, OPENAI_BASE_URL) for other providers that expose a compatible API.
Configuration on the static site#
The Hugo side only needs the backend URL (see How I set up the OpenAI API key above). If ragApiUrl is left empty, the chat still opens but uses a short offline fallback text instead of calling the LLM — useful while I work on the site without spending API credits.
Error handling (course checklist)#
Implemented at a basic but real level:
- Missing API key →
503with clear message to configure.env - OpenAI errors → parsed and returned as
{ "error": "..." } - Network / timeout → frontend abort after ~55s, user sees retry guidance
- Invalid JSON from model provider →
502with short detail - Empty reply → treated as error
Not yet implemented (possible extensions): retries with backoff, job queue for long tasks, structured JSON output like the assignment assessor exercise.
Testing#
Manual tests I run:
GET /healthon the API — key configured, RAG enabled- Open portfolio, send a question — reply uses portfolio tone
- Ask something not in
rag-data— model should hesitate or say it is unsure - Stop the API — frontend shows error path, not a crash
- New chat + switch chats — history in
localStoragepersists per conversation
Relation to the larger exam project#
The course also describes a separate exercise: assess a student report with a rubric and return structured JSON (overallAssessment, criteriaFeedback, etc.). That is a different endpoint and prompt design (POST /api/assess).
This portfolio app focuses on visitor chat + RAG. The integration patterns are the same: backend-owned key, system/user prompts, validate and handle failures.
Reflection#
Building this app reinforced that “AI-driven” means engineering around the model: boundaries, secrets, retrieval, UX while waiting, and honest limits when context is missing. The LLM is one service in the stack — the product is the full path from button click to grounded answer.
See also: RAG chatbot post for retrieval details and the same key setup from a RAG-focused angle.
More detail on env variables lives in server/README.md in the repository.