Assignment: Write about an automated workflow that updates a RAG bot on Dify so it always reflects the content on my portfolio.
What the assignment is asking for#
The goal is not a one-off upload of documents. It is a pipeline:
Portfolio content changes
→ workflow detects or is triggered
→ knowledge base is refreshed
→ chatbot answers from up-to-date materialOn Dify, that usually means:
- A knowledge dataset connected to your bot
- Documents ingested (often from files or URLs)
- A workflow or integration that re-syncs when the site changes — for example after deploy, on a schedule, or via webhook
Without automation, the bot slowly becomes wrong every time you edit projects or bio text.
How I think about “sync” for my own site#
My portfolio is a Hugo static site. The assistant I ship uses a custom Node API (server/) with Markdown in server/rag-data/ instead of Dify for production chat. The workflow idea is the same as on Dify:
| Step | Dify-style | My portfolio API |
|---|---|---|
| Source of truth | Site pages / exports | rag-data/*.md (+ site content I copy in) |
| Trigger | Workflow / webhook / schedule | Server restart after deploy, or manual edit + restart |
| Process | Chunk + embed + store in dataset | rag.js chunks + OpenAI embeddings at startup |
| Consume | Dify app / widget | POST /chat from the Hugo chat widget |
So the automation problem is: when portfolio facts change, how do we refresh what the bot knows without manual copy-paste every time?
Workflow I use today (practical)#
- Edit portfolio content — projects in
content/projects/, bio on contact/home, etc. - Update knowledge files — mirror important facts into
server/rag-data/(e.g. new project summary, skills, FAQ). - Redeploy or restart the chat API — on start,
rag.jsrebuilds the embedding index and logs how many chunks were indexed. The server readsOPENAI_API_KEYfromserver/.envagain on each restart (I never put the key in the workflow script or in Git). - Verify with a few questions in the chat UI (“What projects do you have?”, “How can I contact you?”).
API key setup (OpenAI account → .env → ragApiUrl in Hugo) is described in detail in my AI-driven application post.
That is a manual but repeatable workflow. It is honest for a student portfolio and keeps the RAG layer understandable.
Toward fuller automation (Dify or custom)#
These are the next steps I would document in a Dify setup or extend on my server:
Option A — Dify + portfolio (course tool)#
- Export or crawl portfolio URLs after each
hugodeploy. - Trigger a Dify knowledge sync (API or built-in workflow).
- Point the Dify chat app at the updated dataset.
- Embed or link the Dify widget if you do not use a custom backend.
Option B — Custom API + CI#
- On Git push / deploy, a script generates
rag-data/*.mdfrom Hugo content (templates or a small extractor). - CI calls
POST /reindexon the chat API (endpoint to add) or restarts the service. - No separate Dify host — one pipeline from repo to embeddings.
git push → build site → generate rag-data → restart API / reindexEither option satisfies the spirit of the assignment: the bot’s knowledge tracks the portfolio instead of drifting.
Risks and design choices#
- Stale chunks: If you only update the website but not
rag-data, RAG will confidently cite old text. Automation must touch the same store the retriever reads. - Over-syncing: Re-embedding everything on every tiny edit costs time and API money; hash-based “only re-index if content changed” is a good production pattern (also covered in course material).
- Two bots: Running both Dify and a custom API is fine for learning, but visitors should use one clear entry point on the live site.
Reflection#
The assignment frames Dify as the product for workflow automation. I implemented the same architectural idea on my own stack so the live portfolio chat and my exam documentation stay aligned. The important learning outcome is the pipeline mindset: treat knowledge as data that must be versioned, triggered, and refreshed — not as a one-time PDF upload.
If I add Dify later, I would use it for orchestration and keep this post updated with screenshots of the actual workflow nodes and triggers.