RAG workflow automation — keeping the bot aligned with my portfolio

Table of Contents

Assignment: Write about an automated workflow that updates a RAG bot on Dify so it always reflects the content on my portfolio.

What the assignment is asking for
#

The goal is not a one-off upload of documents. It is a pipeline:

Portfolio content changes
  → workflow detects or is triggered
  → knowledge base is refreshed
  → chatbot answers from up-to-date material

On Dify, that usually means:

A knowledge dataset connected to your bot
Documents ingested (often from files or URLs)
A workflow or integration that re-syncs when the site changes — for example after deploy, on a schedule, or via webhook

Without automation, the bot slowly becomes wrong every time you edit projects or bio text.

How I think about “sync” for my own site
#

My portfolio is a Hugo static site. The assistant I ship uses a custom Node API (server/) with Markdown in server/rag-data/ instead of Dify for production chat. The workflow idea is the same as on Dify:

Step	Dify-style	My portfolio API
Source of truth	Site pages / exports	`rag-data/*.md` (+ site content I copy in)
Trigger	Workflow / webhook / schedule	Server restart after deploy, or manual edit + restart
Process	Chunk + embed + store in dataset	`rag.js` chunks + OpenAI embeddings at startup
Consume	Dify app / widget	`POST /chat` from the Hugo chat widget

So the automation problem is: when portfolio facts change, how do we refresh what the bot knows without manual copy-paste every time?

Workflow I use today (practical)
#

Edit portfolio content — projects in content/projects/, bio on contact/home, etc.
Update knowledge files — mirror important facts into server/rag-data/ (e.g. new project summary, skills, FAQ).
Redeploy or restart the chat API — on start, rag.js rebuilds the embedding index and logs how many chunks were indexed. The server reads OPENAI_API_KEY from server/.env again on each restart (I never put the key in the workflow script or in Git).
Verify with a few questions in the chat UI (“What projects do you have?”, “How can I contact you?”).

API key setup (OpenAI account → .env → ragApiUrl in Hugo) is described in detail in my AI-driven application post.

That is a manual but repeatable workflow. It is honest for a student portfolio and keeps the RAG layer understandable.

Toward fuller automation (Dify or custom)
#

These are the next steps I would document in a Dify setup or extend on my server:

Option A — Dify + portfolio (course tool)
#

Export or crawl portfolio URLs after each hugo deploy.
Trigger a Dify knowledge sync (API or built-in workflow).
Point the Dify chat app at the updated dataset.
Embed or link the Dify widget if you do not use a custom backend.

Option B — Custom API + CI
#

On Git push / deploy, a script generates rag-data/*.md from Hugo content (templates or a small extractor).
CI calls POST /reindex on the chat API (endpoint to add) or restarts the service.
No separate Dify host — one pipeline from repo to embeddings.

git push → build site → generate rag-data → restart API / reindex

Either option satisfies the spirit of the assignment: the bot’s knowledge tracks the portfolio instead of drifting.

Risks and design choices
#

Stale chunks: If you only update the website but not rag-data, RAG will confidently cite old text. Automation must touch the same store the retriever reads.
Over-syncing: Re-embedding everything on every tiny edit costs time and API money; hash-based “only re-index if content changed” is a good production pattern (also covered in course material).
Two bots: Running both Dify and a custom API is fine for learning, but visitors should use one clear entry point on the live site.

Reflection
#

The assignment frames Dify as the product for workflow automation. I implemented the same architectural idea on my own stack so the live portfolio chat and my exam documentation stay aligned. The important learning outcome is the pipeline mindset: treat knowledge as data that must be versioned, triggered, and refreshed — not as a one-time PDF upload.

If I add Dify later, I would use it for orchestration and keep this post updated with screenshots of the actual workflow nodes and triggers.

What the assignment is asking for#

How I think about “sync” for my own site#

Workflow I use today (practical)#

Toward fuller automation (Dify or custom)#

Option A — Dify + portfolio (course tool)#

Option B — Custom API + CI#

Risks and design choices#

Reflection#