Building a RAG Pipeline with Qdrant and Gemini

Retrieval-augmented generation (RAG) grounds a language model in your own data, so answers stay accurate and current without retraining. Here is the pipeline I reach for.

The four stages

Chunk the source documents into passages.
Embed each chunk into a vector.
Store the vectors in Qdrant for fast similarity search.
Retrieve the top matches and pass them to Gemini as context.
Build the Docker Image: Use the Dockerfile to build the Docker image for the Express application.
sh
```
docker build -t express-lb .
```

Request Workflow Diagram

graph TD;
    A[Client] -->|HTTP Request| B[Nginx Load Balancer :8000];
    B -->|Round Robin<br/>Primary| C[App Instance 1 :4500];
    B -->|Round Robin<br/>Primary| D[App Instance 2 :4501];
    B -->|Backup<br/>if 1 & 2 down| E[App Instance 3 :4502<br/>Backup];
    C --> F[Response];
    D --> F;
    E -->|Only if needed| F;
    F --> A;

Searching with Qdrant

Qdrant returns the nearest neighbours for a query embedding in milliseconds:

tssrc/lib/foo.ts

const hits = await client.search('articles', {
    vector: queryEmbedding,
    limit: 5,
});

Feed those passages into the prompt and let the model answer from them. The result is grounded, cite-able output that keeps improving as your corpus grows.

Request Workflow Diagram

graph TD;
    A[Client] -->|HTTP Request| B[Nginx Load Balancer :8000];
    B -->|Round Robin<br/>Primary| C[App Instance 1 :4500];
    B -->|Round Robin<br/>Primary| D[App Instance 2 :4501];
    B -->|Backup<br/>if 1 & 2 down| E[App Instance 3 :4502<br/>Backup];
    C --> F[Response];
    D --> F;
    E -->|Only if needed| F;
    F --> A;

VS Code Laravel

	module.exports = {
	root: true,
	env: {
	browser: true,
	node: true,
	},
	parserOptions: {
	parser: '@babel/eslint-parser',
	requireConfigFile: false,
	},
	extends: [
	'@nuxtjs',
	'plugin:nuxt/recommended',
	'prettier'
	],
	plugins: ['prettier'],
	rules: {
	'prettier/prettier': ['error'],
	'vue/html-indent': ['error', 4],
	'vue/singleline-html-element-content-newline': 0,
	'vue/component-name-in-template-casing': ['error', 'PascalCase'],
	'vue/valid-v-slot': [
	'error',
	{
	allowModifiers: true,
	},
	],
	},
	globals: {
	_: true,
	},
	}

view raw .eslintrc.js hosted with ❤ by GitHub

	# Ignore artifacts:
	build
	coverage

view raw .prettierignore hosted with ❤ by GitHub

	{
	"semi": false,
	"singleQuote": true,
	"tabWidth": 4,
	"printWidth": 120
	}

view raw .prettierrc hosted with ❤ by GitHub

	{
	"devDependencies": {
	"@babel/eslint-parser": "^7.15.0",
	"@nuxtjs/eslint-config": "^6.0.1",
	"@nuxtjs/eslint-module": "^3.0.2",
	"eslint": "^7.32.0",
	"eslint-config-prettier": "^8.3.0",
	"eslint-plugin-nuxt": "^2.0.0",
	"eslint-plugin-prettier": "^3.4.0",
	"eslint-plugin-vue": "^7.15.1",
	"prettier": "^2.3.2"
	}
	}

view raw package.json hosted with ❤ by GitHub

	{
	"vetur.format.defaultFormatter.html": "none",
	// Set the default
	"editor.formatOnSave": false,
	// Enable per-language
	"[javascript]": {
	"editor.formatOnSave": true
	},
	"[vue]": {
	"editor.formatOnSave": true
	}
	}

view raw settings.json hosted with ❤ by GitHub

Building a RAG Pipeline with Qdrant and Gemini

Building an LLM Knowledge System

What you’ll learn

The four stages

Request Workflow Diagram

Searching with Qdrant

Request Workflow Diagram

VS Code Laravel