Re-ranking

Re-ranking in Tiledesk: How It Works and Why It Matters

What is re-ranking (easy explanation)

Re-ranking is a second, smarter selection step applied after an initial search (e.g. vector search, keyword search, hybrid search).

  1. First step – Retrieval The system retrieves a set of candidate results that are probably relevant to the user’s question (e.g. top 20 chunks from a vector database).

  2. Second step – Re-ranking A more precise model evaluates each candidate in context of the actual user query and:

    • assigns a relevance score

    • sorts results from most to least relevant

    • optionally discards low-quality matches

In short:

Retrieval finds “possible answers” → Re-ranking finds the “best answers.”

Why re-ranking is needed

Vector similarity alone is powerful, but it has limitations:

  • It may retrieve semantically similar but contextually wrong chunks

  • It treats all candidates as equally good once retrieved

  • It cannot always understand intent, constraints, or priority

Re-ranking solves this by deeply comparing the user question with each candidate, instead of comparing embeddings only.

How re-ranking works

What re-ranking model evaluates

In Tiledesk, re-ranking is used in RAG pipelines to improve answer quality across:

  • Customer support assistants

  • Internal knowledge bases

  • Enterprise document search

  • Multi-agent workflows

Tiledesk allows re-ranking to be applied:

  • Automatically in RAG flows

  • As a configurable step in agent pipelines

  • In on-premise, hybrid, or cloud deployments

Intuitive use case

Scenario

A company uses Tiledesk to power a support assistant with:

  • Product manuals

  • Internal procedures

  • Troubleshooting guides

User question:

“How can I reset my device if it’s stuck during firmware update?”


Without Re-ranking

Vector search retrieves chunks like:

  1. Firmware update overview

  2. Reset device after factory test

  3. Device troubleshooting – connection issues

  4. Firmware version history

  5. Reset device procedure (correct)

The LLM sees mixed context and may:

  • Answer partially

  • Mention irrelevant steps

  • Hallucinate missing details


With Re-ranking Enabled

The re-ranking model analyzes each chunk against the exact question and produces:

  1. Reset device procedure (firmware recovery mode) ⭐⭐⭐⭐⭐

  2. Troubleshooting – firmware stuck scenarios ⭐⭐⭐⭐

  3. Firmware update overview ⭐⭐

  4. Reset after factory test ⭐

  5. Version history ⭐

Only the top, most relevant content is passed to the LLM.

Result for the User

  • More precise answer

  • Correct steps on the first try

  • Less confusion

  • Faster resolution

Benefits for user and organizations

1. Higher Answer Accuracy

Re-ranking significantly reduces:

  • Irrelevant context

  • Partial answers

  • Hallucinations

2. Better Use of Existing Knowledge

Even large or noisy knowledge bases become:

  • More reliable

  • Easier to maintain

  • More scalable

3. Improved User Trust

Users notice when:

  • Answers are consistent

  • Instructions are correct

  • The assistant “understands” intent

This leads to:

  • Higher adoption

  • Lower fallback to human support

4. Cost and Performance Optimization

By sending only the best chunks to the LLM:

  • Fewer tokens are used

  • Responses are faster

  • Costs are reduced

5. Enterprise-Grade Control

In Tiledesk, re-ranking supports:

  • On-premise and GDPR-compliant deployments

  • Integration with custom retrieval logic

How to enable Re-ranking?

Move to the Knowledge Bases section and press + New Knowledge Base button, then choose the "Hybri search" option

Once the Knowledge base is created you can create and connect an AI Agent directly to it

In the AI Agent flow you can also decide to enable/disable the re-ranking for a specific Ask Knowledge Base Action

When you should enable re-ranking?

Re-ranking is especially valuable when:

  • Knowledge bases are large (hundreds or thousands of documents)

  • Documents are similar to each other

  • Precision matters (legal, technical, industrial domains)

  • Users ask complex or multi-constraint questions

Re-ranking uses GPUs

In Tiledesk, re-ranking is designed for enterprise-grade, real-time precision, which makes it suitable primarily for on-prem GPU installations.

Re-ranking relies on cross-encoder models that must score many (query, chunk) pairs in parallel, a workload that is computationally intensive and latency-sensitive. Running this step on CPUs introduces unpredictable delays.

An on-prem GPU allows Tiledesk to execute re-ranking locally, with stable low latency, full data sovereignty, and predictable performance under load, making it the only deployment model that consistently meets enterprise SLAs and compliance requirements.

In our SAAS deployment we extensively use GPUs for Hybrid-search and Re-ranking.

Last updated