How to Use Synchronized Sitemap in Knowledge Base

The Synchronized Sitemap feature allows you to automatically import and maintain an entire sitemap in your Tiledesk Knowledge Base. Once configured, all URLs from the sitemap are continuously synchronized and updated based on the refresh rate you set, ensuring your knowledge base always reflects the latest content from your website.

This feature is ideal for maintaining up-to-date documentation, help centers, blogs, or any website content that changes regularly, without manual intervention.

Key Features

Automatic Synchronization

  • Continuous updates: URLs are automatically refreshed based on your configured refresh rate

  • New URL detection: New pages added to your sitemap are automatically imported to the Knowledge Base

  • Automatic cleanup: URLs removed from your sitemap are automatically deleted from the Knowledge Base

Unified Configuration

  • Inherited settings: All URLs inherit the same configuration as the parent sitemap

  • Consistent processing: HTML tags and RAG tags settings apply uniformly to all URLs

  • Centralized management: Configure once at the sitemap level, apply to all URLs

How It Works

  1. Initial Import: Tiledesk fetches all URLs from the sitemap and imports them into the Knowledge Base

  2. Continuous Monitoring: Based on the refresh rate, Tiledesk periodically checks the sitemap for changes

  3. Automatic Updates:

◦ Existing URLs are re-crawled and updated with fresh content

◦ New URLs are automatically added to the Knowledge Base

◦ Removed URLs are automatically deleted from the Knowledge Base

  1. Inheritance: All URLs maintain the same HTML tags, RAG tags, and refresh rate as the parent sitemap

Viewing Synced URLs

Once imported, you can:

  • View all URLs from the sitemap in the Knowledge Base list

  • See the last sync date and status for each URL

  • Check individual URL content and metadata

HTML Tags Configuration

Specify which HTML elements to include or exclude during content extraction. Pages are fully rendered and executed in a headless Chromium instance before text extraction, ensuring JavaScript-rendered content is captured correctly.

Extract Tags (Mandatory)

Define the HTML tags from which content will be extracted. <body> is included by default and covers the entire page body.

You can replace or extend it with more specific tags to narrow down the extracted content:

  • article — main article content

  • main — primary page content

  • div.content — a specific div with class "content"

Tip: Using more specific tags instead of <body> improves AI response quality by reducing noise from unrelated page sections.

Unwanted Tags

Define HTML tags that should be excluded from extraction, even if they fall within an Extract Tag:

Examples:

Unwanted Classnames

Exclude elements by their CSS class name, regardless of the tag type. Useful for removing recurring UI components like banners, sidebars, or cookie notices:

Examples:

RAG Tags Configuration

Add metadata tags to improve AI retrieval and relevance.

What are RAG tags?

RAG tags are labels you can assign to one or more contents in your Knowledge Base to filter which content the AI uses when answering questions.

When a user asks a question, the AI searches only among contents that match the specified tag, ignoring all others.

Example: Suppose you have a Knowledge Base where:

  • Some contents are imported manually and tagged as approved

  • Other contents are generated automatically (e.g. via self-learning) and have no tag

When you perform a question using the tag approved, the AI will only retrieve answers from contents tagged approved — the untagged ones will be completely ignored.

This is especially useful when you want to:

  • Separate verified content from automatically generated or draft content

  • Serve different audiences with different subsets of your knowledge base

  • Ensure the AI Agent only responds using trusted or curated sources

Tag Examples

  • approved

  • product-documentation

  • pricing-information

  • technical-support

  • getting-started

Last updated