Elasticsearch

2 min read

Overview

Elasticsearch is a powerful keyword search engine integrated into our RAG (Retrieval-Augmented Generation) system, specifically designed to serve the unique needs of Financial Services Industry (FSI) clients. It uses the industry-standard BM25 (Best Matching 25) scoring algorithm to provide highly relevant keyword-based search results for your financial documents and content.

When you enable the COMBINED search type in the InternalSearch tool of the Unique AI assistant, our platform intelligently combines:

  • Qdrant for vector-based semantic search

    • excels at concept queries such as "What are our policies on managing customer credit risk?"

  • Elasticsearch for keyword-based search using BM25

    • excels at keyword queries such as "Basel III Tier 1 capital ratio"

This hybrid approach ensures you get the best of both worlds: semantic understanding through vectors and precise keyword matching through Elasticsearch's advanced text analysis.

note

This service is currently in BETA. We may continue to refine the indexing and retrieval methods as we improve the system. If you encounter any issues while using the service, we’d appreciate your feedback.

Who it’s for

  • Admins who configure Spaces to optimize the experience for AI chat users relying on internal document searches, particularly when their queries contain specific keywords or technical terms

Can this feature be enabled on non-azure or self-hosted tenants?


Benefits

Elasticsearch provides superior relevance scoring for key-word based queries and improves the search performance overall. Our platform previously used PostgreSQL's built-in full-text search with n-gram-based similarity matching (pg_trgm extension). While functional, this approach had several limitations for FSI requirements.

Key benefits of Elasticsearch over PostgreSQL FTS:

Superior Relevance Scoring

  • BM25 Algorithm: Industry-standard relevance scoring vs. basic term frequency

  • Document Length Normalization: Better handling of varying document sizes common in financial documents

  • Term Frequency Saturation: Prevents over-weighting of frequently repeated term

Enhanced Performance

  • Dedicated Search Engine: Purpose-built for search vs. general database operations

  • Advanced Indexing: Optimized inverted indices vs. simple GIN indices

  • Horizontal Scaling: Can scale independently from your database

Example queries

Regulatory Compliance

  • Regulation References: "Section 225 of Dodd-Frank", "Basel III capital requirements"

  • Compliance Codes: "CCAR stress testing", "GDPR Article 17", "SOX Section 404"

  • Policy Numbers: "Policy AML-2023-001", "Procedure RISK-001-2024"

Investment Research

  • Financial Instruments: "10-year Treasury bonds", "S&P 500 futures", "EUR/USD options"

  • Financial Metrics: "price-to-earnings ratio", "debt-to-equity", "return on equity"

  • Market Data: "Q3 2024 earnings", "dividend yield 3.5%", "beta coefficient"

Step-by-Step Guide

  • Open the Unique AI settings by clicking on the edit icon

image-20250614-151821.png
  • Verify that the “InternalSearch” tool has “searchType” set to “COMBINED”

image-20250614-151956.png

Limitations

note

This service is currently in BETA. We may continue to refine the indexing and retrieval methods as we improve the system. If you encounter any issues while using the service, we’d appreciate your feedback.


Last updated