Update interfaces.md with documentation for reranker functionality

This commit is contained in:
Steve White 2025-02-28 08:06:55 -06:00
parent 0fe1d707f0
commit e748c345e2
1 changed files with 111 additions and 0 deletions

View File

@ -514,6 +514,117 @@ rate_limits = handler.get_rate_limit_info()
- **Description**: Gets information about the API's rate limits - **Description**: Gets information about the API's rate limits
- **Returns**: Dict[str, Any] - Dictionary with rate limit information - **Returns**: Dict[str, Any] - Dictionary with rate limit information
## Ranking Module
### JinaReranker Class
The `JinaReranker` class provides document reranking functionality using Jina AI's Reranker API.
#### Initialization
```python
reranker = JinaReranker(
api_key=None, # Optional, will use environment variable if not provided
model="jina-reranker-v2-base-multilingual", # Default model
endpoint="https://api.jina.ai/v1/rerank" # Default endpoint
)
```
- **Description**: Initializes the JinaReranker with the specified API key, model, and endpoint
- **Parameters**:
- `api_key` (Optional[str]): Jina AI API key (defaults to environment variable)
- `model` (str): The reranker model to use
- `endpoint` (str): The API endpoint
- **Requirements**: JINA_API_KEY environment variable must be set if api_key is not provided
- **Raises**: ValueError if API key is not available
#### rerank
```python
reranked_docs = reranker.rerank(query, documents, top_n=None)
```
- **Description**: Reranks a list of documents based on their relevance to the query
- **Parameters**:
- `query` (str): The query string
- `documents` (List[str]): List of document strings to rerank
- `top_n` (Optional[int]): Number of top documents to return (defaults to all)
- **Returns**: List[Dict[str, Any]] - List of reranked documents with scores
- **Example Return Format**:
```json
[
{
"index": 0,
"score": 0.95,
"document": "Document content here"
},
{
"index": 3,
"score": 0.82,
"document": "Another document content"
}
]
```
#### get_jina_reranker
```python
reranker = get_jina_reranker()
```
- **Description**: Factory function to get a JinaReranker instance with configuration from the config file
- **Returns**: JinaReranker - Initialized reranker instance
- **Raises**: ValueError if API key is not available
### Usage Examples
#### Basic Usage
```python
from ranking.jina_reranker import JinaReranker
reranker = JinaReranker()
query = "What is quantum computing?"
documents = [
"Quantum computing is a computation system that uses quantum mechanics.",
"Classical computers use bits while quantum computers use qubits.",
"Artificial intelligence is transforming various industries."
]
reranked = reranker.rerank(query, documents)
for doc in reranked:
print(f"Score: {doc['score']}, Document: {doc['document']}")
```
#### Integration with ResultCollector
```python
from execution.result_collector import ResultCollector
from ranking.jina_reranker import get_jina_reranker
# Initialize components
reranker = get_jina_reranker()
collector = ResultCollector(reranker=reranker)
# Process search results with reranking
reranked_results = collector.process_results(
search_results,
dedup=True,
max_results=20,
use_reranker=True
)
```
#### Testing
```python
# Simple test script
import json
from ranking.jina_reranker import get_jina_reranker
reranker = get_jina_reranker()
query = "What is quantum computing?"
documents = [
"Quantum computing is a type of computation that harnesses quantum mechanics.",
"Classical computers use bits, while quantum computers use qubits.",
"Machine learning is a subset of artificial intelligence."
]
reranked = reranker.rerank(query, documents)
print(json.dumps(reranked, indent=2))
```
## Search Execution Testing ## Search Execution Testing
The search execution module has been tested to ensure it correctly executes search queries across multiple search engines and processes the results. The search execution module has been tested to ensure it correctly executes search queries across multiple search engines and processes the results.