Update interfaces.md with documentation for reranker functionality

This commit is contained in:
Steve White 2025-02-28 08:06:55 -06:00
parent 0fe1d707f0
commit e748c345e2
1 changed files with 111 additions and 0 deletions

View File

@ -514,6 +514,117 @@ rate_limits = handler.get_rate_limit_info()
- **Description**: Gets information about the API's rate limits
- **Returns**: Dict[str, Any] - Dictionary with rate limit information
## Ranking Module
### JinaReranker Class
The `JinaReranker` class provides document reranking functionality using Jina AI's Reranker API.
#### Initialization
```python
reranker = JinaReranker(
api_key=None, # Optional, will use environment variable if not provided
model="jina-reranker-v2-base-multilingual", # Default model
endpoint="https://api.jina.ai/v1/rerank" # Default endpoint
)
```
- **Description**: Initializes the JinaReranker with the specified API key, model, and endpoint
- **Parameters**:
- `api_key` (Optional[str]): Jina AI API key (defaults to environment variable)
- `model` (str): The reranker model to use
- `endpoint` (str): The API endpoint
- **Requirements**: JINA_API_KEY environment variable must be set if api_key is not provided
- **Raises**: ValueError if API key is not available
#### rerank
```python
reranked_docs = reranker.rerank(query, documents, top_n=None)
```
- **Description**: Reranks a list of documents based on their relevance to the query
- **Parameters**:
- `query` (str): The query string
- `documents` (List[str]): List of document strings to rerank
- `top_n` (Optional[int]): Number of top documents to return (defaults to all)
- **Returns**: List[Dict[str, Any]] - List of reranked documents with scores
- **Example Return Format**:
```json
[
{
"index": 0,
"score": 0.95,
"document": "Document content here"
},
{
"index": 3,
"score": 0.82,
"document": "Another document content"
}
]
```
#### get_jina_reranker
```python
reranker = get_jina_reranker()
```
- **Description**: Factory function to get a JinaReranker instance with configuration from the config file
- **Returns**: JinaReranker - Initialized reranker instance
- **Raises**: ValueError if API key is not available
### Usage Examples
#### Basic Usage
```python
from ranking.jina_reranker import JinaReranker
reranker = JinaReranker()
query = "What is quantum computing?"
documents = [
"Quantum computing is a computation system that uses quantum mechanics.",
"Classical computers use bits while quantum computers use qubits.",
"Artificial intelligence is transforming various industries."
]
reranked = reranker.rerank(query, documents)
for doc in reranked:
print(f"Score: {doc['score']}, Document: {doc['document']}")
```
#### Integration with ResultCollector
```python
from execution.result_collector import ResultCollector
from ranking.jina_reranker import get_jina_reranker
# Initialize components
reranker = get_jina_reranker()
collector = ResultCollector(reranker=reranker)
# Process search results with reranking
reranked_results = collector.process_results(
search_results,
dedup=True,
max_results=20,
use_reranker=True
)
```
#### Testing
```python
# Simple test script
import json
from ranking.jina_reranker import get_jina_reranker
reranker = get_jina_reranker()
query = "What is quantum computing?"
documents = [
"Quantum computing is a type of computation that harnesses quantum mechanics.",
"Classical computers use bits, while quantum computers use qubits.",
"Machine learning is a subset of artificial intelligence."
]
reranked = reranker.rerank(query, documents)
print(json.dumps(reranked, indent=2))
```
## Search Execution Testing
The search execution module has been tested to ensure it correctly executes search queries across multiple search engines and processes the results.