Update interfaces.md with documentation for reranker functionality
This commit is contained in:
parent
0fe1d707f0
commit
e748c345e2
|
@ -514,6 +514,117 @@ rate_limits = handler.get_rate_limit_info()
|
|||
- **Description**: Gets information about the API's rate limits
|
||||
- **Returns**: Dict[str, Any] - Dictionary with rate limit information
|
||||
|
||||
## Ranking Module
|
||||
|
||||
### JinaReranker Class
|
||||
|
||||
The `JinaReranker` class provides document reranking functionality using Jina AI's Reranker API.
|
||||
|
||||
#### Initialization
|
||||
```python
|
||||
reranker = JinaReranker(
|
||||
api_key=None, # Optional, will use environment variable if not provided
|
||||
model="jina-reranker-v2-base-multilingual", # Default model
|
||||
endpoint="https://api.jina.ai/v1/rerank" # Default endpoint
|
||||
)
|
||||
```
|
||||
- **Description**: Initializes the JinaReranker with the specified API key, model, and endpoint
|
||||
- **Parameters**:
|
||||
- `api_key` (Optional[str]): Jina AI API key (defaults to environment variable)
|
||||
- `model` (str): The reranker model to use
|
||||
- `endpoint` (str): The API endpoint
|
||||
- **Requirements**: JINA_API_KEY environment variable must be set if api_key is not provided
|
||||
- **Raises**: ValueError if API key is not available
|
||||
|
||||
#### rerank
|
||||
```python
|
||||
reranked_docs = reranker.rerank(query, documents, top_n=None)
|
||||
```
|
||||
- **Description**: Reranks a list of documents based on their relevance to the query
|
||||
- **Parameters**:
|
||||
- `query` (str): The query string
|
||||
- `documents` (List[str]): List of document strings to rerank
|
||||
- `top_n` (Optional[int]): Number of top documents to return (defaults to all)
|
||||
- **Returns**: List[Dict[str, Any]] - List of reranked documents with scores
|
||||
- **Example Return Format**:
|
||||
```json
|
||||
[
|
||||
{
|
||||
"index": 0,
|
||||
"score": 0.95,
|
||||
"document": "Document content here"
|
||||
},
|
||||
{
|
||||
"index": 3,
|
||||
"score": 0.82,
|
||||
"document": "Another document content"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
#### get_jina_reranker
|
||||
```python
|
||||
reranker = get_jina_reranker()
|
||||
```
|
||||
- **Description**: Factory function to get a JinaReranker instance with configuration from the config file
|
||||
- **Returns**: JinaReranker - Initialized reranker instance
|
||||
- **Raises**: ValueError if API key is not available
|
||||
|
||||
### Usage Examples
|
||||
|
||||
#### Basic Usage
|
||||
```python
|
||||
from ranking.jina_reranker import JinaReranker
|
||||
|
||||
reranker = JinaReranker()
|
||||
query = "What is quantum computing?"
|
||||
documents = [
|
||||
"Quantum computing is a computation system that uses quantum mechanics.",
|
||||
"Classical computers use bits while quantum computers use qubits.",
|
||||
"Artificial intelligence is transforming various industries."
|
||||
]
|
||||
|
||||
reranked = reranker.rerank(query, documents)
|
||||
for doc in reranked:
|
||||
print(f"Score: {doc['score']}, Document: {doc['document']}")
|
||||
```
|
||||
|
||||
#### Integration with ResultCollector
|
||||
```python
|
||||
from execution.result_collector import ResultCollector
|
||||
from ranking.jina_reranker import get_jina_reranker
|
||||
|
||||
# Initialize components
|
||||
reranker = get_jina_reranker()
|
||||
collector = ResultCollector(reranker=reranker)
|
||||
|
||||
# Process search results with reranking
|
||||
reranked_results = collector.process_results(
|
||||
search_results,
|
||||
dedup=True,
|
||||
max_results=20,
|
||||
use_reranker=True
|
||||
)
|
||||
```
|
||||
|
||||
#### Testing
|
||||
```python
|
||||
# Simple test script
|
||||
import json
|
||||
from ranking.jina_reranker import get_jina_reranker
|
||||
|
||||
reranker = get_jina_reranker()
|
||||
query = "What is quantum computing?"
|
||||
documents = [
|
||||
"Quantum computing is a type of computation that harnesses quantum mechanics.",
|
||||
"Classical computers use bits, while quantum computers use qubits.",
|
||||
"Machine learning is a subset of artificial intelligence."
|
||||
]
|
||||
|
||||
reranked = reranker.rerank(query, documents)
|
||||
print(json.dumps(reranked, indent=2))
|
||||
```
|
||||
|
||||
## Search Execution Testing
|
||||
|
||||
The search execution module has been tested to ensure it correctly executes search queries across multiple search engines and processes the results.
|
||||
|
|
Loading…
Reference in New Issue