Update interfaces.md with documentation for reranker functionality
This commit is contained in:
parent
0fe1d707f0
commit
e748c345e2
|
@ -514,6 +514,117 @@ rate_limits = handler.get_rate_limit_info()
|
||||||
- **Description**: Gets information about the API's rate limits
|
- **Description**: Gets information about the API's rate limits
|
||||||
- **Returns**: Dict[str, Any] - Dictionary with rate limit information
|
- **Returns**: Dict[str, Any] - Dictionary with rate limit information
|
||||||
|
|
||||||
|
## Ranking Module
|
||||||
|
|
||||||
|
### JinaReranker Class
|
||||||
|
|
||||||
|
The `JinaReranker` class provides document reranking functionality using Jina AI's Reranker API.
|
||||||
|
|
||||||
|
#### Initialization
|
||||||
|
```python
|
||||||
|
reranker = JinaReranker(
|
||||||
|
api_key=None, # Optional, will use environment variable if not provided
|
||||||
|
model="jina-reranker-v2-base-multilingual", # Default model
|
||||||
|
endpoint="https://api.jina.ai/v1/rerank" # Default endpoint
|
||||||
|
)
|
||||||
|
```
|
||||||
|
- **Description**: Initializes the JinaReranker with the specified API key, model, and endpoint
|
||||||
|
- **Parameters**:
|
||||||
|
- `api_key` (Optional[str]): Jina AI API key (defaults to environment variable)
|
||||||
|
- `model` (str): The reranker model to use
|
||||||
|
- `endpoint` (str): The API endpoint
|
||||||
|
- **Requirements**: JINA_API_KEY environment variable must be set if api_key is not provided
|
||||||
|
- **Raises**: ValueError if API key is not available
|
||||||
|
|
||||||
|
#### rerank
|
||||||
|
```python
|
||||||
|
reranked_docs = reranker.rerank(query, documents, top_n=None)
|
||||||
|
```
|
||||||
|
- **Description**: Reranks a list of documents based on their relevance to the query
|
||||||
|
- **Parameters**:
|
||||||
|
- `query` (str): The query string
|
||||||
|
- `documents` (List[str]): List of document strings to rerank
|
||||||
|
- `top_n` (Optional[int]): Number of top documents to return (defaults to all)
|
||||||
|
- **Returns**: List[Dict[str, Any]] - List of reranked documents with scores
|
||||||
|
- **Example Return Format**:
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"index": 0,
|
||||||
|
"score": 0.95,
|
||||||
|
"document": "Document content here"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"index": 3,
|
||||||
|
"score": 0.82,
|
||||||
|
"document": "Another document content"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
#### get_jina_reranker
|
||||||
|
```python
|
||||||
|
reranker = get_jina_reranker()
|
||||||
|
```
|
||||||
|
- **Description**: Factory function to get a JinaReranker instance with configuration from the config file
|
||||||
|
- **Returns**: JinaReranker - Initialized reranker instance
|
||||||
|
- **Raises**: ValueError if API key is not available
|
||||||
|
|
||||||
|
### Usage Examples
|
||||||
|
|
||||||
|
#### Basic Usage
|
||||||
|
```python
|
||||||
|
from ranking.jina_reranker import JinaReranker
|
||||||
|
|
||||||
|
reranker = JinaReranker()
|
||||||
|
query = "What is quantum computing?"
|
||||||
|
documents = [
|
||||||
|
"Quantum computing is a computation system that uses quantum mechanics.",
|
||||||
|
"Classical computers use bits while quantum computers use qubits.",
|
||||||
|
"Artificial intelligence is transforming various industries."
|
||||||
|
]
|
||||||
|
|
||||||
|
reranked = reranker.rerank(query, documents)
|
||||||
|
for doc in reranked:
|
||||||
|
print(f"Score: {doc['score']}, Document: {doc['document']}")
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Integration with ResultCollector
|
||||||
|
```python
|
||||||
|
from execution.result_collector import ResultCollector
|
||||||
|
from ranking.jina_reranker import get_jina_reranker
|
||||||
|
|
||||||
|
# Initialize components
|
||||||
|
reranker = get_jina_reranker()
|
||||||
|
collector = ResultCollector(reranker=reranker)
|
||||||
|
|
||||||
|
# Process search results with reranking
|
||||||
|
reranked_results = collector.process_results(
|
||||||
|
search_results,
|
||||||
|
dedup=True,
|
||||||
|
max_results=20,
|
||||||
|
use_reranker=True
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Testing
|
||||||
|
```python
|
||||||
|
# Simple test script
|
||||||
|
import json
|
||||||
|
from ranking.jina_reranker import get_jina_reranker
|
||||||
|
|
||||||
|
reranker = get_jina_reranker()
|
||||||
|
query = "What is quantum computing?"
|
||||||
|
documents = [
|
||||||
|
"Quantum computing is a type of computation that harnesses quantum mechanics.",
|
||||||
|
"Classical computers use bits, while quantum computers use qubits.",
|
||||||
|
"Machine learning is a subset of artificial intelligence."
|
||||||
|
]
|
||||||
|
|
||||||
|
reranked = reranker.rerank(query, documents)
|
||||||
|
print(json.dumps(reranked, indent=2))
|
||||||
|
```
|
||||||
|
|
||||||
## Search Execution Testing
|
## Search Execution Testing
|
||||||
|
|
||||||
The search execution module has been tested to ensure it correctly executes search queries across multiple search engines and processes the results.
|
The search execution module has been tested to ensure it correctly executes search queries across multiple search engines and processes the results.
|
||||||
|
|
Loading…
Reference in New Issue