# Decision Log ## 2025-02-27: Initial Project Setup ### Decision: Use Jina AI APIs for Semantic Search - **Context**: Need for semantic search capabilities that understand context beyond keywords - **Options Considered**: 1. Build custom embedding solution 2. Use open-source models locally 3. Use Jina AI's APIs - **Decision**: Use Jina AI's APIs for embedding generation and similarity computation - **Rationale**: - High-quality embeddings with state-of-the-art models - No need to manage model deployment and infrastructure - Simple API integration with reasonable pricing - Support for long texts through segmentation ### Decision: Separate Markdown Segmentation from Similarity Computation - **Context**: Need to handle potentially long markdown documents - **Options Considered**: 1. Integrate segmentation directly into the similarity module 2. Create a separate module for segmentation - **Decision**: Create a separate module (markdown_segmenter.py) for document segmentation - **Rationale**: - Better separation of concerns - More modular design allows for independent use of components - Easier to maintain and extend each component separately ### Decision: Use Environment Variables for API Keys - **Context**: Need to securely manage API credentials - **Options Considered**: 1. Configuration files 2. Environment variables 3. Secret management service - **Decision**: Use environment variables (JINA_API_KEY) - **Rationale**: - Simple to implement - Standard practice for managing secrets - Works well across different environments - Prevents accidental commit of credentials to version control ### Decision: Use Cosine Similarity with Normalized Vectors - **Context**: Need a metric for comparing semantic similarity between text embeddings - **Options Considered**: 1. Euclidean distance 2. Cosine similarity 3. Dot product - **Decision**: Use cosine similarity with normalized vectors - **Rationale**: - Standard approach for semantic similarity - Normalized vectors simplify computation (dot product equals cosine similarity) - Less sensitive to embedding magnitude, focusing on direction (meaning) ## 2025-02-27: Research System Architecture ### Decision: Implement a Multi-Stage Research Pipeline - **Context**: Need to define the overall architecture for the intelligent research system - **Options Considered**: 1. Monolithic application with tightly coupled components 2. Microservices architecture with independent services 3. Pipeline architecture with distinct processing stages - **Decision**: Implement an 8-stage pipeline architecture - **Rationale**: - Clear separation of concerns with each stage having a specific responsibility - Easier to develop and test individual components - Flexibility to swap or enhance specific stages without affecting others - Natural flow of data through the system matches the research process ### Decision: Use Multiple Search Sources - **Context**: Need to gather comprehensive information from various sources - **Options Considered**: 1. Use a single search API for simplicity 2. Implement custom web scraping for all sources 3. Use multiple specialized search APIs - **Decision**: Integrate multiple search sources (Google, Serper, Jina Search, Google Scholar, arXiv) - **Rationale**: - Different sources provide different types of information (academic, general, etc.) - Increases the breadth and diversity of search results - Specialized APIs like arXiv provide domain-specific information - Redundancy ensures more comprehensive coverage ### Decision: Use Jina AI for Semantic Processing - **Context**: Need for advanced semantic understanding in document processing - **Options Considered**: 1. Use simple keyword matching 2. Implement custom embedding models 3. Use Jina AI's suite of APIs - **Decision**: Use Jina AI's APIs for embedding generation, similarity computation, and reranking - **Rationale**: - High-quality embeddings with state-of-the-art models - Comprehensive API suite covering multiple needs (embeddings, segmentation, reranking) - Simple integration with reasonable pricing - Consistent approach across different semantic processing tasks ## 2025-02-27: Search Execution Architecture ### Decision: Search Execution Architecture - **Context**: We needed to implement a search execution module that could execute search queries across multiple search engines and process the results in a standardized way. - **Decision**: 1. Create a modular search execution architecture: - Implement a base handler interface (`BaseSearchHandler`) for all search API handlers - Create specific handlers for each search engine (Google, Serper, Scholar, arXiv) - Develop a central `SearchExecutor` class to manage execution across multiple engines - Implement a `ResultCollector` class for processing and organizing results 2. Use parallel execution for search queries: - Implement thread-based parallelism using `concurrent.futures` - Add support for both synchronous and asynchronous execution - Include timeout management and error handling 3. Standardize search results: - Define a common result format across all search engines - Include metadata specific to each search engine in a standardized way - Implement deduplication and scoring for result ranking - **Rationale**: - A modular architecture allows for easy addition of new search engines - Parallel execution significantly improves search performance - Standardized result format simplifies downstream processing - Separation of concerns between execution and result processing - **Alternatives Considered**: 1. Sequential execution of search queries: - Simpler implementation - Much slower performance - Would not scale well with additional search engines 2. Separate modules for each search engine: - Would lead to code duplication - More difficult to maintain - Less consistent result format 3. Using a third-party search aggregation service: - Would introduce additional dependencies - Less control over the search process - Potential cost implications - **Impact**: - Efficient execution of search queries across multiple engines - Consistent result format for downstream processing - Flexible architecture that can be extended with new search engines - Clear separation of concerns between different components ## 2025-02-27: Search Execution Module Refinements ### Decision: Remove Google Search Handler - **Context**: Both Google and Serper handlers were implemented, but Serper is essentially a front-end for Google search - **Options Considered**: 1. Keep both handlers for redundancy 2. Remove the Google handler and only use Serper - **Decision**: Remove the Google search handler - **Rationale**: - Redundant functionality as Serper provides the same results - Simplifies the codebase and reduces maintenance - Reduces API costs by avoiding duplicate searches - Serper provides a more reliable and consistent API for Google search ### Decision: Modify LLM Query Enhancement Prompt - **Context**: The LLM was returning enhanced queries with explanations, which caused issues with search APIs - **Options Considered**: 1. Post-process the LLM output to extract just the query 2. Modify the prompt to request only the enhanced query - **Decision**: Modify the LLM prompt to request only the enhanced query without explanations - **Rationale**: - More reliable than post-processing, which could be error-prone - Cleaner implementation that addresses the root cause - Ensures consistent output format for downstream processing - Reduces the risk of exceeding API character limits ### Decision: Implement Query Truncation - **Context**: Enhanced queries could exceed the Serper API's 2048 character limit - **Options Considered**: 1. Limit the LLM's output length 2. Truncate queries before sending to the API 3. Split long queries into multiple searches - **Decision**: Implement query truncation in the search executor - **Rationale**: - Simple and effective solution - Preserves as much of the enhanced query as possible - Ensures API requests don't fail due to length constraints - Can be easily adjusted if API limits change ## 2025-02-27: Testing Strategy for Query Processor ### Context After integrating Groq and OpenRouter as additional LLM providers, we needed to verify that the query processor module functions correctly with these new providers. ### Decision 1. Create dedicated test scripts to validate the query processor functionality: - A basic test script for the core processing pipeline - A comprehensive test script for detailed component testing 2. Use monkey patching to ensure tests consistently use the Groq model: - Create a global LLM interface with the Groq model - Override the `get_llm_interface` function to always return this interface - This approach allows testing without modifying the core code 3. Test all key functionality of the query processor: - Query enhancement - Query classification - Search query generation - End-to-end processing pipeline ### Rationale - Dedicated test scripts provide a repeatable way to verify functionality - Monkey patching allows testing with specific models without changing the core code - Comprehensive testing ensures all components work correctly with the new providers - Saving test results to a JSON file provides a reference for future development ### Alternatives Considered 1. Modifying the query processor to accept a model parameter: - Would require changing the core code - Could introduce bugs in the production code 2. Using environment variables to control model selection: - Less precise control over which model is used - Could interfere with other tests or production use ### Impact - Verified that the query processor works correctly with Groq models - Established a testing approach that can be used for other modules - Created reusable test scripts for future development