pynamer/memory-bank/systemPatterns.md

2.0 KiB

System Patterns: PyNamer

Architecture: Command-Line Interface (CLI) tool.

Core Components:

  • CLI Parser (argparse): Handles command-line arguments (images, config, dry-run, verbose).
  • Configuration Loader (PyYAML): Loads settings from config.yaml.
  • LLM Interaction (litellm): Abstracts communication with various LLM providers. Handles API key and endpoint configuration.
  • Image Processing (Pillow):
    • Opens and reads image files.
    • Resizes images exceeding resize_max_dimension while maintaining aspect ratio.
    • Saves the processed image to a specified format (resize_format) in memory.
  • Encoding (base64, io): Encodes the processed image data for transmission via API.
  • File System Interaction (os, pathlib): Checks file existence, extracts paths/extensions, renames files.
  • Filename Cleaning: Simple string manipulation to enforce snake_case and remove invalid characters.
  • Logging (logging): Provides informative output about the process.

Workflow Pattern:

  1. Parse CLI arguments.
  2. Initialize PyNamer class with the config path.
  3. Load configuration (_load_config).
  4. Set up LLM client (_setup_llm), including image resize settings.
  5. Iterate through input image paths provided via CLI.
  6. For each image: a. Check existence and supported format (_is_supported_format). b. Resize and encode the image (_resize_and_encode_image). c. Prepare API request payload (prompts + image data). d. Call LLM via litellm.completion. e. Extract and clean the suggested filename. f. Construct the new file path. g. If not dry-run, rename the file, handling potential name collisions (rename_image). h. Log/print the outcome.

Configuration Pattern:

  • Centralized YAML file (config.yaml) for user-configurable settings (LLM details, API keys, prompts, image processing parameters).
  • Environment variables can override API keys/endpoints if not set in the config.