Updated to shrink big images before sending them.
This commit is contained in:
parent
11ea971542
commit
d4ea970b3d
|
@ -0,0 +1,38 @@
|
||||||
|
# PyNamer Project Rules
|
||||||
|
|
||||||
|
**Implementation Patterns:**
|
||||||
|
1. **Image Processing:**
|
||||||
|
- Always maintain aspect ratio when resizing.
|
||||||
|
- Use LANCZOS resampling for quality downscaling.
|
||||||
|
- Handle transparency conversion when saving as JPEG.
|
||||||
|
- Keep original image files untouched until final rename operation.
|
||||||
|
|
||||||
|
2. **Filename Generation:**
|
||||||
|
- Enforce snake_case format.
|
||||||
|
- Remove special characters.
|
||||||
|
- Handle duplicate filenames by appending incrementing numbers.
|
||||||
|
|
||||||
|
3. **Error Handling:**
|
||||||
|
- Log detailed errors for debugging.
|
||||||
|
- Fail gracefully with clear user feedback.
|
||||||
|
- Preserve original files on errors.
|
||||||
|
|
||||||
|
4. **Configuration:**
|
||||||
|
- Sensible defaults for all configurable parameters.
|
||||||
|
- Environment variables can override sensitive settings (API keys).
|
||||||
|
- Config changes require restart (no hot-reloading).
|
||||||
|
|
||||||
|
**User Preferences:**
|
||||||
|
- Default to JPEG format for resized images (better compression).
|
||||||
|
- Default max dimension of 1024px (balances quality and efficiency).
|
||||||
|
- Dry-run mode enabled by flag for safety.
|
||||||
|
|
||||||
|
**Known Challenges:**
|
||||||
|
- Large images may still consume significant memory during processing.
|
||||||
|
- Some LLM models may have different optimal image sizes/formats.
|
||||||
|
- Transparency handling requires special consideration when converting formats.
|
||||||
|
|
||||||
|
**Workflow Patterns:**
|
||||||
|
- Always check file existence and supported formats first.
|
||||||
|
- Process images sequentially (no parallel processing yet).
|
||||||
|
- Log each major operation step for traceability.
|
|
@ -1,2 +1,170 @@
|
||||||
|
# Byte-compiled / optimized / DLL files
|
||||||
|
__pycache__/
|
||||||
|
*.py[cod]
|
||||||
|
*$py.class
|
||||||
|
|
||||||
|
# C extensions
|
||||||
|
*.so
|
||||||
|
|
||||||
|
# Distribution / packaging
|
||||||
|
.Python
|
||||||
|
build/
|
||||||
|
develop-eggs/
|
||||||
|
dist/
|
||||||
|
downloads/
|
||||||
|
eggs/
|
||||||
|
.eggs/
|
||||||
|
lib/
|
||||||
|
lib64/
|
||||||
|
parts/
|
||||||
|
sdist/
|
||||||
|
var/
|
||||||
|
wheels/
|
||||||
|
*.egg-info/
|
||||||
|
.installed.cfg
|
||||||
|
*.egg
|
||||||
|
MANIFEST
|
||||||
|
|
||||||
|
# PyInstaller
|
||||||
|
# Usually these files are written by a python script from a template
|
||||||
|
# before PyInstaller builds the exe, so as to inject date/other infos into it.
|
||||||
|
*.manifest
|
||||||
|
*.spec
|
||||||
|
|
||||||
|
# Installer logs
|
||||||
|
pip-log.txt
|
||||||
|
pip-delete-this-directory.txt
|
||||||
|
|
||||||
|
# Unit test / coverage reports
|
||||||
|
htmlcov/
|
||||||
|
.tox/
|
||||||
|
.nox/
|
||||||
|
.coverage
|
||||||
|
.coverage.*
|
||||||
|
.cache
|
||||||
|
nosetests.xml
|
||||||
|
coverage.xml
|
||||||
|
*.cover
|
||||||
|
*.py,cover
|
||||||
|
.hypothesis/
|
||||||
|
.pytest_cache/
|
||||||
|
|
||||||
|
# Translations
|
||||||
|
*.mo
|
||||||
|
*.pot
|
||||||
|
|
||||||
|
# Django stuff:
|
||||||
|
*.log
|
||||||
|
local_settings.py
|
||||||
|
db.sqlite3
|
||||||
|
|
||||||
|
# Flask stuff:
|
||||||
|
instance/
|
||||||
|
.webassets-cache
|
||||||
|
|
||||||
|
# Scrapy stuff:
|
||||||
|
.scrapy
|
||||||
|
|
||||||
|
# Sphinx documentation
|
||||||
|
docs/_build/
|
||||||
|
|
||||||
|
# PyBuilder
|
||||||
|
target/
|
||||||
|
|
||||||
|
# Jupyter Notebook
|
||||||
|
.ipynb_checkpoints
|
||||||
|
|
||||||
|
# IPython
|
||||||
|
profile_default/
|
||||||
|
ipython_config.py
|
||||||
|
|
||||||
|
# pyenv
|
||||||
|
.python-version
|
||||||
|
|
||||||
|
# pipenv
|
||||||
|
# According to pypa/pipenv#598, it's recommended to include Pipfile.lock in version control.
|
||||||
|
# However, in case of collaboration, if having platform-specific dependencies or dependencies
|
||||||
|
# having no cross-platform support, pipenv may install dependencies that don't work, or not
|
||||||
|
# install all needed dependencies.
|
||||||
|
#Pipfile.lock
|
||||||
|
|
||||||
|
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
|
||||||
|
__pypackages__/
|
||||||
|
|
||||||
|
# Celery stuff
|
||||||
|
celerybeat-schedule
|
||||||
|
celerybeat.pid
|
||||||
|
|
||||||
|
# SageMath parsed files
|
||||||
|
*.sage.py
|
||||||
|
|
||||||
|
# Environments
|
||||||
|
.env
|
||||||
|
.venv
|
||||||
|
env/
|
||||||
|
venv/
|
||||||
|
ENV/
|
||||||
|
env.bak/
|
||||||
|
venv.bak/
|
||||||
|
|
||||||
|
# Spyder project settings
|
||||||
|
.spyderproject
|
||||||
|
.spyproject
|
||||||
|
|
||||||
|
# Rope project settings
|
||||||
|
.ropeproject
|
||||||
|
|
||||||
|
# mkdocs documentation
|
||||||
|
/site
|
||||||
|
|
||||||
|
# mypy
|
||||||
|
.mypy_cache/
|
||||||
|
.dmypy.json
|
||||||
|
dmypy.json
|
||||||
|
|
||||||
|
# Pyre type checker
|
||||||
|
.pyre/
|
||||||
|
|
||||||
|
# pytype static type analyzer
|
||||||
|
.pytype/
|
||||||
|
|
||||||
|
# operating system files
|
||||||
.DS_Store
|
.DS_Store
|
||||||
.venv/
|
.DS_Store?
|
||||||
|
._*
|
||||||
|
.Spotlight-V100
|
||||||
|
.Trashes
|
||||||
|
ehthumbs.db
|
||||||
|
Thumbs.db
|
||||||
|
|
||||||
|
# Image files
|
||||||
|
*.png
|
||||||
|
*.jpg
|
||||||
|
*.jpeg
|
||||||
|
*.gif
|
||||||
|
*.bmp
|
||||||
|
*.tiff
|
||||||
|
|
||||||
|
# Log files
|
||||||
|
*.log
|
||||||
|
|
||||||
|
# Editor directories and files
|
||||||
|
.idea/
|
||||||
|
.vscode/
|
||||||
|
*.swp
|
||||||
|
*.swo
|
||||||
|
*~
|
||||||
|
*.bak
|
||||||
|
*.tmp
|
||||||
|
*.orig
|
||||||
|
*.class
|
||||||
|
*.jar
|
||||||
|
*.war
|
||||||
|
*.ear
|
||||||
|
*.zip
|
||||||
|
*.tar.gz
|
||||||
|
*.rar
|
||||||
|
|
||||||
|
# Local development files
|
||||||
|
*.local
|
||||||
|
*.dev
|
||||||
|
|
10
config.yaml
10
config.yaml
|
@ -2,10 +2,10 @@
|
||||||
|
|
||||||
# LLM API Configuration
|
# LLM API Configuration
|
||||||
llm:
|
llm:
|
||||||
provider: "openai" # Provider name (openai, anthropic, etc.)
|
provider: "openrouter" # Supported: openai, anthropic, openrouter
|
||||||
model: "gpt-4o-mini" # Model name
|
model: "openrouter/google/gemma-3-27b-it" # Must be a vision-capable model
|
||||||
api_key: "" # Your API key (leave empty to use environment variable)
|
api_key: "" # Your API key (or set OPENAI_API_KEY environment variable)
|
||||||
endpoint: "" # Custom endpoint URL (if using a proxy or alternative service)
|
endpoint: "" # Custom endpoint URL if needed
|
||||||
max_tokens: 100 # Maximum tokens for response
|
max_tokens: 100 # Maximum tokens for response
|
||||||
temperature: 0.7 # Temperature for generation
|
temperature: 0.7 # Temperature for generation
|
||||||
|
|
||||||
|
@ -17,6 +17,8 @@ image:
|
||||||
- ".png"
|
- ".png"
|
||||||
- ".gif"
|
- ".gif"
|
||||||
- ".webp"
|
- ".webp"
|
||||||
|
resize_max_dimension: 1024 # Max width/height before resizing
|
||||||
|
resize_format: "JPEG" # Format for resized images
|
||||||
|
|
||||||
# Prompt Configuration
|
# Prompt Configuration
|
||||||
prompt:
|
prompt:
|
||||||
|
|
|
@ -0,0 +1,25 @@
|
||||||
|
# Active Context: PyNamer - Image Resizing Implementation
|
||||||
|
|
||||||
|
**Current Focus:** Implementing image resizing functionality to normalize image dimensions before sending them to the LLM.
|
||||||
|
|
||||||
|
**Decisions Made:**
|
||||||
|
- Use the `Pillow` library for image manipulation due to its robustness and ease of use in Python.
|
||||||
|
- Add `Pillow` to `requirements.txt`.
|
||||||
|
- Introduce configuration options in `config.yaml` under the `image` section:
|
||||||
|
- `resize_max_dimension`: Controls the maximum size (width or height) of the image sent to the LLM. Defaults to 1024.
|
||||||
|
- `resize_format`: Specifies the image format (e.g., 'JPEG', 'PNG') to use after resizing. Defaults to 'JPEG'.
|
||||||
|
- Modify the `_encode_image` method (renamed to `_resize_and_encode_image`) to perform resizing:
|
||||||
|
- Open the image using `PIL.Image.open()`.
|
||||||
|
- Check if the image's largest dimension exceeds `resize_max_dimension`.
|
||||||
|
- If it exceeds, calculate new dimensions maintaining aspect ratio and resize using `img.resize()` with `Image.Resampling.LANCZOS`.
|
||||||
|
- Save the (potentially resized) image to an in-memory buffer (`io.BytesIO`) using the configured `resize_format`.
|
||||||
|
- Handle potential transparency issues when saving formats like JPEG by converting the image mode to 'RGB' if necessary.
|
||||||
|
- Base64 encode the bytes from the buffer.
|
||||||
|
- Update the `generate_filename` method to call `_resize_and_encode_image` instead of `_encode_image`.
|
||||||
|
- Update the `generate_filename` method to dynamically set the `mime_type` in the LLM request based on `resize_format`.
|
||||||
|
- Load the new configuration options in `_setup_llm`.
|
||||||
|
|
||||||
|
**Next Steps:**
|
||||||
|
- Create `memory-bank/progress.md`.
|
||||||
|
- Create `.clinerules`.
|
||||||
|
- Final review and testing.
|
|
@ -0,0 +1,16 @@
|
||||||
|
# Product Context: PyNamer
|
||||||
|
|
||||||
|
**Problem:** Manually naming large numbers of image files is tedious and time-consuming. Generic filenames (e.g., `IMG_1234.JPG`) lack descriptive value, making it hard to find specific images later.
|
||||||
|
|
||||||
|
**Solution:** `pynamer` automates the process of generating descriptive filenames for images by leveraging the image understanding capabilities of multimodal LLMs.
|
||||||
|
|
||||||
|
**User Experience:**
|
||||||
|
- The user provides one or more image paths via the command line.
|
||||||
|
- The tool processes each image, interacts with an LLM (configured via `config.yaml`), and renames the file with a descriptive, clean filename.
|
||||||
|
- A dry-run option allows users to preview the changes without modifying files.
|
||||||
|
- **Efficiency Enhancement:** By resizing large images before sending them to the LLM, the tool aims to:
|
||||||
|
- Reduce the amount of data transferred.
|
||||||
|
- Potentially lower API costs (as some models charge based on input size/tokens).
|
||||||
|
- Speed up the processing time.
|
||||||
|
|
||||||
|
**Target User:** Individuals or teams dealing with many images who need a better way to organize and retrieve them based on content (e.g., photographers, researchers, content creators).
|
|
@ -0,0 +1,28 @@
|
||||||
|
# Progress: PyNamer - Image Resizing Implementation
|
||||||
|
|
||||||
|
**Completed:**
|
||||||
|
1. Added Pillow dependency to `requirements.txt`.
|
||||||
|
2. Updated `pynamer.py` with image resizing functionality:
|
||||||
|
- Renamed `_encode_image` to `_resize_and_encode_image`.
|
||||||
|
- Implemented image resizing logic using Pillow.
|
||||||
|
- Added proper error handling for image processing.
|
||||||
|
- Updated `generate_filename` to use the new method and set correct mime type.
|
||||||
|
3. Updated `config.yaml` with new image resizing configuration options:
|
||||||
|
- `resize_max_dimension`
|
||||||
|
- `resize_format`
|
||||||
|
4. Created comprehensive memory bank documentation:
|
||||||
|
- `projectbrief.md`
|
||||||
|
- `productContext.md`
|
||||||
|
- `systemPatterns.md`
|
||||||
|
- `techContext.md`
|
||||||
|
- `activeContext.md`
|
||||||
|
|
||||||
|
**Remaining:**
|
||||||
|
1. Create `.clinerules` file.
|
||||||
|
2. Final testing and verification.
|
||||||
|
|
||||||
|
**Issues/Notes:**
|
||||||
|
- The implementation maintains backward compatibility with existing configurations.
|
||||||
|
- The default resize format is set to JPEG for better compression, but this may need adjustment for images with transparency.
|
||||||
|
- The LANCZOS resampling filter provides good quality for downscaling.
|
||||||
|
- Error handling has been improved to provide better feedback when image processing fails.
|
|
@ -0,0 +1,18 @@
|
||||||
|
# Project Brief: PyNamer
|
||||||
|
|
||||||
|
**Goal:** Enhance the `pynamer` tool to improve efficiency and potentially reduce costs by normalizing image sizes before submitting them to a Large Language Model (LLM) for filename generation.
|
||||||
|
|
||||||
|
**Core Functionality:**
|
||||||
|
- Takes one or more image file paths as input.
|
||||||
|
- Reads configuration from `config.yaml`.
|
||||||
|
- Resizes images exceeding a configured maximum dimension while maintaining aspect ratio.
|
||||||
|
- Encodes the (potentially resized) image to base64.
|
||||||
|
- Sends the image data and configured prompts to an LLM (via `litellm`).
|
||||||
|
- Receives a descriptive filename suggestion from the LLM.
|
||||||
|
- Cleans the suggested filename (snake_case, alphanumeric).
|
||||||
|
- Renames the original image file with the new filename.
|
||||||
|
- Supports dry-run mode.
|
||||||
|
|
||||||
|
**Enhancement:**
|
||||||
|
- Added image resizing using the Pillow library before encoding and sending to the LLM.
|
||||||
|
- Introduced configuration options (`resize_max_dimension`, `resize_format`) in `config.yaml`.
|
|
@ -0,0 +1,36 @@
|
||||||
|
# System Patterns: PyNamer
|
||||||
|
|
||||||
|
**Architecture:** Command-Line Interface (CLI) tool.
|
||||||
|
|
||||||
|
**Core Components:**
|
||||||
|
- **CLI Parser (`argparse`):** Handles command-line arguments (`images`, `config`, `dry-run`, `verbose`).
|
||||||
|
- **Configuration Loader (`PyYAML`):** Loads settings from `config.yaml`.
|
||||||
|
- **LLM Interaction (`litellm`):** Abstracts communication with various LLM providers. Handles API key and endpoint configuration.
|
||||||
|
- **Image Processing (`Pillow`):**
|
||||||
|
- Opens and reads image files.
|
||||||
|
- Resizes images exceeding `resize_max_dimension` while maintaining aspect ratio.
|
||||||
|
- Saves the processed image to a specified format (`resize_format`) in memory.
|
||||||
|
- **Encoding (`base64`, `io`):** Encodes the processed image data for transmission via API.
|
||||||
|
- **File System Interaction (`os`, `pathlib`):** Checks file existence, extracts paths/extensions, renames files.
|
||||||
|
- **Filename Cleaning:** Simple string manipulation to enforce snake_case and remove invalid characters.
|
||||||
|
- **Logging (`logging`):** Provides informative output about the process.
|
||||||
|
|
||||||
|
**Workflow Pattern:**
|
||||||
|
1. Parse CLI arguments.
|
||||||
|
2. Initialize `PyNamer` class with the config path.
|
||||||
|
3. Load configuration (`_load_config`).
|
||||||
|
4. Set up LLM client (`_setup_llm`), including image resize settings.
|
||||||
|
5. Iterate through input image paths provided via CLI.
|
||||||
|
6. For each image:
|
||||||
|
a. Check existence and supported format (`_is_supported_format`).
|
||||||
|
b. Resize and encode the image (`_resize_and_encode_image`).
|
||||||
|
c. Prepare API request payload (prompts + image data).
|
||||||
|
d. Call LLM via `litellm.completion`.
|
||||||
|
e. Extract and clean the suggested filename.
|
||||||
|
f. Construct the new file path.
|
||||||
|
g. If not dry-run, rename the file, handling potential name collisions (`rename_image`).
|
||||||
|
h. Log/print the outcome.
|
||||||
|
|
||||||
|
**Configuration Pattern:**
|
||||||
|
- Centralized YAML file (`config.yaml`) for user-configurable settings (LLM details, API keys, prompts, image processing parameters).
|
||||||
|
- Environment variables can override API keys/endpoints if not set in the config.
|
|
@ -0,0 +1,40 @@
|
||||||
|
# Tech Context: PyNamer
|
||||||
|
|
||||||
|
**Language:** Python 3
|
||||||
|
|
||||||
|
**Core Libraries:**
|
||||||
|
- `litellm`: For interacting with various LLM APIs (OpenAI, Anthropic, etc.). Handles model routing, API key management, and standardized response format.
|
||||||
|
- `PyYAML`: For parsing the `config.yaml` configuration file.
|
||||||
|
- `Pillow`: For image manipulation (opening, resizing, saving to buffer).
|
||||||
|
- `argparse`: Standard library for parsing command-line arguments.
|
||||||
|
- `base64`: Standard library for encoding image data.
|
||||||
|
- `io`: Standard library for handling in-memory byte streams (used with Pillow).
|
||||||
|
- `os`, `pathlib`: Standard libraries for file system operations.
|
||||||
|
- `logging`: Standard library for application logging.
|
||||||
|
|
||||||
|
**Dependencies:**
|
||||||
|
- Listed in `requirements.txt`.
|
||||||
|
- Key dependencies: `litellm`, `pyyaml`, `Pillow`.
|
||||||
|
|
||||||
|
**Setup & Execution:**
|
||||||
|
1. **Installation:**
|
||||||
|
```bash
|
||||||
|
pip install -r requirements.txt
|
||||||
|
# or potentially: pip install . (if setup.py is configured correctly)
|
||||||
|
```
|
||||||
|
2. **Configuration:**
|
||||||
|
- Create or modify `config.yaml`.
|
||||||
|
- Set LLM `api_key` in the config or via environment variable (e.g., `OPENAI_API_KEY`).
|
||||||
|
- Adjust `model`, `max_tokens`, `temperature`, `resize_max_dimension`, `resize_format`, and `prompts` as needed.
|
||||||
|
3. **Execution:**
|
||||||
|
```bash
|
||||||
|
python pynamer.py <image_path_1> [image_path_2 ...] [-c config.yaml] [-d] [-v]
|
||||||
|
```
|
||||||
|
- `<image_path>`: Path to the image file(s). Handles paths with spaces.
|
||||||
|
- `-c`: Specify a different config file path.
|
||||||
|
- `-d`: Dry run (preview changes).
|
||||||
|
- `-v`: Verbose logging.
|
||||||
|
|
||||||
|
**Environment:**
|
||||||
|
- Assumes a standard Python environment where dependencies can be installed via pip.
|
||||||
|
- Relies on network access to reach the configured LLM API endpoint.
|
55
pynamer.py
55
pynamer.py
|
@ -2,14 +2,17 @@
|
||||||
|
|
||||||
import argparse
|
import argparse
|
||||||
import base64
|
import base64
|
||||||
|
import io
|
||||||
import os
|
import os
|
||||||
import sys
|
import sys
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
import yaml
|
import yaml
|
||||||
from typing import Dict, List, Optional, Union
|
from typing import Dict, List, Optional, Union
|
||||||
|
|
||||||
import litellm
|
import litellm
|
||||||
from litellm import completion
|
from litellm import completion
|
||||||
import logging
|
import logging
|
||||||
|
from PIL import Image # Added for image processing
|
||||||
|
|
||||||
# Configure logging
|
# Configure logging
|
||||||
logging.basicConfig(
|
logging.basicConfig(
|
||||||
|
@ -66,10 +69,16 @@ class PyNamer:
|
||||||
self.max_tokens = llm_config.get('max_tokens', 100)
|
self.max_tokens = llm_config.get('max_tokens', 100)
|
||||||
self.temperature = llm_config.get('temperature', 0.7)
|
self.temperature = llm_config.get('temperature', 0.7)
|
||||||
|
|
||||||
logger.info(f"LLM setup complete. Using model: {self.model}")
|
# Image processing settings
|
||||||
|
image_config = self.config.get('image', {})
|
||||||
|
self.resize_max_dimension = image_config.get('resize_max_dimension', 1024) # Default max dimension
|
||||||
|
self.resize_format = image_config.get('resize_format', 'JPEG') # Default format after resize
|
||||||
|
|
||||||
def _encode_image(self, image_path: str) -> str:
|
logger.info(f"LLM setup complete. Using model: {self.model}")
|
||||||
"""Encode image to base64 for API submission.
|
logger.info(f"Image resize settings: max_dimension={self.resize_max_dimension}, format={self.resize_format}")
|
||||||
|
|
||||||
|
def _resize_and_encode_image(self, image_path: str) -> str:
|
||||||
|
"""Resize image if necessary and encode to base64 for API submission.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image_path: Path to the image file
|
image_path: Path to the image file
|
||||||
|
@ -77,8 +86,35 @@ class PyNamer:
|
||||||
Returns:
|
Returns:
|
||||||
Base64 encoded image string
|
Base64 encoded image string
|
||||||
"""
|
"""
|
||||||
with open(image_path, "rb") as image_file:
|
try:
|
||||||
return base64.b64encode(image_file.read()).decode('utf-8')
|
with Image.open(image_path) as img:
|
||||||
|
# Calculate new size maintaining aspect ratio
|
||||||
|
width, height = img.size
|
||||||
|
if max(width, height) > self.resize_max_dimension:
|
||||||
|
if width > height:
|
||||||
|
new_width = self.resize_max_dimension
|
||||||
|
new_height = int(height * (self.resize_max_dimension / width))
|
||||||
|
else:
|
||||||
|
new_height = self.resize_max_dimension
|
||||||
|
new_width = int(width * (self.resize_max_dimension / height))
|
||||||
|
|
||||||
|
logger.debug(f"Resizing image from {width}x{height} to {new_width}x{new_height}")
|
||||||
|
img = img.resize((new_width, new_height), Image.Resampling.LANCZOS)
|
||||||
|
else:
|
||||||
|
logger.debug("Image size is within limits, no resize needed.")
|
||||||
|
|
||||||
|
# Save resized image to a bytes buffer
|
||||||
|
buffer = io.BytesIO()
|
||||||
|
# Handle potential transparency issues when saving as JPEG
|
||||||
|
if self.resize_format.upper() == 'JPEG' and img.mode in ('RGBA', 'P'):
|
||||||
|
img = img.convert('RGB')
|
||||||
|
img.save(buffer, format=self.resize_format)
|
||||||
|
img_bytes = buffer.getvalue()
|
||||||
|
|
||||||
|
return base64.b64encode(img_bytes).decode('utf-8')
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error processing image {image_path}: {e}")
|
||||||
|
raise # Re-raise the exception to be caught by the caller
|
||||||
|
|
||||||
def _is_supported_format(self, file_path: str) -> bool:
|
def _is_supported_format(self, file_path: str) -> bool:
|
||||||
"""Check if the file format is supported.
|
"""Check if the file format is supported.
|
||||||
|
@ -111,8 +147,11 @@ class PyNamer:
|
||||||
return None
|
return None
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Encode image
|
# Resize and encode image
|
||||||
base64_image = self._encode_image(image_path)
|
base64_image = self._resize_and_encode_image(image_path)
|
||||||
|
|
||||||
|
# Determine the mime type based on the resize format
|
||||||
|
mime_type = f"image/{self.resize_format.lower()}"
|
||||||
|
|
||||||
# Prepare messages for LLM
|
# Prepare messages for LLM
|
||||||
system_message = self.config.get('prompt', {}).get('system_message', '')
|
system_message = self.config.get('prompt', {}).get('system_message', '')
|
||||||
|
@ -126,7 +165,7 @@ class PyNamer:
|
||||||
{"type": "text", "text": user_message},
|
{"type": "text", "text": user_message},
|
||||||
{
|
{
|
||||||
"type": "image_url",
|
"type": "image_url",
|
||||||
"image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}
|
"image_url": {"url": f"data:{mime_type};base64,{base64_image}"}
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
|
|
|
@ -1,2 +1,3 @@
|
||||||
litellm>=1.10.0
|
litellm>=1.10.0
|
||||||
pyyaml>=6.0
|
pyyaml>=6.0
|
||||||
|
Pillow>=9.0.0 # Added for image resizing
|
||||||
|
|
1
setup.py
1
setup.py
|
@ -30,6 +30,7 @@ setup(
|
||||||
install_requires=[
|
install_requires=[
|
||||||
"litellm>=1.10.0",
|
"litellm>=1.10.0",
|
||||||
"pyyaml>=6.0",
|
"pyyaml>=6.0",
|
||||||
|
"Pillow>=9.0.0",
|
||||||
],
|
],
|
||||||
python_requires=">=3.7",
|
python_requires=">=3.7",
|
||||||
entry_points={
|
entry_points={
|
||||||
|
|
|
@ -1,138 +0,0 @@
|
||||||
Metadata-Version: 2.1
|
|
||||||
Name: pynamer
|
|
||||||
Version: 0.1.0
|
|
||||||
Summary: Generate descriptive filenames for images using LLMs
|
|
||||||
Home-page: https://github.com/yourusername/pynamer
|
|
||||||
Author: Your Name
|
|
||||||
Author-email: your.email@example.com
|
|
||||||
Classifier: Development Status :: 3 - Alpha
|
|
||||||
Classifier: Intended Audience :: Developers
|
|
||||||
Classifier: License :: OSI Approved :: MIT License
|
|
||||||
Classifier: Programming Language :: Python :: 3
|
|
||||||
Classifier: Programming Language :: Python :: 3.7
|
|
||||||
Classifier: Programming Language :: Python :: 3.8
|
|
||||||
Classifier: Programming Language :: Python :: 3.9
|
|
||||||
Classifier: Programming Language :: Python :: 3.10
|
|
||||||
Requires-Python: >=3.7
|
|
||||||
Description-Content-Type: text/markdown
|
|
||||||
License-File: LICENSE
|
|
||||||
Requires-Dist: litellm>=1.10.0
|
|
||||||
Requires-Dist: pyyaml>=6.0
|
|
||||||
|
|
||||||
# PyNamer
|
|
||||||
|
|
||||||
PyNamer is a command-line tool that uses AI vision models to generate descriptive filenames for images. It analyzes the content of images and renames them with meaningful, descriptive filenames in snake_case format.
|
|
||||||
|
|
||||||
## Features
|
|
||||||
|
|
||||||
- Uses LiteLLM to integrate with various vision-capable LLMs (default: GPT-4 Vision)
|
|
||||||
- Configurable via YAML config file
|
|
||||||
- Supports multiple image formats (jpg, jpeg, png, gif, webp)
|
|
||||||
- Dry-run mode to preview changes without renaming files
|
|
||||||
- Handles filename collisions automatically
|
|
||||||
|
|
||||||
## Installation
|
|
||||||
|
|
||||||
### Option 1: Install from PyPI (recommended)
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pip install pynamer
|
|
||||||
```
|
|
||||||
|
|
||||||
### Option 2: Install from source
|
|
||||||
|
|
||||||
1. Clone this repository
|
|
||||||
2. Install the package in development mode:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pip install -e .
|
|
||||||
```
|
|
||||||
|
|
||||||
### Set up your API key
|
|
||||||
|
|
||||||
You need to set up your API key for the vision model:
|
|
||||||
|
|
||||||
- Set the appropriate environment variable (e.g., `OPENAI_API_KEY`), or
|
|
||||||
- Create a custom config file with your API key
|
|
||||||
|
|
||||||
## Configuration
|
|
||||||
|
|
||||||
PyNamer comes with a default configuration, but you can create a custom config file to customize:
|
|
||||||
|
|
||||||
- LLM provider and model
|
|
||||||
- API key and endpoint
|
|
||||||
- Supported image formats
|
|
||||||
- Prompt templates for filename generation
|
|
||||||
|
|
||||||
Example custom config file (config.yaml):
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
llm:
|
|
||||||
provider: "openai"
|
|
||||||
model: "gpt-4-vision-preview"
|
|
||||||
api_key: "your-api-key-here"
|
|
||||||
max_tokens: 100
|
|
||||||
temperature: 0.7
|
|
||||||
```
|
|
||||||
|
|
||||||
## Usage
|
|
||||||
|
|
||||||
After installation, you can use PyNamer directly from the command line:
|
|
||||||
|
|
||||||
Basic usage:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pynamer path/to/image.jpg
|
|
||||||
```
|
|
||||||
|
|
||||||
Process multiple images:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pynamer image1.jpg image2.png image3.jpg
|
|
||||||
```
|
|
||||||
|
|
||||||
Use a different config file:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pynamer -c custom_config.yaml image.jpg
|
|
||||||
```
|
|
||||||
|
|
||||||
Preview changes without renaming (dry run):
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pynamer -d image.jpg
|
|
||||||
```
|
|
||||||
|
|
||||||
Enable verbose logging:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pynamer -v image.jpg
|
|
||||||
```
|
|
||||||
|
|
||||||
## Example
|
|
||||||
|
|
||||||
Input: `IMG_20230615_123456.jpg` (a photo of a cat sleeping on a window sill)
|
|
||||||
|
|
||||||
Output: `orange_cat_sleeping_on_sunny_windowsill.jpg`
|
|
||||||
|
|
||||||
## Development
|
|
||||||
|
|
||||||
### Building the package
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pip install build
|
|
||||||
python -m build
|
|
||||||
```
|
|
||||||
|
|
||||||
### Installing in development mode
|
|
||||||
|
|
||||||
```bash
|
|
||||||
pip install -e .
|
|
||||||
```
|
|
||||||
|
|
||||||
## Requirements
|
|
||||||
|
|
||||||
- Python 3.7+
|
|
||||||
- LiteLLM
|
|
||||||
- PyYAML
|
|
||||||
- Access to a vision-capable LLM API (OpenAI, Anthropic, etc.)
|
|
|
@ -1,15 +0,0 @@
|
||||||
LICENSE
|
|
||||||
MANIFEST.in
|
|
||||||
README.md
|
|
||||||
pyproject.toml
|
|
||||||
setup.py
|
|
||||||
src/pynamer/__init__.py
|
|
||||||
src/pynamer/cli.py
|
|
||||||
src/pynamer/config.yaml
|
|
||||||
src/pynamer/core.py
|
|
||||||
src/pynamer.egg-info/PKG-INFO
|
|
||||||
src/pynamer.egg-info/SOURCES.txt
|
|
||||||
src/pynamer.egg-info/dependency_links.txt
|
|
||||||
src/pynamer.egg-info/entry_points.txt
|
|
||||||
src/pynamer.egg-info/requires.txt
|
|
||||||
src/pynamer.egg-info/top_level.txt
|
|
|
@ -1,2 +0,0 @@
|
||||||
litellm>=1.10.0
|
|
||||||
pyyaml>=6.0
|
|
|
@ -1,3 +1,3 @@
|
||||||
"""PyNamer - Generate descriptive filenames for images using LLMs."""
|
"""PyNamer - Generate descriptive filenames for images using LLMs."""
|
||||||
|
|
||||||
__version__ = "0.1.0"
|
__version__ = "0.2.0"
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
"""Core functionality for PyNamer."""
|
"#""Core functionality for PyNamer."""
|
||||||
|
|
||||||
import argparse
|
import argparse
|
||||||
import base64
|
import base64
|
||||||
|
|
Loading…
Reference in New Issue