1.6 KiB
1.6 KiB
Active Context: PyNamer - Image Resizing Implementation
Current Focus: Implementing image resizing functionality to normalize image dimensions before sending them to the LLM.
Decisions Made:
- Use the
Pillow
library for image manipulation due to its robustness and ease of use in Python. - Add
Pillow
torequirements.txt
. - Introduce configuration options in
config.yaml
under theimage
section:resize_max_dimension
: Controls the maximum size (width or height) of the image sent to the LLM. Defaults to 1024.resize_format
: Specifies the image format (e.g., 'JPEG', 'PNG') to use after resizing. Defaults to 'JPEG'.
- Modify the
_encode_image
method (renamed to_resize_and_encode_image
) to perform resizing:- Open the image using
PIL.Image.open()
. - Check if the image's largest dimension exceeds
resize_max_dimension
. - If it exceeds, calculate new dimensions maintaining aspect ratio and resize using
img.resize()
withImage.Resampling.LANCZOS
. - Save the (potentially resized) image to an in-memory buffer (
io.BytesIO
) using the configuredresize_format
. - Handle potential transparency issues when saving formats like JPEG by converting the image mode to 'RGB' if necessary.
- Base64 encode the bytes from the buffer.
- Open the image using
- Update the
generate_filename
method to call_resize_and_encode_image
instead of_encode_image
. - Update the
generate_filename
method to dynamically set themime_type
in the LLM request based onresize_format
. - Load the new configuration options in
_setup_llm
.
Next Steps:
- Create
memory-bank/progress.md
. - Create
.clinerules
. - Final review and testing.