Files
ocr-screenshot-gallery/README.md
T
2025-11-16 01:42:27 -05:00

7.9 KiB
Raw Blame History

Screenshot OCR Gallery

A Qt6-based image gallery application that allows you to search through OCR data from your screenshots with live preview and dynamic resizing.

Features

  • Fast visual navigation through your screenshot collection
  • Non-blocking UI with background threaded search operations
  • Smart typing detection with 500ms inactivity timer before searching
  • Visual feedback while typing and searching with animated status indicators
  • Settings dialog to customize database location and screenshots directory
  • Settings stored in ~/.config/ScreenshotOCRGallery/settings.ini for easy access
  • Customizable image preload count for performance tuning
  • Prominent "Load More Images" button for easy one-click pagination
  • Optimized lazy loading that only loads images when needed
  • Ultra-responsive live search through OCR text using SQLite FTS5 full-text search technology
  • Extremely fast search even with large databases containing thousands of screenshots
  • Dynamic grid layout that automatically reflows (1x, 2x, 3x, 4x, etc.) based on window width
  • No horizontal scrollbars - content always fits the window width
  • Dynamic filename overlay at the bottom of each image that scales with the thumbnail width
  • Opens images in your default image viewer on click
  • Menu bar with File options (Open, Open With, Settings, Quit)
  • Minimal 2px spacing between images for a compact view
  • Proper error handling for missing files and database issues

Requirements

Build Dependencies

  • Qt6 (Core, Gui, Widgets, Sql modules)
  • C++17 compatible compiler
  • CMake (3.16+)
  • SQLite3 support

For Arch Linux users:

sudo pacman -S qt6-base qt6-tools cmake

For Debian/Ubuntu users:

sudo apt install qt6-base-dev libqt6sql6-sqlite cmake

Database Requirements

The application expects a SQLite database file at /home/master/screenshot_ocr.db with the following schema:

CREATE TABLE ocr_results (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    filename TEXT UNIQUE,
    full_path TEXT,
    ocr_text TEXT,
    file_size INTEGER,
    created_date TEXT,
    ocr_date TEXT
);

Building the Project

  1. Clone or download this repository
  2. Navigate to the project directory
  3. Run the build script:
cd screenshot-gallery
chmod +x build.sh
./build.sh

The build script will:

  • Check for required dependencies
  • Create a build directory
  • Run CMake to configure the project
  • Build the executable
  • Ask if you want to run the application

Manual Build

If you prefer to build manually:

mkdir -p build
cd build
cmake ..
make

Running the Application

After building, run the application:

./build/screenshot-gallery

Usage

  1. When the application starts, it will display the first batch of screenshots (20 by default)

  2. Click the large "Load More Images" button at the end of the gallery to load additional images with a single click

  3. Type in the search bar to filter images by OCR text content

  4. The app waits for you to stop typing (500ms pause) before performing the search

  5. Animated status indicators show when you're typing and when searching is in progress

  6. Search happens in the background - UI stays responsive even during complex searches

  7. Results update instantly as they become available, loading only what you can see

  8. When you clear the search bar, the first batch of images loads immediately

  9. Resize the application window to see the grid automatically reflow:

    • Wider windows show more columns (4x, 5x, etc.)
    • Narrower windows reduce to fewer columns (3x, 2x)
    • Very narrow windows show a single centered column (1x)
    • No horizontal scrolling - content always fits the available width
  10. Each image displays its filename at the bottom with a dark overlay that resizes with the thumbnail

  11. Click on any image to open it in your default image viewer

  12. Use the File menu for additional options:

    • Open: Select and open an image file
    • Open With: Choose a program to open an image file
    • Settings: Configure your database path and screenshots directory
    • Quit: Exit the application
  13. Configure the application through the Settings dialog:

    • Database File Path: Change where your OCR database is stored
    • Screenshots Directory: Set the default location for your screenshots
    • Image count to pre-load: Adjust how many images are loaded at once (default: 20)
    • The settings are automatically saved to ~/.config/ScreenshotOCRGallery/settings.ini
    • Changes are applied immediately without requiring a restart

Search Technology

This application combines multiple performance-enhancing technologies:

1. Customizable Configuration

  • Settings Dialog: Easily configure database location, screenshots directory, and preload count
  • File Path Selection: Browse for locations using native file pickers
  • Persistent Settings: Your configuration is saved in ~/.config/ScreenshotOCRGallery/settings.ini
  • Performance Tuning: Adjust how many images are preloaded (20 by default)
  • Dynamic Updates: Changes are applied immediately without requiring a restart

2. Intelligent Input Handling

  • Typing Inactivity Detection: Search only triggers after 500ms of no typing
  • Visual Feedback: Animated status indicators show typing and searching states
  • Immediate Response: UI instantly acknowledges your input
  • Efficient Processing: Prevents wasteful searches while you're still typing

3. Lazy Loading & Pagination

  • Initial Fast Load: Only loads a configurable batch of images (default: 20) for immediate display
  • Prominent Load More Button: Large, clearly visible button at the end of the image list for loading more images
  • Customizable Batch Size: Adjust the number of images loaded at once via settings
  • Optimized Memory Usage: Only keeps necessary images in memory
  • Built-in Progress Tracking: Button dynamically updates to show current progress (e.g., "Load More Images (20 of 157)")
  • Simple Interaction: One-click loading of additional images without needing to scroll

4. Background Threading

  • Non-Blocking UI: All search operations happen in background threads
  • Responsive Interface: The application remains fully responsive while searching
  • Parallel Processing: Search operations don't block the main UI thread
  • Live Updates: Results appear as they become available
  • What is FTS5? A powerful full-text search engine built into SQLite that uses specialized indexing
  • Performance Benefits: Up to 100× faster than standard LIKE queries for text searches
  • Smart Search: Supports word stemming, prefix matches, and phrase queries
  • Automatic Fallback: If FTS5 is not available, the app automatically falls back to standard search

6. Search Result Caching

  • Paginated Cache: Results are cached by page for efficient retrieval
  • Smart Invalidation: Cache expires after 5 minutes to ensure fresh results
  • Thread-Safe: The cache is protected for concurrent access
  • Memory Efficient: Only stores what's needed, with automatic cleanup

Troubleshooting

Database Connection Issues

If the application cannot connect to the database:

  • Ensure the database file exists at the expected location (/home/master/screenshot_ocr.db)
  • Check file permissions
  • Verify the database has the required schema

Missing Images

If the gallery shows placeholders instead of images:

  • Verify that the image files exist at the paths stored in the database
  • Check file permissions
  • Ensure the paths in the database are correct and accessible

Build Issues

If the build fails:

  • Make sure you have installed all required Qt6 packages
  • Check that your CMake version is at least 3.16
  • Ensure you have the necessary permissions in the build directory

License

This project is released under the MIT License.

Contributing

Contributions are welcome! Feel free to submit pull requests or open issues for bugs and feature requests.