Get OCR text positions with bounding boxes for a specific frame. Falls back to accessibility tree node bounds when no OCR data exists. Both OCR and accessibility bounds are normalized to 0-1 relative to the monitor (full-screen capture), so they align correctly with the screenshot.