api reference for screenpipe
below is the detailed api reference for screenpipe's core functionality.
search api
- endpoint:
/search
- method:
get
- description: searches captured data (ocr, audio transcriptions, etc.) stored in screenpipe's local database.
query parameters:
q
(string, optional): search term (a SINGLE word)content_type
(enum): type of content to search:ocr
: optical character recognition textaudio
: audio transcriptionsui
: user interface elements
limit
(int): max results per page (default: 20)offset
(int): pagination offsetstart_time
(timestamp, optional): filter by start timestampend_time
(timestamp, optional): filter by end timestampapp_name
(string, optional): filter by application namewindow_name
(string, optional): filter by window nameinclude_frames
(bool, optional): include base64 encoded framesmin_length
(int, optional): minimum content lengthmax_length
(int, optional): maximum content lengthspeaker_ids
(int[], optional): filter by specific speaker ids
sample requests:
# Basic search
curl "http://localhost:3030/search?q=meeting&content_type=ocr&limit=10"
# Audio search with speaker filter
curl "http://localhost:3030/search?content_type=audio&speaker_ids=1,2"
# UI elements search
curl "http://localhost:3030/search?content_type=ui&app_name=chrome"
sample response:
{
"data": [
{
"type": "OCR",
"content": {
"frame_id": 123,
"text": "meeting notes",
"timestamp": "2024-03-10T12:00:00Z",
"file_path": "/frames/frame123.png",
"offset_index": 0,
"app_name": "chrome",
"window_name": "meeting",
"tags": ["meeting"],
"frame": "base64_encoded_frame_data"
}
}
],
"pagination": {
"limit": 5,
"offset": 0,
"total": 100
}
}
audio devices api
- endpoint:
/audio/list
- method:
get
- description: lists available audio input/output devices
sample response:
[
{
"name": "built-in microphone",
"is_default": true
}
]
monitors api
- endpoint:
/vision/list
- method:
post
- description: lists available monitors/displays
sample response:
[
{
"id": 1,
"name": "built-in display",
"width": 2560,
"height": 1600,
"is_default": true
}
]
tags api
- endpoint:
/tags/:content_type/:id
- methods:
post
(add),delete
(remove) - description: manage tags for content items
- content_type:
vision
oraudio
add tags request:
{
"tags": ["important", "meeting"]
}
sample response:
{
"success": true
}
pipes api
list pipes
- endpoint:
/pipes/list
- method:
get
download pipe
- endpoint:
/pipes/download
- method:
post
{
"url": "https://github.com/user/repo/pipe-example"
}
enable pipe
- endpoint:
/pipes/enable
- method:
post
{
"pipe_id": "pipe-example"
}
disable pipe
- endpoint:
/pipes/disable
- method:
post
{
"pipe_id": "pipe-example"
}
update pipe config
- endpoint:
/pipes/update
- method:
post
{
"pipe_id": "pipe-example",
"config": {
"key": "value"
}
}
speakers api
list unnamed speakers
- endpoint:
/speakers/unnamed
- method:
get
- description: get list of speakers without names assigned
query parameters:
limit
(int): max resultsoffset
(int): pagination offsetspeaker_ids
(int[], optional): filter specific speaker ids
sample request:
curl "http://localhost:3030/speakers/unnamed?limit=10&offset=0"
search speakers
- endpoint:
/speakers/search
- method:
get
- description: search speakers by name
query parameters:
name
(string, optional): name prefix to search for
sample request:
curl "http://localhost:3030/speakers/search?name=john"
update speaker
- endpoint:
/speakers/update
- method:
post
- description: update speaker name or metadata
request body:
{
"id": 123,
"name": "john doe",
"metadata": "{\"role\": \"engineer\"}"
}
delete speaker
- endpoint:
/speakers/delete
- method:
post
- description: delete a speaker and associated audio chunks
request body:
{
"id": 123
}
get similar speakers
- endpoint:
/speakers/similar
- method:
get
- description: find speakers with similar voice patterns
query parameters:
speaker_id
(int): reference speaker idlimit
(int): max results
sample request:
curl "http://localhost:3030/speakers/similar?speaker_id=123&limit=5"
merge speakers
- endpoint:
/speakers/merge
- method:
post
- description: merge two speakers into one
request body:
{
"speaker_to_keep_id": 123,
"speaker_to_merge_id": 456
}
mark as hallucination
- endpoint:
/speakers/hallucination
- method:
post
- description: mark a speaker as incorrectly identified
request body:
{
"speaker_id": 123
}
health api
- endpoint:
/health
- method:
get
- description: system health status
sample response:
{
"status": "healthy",
"last_frame_timestamp": "2024-03-10T12:00:00Z",
"last_audio_timestamp": "2024-03-10T12:00:00Z",
"last_ui_timestamp": "2024-03-10T12:00:00Z",
"frame_status": "ok",
"audio_status": "ok",
"ui_status": "ok",
"message": "all systems functioning normally"
}
stream frames api
- endpoint:
/stream/frames
- method:
get
- description: stream frames as server-sent events (sse)
query parameters:
start_time
(timestamp): start time for frame streamend_time
(timestamp): end time for frame stream
sample request:
curl "http://localhost:3030/stream/frames?start_time=2024-03-10T12:00:00Z&end_time=2024-03-10T13:00:00Z"
sample event data:
{
"timestamp": "2024-03-10T12:00:00Z",
"devices": [
{
"device_id": "screen-1",
"frame": "base64_encoded_frame_data"
}
]
}
experimental api
merge frames
- endpoint:
/experimental/frames/merge
- method:
post
- description: merges multiple video frames into a single video
request body:
{
"video_paths": ["path/to/video1.mp4", "path/to/video2.mp4"]
}
sample response:
{
"video_path": "/path/to/merged/video.mp4"
}
validate media
- endpoint:
/experimental/validate/media
- method:
get
- description: validates media file format and integrity
query parameters:
file_path
(string): path to media file to validate
sample response:
{
"status": "valid media file"
}
input control (experimental feature)
- endpoint:
/experimental/input_control
- method:
post
- description: control keyboard and mouse input programmatically
request body:
{
"action": {
"type": "KeyPress",
"data": "enter"
}
}
or
{
"action": {
"type": "MouseMove",
"data": {
"x": 100,
"y": 200
}
}
}
or
{
"action": {
"type": "MouseClick",
"data": "left"
}
}
or
{
"action": {
"type": "WriteText",
"data": "hello world"
}
}
database api
execute raw sql
- endpoint:
/raw_sql
- method:
post
- description: execute raw SQL queries against the database (use with caution)
request body:
{
"query": "SELECT * FROM frames LIMIT 5"
}
add content
- endpoint:
/add
- method:
post
- description: add new content (frames or transcriptions) to the database
request body:
{
"device_name": "device1",
"content": {
"content_type": "frames",
"data": {
"frames": [
{
"file_path": "/path/to/frame.png",
"timestamp": "2024-03-10T12:00:00Z",
"app_name": "chrome",
"window_name": "meeting",
"ocr_results": [
{
"text": "detected text",
"text_json": "{\"additional\": \"metadata\"}",
"ocr_engine": "tesseract",
"focused": true
}
],
"tags": ["meeting", "important"]
}
]
}
}
}
or
{
"device_name": "microphone1",
"content": {
"content_type": "transcription",
"data": {
"transcription": "transcribed text",
"transcription_engine": "whisper"
}
}
}
realtime streaming api
transcription stream
- endpoint:
/sse/transcriptions
- method:
get
- description: stream real-time transcriptions using server-sent events (SSE)
sample event data:
{
"transcription": "live transcribed text",
"timestamp": "2024-03-10T12:00:00Z",
"device": "microphone1"
}
vision stream
- endpoint:
/sse/vision
- method:
get
- description: stream real-time vision events using server-sent events (SSE)
query parameters:
images
(bool, optional): include base64 encoded images in events
sample event data:
{
"type": "Ocr",
"text": "detected text",
"timestamp": "2024-03-10T12:00:00Z",
"image": "base64_encoded_image_data",
"app_name": "chrome",
"window_name": "meeting"
}