Page States

Page states capture the full state of a page when you execute your workflows. The html content uploaded here will be used by the selectors to perform a search. They also are used for visualization purposes on the server, so you can replay previous page states as a browser would see it.

When operating in a headless browser, we recommend affixing a synthetic sl-identifier to each individual element. This helps to pinpoint the element returned from the respective endpoint, especially in the case of dynamic pages where the xpaths can change quickly. In situations where an sl-identifier is not explicitly added, Selectify will generate one automatically on the server.

Page states encapsolate all associated files (for instance, stylesheets, images, etc.) maintaining their original state even in situations of modifications to the external web page. This continuity ensures stability in your workflows over time. For more information on how to manually upload these dependent files, read about the Page Resource model.


Create a page state

This endpoint allows you to persist a snapshot of the current html page of interest. Couple it with the included PageResources for a full reproduction of how your browser is seeing the page.

Header Values

  • Name
    authorization
    Type
    Description

Request Body

Requests should be formatted as a PageStateCreate object.

  • Name
    url
    Type
    string
    Description
    The URL of the page that is being captured.
  • Name
    content
    Type
    string
    Description
    The raw content of the page that is being captured.
  • Name
    field_values
    Type
    array[undefined]
    Description
    Store the value of input and textarea elements, for rendering on page replays.
  • Name
    session_id
    Type
    string
    Description
    The session that this page state belongs to.
  • Name
    allowed_resource_ids
    Type
    array[string]
    Description
    The IDs of the PageResource objects that have resolved locally alongside this page.
  • Name
    auto_resolve_resources
    Type
    boolean
    Description
    If Selectify detects unmet resources that are referenced in the html and not provided as part of allowed_resource_ids, will attempt to fetch these from the source webpage. Defaults to `false`.

Response Body

Responses will be formatted as a PageState object.

  • Name
    id
    Type
    string
    Description
  • Name
    url
    Type
    string
    Description
    The URL of the page that is being captured.
  • Name
    field_values
    Type
    array[FieldValue]
    Description
    Store the value of input and textarea elements, for rendering on page replays.
  • Name
    session_id
    Type
    string
    Description
    The session that this page state belongs to.

FieldValue

  • Name
    identifier
    Type
    string
    Description
    Indicate the `sl-identifier` of the element
  • Name
    value
    Type
    Description
    Text value of the element
  • Name
    checked
    Type
    Description
    For select or radio elements, whether the current element is selected. None if not applicable to the field type.

Request

curl -X POST "https://api.selectify.ai/page_state/" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN"\
  -d '{
    "url": "string",
    "content": "html content",
    "field_values": [
      {
        "identifier": "string",
        "value": null,
        "checked": null
      }
    ],
    "session_id": "00000000-0000-0000-0000-000000000000",
    "allowed_resource_ids": [
      "00000000-0000-0000-0000-000000000000"
    ],
    "auto_resolve_resources": false
  }'

Response

{
  "id": "00000000-0000-0000-0000-000000000000",
  "url": "string",
  "field_values": [
    {
      "identifier": "string",
      "value": null,
      "checked": null
    }
  ],
  "session_id": "00000000-0000-0000-0000-000000000000"
}