Cursors

Procs that have the potential to return large datasets may break the result down into smaller sets, or pages. Instead of returning the full dataset all at once, these procs return one page of values at a time—providing a cursor used to fetch the next page of values.

These procs return cursors when the size of the dataset exceeds the defined page size:

Client libraries handle cursors automatically by paging through the dataset as values are consumed.

Paging is controlled through the count and total arguments. The count argument sets the page size, defaulting to 250 values. The total number of values fetched across all pages is set with the total argument—if total is not passed, proc assumes that all available values are to be retrieved.

Enumerating Cursors

Cursors are fully supported by enum.* procs. Used together, they provide patterns for iterating and mutating large datasets within proc or externally from a client library. Let's explore this in context of a common use-case: fetching the underlying value for every key in a key-value bucket.

One approach is to use keyv.scan to fetch the keys, then retrieve each value using keyv.get.

something went wrong :(
Naively retrieving the value for each key
Setup
require "proc"

client = Proc.connect("PROCAUTH")
const Proc = require("@proc.dev/client");
const client = Proc.connect("PROCAUTH");
authorization=PROCAUTH
client.keyv.scan.call(bucket: "v7efcf3c").each do |key|
  client.keyv.get.call(bucket: "v7efcf3c", key: key)
end
client.keyv.scan.call(undefined, v7efcf3c).each((key) => {
  client.keyv.get.call(undefined, v7efcf3c);
});
this example cannot be expressed in curl

This approach works for small datasets, but does not scale to larger ones. Because keyv.scan only returns the keys, each additional key requires another request to proc to get the underlying value. We can use an enumerator instead to fetch the keys and map them into values before returning to the client.

something went wrong :(
Mapping keys into values with an enumerator
Setup
require "proc"

client = Proc.connect("PROCAUTH")
const Proc = require("@proc.dev/client");
const client = Proc.connect("PROCAUTH");
authorization=PROCAUTH
client.keyv.scan.call(bucket: "m6cb1cf4") do
  client.enum.map do
    client.keyv.get(bucket: "m6cb1cf4")
  end
end
client.keyv.scan.call(undefined, m6cb1cf4, () => {
  return client.enum.map(undefined, {}, () => {
    return client.keyv.get(undefined, m6cb1cf4);
  });
});
curl "https://proc.run/keyv/scan?bucket=m6cb1cf4" --silent \
--header "authorization: bearer $authorization" \
--header "content-type: application/vnd.proc+json" \
--header "accept: text/plain" \
--data '[["$$", "proc", ["m6cb1cf4"]]]]]]]]]'
 

[...]

This approach constrains the number of requests between the client and proc to the number of pages returned by keyv.scan. Given 10,000 keys, keyv.scan returns 40 pages of results. The first approach makes a total of 40 + 10,000 requests while the second approach makes only 40 requests—much better!

Learn more about enumerators in the enumerator docs.

Consuming Cursors

Cursors, like other enumerables, are enumerated lazily, meaning iteration over every underlying value is not guaranteed unless the enumerator is fully consumed. Enumerators can be consumed in one of two ways:

  1. Iterating over the enumerator in the client.
  2. Resolving the enumerator within proc.

The mapping example from the section above demonstrates client-based iteration. While the client-based approach makes sense when values are needed on the client, it would be better in many cases if deletion took place without any involvement from the client. Do this by piping the enumerable result of keyv.scan to enum.each.

something went wrong :(
Deleting keys from a bucket
Setup
require "proc"

client = Proc.connect("PROCAUTH")
const Proc = require("@proc.dev/client");
const client = Proc.connect("PROCAUTH");
authorization=PROCAUTH
delete = client.keyv.scan(bucket: "p6124289") >> client.enum.each {
  client.keyv.delete(bucket: "p6124289")
}

delete.call
let delete = client.keyv.scan(
  undefined, p6124289
).compose(
  client.enum.each(undefined, {}, () => {
    return client.keyv.delete(
      undefined, p6124289
    );
  })
);

delete.call();
curl "https://proc.run/core/exec" --silent \
--header "authorization: bearer $authorization" \
--header "content-type: application/vnd.proc+json" \
--header "accept: text/plain" \
--data '[["$$", "proc", ["p6124289"]]]]]]]]]'
 

[...]

See enum.resolve for a way to resolve enumerators without returning enumerable values to the client.


Stuck? Want to chat about an idea? Join the community on Discord.