Moves away from weekly.

This commit is contained in:
Vitor Pamplona 2023-11-08 13:29:00 -05:00
parent 447f9b3a87
commit f93015b70b

63
29.md
View File

@ -6,52 +6,61 @@ Time-Based Sync
`draft` `optional` `author:vitorpamplona`
This document describes a simple relay filter extension to allow event database syncing between relays and clients. It works for both client-relay and relay-relay scenarios when the majority of the events have been downloaded in a previous session.
### Motivation
Clients that keep a local database (either in memory or in disk) wish to update that database with any new events from each relay. Currently, the only way to know if the client is up-to-date is to re-download all events. This NIP allows a client to request a hash for each week of records from the relay. If the local hash matches the remote hash for the week, the client is up-to-date. If the hash is different, the client can request to download all events in that week using the regular `since` and `until` filters.
This NIP describes a simple event database reconciliation procedure for clients and relays (peers). Both sides hash the same groups of event ids and compare the resulting hashes to determine which subsets should be downloaded or uploaded to each other. The procedure is ideal for clients that use a local database and must guarantee the database is in sync with each relay without downloading or uploading all events again.
### Sync Protocol
The client sends a `WEEKLY-HASHES` message to the relay with a subscription ID and appropriate filters for the content to be synced.
The syncing peer (client or a relay) sends a `HASH-REQ` message to a relay with a subscription ID and filters for the content to be synced.
Request:
```js
[
"WEEKLY-HASHES",
<subscription ID string>,
<nostr filter>, <nostr filter2>, <nostr filter3>
    "HASH-REQ",
    <subscription ID string>,
"<WINDOW-SIZE>"
    <nostr filter>, <nostr filter2>, <nostr filter3>
]
```
The relay calculates the weekly hashes and responds with an EOSE-ended sequence of
Upon receiving a `HASH-REQ` the relay MUST:
1. Apply all nostr filters in the subscription to the database
2. Sort all resulting events by `.created_at`
3. Group by the first `WINDOW-SIZE` chars of the stringified `.created_at`
4. For each group, create an array of event ids: `["idHex1", "idHex2", "idHex3"]`, JSON-serialize it, and hash it using SHA-256
5. Return the group list, with their truncated `.created_at` identifier and hash to the peer.
After calculating the hashes, the relay responds with an EOSE-ended sequence of
Response:
```js
[
"WEEKLY-HASH",
<subscription ID string>,
<week in "YYYY-ww" format>,
<SHA256(JSON.stringify([event1.id, event2.id, event3.id, ...])) in hex>
    "HASH-RES",
    "<subscription ID string>",
    "<TRUNCATED_CREATED_AT>",
    "<SHA256(JSON.stringify([event1.id, event2.id, event3.id, ...])) in hex>"
]
```
The client then compares the receiving hashes with those stored locally and, if different, creates the filter to download all events from that week (in GMT) again.
The peer then compares the receiving hashes with those stored locally and, if different, creates a new filter to either refine the window size OR to download all events within the missing window.
### Weekly Hash calculation
The choice of window size is use case dependent. Clients can start with a large window after the first round of results adjust the filters and reduce the window size to further reduce the group to be re-downloaded.
The weekly hash MUST happen in the client and in the relay in the exact same way:
### Appendix A: Explanation of Window Size calculations
1. Apply all filters in a subscription to the database
2. Sort all events by `.created_at`
3. Group by a formatted `.created_at` as `YYYY-ww` in GMT, where `ww` is the week of the year
4. For each group, create an array of event ids: [id1, id2, id3], JSON-serialize it and hash it using SHA-256
5. Return a list of the formatted `.created_at` and the hashes.
The `WINDOW-SIZE` is a number between 0 and 10 and represents each order of magnitude of `.created_at`. It defines the size of each group as well as their start and end times for the `.created_at` field.
Clients must keep records of which events are coming from which relays to sucessfully filter only events from that relay and compare hashes.
By truncating the `.created_at` to the first `WINDOW-SIZE` chars, it effectively creates the groups below:
### Why weekly?
0: Returns only one hash for the entire subscription
1: Groups by periods of 1000000000 seconds (~31.7 years): Periods start with `*000000000 and end with `*999999999`
2: Groups by periods of 100000000 seconds (~3.17 years): Periods start with `**00000000 and end with `**99999999`
3: Groups by periods of 10000000 seconds (~16.53 weeks): Periods start with `***0000000 and end with `***9999999`
4: Groups by periods of 1000000 seconds (~11.57 days): Periods start with `****000000` and end with `****999999`
5: Groups by periods of 100000 seconds (~1.16 days): Periods start with `*****00000` and end with `*****99999`
6: Groups by periods of 10000 seconds (~2.77 hours): Periods start with `******0000` and end with `******9999`  
7: Groups by periods of 1000 seconds (~16.66 minutes): Periods start with `*******000` and end with `*******999`
8: Groups by periods of 100 seconds (~1.66 minutes): Periods start with `********00` and end with `********99`
9: Groups by periods of 10 seconds: Periods start with `*********0` and end with `*********9`
10: Groups by periods of 1 second: Periods start with `**********` and end with `**********`
Simplicity. We could do a recursive approach where the Client chooses the format (e.g: `YYYY`, `YYYY-ww`, `YYYY-MM`, `YYYY-MM-dd`, `YYYY-MM-ddHH`, `YYYY-MM-ddHHmm`). This flexibility allows for some cost savings but adds complexity to the implementation of this NIP.
Notice that each group starts at 0 (inclusive) and ends at their last possible number 9 (inclusive).