nips/45.md

NIP-45
======

Event Counts
------------

`draft` `optional`

Relays may support the verb `COUNT`, which provides a mechanism for obtaining event counts.

## Motivation

Some queries a client may want to execute against connected relays are prohibitively expensive, for example, in order to retrieve follower counts for a given pubkey, a client must query all kind-3 events referring to a given pubkey only to count them. The result may be cached, either by a client or by a separate indexing server as an alternative, but both options erode the decentralization of the network by creating a second-layer protocol on top of Nostr.

## Filters and return values

This NIP defines the verb `COUNT`, which accepts a subscription id and filters as specified in [NIP 01](01.md) for the verb `REQ`. Multiple filters are OR'd together and aggregated into a single count result.

```
["COUNT", <subscription_id>, <filters JSON>...]
```

Counts are returned using a `COUNT` response in the form `{"count": <integer>}`. Relays may use probabilistic counts to reduce compute requirements.
In case a relay uses probabilistic counts, it MAY indicate it in the response with `approximate` key i.e. `{"count": <integer>, "approximate": <true|false>}`.

```
["COUNT", <subscription_id>, {"count": <integer>}]
```

Whenever the relay decides to refuse to fulfill the `COUNT` request, it MUST return a `CLOSED` message.

## HyperLogLog

Relays may return an HyperLogLog value together with the count, hex-encoded.

```
["COUNT", <subscription_id>, {"count": <integer>, "hll": "<hex>"}]
```

This is so it enables merging results from multiple relays and yielding a reasonable estimate of reaction counts, comment counts and follower counts, while saving many millions of bytes of bandwidth for everybody.

### Algorithm

This section describes the steps a relay should take in order to return HLL values to clients.

1. Upon receiving a filter, if it has a single `#e`, `#p`, `#a` or `#q` item, read its 32th ascii character as a byte and take its modulo over 24 to obtain an `offset` -- in the unlikely case that the filter doesn't meet these conditions, set `offset` to the number 16;
2. Initialize 256 registers to 0 for the HLL value;
3. For all the events that are to be counted according to the filter, do this:
    1. Read byte at position `offset` of the event `pubkey`, its value will be the register index `ri`;
    2. Count the number of leading zero bits starting at position `offset+1` of the event `pubkey`;
    3. Compare that with the value stored at register `ri`, if the new number is bigger, store it.

That is all that has to be done on the relay side, and therefore the only part needed for interoperability.

On the client side, these HLL values received from different relays can be merged (by simply going through all the registers in HLL values from each relay and picking the highest value for each register, regardless of the relay).

And finally the absolute count can be estimated by running some methods I don't dare to describe here in English, it's better to check some implementation source code (also, there can be different ways of performing the estimation, with different quirks applied on top of the raw registers).

### Attack vectors

One could mine a pubkey with a certain number of zero bits in the exact place where the HLL algorithm described above would look for them in order to artificially make its reaction or follow "count more" than others. For this to work a different pubkey would have to be created for each different target (event id, followed profile etc). This approach is not very different than creating tons of new pubkeys and using them all to send likes or follow someone in order to inflate their number of followers. The solution is the same in both cases: clients should not fetch these reaction counts from open relays that accept everything, they should base their counts on relays that perform some form of filtering that makes it more likely that only real humans are able to publish there and not bots or artificially-generated pubkeys.

### `hll` encoding

The value `hll` value must be the concatenation of the 256 registers, each being a uint8 value (i.e. a byte). Therefore `hll` will be a 512-character hex string.

## Examples

### Count posts and reactions

```
["COUNT", <subscription_id>, {"kinds": [1, 7], "authors": [<pubkey>]}]
["COUNT", <subscription_id>, {"count": 5}]
```


### Count posts approximately

```
["COUNT", <subscription_id>, {"kinds": [1]}]
["COUNT", <subscription_id>, {"count": 93412452, "approximate": true}]
```

### Followers count with HyperLogLog

```
["COUNT", <subscription_id>, {"kinds": [3], "#p": [<pubkey>]}]
["COUNT", <subscription_id>, {"count": 16578, "hll": "0607070505060806050508060707070706090d080b0605090607070b07090606060b0705070709050807080805080407060906080707080507070805060509040a0b06060704060405070706080607050907070b08060808080b080607090a06060805060604070908050607060805050d05060906090809080807050e0705070507060907060606070708080b0807070708080706060609080705060604060409070a0808050a0506050b0810060a0908070709080b0a07050806060508060607080606080707050806080c0a0707070a080808050608080f070506070706070a0908090c080708080806090508060606090906060d07050708080405070708"}]
```

### Relay refuses to count

```
["COUNT", <subscription_id>, {"kinds": [4], "authors": [<pubkey>], "#p": [<pubkey>]}]
["CLOSED", <subscription_id>, "auth-required: cannot count other people's DMs"]
```
Add NIP-45, which defines a COUNT verb 2023-01-03 23:11:17 -05:00			`NIP-45`
			`======`

			`Event Counts`
style: fix header styles in same format 2024-09-02 13:03:45 -04:00			`------------`
Add NIP-45, which defines a COUNT verb 2023-01-03 23:11:17 -05:00
remove all NIP authors. 2023-11-15 19:42:51 -05:00			`draft` `optional`
Add NIP-45, which defines a COUNT verb 2023-01-03 23:11:17 -05:00
Clarify how NIP 45 works with multiple COUNT filters. (#504) 2023-05-06 14:35:21 -04:00			Relays may support the verb `COUNT`, which provides a mechanism for obtaining event counts.
Add NIP-45, which defines a COUNT verb 2023-01-03 23:11:17 -05:00
			`## Motivation`

Clarify how NIP 45 works with multiple COUNT filters. (#504) 2023-05-06 14:35:21 -04:00			`Some queries a client may want to execute against connected relays are prohibitively expensive, for example, in order to retrieve follower counts for a given pubkey, a client must query all kind-3 events referring to a given pubkey only to count them. The result may be cached, either by a client or by a separate indexing server as an alternative, but both options erode the decentralization of the network by creating a second-layer protocol on top of Nostr.`
Add NIP-45, which defines a COUNT verb 2023-01-03 23:11:17 -05:00
			`## Filters and return values`

Clarify how NIP 45 works with multiple COUNT filters. (#504) 2023-05-06 14:35:21 -04:00			This NIP defines the verb `COUNT`, which accepts a subscription id and filters as specified in [NIP 01](01.md) for the verb `REQ`. Multiple filters are OR'd together and aggregated into a single count result.
feat: support counting by filters 2023-04-12 11:18:22 -04:00
remove nip45 code block type remove nip45 code block type and then code will not highlight error. 2024-08-06 09:03:40 -04:00			```
feat: support counting by filters 2023-04-12 11:18:22 -04:00			`["COUNT", <subscription_id>, <filters JSON>...]`
			```
Add NIP-45, which defines a COUNT verb 2023-01-03 23:11:17 -05:00
Clarify how NIP 45 works with multiple COUNT filters. (#504) 2023-05-06 14:35:21 -04:00			Counts are returned using a `COUNT` response in the form `{"count": <integer>}`. Relays may use probabilistic counts to reduce compute requirements.
Allow relays indicate whether probabilistic count was used in NIP-45 2023-08-27 09:34:55 -04:00			In case a relay uses probabilistic counts, it MAY indicate it in the response with `approximate` key i.e. `{"count": <integer>, "approximate": <true\|false>}`.
Add NIP-45, which defines a COUNT verb 2023-01-03 23:11:17 -05:00
remove nip45 code block type remove nip45 code block type and then code will not highlight error. 2024-08-06 09:03:40 -04:00			```
feat: support counting by filters 2023-04-12 11:18:22 -04:00			`["COUNT", <subscription_id>, {"count": <integer>}]`
			```

`CLOSED` messages for relays that want to reject REQs and NIP-42 `AUTH` integration (#902) Co-authored-by: monlovesmango <96307647+monlovesmango@users.noreply.github.com> 2023-12-06 10:01:27 -05:00			Whenever the relay decides to refuse to fulfill the `COUNT` request, it MUST return a `CLOSED` message.

nip45: add hyperloglog relay response. 2024-10-29 10:30:33 -04:00			`## HyperLogLog`
Add NIP-45, which defines a COUNT verb 2023-01-03 23:11:17 -05:00
nip45: add hyperloglog relay response. 2024-10-29 10:30:33 -04:00			`Relays may return an HyperLogLog value together with the count, hex-encoded.`
reformat a bunch of json things and small nitpicks. 2023-11-18 07:13:12 -05:00
remove nip45 code block type remove nip45 code block type and then code will not highlight error. 2024-08-06 09:03:40 -04:00			```
nip45: add hyperloglog relay response. 2024-10-29 10:30:33 -04:00			`["COUNT", <subscription_id>, {"count": <integer>, "hll": "<hex>"}]`
reformat a bunch of json things and small nitpicks. 2023-11-18 07:13:12 -05:00			```
Add NIP-45, which defines a COUNT verb 2023-01-03 23:11:17 -05:00
nip45: add hyperloglog relay response. 2024-10-29 10:30:33 -04:00			`This is so it enables merging results from multiple relays and yielding a reasonable estimate of reaction counts, comment counts and follower counts, while saving many millions of bytes of bandwidth for everybody.`

			`### Algorithm`

nip45: negate pow attacks on hyperloglog using a stupid hack. 2024-11-03 14:49:02 -05:00			`This section describes the steps a relay should take in order to return HLL values to clients.`

			1. Upon receiving a filter, if it has a single `#e`, `#p`, `#a` or `#q` item, read its 32th ascii character as a byte and take its modulo over 24 to obtain an `offset` -- in the unlikely case that the filter doesn't meet these conditions, set `offset` to the number 16;
			`2. Initialize 256 registers to 0 for the HLL value;`
			`3. For all the events that are to be counted according to the filter, do this:`
			1. Read byte at position `offset` of the event `pubkey`, its value will be the register index `ri`;
			2. Count the number of leading zero bits starting at position `offset+1` of the event `pubkey`;
			3. Compare that with the value stored at register `ri`, if the new number is bigger, store it.
nip45: add hyperloglog relay response. 2024-10-29 10:30:33 -04:00
			`That is all that has to be done on the relay side, and therefore the only part needed for interoperability.`

			`On the client side, these HLL values received from different relays can be merged (by simply going through all the registers in HLL values from each relay and picking the highest value for each register, regardless of the relay).`

			`And finally the absolute count can be estimated by running some methods I don't dare to describe here in English, it's better to check some implementation source code (also, there can be different ways of performing the estimation, with different quirks applied on top of the raw registers).`

nip45: mention hyperloglog attack and its solution. 2024-11-09 05:59:14 -05:00			`### Attack vectors`

			One could mine a pubkey with a certain number of zero bits in the exact place where the HLL algorithm described above would look for them in order to artificially make its reaction or follow "count more" than others. For this to work a different pubkey would have to be created for each different target (event id, followed profile etc). This approach is not very different than creating tons of new pubkeys and using them all to send likes or follow someone in order to inflate their number of followers. The solution is the same in both cases: clients should not fetch these reaction counts from open relays that accept everything, they should base their counts on relays that perform some form of filtering that makes it more likely that only real humans are able to publish there and not bots or artificially-generated pubkeys.

nip45: add hyperloglog relay response. 2024-10-29 10:30:33 -04:00			### `hll` encoding

			The value `hll` value must be the concatenation of the 256 registers, each being a uint8 value (i.e. a byte). Therefore `hll` will be a 512-character hex string.

			`## Examples`

reformat a bunch of json things and small nitpicks. 2023-11-18 07:13:12 -05:00			`### Count posts and reactions`

remove nip45 code block type remove nip45 code block type and then code will not highlight error. 2024-08-06 09:03:40 -04:00			```
feat: support counting by filters 2023-04-12 11:18:22 -04:00			`["COUNT", <subscription_id>, {"kinds": [1, 7], "authors": [<pubkey>]}]`
			`["COUNT", <subscription_id>, {"count": 5}]`
reformat a bunch of json things and small nitpicks. 2023-11-18 07:13:12 -05:00			```
Allow relays indicate whether probabilistic count was used in NIP-45 2023-08-27 09:34:55 -04:00
nip45: add hyperloglog relay response. 2024-10-29 10:30:33 -04:00
reformat a bunch of json things and small nitpicks. 2023-11-18 07:13:12 -05:00			`### Count posts approximately`

			```
Allow relays indicate whether probabilistic count was used in NIP-45 2023-08-27 09:34:55 -04:00			`["COUNT", <subscription_id>, {"kinds": [1]}]`
			`["COUNT", <subscription_id>, {"count": 93412452, "approximate": true}]`
Add NIP-45, which defines a COUNT verb 2023-01-03 23:11:17 -05:00			```
`CLOSED` messages for relays that want to reject REQs and NIP-42 `AUTH` integration (#902) Co-authored-by: monlovesmango <96307647+monlovesmango@users.noreply.github.com> 2023-12-06 10:01:27 -05:00
nip45: add hyperloglog relay response. 2024-10-29 10:30:33 -04:00			`### Followers count with HyperLogLog`

			```
			`["COUNT", <subscription_id>, {"kinds": [3], "#p": [<pubkey>]}]`
			["COUNT", <subscription_id>, {"count": 16578, "hll": "0607070505060806050508060707070706090d080b0605090607070b07090606060b0705070709050807080805080407060906080707080507070805060509040a0b06060704060405070706080607050907070b08060808080b080607090a06060805060604070908050607060805050d05060906090809080807050e0705070507060907060606070708080b0807070708080706060609080705060604060409070a0808050a0506050b0810060a0908070709080b0a07050806060508060607080606080707050806080c0a0707070a080808050608080f070506070706070a0908090c080708080806090508060606090906060d07050708080405070708"}]
			```

`CLOSED` messages for relays that want to reject REQs and NIP-42 `AUTH` integration (#902) Co-authored-by: monlovesmango <96307647+monlovesmango@users.noreply.github.com> 2023-12-06 10:01:27 -05:00			`### Relay refuses to count`

			```
			`["COUNT", <subscription_id>, {"kinds": [4], "authors": [<pubkey>], "#p": [<pubkey>]}]`
			`["CLOSED", <subscription_id>, "auth-required: cannot count other people's DMs"]`
			```