NIP-97: Files hosted on relay ============================== `final` `optional` `author:ondra-novak` This NIP solves the problem of sharing and distributing binary content over NOSTR networks such as media - images, short videos, music, or in general any files up to a certain size Key Takeaways ------------- * Builds on NIP-94 * Defines how binary content (images, media, files) is managed by NOSTR network * Defines new commands at the protocol level ("FILE","RETRIEVE") * Uses the binary messages in the websocket connection Changes in File header (NIP-94) -------------------------------- ```json { "id": <32-bytes lowercase hex-encoded sha256 of the the serialized event data>, "pubkey": <32-bytes lowercase hex-encoded public key of the event creator>, "created_at": , "kind": 1063, "tags": [ ["m", ], ["x",], ["size", ], ["url",], ["aes-256-gcm",, ], ["dim", ], ["blurhash", ] ], "content": , "sig": <64-bytes hex of the signature of the sha256 hash of the serialized event data, which is the same as the "id" field>, "nip97":true, "file_url":" } ``` * **url is optional** - This item can be still filled with a link to an external server. This may help during the transition period when the client is uploading the file to third party servers for relays that don't support this NIP. * **field "nip97"** appears only when event is sent from the relay to a client. This field is added by the relay to indicate, that event also contains a binary content. The value is always `true`. This also indicates, that binary content can be received by the command `RETRIEVE` (see below). *This field SHOULD NOT be set by a client *`*` * **field "file_url"** (optional) appears only when event is sent from the relay to a client. Its presence indicates, that the binary content can be downloaded at a given URL which is set as a value of this field. If this field is missing, then the binary connent cannot be downloaded using the http protocol, and the only way to retrieve the content is by using the command `RETRIEVE`. This field, cannot appear without the field `"nip97":true`. The client must not assume that given url is permanent and canonical. The relay can generate urls dynamically for the clients (each client can receive a different URL for the same event) *This field SHOULT NOT be set by a client*`*` `*` - *It is allowed to publish events with these fields by using the command* `FILE`*(see below) . The relay should remove these fields during the processing of the event. It is not allowed for the command* `EVENT` * as it is currently undefined, how the relay is handling these new fields.* **NOTE** - when copying events from one relay to another, the service must check for the presence of a field `"nip97":true`. If such a fields exists on the event, it must be posted using the `FILE` command with its binary content. (see below) Upload ------ A new protocol level command is introduced for upload: `FILE` ### Protocol flow ``` client: ["FILE", ] relay: ["OK",",true,"continue"] client: relay: ["OK",, true, ""] ``` The client MUST notify the relay that it is about to send a binary file. This is just implemented by the new 'FILE' command. The command "allocates" the binary channel of the websocket connection for file transfer. This will allow this binary channel to be used for other purposes in the future. The command parameter is a modified `file header event`(see above) which must contain the tags `x`,`m` and `size`. * **x** - MUST contain SHA256 of the binary content stored as HEX * **m** - MUST contain content type as mime specification. If the content type is unknown, use `application/octet-stream` * **size** - MUST contain size of the binary content. ### Binary message The standard binary message is used as defined in the websocket specification. The message can be fragmented (continuation frames can be used) ### Responses The relay MUST always respond with a NIP-20 response to both the command and the binary message. Response to the command `FILE` is `OK`, with status `true` - which signals that relay is ready to accept a binary message as binary content of the file. The text part of the response is just informational for debugging purpose. If the relay respons with status `false`, or the response is not `OK` message, the client MUST NOT send any binary message. The text part of the response contains explanation of the error (similar to NIP-20) The relay must also respond once it receives the binary message if it was correctly announced using the `FILE` command. The response is also `OK`, with status `true` for success or `false` for failure. This proposal doesn't define response for binary messages sent without the announcement ### Special error messages * `max_size: ` - this error message informs that file is too big. * `invalid: file mismatch` - received binary content doesn't match to announced metadata in the associated event ### Rules for state control 1. The relay SHOULD NOT accept any binary message without prior anouncement. 2. The relay SHOULD publish event after successful binary transfer. 3. The relay MUST NOT publish an event with incomplette binary transfer. 4. If a `FILE` command is sent after another `FILE` without sending a binary message between them, the previous operation is cancelled and the associated event MUST NOT be published. 5. If a client sends `FILE` command and then closes the connection without sending a binary message, the operation is canceled and event MUST NOT be published. 6. If the connection is closed during reading a binary message (continuation frames), the operation is canceled and event MUST NOT be published. 7. If the received binary message is corrupted (hash or size doesn't match), the relay MUST NOT publish the event and signal this to the client using appropriate response. 8. Relay is free to choose not to accept binary content for any reason. For example, Relay may test submitted images for explicit or copyrighted content and reject such content. ### Possible implementation - clients 1. calculates SHA256 hash of the binary content 2. prepares the file header event, fills mandatory tags (hash, size, mime type) 3. signs the file header event 4. sends the command FILE with the event to the relay 5. reads a response received from the relay 6. if the response indicates rejection (status: false), stops the operation, handles error 7. sends the binary content as a binary message 8. reads a response received from the relay 9. if the response indicates rejection (status: false), stops the operation, handles error 10. indicates success to the user ### Possible implementation - relays - assume that relay has separate processor for text and binary messages - assume that relay process messages asynchronously 1. the relay receives FILE command by text channel processor. 2. checks the validity of the event 3. checks for mandatory fields 8. responds with failure state if the checks fail (on text channel) - done 4. passes the event to the binary channel processor and configures it to receive the binary content 5. responds with success state on text channel 6. the processor for binary messages receives a message 7. calculates SHA256 hash, check hash validity, checks size 8. responds with failure state if the checks fail (on text channel) - done 9. responds with success state otherwise (on text channel), stores the binary content and publishes the event Download -------- The `RETRIEVE` command is used to download the binary file from the relay **Protocol flow** ``` client: ["RETRIEVE",""] relay: ["OK","", true, ""] relay: ``` **NOTE** the binary content is referenced by the event's id (not by the file hash). If the binary file does not exist, the response looks like this ``` client: ["RETRIEVE",""] relay: ["OK","", false, "missing: not found"] ``` In this case, the relay MUST NOT generate a binary message **NOTE**: It may indeed happen that binary content is unavailable even though there is an event for it. The client must be prepared for this eventuality. For example, the event might have been incorrectly published, or the relay database may have been corrupted, or the content might been deleted for copyrighted or explicit content. ** error messages ** (examples) * **missing: not found** * **missing: removed for legal reasons** * **missing: removed for explicit content** * **blocked: unauthorized** * **blocked: unavailable in current location** Retrieve the content by URL (optional) -------------------------------------- If the relay caches files, for example, on CDN servers, it can announce availablity of the contnet using the "file_url" field inserted directly to the event ``` { "id":"....", "kind": 1063, "pubkey":"....", ... "nip97":true, "file_url":"https://cdn.myrelay.example.com/file/abc123acfa7e1256a.jpg" } ``` This is optional extension, the client still can use "RETRIEVE" to download the file. Changes in relay information document (NIP-11) ----------------------------------------------- New item is added to the "limitation" section ``` "limitation": { "max_file_size": 262144, } ``` * **max_file_size** - maximum size of binary content in bytes. This value can be greater than the value **max_message_size**. This value is mandatory for the relay supporting this NIP. The relay MUST NOT assume that **max_message_size** defines maximum size of a fragment as there are many websocket clients that are unable to control size of fragments from their API. This includes all majority browsers.