Return-Path: <will8clark@gmail.com>
Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138])
 by lists.linuxfoundation.org (Postfix) with ESMTP id A2CBBC07FF
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri,  8 May 2020 12:31:11 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by whitealder.osuosl.org (Postfix) with ESMTP id 9230188658
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri,  8 May 2020 12:31:11 +0000 (UTC)
X-Virus-Scanned: amavisd-new at osuosl.org
Received: from whitealder.osuosl.org ([127.0.0.1])
 by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id ZmRxGnFwqvSp
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri,  8 May 2020 12:31:10 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from mail-wr1-f53.google.com (mail-wr1-f53.google.com
 [209.85.221.53])
 by whitealder.osuosl.org (Postfix) with ESMTPS id 08B458867F
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri,  8 May 2020 12:31:10 +0000 (UTC)
Received: by mail-wr1-f53.google.com with SMTP id z8so1650726wrw.3
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri, 08 May 2020 05:31:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:content-transfer-encoding:mime-version:subject:message-id:date
 :to; bh=vWiqgvCUADG2p2f58ZvXICahhYzFibnn3A+XI5mI/P8=;
 b=aBynhffFEI6JEZpKnBJhkGkQICShgi+jTWW6Ga3jMlGZ+PD3ixM+XwzSqsXHCwUtt+
 FbBABGggi4sbk1Syj4TEHQ/ue5OtK5vmqdkFuEAzTQi2Gitffn2sITqAYijFlRn/wgi5
 HpDS2To54FKPeUK/UpyJg0ldEj5K8DolfkyGHfblMnhPEirCzPvq6p0xAoAYGf6qy/t2
 jf7OQmLGqkDy/9Gok8SsRKulRgOjuz9Ro7mL+BE2MPeP0ZPP18RnvuPv9wFl3Am/8icK
 FEf/a3+UAG8BdVbmk1moCs5LQaj6/A5aP8iWMui0TVVkdZqgLyUO++nNML1pNdWLS4ig
 ZZ0w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:content-transfer-encoding:mime-version
 :subject:message-id:date:to;
 bh=vWiqgvCUADG2p2f58ZvXICahhYzFibnn3A+XI5mI/P8=;
 b=Ls60qJTgZhp56rWqQIIPsST6DyBbatqmSZxoFEo8kPytxtZlNR39+KbExcrBcuD0T+
 iTVOfZdx8wL2utL29d7HB7nL6JIJNedAZmp2sDj44TH+Dudu1iFpCySE20Bo7wuuj5/x
 PBumDQqCYxNm0ZEnhwtoVSZja2FgEAOjMKbRTB/McqxLvjq70qmB19vmmQpHsxy9WKO7
 085FIRSgKmE5rzszD+xzZ8uZ5eckzm1+j1gO4iP+cjB+l6e2wrgFlTZtafv7I5BnHBPZ
 5A3Qf6RAMBPAj7AmtqPu0zJ2oUDSzyTlgXWUm1maOddoPXGWjmOfMpkB9iwPjcFdas98
 2SBw==
X-Gm-Message-State: AGi0PuY1SR5zibOmJY4NE7UU5FZccDBr46WbyIgNzfTbigxqfQW0JAZ7
 31d2FAwYghFIG5RJ1rEqtsfnf9Pslqg=
X-Google-Smtp-Source: APiQypKI6EbK2wCANFa6QR/gYYX3pSRIUPx+1hyyECZjYqZDBZ0oHjOKDjf9R6Bn4kOB4B2DN19/IA==
X-Received: by 2002:a5d:4092:: with SMTP id o18mr2694833wrp.227.1588941067619; 
 Fri, 08 May 2020 05:31:07 -0700 (PDT)
Received: from macbook-pro.localdomain
 (cpc123762-trow7-2-0-cust7.18-1.cable.virginm.net. [77.98.116.8])
 by smtp.gmail.com with ESMTPSA id c19sm2904574wrb.89.2020.05.08.05.31.06
 for <bitcoin-dev@lists.linuxfoundation.org>
 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 08 May 2020 05:31:06 -0700 (PDT)
From: Will Clark <will8clark@gmail.com>
Content-Type: text/plain;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\))
Message-Id: <40DB3DBE-A1C9-4D20-A3C7-F5660307D9D7@gmail.com>
Date: Fri, 8 May 2020 13:31:06 +0100
To: bitcoin-dev@lists.linuxfoundation.org
X-Mailer: Apple Mail (2.3608.80.23.2.2)
X-Mailman-Approved-At: Fri, 08 May 2020 13:34:51 +0000
Subject: [bitcoin-dev] Compressed block headers
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>, 
 <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>, 
 <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Fri, 08 May 2020 12:31:11 -0000

Hello list,

I would like to propose a compressed block header scheme for IBD and =
block announcements. This proposal is derivative of previous proposals =
found on this list (see links in spec below) with some modifications and =
clarifications.

The below specification (also found at =
https://github.com/willcl-ark/compressed-block-headers/blob/v1.0/compresse=
d-block-headers.adoc ) details the compression recommended along with =
the generated bandwidth savings in the best-case scenario.

I look forward to any feedback anyone has to offer on the specification =
itself, as well as any additions or objections to the motivation.

Cheers,
Will


=3D Compressed block headers
Will Clark <will8clark@gmail.com>
v1.0, May 2020:
:toc: preamble
:toclevels: 4


This work is a derivation of these mailing list posts:

1. =
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-August/014876=
.html[bitcoin-dev: "Compressed" headers stream - 2017] (with =
resurrection =
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-December/0153=
85.html[here])

2. =
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-March/015851.=
html[bitcoin-dev: Optimized Header Sync]

'''

=3D=3D Motivation

Block headers as exchanged by nodes over the p2p network are currently =
81 bytes each.

For low bandwidth nodes who are doing a headers-only sync, reducing the =
size of the headers can provide a significant bandwidth saving. Also, =
nodes can support more header-only peers for IBD and protection against =
eclipse attacks if header bandwidth is reduced.

=3D=3D=3D Background

Currently headers are sent over the p2p network as a vector of =
`block_headers`, which are composed of the following sized fields:

[cols=3D"<,>"]
|=3D=3D=3D
|Field               |Size

|Version             |4 bytes
|Previous block hash |32 bytes
|Merkle root hash    |32 bytes
|Time                |4 bytes
|nBits               |4 bytes
|nonce               |4 bytes
|txn_count           |1 byte
|*Total*             |81 bytes
|=3D=3D=3D

Some fields can be removed completely, others can be compressed under =
certain conditions.

=3D=3D Proposed specification

=3D=3D=3D block_header2 data type

The following table illustrates the proposed `block_header2` data type =
specification.

[cols=3D"<,>,>"]
|=3D=3D=3D
|Field               |Size     |Compressed

|Bitfield            |1 byte   | 1 byte
|Version             |4 bytes  |0 \| 4 bytes
|Previous block hash |32 bytes |0 \| 32 bytes
|Merkle root hash    |32 bytes |32 bytes
|Time                |4 bytes  |2 \| 4 bytes
|nBits               |4 bytes  |0 \| 4 bytes
|nonce               |4 bytes  |4 bytes
|*Total*             |81 bytes |range: 39 - 81 bytes
|=3D=3D=3D

This compression results in a maximum reduction from an 81 byte header =
to best-case 39 byte header. With 629,474 blocks in the current =
blockchain, a continuous header sync from genesis (requiring a single =
full 81 byte header followed by only compressed `block_header2`) has =
been tested to have its required bandwidth reduced from 50.98MB down to =
25.86MB, a saving of 49%.

=3D=3D=3D=3D Bitfield

To make parsing of header messages easier and further increase header =
compression, a single byte bitfield was suggested by gmaxwell =
footnote:[https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-Dec=
ember/015397.html]. We propose the following amended bitfield meanings =
(bits re-ordered to match `headers2` field order):

[cols=3D"<,<"]
|=3D=3D=3D
|Bit |Meaning + field size to read

|0 +
1 +
2    |version: same as the last *distinct* value 1st ... 7th (0 byte =
field) or a new 32bit distinct value (4 byte field).
|3   |prev_block_hash: is omitted (0 byte field) or included (32 byte =
field)
|4   |timestamp: as small offset (2 byte field) or full (4 byte field).
|5   |nbits: same as last header (0 byte field) or new (4 byte field).
|6   |possibly to signal "more headers follow" to make the encoding =
self-delimiting.
|7   |currently undefined
|=3D=3D=3D

This bitfield adds 1 byte for every block in the chain, for a current =
total increase of 629,474B.

=3D=3D=3D=3D Version

In most cases the Version field will be identical to one referenced in =
one of the previous 7 unique versions, as indicated by bits 0,1,2 of the =
Bitfield.

To block 629,474 there were 616,137 blocks whose version was in the =
previous 7 distinct versions, and only 13,338 blocks whose version was =
not, this includes any version bit manipulation done via overt ASIC =
boost.

[cols=3D">,>,>,>"]
|=3D=3D=3D
|Genesis to block |Current (B) |Compressed (B) |Saving (%)

|629,474          |2,517,896   |53,352         |98
|=3D=3D=3D

=3D=3D=3D=3D Previous block hash

The previous block hash will always be the
`SHA256(SHA256(<previous_header>))` so is redundant, presuming you have =
the previous header in the chain.

[cols=3D">,>,>,>"]
|=3D=3D=3D
|Genesis to block |Current (B) |Compressed (B) |Saving (%)

|629,474          |20,143,168  |0              |100
|=3D=3D=3D

=3D=3D=3D=3D Time

The timestamp (in seconds) is consensus bound, based both on the time in =
the previous
header: `MAX_FUTURE_BLOCK_TIME =3D 2 * 60 * 60 =3D 7200`, and being =
greater than the `MedianTimePast` of the previous 11 blocks. Therefore =
this can be safely represented as an offset from the previous headers' =
timestamp using a 2 byte `signed short int`.

[cols=3D">,>,>,>"]
|=3D=3D=3D
|Genesis to block |Current (B) |Compressed (B) |Saving (%)

|629,474          |2,517,896   |1,258,952      |50
|=3D=3D=3D

=3D=3D=3D=3D nBits

nBits currently changes once every 2016 blocks. It could be entirely =
calculated by the client from the timestamps of the previous 2015 blocks =
footnote:[2015 blocks are used in the adjustment calculation due to an =
off-by-one error: =
https://bitcointalk.org/index.php?topic=3D43692.msg521772#msg521772"].

To simplify 'light' client implementations which would otherwise require =
consensus-valid calculation of the adjustments, we propose to transmit =
this according to the <<Bitfield>> specification above.

To block 629,474 there have been 298 nBits adjustments (vs an expected =
311 -- there was none before block 32,256).

[cols=3D">,>,>,>"]
|=3D=3D=3D
|Genesis to block |Current (B) |Compressed (B) |Saving (%)

|629,474          |2,517,896   |1,196          |99.6
|=3D=3D=3D

=3D=3D=3D=3D txn_count

txn_count is included to make parsing of these messages compatible with =
parsing of `block` messages =
footnote:[https://bitcoin.stackexchange.com/questions/2104/why-is-the-bloc=
k-header-txn-count-field-always-zero]. Therefore this field and its =
associated byte can be removed for transmission of compact headers.

[cols=3D">,>,>,>"]
|=3D=3D=3D
|Genesis to block |Current (B) |Compressed (B) |Saving (%)

|629,474          |629,474     |0              |100
|=3D=3D=3D

=3D=3D=3D Service Bit

A new service bit would be required so that the nodes can advertise =
their ability to supply compact headers.

=3D=3D=3D P2P Messages

Three new messages would be used by nodes that enable compact block =
header support, two query messages: `getheaders2` and `sendheaders2` and =
one response: `headers2`.

=3D=3D=3D=3D `getheaders2` -- Requesting compact headers

The new p2p message required to request compact block headers would =
require the same fields as the current `getheaders` message:

[cols=3D">,<,<,<"]
|=3D=3D=3D
|Field Size |Description          |Data type |Comments

|4          |version              |uint32_t  |the protocol version
|1+         |hash count           |var_int   |number of block locator =
hash entries
|32+        |block locator hashes |char[32]  |block locator object; =
newest back to genesis block (dense to start, but then sparse)
|32         |hash_stop            |char[32]  |hash of the last desired =
block header; set to zero to get as many blocks as possible (2000)
|=3D=3D=3D

=3D=3D=3D=3D `sendheaders2` -- Request compact header announcements

Since =
https://github.com/bitcoin/bips/blob/master/bip-0130.mediawiki[BIP-130], =
nodes have been able to request to receive new headers directly in =
`headers` messages, rather than via an `inv` of the new block hash and =
subsequent `getheader` request and `headers` response (followed by a =
final `getdata` to get the tip block itself, if desired). This is =
requested by transmitting an empty `sendheaders` message after the =
version handshake is complete.]

Upon receipt of this message, the node is permitted, but not required, =
to preemptively announce new headers with the `headers2` message =
(instead of `inv`). Preemptive header announcement is supported by the =
protocol version =E2=89=A5 70012 | Bitcoin Core version =E2=89=A5 =
0.12.0.

For the motivational use-case it makes sense to also update this =
mechanism to support sending header updates using compact headers using =
a new message.

=3D=3D=3D=3D `headers2` -- Receiving compact headers

A `headers2` message is returned in response to `getheaders2` or at new =
header announcement following a `sendheaders2` request. It contains both =
`length` and `headers` fields. The `headers` field contains a variable =
length vector of `block_header2`:

|=3D=3D=3D
|Field Size |Description |Data type       |Comments

|1+         |length      |var_int         |Length of `headers`
|39-81x?    |headers     |block_header2[] |Compressed block headers in =
<<block_header2 data type>> format
|=3D=3D=3D

=3D=3D=3D Implementation

* The first header in the first `block_header2[]` vector to a =
newly-connected client MUST contain the full nBits`, `timestamp`, =
`version` and `prev_block_hash` fields, along with a correctly populated =
`bitfield` byte.
* Subsequent headers in a contiguous vector SHOULD follow the compressed =
<<block_header2 data type>> format.
* Subsequent compressed headers supplied to an already-connected client =
(requesting compressed headers), SHOULD follow the compressed =
<<block_header2 data type>> format.