summaryrefslogtreecommitdiff
path: root/f7/6e593281605b1951cc4208281dbce1c93ff611
blob: 94115c6f80799c5d52e09219a0e93a5f8e299d04 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
Return-Path: <keagan.mcclelland@gmail.com>
Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138])
 by lists.linuxfoundation.org (Postfix) with ESMTP id 80F0AC0001
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri, 26 Feb 2021 18:40:49 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
 by smtp1.osuosl.org (Postfix) with ESMTP id 5B28184090
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri, 26 Feb 2021 18:40:49 +0000 (UTC)
X-Virus-Scanned: amavisd-new at osuosl.org
X-Spam-Flag: NO
X-Spam-Score: 0.601
X-Spam-Level: 
X-Spam-Status: No, score=0.601 tagged_above=-999 required=5
 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
 DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001,
 HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001,
 RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001]
 autolearn=ham autolearn_force=no
Authentication-Results: smtp1.osuosl.org (amavisd-new);
 dkim=pass (2048-bit key) header.d=gmail.com
Received: from smtp1.osuosl.org ([127.0.0.1])
 by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id b_PWG9qSu3JJ
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri, 26 Feb 2021 18:40:48 +0000 (UTC)
X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0
Received: from mail-wr1-f42.google.com (mail-wr1-f42.google.com
 [209.85.221.42])
 by smtp1.osuosl.org (Postfix) with ESMTPS id 6E24D84080
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri, 26 Feb 2021 18:40:48 +0000 (UTC)
Received: by mail-wr1-f42.google.com with SMTP id e10so9289029wro.12
 for <bitcoin-dev@lists.linuxfoundation.org>;
 Fri, 26 Feb 2021 10:40:48 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:from:date:message-id:subject:to;
 bh=EN5ikFcIA6VxrMha3gPH+0OckG3F3b9Qz49V0HsqmrQ=;
 b=QYVg6BFvX59G9NKi5oRdOh6Fm2Fn7FIgEGVv5NPU/rI94fIpqFk8JKksLd6/l9QD9/
 NcNj8C+Kc0yhBa9yhyzQWNzxr5mblnCSiPiKVDaZ1Z4B4peg6IHp6rmKEqoyBc0FbrVQ
 bISSrn9T71UEHSUegLKr3x5PL0NKwYQBKGWIGZro7JEANUfXnxWpzjn/UbwwVMbi3Dnd
 l5ZAbbalcKWJUQpVHbi1bjUubT4VA9daK9mLfMUV+KI2JCDIo111hfcNqN9mf2Z+/TMS
 k+sIIFD3tRuocE8cHtvug6SZPLvQ8J8RNtTdpEna5ZJaznIGDIvohbOOVjtE9WFnG39S
 2oTQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:from:date:message-id:subject:to;
 bh=EN5ikFcIA6VxrMha3gPH+0OckG3F3b9Qz49V0HsqmrQ=;
 b=ItvHQrG2xQggEtwvOxRhSsBNJWzsg49xFB3ydqEFzhL0NdMMFtErwYJfzSNQv1G0sT
 hIoeuugIStYl35XeHV2OK3wh4JWNfiqIjXX2TAGj5XNBvQm0xQSsGlEg3TufpsAKNOrg
 rEwvVmAuGenbnO1HAoIfLrIZgfv8d+sQnYay7313F+1rYyeifRBF+4dRH86ropM2pT7u
 oSyyaZWpNoUBY+ZHKrSaB/SvI7vdZOZZjzhpfgnzvLHMaI7TbGcI6+sSglxHfNM/cdTZ
 pLRwmoS+Q1+kI3e5POrVqGZ8riCHIYMe39LwyMlS+ETlWSMywDxk2lsESW5hA/XvYiyZ
 3yRA==
X-Gm-Message-State: AOAM530aYM8gbJkWyywO6yMLwi5iXfyfFVEAO6YPNBbYw/ASUi4k+2Gu
 jVgvShLVnM0+WAGCbLYdzVMpb5WjmFyMIp+UxMPn8HXHPbc=
X-Google-Smtp-Source: ABdhPJzDTAGOzM1UZMiyiauCgq6IY08WQbeiY+zSdflTYpCVVn3Y/NStUxyOib48hbUwuz6znCLZT1KPfXO3NKxPmh0=
X-Received: by 2002:adf:ed44:: with SMTP id u4mr4577063wro.35.1614364846272;
 Fri, 26 Feb 2021 10:40:46 -0800 (PST)
MIME-Version: 1.0
From: Keagan McClelland <keagan.mcclelland@gmail.com>
Date: Fri, 26 Feb 2021 11:40:35 -0700
Message-ID: <CALeFGL1WSSA69ARvJW3di-UC_gz7NV9q7=6zd7s=CHnmttdQFg@mail.gmail.com>
To: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Content-Type: multipart/alternative; boundary="00000000000033320f05bc419b01"
X-Mailman-Approved-At: Fri, 26 Feb 2021 20:57:10 +0000
Subject: [bitcoin-dev] A design for Probabilistic Partial Pruning
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>, 
 <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>, 
 <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Feb 2021 18:40:49 -0000

--00000000000033320f05bc419b01
Content-Type: text/plain; charset="UTF-8"

Hi all,

I've been thinking for quite some time about the problem of pruned nodes
and ongoing storage costs for full nodes. One of the things that strikes me
as odd is that we only really have two settings.

A. Prune everything except the most recent blocks, down to the cache size
B. Keep everything since genesis

From my observations and conversations with various folks in the community,
they would like to be able to run a "partially" pruned node to help bear
the load of bootstrapping other nodes and helping with data redundancy in
the network, but would prefer to not dedicate hundreds of Gigabytes of
storage space to the cause.

This led me to the idea that a node could randomly prune some of the blocks
from history if it passed some predicate. A rough sketch of this would look
as follows.

1. At node startup, it would generate a random seed, this would be unique
to the node but not necessary that it be cryptographically secure.
2. In the node configuration it would also carry a "threshold" expressed as
some percentage of blocks it wanted to keep.
3. As IBD occurs, based off of the threshold, the block hash, and the
node's unique seed, the node would either decide to prune the data or keep
it. The uniqueness of the node's hash should ensure that no block is
systematically overrepresented in the set of nodes choosing this storage
scheme.
4. Once the node's IBD is complete it would advertise this as a peer
service, advertising its seed and threshold, so that nodes could
deterministically deduce which of its peers had which blocks.

The goals are to increase data redundancy in a way that more uniformly
shares the load across nodes, alleviating some of the pressure of full
archive nodes on the IBD problem. I am working on a draft BIP for this
proposal but figured I would submit it as a high level idea in case anyone
had any feedback on the initial design before I go into specification
levels of detail.

If you have thoughts on

A. The protocol design itself
B. The barriers to put this kind of functionality into Core

I would love to hear from you,

Cheers,
Keagan

--00000000000033320f05bc419b01
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi all,<div><br></div><div>I&#39;ve been thinking for quit=
e some time about the problem of pruned nodes and ongoing storage costs for=
 full nodes. One of the things that strikes me as odd is that we only reall=
y have two settings.</div><div><br></div><div>A. Prune everything except th=
e most recent blocks, down to the cache size</div><div>B. Keep everything s=
ince genesis</div><div><br></div><div>From my observations and conversation=
s with various folks in the community, they would like to be able to run a =
&quot;partially&quot; pruned node to help bear the load of bootstrapping ot=
her nodes and helping with data redundancy in the network, but would prefer=
 to not dedicate hundreds of Gigabytes of storage space to the cause.</div>=
<div><br></div><div>This led me to the idea that a node could randomly prun=
e some of the blocks from history if it passed some predicate. A rough sket=
ch of this would look as follows.</div><div><br></div><div>1. At node start=
up, it would generate a random seed, this would be unique to the node but n=
ot necessary that it be cryptographically secure.</div><div>2. In the node =
configuration it would also carry a &quot;threshold&quot; expressed as some=
 percentage of blocks it wanted to keep.</div><div>3. As IBD occurs, based =
off of the threshold, the block hash, and the node&#39;s unique seed, the n=
ode would either decide to prune the data or keep it. The uniqueness of the=
 node&#39;s hash should ensure that no block is systematically overrepresen=
ted in the set of nodes choosing this storage scheme.</div><div>4. Once the=
 node&#39;s IBD is complete it would advertise this as a peer service, adve=
rtising its seed and threshold, so that nodes could deterministically deduc=
e which of its peers had which blocks.</div><div><br></div><div>The goals a=
re to increase data redundancy in a way that more uniformly shares the load=
 across nodes, alleviating some of the pressure of full archive nodes on th=
e IBD problem. I am working on a draft BIP for this proposal but figured I =
would submit it as a high level idea in case anyone had any feedback on the=
 initial design before I go into specification levels of detail.</div><div>=
<br></div><div>If you have thoughts on</div><div><br></div><div>A. The prot=
ocol design itself</div><div>B. The barriers to put this kind of functional=
ity into Core</div><div><br></div><div>I would love to hear from you,</div>=
<div><br></div><div>Cheers,</div><div>Keagan</div></div>

--00000000000033320f05bc419b01--