summaryrefslogtreecommitdiff
path: root/ef/f9f1e630e8cd70aff880ca71ebcd46042b572a
blob: 01b84e4a450b3c09e534819f7766a58ca7401cf4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
Return-Path: <gmaxwell@gmail.com>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 2F787E47
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri, 25 May 2018 18:42:44 +0000 (UTC)
X-Greylist: whitelisted by SQLgrey-1.7.6
Received: from mail-ua0-f178.google.com (mail-ua0-f178.google.com
	[209.85.217.178])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id CC2AD180
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri, 25 May 2018 18:42:42 +0000 (UTC)
Received: by mail-ua0-f178.google.com with SMTP id f30-v6so2313501uab.11
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Fri, 25 May 2018 11:42:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=mime-version:sender:in-reply-to:references:from:date:message-id
	:subject:to; bh=amZkCbxLeJf2BtaiwbgMCvZxYkbyvMFvXDWVNEe4pZs=;
	b=J1pRrVozbUYTd8n145E1dfBC41Ta3CW82rrHG0e8X3vz/7EBxuD3zqqBDD0AMj7DL8
	9AA4yw6VOZS1t63uk4kt1dk5I7qGQIfDqc00DCJGt0RzIRbo0EGfFGd9V9sPgm6sGQLc
	yIlxO5coNurJZR5BK1dWl3+HydqrV+9aP3I3U1rUiQmVcuFJDzfyfM8KeF9qVARoYIQB
	n5tOVdltrTLYfG/H1TgsOU71qE4svlr2EN9pU4c1MVqXA7bBH7fob19aVIJn+j0hx+zp
	WUyejlmKZUGn9xPOzMC/djF8gJz0paep+wPSFikt4U3hwRCmMUjqltDvw+rLUszCDgUg
	j8MA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:sender:in-reply-to:references:from
	:date:message-id:subject:to;
	bh=amZkCbxLeJf2BtaiwbgMCvZxYkbyvMFvXDWVNEe4pZs=;
	b=mm4E6K9V1aaaRImwgUaBeIwmdZRev0atHFoEsly+bH5aLWj64EunKvKozEQ3i9NZcV
	8LsgoPcIz8g5wMUDIk/5BU2BjAUFNqDlzrr/v5UHYYwtPNoXTtDoQfDDIieKrenj4CRD
	fPbiN/ZKG8GQId9f5b0SoPVgh4E7CQAgCa2mFiMuHDoMf4sdxXDHZRhARd7iRptkuJ6u
	UMyUbU7wU0ZNUIeddtmDs1a7bHDkFe28Is0I4fyTKP4PMuNyGFnMX4v1wI7vFXzIQoRr
	xnyG/ybQPZqXzUeAw3dnF8YlNaEcijKv8YGkTJfOFyMSWLA3v8P7cngnEakuNDQZKttJ
	/gfw==
X-Gm-Message-State: ALKqPwfzVMSNaXw8Ix6P0Oy+Z/p8rY8wA7mcy2FB9PslJ6GR7Ff2y4tR
	kuOUDerHpsEWS2EKSZgbrV6ug+6JKUo7Fa8z0Dc=
X-Google-Smtp-Source: ADUXVKIPxarCXfPCPEkIo651XFI9wi9esoyXJ5E4hdCUgAOFx5a0E3c7GmcwFocqqQQTFRoM7v/zLhj0KmdckJ5NW0k=
X-Received: by 2002:ab0:1092:: with SMTP id
	d18-v6mr2410356uab.87.1527273761839; 
	Fri, 25 May 2018 11:42:41 -0700 (PDT)
MIME-Version: 1.0
Sender: gmaxwell@gmail.com
Received: by 2002:a67:5184:0:0:0:0:0 with HTTP; Fri, 25 May 2018 11:42:41
	-0700 (PDT)
In-Reply-To: <CAPg+sBgywj6PgijmSNkYYkKKQuek2g9-cSy6GJBpV+=gom7LfQ@mail.gmail.com>
References: <CAPg+sBgywj6PgijmSNkYYkKKQuek2g9-cSy6GJBpV+=gom7LfQ@mail.gmail.com>
From: Gregory Maxwell <greg@xiph.org>
Date: Fri, 25 May 2018 18:42:41 +0000
X-Google-Sender-Auth: VQf-u-Iqg4gtmVNdunUeYVYGHVk
Message-ID: <CAAS2fgS5cnNZSp7DJdDEdt1ainezfg7aoAbga2Py7gqfe267kw@mail.gmail.com>
To: Pieter Wuille <pieter.wuille@gmail.com>, 
	Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Content-Type: text/plain; charset="UTF-8"
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID, FREEMAIL_FROM,
	RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
Subject: Re: [bitcoin-dev] Minimizing the redundancy in Golomb Coded Sets
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Fri, 25 May 2018 18:42:44 -0000

On Fri, May 25, 2018 at 5:54 PM, Pieter Wuille via bitcoin-dev
<bitcoin-dev@lists.linuxfoundation.org> wrote:
> Hi all,
>
> I spent some time working out the optimal parameter selection for the
> Golomb Coded Sets that are proposed in BIP158:
> https://gist.github.com/sipa/576d5f09c3b86c3b1b75598d799fc845
>
> TL;DR: if we really want an FP rate of exactly 1 in 2^20, the Rice
> parameter should be 19, not 20. If we don't, we should pick an FP rate
> of 1 in a 1.4971*2^B. So for example M=784931 B=19 or M=1569861 B=20.


I did a rough analysis using Pieter's approximations on what
parameters minimizes the total communications for a lite wallet
scanning the chain and fetching a witnessless block whenever they get
a filter hit. For a wallet with 1000 keys and blocks of 1MB if the
number of entries in the is at least 5096 then M=784931 results in a
lower total data rate rate (FP blocks + filters) than M=1569861.
M=392465 (the optimal value for the rice parameter 18) is
communications is better if at least 10192 entries are set, and
M=196233 (optimal FP for rice 17) is better if at least 20384 entries
are set.

The prior filter set proposal is setting roughly 13300 entries per
full block,  and I guestimate that the in+out scripts only ones are
setting about 7500 entries (if that actual number was in any of the
recent posts I missed it, I'm guessing based on jimpo's sizes graph).

The breakpoints are obviously different if the client is monitoring
for, say, 10,000 keys instead of 1000 but I think it generally makes
more sense to optimize for lower key counts since bigger users are
more likely to tolerate the additional bandwidth usage.

So I think that assuming that all-scripts inputs and outputs (but no
txids) are used and that my guess of 7500 bits set for that
configuration is roughly right, then M=1569861 and rice parameter 19
should be used.

The actual optimal FP rate for total data transferred won't be one
that gets the optimal rice coding efficiency, but since different
clients will be monitoring for different numbers of keys, it probably
makes sense to pick a parameter with optimal compression rather than
optimal-data-transfer-for-a-specific-key-count-- at least then we're
spending the least amount of filter bits per false positive rate,
whatever that rate is... if we can't be optimal at least we can be
efficient. :)