Return-Path: <lucasontivero@gmail.com> Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 99043E4F for <bitcoin-dev@lists.linuxfoundation.org>; Wed, 30 May 2018 03:10:07 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-lf0-f51.google.com (mail-lf0-f51.google.com [209.85.215.51]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 88FA3224 for <bitcoin-dev@lists.linuxfoundation.org>; Wed, 30 May 2018 03:10:06 +0000 (UTC) Received: by mail-lf0-f51.google.com with SMTP id n18-v6so1897167lfh.10 for <bitcoin-dev@lists.linuxfoundation.org>; Tue, 29 May 2018 20:10:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=p9+3bu7vtj6g2ZBgbMo0zo4+nd0AMj1nOKuTU7Z7wuI=; b=adfXPA8m2ZqSm5E0/6rGPxim5iao1hmZYWxAxgvWmCyCjChC2qRRojSHSMEGBadlbn itwQ8MPmrczczmCECVhzjqOz6FB+t3yuv6py36HJ9PRpJZ5ZqatgoscJY5MW3ij6C4vz v4fUkGcz2sTts3yHHfJvRz4dQj6xEpfK0KNiJ0XisRhsnctR5Z4U21yh4DC3PAzGyMKu KgAqIzQ2H035Z3rl8g36+kO0qzgtNWLUTNpxNDlLNbeRJg8kPmQwdQgJ8UopX67n9RVV UBVC6gI4riGwy05oq0qWnzVXHBvJNRW+XIRzmDw0QuFZczNwSXnkrJyqKMwpsTKNfHgV Z8Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=p9+3bu7vtj6g2ZBgbMo0zo4+nd0AMj1nOKuTU7Z7wuI=; b=a46WNKwhB+iP8LzObZjcXbTwmaBWBfBeI1ISWqctDOYWtTCUJDZI4mKlkuwGzod2Em bVqyG3y8TLqN/AQKUOiM2CfOuBvHu57aKvK4k2xSCIJsPTCWlafUw0G6h+fEUVaPYTdH P/jszD8qXPGfKl9AJ8nVh2ediuKWCe1+pNVycDtoKPjZEQ+JnkbVpVsM38ukunVEfc3e TaCgDoIdTEabIYT5rLXstVAXgV8V0IZMUHt0kzTtDICmPslBmgdde3qGeoG2BGbOI2Hm 7Lv/4uh8wVvshipi7riDQK6E0hlevWUPUZqgFuxpzH/2X+5aYl2oGywqQxWg+JaCh365 X2cA== X-Gm-Message-State: ALKqPwdpMQC8/xON3yC4zH8KdjvH2xBX3oNPiZ2lyBFjvhCimoCKY4pf TWz3Az+eufSN40vuvgv8z26yt33wyGnBacwDTsY= X-Google-Smtp-Source: ADUXVKKU4P6/h2mgWIj3cMoO/x0vHXAgoW9C8Ee3HB/5TJEox9vNxb8gFC9B0/MTVz3eUSSun5rvxnVagI8blMpSoKc= X-Received: by 2002:a19:97cb:: with SMTP id z194-v6mr498101lfd.17.1527649804876; Tue, 29 May 2018 20:10:04 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a19:3bd4:0:0:0:0:0 with HTTP; Tue, 29 May 2018 20:10:04 -0700 (PDT) In-Reply-To: <CADZtCShF-sGNfjSaOdmK=YMOdvimN2b_a1vXmJv_XHKxvn14Zw@mail.gmail.com> References: <CAPg+sBgywj6PgijmSNkYYkKKQuek2g9-cSy6GJBpV+=gom7LfQ@mail.gmail.com> <CAAS2fgS5cnNZSp7DJdDEdt1ainezfg7aoAbga2Py7gqfe267kw@mail.gmail.com> <CAAS2fgQSS7R+PtmpcXJqjeXWnLa8S8O_1nFgdCYiLqRYbQM3QQ@mail.gmail.com> <CADZtCShF-sGNfjSaOdmK=YMOdvimN2b_a1vXmJv_XHKxvn14Zw@mail.gmail.com> From: Lucas Ontivero <lucasontivero@gmail.com> Date: Wed, 30 May 2018 00:10:04 -0300 Message-ID: <CALHvQn0eSU4f_f-XgTefVxfGKTjHqMsQ+bQDOb9qxk_tfuPqqg@mail.gmail.com> To: Jim Posen <jim.posen@gmail.com>, Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org> Content-Type: multipart/alternative; boundary="000000000000f5fefd056d63af9e" X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Wed, 30 May 2018 03:11:05 +0000 Subject: Re: [bitcoin-dev] Minimizing the redundancy in Golomb Coded Sets X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org> List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>, <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe> List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/> List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org> List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help> List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>, <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe> X-List-Received-Date: Wed, 30 May 2018 03:10:07 -0000 --000000000000f5fefd056d63af9e Content-Type: text/plain; charset="UTF-8" Hi Jim, Yes please, could you share CSV? We are developing a Wallet that uses Golomb-Rice filters it would help a lot for determine the best P value depending on the estimated number of elements the client needs to watch. 2018-05-29 19:38 GMT-03:00 Jim Posen via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org>: > This is a really cool finding, thanks Pieter! > > I did some more analysis on selecting a good P value to reduce total data > downloaded considering both filters themselves and blocks in the case of > false positive matches, using data from mainnet. The quantity it minimizes > is: > > filter_size(N, B) + block_size * false_positive_probability(C, N, B) > > N is the number of filter elements per block > B is the Golomb-Rice coding parameter > C is the number of filter elements watched by the client > > The main result is that: > > For C = 10, B = 13 is optimal > For C = 100, B = 16 is optimal > For C = 1,000, B = 20 is optimal > For C = 10,000, B = 23 is optimal > > So any value of B in the range 16 to 20 seems reasonable, with M = 1.4971 > * 2^B for optimal compression, as Pieter derived. The selection of the > parameter depends on the target number of elements that a client may watch. > > I attached some of the results, and would be happy to share the CSV and > raw notebook if people are interested. > > > On Fri, May 25, 2018 at 2:14 PM Gregory Maxwell via bitcoin-dev < > bitcoin-dev@lists.linuxfoundation.org> wrote: > >> On Fri, May 25, 2018 at 6:42 PM, Gregory Maxwell <greg@xiph.org> wrote: >> > configuration is roughly right, then M=1569861 and rice parameter 19 >> > should be used. >> >> That should have been M=784931 B=19 ... paste error. >> _______________________________________________ >> bitcoin-dev mailing list >> bitcoin-dev@lists.linuxfoundation.org >> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >> > > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > > --000000000000f5fefd056d63af9e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div>Hi Jim,</div><div><br></div><div>Yes please, could yo= u share CSV? We are developing a Wallet that uses Golomb-Rice filters it wo= uld help a lot for determine the best P value depending on the estimated nu= mber of elements the client needs to watch. <br></div></div><div class=3D"g= mail_extra"><br><div class=3D"gmail_quote">2018-05-29 19:38 GMT-03:00 Jim P= osen via bitcoin-dev <span dir=3D"ltr"><<a href=3D"mailto:bitcoin-dev@li= sts.linuxfoundation.org" target=3D"_blank">bitcoin-dev@lists.linuxfoundatio= n.org</a>></span>:<br><blockquote class=3D"gmail_quote" style=3D"margin:= 0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">Th= is is a really cool finding, thanks Pieter!<div><br></div><div>I did some m= ore analysis on selecting a good P value to reduce total data downloaded co= nsidering both filters themselves and blocks in the case of false positive = matches, using data from mainnet. The quantity it minimizes is:</div><div><= br></div><div>filter_size(N, B)=C2=A0+ block_size * false_positive_probabil= ity(C, N, B)</div><div><br></div><div>N is the number of filter elements pe= r block</div><div>B is the Golomb-Rice coding parameter</div><div>C is the = number of filter elements watched by the client</div><div><br></div><div>Th= e main result is that:</div><div><br></div><div>For C =3D 10, B =3D 13 is o= ptimal</div><div>For C =3D 100, B =3D 16 is optimal</div><div>For C =3D 1,0= 00, B =3D 20 is optimal</div><div>For C =3D 10,000, B =3D 23 is optimal</di= v><div><br></div><div>So any value of B in the range 16 to 20 seems reasona= ble, with M =3D=C2=A0<span style=3D"color:rgb(34,34,34);font-family:sans-se= rif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-var= iant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;tex= t-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;backgr= ound-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-c= olor:initial;float:none;display:inline">1.4971 * 2^B for optimal compressio= n, as Pieter derived. The selection of the parameter depends on the target = number of elements that a client may watch.</span></div><div><span style=3D= "color:rgb(34,34,34);font-family:sans-serif;font-size:13px;font-style:norma= l;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;le= tter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;wh= ite-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-de= coration-style:initial;text-decoration-color:initial;float:none;display:inl= ine"><br></span></div><div><span style=3D"color:rgb(34,34,34);font-family:s= ans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;fo= nt-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:sta= rt;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;= background-color:rgb(255,255,255);text-decoration-style:initial;text-decora= tion-color:initial;float:none;display:inline">I attached some of the result= s, and would be happy to share the CSV and raw notebook if people are inter= ested.</span></div><div><br></div></div><div class=3D"HOEnZb"><div class=3D= "h5"><br><div class=3D"gmail_quote"><div dir=3D"ltr">On Fri, May 25, 2018 a= t 2:14 PM Gregory Maxwell via bitcoin-dev <<a href=3D"mailto:bitcoin-dev= @lists.linuxfoundation.org" target=3D"_blank">bitcoin-dev@lists.<wbr>linuxf= oundation.org</a>> wrote:<br></div><blockquote class=3D"gmail_quote" sty= le=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Fri= , May 25, 2018 at 6:42 PM, Gregory Maxwell <<a href=3D"mailto:greg@xiph.= org" target=3D"_blank">greg@xiph.org</a>> wrote:<br> > configuration is roughly right, then M=3D1569861 and rice parameter 19= <br> > should be used.<br> <br> That should have been M=3D784931 B=3D19=C2=A0 ... paste error.<br> ______________________________<wbr>_________________<br> bitcoin-dev mailing list<br> <a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_blank">= bitcoin-dev@lists.<wbr>linuxfoundation.org</a><br> <a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" = rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.<wbr>org= /mailman/listinfo/bitcoin-<wbr>dev</a><br> </blockquote></div> </div></div><br>______________________________<wbr>_________________<br> bitcoin-dev mailing list<br> <a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org">bitcoin-dev@lists.= <wbr>linuxfoundation.org</a><br> <a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" = rel=3D"noreferrer" target=3D"_blank">https://lists.linuxfoundation.<wbr>org= /mailman/listinfo/bitcoin-<wbr>dev</a><br> <br></blockquote></div><br></div> --000000000000f5fefd056d63af9e--