<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Dominik Winecki</title>
    <description></description>
    <link>https://dominik.win/blog/</link>
    <atom:link href="https://dominik.win/blog/feed.xml" rel="self" type="application/rss+xml"/>
    <pubDate>Wed, 15 Apr 2026 05:01:59 +0000</pubDate>
    <lastBuildDate>Wed, 15 Apr 2026 05:01:59 +0000</lastBuildDate>
    <generator>Jekyll v4.4.1</generator>
    
      <item>
        <title>a(13)-a(22) for A279860</title>
        <description>&lt;p&gt;Here is the definition of &lt;a href=&quot;https://oeis.org/A279860&quot;&gt;A279860&lt;/a&gt; from OEIS:&lt;/p&gt;
&lt;blockquote&gt;
  &lt;p&gt;a(n) is the first n-digit substring to repeat in the decimal expansion of e.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;There is ongoing competition[&lt;a href=&quot;https://sponaugle.com/wp/math_pi_repeat/&quot;&gt;1&lt;/a&gt;, &lt;a href=&quot;https://jonas.sh/world-record-finding-longest-reapting-sequence-in-decimal-expanson-of-pi-2/&quot;&gt;2&lt;/a&gt;] to find such repeated sequences in π, &lt;a href=&quot;https://oeis.org/A197123&quot;&gt;A197123&lt;/a&gt;, so I took a crack at the second-most-famous transcendental constant.&lt;/p&gt;

&lt;p&gt;I searched the first 250 billion digits and found &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a(1)&lt;/code&gt;-&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a(22)&lt;/code&gt;.
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a(23)&lt;/code&gt; was not found (I only had a 26.8% chance of finding it).
I used a HashSet but would probably switch to a bloom filter for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a(24)&lt;/code&gt; or higher.
Rather than partitioning by prefix, I used the least significant bits of the sequence as a hash and used modulo partitioning.&lt;/p&gt;

&lt;p&gt;Here are the sequences and their positions:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;n&lt;/th&gt;
      &lt;th&gt;a(n)&lt;/th&gt;
      &lt;th&gt;First Position&lt;/th&gt;
      &lt;th&gt;Repeat Position&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;0&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;18&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;182&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;4&lt;/td&gt;
      &lt;td&gt;1828&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;5&lt;/td&gt;
      &lt;td&gt;98793&lt;/td&gt;
      &lt;td&gt;478&lt;/td&gt;
      &lt;td&gt;494&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;6&lt;/td&gt;
      &lt;td&gt;987931&lt;/td&gt;
      &lt;td&gt;478&lt;/td&gt;
      &lt;td&gt;494&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;7&lt;/td&gt;
      &lt;td&gt;4349076&lt;/td&gt;
      &lt;td&gt;170&lt;/td&gt;
      &lt;td&gt;2731&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;8&lt;/td&gt;
      &lt;td&gt;82549802&lt;/td&gt;
      &lt;td&gt;11449&lt;/td&gt;
      &lt;td&gt;23544&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;9&lt;/td&gt;
      &lt;td&gt;450388721&lt;/td&gt;
      &lt;td&gt;9832&lt;/td&gt;
      &lt;td&gt;29424&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;10&lt;/td&gt;
      &lt;td&gt;5291493974&lt;/td&gt;
      &lt;td&gt;35210&lt;/td&gt;
      &lt;td&gt;159929&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;11&lt;/td&gt;
      &lt;td&gt;72883660263&lt;/td&gt;
      &lt;td&gt;55704&lt;/td&gt;
      &lt;td&gt;172254&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;12&lt;/td&gt;
      &lt;td&gt;476539957100&lt;/td&gt;
      &lt;td&gt;685615&lt;/td&gt;
      &lt;td&gt;947487&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;13&lt;/td&gt;
      &lt;td&gt;2557556217677&lt;/td&gt;
      &lt;td&gt;696773&lt;/td&gt;
      &lt;td&gt;4212711&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;14&lt;/td&gt;
      &lt;td&gt;45820090256930&lt;/td&gt;
      &lt;td&gt;937385&lt;/td&gt;
      &lt;td&gt;11527854&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;15&lt;/td&gt;
      &lt;td&gt;441340408673681&lt;/td&gt;
      &lt;td&gt;16824283&lt;/td&gt;
      &lt;td&gt;64070443&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;16&lt;/td&gt;
      &lt;td&gt;4413404086736815&lt;/td&gt;
      &lt;td&gt;16824283&lt;/td&gt;
      &lt;td&gt;64070443&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;17&lt;/td&gt;
      &lt;td&gt;04490816865108378&lt;/td&gt;
      &lt;td&gt;305179147&lt;/td&gt;
      &lt;td&gt;305519388&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;18&lt;/td&gt;
      &lt;td&gt;044908168651083787&lt;/td&gt;
      &lt;td&gt;305179147&lt;/td&gt;
      &lt;td&gt;305519388&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;19&lt;/td&gt;
      &lt;td&gt;3369635849169468729&lt;/td&gt;
      &lt;td&gt;3493534312&lt;/td&gt;
      &lt;td&gt;4849639160&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;20&lt;/td&gt;
      &lt;td&gt;29845348176416445308&lt;/td&gt;
      &lt;td&gt;15006890876&lt;/td&gt;
      &lt;td&gt;15441233099&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;21&lt;/td&gt;
      &lt;td&gt;561387588857094780748&lt;/td&gt;
      &lt;td&gt;33340537999&lt;/td&gt;
      &lt;td&gt;55184810049&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;22&lt;/td&gt;
      &lt;td&gt;8238609087161769234440&lt;/td&gt;
      &lt;td&gt;161691630698&lt;/td&gt;
      &lt;td&gt;176191766517&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
</description>
        <pubDate>Tue, 08 Apr 2025 00:00:00 +0000</pubDate>
        <link>https://dominik.win/blog/a279860/</link>
        <guid isPermaLink="true">https://dominik.win/blog/a279860/</guid>
        
        
      </item>
    
      <item>
        <title>My workflow for plots in research papers</title>
        <description>&lt;p&gt;When writing papers, managing evaluation data and plots has been consistently troublesome.
While getting to the first draft of a plot is often quick, the challenge arises during the inevitable subsequent revisions.
Here’s the workflow I’ve been using to speed up my iteration time.&lt;/p&gt;

&lt;h3 id=&quot;issues--inconveniences&quot;&gt;Issues &amp;amp; Inconveniences&lt;/h3&gt;

&lt;p&gt;There are two categories of issues/inconveniences that my previous bespoke Excel-based workflow had:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Manual updates:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Loading data was manual (even from CSVs).&lt;/li&gt;
  &lt;li&gt;Every plot had to be created from scratch.&lt;/li&gt;
  &lt;li&gt;Plots had to be manually exported one by one on each change.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reproducibility issues:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Spreadsheets had to be versioned and tracked separately from code*.&lt;/li&gt;
  &lt;li&gt;I often forgot which evaluation run was loaded in any given sheet.
    &lt;ul&gt;
      &lt;li&gt;Especially when returning to a project after a few weeks/months.&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;There was no record of label/name updates.&lt;/li&gt;
  &lt;li&gt;Excel subtly changes chart sizes when you save/load a sheet.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Most of these issues are different forms of excess user-managed state.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;*While spreadsheets can be version-controlled like any other file, no merge algorithm will resolve conflicts, forcing this task on the user.&lt;/p&gt;

&lt;h3 id=&quot;my-workflow&quot;&gt;My workflow&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;TLDR:&lt;/em&gt;&lt;/strong&gt; Code → &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;data.json&lt;/code&gt; → duckdb → ggplot2 → &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;plots/&amp;lt;name&amp;gt;.pdf&lt;/code&gt; → LaTeX &lt;br /&gt;
(where duckdb and ggplot2 are one-liners in an R Jupyter Notebook cell)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Evaluation Code:&lt;/strong&gt;
Any benchmark outputs metrics as individual lines of JSON, which can be written to a file.
For example:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;fibonacci_recursive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;fibonacci_recursive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;fibonacci_recursive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;fibonacci_iterative&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;start_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;fibonacci_recursive&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;rec_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start_time&lt;/span&gt;
&lt;span class=&quot;nf&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;dumps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;fib-recursive&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rec_time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;start_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;fibonacci_iterative&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;iter_time&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start_time&lt;/span&gt;
&lt;span class=&quot;nf&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;json&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;dumps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;fib-iterative&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iter_time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;python my_benchmark.py &lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt; data.json
&lt;span class=&quot;nv&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;cat &lt;/span&gt;data.json
&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;fn&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;fib-recursive&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;time&quot;&lt;/span&gt;: 3.695487976074219e-05&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;fn&quot;&lt;/span&gt;: &lt;span class=&quot;s2&quot;&gt;&quot;fib-iterative&quot;&lt;/span&gt;, &lt;span class=&quot;s2&quot;&gt;&quot;time&quot;&lt;/span&gt;: 3.337860107421875e-06&lt;span class=&quot;o&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If there are multiple benchmarks to run, I make a script that clears the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;data.json&lt;/code&gt; file and then runs each benchmark appending the results to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;data.json.&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Loading, Preprocessing, and Plotting Data:&lt;/strong&gt;
I then load and preprocess the JSON on disk with DuckDB.
I do this in a Jupyter Notebook running in R kernel.
While I prefer Python, ggplot2 is better than anything Python has.&lt;/p&gt;

&lt;div class=&quot;language-r highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;duckdb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;con&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbConnect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;duckdb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbExecute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;con&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;INSTALL json; LOAD json;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;
SELECT
    CASE 
    WHEN json-&amp;gt;&amp;gt;&apos;fn&apos; = &apos;fib-recursive&apos; THEN &apos;Recursive&apos;
    WHEN json-&amp;gt;&amp;gt;&apos;fn&apos; = &apos;fib-iterative&apos; THEN &apos;Iterative&apos;
    ELSE json-&amp;gt;&amp;gt;&apos;fn&apos;
    END AS fn,
    CAST(json-&amp;gt;&amp;gt;&apos;time&apos; AS DOUBLE) * 1000000 AS time
FROM read_ndjson_objects(&apos;data.json&apos;)
&quot;&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbGetQuery&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;con&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, the dataframe can be plotted directly and saved as a PDF at the same time:&lt;/p&gt;
&lt;div class=&quot;language-r highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;aes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;geom_bar&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stat&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;identity&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fill&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;steelblue&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;labs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Execution Time of Fibonacci Functions&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Function Type&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Time (microseconds)&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggsave&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;plots/fibonacci.pdf&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;device&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;&apos;pdf&apos;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;width&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;4.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;height&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;m&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src=&quot;/blog/res/fibonacci-plot.svg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Each SQL/plot pair gets its own cell.
And, for convenience, the notebook starts by clearing the plots directory:&lt;/p&gt;
&lt;div class=&quot;language-r highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ggplot2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;library&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;duckdb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;con&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;-&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbConnect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;duckdb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dbExecute&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;con&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;INSTALL json; LOAD json;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;

&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dir.exists&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;plots&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
    &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unlink&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;plots&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recursive&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;TRUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dir.create&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;plots&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This plots directory can then be copied directly into the LaTeX project/Overleaf.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;
This notebook can now be version-controlled, and the data is in a single, human-readable file that is easy to back up.
Additionally, figures can be updated in 30 seconds with a change in data, preprocessing, or plots.&lt;/p&gt;
</description>
        <pubDate>Fri, 06 Sep 2024 00:00:00 +0000</pubDate>
        <link>https://dominik.win/blog/research-plots-workflow/</link>
        <guid isPermaLink="true">https://dominik.win/blog/research-plots-workflow/</guid>
        
        
      </item>
    
      <item>
        <title>Trying to get a phone screen fixed</title>
        <description>&lt;p&gt;I used to use a OnePlus 6T.
One day, the screen stopped accepting touch input.
The Android ecosystem has been a bit slow lately, and I didn’t want to create e-waste, so I tried to get the screen replaced.&lt;/p&gt;

&lt;p&gt;I found a company with a local store, uBreakiFix, that could service my phone.
However, the new screen seemed dull and lifeless.
Worse, it had &lt;em&gt;backlight bleed&lt;/em&gt;.
That should be impossible.
My phone has an AMOLED panel, and OLED panels don’t have backlights.&lt;/p&gt;

&lt;p&gt;Let’s take a look at that screen:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/pixel_comparison.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The mystery screen sure does look similar to an LCD, doesn’t it?
It’s using RGB stripe, which is common on LCD, and more expensive on OLED.&lt;/p&gt;

&lt;p&gt;The larger problem is the use of RGB pixels, whereas almost all OLED phones (OnePlus 6T and LG V30 included), use &lt;a href=&quot;https://en.wikipedia.org/wiki/PenTile_matrix_family&quot;&gt;PenTile RGBG&lt;/a&gt; layout.
Simply, it has half the number of green sub-pixels.&lt;/p&gt;

&lt;p&gt;The screen they used has visible backlight bleed, a pixel layout similar to S-IPS LCD, and a subpixel geometry nowhere near the PenTile display it should be.
It’s a counterfeit.
I got a refund.
And then an iPhone.&lt;/p&gt;
</description>
        <pubDate>Thu, 09 Jun 2022 00:00:00 +0000</pubDate>
        <link>https://dominik.win/blog/phone-screen-fix-counterfeit/</link>
        <guid isPermaLink="true">https://dominik.win/blog/phone-screen-fix-counterfeit/</guid>
        
        
      </item>
    
      <item>
        <title>A CRDT for contact synchronization</title>
        <description>&lt;h3 id=&quot;background&quot;&gt;Background&lt;/h3&gt;

&lt;p&gt;Some things are too important in technology to not standardize.
Contacts and calendars, for example.
They can be represented as &lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc6350&quot;&gt;vCard&lt;/a&gt; and &lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc5545&quot;&gt;iCal&lt;/a&gt; respectively.
These work fine for simple import/export operations, but they provide little aid in supporting users across devices.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc6352&quot;&gt;CardDAV&lt;/a&gt; and &lt;a href=&quot;https://datatracker.ietf.org/doc/html/rfc4791&quot;&gt;CalDAV&lt;/a&gt; are the synchronization protocols built on top of vCard and iCal.
Thanks to them, you can use whatever apps you want to on your phone or desktop, and it all synchronizes through the cloud.&lt;/p&gt;

&lt;p&gt;As long as it goes through the central server.&lt;/p&gt;

&lt;p&gt;That central server thing has some problems:&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;You have to trust someone to host your data&lt;/li&gt;
  &lt;li&gt;Devices can only communicate through the central server&lt;/li&gt;
  &lt;li&gt;Your protocols can be less robust since the server can fix issues for you&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Email is similar.
IMAP synchronization lets everyone use whatever client they want to while the central server keeps your data.&lt;/p&gt;

&lt;p&gt;Except with email, and instant messaging in general, there is the desire to replace it with a fully distributed system.
&lt;a href=&quot;https://matrix.org/&quot;&gt;Matrix&lt;/a&gt; is the main effort to do so.
The vision is every application or (potentially static) website-based client acting as a peer.
Servers may exist, too, but serve mainly as peers with higher-than-usual availability.
It’s a pure distributed/p2p/web3 solution.&lt;/p&gt;

&lt;p&gt;It’s not the first to do so.
Source code management did this a long time ago.
Flat files are your data format.
CVS/Subversion was the synchronization solution.
Then git/mercurial made decentralization popular.&lt;/p&gt;

&lt;p&gt;Note that removing the central server isn’t the only benefit.
You get operational benefits in cloud hosting since you can run multiple less-reliable peers.
You can benefit from p2p and mesh networks, so nearby devices could sync even when neither can connect to the internet.
Users also get the benefit of a more generalized merging algorithm; it’s naturally more robust.&lt;/p&gt;

&lt;p&gt;Contacts and calendars &lt;em&gt;may&lt;/em&gt; eventually go this way too; even though the need is less pressing.&lt;/p&gt;

&lt;h3 id=&quot;a-distributed-vcard-list&quot;&gt;A distributed vCard list&lt;/h3&gt;

&lt;p&gt;Unlike source code management, where conflicts between versions are all but guaranteed, Matrix uses &lt;a href=&quot;https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type&quot;&gt;Conflict-free replicated data types&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Each person has their own copy of a data structure, and there is a method for merging any two data structures together whereby any order of merges results in the same output.
This allows for each peer in a system to have a copy and use gossip to propagate changes.
And the gossip that it’s sending is just its current state.
It doesn’t need a push mechanism.&lt;/p&gt;

&lt;p&gt;Consider two servers hosting a file over HTTP.
By periodically reading each other’s values (à la RSS) and merging their data, they can propagate their changes across arbitrary topologies.&lt;/p&gt;

&lt;h4 id=&quot;a-single-vcard-entry&quot;&gt;A single vCard entry&lt;/h4&gt;

&lt;p&gt;There are a lot of things you can put in a vCard file.
Luckily, for this purpose, only cardinality matters.
It has four different classes of labels:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Exactly 1&lt;/li&gt;
  &lt;li&gt;0 or 1&lt;/li&gt;
  &lt;li&gt;1 or more&lt;/li&gt;
  &lt;li&gt;0 or more&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A different CRDT can be chosen for each of these based on when you find losing data acceptable.
This is the less interesting part of the problem; as long as some logical CRDT is used, it doesn’t matter.&lt;/p&gt;

&lt;p&gt;I’m going to use a Last-writer-wins register for exactly 1, and the nullable version of this for the 0 or 1 case.
For the remaining two, an observed removed set (with adds taking precedence over removes) will suffice.&lt;/p&gt;

&lt;p&gt;A very simple example:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ContactEntry&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Lww&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;tel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ORSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// insert the remaining thousand vCard keys here&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;a-list-of-vcards&quot;&gt;A list of vCards&lt;/h4&gt;

&lt;p&gt;Synchronizing contacts isn’t as simple as synchronizing a single contact.
Contacts can be added and deleted.
They can also be merged.
Mergers are just an update of one and the deletion of another, but since we want changes that haven’t propagated to the current peer to be reflected, a merger needs to be a first-class operation.&lt;/p&gt;

&lt;p&gt;Mergers are hierarchical.
They form a tree.
You can store a tree as a CRDT.
But that’s not necessary here.&lt;/p&gt;

&lt;p&gt;Rather, we can store one logical “contact” as a set of contact entries.
To differentiate between contact entries, each is assigned a random UUID.
Each contact starts with one entry.
When it’s merged with another, they form a new set with the union of their entries.
When merging two of these sets, any entry in two separate sets will replace both with their union.&lt;/p&gt;

&lt;p&gt;This is a &lt;a href=&quot;https://en.wikipedia.org/wiki/Disjoint-set_data_structure&quot;&gt;Disjoint-set data structure&lt;/a&gt; (i.e., a set of disjoint sets).
It’s a CRDT, or can be implemented as one.
I can’t find anyone else who has shown this, so that might be new.&lt;/p&gt;

&lt;p&gt;A list can be represented as such:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ContactList&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;entries&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Map&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Uuid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ContactEntry&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;contacts&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Uuid&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// uuids point to entries&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This takes advantage of the mergeable property of CRDTs: from the user perspective, a contact is the merger of each of its entries.
It does mergers for the logical representation, not just the physical representation (as we explicitly don’t want to merge different entries).
I can’t find this used elsewhere in the ecosystem either.&lt;/p&gt;

&lt;p&gt;We can put a tombstone boolean on each of these entry sets to implement deletion.
However, if one person deletes a contact while on another device merging it with another contact, it will delete the other once propagated.
Rather, we can delete at the entry level:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ContactEntry&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Lww&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;tel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ORSet&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;tombstone&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;TrueWinsBool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;A contact is only considered deleted once all of its entries have a tombstone bit set.
When a client wants to delete a contact, it sets this bit on each of the entries.
The contact will only ever be undeleted if another peer had merged this contact with another non-deleted contact without observing the deletion to protect the new entry from instant deletion.&lt;/p&gt;

&lt;h4 id=&quot;new-things-you-can-now-do&quot;&gt;New things you can now do&lt;/h4&gt;

&lt;p&gt;Collaborative editor CRDTs are the most developed.
They allow for distributed text editing of a single field.
One can imagine embedding one of these for each note associated with a contact, or similar use cases.&lt;/p&gt;

&lt;p&gt;Also, one useful feature many devices have is showing the most recent or most frequently contacted people first in lists.
This could be embedded in their contact to work across platforms.
The simple case of the most recently contacted could be implemented with a timestamp.
A total interaction count could be done with a counter.
To be even more accurate, count-min sketches and HyperLogLog structures can be used to probabilistically keep track of frequencies without embedding too much sensitive data within the contacts themselves.&lt;/p&gt;

&lt;p&gt;Many clients show the user when they last synchronized with the cloud.
The “no single source of truth” model may seem opposed to this; however, it provides a superset of functionality.
You can embed propagation observability of a CRDT into itself; you can pick a node to be the “main” one if you want.
Or, you can be more creative and use any arbitrary standard for consensus.
For example, you could host cloud peers on three different providers and only consider a version “up to date” when two of the three have observed it.&lt;/p&gt;

&lt;p&gt;While the improvements of switching to a distributed architecture are likely marginal, they would be a win for decentralization and user experience.
Similar solutions are desirable for any product that has become a near-commodity.
Currently, that would be contacts, calendars, and photo sync.
In the future, playlists, credential/password stores, browser settings/history/bookmarks, and even user identity could benefit from elegant applications of distributed systems concepts.&lt;/p&gt;
</description>
        <pubDate>Sun, 30 Jan 2022 00:00:00 +0000</pubDate>
        <link>https://dominik.win/blog/crdt-contact-synchronization/</link>
        <guid isPermaLink="true">https://dominik.win/blog/crdt-contact-synchronization/</guid>
        
        
      </item>
    
      <item>
        <title>Transcendental Numbers for Scoring</title>
        <description>&lt;p&gt;I’ve organized a hackathon or two.
Figuring out how to score them is difficult.
Even excluding social aspects aggregating results is far from an exact science.
However, some methods can enforce additional constraints.
Especially if they don’t have to be simple or practical.&lt;/p&gt;

&lt;h3 id=&quot;reducing-the-probability-of-ties&quot;&gt;Reducing the probability of ties&lt;/h3&gt;

&lt;p&gt;Ties can be problematic in practice at competitions.
Should two teams receive the same scoring any fair system must produce a tie.
However, the case of two teams receiving scores that add up to the same final score is an unfortunate possibility that can be minimized.&lt;/p&gt;

&lt;p&gt;The most common form of a combined score is as a linear sum where \(s\) are scores and \(w\) are weights.&lt;/p&gt;

\[\text{Score}=w_1s_1+w_2s_2+ \dots + w_n s_n\]

&lt;p&gt;This is likely to produce ties.
Any single variable in this system (assuming its corresponding multiplicand isn’t 0) can be changed to set the score to any arbitrary value.&lt;/p&gt;

&lt;p&gt;However, there exists an implicit constraint: the weights and scores are all rational numbers. Formally, \(s_n \in \mathbb{Q}\) and \(w_n \in \mathbb{Q}\). This also implies their product is rational.
While values of \(s\) come from humans we have full control over the values of \(w\).
Also, \(w_n \neq 0\), since we do not need to include any criteria with no weight.&lt;/p&gt;

&lt;p&gt;By introducing another, irrational, set of multiplicands we can ensure that our output is irrational.&lt;/p&gt;

&lt;p&gt;Consider the case:&lt;/p&gt;

\[\text{Score}=w_1s_1+w_2s_2\sqrt{2}\]

&lt;p&gt;The first component \(w_1s_1\) is rational while the second component \(w_2s_2\sqrt{2}\) is irrational. In every case except \(s_2 = 0\), the Score is irrational.&lt;/p&gt;

&lt;p&gt;Adding more terms requires a string of irrational numbers where each number can’t be constructed through a (rational) linear sum of the other irrational numbers.
These irrational numbers must not be a rational multiple of any other number used, i.e., you can’t use \(\sqrt{2}\) and \(2\sqrt{2}\), or anything similar as it’s trivial to cancel out the terms.
However there are other powers you could use instead which would work, such as using the range \([0, 1)\).&lt;/p&gt;

\[\text{Score}=w_1s_12^{0/n}+w_2s_22^{1/n}+ \dots + w_n s_n2^{(n-1)/n}\]

&lt;p&gt;Here, for any finite number of criteria \(n\), and given the constraints on values of \(w\) and \(s\), no two Scores are equal unless all the values of \(s\) are equal.
This is the simple solution to the original problem when using linear sums.
However, one could use the even stricter set of transcendental numbers instead.
\(e^x\) is transcendental (which guarantees irrationality) for any non-transcendental value \(x\).&lt;/p&gt;

&lt;p&gt;Consider the case:&lt;/p&gt;

\[\text{Score}=w_1s_1e^1+w_2s_2e^2+ \dots + w_n s_ne^n\]

&lt;p&gt;This solves the same problem, but there is a more general form we can target.
This approach requires the score to be a linear sum against vector \(s\), but it’s possible to support an entire class of functions instead.
Any non-constant &lt;a href=&quot;https://en.wikipedia.org/wiki/Algebraic_function&quot;&gt;algebraic function&lt;/a&gt; taking a single transcendental number resolves to a transcendental number.
This can be extended to the multi-variable case by using any arbitrary algebraic function in the form \(f(s_1T_1,\space s_2T_2,\space\dots)\) by choosing some special values of vector \(T\).
Note the values of \(w\) are no longer necessary as they may be embedded within \(f\).&lt;/p&gt;

&lt;p&gt;The constraint on \(T\) isn’t just to find transcendental numbers.
The previous case of using \(e\) and \(e^2\) could trivially be “undone” by taking the square root of the second and dividing it from the first to produce a rational result.
Rather, we need \(T\) to be a set of transcendental numbers that are all algebraically independent over the rational numbers.
A few such sets are known, including one method of producing these sets to any arbitrary size; the &lt;a href=&quot;https://en.wikipedia.org/wiki/Lindemann%E2%80%93Weierstrass_theorem&quot;&gt;Lindemann–Weierstrass theorem&lt;/a&gt;.
This shifts the problem to finding a set of algebraic numbers which are linearly independent over the rationals.
Which, coincidentally, is the exact problem solved by the first solution to the simple case.
Using these we can generate a set of transcendental numbers that are algebraically independent over the rationals:&lt;/p&gt;

\[T = [e^{2^{0/n}}, e^{2^{1/n}}, e^{2^{2/n}}, \dots, e^{2^{\left[(n-1)/n\right]}}]\]

&lt;p&gt;Any set \(\{s_1T_1,\space s_2T_2,\space\dots\}\) will be algebraically independent over the rationals as well as having each element be transcendental (or zero).&lt;/p&gt;

&lt;p&gt;Therefore, for &lt;strong&gt;&lt;em&gt;any&lt;/em&gt;&lt;/strong&gt; non-constant algebraic function \(f\), the value \(f(s_1T_1,\space s_2T_2,\space\dots,\space s_nT_n)\) will be transcendental, or zero if all values of \(s\) are zero.
Of course values of \(s\) still need to be rational.&lt;/p&gt;

&lt;p&gt;Rather than using an arbitrary tie-breaking method, this method inherits all the ordering properties from the scoring function \(f\).
Practically this forms an injective (one-to-one) function mapping a vector of rationals \(s\) to the real numbers.
Since real numbers are ordered there must be one vector of scores which is the highest within any set.&lt;/p&gt;
</description>
        <pubDate>Sat, 01 Jan 2022 00:00:00 +0000</pubDate>
        <link>https://dominik.win/blog/transcendental-numbers-for-scoring/</link>
        <guid isPermaLink="true">https://dominik.win/blog/transcendental-numbers-for-scoring/</guid>
        
        
      </item>
    
      <item>
        <title>Laptop battery life data</title>
        <description>&lt;p&gt;Back in 2016, I wrote a daemon that grabbed the system battery stats every second and dumped it into a CSV.&lt;/p&gt;

&lt;p&gt;I ran it on my laptop for 99 days:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/battery-chart.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Major takeaways:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Cheap replacement lithium-ion batteries have a really limited number of cycles.&lt;/li&gt;
  &lt;li&gt;Linux laptop power management still has a ways to go.&lt;/li&gt;
  &lt;li&gt;Pulling battery stats and writing a file once per second isn’t great for power management.&lt;/li&gt;
&lt;/ol&gt;
</description>
        <pubDate>Mon, 05 Oct 2020 00:00:00 +0000</pubDate>
        <link>https://dominik.win/blog/laptop-battery-data/</link>
        <guid isPermaLink="true">https://dominik.win/blog/laptop-battery-data/</guid>
        
        
      </item>
    
      <item>
        <title>Replacing malloc with rand</title>
        <description>&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;malloc&lt;/code&gt; function is fairly simple.
Either give me a pointer to some memory I can use or give me NULL if I’ve asked for too much.&lt;/p&gt;

&lt;p&gt;And, on Linux that’s exactly what happens:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;malloc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// ptr = 0x563bd6707260&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;malloc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000000000000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;// ptr = 0x0&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;printf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;%p&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;However, something interesting happens when you try this on a Mac:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/replacing-malloc-with-rand/osx_large_ptr.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;It works…&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;but why?
Surely this 8 GB Mac isn’t going to find a free terabyte lying around.&lt;/p&gt;

&lt;p&gt;Clearly it’s mistaken, and as soon as we write to it it’ll segfault:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;size_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And… it got to 64GB before getting killed:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/replacing-malloc-with-rand/ram_usage.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;It didn’t even put a dent in the system resources:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/replacing-malloc-with-rand/zero_resources.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Okay, so it looks like either macOS is discarding writes to memory, or it realized it’s all zeros so it’s compressing them.
Let’s try again with some higher entropy data:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;size_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;++&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It still gets to 64GB, but not without swapping to disk:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/replacing-malloc-with-rand/rand_resources.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/replacing-malloc-with-rand/swap.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;So it looks like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;malloc&lt;/code&gt; actually did return a terabyte of virtual address space, even if we won’t be able to use all of it.&lt;/p&gt;

&lt;p&gt;It’s pretty clear what’s happening here, macOS is just filling in the memory pages when we need them, not when we get assigned them.
Or as it’s known formally, &lt;a href=&quot;https://en.wikipedia.org/wiki/Demand_paging&quot;&gt;demand paging&lt;/a&gt;.
There are plenty of benefits to this type of memory management, especially for laptops.
This is what lets a relatively low-spec MacBook run Docker, Chrome, and a handful of Electron apps all at once while still being usable.&lt;/p&gt;

&lt;p&gt;So, if a terabyte works, &lt;em&gt;how large can we go?&lt;/em&gt;
It’s a 64-bit system, so \(2^{64}\), right?&lt;/p&gt;

&lt;p&gt;Well, modern x86_64 CPUs &lt;a href=&quot;https://en.wikipedia.org/wiki/X86-64#Virtual_address_space_details&quot;&gt;only support 48-bit memory addresses&lt;/a&gt;, since 256TB is enough for most people.
And sure enough, allocating 100TB works perfectly fine, but trying to allocate a petabyte causes something to break:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/replacing-malloc-with-rand/petabyte_crash.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;But this also has one other benefit, it lets you try out how programs could run with huge amounts of RAM.&lt;/p&gt;

&lt;h3 id=&quot;rand-as-an-allocator&quot;&gt;rand() as an allocator&lt;/h3&gt;

&lt;p&gt;If you have enough memory in a system you can just pick some number at random and hope it works, right?&lt;/p&gt;

&lt;p&gt;To put this theory to the test I pulled a copy of &lt;a href=&quot;https://github.com/antirez/redis&quot;&gt;Redis&lt;/a&gt;, an in-memory database, and started replacing its allocator.&lt;/p&gt;

&lt;p&gt;Redis uses one of three allocators, set at compile-time, and wraps them in a custom wrapper as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;zmalloc&lt;/code&gt;, so changing one place would change it for the whole project.&lt;/p&gt;

&lt;p&gt;To start, let’s get 100TB, and then just pick a spot somewhere in that:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;zmalloc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;size_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// printf(&quot;zmalloc size=%lu\n&quot;, size);&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;static&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;malloc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100000000000000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;kt&quot;&gt;size_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rand64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;size_t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;size_t&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;rand64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;%=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;100000000000000&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rand64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There’s also &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;calloc&lt;/code&gt;, but let’s assume there are zeros there anyway:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;zcalloc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;size_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zmalloc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;realloc&lt;/code&gt; can get a little weird, so let’s try to keep most of the original behavior:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;zrealloc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;size_t&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;zfree&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;NULL&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zmalloc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And finally,&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;zfree&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;So does it work?&lt;/em&gt; &lt;strong&gt;Yes!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/replacing-malloc-with-rand/redis_running.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;It even accepts user connections through &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;redis-cli&lt;/code&gt;!
Sure, once you try to run a remote command you get a segfault, but you can’t win them all.&lt;/p&gt;

&lt;p&gt;It takes +13000 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;zmalloc&lt;/code&gt; calls to get a server running, and it works!&lt;/p&gt;

&lt;p&gt;Now let’s see if it passes tests:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/replacing-malloc-with-rand/redis_tests.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Hey, passing the first 20 tests, not too bad!
There were some data structure tests in those.
It didn’t even fail due to a segfault or anything, it just can’t check how large an allocated block was.&lt;/p&gt;

&lt;h3 id=&quot;did-that-really-need-100tb-of-space&quot;&gt;Did that &lt;em&gt;really&lt;/em&gt; need 100TB of space?&lt;/h3&gt;

&lt;p&gt;Couldn’t this have worked with just 8GB?&lt;/p&gt;

&lt;p&gt;If the goal is to avoid memory collisions, then what is the chance a duplicate pointer gets returned?
This exact situation gets covered by the &lt;a href=&quot;https://en.wikipedia.org/wiki/Birthday_problem&quot;&gt;birthday problem&lt;/a&gt;.
If we have \(k = 50000\) separate allocations and \(n=\text{100 trillion}\) then the chance of a collision can be found with this formula:&lt;/p&gt;

\[1-\left(\frac{n-1}{n}\right)^{\frac{(k^2-k)}{2}}\]

&lt;p&gt;Given 100TB of address space, the chance of a collision across 50k allocations is 0.00125%.
With 8GB it’s 14.47%.
And this only accounts for pointer collisions, much more address space gets used.&lt;/p&gt;

&lt;p&gt;And all of this is before considering that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rand()&lt;/code&gt; isn’t exactly cryptography-grade randomness either.&lt;/p&gt;

&lt;p&gt;So, if you are picking your pointers at random, at the very least, taking advantage of demand paging gives you a little more peace of mind.&lt;/p&gt;
</description>
        <pubDate>Mon, 19 Aug 2019 00:00:00 +0000</pubDate>
        <link>https://dominik.win/blog/replacing-malloc-with-rand/</link>
        <guid isPermaLink="true">https://dominik.win/blog/replacing-malloc-with-rand/</guid>
        
        
      </item>
    
      <item>
        <title>NCUR19 Trip Report</title>
        <description>&lt;p&gt;&lt;img src=&quot;/blog/res/ncur19/header.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;I’m thrilled that this year I had the opportunity to present &lt;a href=&quot;/scriptwriter/&quot;&gt;my research&lt;/a&gt; at the &lt;a href=&quot;http://www.cur.org/what/events/students/ncur/2019/&quot;&gt;National Conference on Undergraduate Research&lt;/a&gt; at Kennesaw State University.&lt;/p&gt;

&lt;h1 id=&quot;day-1&quot;&gt;Day 1&lt;/h1&gt;

&lt;p&gt;NCUR19 kicked off with a plenary session featuring a &lt;a href=&quot;https://youtu.be/d2YC3Yi6PlM?t=1176&quot;&gt;keynote on whale sharks&lt;/a&gt;.
Thanks to the Georgia Aquarium, which has the only captive whale sharks in the US, Atlanta is the front-runner on their research.
Even though a handful are kept in captivity, there is still very little known about them, like how deep they go on their deep-ocean dives.
That last question may soon be answered since a custom tag has been deployed that won’t max out at 2000m.
I didn’t know much about whale sharks going into this, but I left with my calendar marked for the day that tracking tag is programmed to come off of one.&lt;/p&gt;

&lt;h1 id=&quot;research-highlights&quot;&gt;Research Highlights&lt;/h1&gt;

&lt;p&gt;Following the plenary session was the start of the poster presentations.
There are at least a hundred posters per session so I can only mention some of the (mostly computer science) highlights that I found memorable:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Autonomous Payload Return Vehicle&lt;/strong&gt;:
When weather balloons drop their payloads with traditional parachutes they have a tendency to land in the worst places.
This project uses a parafoil controlled with servos to steer the payload to a target.
Being able to pick up your data from an empty field instead of climbing a roof or trekking through water makes atmospheric research more accessible.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Graphene Filters&lt;/strong&gt;:
This research was into the effectiveness of graphene-based filters for organic pollutants.
What was surprising to me was how the graphene filters could be cheaper than activated carbon filters in the long term, since they are used up much slower.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monte Carlo N Particle Semiconductor Detector Simulations&lt;/strong&gt;:
I don’t really have the background needed to understand the chemistry/physics behind this one, but this project is evaluating alternative materials for detecting radiation since the currently used ones are expensive and hard to get.
The software used came from an Oak Ridge National Lab and required the project to justify its need to the federal government.
It was a really impressive simulation with a clear market need and good results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Datamining to find Cyberattack Trends&lt;/strong&gt;:
This was a study into a cybersecurity attack database from a federal law enforcement agency.
Things such as SQL injection and buffer overflows stayed generally constant with occasional spikes on major vulnerabilities.
However, there were large shifts to more focused attacks on edge devices since those provide the highest ROI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Machine Learning for finding Malicious URLs&lt;/strong&gt;:
Due to the short lifetime of malicious URLs, they are purchased in large quantities.
Because of this most are using extremely repetitive and predictable structures.
Machine learning can effectively classify these URLs.
This can be combined with other methods to further increase accuracy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Grip Strength Among Sonographers&lt;/strong&gt;:
Sonographers are at a massive risk for repetitive strain injury.
Increased grip strength can prevent this, but sonographers already have twice that of the average person.
By instead focusing on training related muscles that aren’t regularly stressed further gains can be made to reduce the risk of injury.
On a side note, sonographers would make excellent rock climbers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pesticides on Spiders&lt;/strong&gt;:
This was a study into how pesticides impact non-target insects, specifically wolf spiders.
Many chemical treatments are marketed as having no effects on other species.
This research demonstrated them as causing clear damage even at doses far below what is commonly used.&lt;/p&gt;

&lt;h1 id=&quot;day-2&quot;&gt;Day 2&lt;/h1&gt;

&lt;p&gt;This was the day when I presented &lt;a href=&quot;/scriptwriter/&quot;&gt;my research&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/ncur19/poster.jpg&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;It went well!
A lot more people showed up than the last time I presented.
The nice thing was that no matter a person’s background I was able to find a relatable use-case for my system.&lt;/p&gt;

&lt;h1 id=&quot;day-3&quot;&gt;Day 3&lt;/h1&gt;

&lt;p&gt;Following the final two poster sessions, the conference ended with another plenary session, this time with a keynote on Lockheed Martin Skunk Works.
It covered everything from the U-2 and SR-71 all the way to aircraft of the coming decades (Including possible commercial supersonic flight!).
The presentation, along with the others, had a focus on the importance of engineering in other fields.&lt;/p&gt;

&lt;h1 id=&quot;takeaways&quot;&gt;Takeaways&lt;/h1&gt;

&lt;p&gt;NCUR does a really good job at promoting a diverse set of research from all across the US.
And, despite being the only student going from OSU, Ohio was certainly not lacking in representation.
No matter where I went there was always someone from Miami, C-State, or Capitol nearby.&lt;/p&gt;

&lt;p&gt;But, regardless of where people came from everyone had the same passion for their research and enthusiasm to learn about the work of others.&lt;/p&gt;

&lt;p&gt;In addition to a list of projects to follow up on, I left this conference with a better understanding of what goes into research and its greater impact on the world.&lt;/p&gt;
</description>
        <pubDate>Sat, 20 Apr 2019 00:00:00 +0000</pubDate>
        <link>https://dominik.win/blog/ncur19-trip-report/</link>
        <guid isPermaLink="true">https://dominik.win/blog/ncur19-trip-report/</guid>
        
        
      </item>
    
      <item>
        <title>Programming Swerve Drive</title>
        <description>&lt;p&gt;Swerve drives are fun.
While sometimes impractical they check off more boxes than any other drivetrain.
Not only can you drive them in any direction, but you can also rotate independently, all while using conventional wheels aimed in the optimal direction for force.
Unlike mecanum wheels, which have found use in forklifts, swerve is almost exclusively used by FIRST robotics groups.&lt;/p&gt;

&lt;div style=&quot;text-align: center;&quot;&gt;
    &lt;a href=&quot;https://www.youtube.com/watch?v=BoRbbkOKHYE&quot;&gt;
        &lt;video style=&quot;position: relative; padding-bottom: 56.25; overflow: hidden; max-width: 100%; align-content: center;&quot; autoplay=&quot;&quot; loop=&quot;&quot; muted=&quot;&quot; playsinline=&quot;&quot;&gt;
            &lt;source src=&quot;/blog/res/swerve/example_video.mp4&quot; type=&quot;video/mp4&quot; /&gt;
        &lt;/video&gt;
    &lt;/a&gt;
&lt;/div&gt;

&lt;h2 id=&quot;mathematical-model&quot;&gt;Mathematical Model&lt;/h2&gt;

&lt;p&gt;A swerve drive takes two inputs for control: the desired translation and rotation.
This maps to kinematics definitions of a velocity vector and angular rotation, which I’ll call \(\vec{v}\) (m/s) and \(\omega\) (rad/s).
The outputs are actually motor values for 2x the number of modules (for pivot and drive motors), but for now, let’s abstract this away and pretend every module takes a vector.&lt;/p&gt;

&lt;p&gt;Here is where this definition will diverge from many other implementations.
We will solve this in the general case, so \(n\) modules at arbitrary locations.
Most implementations have 4 modules in a rectangle around a center, but the general case is both closer to the underlying physics and (at least to me) easier to implement well in modern languages.&lt;/p&gt;

&lt;p&gt;Now, since we have angular velocity we need a reference point.
If we assume an origin exists at \((0,0)\) we can define the location of each module relative to that.
Therefore module \(n\) is located at \(\vec{m}_n\) relative to the origin (in meters).&lt;/p&gt;

&lt;p&gt;With this we can define a simple frame like this with just four vectors around a center point:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/swerve/square_frame.svg&quot; alt=&quot;Square Frame&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This allows for what is essentially the fundamental formula of swerve drive:&lt;/p&gt;

\[\vec{\text{output}}_n = \vec{v} + \omega\cdot\text{perpendicular}(\vec{m}_n)\]

&lt;p&gt;This doesn’t specify whether you need the clockwise or counter-clockwise perpendicular function, but as long as it agrees with \(\omega\) it doesn’t matter.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/swerve/basic_swerve_square.svg&quot; alt=&quot;Basic Swerve Square Frame&quot; /&gt;&lt;/p&gt;

&lt;p&gt;By using this general form of swerve drive we get support for much more powerful operations simply by changing a few variables.
For example, just by changing the module locations in \(\vec{m}\) you get support for arbitrary shapes:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/swerve/hex.svg&quot; alt=&quot;Hex Frame&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Also, since the perpendicular function maintains length we get rotational scaling for free:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/res/swerve/fanout.svg&quot; alt=&quot;Swerve Rotation Fanout&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The final benefit of this model is that the center of rotation is arbitrary.
Even if most of the time a center-based rotation is desired, the center can be moved anywhere on the 2d plane if needed.&lt;/p&gt;

&lt;h2 id=&quot;necessary-upgrades&quot;&gt;Necessary Upgrades&lt;/h2&gt;

&lt;p&gt;Below the high-level kinematics model, a basic implementation can just be a cartesian to polar conversion.
However, there are a few parts that stray from the theoretical model that allow for better control using physical motors.&lt;/p&gt;

&lt;h3 id=&quot;speed-normalization&quot;&gt;Speed Normalization&lt;/h3&gt;

&lt;p&gt;To get the best performance out of the drive train you should be running it at a speed where it occasionally maxes out the motors.
For example, the module output vector lengths at some point could be &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[2.30584285 1.80334026 0.55053495 1.538819]&lt;/code&gt;.
Since a motor should never be set to over full power, there needs to be defined behavior on how to handle this problem.&lt;/p&gt;

&lt;h4 id=&quot;option-1-clamping&quot;&gt;Option 1: Clamping&lt;/h4&gt;

&lt;p&gt;The naive method is clamping and is most likely what will happen by default if speed normalization is ignored.
Here this results in values of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[1.0 1.0 0.55053495 1.0]&lt;/code&gt;.
This distorts the final direction, so this should never be used.&lt;/p&gt;

&lt;h4 id=&quot;option-2-pre-normalize&quot;&gt;Option 2: Pre-normalize&lt;/h4&gt;

&lt;p&gt;This method relies on finding the highest possible speed and scaling down each value accordingly if it is greater than 1.
The highest possible speed is always when the translation and rotation components are in the same direction, so we can just add their distances.&lt;/p&gt;

\[\text{scalar}=\min\left(\frac{1}{\left|\vec{r}_\text{max}\right| + \left|\vec{v}\right|}, 1.0\right)\]

&lt;p&gt;This relies on knowing \(\vec{r}_\text{max}\), which will always be the \(\vec{r}\) of the furthest module from the center of rotation.&lt;/p&gt;

\[\vec{r}_\text{max}=\omega |m_\text{furthest}|\]

&lt;p&gt;This approach is the simplest and guaranteed to never exceed 1, but it may be slightly too conservative.
For example, this results in values of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[0.99638291 0.77924539 0.23789289 0.66494252]&lt;/code&gt;, which is very close to the highest.
While only \(0.4\%\) is lost in this example, it may be more pronounced in others.&lt;/p&gt;

&lt;h4 id=&quot;option-3-post-normalize&quot;&gt;Option 3: Post-normalize&lt;/h4&gt;

&lt;p&gt;This approach just takes all the speeds and divides them by the highest, if it’s larger than one.
In the example, this produces values of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[1. 0.78207422 0.23875649 0.6673564]&lt;/code&gt;.
This method produces the best values but can be a little weird to implement if you separate each module into its own object since they would all need to communicate their values.&lt;/p&gt;

&lt;h3 id=&quot;direction-flipping&quot;&gt;Direction Flipping&lt;/h3&gt;

&lt;p&gt;Turning a module to perfectly match the target vector is often unnecessary since the opposite vector with a reversed speed accomplishes the same thing.
Taking this into account, one module should never have to travel more than 90 degrees to reach its target.&lt;/p&gt;

&lt;p&gt;This can be implemented with a simple if statement, but there is a better option using vectors that also fixes the stray module problem.&lt;/p&gt;

&lt;h3 id=&quot;stray-module-problem&quot;&gt;Stray Module Problem&lt;/h3&gt;

&lt;p&gt;A common problem is one module taking a different path from the rest.
This results in the wheels fighting each other and stalling.&lt;/p&gt;

&lt;p&gt;While this isn’t too bad for a single target, when the target is constantly changing this can become a large proportion of the total runtime.&lt;/p&gt;

&lt;center&gt;
&lt;video style=&quot;position: relative; padding-bottom: 56.25; overflow: hidden; max-width: 100%; align-content: center;&quot; autoplay=&quot;&quot; loop=&quot;&quot; muted=&quot;&quot; playsinline=&quot;&quot;&gt;
    &lt;source src=&quot;/blog/res/swerve/stray_module.mp4&quot; type=&quot;video/mp4&quot; /&gt;
&lt;/video&gt;
&lt;/center&gt;

&lt;h4 id=&quot;option-1-mitigation&quot;&gt;Option 1: Mitigation&lt;/h4&gt;

&lt;p&gt;The simplest option is to slow down the modules when they are pointing in the wrong direction.
A quick and easy way to do this is to multiply them by the cosine of the angle difference, \(\theta\), which is easy to do with vectors:&lt;/p&gt;

\[\cos\theta=\frac{\vec{\text{target}}\cdot\vec{\text{current}}}{|\vec{\text{target}}|\ |\vec{\text{current}}|}\]

&lt;p&gt;If a more aggressive limit is needed this scalar can be raised to an exponent, which also won’t change the domain.
Here is an example of speed scaled by \(\cos(\theta)\) and \(\cos(\theta)^3\):&lt;/p&gt;

&lt;center&gt;
&lt;video style=&quot;position: relative; padding-bottom: 56.25; overflow: hidden; max-width: 100%; align-content: center;&quot; autoplay=&quot;&quot; loop=&quot;&quot; muted=&quot;&quot; playsinline=&quot;&quot;&gt;
    &lt;source src=&quot;/blog/res/swerve/stray_module_cos.mp4&quot; type=&quot;video/mp4&quot; /&gt;
&lt;/video&gt;
&lt;/center&gt;

&lt;p&gt;Another benefit to cosine scaling is that it also will take care of reversing the drive wheel when needed because the resulting cosine will be negative.
The only drawback is that you can’t directly use even exponents, but that’s pretty much irrelevant.&lt;/p&gt;

&lt;h4 id=&quot;option-2-explicit-avoidance&quot;&gt;Option 2: Explicit Avoidance&lt;/h4&gt;

&lt;p&gt;Another option is to explicitly avoid having modules take different paths.
Essentially, “if one is going to fight the others have it follow the others”.
This can only be accomplished with some pretty ugly hacks, but it can be done.&lt;/p&gt;

&lt;p&gt;One way of doing this is to shift the module flip windows.
When each module is deciding which way to go it looks for the shortest path to being in line with its target.
This creates two 180 degree windows, and the one it currently falls on decides which direction it is going to use.
If you were to take the average of module rotations (or really their derivatives) and shift this window in the opposite direction proportionally this would effectively make each module follow the rest if it’s close enough.
This introduces another constant for how large this shift is.
It would need to be tuned to be just larger than the usual error in rotation so it can capture most of the stray modules.&lt;/p&gt;

&lt;h2 id=&quot;going-further&quot;&gt;Going Further&lt;/h2&gt;

&lt;p&gt;Swerve drive gets programmed with a simple physical model assuming perfect inputs.
However, swerve algorithms are not a problem with an ideal solution that can be derived or even expressed with conventional mathematical models.&lt;/p&gt;

&lt;p&gt;A perfect control system would take into account these three separate factors:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Time series control&lt;/li&gt;
  &lt;li&gt;Noisy I/O&lt;/li&gt;
  &lt;li&gt;Full robot kinematics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of these makes the final algorithms more complex.
The final two are soft-computing problems as well, so there may not necessarily be clear ways to improve on them.
So is this as good as it gets?
In practice, yes.&lt;/p&gt;

&lt;p&gt;But there is another option to take it further: neural networks.&lt;/p&gt;

&lt;p&gt;Swerve, at least in 2d, is really just a function that takes three numbers; \(\vec{v}_x\), \(\vec{v}_y\), and \(\omega\); plus \(n\) encoder inputs, and outputs \(2n\) motor outputs, for the drive and pivot motor speeds.
And \(n\) is almost always 3 or 4.
So seven linear inputs and maybe 8 linear outputs, and almost all the operations are additions and multiplications.
This is just about the perfect use case for a small neural network.&lt;/p&gt;

&lt;p&gt;However, this doesn’t take into account time series data, since it’s still stateless.
To fix this we can give the network some memory, specifically as an &lt;a href=&quot;https://en.wikipedia.org/wiki/Long_short-term_memory&quot;&gt;LSTM&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Training this certainly would not be easy, and even less so to test, but this is the best option to move forward.&lt;/p&gt;

&lt;h3 id=&quot;source-code&quot;&gt;Source code&lt;/h3&gt;

&lt;p&gt;All of the graphics here were created from this &lt;a href=&quot;/blog/res/swerve/SwerveDrive.ipynb&quot;&gt;jupyter notebook&lt;/a&gt; (&lt;a href=&quot;/blog/res/swerve/SwerveDrive.html&quot;&gt;HTML&lt;/a&gt;).&lt;/p&gt;
</description>
        <pubDate>Mon, 15 Apr 2019 00:00:00 +0000</pubDate>
        <link>https://dominik.win/blog/programming-swerve-drive/</link>
        <guid isPermaLink="true">https://dominik.win/blog/programming-swerve-drive/</guid>
        
        
      </item>
    
      <item>
        <title>TCP Terminals with MATLAB</title>
        <description>&lt;p&gt;The OSU Freshman Engineering program concludes with a Software Design Project where students compete to build the best MATLAB games they can.
Since there was no way I’d win for UI I took a stab at building a multiplayer version of battleship.&lt;/p&gt;

&lt;h2 id=&quot;background&quot;&gt;Background&lt;/h2&gt;

&lt;p&gt;While MATLAB certainly isn’t known for its easy game creation this hasn’t stopped people from creating some really impressive projects.
TCP/IP support exists, but provides the bare minimum for writing servers.
You only get one socket per port, and every function call blocks, so sometimes you have to get creative.&lt;/p&gt;

&lt;p&gt;Since this was created for a class the goal was to create the minimal viable product with the least effort, and it definitely shows in the code quality.&lt;/p&gt;

&lt;h2 id=&quot;networking&quot;&gt;Networking&lt;/h2&gt;

&lt;p&gt;To avoid having to synchronize state across multiple programs I decided to only have one single server and have all clients connect through TCP.
This is a common technique in cybersecurity exploits, where you open up a remote root shell, so it is doable.
Since we can only open one socket per port, we’ll have to use two:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;t1_socket = tcpip(&apos;0.0.0.0&apos;, 30000, &apos;NetworkRole&apos;, &apos;server&apos;);
t2_socket = tcpip(&apos;0.0.0.0&apos;, 30002, &apos;NetworkRole&apos;, &apos;server&apos;);

fopen(t1_socket);
fprintf(t1_socket, &apos;Welcome to Battleship Player 1!\nWaiting for Player 2...\n&apos;);
fopen(t2_socket);
fprintf(t2_socket, &apos;Hello Player 2! Welcome to Battleship!\n&apos;);
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Great! Now we can send data to the clients that connect with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;nc &amp;lt;ip&amp;gt; 3000{0,2}&lt;/code&gt;!
But then we’ll have to find a way to clear the screen so it at least looks presentable.
This should be as easy as faking a terminal clear command:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ clear | xxd
00000000: 1b5b 334a 1b5b 481b 5b32 4a              .[3J.[H.[2J
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now we can clear a client screen by sending those bytes:&lt;/p&gt;
&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function clearSock(socket)
    data = char([hex2dec(&apos;1b&apos;), hex2dec(&apos;5b&apos;), hex2dec(&apos;48&apos;), hex2dec(&apos;1b&apos;), hex2dec(&apos;5b&apos;), hex2dec(&apos;4a&apos;)]);
    fprintf(socket, &apos;%s&apos;, data);
end
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now we need to get user input.
This gets a bit more complex since reading from multiple sockets at once isn’t a supported feature.
Luckily, we do have a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BytesAvailable&lt;/code&gt; property, so we can avoid blocking on a read.
Here we have a function that takes an array of sockets and a bitmask to show which ones still need data:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function [p_id,data] = getSomeInput(socks, sockmask)
    p_id = 0;
    data = &apos;&apos;;
    while 1
        for pid = 1:length(socks)
            p_id = pid;
            if ~sockmask(p_id)
                continue
            end
            if socks(p_id).BytesAvailable &amp;gt; 0
                data = fread(socks(p_id), socks(p_id).BytesAvailable);
                data = data(1:length(data)-1); % Remove newline
                fprintf(&apos;Got data from %d: [%s]\n&apos;, p_id, data);
                return
            end
        end
    end
end
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;game-logic&quot;&gt;Game Logic&lt;/h2&gt;

&lt;p&gt;Now that all the boilerplate networking functions are out of the way we can get down into the game code.
There isn’t too much special about our implementation, so I’ll skip most of this.
Despite not being built for the task the language made most of it short and simple if you can represent it as an element-wise operation.
For example, here is how I tested for collisions when placing ships:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;[err,sBoard] = drawShip(coord, vert, SHIP_SIZES(ship_places(p_id)));

if err
    fprintf(socks(p_id), &apos;Collides with wall, try again: &apos;);
    continue
end

if sum(sum(board &amp;amp; sBoard))
    fprintf(socks(p_id), &apos;Collides with another ship, try again: &apos;);
    continue
end
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To create a proper user experience they should be able to see an ASCII art version of the board.
Something like this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;------------
|      HHH |A
|  H       |B
|  H       |C
|  H  H    |D
|  H  H    |E
|  H  H    |F
|     H    |G
|          |H
|  HHH     |I
|        HH|J
------------
 0123456789
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is where element-wise and matrix operations save the day.
Surprisingly, this doesn’t take very many lines at all.
All of this, and more, can be drawn with this monstrosity:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function printBoard(socket, board)
    width = size(board,2);
    line = char(zeros(1,width + 2) + &apos;-&apos;);
    asciiBoard = char((board == 1) * &apos;H&apos; + (board == 0) * &apos; &apos; + (board == 2) * &apos;X&apos; + &apos;.&apos; * (board == 3));
    fprintf(socket,&apos;%s\n&apos;, line);
    for i=1:width
        fprintf(socket,&apos;|%s|&apos;, convertCharsToStrings(asciiBoard(i,:)),&apos;sync&apos;);
        fprintf(socket,&apos;%s\n&apos;, convertCharsToStrings(char(&apos;A&apos; + i - 1)), &apos;sync&apos;);
    end
    fprintf(socket,&apos;%s\n&apos;, convertCharsToStrings(line));
    fprintf(socket,&apos; %s\n&apos;,convertCharsToStrings(char((1:width) + &apos;0&apos; - 1)));
end
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This type of problem is one of the few where MATLAB comes only second to Python in lines of code.
ASCII matrices are by no means good solutions to problems, and neither is embedding branching through element-wise logic, but they do have their uses.&lt;/p&gt;

&lt;h2 id=&quot;takeaways&quot;&gt;Takeaways&lt;/h2&gt;

&lt;p&gt;This code is bad and wrong, with a UI only a sysadmin could love.
But it does work.
If I had to do it again I’d use Python, or maybe Go, but it really wasn’t as bad as I was expecting it to be.
At the end it was even playable, albeit it’s a pretty slow game to play.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/blog/res/tcp_battleship.m&quot;&gt;Here&lt;/a&gt; is a link to the full code.&lt;/p&gt;
</description>
        <pubDate>Wed, 20 Mar 2019 00:00:00 +0000</pubDate>
        <link>https://dominik.win/blog/tcp-terminals-with-matlab/</link>
        <guid isPermaLink="true">https://dominik.win/blog/tcp-terminals-with-matlab/</guid>
        
        
      </item>
    
  </channel>
</rss>
