toasters.rocks/king-james-bible/index.html

4 lines
16 KiB
HTML
Raw Normal View History

<!doctype html><html lang=en><head><meta charset=utf-8><meta name=viewport content="width=device-width,initial-scale=1"><link rel="shortcut icon" href=/img/icon.png type=image/png><meta name=generator content="Hugo 0.79.0"><meta property="og:title" content="King James Bible: An Adventure in Compression"><meta property="og:description" content="Figuring out how much space the Bible takes on a calculator or a Game Boy is fun"><meta property="og:type" content="article"><meta property="og:url" content="http://toasters.rocks/king-james-bible/"><meta property="og:image" content="http://toasters.rocks/images/2020/01/screenshot20200110191340.png"><meta property="article:published_time" content="2020-01-11T00:38:16+00:00"><meta property="article:modified_time" content="2020-01-11T00:59:58+00:00"><meta name=twitter:card content="summary_large_image"><meta name=twitter:image content="http://toasters.rocks/images/2020/01/screenshot20200110191340.png"><meta name=twitter:title content="King James Bible: An Adventure in Compression"><meta name=twitter:description content="Figuring out how much space the Bible takes on a calculator or a Game Boy is fun"><title>King James Bible: An Adventure in Compression - toasters rocks</title><link rel=stylesheet href=http://toasters.rocks/css/toastersrocks.min.css></head><body><header><img src=/img/icon.png><h1>toasters rocks</h1></header><main><aside><nav><a href=/><i class="fas fa-home"></i>Home</a><br><a href=http://juju2143.ca/><i class="fas fa-user"></i>About</a><br><a href=/fr/><i class="fas fa-globe"></i>Français</a><br><a href=https://yukiis.moe/><i class="far fa-comment"></i>Comics</a><br><a href=https://codewalr.us/><i class="far fa-folder-open"></i>Forums</a><br></nav><br><nav><a title=Twitter href=https://twitter.com/juju2143><i style=color:#4da7de class="fab fa-twitter"></i><span style=color:#4da7de>Twitter</span></a><br><a title=Discord href=https://discord.gg/cuZcfcF><i style=color:#7289da class="fab fa-discord"></i><span style=color:#7289da>Discord</span></a><br><a title=GitHub href=https://github.com/juju2143><i style=color:#221e1b class="fab fa-github"></i><span style=color:#221e1b>GitHub</span></a><br><a title=Patreon href=https://patreon.com/juju2143><i style=color:#f96854 class="fab fa-patreon"></i><span style=color:#f96854>Patreon</span></a><br><a title=YouTube href=https://youtube.com/user/julosoft><i style=color:#e02a20 class="fab fa-youtube"></i><span style=color:#e02a20>YouTube</span></a><br><a title="YouTube 2" href=https://youtube.com/c/juju2143><i style=color:#e02a20 class="fab fa-youtube"></i><span style=color:#e02a20>YouTube 2</span></a><br><a title=Twitch href=https://twitch.tv/juju2143><i style=color:#6441a5 class="fab fa-twitch"></i><span style=color:#6441a5>Twitch</span></a><br><a title=Instagram href=https://instagram.com/j.p.savard><i style=color:#d6249f class="fab fa-instagram"></i><span style=color:#d6249f>Instagram</span></a><br><a title=DeviantArt href=https://deviantart.com/juju2143><i style=color:#c5d200 class="fab fa-deviantart"></i><span style=color:#c5d200>DeviantArt</span></a><br><a title=SoundCloud href=https://soundcloud.com/juju2143><i style=color:#fe3801 class="fab fa-soundcloud"></i><span style=color:#fe3801>SoundCloud</span></a><br></nav></aside><article style=background-image:url(/images/2020/01/screenshot20200110191340.png)><div class=metadata style="height:calc((var(--height) - 2em) * 0.7478152309612984 - 3.5em)"><h2 name=top>King James Bible: An Adventure in Compression</h2><p>Figuring out how much space the Bible takes on a calculator or a Game Boy is fun</p><i class="far fa-calendar-alt"></i><time datetime=2020-01-11>January 11, 2020</time><br><i class="fas fa-tags"></i>#<a class="btn btn-sm btn-outline-dark tag-btn" href=http://toasters.rocks/tags/tech>Tech</a><br><i class="fas fa-hourglass"></i>~7 minutes</div><p>Well, time for another adventure, and with every adventure it begins with a very silly thought that isn&rsquo;t even mine this time:</p><p><img src=/images/2020/01/screenshot20200110194154.png alt='Discord screenshot of DJ Omnim
&ldquo;I wonder if one can fit the entire bible on a TI-Nspire CX with mViewer GX PDF converter&rdquo;, says our friend DJ</p><p>And there you go, am I searching for the answer:</p><p><blockquote class=twitter-tweet><p lang=en dir=ltr>me: trying to find out how big the Bible is in terms of computer storage because someone asked on Discord<br><br>me, literally 30 seconds later: <a href=https://t.co/qQiEqTKnCk>https://t.co/qQiEqTKnCk</a></p>&mdash; 輝き雪 Yuki, CEO of snow (@juju2143) <a href="https://twitter.com/juju2143/status/1215378475277787137?ref_src=twsrc%5Etfw">January 9, 2020</a></blockquote><script async src=https://platform.twitter.com/widgets.js></script>That&rsquo;s the Wikipedia effect right there, you look for something and before you know you know everything there is to know about religion and now you&rsquo;re on some completely unrelated page about quantum theory.</p><p>So I downloaded the whole King James Version on <a href=http://www.gutenberg.org/>Project Gutenberg</a>, removed the header and footer they put there for better text processing, it&rsquo;s about 4.4 MB, converted to PDF, since the format support plain text directly it&rsquo;s not that much more (I got a 3 MB file), then converted to work on a TI-Nspire with the <a href=https://tiplanet.org/forum/editgx.php>mViewer GX PDF converter</a> I&mldr; I think I broke TI-Planet. Well, from what it was able to generate (76 pages out of 1664, pretty much the book of Genesis?) each 10 pages is about 1.3 MB, so by extension the whole thing should be around 216 MB. We&rsquo;re dealing with images now, and not just plain text, so yeah. Could be lower if you set the resolution to something almost unreadable, but at this point you&rsquo;re better using a plain text reader on your calc.</p><p>So in conclusion, maybe. Maybe you can manage to do it. But it&rsquo;s gonna take most of your calc space, which is, with nothing installed, is about 100 MB.</p><p>But wait a minute, we have another contender&mldr;</p><p><blockquote class=twitter-tweet><p lang=en dir=ltr>Didn't they manage to cram the whole Bible on a GameBoy cartridge?</p>&mdash; Minty Root (@Minty_Root) <a href="https://twitter.com/Minty_Root/status/1215378787652833282?ref_src=twsrc%5Etfw">January 9, 2020</a></blockquote><script async src=https://platform.twitter.com/widgets.js></script>what are you talkin' about Minty</p><p>Oh God, we&rsquo;re gonna have some fun with that. Sure enough, there was an unlicensed King James Bible for the Game Boy published by Wisdom Tree in 1994, if you want to see it in action there was an <a href="https://www.youtube.com/watch?v=Kz0TOQ1BF-M">Angry Video Game Nerd episode about it</a>, but what&rsquo;s amazing about it is that is that the ROM is only one megabyte, including the entire text of the Bible, a search engine and two word search games.</p><p>(Note, if you&rsquo;re emulating it, use <a href=http://bgb.bircd.org/>BGB</a>. Any other emulator will introduce bugs due to its weird mapping no one will understand except BGB. Of course, I will not provide the ROM for the usual copyright reasons.)</p><p><img src=/images/2020/01/screenshot20200109163510.png alt="Screenshot of the hangman game running in an emulator that is not BGB featuring characters you can&rsquo;t normally input">
Here&rsquo;s what I mean. The reader will crash and the games will make you guess garbage you can&rsquo;t input.</p><p>So for fun, with the KJB text I have in hand, I tested some of the most common compression utilities, all set to their maximum/best/slowest settings:</p><table><thead><tr><th style=text-align:left>Compression</th><th style=text-align:right>Size</th><th style=text-align:right>Ratio</th></tr></thead><tbody><tr><td style=text-align:left>zpaq -m5</td><td style=text-align:right>739407</td><td style=text-align:right>16.682%</td></tr><tr><td style=text-align:left>bzip2 -9</td><td style=text-align:right>993406</td><td style=text-align:right>22.412%</td></tr><tr><td style=text-align:left>lzma -9</td><td style=text-align:right>1048408</td><td style=text-align:right>23.653%</td></tr><tr><td style=text-align:left>xz -9</td><td style=text-align:right>1048616</td><td style=text-align:right>23.658%</td></tr><tr><td style=text-align:left>7z -mx9</td><td style=text-align:right>1048710</td><td style=text-align:right>23.660%</td></tr><tr><td style=text-align:left>zstd &ndash;ultra -22</td><td style=text-align:right>1068137</td><td style=text-align:right>24.099%</td></tr><tr><td style=text-align:left>rar -m5</td><td style=text-align:right>1142360</td><td style=text-align:right>25.773%</td></tr><tr><td style=text-align:left>gzip -9</td><td style=text-align:right>1385457</td><td style=text-align:right>31.258%</td></tr><tr><td style=text-align:left>zip -9</td><td style=text-align:right>1385595</td><td style=text-align:right>31.261%</td></tr><tr><td style=text-align:left>lz4 -9</td><td style=text-align:right>1596418</td><td style=text-align:right>36.017%</td></tr><tr><td style=text-align:left>lzop -9</td><td style=text-align:right>1611939</td><td style=text-align:right>36.367%</td></tr><tr><td style=text-align:left>Uncompressed</td><td style=text-align:right>4432375</td><td style=text-align:right>100%</td></tr></tbody></table><p>Note that some of these are different containers for the same algorithm, hence similar filesizes, and some of them are better suited for other uses, e.g. lz4 and lzop are better to decompress the Linux kernel at boot time because they&rsquo;re fast and use less memory, and zstd is starting to replace xz because it&rsquo;s 1300% faster despite producing slightly bigger files.</p><p>So, with our goal of a ROM size of 1048576 bytes with enough space left to fit some code for the decompressor that is fast enough to be playable on a Game Boy, a good-looking UI, a search engine and some games, only zpaq and bzip2 would fit the bill, and even then. (Special mention to lzma which fits a megabyte almost exactly.) Most of those algorithms were devised after 1994, bzip2 in particular was devised between 1996 and 2000, but even though it has the best compression ratio it&rsquo;s way slower than gzip.</p><p>Anyway, I&rsquo;m not an expert, but yeah, there&rsquo;s more efficient compressors out there, but we don&rsquo;t usually use them because they&rsquo;re either experimental and/or very, very slow, the PAQ ones in particular. So I&rsquo;d imagine a slow compressor with a fast decompressor that is tuned for English text.</p><p>So, now that we have our compression benchmark on file size, it&rsquo;s appropriate to make a decompression benchmark based on time, because that&rsquo;s what we need, right? So here&rsquo;s some tests under a normal load on my good ol' iMac 27" mid-2011 running Linux (don&rsquo;t laugh, it&rsquo;s old af but it&rsquo;s still my daily driver and it still works for me) using the above files decompressed to <code>/dev/null</code> and ran several times until it gives somewhat consistent approximate results. I didn&rsquo;t bothered to time the software during the compression phase because it&rsquo;s irrelevant to our use case (and I haven&rsquo;t thought of that when I tested), but all of them were quite fast except zpaq.</p><table><thead><tr><th style=text-align:left>Decompression</th><th style=text-align:right>Time (s)</th></tr></thead><tbody><tr><td style=text-align:left>lz4</td><td style=tex
var d=document,s=d.createElement('script');s.async=true;s.src='//'+"juju2143"+'.disqus.com/embed.js';s.setAttribute('data-timestamp',+new Date());(d.head||d.body).appendChild(s);})();</script><noscript>Please enable JavaScript to view the <a href=https://disqus.com/?ref_noscript>comments powered by Disqus.</a></noscript><a href=https://disqus.com class=dsq-brlink>comments powered by <span class=logo-disqus>Disqus</span></a></article></main><footer>Copyright © 2020 J.P. Savard - Theme by <a href=https://github.com/juju2143/hugo-theme-toastersrocks>J. P. Savard</a> - Powered by Hugo 0.79.0</footer></body></html>