<!doctype html><htmllang=en><head><metacharset=utf-8><metaname=viewportcontent="width=device-width,initial-scale=1"><linkrel="shortcut icon"href=/img/icon.pngtype=image/png><metaname=generatorcontent="Hugo 0.105.0"><metaproperty="og:title"content="King James Bible: An Adventure in Compression"><metaproperty="og:description"content="Figuring out how much space the Bible takes on a calculator or a Game Boy is fun"><metaproperty="og:type"content="article"><metaproperty="og:url"content="http://toasters.rocks/king-james-bible/"><metaproperty="og:image"content="http://toasters.rocks/images/2020/01/screenshot20200110191340.png"><metaproperty="article:section"content><metaproperty="article:published_time"content="2020-01-11T00:38:16+00:00"><metaproperty="article:modified_time"content="2020-01-11T00:59:58+00:00"><metaname=twitter:cardcontent="summary_large_image"><metaname=twitter:imagecontent="http://toasters.rocks/images/2020/01/screenshot20200110191340.png"><metaname=twitter:titlecontent="King James Bible: An Adventure in Compression"><metaname=twitter:descriptioncontent="Figuring out how much space the Bible takes on a calculator or a Game Boy is fun"><metaname=theme-colorcontent="#660066"><title>King James Bible: An Adventure in Compression - toasters rocks</title><linkrel=stylesheethref=http://toasters.rocks/css/toastersrocks.min.css></head><body><header><imgsrc=/img/icon.png><h1>toasters rocks</h1></header><main><aside><nav><ahref=/><iclass="fas fa-home"></i>
<spanstyle=color:#fe3801>SoundCloud</span></a><br></nav></aside><articlestyle=background-image:url(/images/2020/01/screenshot20200110191340.png)><divclass=metadatastyle="height:calc((var(--height) - 2em) * .7478152309612984 - 3.5em)"><h2name=top>King James Bible: An Adventure in Compression</h2><p>Figuring out how much space the Bible takes on a calculator or a Game Boy is fun</p><iclass="far fa-calendar-alt"></i>
#<aclass="btn btn-sm btn-outline-dark tag-btn"href=http://toasters.rocks/tags/tech>Tech</a><br><iclass="fas fa-hourglass"></i> ~7 minutes</div><p>Well, time for another adventure, and with every adventure it begins with a very silly thought that isn’t even mine this time:</p><p><imgsrc=/images/2020/01/screenshot20200110194154.pngalt="Discord screenshot of DJ Omnimaga who says &quot;I wonder if one can fit the entire bible on a TI-Nspire CX with mViewer GX PDF converter&quot;">
“I wonder if one can fit the entire bible on a TI-Nspire CX with mViewer GX PDF converter”, says our friend DJ</p><p>And there you go, am I searching for the answer:</p><p><blockquoteclass=twitter-tweet><plang=endir=ltr>me: trying to find out how big the Bible is in terms of computer storage because someone asked on Discord<br><br>me, literally 30 seconds later: <ahref=https://t.co/qQiEqTKnCk>https://t.co/qQiEqTKnCk</a></p>— Yuki 雪🏳️⚧️ (@juju2143) <ahref="https://twitter.com/juju2143/status/1215378475277787137?ref_src=twsrc%5Etfw">January 9, 2020</a></blockquote><scriptasyncsrc=https://platform.twitter.com/widgets.js></script>
That’s the Wikipedia effect right there, you look for something and before you know you know everything there is to know about religion and now you’re on some completely unrelated page about quantum theory.</p><p>So I downloaded the whole King James Version on <ahref=http://www.gutenberg.org/>Project Gutenberg</a>, removed the header and footer they put there for better text processing, it’s about 4.4 MB, converted to PDF, since the format support plain text directly it’s not that much more (I got a 3 MB file), then converted to work on a TI-Nspire with the <ahref=https://tiplanet.org/forum/editgx.php>mViewer GX PDF converter</a> I… I think I broke TI-Planet. Well, from what it was able to generate (76 pages out of 1664, pretty much the book of Genesis?) each 10 pages is about 1.3 MB, so by extension the whole thing should be around 216 MB. We’re dealing with images now, and not just plain text, so yeah. Could be lower if you set the resolution to something almost unreadable, but at this point you’re better using a plain text reader on your calc.</p><p>So in conclusion, maybe. Maybe you can manage to do it. But it’s gonna take most of your calc space, which is, with nothing installed, is about 100 MB.</p><p>But wait a minute, we have another contender…</p><p><blockquoteclass=twitter-tweet><plang=endir=ltr>Didn't they manage to cram the whole Bible on a GameBoy cartridge?</p>— Minty Root (@Minty_Root) <ahref="https://twitter.com/Minty_Root/status/1215378787652833282?ref_src=twsrc%5Etfw">January 9, 2020</a></blockquote><scriptasyncsrc=https://platform.twitter.com/widgets.js></script>
what are you talkin’ about Minty</p><p>Oh God, we’re gonna have some fun with that. Sure enough, there was an unlicensed King James Bible for the Game Boy published by Wisdom Tree in 1994, if you want to see it in action there was an <ahref="https://www.youtube.com/watch?v=Kz0TOQ1BF-M">Angry Video Game Nerd episode about it</a>, but what’s amazing about it is that is that the ROM is only one megabyte, including the entire text of the Bible, a search engine and two word search games.</p><p>(Note, if you’re emulating it, use <ahref=http://bgb.bircd.org/>BGB</a>. Any other emulator will introduce bugs due to its weird mapping no one will understand except BGB. Of course, I will not provide the ROM for the usual copyright reasons.)</p><p><imgsrc=/images/2020/01/screenshot20200109163510.pngalt="Screenshot of the hangman game running in an emulator that is not BGB featuring characters you can&rsquo;t normally input">
Here’s what I mean. The reader will crash and the games will make you guess garbage you can’t input.</p><p>So for fun, with the KJB text I have in hand, I tested some of the most common compression utilities, all set to their maximum/best/slowest settings:</p><table><thead><tr><thstyle=text-align:left>Compression</th><thstyle=text-align:right>Size</th><thstyle=text-align:right>Ratio</th></tr></thead><tbody><tr><tdstyle=text-align:left>zpaq -m5</td><tdstyle=text-align:right>739407</td><tdstyle=text-align:right>16.682%</td></tr><tr><tdstyle=text-align:left>bzip2 -9</td><tdstyle=text-align:right>993406</td><tdstyle=text-align:right>22.412%</td></tr><tr><tdstyle=text-align:left>lzma -9</td><tdstyle=text-align:right>1048408</td><tdstyle=text-align:right>23.653%</td></tr><tr><tdstyle=text-align:left>xz -9</td><tdstyle=text-align:right>1048616</td><tdstyle=text-align:right>23.658%</td></tr><tr><tdstyle=text-align:left>7z -mx9</td><tdstyle=text-align:right>1048710</td><tdstyle=text-align:right>23.660%</td></tr><tr><tdstyle=text-align:left>zstd –ultra -22</td><tdstyle=text-align:right>1068137</td><tdstyle=text-align:right>24.099%</td></tr><tr><tdstyle=text-align:left>rar -m5</td><tdstyle=text-align:right>1142360</td><tdstyle=text-align:right>25.773%</td></tr><tr><tdstyle=text-align:left>gzip -9</td><tdstyle=text-align:right>1385457</td><tdstyle=text-align:right>31.258%</td></tr><tr><tdstyle=text-align:left>zip -9</td><tdstyle=text-align:right>1385595</td><tdstyle=text-align:right>31.261%</td></tr><tr><tdstyle=text-align:left>lz4 -9</td><tdstyle=text-align:right>1596418</td><tdstyle=text-align:right>36.017%</td></tr><tr><tdstyle=text-align:left>lzop -9</td><tdstyle=text-align:right>1611939</td><tdstyle=text-align:right>36.367%</td></tr><tr><tdstyle=text-align:left>Uncompressed</td><tdstyle=text-align:right>4432375</td><tdstyle=text-align:right>100%</td></tr></tbody></table><p>Note that some of these are different containers for the same algorithm, hence similar filesizes, and some of them are better suited for other uses, e.g. lz4 and lzop are better to decompress the Linux kernel at boot time because they’re fast and use less memory, and zstd is starting to replace xz because it’s 1300% faster despite producing slightly bigger files.</p><p>So, with our goal of a ROM size of 1048576 bytes with enough space left to fit some code for the decompressor that is fast enough to be playable on a Game Boy, a good-looking UI, a search engine and some games, only zpaq and bzip2 would fit the bill, and even then. (Special mention to lzma which fits a megabyte almost exactly.) Most of those algorithms were devised after 1994, bzip2 in particular was devised between 1996 and 2000, but even though it has the best compression ratio it’s way slower than gzip.</p><p>Anyway, I’m not an expert, but yeah, there’s more efficient compressors out there, but we don’t usually use them because they’re either experimental and/or very, very slow, the PAQ ones in particular. So I’d imagine a slow compressor with a fast decompressor that is tuned for English text.</p><p>So, now that we have our compression benchmark on file size, it’s appropriate to make a decompression benchmark based on time, because that’s what we need, right? So here’s some tests under a normal load on my good ol’ iMac 27" mid-2011 running Linux (don’t laugh, it’s old af but it’s still my daily driver and it still works for me) using the above files decompressed to <code>/dev/null</code> and ran several times until it gives somewhat consistent approximate results. I didn’t bothered to time the software during the compression phase because it’s irrelevant to our use case (and I haven’t thought of that when I tested), but all of them were quite fast except zpaq.</p><table><thead><tr><thstyle=text-align:left>Decompression</th><thstyle=text-align:right>Time (s)</th></tr></thead><tbody><tr><tdstyle=text-align:left>lz4</td><tdsty