From bbeea695a6d0c0a58be82813a691ac413dee8f3f Mon Sep 17 00:00:00 2001 From: Frederick Yin Date: Thu, 23 Nov 2023 20:17:40 -0500 Subject: New post: random/2bugs1day --- docs/random/2bugs1day.md | 117 +++++++++++++++++++++++++++++++++++++++ docs/random/img/2bugs1day/re.png | Bin 0 -> 26941 bytes docs/random/index.md | 1 + 3 files changed, 118 insertions(+) create mode 100644 docs/random/2bugs1day.md create mode 100644 docs/random/img/2bugs1day/re.png diff --git a/docs/random/2bugs1day.md b/docs/random/2bugs1day.md new file mode 100644 index 0000000..360472c --- /dev/null +++ b/docs/random/2bugs1day.md @@ -0,0 +1,117 @@ +# 2 Bugs 1 Day + +2023-11-23 + +If I had a nickel every time I encountered a cursed bug today, I'd have +two. It's not a lot but weird how it happened twice. + +## Bug 1 + +My roommate asked me about a bug in his C code. He passed a value to +a function. The value is not supposed to be zero, but in the output, it +apparently is. Why? + +We could not find any lines of code that could have changed this value. It +was initialized with a command line argument (i.e. `atoi(argv[2])`), then +remains read-only. + +After ruling out off-by-one errors and floating point rounding, +I hypothesize it has something to do with the data structure. The value is +within a struct as follows: + +```c +struct foo_t { + bar_t bar[256]; + int value; +} foo; +``` + +`foo` is a global instance of `struct foo_t`. To initialize it, we set +`value` to a command line argument and initialize `bar` with a for loop, +which looks like: + +```c +for (int i = 0; i < arg1; i++) { + for (int j = 0; j < arg2; j++) { + foo.bar[i * arg2 * arg3 + j] = 0; + } +} +``` + +I realized that `arg3` was unnecessary. By multiplying it, we're spacing +out the indices too much. If `arg1` gets big enough, we'll get a buffer +overflow. + +It just so happens that all arguments are powers of 2, and `bar` is an +array of 256. Which means… You'll overwrite `value`! + +We removed `arg3` and sure enough, it worked. + +Conclusion: Lack of bound checking in C. + +## Bug 2 + +Right after we hunted down this bug, my roommate said there's another bug +in VSCode that has bothered him for a while. It seems like whenever he +types the word "store" in a Markdown code block, the syntax highlighting +(regardless of language) breaks. He looked through the VSCode repo and +didn't find anything particular. + +We found that the keyword doesn't have to be "store", it just needs to be +"re" + whitespace. + +![Some C code in a Markdown code block, containing a string literal "re ". +Highlighter refuses to match curly braces surrounding it, and completely +stops working below.](img/2bugs1day/re.png) + +What does `re` stand for? The Python package? + +This is so cursed that I decided to make the ultimate sacrifice — to +install VSCode on my own machine. + +However, I failed to reproduce. My roommate correctly suggests it's +a problem with a plugin. He disabled all the Markdown-related plugins on +his machine. The bug is still there. + +He then disabled all plugins (except those bundled). The bug is gone. + +VSCode has a "plugin bisect" tool that does a binary search to find the +problematic plugin in O(log(# plugins)) time. I did that and the problem +is… + +OCaml Platform. + +Do you realize how shocked we were? Like, we would rather believe it was +ghosts or something. OCaml is literally the single most innocent plugin. + +We looked at OCaml Platform's code and found something interesting. + +```json +{ + "repository": { + "reason-code-block": { + "begin": "(re|reason|reasonml)(\\s+[^`~]*)?$", + "end": "(^|\\G)(?=\\s*[`~]{3,}\\s*$)", + ... + }, + ... +} +``` + +I do not know how this works, but boy I sure know regex checks out. We +tried `reason`, and yes, it broke as well. + +This code was added in a 2020 commit, and remains unchanged since. It's +such a minor bug that I believe 90% its users didn't even notice. +I encouraged my roommate to file a bug report — or even a pull request. + +My hypothesis is OCaml programmers simply don't program in anything else. + +Conclusion: OCaml plugin ships a regex with false positives. + +## Takeaway + +1. There is a reason for every bug. +2. There is not an obvious reason for every bug. +3. C will just silently overflow your buffer without you knowing. Can't + wait till Linux is 100% Rust diff --git a/docs/random/img/2bugs1day/re.png b/docs/random/img/2bugs1day/re.png new file mode 100644 index 0000000..209356b Binary files /dev/null and b/docs/random/img/2bugs1day/re.png differ diff --git a/docs/random/index.md b/docs/random/index.md index ff44b18..7190d22 100644 --- a/docs/random/index.md +++ b/docs/random/index.md @@ -20,3 +20,4 @@ Nevertheless, occasionally I leave a permanent trace along the way. - [xkcdbot](xkcdbot.md) - [Playlist to put on on my deathbed](deathbed_playlist.md) - [Of Potato Chips And Food Globalization](potato_chips.md) +- [2 Bugs 1 Day](2bugs1day.md) -- cgit v1.2.3