1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
|
# 2 Bugs 1 Day
2023-11-23
If I had a nickel every time I encountered a cursed bug today, I'd have
two. It's not a lot but weird how it happened twice.
## Bug 1
My roommate asked me about a bug in his C code. He passed a value to
a function. The value is not supposed to be zero, but in the output, it
apparently is. Why?
We could not find any lines of code that could have changed this value. It
was initialized with a command line argument (i.e. `atoi(argv[2])`), then
remains read-only.
After ruling out off-by-one errors and floating point rounding,
I hypothesize it has something to do with the data structure. The value is
within a struct as follows:
```c
struct foo_t {
bar_t bar[256];
int value;
} foo;
```
`foo` is a global instance of `struct foo_t`. To initialize it, we set
`value` to a command line argument and initialize `bar` with a for loop,
which looks like:
```c
for (int i = 0; i < arg1; i++) {
for (int j = 0; j < arg2; j++) {
foo.bar[i * arg2 * arg3 + j] = 0;
}
}
```
I realized that `arg3` was unnecessary. By multiplying it, we're spacing
out the indices too much. If `arg1` gets big enough, we'll get a buffer
overflow.
It just so happens that all arguments are powers of 2, and `bar` is an
array of 256. Which means… You'll overwrite `value`!
We removed `arg3` and sure enough, it worked.
Conclusion: Lack of bound checking in C.
## Bug 2
Right after we hunted down this bug, my roommate said there's another bug
in VSCode that has bothered him for a while. It seems like whenever he
types the word "store" in a Markdown code block, the syntax highlighting
(regardless of language) breaks. He looked through the VSCode repo and
didn't find anything particular.
We found that the keyword doesn't have to be "store", it just needs to be
"re" + whitespace.
![Some C code in a Markdown code block, containing a string literal "re ".
Highlighter refuses to match curly braces surrounding it, and completely
stops working below.](img/2bugs1day/re.png)
What does `re` stand for? The Python package?
This is so cursed that I decided to make the ultimate sacrifice — to
install VSCode on my own machine.
However, I failed to reproduce. My roommate correctly suggests it's
a problem with a plugin. He disabled all the Markdown-related plugins on
his machine. The bug is still there.
He then disabled all plugins (except those bundled). The bug is gone.
VSCode has a "plugin bisect" tool that does a binary search to find the
problematic plugin in O(log(# plugins)) time. I did that and the problem
is…
OCaml Platform.
Do you realize how shocked we were? Like, we would rather believe it was
ghosts or something. OCaml is literally the single most innocent plugin.
We looked at OCaml Platform's code and found something interesting.
```json
{
"repository": {
"reason-code-block": {
"begin": "(re|reason|reasonml)(\\s+[^`~]*)?$",
"end": "(^|\\G)(?=\\s*[`~]{3,}\\s*$)",
...
},
...
}
```
I do not know how this works, but boy I sure know regex checks out. We
tried `reason`, and yes, it broke as well.
This code was added in a 2020 commit, and remains unchanged since. It's
such a minor bug that I believe 90% its users didn't even notice.
I encouraged my roommate to file a bug report — or even a pull request.
My hypothesis is OCaml programmers simply don't program in anything else.
Conclusion: OCaml plugin ships a regex with false positives.
## Takeaway
1. There is a reason for every bug.
2. There is not an obvious reason for every bug.
3. C will just silently overflow your buffer without you knowing. Can't
wait till Linux is 100% Rust
|