Can an uninitialized bool crash a program?
I know that "undefined behaviour" in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I would have assumed the code looked safe enough. In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization were enabled.
I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string "true" or "false" to an existing destination buffer. Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value.
// Zero-filled global buffer of 16 characters
char destBuffer[16];
void Serialize(bool boolValue) {
// Determine which string to print based on boolValue
const char* whichString = boolValue ? "true" : "false";
// Compute the length of the string we selected
const size_t len = strlen(whichString);
// Copy string into destination buffer, which is zero-filled (thus already null-terminated)
memcpy(destBuffer, whichString, len);
}
If this code is executed with clang 5.0.0 + optimizations, it will/can crash.
The expected ternary-operator boolValue ? "true" : "false"
looked safe enough for me, I was assuming, "Whatever garbage value is in boolValue
doesn't matter, since it will evaluate to true or false anyhow."
I have setup a Compiler Explorer example that shows the problem in the disassembly:
#include <iostream>
#include <cstring>
// Simple struct, with an empty constructor that doesn't initialize anything
struct FStruct {
bool uninitializedBool;
__attribute__ ((noinline)) // Note: the constructor must be declared noinline to trigger the problem
FStruct() {};
};
int main()
{
// Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.
FStruct structInstance;
// Output "true" or "false" to stdout
Serialize(structInstance.uninitializedBool);
return 0;
}
The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which should technically be either 0 or 1, and goes like this:
const size_t len = strlen(whichString); // original code
const size_t len = 5 - boolValue; // clang clever optimization
While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way? Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?
c++ undefined-behavior
New contributor
|
show 3 more comments
I know that "undefined behaviour" in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I would have assumed the code looked safe enough. In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization were enabled.
I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string "true" or "false" to an existing destination buffer. Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value.
// Zero-filled global buffer of 16 characters
char destBuffer[16];
void Serialize(bool boolValue) {
// Determine which string to print based on boolValue
const char* whichString = boolValue ? "true" : "false";
// Compute the length of the string we selected
const size_t len = strlen(whichString);
// Copy string into destination buffer, which is zero-filled (thus already null-terminated)
memcpy(destBuffer, whichString, len);
}
If this code is executed with clang 5.0.0 + optimizations, it will/can crash.
The expected ternary-operator boolValue ? "true" : "false"
looked safe enough for me, I was assuming, "Whatever garbage value is in boolValue
doesn't matter, since it will evaluate to true or false anyhow."
I have setup a Compiler Explorer example that shows the problem in the disassembly:
#include <iostream>
#include <cstring>
// Simple struct, with an empty constructor that doesn't initialize anything
struct FStruct {
bool uninitializedBool;
__attribute__ ((noinline)) // Note: the constructor must be declared noinline to trigger the problem
FStruct() {};
};
int main()
{
// Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.
FStruct structInstance;
// Output "true" or "false" to stdout
Serialize(structInstance.uninitializedBool);
return 0;
}
The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which should technically be either 0 or 1, and goes like this:
const size_t len = strlen(whichString); // original code
const size_t len = 5 - boolValue; // clang clever optimization
While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way? Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?
c++ undefined-behavior
New contributor
5
I'm pretty sure that if you call this function with an uninitialized value, you get UB at the point of the call, long before the function is even entered.
– melpomene
1 hour ago
3
@SidS: the buffer shown here is for demonstration purpose and was in the global space, so it is actually zero-filled (null-terminated) automatically.
– Remz
1 hour ago
2
How did you pass abool
into this function without initializing it? I feel like that should have raised all sorts of compiler warnings at the very least. Any non-bool
passed in would be converted to normal format (0
or1
), so the only way to do this would be explicitly leaving thebool
uninitialized in the caller (should warn) or doing terrible things to force an invalid value into thebool
in the caller, e.g.memset(&mybool, 0xFF, sizeof mybool)
, which would force an invalid bit pattern intomybool
, while the compiler would still believe it doesn't need to normalize it.
– ShadowRanger
1 hour ago
4
It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.
– John Kugelman
59 mins ago
2
@SidS You are right. Sadly the noinline part is required for the undefined behaviour to occur in this simplified example (and in this specific case, clang 5.0.0). I would have prefer not requiring it, but the bool "fixed itself" without this. It is not related to the question/problem however, since the function could have been in a different library.
– Remz
56 mins ago
|
show 3 more comments
I know that "undefined behaviour" in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I would have assumed the code looked safe enough. In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization were enabled.
I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string "true" or "false" to an existing destination buffer. Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value.
// Zero-filled global buffer of 16 characters
char destBuffer[16];
void Serialize(bool boolValue) {
// Determine which string to print based on boolValue
const char* whichString = boolValue ? "true" : "false";
// Compute the length of the string we selected
const size_t len = strlen(whichString);
// Copy string into destination buffer, which is zero-filled (thus already null-terminated)
memcpy(destBuffer, whichString, len);
}
If this code is executed with clang 5.0.0 + optimizations, it will/can crash.
The expected ternary-operator boolValue ? "true" : "false"
looked safe enough for me, I was assuming, "Whatever garbage value is in boolValue
doesn't matter, since it will evaluate to true or false anyhow."
I have setup a Compiler Explorer example that shows the problem in the disassembly:
#include <iostream>
#include <cstring>
// Simple struct, with an empty constructor that doesn't initialize anything
struct FStruct {
bool uninitializedBool;
__attribute__ ((noinline)) // Note: the constructor must be declared noinline to trigger the problem
FStruct() {};
};
int main()
{
// Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.
FStruct structInstance;
// Output "true" or "false" to stdout
Serialize(structInstance.uninitializedBool);
return 0;
}
The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which should technically be either 0 or 1, and goes like this:
const size_t len = strlen(whichString); // original code
const size_t len = 5 - boolValue; // clang clever optimization
While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way? Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?
c++ undefined-behavior
New contributor
I know that "undefined behaviour" in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I would have assumed the code looked safe enough. In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization were enabled.
I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string "true" or "false" to an existing destination buffer. Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value.
// Zero-filled global buffer of 16 characters
char destBuffer[16];
void Serialize(bool boolValue) {
// Determine which string to print based on boolValue
const char* whichString = boolValue ? "true" : "false";
// Compute the length of the string we selected
const size_t len = strlen(whichString);
// Copy string into destination buffer, which is zero-filled (thus already null-terminated)
memcpy(destBuffer, whichString, len);
}
If this code is executed with clang 5.0.0 + optimizations, it will/can crash.
The expected ternary-operator boolValue ? "true" : "false"
looked safe enough for me, I was assuming, "Whatever garbage value is in boolValue
doesn't matter, since it will evaluate to true or false anyhow."
I have setup a Compiler Explorer example that shows the problem in the disassembly:
#include <iostream>
#include <cstring>
// Simple struct, with an empty constructor that doesn't initialize anything
struct FStruct {
bool uninitializedBool;
__attribute__ ((noinline)) // Note: the constructor must be declared noinline to trigger the problem
FStruct() {};
};
int main()
{
// Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.
FStruct structInstance;
// Output "true" or "false" to stdout
Serialize(structInstance.uninitializedBool);
return 0;
}
The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which should technically be either 0 or 1, and goes like this:
const size_t len = strlen(whichString); // original code
const size_t len = 5 - boolValue; // clang clever optimization
While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way? Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?
c++ undefined-behavior
c++ undefined-behavior
New contributor
New contributor
edited 1 hour ago
Sid S
3,6272723
3,6272723
New contributor
asked 1 hour ago
RemzRemz
444
444
New contributor
New contributor
5
I'm pretty sure that if you call this function with an uninitialized value, you get UB at the point of the call, long before the function is even entered.
– melpomene
1 hour ago
3
@SidS: the buffer shown here is for demonstration purpose and was in the global space, so it is actually zero-filled (null-terminated) automatically.
– Remz
1 hour ago
2
How did you pass abool
into this function without initializing it? I feel like that should have raised all sorts of compiler warnings at the very least. Any non-bool
passed in would be converted to normal format (0
or1
), so the only way to do this would be explicitly leaving thebool
uninitialized in the caller (should warn) or doing terrible things to force an invalid value into thebool
in the caller, e.g.memset(&mybool, 0xFF, sizeof mybool)
, which would force an invalid bit pattern intomybool
, while the compiler would still believe it doesn't need to normalize it.
– ShadowRanger
1 hour ago
4
It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.
– John Kugelman
59 mins ago
2
@SidS You are right. Sadly the noinline part is required for the undefined behaviour to occur in this simplified example (and in this specific case, clang 5.0.0). I would have prefer not requiring it, but the bool "fixed itself" without this. It is not related to the question/problem however, since the function could have been in a different library.
– Remz
56 mins ago
|
show 3 more comments
5
I'm pretty sure that if you call this function with an uninitialized value, you get UB at the point of the call, long before the function is even entered.
– melpomene
1 hour ago
3
@SidS: the buffer shown here is for demonstration purpose and was in the global space, so it is actually zero-filled (null-terminated) automatically.
– Remz
1 hour ago
2
How did you pass abool
into this function without initializing it? I feel like that should have raised all sorts of compiler warnings at the very least. Any non-bool
passed in would be converted to normal format (0
or1
), so the only way to do this would be explicitly leaving thebool
uninitialized in the caller (should warn) or doing terrible things to force an invalid value into thebool
in the caller, e.g.memset(&mybool, 0xFF, sizeof mybool)
, which would force an invalid bit pattern intomybool
, while the compiler would still believe it doesn't need to normalize it.
– ShadowRanger
1 hour ago
4
It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.
– John Kugelman
59 mins ago
2
@SidS You are right. Sadly the noinline part is required for the undefined behaviour to occur in this simplified example (and in this specific case, clang 5.0.0). I would have prefer not requiring it, but the bool "fixed itself" without this. It is not related to the question/problem however, since the function could have been in a different library.
– Remz
56 mins ago
5
5
I'm pretty sure that if you call this function with an uninitialized value, you get UB at the point of the call, long before the function is even entered.
– melpomene
1 hour ago
I'm pretty sure that if you call this function with an uninitialized value, you get UB at the point of the call, long before the function is even entered.
– melpomene
1 hour ago
3
3
@SidS: the buffer shown here is for demonstration purpose and was in the global space, so it is actually zero-filled (null-terminated) automatically.
– Remz
1 hour ago
@SidS: the buffer shown here is for demonstration purpose and was in the global space, so it is actually zero-filled (null-terminated) automatically.
– Remz
1 hour ago
2
2
How did you pass a
bool
into this function without initializing it? I feel like that should have raised all sorts of compiler warnings at the very least. Any non-bool
passed in would be converted to normal format (0
or 1
), so the only way to do this would be explicitly leaving the bool
uninitialized in the caller (should warn) or doing terrible things to force an invalid value into the bool
in the caller, e.g. memset(&mybool, 0xFF, sizeof mybool)
, which would force an invalid bit pattern into mybool
, while the compiler would still believe it doesn't need to normalize it.– ShadowRanger
1 hour ago
How did you pass a
bool
into this function without initializing it? I feel like that should have raised all sorts of compiler warnings at the very least. Any non-bool
passed in would be converted to normal format (0
or 1
), so the only way to do this would be explicitly leaving the bool
uninitialized in the caller (should warn) or doing terrible things to force an invalid value into the bool
in the caller, e.g. memset(&mybool, 0xFF, sizeof mybool)
, which would force an invalid bit pattern into mybool
, while the compiler would still believe it doesn't need to normalize it.– ShadowRanger
1 hour ago
4
4
It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.
– John Kugelman
59 mins ago
It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.
– John Kugelman
59 mins ago
2
2
@SidS You are right. Sadly the noinline part is required for the undefined behaviour to occur in this simplified example (and in this specific case, clang 5.0.0). I would have prefer not requiring it, but the bool "fixed itself" without this. It is not related to the question/problem however, since the function could have been in a different library.
– Remz
56 mins ago
@SidS You are right. Sadly the noinline part is required for the undefined behaviour to occur in this simplified example (and in this specific case, clang 5.0.0). I would have prefer not requiring it, but the bool "fixed itself" without this. It is not related to the question/problem however, since the function could have been in a different library.
– Remz
56 mins ago
|
show 3 more comments
3 Answers
3
active
oldest
votes
The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.
The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).
Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.
NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.
add a comment |
The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to true
or false
). The true
value doesn't have to be the same as the integer 1 -- indeed, there can be various representations of true
and false
-- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.
So if you fail to initialise a bool
, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:
50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)
The "true
value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other thantrue
/false
),true
is always1
, andfalse
is always0
. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact thatbool
s actual bit pattern could only be0
or1
), so it's kind of irrelevant to the OP's problem.
– ShadowRanger
54 mins ago
@ShadowRanger You can always inspect the object representation directly.
– T.C.
50 mins ago
@shadowranger: my point is that the implementation is in charge. If it limits valid representations oftrue
to the bit pattern1
, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of abool
by copying it into a byte array; that is not UB (but it is implementation-defined)
– rici
34 mins ago
add a comment |
A bool is only allowed to hold the values 0
or 1
, and the generated code can assume that it will only hold one of these two values. The code generated for the ternary in the assignment could use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:
const static char *strings = {"false", "true"};
const char *whichString = strings[boolValue];
If boolValue
is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings
array.
@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.
– Barmar
53 mins ago
You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.
– Remz
38 mins ago
@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.
– Barmar
34 mins ago
@Remz Recast thebool
toint
with*(int *)&boolValue
and print it for debugging purposes, see if it is anything other than0
or1
when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.
– Havenard
6 mins ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Remz is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54120862%2fcan-an-uninitialized-bool-crash-a-program%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.
The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).
Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.
NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.
add a comment |
The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.
The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).
Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.
NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.
add a comment |
The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.
The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).
Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.
NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.
The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.
The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).
Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.
NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.
answered 50 mins ago
M.MM.M
104k11115234
104k11115234
add a comment |
add a comment |
The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to true
or false
). The true
value doesn't have to be the same as the integer 1 -- indeed, there can be various representations of true
and false
-- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.
So if you fail to initialise a bool
, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:
50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)
The "true
value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other thantrue
/false
),true
is always1
, andfalse
is always0
. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact thatbool
s actual bit pattern could only be0
or1
), so it's kind of irrelevant to the OP's problem.
– ShadowRanger
54 mins ago
@ShadowRanger You can always inspect the object representation directly.
– T.C.
50 mins ago
@shadowranger: my point is that the implementation is in charge. If it limits valid representations oftrue
to the bit pattern1
, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of abool
by copying it into a byte array; that is not UB (but it is implementation-defined)
– rici
34 mins ago
add a comment |
The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to true
or false
). The true
value doesn't have to be the same as the integer 1 -- indeed, there can be various representations of true
and false
-- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.
So if you fail to initialise a bool
, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:
50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)
The "true
value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other thantrue
/false
),true
is always1
, andfalse
is always0
. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact thatbool
s actual bit pattern could only be0
or1
), so it's kind of irrelevant to the OP's problem.
– ShadowRanger
54 mins ago
@ShadowRanger You can always inspect the object representation directly.
– T.C.
50 mins ago
@shadowranger: my point is that the implementation is in charge. If it limits valid representations oftrue
to the bit pattern1
, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of abool
by copying it into a byte array; that is not UB (but it is implementation-defined)
– rici
34 mins ago
add a comment |
The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to true
or false
). The true
value doesn't have to be the same as the integer 1 -- indeed, there can be various representations of true
and false
-- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.
So if you fail to initialise a bool
, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:
50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)
The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to true
or false
). The true
value doesn't have to be the same as the integer 1 -- indeed, there can be various representations of true
and false
-- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.
So if you fail to initialise a bool
, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:
50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)
edited 30 mins ago
answered 1 hour ago
ricirici
152k19132197
152k19132197
The "true
value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other thantrue
/false
),true
is always1
, andfalse
is always0
. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact thatbool
s actual bit pattern could only be0
or1
), so it's kind of irrelevant to the OP's problem.
– ShadowRanger
54 mins ago
@ShadowRanger You can always inspect the object representation directly.
– T.C.
50 mins ago
@shadowranger: my point is that the implementation is in charge. If it limits valid representations oftrue
to the bit pattern1
, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of abool
by copying it into a byte array; that is not UB (but it is implementation-defined)
– rici
34 mins ago
add a comment |
The "true
value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other thantrue
/false
),true
is always1
, andfalse
is always0
. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact thatbool
s actual bit pattern could only be0
or1
), so it's kind of irrelevant to the OP's problem.
– ShadowRanger
54 mins ago
@ShadowRanger You can always inspect the object representation directly.
– T.C.
50 mins ago
@shadowranger: my point is that the implementation is in charge. If it limits valid representations oftrue
to the bit pattern1
, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of abool
by copying it into a byte array; that is not UB (but it is implementation-defined)
– rici
34 mins ago
The "
true
value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true
/false
), true
is always 1
, and false
is always 0
. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bool
s actual bit pattern could only be 0
or 1
), so it's kind of irrelevant to the OP's problem.– ShadowRanger
54 mins ago
The "
true
value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true
/false
), true
is always 1
, and false
is always 0
. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bool
s actual bit pattern could only be 0
or 1
), so it's kind of irrelevant to the OP's problem.– ShadowRanger
54 mins ago
@ShadowRanger You can always inspect the object representation directly.
– T.C.
50 mins ago
@ShadowRanger You can always inspect the object representation directly.
– T.C.
50 mins ago
@shadowranger: my point is that the implementation is in charge. If it limits valid representations of
true
to the bit pattern 1
, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool
by copying it into a byte array; that is not UB (but it is implementation-defined)– rici
34 mins ago
@shadowranger: my point is that the implementation is in charge. If it limits valid representations of
true
to the bit pattern 1
, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool
by copying it into a byte array; that is not UB (but it is implementation-defined)– rici
34 mins ago
add a comment |
A bool is only allowed to hold the values 0
or 1
, and the generated code can assume that it will only hold one of these two values. The code generated for the ternary in the assignment could use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:
const static char *strings = {"false", "true"};
const char *whichString = strings[boolValue];
If boolValue
is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings
array.
@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.
– Barmar
53 mins ago
You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.
– Remz
38 mins ago
@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.
– Barmar
34 mins ago
@Remz Recast thebool
toint
with*(int *)&boolValue
and print it for debugging purposes, see if it is anything other than0
or1
when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.
– Havenard
6 mins ago
add a comment |
A bool is only allowed to hold the values 0
or 1
, and the generated code can assume that it will only hold one of these two values. The code generated for the ternary in the assignment could use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:
const static char *strings = {"false", "true"};
const char *whichString = strings[boolValue];
If boolValue
is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings
array.
@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.
– Barmar
53 mins ago
You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.
– Remz
38 mins ago
@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.
– Barmar
34 mins ago
@Remz Recast thebool
toint
with*(int *)&boolValue
and print it for debugging purposes, see if it is anything other than0
or1
when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.
– Havenard
6 mins ago
add a comment |
A bool is only allowed to hold the values 0
or 1
, and the generated code can assume that it will only hold one of these two values. The code generated for the ternary in the assignment could use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:
const static char *strings = {"false", "true"};
const char *whichString = strings[boolValue];
If boolValue
is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings
array.
A bool is only allowed to hold the values 0
or 1
, and the generated code can assume that it will only hold one of these two values. The code generated for the ternary in the assignment could use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:
const static char *strings = {"false", "true"};
const char *whichString = strings[boolValue];
If boolValue
is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings
array.
edited 55 mins ago
answered 1 hour ago
BarmarBarmar
420k35244344
420k35244344
@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.
– Barmar
53 mins ago
You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.
– Remz
38 mins ago
@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.
– Barmar
34 mins ago
@Remz Recast thebool
toint
with*(int *)&boolValue
and print it for debugging purposes, see if it is anything other than0
or1
when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.
– Havenard
6 mins ago
add a comment |
@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.
– Barmar
53 mins ago
You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.
– Remz
38 mins ago
@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.
– Barmar
34 mins ago
@Remz Recast thebool
toint
with*(int *)&boolValue
and print it for debugging purposes, see if it is anything other than0
or1
when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.
– Havenard
6 mins ago
@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.
– Barmar
53 mins ago
@SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.
– Barmar
53 mins ago
You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.
– Remz
38 mins ago
You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.
– Remz
38 mins ago
@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.
– Barmar
34 mins ago
@Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.
– Barmar
34 mins ago
@Remz Recast the
bool
to int
with *(int *)&boolValue
and print it for debugging purposes, see if it is anything other than 0
or 1
when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.– Havenard
6 mins ago
@Remz Recast the
bool
to int
with *(int *)&boolValue
and print it for debugging purposes, see if it is anything other than 0
or 1
when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.– Havenard
6 mins ago
add a comment |
Remz is a new contributor. Be nice, and check out our Code of Conduct.
Remz is a new contributor. Be nice, and check out our Code of Conduct.
Remz is a new contributor. Be nice, and check out our Code of Conduct.
Remz is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54120862%2fcan-an-uninitialized-bool-crash-a-program%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
5
I'm pretty sure that if you call this function with an uninitialized value, you get UB at the point of the call, long before the function is even entered.
– melpomene
1 hour ago
3
@SidS: the buffer shown here is for demonstration purpose and was in the global space, so it is actually zero-filled (null-terminated) automatically.
– Remz
1 hour ago
2
How did you pass a
bool
into this function without initializing it? I feel like that should have raised all sorts of compiler warnings at the very least. Any non-bool
passed in would be converted to normal format (0
or1
), so the only way to do this would be explicitly leaving thebool
uninitialized in the caller (should warn) or doing terrible things to force an invalid value into thebool
in the caller, e.g.memset(&mybool, 0xFF, sizeof mybool)
, which would force an invalid bit pattern intomybool
, while the compiler would still believe it doesn't need to normalize it.– ShadowRanger
1 hour ago
4
It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.
– John Kugelman
59 mins ago
2
@SidS You are right. Sadly the noinline part is required for the undefined behaviour to occur in this simplified example (and in this specific case, clang 5.0.0). I would have prefer not requiring it, but the bool "fixed itself" without this. It is not related to the question/problem however, since the function could have been in a different library.
– Remz
56 mins ago