Skip to content

Unreachable else best practice #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lukpueh opened this issue Jan 10, 2018 · 12 comments
Closed

Unreachable else best practice #18

lukpueh opened this issue Jan 10, 2018 · 12 comments

Comments

@lukpueh
Copy link
Member

lukpueh commented Jan 10, 2018

I was wondering how others feel about mutually exclusive conditionals, in a case like this:

Let's say we raise an exception if two conditions are not mutually exclusive and subsequently do different things depending on the conditions, which of the following alternatives do you find most readable? Note that the pragmas are necessary to prevent unittest code coverage decrease (if in place).

(1)

if not xor(condition1, condition2): 
  raise

if condition1:
  do1()
elif condition2:
  do2()
else: # pragma: no cover
  raise

(2)

if not xor(condition1, condition2): 
  raise

if condition1:
  do1()
elif condition2: # pragma: no branch
  do2()

(3)

if not xor(condition1, condition2):
  raise

if condition1:
  do1()
else: # condition2
 do2()

(4)

if not xor(condition1, condition2):
  raise

if condition1:
  do1()
if condition2:
  do2()

The code style guidelines currently recommend to use (1) and to not use (3), however the example in the guidelines does not consider an initial check for mutual exclusiveness.

@awwad
Copy link
Contributor

awwad commented Jan 10, 2018

The guidelines do seem to recommend against (3), by saying that normal logic shouldn't be in else clauses (only exceptions).

I think I've tended to use (3) though, with a comment included at the else, and tend to test things like mutually exclusive parameters early as well.

Each option has issues, I think.... I'd be eager to know what the community thinks is the best practice as well.

@vladimir-v-diaz
Copy link
Contributor

I wonder if there is much benefit to the initial check for mutual exclusiveness. If there is not a lot code between the initial check and the if-elif-else block, the example is much easier to read as follows:

if value1:
  do1()

elif value2:
  do2()

else:
  raise

Our guidelines doc does favor readability and security over performance.

I tend to use (1), but the initial check often tries to validate more of the variable's content. In the future I might try the following:

def foo(value1, value2=None, value3=None):
   validate(value1)
   
   if value2:
     validate(value2)
     bar(value2, value1)

   elif value3:
      validate(value3)
      baz(value3, value2)

   else:
      raise

Let me know what you think.

@lukpueh
Copy link
Member Author

lukpueh commented Jan 10, 2018

@vladimir-v-diaz, note that your first example does not check for mutual exclusiveness. I was curious about cases, where I have two mutually exclusive conditions, which I want to handle in a readable way (if/elif).

@vladimir-v-diaz
Copy link
Contributor

That's the point I was trying to make, that you should consider removing the initial check for mutual exclusiveness. The mutual check should be contained in the if-elif.

@lukpueh
Copy link
Member Author

lukpueh commented Jan 10, 2018

if value1: do1()
elif value2: do2()
else: raise

Does not check for mutually exclusiveness.

Anyway, I was talking about cases, where removing the initial check is not an option.

@vladimir-v-diaz
Copy link
Contributor

Instead of...

if not xor(condition1, condition2): 
  raise
...

if condition1:
  do1()
elif condition2:
  do2()
else: # pragma: no cover
  raise

You can try...

if condition1 and condition2: 
  raise InputError("condition1 and condition2 are mutually exclusive.")

elif condition1:
  do1()

elif condition2:
  do2()

else:
  raise InputError("Unknown condition.")

I think linking to the specific in-toto example that you wish to refactor might help. It might just be a function that is trying to accept too many independent arguments.

@lukpueh lukpueh changed the title Mutually exclusive conditionals best practice Unreachable else best practice Nov 18, 2020
@lukpueh
Copy link
Member Author

lukpueh commented Nov 18, 2020

I would like to revive this discussion, because I'm really curious what others think is the best solution.

Maybe I tried to keep my question too abstract, and got to caught-up with mutual exclusiveness, which is not the core of the problem. The question is really about very specific code constructs that I've seen a lot in secure systems lab code, that is:

How to avoid/handle a situation where an early input validation makes the default case (else) in a subsequent input-dispatching switch (if/elif/else) unreachable? E.g. like so:

def sign(data, key)
  if not is_supported_keytype(key["type"]):
    raise UnsupportedKeyType("must be 'rsa' or 'ed25519'")

  if key["type"] == "rsa":
    rsa_sign(data, key)

  elif key["type"] == "ed25519":
    ed25519_sign(data, key)

  else: # FIXME: Duplicate code that will never be executed
    raise UnsupportedKeyType("must be 'rsa' or 'ed25519'")

I listed some alternatives how to handle this above in the PR description. Another alternative would be to just not perform the initial input validation, if there is a subsequent switch that does the same work.

Any opinions? (cc @mnm678, @adityasaky, @SantiagoTorres, @joshuagl, @sechkova, @jku, @MVrachev)

@SantiagoTorres
Copy link

So it seems the crux of the earlier discussion was whether there are cases in which the if and the else need both be executed (i.e., they are not mutually exclusive). I'm finding it hard to reason of a case in the code in which we've done (1) where we should've done (4). To me, this seems to go beyond code guidelines and is instead a bug (code does x when it should do y).

That aside, I'm personally more comfortable with (1) and perhaps even more with (3), for the following reasons:

  • it is easier to read
  • it is friendlier to code extension. For example, say we wanted to add nistp384, it'd be obvious that you can add another clause and be done with it, without having to track the xor clause above.

Again, I see that this is an illustrative example, but I don't think the snippet is doing input validation, but rather looks like a dispatch table, no? What I'm interested in knowing is rather how to draw the line between 1 and 4 (as it seems to me that 4 implies the code is not mutually exclusive, whereas the other doesn't). So, to put what these snippets are doing in words:

  1. Checks for options being available using xor (and also implies mutually exclusive), then adds the dispatch table (if so, we could ignore the xor as you recommend)
  2. The same, but we remove the obvious raise, as we know that the xor clause above will do the check
  3. same as 1 but we move the condition check to a comment, so humans can read it
  4. Same as 2, but we remove the "el" as we know that they are exclusive and thus one will not hit the other.

Now, in the perspective of having a third option added (say, nistp384), how do these examples fare?

  1. checks for options being available with xor --- here, the xor will not be doing what's obvious due to reasons I'll add below --- adds a new elif to the dispatch table.
  2. The same, and won't blow up if it somehow (for the sake of the example), the key is both ed25519 and RSA, but not nistp384. The xor will in fact let things through, yet the if/elif clauses won't let us hit two cases together.
  3. The same, perhaps now the commented clause is "if nistp384" or you "upgrade" ed25519 for historical reasons
  4. Not the same, as the code will execute both clauses, ed25519 and RSA.

Now my issue is that these issues are more related to what we expect the code to do (is this a dispatch table or not), rather than what looks better. This is why I'm somewhat leaning towards (1) and (3): you have atomic units of code that are doing one thing and one thing only:

  1. are the inputs reasonable? (if we have the xor)
  2. Is this input meant to be handled inside of this if? (then we assume no elifs)
  3. Is this an explicit, unexpected case (hell, we can think of that else as an raise NonImplemented("nistp384 will be here in the future").

I hope this adds to the conversation...

@jku
Copy link

jku commented Nov 19, 2020

I'm only commenting on the more explicit case in #18 (comment) -- I don't have an opinion on the more generic original question.

I think the second check (the else branch) can be useful and is not duplicate code. The exception to that might be when sign() and is_supported_keytype() are so tightly bound that the definition of the latter is specifically "does sign() function support this type?".

But the else branch is explicitly not another check for whether a key is supported, it is a check for the programs internal consistency. As such it should not raise UnsupportedKeyType -- we do not want this case to be handled as just an unsupported key in calling code because we know that the program is broken if we end up in the else branch: is_supported_keytype() promised that the key is supported, yet sign() did not handle the case. Maybe NotImplementedError/RuntimeError (or even assert() if we are 101% sure any new key types get tested in the test suite)?

The fact that the else-branch gets no test coverage is an annoyance but that only reflects the nature of Python and the data structures -- the branch might not get tested on this commit but who knows: on the next commit the result may be different...

@adityasaky
Copy link
Member

I'm not a fan of option 3 for something like sign. I try to be explicit about the different branches, so I'd opt for an elif for each key type (apart from the one in if of course), rather than use the else to catch the last one. More than something that can be caught during execution, I think it's a sanity check of sorts for me as the dev where I may have added a new key type in one place but not another, and it'd be a bit of a head scratcher if something seemed to work but not quite because it applied the logic of another key type. Perhaps this isn't likely with keys in particular, but I think it could apply in the general sense for these things? I like option 1 the most here even though it's a bit redundant in action. I also agree that ending up in the else branch is a sign to the devs that there's a failure in logic rather than just a lack of support of a particular key type, so maybe we don't want to use the same exception.

@MVrachev
Copy link

MVrachev commented Nov 20, 2020

In short: for boolean checks (or checks for which we are sure there are and will be only 2 choices) I would prefer 3.
For every other case, I would go with 2.

@lukpueh
Copy link
Member Author

lukpueh commented Dec 1, 2020

Thanks a ton for taking the time to comment, @SantiagoTorres, @jku, @adityasaky and @MVrachev!

I'm finding it hard to reason of a case in the code in which we've done (1) where we should've done (4).

Agreed. (4) was a bit of an anti-example.

For me the key take-away here is that the else branch in the dispatch table is not another input check, but checks the programs internal consistency, which does have merit, but should be handled differently.
(E.g. in in-toto-golang we return an error from input validation but panic in the supposedly unreachable else branch, see in-toto/in-toto-golang#56 (comment) pp.)

The reason why I brought this up was because sslab code often is overly defensive. That together with many internal layers of abstractions, where none trusts the other, makes it very hard to maintain the code.

Long story short, I don't think we need to add this to a style guide, but maybe make a note in an input validation guide (see theupdateframework/python-tuf#1130). I'll close here and reference there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants