Dark Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Improve handling nosec for multi-line strings#915

Merged
sigmavirus24 merged 1 commit intoPyCQA:mainfrom
kfrydel:issue_880
Feb 27, 2023
Merged

Improve handling nosec for multi-line strings#915
sigmavirus24 merged 1 commit intoPyCQA:mainfrom
kfrydel:issue_880

Conversation

Copy link
Contributor

kfrydel commented Jun 28, 2022

This commit improves handling nosecs
in multi-line strings, like:

1. nosec_not_working = f"""
2. SELECT * FROM {table}
3. """ # nosec

Before this change, bandit was checking if there is
a nosec in line 1. Now, in the case of ast Constants
it searches for nosec in the last line of the expression.

This commit also moves detecting nosec without test number
to test phase from "pre-visit" phase.
It may increase the time of performing checks but avoids
counting the same nosec mark multiple times.
"pre-visit" phase is run separately for each part of multi-line
string split by FormattedValue items. Thus for the above example,
it would be run twice, the first time for "\n SELECT * FROM "
and the second time for "\n" making the nosec being counted twice.

Resolves: #880

dybi and GeyseR reacted with thumbs up emoji jameshalsall, daniel-kun, and msdcanderson reacted with eyes emoji
kfrydel requested review from ericwb, lukehinds and sigmavirus24 as code owners June 28, 2022 07:30
ericwb reviewed Jul 7, 2022
Copy link
Member

ericwb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think using the last line of linerange will fail for Python 3.7. Currently Bandit incorrectly reports the last line number for Py37. See issue #820

import logging
import os.path
import sys
from _ast import Constant
Copy link
Member

ericwb Jul 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use ast.Constant?

Copy link
Contributor Author

kfrydel Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, it should be ast. Fixed.

kfrydel force-pushed the issue_880 branch 2 times, most recently from 06b7c6a to 80b337d Compare July 8, 2022 13:34
Copy link
Contributor Author

kfrydel commented Jul 8, 2022

So I think using the last line of linerange will fail for Python 3.7. Currently Bandit incorrectly reports the last line number for Py37. See issue #820

Thank you for taking a look at my changes.
I'm not sure if I understand. Do you mean that my changes are not going to work in Python 3.7?
get_nosec function returns context['lineno'] for strings in Python 3.7. Why? Because in the case of Python 3.7 strings are parsed to ast.Str. In the case of Python 3.10 they are parsed to ast.Constant. It looks like ast.Constant is used from 3.8 according to https://docs.python.org/3/library/ast.html:

Changed in version 3.8: Class ast.Constant is now used for all constants.

I did the following test:
file: string_examples.py

a = "string 1"
b = f"f-string {a}"
c = """multi-line
string
1"""
d = f"""multi-line
f-string
{a}"""

file: ast_example.py

import ast

with open('string_examples.py', 'r') as f:
ast_tree = ast.parse(f.read())

for node in ast_tree.body:
print(ast.dump(node))

And the output is:

  1. python 3.7:
Assign(targets=[Name(id='a', ctx=Store())], value=Str(s='string 1'))
Assign(targets=[Name(id='b', ctx=Store())], value=JoinedStr(values=[Str(s='f-string '), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1, format_spec=None)]))
Assign(targets=[Name(id='c', ctx=Store())], value=Str(s='multi-line\nstring\n1'))
Assign(targets=[Name(id='d', ctx=Store())], value=JoinedStr(values=[Str(s='multi-line\nf-string\n'), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1, format_spec=None)]))
  1. python 3.8:
Assign(targets=[Name(id='a', ctx=Store())], value=Constant(value='string 1', kind=None), type_comment=None)
Assign(targets=[Name(id='b', ctx=Store())], value=JoinedStr(values=[Constant(value='f-string ', kind=None), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1, format_spec=None)]), type_comment=None)
Assign(targets=[Name(id='c', ctx=Store())], value=Constant(value='multi-line\nstring\n1', kind=None), type_comment=None)
Assign(targets=[Name(id='d', ctx=Store())], value=JoinedStr(values=[Constant(value='multi-line\nf-string \n', kind=None), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1, format_spec=None)]), type_comment=None)
  1. python 3.10:
Assign(targets=[Name(id='a', ctx=Store())], value=Constant(value='string 1'))
Assign(targets=[Name(id='b', ctx=Store())], value=JoinedStr(values=[Constant(value='f-string '), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1)]))
Assign(targets=[Name(id='c', ctx=Store())], value=Constant(value='multi-line\nstring\n1'))
Assign(targets=[Name(id='d', ctx=Store())], value=JoinedStr(values=[Constant(value='multi-line\nf-string \n'), FormattedValue(value=Name(id='a', ctx=Load()), conversion=-1)]))

Copy link

frenzymadness commented Aug 10, 2022

Thank you for working on this. This also fixes #658 reported in December 2020.

kfrydel requested a review from ericwb August 18, 2022 12:52
kfrydel force-pushed the issue_880 branch from 80b337d to dccf635 Compare August 23, 2022 06:58
Copy link
Contributor Author

kfrydel commented Aug 23, 2022

Hi @ericwb , would you find a while to take a look at this pull request? If you think it should be fixed in another way, do not hesitate to say so. I'm willing to spend some more time on this if needed.

dybi reacted with thumbs up emoji

kfrydel force-pushed the issue_880 branch from dccf635 to 97789a3 Compare August 29, 2022 06:43
Copy link

sshishov commented Jan 31, 2023

Hello guys, can we expect this PR to be merged some time soon?

sigmavirus24 reacted with thumbs down emoji

kfrydel force-pushed the issue_880 branch from 97789a3 to 24e1359 Compare February 13, 2023 09:16
kfrydel force-pushed the issue_880 branch from 24e1359 to f73726b Compare February 24, 2023 12:32
Copy link
Member

sigmavirus24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include a test for implicit concatenation over multiple lines and the behavior for those cases? I'm not sure how python will report the lineno on 3.7 or the range on 3.8+. Also, uncertain where on those we should expect to see a nosec

kfrydel mentioned this pull request Feb 24, 2023
This commit improves handling nosecs
in multi-line strings, like:

1. nosec_not_working = f"""
2. SELECT * FROM {table}
3. """ # nosec

Before this change, bandit was checking if there is
a nosec in line 1. Now, it searches for nosec in all
lines of the expression.

In python 3.7, linerange for a multiline expression is sqeezed to
first line. Thus, if nosec is set in the second or further line
then it is not taken into account by bandit.

This commit also moves detecting nosec without test number
to test phase from "pre-visit" phase.
It may increase the time of performing checks but avoids
counting the same nosec mark multiple times.
"pre-visit" phase is run separately for each part of multi-line
string split by FormattedValue items. Thus for the above example,
it would be run twice, the first time for "\n SELECT * FROM "
and the second time for "\n" making the nosec being counted twice.

Resolves: PyCQA#880
kfrydel force-pushed the issue_880 branch from f73726b to 1a27530 Compare February 27, 2023 11:59
Copy link
Contributor Author

kfrydel commented Feb 27, 2023

I added tests for implicit concatenation cases and it turned out that my solution didn't work. I changed the approach and now I search for nosec in every line reported in linerange property. I also abandoned using lineno property and always use linerange so the code became a bit simpler.
nosec can be now placed at any line of "implicit concatenation" expression.
Except python 3.7. Indeed, as @ericwb said, we do not have the last line of the analyzed expression thus in the case of python 3.7 nosec has to be at the first line of the "implicit concatenation" expression. And to be honest, I have no idea on how to fix this.

In the case of a multiline string, as shown below, nosec is detected when it is placed at the last line:

f"""
SELECT *
FROM foo
WHERE id = {identifier}
""" # nosec

independently from the python version.

kfrydel requested review from sigmavirus24 and removed request for ericwb February 27, 2023 12:10
sigmavirus24 approved these changes Feb 27, 2023
sigmavirus24 merged commit 72fa5a7 into PyCQA:main Feb 27, 2023
Copy link
Contributor Author

kfrydel commented Feb 27, 2023

@sigmavirus24 Thank you!

Copy link

JRemitz commented Mar 10, 2023 *
edited
Loading

Not sure but this change appears to have broken our Python 3.7 runs. We had tests getting skipped and now none a portion of them are. Still investigating root cause.

Copy link
Contributor Author

kfrydel commented Mar 10, 2023

@JRemitz Please let me know if you have examples that should be skipped and are not.

Copy link

JRemitz commented Mar 10, 2023

Definitely and I think I'm eating my words... we had a reduction in "skipped" tests, but the new ones were not previously skipped, they were two of the new rules that were added. I am able to correlate those, I just need to dig up why the skipped tests count dropped. I'll let you know if that has any relationship to this change. Apologies for the noise!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

ericwb ericwb left review comments

sigmavirus24 sigmavirus24 approved these changes

lukehinds Awaiting requested review from lukehinds lukehinds is a code owner

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

#nosec doesn't work with multi-line strings and Python 3.10

6 participants