Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CommunityBridge Project by Yash Varshney #148

Merged
merged 2 commits into from
Oct 23, 2020
Merged

Conversation

Yash-Varshney
Copy link
Contributor

@Yash-Varshney Yash-Varshney commented Oct 7, 2020

Hello Everyone, I am Yash Varshney and this P.R. is regarding the community bridge mentorship programme. This pr contains all the four task done by me, previously I have made the p.r. for two tasks but this one contains all the task together. This project updates the python tool with v2.2 (although not completely) and introduces a CLI tool for parser and convertor. I have and will be raising some issues regarding future work needed to do. The next task will be to update the python tool with v3.x of python.
It was a great learning experience for me for the last 3 months and I will further be working with spdx-community.
This P.R. is open for review.

  • Passing all the tests.

  • Review needed.

Signed-off-by: Yash Varshney b18038@students.iitmandi.ac.in

Sorry, something went wrong.

…sh Varshney as a part of CommunityBridgeMentorship Programme. The main feature introduce is the CLI-tool for python parser and convertor. Apart from this, Relationship Class is added in python tool, all the other classes have been updated wrt to v2.2 of spdx specs (some issues have also been raised), attribution text has been added to file,package and snippet class as well. BLACK is used for formatting the whole tool.

Signed-off-by: Yash Varshney <b18038@students.iitmandi.ac.in>
@Yash-Varshney
Copy link
Contributor Author

ping : @rtgdk @goneall

@Yash-Varshney
Copy link
Contributor Author

Also raising this issue. A lot of points in this issue have been achieved but some are remaining, I will be raising issues for them ASAP.

@goneall goneall requested a review from pombredanne October 7, 2020 22:20
@goneall
Copy link
Member

goneall commented Oct 7, 2020

@Yash-Varshney Thanks for the contributions and the PR!

@zvr, @rtgdk & @pombredanne - if you could review and let me know if your OK with me merging.

Copy link

@RishabhBhatnagar RishabhBhatnagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, good work. Thank You for contributing to the SPDX community.
Although, I had a hard time scanning the significant changes because of petty format changes like changing single-quote to double-quote, formatting spaces, adding lines, etcetera.

If possible, segregate changes into two commits. One for significant changes and the other for formatting.

if __name__ == '__main__':
import sys

def parse_JSON(file):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since python is a dynamically-typed language, it is harder to determine what should be the type of input argument.
In my perspective, the file variable might be a file object.

But upon seeing the usage of the variable file, seems like it is a path string.
Either the variable should be renamed for unambiguity or docstring should specify the data-type of the file variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback Rishabh, I feel that file variable does more justice as it literally means the address of the path where the file is.

if __name__ == "__main__":
import sys

file = sys.argv[1]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will raise an IndexError if the filename is not provided in the cli input.
It should rather inform the user about the usage when the filename is not provided while running the script.

Do these changes for other example files as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RishabhBhatnagar this parse_json (and all other files) are used for the cli_tool only. It actually doesn't make sense to use them without input files.
Moreover, due to the introduction of cli_tool, these files won't be used .
Thanks

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. No user will make an error consciously. If he/she ends up not providing input, the program must terminate gracefully with an error message describing the usage.
    Check out any of the examples in tools-golang/examples directory. Missing input describes the usage and terminates.
    One more example: mv file_name when run in any unix-terminal will terminate stating "missing destination file".

  2. Due to the introduction of cli_tool, which files won't be used?

Note: this is not a deal-breaker.

elif outfile.endswith(".spdx"):
outfile_format = "tag"
elif outfile.endswith(".rdf.xml"):
outfile_format = "rdf"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unreachable statement. This case is consumed by the case when the outfile ends in ".xml"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement is basically to show the support for rdf.xml file to a reader. Ofc, it is unreachable but it actually tells the format supported :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subsuming case is when the outfile ends in ".xml".
If the outfile ends in ".xml", outfile_format is "xml"
If the outfile ends in ".rdf.xml", the expected outfile_format according to the code is "rdf" but the subsuming case will set the outfile_format to "xml".

This must be corrected.

for relation in doc.relationships:
print("\tRelationship: {0}".format(relation.relationship))
try:
print("\tRelationship: {0}".format(relation.comment))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AttributeError: Relationship object won't have any member called "comment"
Relationship.relationship_comment is the attribute you should use.

Moreover, rather than EAFP, you can use LBYL as the Relationship class exposes a function(has_comment) to check if a comment is set.

"INPUT FILETYPE NOT SUPPORTED. (only RDF and TAG format supported)"
)

if outfile.endswith(".rdf"):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Optional] compress the if elif else ladder by the number of outfile_formats we can have.

For example:

    if outfile.endswith((".rdf", ".rdf.xml")):
        outfile_format = "rdf"
    elif outfile.endswith((".tag", ".spdx")):
        outfile_format = "tag"
    ....


def rdf_to_json(infile, outfile):
infile = str(infile)
outfile = str(outfile)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant typecasts of infile and outfile to string. Input variables are already strings (Correct me if I am wrong).
This operation is performed in all the subsequent functions in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input variables can be anything (mostly strings ofc) but as a sanity check I deliberately converted it to string because if somehow there is a file say 12345.spdx and the user wants to check it, it will break the code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the file name is "12345.spdx", the code won't break.

print(os.strerror(e.errno))


def rdf_to_tag(infile, outfile):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant functions with name *_to_* .

You're calling them via dynamic access to global variables. You can directly call the intended function rather than redirecting through a custom defined function.

For example:
instead of calling rdf_to_tag, you can directly call RDF_TO_TAG function.

This will reduce the script size significantly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, thanks. It could be improved further :)


@property
def relatedspdxelement(self):
return self.relationship.split(" ")[2]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than properties splitting the member variables every time and returning nth part,

It is preferable if you set these properties in the constructor itself, as the members (relationship and relationship_comment) are static for an object.
This will also block creation of an invalid Relationship which don't have enough parts (refA, relationship_type and, refB).

To use : run `parser` using terminal or run `parser --file <file name>`
"""
if file.endswith(".rdf"):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Optional] This if-elif-else ladder can be compressed too.

@rtgdk
Copy link
Collaborator

rtgdk commented Oct 19, 2020

@Yash-Varshney @goneall I tried it out. And it works fine. I couldn't review the whole code since it takes too much time to review the real changes.

@Yash-Varshney Can you please respond to the comments made by @RishabhBhatnagar regarding the code structure? Overall, things look okay to merge if we have approvals from main reviewers.

@zvr
Copy link
Member

zvr commented Oct 20, 2020

I'm also OK with merging this.
As I wrote in an email, I am not concerned with it being a single large PR: we had explicitly asked the author to work on his own branch for all of the three months and submit a final PR at the end. This is not an incremental small fix; it's a push of a large body of work implementing new functionality.

Of course, there are things that can/should be changed, but I propose we merge this and further work be done on the merged single base afterwards.

@Yash-Varshney
Copy link
Contributor Author

Thanks, @rtgdk @zvr @goneall @RishabhBhatnagar for reviewing. Ik that due to formatting there are some redundant additions but it was needed to format the whole code once so that it won't be a problem in future.
There are some small changes suggested by @RishabhBhatnagar, but I also believe that the code is working fine and it should be merged first and then few additions can be made.
Looking forward, I am thinking of adding a test for checking the use of BLACK formatter.
Thank you :)

@goneall
Copy link
Member

goneall commented Oct 21, 2020

@Yash-Varshney The PR is showing that the branch is out of date with the base branch - can you update the branch and verify that it is still working.

@RishabhBhatnagar If your OK merging in its current state, can you create a separate issue for the outstanding issues and approve the PR?

@RishabhBhatnagar
Copy link

@RishabhBhatnagar If your OK merging in its current state, can you create a separate issue for the outstanding issues and approve the PR?

@goneall, If you say so, I'll approve the PR. But I am not in the favor of merging it as I couldn't run the convertor for any of the described ways in the readme file.

Anyways, my review won't affect the merging of this PR as I am not a maintainer for this repository.

Copy link
Member

@goneall goneall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although there are some good suggestions from @RishabhBhatnagar still open, I have reviewed the code and tried out some of the functions which worked OK for me.

I'll go ahead and merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants