-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CommunityBridge Project by Yash Varshney #148
Conversation
…sh Varshney as a part of CommunityBridgeMentorship Programme. The main feature introduce is the CLI-tool for python parser and convertor. Apart from this, Relationship Class is added in python tool, all the other classes have been updated wrt to v2.2 of spdx specs (some issues have also been raised), attribution text has been added to file,package and snippet class as well. BLACK is used for formatting the whole tool. Signed-off-by: Yash Varshney <b18038@students.iitmandi.ac.in>
Also raising this issue. A lot of points in this issue have been achieved but some are remaining, I will be raising issues for them ASAP. |
@Yash-Varshney Thanks for the contributions and the PR! @zvr, @rtgdk & @pombredanne - if you could review and let me know if your OK with me merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, good work. Thank You for contributing to the SPDX community.
Although, I had a hard time scanning the significant changes because of petty format changes like changing single-quote to double-quote, formatting spaces, adding lines, etcetera.
If possible, segregate changes into two commits. One for significant changes and the other for formatting.
if __name__ == '__main__': | ||
import sys | ||
|
||
def parse_JSON(file): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since python is a dynamically-typed language, it is harder to determine what should be the type of input argument.
In my perspective, the file
variable might be a file object.
But upon seeing the usage of the variable file
, seems like it is a path string.
Either the variable should be renamed for unambiguity or docstring should specify the data-type of the file
variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your feedback Rishabh, I feel that file
variable does more justice as it literally means the address of the path where the file is.
if __name__ == "__main__": | ||
import sys | ||
|
||
file = sys.argv[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will raise an IndexError if the filename is not provided in the cli input.
It should rather inform the user about the usage when the filename is not provided while running the script.
Do these changes for other example files as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RishabhBhatnagar this parse_json (and all other files) are used for the cli_tool only. It actually doesn't make sense to use them without input files.
Moreover, due to the introduction of cli_tool, these files won't be used .
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
No user will make an error consciously. If he/she ends up not providing input, the program must terminate gracefully with an error message describing the usage.
Check out any of the examples in tools-golang/examples directory. Missing input describes the usage and terminates.
One more example:mv file_name
when run in any unix-terminal will terminate stating "missing destination file". -
Due to the introduction of cli_tool, which files won't be used?
Note: this is not a deal-breaker.
elif outfile.endswith(".spdx"): | ||
outfile_format = "tag" | ||
elif outfile.endswith(".rdf.xml"): | ||
outfile_format = "rdf" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreachable statement. This case is consumed by the case when the outfile ends in ".xml"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This statement is basically to show the support for rdf.xml
file to a reader. Ofc, it is unreachable but it actually tells the format supported :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subsuming case is when the outfile ends in ".xml".
If the outfile ends in ".xml", outfile_format is "xml"
If the outfile ends in ".rdf.xml", the expected outfile_format according to the code is "rdf" but the subsuming case will set the outfile_format to "xml".
This must be corrected.
for relation in doc.relationships: | ||
print("\tRelationship: {0}".format(relation.relationship)) | ||
try: | ||
print("\tRelationship: {0}".format(relation.comment)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AttributeError: Relationship object won't have any member called "comment"
Relationship.relationship_comment is the attribute you should use.
Moreover, rather than EAFP, you can use LBYL as the Relationship class exposes a function(has_comment) to check if a comment is set.
"INPUT FILETYPE NOT SUPPORTED. (only RDF and TAG format supported)" | ||
) | ||
|
||
if outfile.endswith(".rdf"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Optional] compress the if elif else ladder by the number of outfile_formats we can have.
For example:
if outfile.endswith((".rdf", ".rdf.xml")):
outfile_format = "rdf"
elif outfile.endswith((".tag", ".spdx")):
outfile_format = "tag"
....
|
||
def rdf_to_json(infile, outfile): | ||
infile = str(infile) | ||
outfile = str(outfile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant typecasts of infile and outfile to string. Input variables are already strings (Correct me if I am wrong).
This operation is performed in all the subsequent functions in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input variables can be anything (mostly strings ofc) but as a sanity check I deliberately converted it to string because if somehow there is a file say 12345.spdx
and the user wants to check it, it will break the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if the file name is "12345.spdx", the code won't break.
print(os.strerror(e.errno)) | ||
|
||
|
||
def rdf_to_tag(infile, outfile): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redundant functions with name *_to_* .
You're calling them via dynamic access to global variables. You can directly call the intended function rather than redirecting through a custom defined function.
For example:
instead of calling rdf_to_tag, you can directly call RDF_TO_TAG function.
This will reduce the script size significantly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, thanks. It could be improved further :)
|
||
@property | ||
def relatedspdxelement(self): | ||
return self.relationship.split(" ")[2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than properties splitting the member variables every time and returning nth part,
It is preferable if you set these properties in the constructor itself, as the members (relationship and relationship_comment) are static for an object.
This will also block creation of an invalid Relationship which don't have enough parts (refA, relationship_type and, refB).
To use : run `parser` using terminal or run `parser --file <file name>` | ||
""" | ||
if file.endswith(".rdf"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Optional] This if-elif-else ladder can be compressed too.
@Yash-Varshney @goneall I tried it out. And it works fine. I couldn't review the whole code since it takes too much time to review the real changes. @Yash-Varshney Can you please respond to the comments made by @RishabhBhatnagar regarding the code structure? Overall, things look okay to merge if we have approvals from main reviewers. |
I'm also OK with merging this. Of course, there are things that can/should be changed, but I propose we merge this and further work be done on the merged single base afterwards. |
Thanks, @rtgdk @zvr @goneall @RishabhBhatnagar for reviewing. Ik that due to formatting there are some redundant additions but it was needed to format the whole code once so that it won't be a problem in future. |
@Yash-Varshney The PR is showing that the branch is out of date with the base branch - can you update the branch and verify that it is still working. @RishabhBhatnagar If your OK merging in its current state, can you create a separate issue for the outstanding issues and approve the PR? |
@goneall, If you say so, I'll approve the PR. But I am not in the favor of merging it as I couldn't run the convertor for any of the described ways in the readme file. Anyways, my review won't affect the merging of this PR as I am not a maintainer for this repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although there are some good suggestions from @RishabhBhatnagar still open, I have reviewed the code and tried out some of the functions which worked OK for me.
I'll go ahead and merge it.
Hello Everyone, I am Yash Varshney and this P.R. is regarding the community bridge mentorship programme. This pr contains all the four task done by me, previously I have made the p.r. for two tasks but this one contains all the task together. This project updates the python tool with v2.2 (although not completely) and introduces a CLI tool for parser and convertor. I have and will be raising some issues regarding future work needed to do. The next task will be to update the python tool with v3.x of python.
It was a great learning experience for me for the last 3 months and I will further be working with spdx-community.
This P.R. is open for review.
Passing all the tests.
Review needed.
Signed-off-by: Yash Varshney b18038@students.iitmandi.ac.in