Potential fix for the negation causing any file that's not the negated file to match #10

kbendick · 2020-11-04T00:18:51Z

I found this solution via this discussion: actions/labeler#101

It seems as though when any is used it means that any file that matches ALL of the given glob patterns will be labeled. So the workaround seems to be to use an individual any list for the negated ones.

In order to determine which files might match the negated ones, I used find and then checked the output against the list of labels, like so:

$  find .  -path '*UI.scala' 
./resource-managers/mesos/src/main/scala/org/apache/spark/deploy/mesos/ui/MesosClusterUI.scala
./core/src/main/scala/org/apache/spark/ui/WebUI.scala
./core/src/main/scala/org/apache/spark/ui/SparkUI.scala
./core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
./core/src/main/scala/org/apache/spark/deploy/worker/ui/WorkerWebUI.scala
./core/src/main/scala/org/apache/spark/internal/config/UI.scala

This tells me that when using the !**/*UI.scala glob, we only need to be concerned (at the moment) with any matches that would match the other existing globs for the label, in this case CORE. So here, we're only worried about /core/**/* as that's the only glob from the CORE label which potentially has any *UI.scala files in it.

Admittedly having to write the config this way is a bit cumbersome. We can either be thorough and use a separate any block with a list for each glob plus all of the negations, which would future proof us a bit more, or we can be conservative to keep the config simpler at the risk of possibly introducing invalid matches later if a *UI.scala file gets placed under some other directory that is matched in the label which currently does not have subfiles with *UI.scala, which we're trying to avoid.

I would vote to keep the config more legible and then open an issue with the maintainers of this github action to see if something can be done.

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

… isnt the negated file

…mand ### What changes were proposed in this pull request? This PR proposes to sort table properties in DESCRIBE TABLE command. This is consistent with DSv2 command as well: https://github.com/apache/spark/blob/e3058ba17cb4512537953eb4ded884e24ee93ba2/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DescribeTableExec.scala#L63 This PR fixes the test case in Scala 2.13 build as well where the table properties have different order in the map. ### Why are the changes needed? To keep the deterministic and pretty output, and fix the tests in Scala 2.13 build. See https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-3.2-scala-2.13/49/testReport/junit/org.apache.spark.sql/SQLQueryTestSuite/describe_sql/ ``` describe.sql
Expected "...spark_catalog, view.[query.out.col.2=c, view.referredTempFunctionsNames=[], view.catalogAndNamespace.part.1=default]]", but got "...spark_catalog, view.[catalogAndNamespace.part.1=default, view.query.out.col.2=c, view.referredTempFunctionsNames=[]]]" Result did not match for query apache#29
DESC FORMATTED v ``` ### Does this PR introduce _any_ user-facing change? Yes, it will change the text output from `DESCRIBE [EXTENDED|FORMATTED] table_name`. Now the table properties are sorted by its key. ### How was this patch tested? Related unittests were fixed accordingly. Closes apache#30799 from HyukjinKwon/SPARK-33803. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

…aceable ### What changes were proposed in this pull request? This PR uses a manual recursion to replace `RuntimeReplaceable` expressions instead of `transformAllExpressionsWithPruning`. The problem of `transformAllExpressionsWithPruning` is it will automatically make the replacement expression inherit the function alias name from the parent node, which is quite misleading. For example, `select date_part('month', c) from t`, the optimized plan in EXPLAIN before this PR is ``` Project [date_part(cast(c#18 as date)) AS date_part(month, c)#19] +- Relation default.t[c#18] parquet ``` Now it's ``` Project [month(cast(c#9 as date)) AS date_part(month, c)#10] +- Relation default.t[c#9] parquet ``` ### Why are the changes needed? fix misleading EXPLAIN result ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? new test Closes apache#35821 from cloud-fan/follow2. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

Potential fix for the negation of UI from core matching any file that…

6cfa791

… isnt the negated file

github-actions bot added CORE INFRA SQL labels Nov 4, 2020

kbendick merged commit 9348623 into master Nov 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential fix for the negation causing any file that's not the negated file to match #10

Potential fix for the negation causing any file that's not the negated file to match #10

kbendick commented Nov 4, 2020 •

edited

Potential fix for the negation causing any file that's not the negated file to match #10

Potential fix for the negation causing any file that's not the negated file to match #10

Conversation

kbendick commented Nov 4, 2020 • edited

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

kbendick commented Nov 4, 2020 •

edited