Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixup info plugin's exclusion logic #6874

Merged
merged 1 commit into from Mar 12, 2024

Conversation

kamilkrzyskow
Copy link
Collaborator

@kamilkrzyskow kamilkrzyskow commented Mar 6, 2024

Continuation of #6826 or rather a fixup of my mistake.

After the change from gitignore-like fnmatch to regex it will be better to store it directly in Python.
I decided to return the list from a function to assure a new copy every time. Not that it matters in the setup, since the plugin runs only once, but operating on a global list didn't feel right to me in this context.

I hope the styling is OK, black doesn't like it, and it liked it even less if I omitted the # to add empty space between entries.

I also noticed that when resolving patterns I stripped "/" for directories to only add them later 🤦, so I strip them now for every path. It's not really a bug fix, as in my testing no "path source" returned a path with a slash at the end, so the strip is just for sanity 😄

Changed slash removal from resolved patterns
Reverted build assets gathering dc808ca
Reverted info.gitignore file 79129d5

#
r".*\.DS_Store", # macOS
#
r"^/\.[^/]+$", # .dotfiles in the root directory
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is risky. I can think of one case where this might not work: the awesome pages plugin with the docs_dir set to the root. I think we should remove this line or better specify what to exclude.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK MkDocs doesn't allow to set docs_dir: .. Do you know if there are any plans to change it?
Also awesome-pages looks only within the docs_dir, so root of the project should be fine.
EDIT: I see it is in the plans for 1.6.0 ehh: https://github.com/mkdocs/mkdocs/milestone/18

ERROR   -  Config value 'docs_dir': The 'docs_dir' should not be the parent directory of the config file.
           Use a child directory instead so that the 'docs_dir' is a sibling of the config file.

I also don't think there are any other cases where a .dotfile is used with a plugin. Any files related to theme's plugins are in the docs_dir too, like .meta.yml or .authors.yml. The social layouts are not a .dotfile. So the user would have to deliberately break the "recommended way to to things".

I'll see how feasible the explicit patterns for the files would be 🤔
This repository has 9 .dotfiles, but some of them are related to TypeScript, so perhaps not everything is required, or there are even more "standard" files people use 😵 💫

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's time to add inclusion_patterns as well 😆

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are at least actively considering it: mkdocs/mkdocs#3519

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't think there are any other cases where a .dotfile is used with a plugin.

We should only exclude files if we are a 100% sure we don't need them for reproductions. Users might put configuration in dotfiles and read them via hooks. Moreover, the path to the authors_file, albeit relative to the docs_dir, can be changed to be located outside of the docs_dir. Same for all other plugin settings that allow to change file locations. Thus, please remove the dotfile rule.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another example is somebody storing Snippets in a dotfile. Dotfiles are only a convention, and Snippets will not care whether your file is called something.md or .awesome-stuffs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the exclusion only logic, replaced the match function with search to match subdirectories without having to resort to .* at the beginning. I still have to properly test it though 😅, a multilayered project with nested .git repositories. This also kind of forced me to use exact names, because [^/]*lint[^/]* would match a documentation file /To lint or not to lint.md, and adding exclusions for .md didn't seem right...

@squidfunk
Copy link
Owner

squidfunk commented Mar 6, 2024

I hope the styling is OK, black doesn't like it, and it liked it even less if I omitted the # to add empty space between entries.

Please don't use automatic code formatters for now. We'll revisit this topic later this year. However the rouge # also look very weird. This is my whole point about code formatters – they force you to use one style, even though a slightly different formatting conveys meaning and makes reading the code easier.

@alexvoss
Copy link
Sponsor Collaborator

alexvoss commented Mar 9, 2024

I tried to produce a reproduction the other day and found that the info plugin added my whole venv directory, which cause it to then declare that the file was too big.

It turns out that the info plugin does not just pack up a venv that is in the project root directory but also a venv that is elsewhere. (I have one installed globally that I usually use so I don't have to upgrade Insiders in a million individual venvs.) Looks like it was a design decision to pack up some elements of the venv but it does more than it should?

Here is reproduction:
9.5.13+insiders.4.53.1-infovenv.zip

I removed the packed up venv manually, so you need to run a build yourself to see the resulting large .zip file.

@kamilkrzyskow
Copy link
Collaborator Author

kamilkrzyskow commented Mar 9, 2024

I did claim that the user could have 2 venv directories in the project and proposed having another safeguard in the shape of generic names exclusion patterns like venv or .venv etc. However, squidfunk said that's not something to optimise for.
#6826 (comment)

Your case is different, as you don't have 2 vens inside the same directory but one global, one local, and you used the global one.

We currently only exclude the active venv if it's inside the reproduction directory. @alexvoss would a generic name exclusion exclude your local one in this instance?

Similar to the exclusion of the site_dir based on the sitemap.xml we could search for /lib/site-packages/ as I believe this path should be agnostic across operating systems. Actually the folder structure isn't the same across Windows and Linux, but the pyvenv.cfg exists in the venv root directory, so once I check if macOS is the same, I'll follow that route.

@squidfunk
Copy link
Owner

I'm a little confused, as what @alexvoss is saying is that the "venv is located elsewhere". Those directories should not be packed up. What I said was that we should not optimize for the case that two venvs exist inside one project, i.e., a root directory, that is being packed up.

@squidfunk
Copy link
Owner

We should very hard not to go down the path where we have a list where we add name of virtual environment directories. This is completely up to the user and thus, arbitrary, so we would need to keep adding and adding thing to the ignore file + those might catch directories that should be packed up that are not venvs. I'm just saying that it's risky.

@alexvoss
Copy link
Sponsor Collaborator

alexvoss commented Mar 9, 2024

Sorry for the sporadic interaction today. Sorry that I manage to produce the weirdest cases...

For background: because I have a ton of projects over time, I created a single venv in my home directory that has Insiders in it and the plugins that are commonly used. That makes it easier to run the most up-to-date version most of the time. Because I test things with both the public edition and with Insiders, I actually have two such environments, ~/venvs/mkd_insiders and ~/venvs/mkd_public. (You can tell that I have used Anaconda/micromamba in the past.)

  • I run mkdocs build with ~/venvs/mkd_insiders activated. This causes ~/venvs/mkd_insiders to be packed up, including site-packages.

  • When I activate the venv in the project itself, it will be packed up but with the site-packages excluded. That works whatever that environment is called.

  • A second environment in the project root will also be packed up including site-packages.

IMHO, excluding site-packages is the way to go. Being paranoid and having zipped up one too many of these, I would also exclude node_modules.

@kamilkrzyskow
Copy link
Collaborator Author

kamilkrzyskow commented Mar 9, 2024

I run mkdocs build with ~/venvs/mkd_insiders activated. This causes ~/venvs/mkd_insiders to be packed up, including site-packages

So this should not happen, the path should be excluded because it's activated...
Where do you run mkdocs build from? Thanks to you, I found a missing part of the platform.json the CWD path and perhaps the excluded paths😅

When I activate the venv in the project itself, it will be packed up but with the site-packages excluded. That works whatever that environment is called.

Perhaps another finding in my lacking testing, I only tested the behaviour of site.getsitepackages() on Windows and it returns both the site-packages and the root directory of the venv:
['C:\\MyFiles\\PythonVENV', 'C:\\MyFiles\\PythonVENV\\Lib\\site-packages']
but from this description it would seem on macOS it only returns the site-packages, so they're excluded, but the bin directory isn't?

A second environment in the project root will also be packed up including site-packages.

Yes, this was expected, only the activated venv was supposed to be excluded.


EDIT:
So from the platform.json logs, I believe macOS returns the /bin and /site-packages directories and Windows the / and /site-packages directories, booo

  "sys.path": [
    "C:\\MyFiles\\_git\\base-material-info-tests\\13-infoenv",
    "C:\\MyFiles\\Python312\\python312.zip",
    "C:\\MyFiles\\Python312\\DLLs",
    "C:\\MyFiles\\Python312\\Lib",
    "C:\\MyFiles\\Python312",
    "C:\\MyFiles\\PythonVENV",
    "C:\\MyFiles\\PythonVENV\\Lib\\site-packages"
  ]
  "sys.path": [
    "/Users/avoss/venvs/mkd_insiders/bin",
    "/Users/avoss/src/mkdocs-material/reproduce/infovenv",
    "/Users/avoss/src/mkdocs-material/reproduce/infovenv/projects/childrensrights",
    "/Users/avoss/src/mkdocs-material/reproduce/infovenv/projects/bookproposal",
    "/opt/homebrew/Cellar/python@3.12/3.12.2_1/Frameworks/Python.framework/Versions/3.12/lib/python312.zip",
    "/opt/homebrew/Cellar/python@3.12/3.12.2_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12",
    "/opt/homebrew/Cellar/python@3.12/3.12.2_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/lib-dynload",
    "/Users/avoss/venvs/mkd_insiders/lib/python3.12/site-packages"
  ]

@kamilkrzyskow
Copy link
Collaborator Author

@alexvoss I can also see the paths in the sys.path:

    "/Users/avoss/src/mkdocs-material/reproduce/infovenv",
    "/Users/avoss/src/mkdocs-material/reproduce/infovenv/projects/childrensrights",
    "/Users/avoss/src/mkdocs-material/reproduce/infovenv/projects/bookproposal",

but they are not in the zip provided in the previous comment, are the directories empty or did the script omit them, or did you delete them?

Please run this command from the same directory you packaged up the zip file before:

python -c "import site, sys, os; from pprint import pprint as pp; print(os.getcwd()); pp(site.getsitepackages()); pp(sys.path)"

@alexvoss
Copy link
Sponsor Collaborator

alexvoss commented Mar 9, 2024

paths in the sys.path

Right, these exist simply because I reused a terminal session that had a PYTHONPATH set. If you look hard enough you will probably find my credit card number somewhere ;o) They have nothing to do with that reproduction project. Not sure when they were expanded? The path is .:projects/childrensrights:projects/bookproposal

@kamilkrzyskow
Copy link
Collaborator Author

.:projects/childrensrights:projects/bookproposal are relative paths, and sys.path contains absolute paths

@alexvoss
Copy link
Sponsor Collaborator

alexvoss commented Mar 9, 2024

What I meant to say is that they may look to you to be related to the reproduction when, in fact, they are not. Interpret sys.path with care. What I will take away is that perhaps I need to write wrappers for my builds so I do not need to pollute my environment. In particular, having dot in the PYTHONPATH seems a bit problematic...

@kamilkrzyskow
Copy link
Collaborator Author

kamilkrzyskow commented Mar 9, 2024

Here is the current form of the platform.json from packing up this repository*:
* I added temporary exclusions for material and src before running mkdocs build they're not in the code permanently:

# Temp
r"^/material/",
r"^/src/",
{
  "system": "Windows-10-10.0.19045-SP0",
  "architecture": [
    "64bit",
    "WindowsPE"
  ],
  "python": "3.12.0",
  "cwd": "C:\\MyFiles\\_git\\base-material",
  "command": "__main__.py build -v",
  "env:$PYTHONPATH": "",
  "sys.path": [
    "C:\\MyFiles\\_git\\base-material",
    "C:\\MyFiles\\Python312\\python312.zip",
    "C:\\MyFiles\\Python312\\DLLs",
    "C:\\MyFiles\\Python312\\Lib",
    "C:\\MyFiles\\Python312",
    "C:\\MyFiles\\PythonVENV",
    "C:\\MyFiles\\PythonVENV\\Lib\\site-packages",
    "C:\\MyFiles\\_git\\base-material"
  ],
  "excluded_entries": [
    "/\\.git/ - /.git/",
    "/\\.idea/ - /.idea/",
    "/\\.vscode/ - /.vscode/",
    "pyvenv.cfg - /custom-v-e-n-v/",
    "^/material/ - /material/",
    "/node_modules/ - /node_modules/",
    "/site/ - /site/",
    "^/src/ - /src/",
    "/__pycache__/ - /__pycache__/",
    "/\\.browserslistrc$ - /.browserslistrc",
    "/\\.dockerignore$ - /.dockerignore",
    "/\\.editorconfig$ - /.editorconfig",
    "/\\.eslintignore$ - /.eslintignore",
    "/\\.eslintrc$ - /.eslintrc",
    "/\\.gitattributes$ - /.gitattributes",
    "/\\.gitignore$ - /.gitignore",
    "/\\.stylelintignore$ - /.stylelintignore",
    "/\\.stylelintrc$ - /.stylelintrc",
    "/[^/]+\\.zip$ - /9.5.13-fixed.zip",
    "/[^/]+\\.zip$ - /performance_debug.zip"
  ]
}

The pyvenv.cfg excluded the inactive custom-v-e-n-v

@kamilkrzyskow
Copy link
Collaborator Author

I don't really know much about Node, so I guess the node_modules pattern isn't perfect, but good enough?
This could exclude /docs/node_modules/, but the risk is rather low 😆
Should I finally introduce the inclusion pattern and include everything in the docs_dir? This doesn't seem ideal, as MkDocs 1.6.0 will allow to set docs_dir as root...
Other option would be to add more exceptions in the _is_excluded function as patterns don't cut it? 🤔

@squidfunk
Copy link
Owner

I don't really know much about Node, so I guess the node_modules pattern isn't perfect, but good enough?
This could exclude /docs/node_modules/, but the risk is rather low 😆
Should I finally introduce the inclusion pattern and include everything in the docs_dir? This doesn't seem ideal, as MkDocs 1.6.0 will allow to set docs_dir as root...
Other option would be to add more exceptions in the _is_excluded function as patterns don't cut it? 🤔

node_modules is the standard. You would need to explicitly set another folder when running npm ... to change that, so it is perfectly fine for us to include node_modules. However, please note that again, we're demanding building of minimal reproductions, so people should actually create a new project to reproduce an error. It is very, very unlikely that there will be node_modules or anything else inside there. This is the case we are optimizing for. We are not optimizing for the case where somebody is running the info plugin on a live repository. It is not the intended use case, and it is absolutely okay when the archive blows up to 100 MB, so GitHb forbids the upload of the repository. The info plugin will also warn when the reproduction is larger than 1 MB.

Looking at the exclusion patterns, I believe there are far too many files there. .stylelintignore, .browserlistrc ... all those files do not occur in the case we are describing above. If they are there and are packed up, it is not a problem, as they increase the payload by a few bytes. It is very important that we keep the info plugin's scope as narrow as possible.

@squidfunk
Copy link
Owner

I'm sorry that iteration on this feels so sluggish, but I'm not sure if we have the same vision for the plugin. Maybe we should first write up what the plugin aims to achieve. As said, in my view, it is only meant for packing up mint reproductions. On all other case, it is okay to fail spectacularly if we can somehow signal the user why it does.

Adding additional complexity to the plugin makes it more error prone, breaking our bug reporting process. We should keep this in mind at all times when adding more lines and logic to the plugin. The plugin must be as dumb as possible. That is at least my experience from troubleshooting issues for this project. The previous state of the info plugin, albeit not perfect, made interaction with bug reporters so much more sensical and simple, that I cannot stress enough that things should be as simple and dumb as possible to work.

@alexvoss
Copy link
Sponsor Collaborator

In the following just my 2p. Feel free to ignore. I realize I am coming to this rather late and may not be sufficiently aware of the discussion that has come before. I ignored it because the case seemed to be in the best hands, then probably only got to reads bits of it.

I can see where you are coming from @squidfunk. The site-packages is created when following the instructions, while nodes_modules and other things would not - or only under very specific circumstances.

I do wonder how many people start with an empty project when producing a reproduction. My workflow would be to take my project, copy it and strip it of things that are not needed for a reproduction. The reason being that removing things bit by bit allows me to narrow down there the problem lies and is easier to do than building things up from scratch. On a good day I would take the result and build it up from an empty project, having narrowed down what needs to be included. On a bad day I would feel convinced that I have produced something "minimal" and would try to zip it up.

It seems to me that there is evidence in the forum that people follow the same process. How else would we be getting so many reproductions that do not meet the "minimal" requirement? Could we make it easier for people to have good days and do the second step of creating the reproduction from the ground up? Alternatively, could we somehow detect if a reproduction has been built from the ground up instead of by stripping down an existing project?

@squidfunk
Copy link
Owner

How else would we be getting so many reproductions that do not meet the "minimal" requirement?

If we get a reproduction, most reproductions meet the minimal requirements. There are only a handful of cases, mostly from users that did not provide one in the first place, that are not minimal. If a reproduction is not minimal, our policy states that we are reserving the right to close the issue, until one is provided, because it can't be ruled out that the problem hides in customizations. This is a mandatory requirement, since we are providing support for free.

Could we make it easier for people to have good days and do the second step of creating the reproduction from the ground up? Alternatively, could we somehow detect if a reproduction has been built from the ground up instead of by stripping down an existing project?

Please, let's not overcomplicate things. We have much more impactful areas to work on. I already feel that the latest efforts in improving the info plugin might have complicated things from a maintenance and usage perspective, but that is yet to be learned. Again, please, please, please, let's keep it as simple as possible.

@kamilkrzyskow
Copy link
Collaborator Author

kamilkrzyskow commented Mar 10, 2024

Looking at the exclusion patterns, I believe there are far too many files there. .stylelintignore, .browserlistrc ... all those files do not occur in the case we are describing above. If they are there and are packed up, it is not a problem, as they increase the payload by a few bytes. It is very important that we keep the info plugin's scope as narrow as possible.

The idea about excluding the dotfiles is to make sure that the zipped file isn't cluttered with them, I didn't add them to save on storage.

I'm sorry that iteration on this feels so sluggish, but I'm not sure if we have the same vision for the plugin. Maybe we should first write up what the plugin aims to achieve. As said, in my view, it is only meant for packing up mint reproductions. On all other case, it is okay to fail spectacularly if we can somehow signal the user why it does.

I didn't have a specific vision of the plugin in mind, maybe that's my issue. Only the general idea of helping the user to pack up files. Therefore, I added the validation to make sure the files are included etc., and I considered adding a configurable excluded_path option, but ultimately didn't add it since info plugin is typically run only once.

I agree that the info plugin's aim shouldn't be to automate the "minimization" process of creating a reproduction, but at the same time I feel it doesn't have to be constrained by that logic to not "help" the users if they decide to pack up their Git repository with their development files in-place.

I also tried adding a "currently processed file" indicator, so that people can see why the zipping process takes so much time, so they can see for example that /node_modules/ is being processed. However, that wasn't ideal and only slowed down the zip creation due to the many IO operations to the terminal.
EDIT: 🤯 It's true that writing things down helps with processing information, the previous implementation was so slow, because it wrote to Terminal for every processed file, but if I limit the idea to "currently processed directory" it should be much better performance wise.

mintty_jAP3kC2wLf.mp4

If we get a reproduction, most reproductions meet the minimal requirements. There are only a handful of cases, mostly from users that did not provide one in the first place, that are not minimal. If a reproduction is not minimal, our policy states that we are reserving the right to close the issue, until one is provided, because it can't be ruled out that the problem hides in customizations. This is a mandatory requirement, since we are providing support for free.

So after reading this, I consider removing the dotfiles would be for the best, as their existence in the zip file will prove the reproduction wasn't built from scratch. 👍

@kamilkrzyskow
Copy link
Collaborator Author

kamilkrzyskow commented Mar 10, 2024

@alexvoss If you have the time please try the info plugin from this PR to see if everything will be packed up and excluded as it should on your machine.

I did run a test in my macOS repository and a venv should include the pyvenv.cfg file. However, it's best to test on actual hardware 😅


⬇️ The latest force push only changed the commit message

@squidfunk
Copy link
Owner

So after reading this, I consider removing the dotfiles would be for the best, as their existence in the zip file will prove the reproduction wasn't built from scratch. 👍

@alexvoss there's your indicator if a project was built from scratch 😉 Okay, so we exclude the virtual environment resolved via site packages (if we can make that work on all OSs), and then just pack up everything (or am I forgetting something again?), keeping the warnings about the latest version, hooks and customizations.

My learning: I for my part need to better explain what the direction and vision of a plugin/extension etc. is from the vantage point of a long-term maintainer next time we work on a PR together. Not explaining this from the start made iteration more tedious, but I think we also learned a lot in the process how we can better work together ☺️

@kamilkrzyskow
Copy link
Collaborator Author

kamilkrzyskow commented Mar 11, 2024

The validation with sys.exit includes version, custom_dir, hooks, and invalid project structure where all files wouldn't be packed up.

The current patterns exclude only files that could be auto-generated during the "minimal reproduction" creation process.

r"/__pycache__/", # Python cache directory
r"/\.DS_Store$", # macOS
r"/[^/]+\.zip$", # Generated files and folders
r"/[^/]*\.cache($|/)", # .cache files and folders
r"/\.vscode/", # Common autogenerated IDE directories
r"/\.vs/",
r"/\.idea/",

This change kind of invalidated the switch from fnmatch to regex 😅 as those patterns aren't that complex.

Additionally dynamic patterns (aka exact POSIX paths) are generated for venv, site_dir, projects plugin's site_dir.
It turned out site.getsitepackages() isn't perfect on macOS, as it only includes the site-packages directory, so the bin directory wasn't excluded: 2nd bullet point (#6874 (comment))

Additionally we support 2 "edge-cases":

  1. Multi-project structure like Setting up multi-language installations #2346 or https://github.com/tiangolo/fastapi where there are different site_dir for each mkdocs.yml file. This is supported via excluding directories containing sitemap.xml.gz, as this file is always in a site_dir
  2. Multiple venv directories, even inactive ones. This is supported with excluding directories containing pyvenv.cfg. If a directory in docs_dir contains pyvenv.cfg it will also be excluded, but that is unlikely to happen in a minimal reproduction. If someone makes a mistake and run python -m venv . then the zip will be empty. This complies with the "fails spectacularly idea" ^^

The edge-cases support makes the dynamic patterns redundant, so dynamic patterns could be removed to de-bloat the code.

To aid the user if they mess up there is the current path indicator, so it's easily understood that the infinite black hole of
/node_modules/ is being packed up.

To aid the debug process, CWD, $PYTHONPATH, and the excluded paths are also stored


@alexvoss
The first bullet point from #6874 (comment)

I run mkdocs build with ~/venvs/mkd_insiders activated. This causes ~/venvs/mkd_insiders to be packed up, including site-packages.

simply can't be true, as the platform.json revealed that your project directory was:

  • /Users/avoss/src/mkdocs-material/reproduce/infovenv
    and the global venv was:
  • /Users/avoss/venvs/mkd_insiders/
    so a completely unrelated path.

The code only scans the os.getcwd() iteratively down the tree of files, also the current default setup doesn't follow symbolic links, so it couldn't include /Users/avoss/venvs/ from /Users/avoss/src/
So only the inactive venv was copied over

@alexvoss
Copy link
Sponsor Collaborator

only the inactive venv was copied over

You are right. I must have got confused about the content of the local venv, thinking it was still empty. Apologies for complicating things.

@kamilkrzyskow kamilkrzyskow marked this pull request as draft March 11, 2024 16:15
@kamilkrzyskow
Copy link
Collaborator Author

OK, after further consideration:

Multiple venv directories, even inactive ones. This is supported with excluding directories containing pyvenv.cfg. If a directory in docs_dir contains pyvenv.cfg it will also be excluded, but that is unlikely to happen in a minimal reproduction. If someone makes a mistake and run python -m venv . then the zip will be empty. This complies with the "fails spectacularly idea" ^^

Handling this edge-case, does "fix" the user's mistake of running python -m venv venv and then forgetting to run . venv/Scripts/activate, ultimately running another venv or global site-packages, and not the created one.

And following the "should fail spectacularly" idea, and "should not automate the minimization" idea, I should not exclude inactive venvs,

Instead, I need to fix the other issue to make sure not just the site-packages but also bin etc. are excluded 😮‍💨

It turned out site.getsitepackages() isn't perfect on macOS, as it only includes the site-packages directory, so the bin directory wasn't excluded: 2nd bullet point (#6874 (comment))

@kamilkrzyskow
Copy link
Collaborator Author

kamilkrzyskow commented Mar 11, 2024

Done, and I found a bug that stemmed from changing from match to search function 😓 But fixed it now ✌️

This tests all edge-cases I'm aware of, and all current patterns:
9.5.13-fixup-truly-last-pray.zip

site.getsitepackages() got changed to site.PREFIXES as the results are constant across operating systems.
https://github.com/python/cpython/blob/d130cb49834573fff6a0a63da00e04d0a0cca818/Lib/site.py#L366-L369

Here is a test run on the repository that checks the values:
https://github.com/kamilkrzyskow/test-macos-actions/actions/runs/8236833853

EDIT: Found one last styling issue ⬇️ fixed in the force push.

Unless there are any styling issues, I consider this ready to merge ✌️

@kamilkrzyskow kamilkrzyskow marked this pull request as ready for review March 11, 2024 18:03
Added more information to platform file
Added processed directory indicator

Changed slash removal in resolved patterns
Changed regex match function to search to support subprojects
Constrained dynamic patterns to the root

Reverted build assets gathering dc808ca
Reverted info.gitignore file 79129d5
Copy link
Owner

@squidfunk squidfunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not yet run it, but skimmed over the changes and it LGTM. Considering ready for merge and giving it a spin! ✌️

#
r".*\.DS_Store", # macOS
#
r"^/\.[^/]+$", # .dotfiles in the root directory
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are at least actively considering it: mkdocs/mkdocs#3519

#
r".*\.DS_Store", # macOS
#
r"^/\.[^/]+$", # .dotfiles in the root directory
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't think there are any other cases where a .dotfile is used with a plugin.

We should only exclude files if we are a 100% sure we don't need them for reproductions. Users might put configuration in dotfiles and read them via hooks. Moreover, the path to the authors_file, albeit relative to the docs_dir, can be changed to be located outside of the docs_dir. Same for all other plugin settings that allow to change file locations. Thus, please remove the dotfile rule.

#
r".*\.DS_Store", # macOS
#
r"^/\.[^/]+$", # .dotfiles in the root directory
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another example is somebody storing Snippets in a dotfile. Dotfiles are only a convention, and Snippets will not care whether your file is called something.md or .awesome-stuffs.

@squidfunk squidfunk merged commit 2d39824 into squidfunk:master Mar 12, 2024
4 checks passed
@kamilkrzyskow
Copy link
Collaborator Author

kamilkrzyskow commented Mar 12, 2024

😮 The last 3 outdated comments that you replied to me a week ago, never appeared for me on this thread... Just now after the merge they bubbled up. Don't know if I blatantly missed them, or if by doing a force push they became outdated and were constrained to the previous commit that was overriden 🤔 Will need to look out for such stuff in the future ✌️

This is what I saw before you merged, no in-between replies xD
image

@squidfunk
Copy link
Owner

No worries! Since the .dotfiles catch-all was removed, I thought you read them although they were not resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants