Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set document title for Maven site #764

Closed
wants to merge 1 commit into from

Conversation

kriegaex
Copy link

@kriegaex kriegaex commented Feb 4, 2024

The document title is e.g. displayed in breadcrumbs.

Thank you for opening a pull request and contributing to asciidoctor-maven-plugin!

What kind of change does this PR introduce? (check at least one)

  • Bugfix
  • Feature
  • Documentation
  • Refactor
  • Build improvement
  • Other (please describe)

What is the goal of this pull request?

Fix problem discussed with @abelsromero in Zulip.

Are there any alternative ways to implement this?

Parsing asciidoc instead of generated HTML. This was just a quick fix to demonstrate how to do it in general. Instead of parsing the content of the first h1 tag in the generated HTML, we could scan the asciidoc from the reader for the first # My doc title section or, if not found, look for :title: My doc title. That would probably be more efficient and also circumvent problems with possibly encoded HTML entities like & etc. Reading the asciidoc line by line and stopping at the first match might also be faster, but I did not do any gold plating here. Feel free to merge and then commit improvements on top.

P.S.: The Doxia sink FAQ says: Avoid sink.rawText!

Are there any implications of this pull request? Anything a user must know?

It should be documented how the user can set the document title in his asciidoc documents.

Is it related to an existing issue?

  • Yes
  • No

Finally, please add a corresponding entry to CHANGELOG.adoc

Please do that yourself, thanks.

The document title is e.g. displayed in breadcrumbs.
@kriegaex
Copy link
Author

kriegaex commented Feb 4, 2024

P.S.: I tried to help you, maybe you can polish this PR and then try to help me by publishing a maintenance release quickly. That would be wonderful.

@abelsromero
Copy link
Member

P.S.: The Doxia sink FAQ says: Avoid sink.rawText!

We are aware and that's why v3.0.0 will introduce a full parse, https://github.com/asciidoctor/asciidoctor-maven-plugin/blob/main/docs/modules/site-integration/pages/parser-module-setup-and-configuration.adoc.

If you are concerned about it, or want to avoid extra work writing a custom CSS, I'd consider waiting for its release, we are one story away https://github.com/asciidoctor/asciidoctor-maven-plugin/issues?q=is%3Aopen+is%3Aissue+milestone%3A3.0.0.

@kriegaex
Copy link
Author

kriegaex commented Feb 4, 2024

Well, I am using what for me looks like the latest stable release, so I looked into this branch (you also pointed me to it) and fixed this one. I have never used the plugin until yesterday and had no idea that the next big release is due soon. So thanks for the hint.

Anyway, I would appreciate a fix for 2.2.x, because chances are that 3.0 might still have glitches in the beginning. I just want to get my work in the other project done with this plugin.

@kriegaex
Copy link
Author

kriegaex commented Feb 4, 2024

P.S.: I did not check in detail, but at first glance it does not look like your new parser is setting the title either. But I could be wrong, I just browsed the code on the website, I did not check out the branch.

@abelsromero
Copy link
Member

abelsromero commented Feb 4, 2024

Well, I am using what for me looks like the latest stable release, so I looked into this branch (you also pointed me to it) and fixed this one.

Correct, v3.0.0 is on main branch. There's always one last extra thing delaying it, and on xmas I decided "enough is enough" and that nothing except big stuff should delay it. That's why I may have sounded dismissive in some messages, sorry.

P.S.: I did not check in detail, but at first glance it does not look like your new parser is setting the title either. But I could be wrong, I just browsed the code on the website, I did not check out the branch.

You are right, it's not. How I have been doing it, is if a bug is to be fixed in both branches, mark the 2.2.x milestone, first do 2.2.x, then port (I wish GH allowed to assign multiple milestones, I am thinking of using tags maybe). In this case, we should try to fix v2.2.x asap and release.

On the fix itself, I had a look and took my notes there #763 (comment), I also want to add the rest of the metadata as described in the Sink.head javadocs, and to ensure the support we need to use load from the Asciidoctor API to ensure all syntax + options/attributes are correctly processed.

As if it was not obvious, this is not a full-time job, so best case scenario Tuesday night I can work on the fix and release on Friday.

@kriegaex
Copy link
Author

kriegaex commented Feb 4, 2024

I did not think it was a full-time job. Mine is to be a freelance agile coach. My last money with coding I earned back in the year 2000. Programming and test automation are just hobbies for me. I do not donate any money to charities, so I donate some of what little spare time I have to OSS development.

As for parsing the header separately or, as you said in the other issue, duplicating streams or so, your reader slurps the whole document already, because the whole document needs to be converted anyway. Why not just read it line by line, parsing out whatever header info you need, but adding each line to the buffer you then convert to HTML? I am talking about 2.2, not 3.0. For now, a quick fix for the title would suffice for me, I am not so soncerned about all the other header information in the context of the Maven site. That could come in a later PR and release, if you ask me.

@abelsromero
Copy link
Member

Why not just read it line by line, parsing out whatever header info you need, but adding each line to the buffer you then convert to HTML?

I am not discarding that option in fact, but I want to explore how to refactor so we can have some tests (not IT tests which are slow and a pain to code) to ensure we cover cases like injecting attributes from the pom. If we can get some component that can isolate future changes, plus extracting all the other metadata ends up having some issues we can use that for now.

@mojavelinux
Copy link
Member

I can tell you that you cannot parse AsciiDoc line-by-line. It's way more complicated than that. However, the :parse_header_only option does short circuit the parsing as soon as the document header has been parsed and the first block of content discovered.

The reason I had suggested splitting the stream is that the first thing the processor does is read and normalize all the lines to prepare for parsing. When it's only reading the document header, that's unnecessary work. However, it's important to emphasize just how little overhead that has. In Java, splitting the stream probably won't save you any time and could even cost more time. Asciidoctor is already as efficient as it possibly can be, so adding extra processing could be counter productive. For extremely large documents, I'm guessing it might save you a few nanoseconds at most.

In other words, don't be afraid to just use the :parse_header_only option and see how it goes.

@kriegaex
Copy link
Author

kriegaex commented Feb 5, 2024

I did not talk about parsing the whole document into an AST or similar line by line. I am not that naive. I meant to quickly find the title, as described in the PR as an alternative to my quick-shot solution. Without knowing the asciidoc format, it certainly seems possible to parse that out without too much hassle.

I think, the title deserves a special role, because other than the (user-) readable content it is the single most important meta datum. The title is the first thing users might see of any document, even before opening it, because it can be used for navigation, site maps etc.

@abelsromero
Copy link
Member

Closed, being replaces by #769

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants