-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong Oritentation of letter #741
Comments
@muhmuhhum thanks for openning the issue. I understand you can't sahre the pdf document but would you mind sharing the code you're using to draw the bounding box? Also, is all the text always draw as horizontal text? For example, do you also have the issue with the "R 12,5" text? |
@BobLd Thx for the quick response its wrong for some of the other letters on the document, but for nearly all, the letter has the correct TextOrientation. I already found that the Location.EndLine und Location.StartLine for the letters with wrong TextOrientation are at the same point. Here a bigger cutout with more words marked. For the drawing i have to change some values cause skia uses top left as origin and i have to calculate the new position with 300 dpi:
And GetRotatedRect:
|
Regarding the rendering with Skia, you indeed need to invert the Y axis. I think one thing that causes your draw bounding boxes to always be Horizontal is that you use Could you instead use the using (var rect = new SKPath())
{
rect.MoveTo((float)transformedPdfBounds.BottomLeft.X, (float)transformedPdfBounds.BottomLeft.Y);
rect.LineTo((float)transformedPdfBounds.TopLeft.X, (float)transformedPdfBounds.TopLeft.Y);
rect.LineTo((float)transformedPdfBounds.TopRight.X, (float)transformedPdfBounds.TopRight.Y);
rect.LineTo((float)transformedPdfBounds.BottomRight.X, (float)transformedPdfBounds.BottomRight.Y);
rect.Close();
_canvas.DrawPath(rect, new SKPaint() { Color = SKColors.Black, Style = SKPaintStyle.Stroke });
} where |
Oh it is intended that the bounding boxes are always horizontal sry i have missed this question in your original answer that is what GetRotatedRect is for to get the horizontal box around the word. Sry if that caused some confusion. |
Ok after some research i think i found the problem. The Pdf has Fonts with Widths of 0 which leads to some weird behavior |
@muhmuhhum sounds good, thanks a lot for that. The code that computes the text orientation is here If you want, you can try to fix it, I'll try to have a look on my side. In the meantime, you can try using the var options = new NearestNeighbourWordExtractor.NearestNeighbourWordExtractorOptions()
{
GroupByOrientation = false
};
var nnWordExtracor = new NearestNeighbourWordExtractor(options); Let me know if that helps |
@BobLd Soory for the late answer my current workauround for this is that when i try to extract the words i check for letters where the letter.StartBaseLine is the same Point as letter.EndBaseLine and then replace them with bottomLeft and BottomRight of the glyph box and set the TextOrientation based on the Rotation of the GlyphRectangle. This may ignores the possible extra width for the Letters but i havent found a good other solution. Now i just ask myself how programms like Adobe Acrobat can draw this pdf cause as far as i understand a character with width of 0 should be drawn as so, but it is displayed normally just as every other character. |
I have following Problem i have a 2d technical drawing where text is written in every direction.
The GetWords() Method with NearestNeighbourWordExtractor works fine for me except for this example.
In the Image you can see a part of the PDF.
Where light blue is the word box.
Dark blue the letter box and green the Location.
My Problem now is that the Letter has the TextOrientation Horizontal which leads to a wrongly drawn Text box for it and maybe that the 7 and the 9 cant find each other with nearest neighbour.
I have tried to create a pdf which has the same problems but i couldnt get it to work.
Because there is a nda i cant share the file, but maybe you could point me in the right direction to find the problem and maybe find a solution for it
The text was updated successfully, but these errors were encountered: