Java – use flyingsaucer to convert HTML pages containing Arabic characters to PDF

I want to convert HTML pages containing Arabic characters into PDF files using flyingsaucer, but the generated PDF does not contain combined characters and prints back

HTML:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
    <head>
        <Meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    </head>

    <body style="font-size:15px;font-family: Arial Unicode MS;">

        <center  style="font-size: 18px; font-family: Arial Unicode MS;">
            <b>
                <i style="font-family: Arial Unicode MS;">
                    &#x062C;&#x0645;&#x064A;&#x0639; &#x0627;&#x0644;&#x062D;&#x0642;&#x0648;&#x0642;<br />
                </i>
            </b>
        </center>
    </body>
</html>

Java excerpt:

String inputFile = "c:\\html.html";
        String url = new File(inputFile).toURI().toURL().toString();
        String outputFile = "c:\\html.pdf";
        OutputStream os = new FileOutputStream(outputFile);

        ITextRenderer renderer = new ITextRenderer();
        renderer.getFontResolver().addFont("c://ARIALUNI.TTF",BaseFont.IDENTITY_H,BaseFont.EMBEDDED);

        renderer.setDocument(url);
        renderer.layout();
        renderer.createPDF(os);
        os.close();

Actual PDF results:

Expected PDF results:

What can I do to get the right results?

Solution

When I use Arabic fonts, I have a similar alignment problem Arabic is a RTL language You need a specific jar to generate PDF in RTL Currently, when you try to generate PDF, the mode is normal LTR because you are getting the current output

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>