When parsing XML in Java, remove invalid characters from string

I have been searching for so on Google, but it has no effect I encountered a character problem in the XML feed I save the value of each tag in string, but when Happen, it just stops I only get 4-5 first words in the tag

So can someone help me with a way to delete it? Or maybe the text in the tag in the XML feed is too long for a string?

thank you!

Example code:

public void characters(char[] ch,int start,int length)
        throws SAXException {

    if (currentElement) {
        currentValue = new String(ch,start,length);
        currentElement = false;
    }

}

public void endElement(String uri,String localName,String qName)
        throws SAXException {

    currentElement = false;

    /** set value */ 
    if (localName.equalsIgnoreCase("title"))
        sitesList.setTitle(currentValue);
    else if (localName.equalsIgnoreCase("id"))
        sitesList.setId(currentValue);
    else if(localName.equalsIgnoreCase("description"))
        sitesList.setDescription(currentValue);
}

The text in the description label is very long, but I only get the first five words before; The characters began to come

Solution

You are using saxparser to parse XML string

When you read only one XML element, you can call the characters () – method multiple times When it finds something like < desc > bla bla & #39; bla bla la.< / desc>.

The solution is to use StringBuilder and append the read character to the character () – method, and then reset StringBuilder in the endelement () method:

private class Handler extends DefaultHandler{

    private StringBuilder temp_val;

    public Handler(){
        this.temp_val = new StringBuilder();
    }

    public void characters(char[] ch,int length){
        temp_val.append(ch,length);
    }

    public void endElement(String uri,String qName){
        System.out.println("Output: "+temp_val.toString());
        // ... Do your stuff
        temp_val.setLength(0); // Reset the StringBuilder
    }

}

In view of this XML file, the above code is useful to me:

<?xml version="1.0" encoding="iso-8859-1" ?>
<test>This is some &#13; example-text.</test>

The output is:

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>