When parsing XML in Java, remove invalid characters from string
I have been searching for so on Google, but it has no effect I encountered a character problem in the XML feed I save the value of each tag in string, but when Happen, it just stops I only get 4-5 first words in the tag
So can someone help me with a way to delete it? Or maybe the text in the tag in the XML feed is too long for a string?
thank you!
Example code:
public void characters(char[] ch,int start,int length) throws SAXException { if (currentElement) { currentValue = new String(ch,start,length); currentElement = false; } } public void endElement(String uri,String localName,String qName) throws SAXException { currentElement = false; /** set value */ if (localName.equalsIgnoreCase("title")) sitesList.setTitle(currentValue); else if (localName.equalsIgnoreCase("id")) sitesList.setId(currentValue); else if(localName.equalsIgnoreCase("description")) sitesList.setDescription(currentValue); }
The text in the description label is very long, but I only get the first five words before; The characters began to come
Solution
You are using saxparser to parse XML string
When you read only one XML element, you can call the characters () – method multiple times When it finds something like < desc > bla bla & #39; bla bla la.< / desc>.
The solution is to use StringBuilder and append the read character to the character () – method, and then reset StringBuilder in the endelement () method:
private class Handler extends DefaultHandler{ private StringBuilder temp_val; public Handler(){ this.temp_val = new StringBuilder(); } public void characters(char[] ch,int length){ temp_val.append(ch,length); } public void endElement(String uri,String qName){ System.out.println("Output: "+temp_val.toString()); // ... Do your stuff temp_val.setLength(0); // Reset the StringBuilder } }
In view of this XML file, the above code is useful to me:
<?xml version="1.0" encoding="iso-8859-1" ?> <test>This is some example-text.</test>
The output is: