In Java, is there a more elegant way to remove duplicate strings from the ArrayList of strings?
So, in short, I have a Java job assignment that needs to manipulate the long ArrayList of strings in various ways (what we are doing, such as displaying word combinations, adding and deleting from ArrayList, nothing special) I noticed that some provided ArrayLists have duplicate entries (this assignment does not require duplicate entries), so I got the benefit of cleaning up the data by deleting duplicate entries from the teacher
private static ArrayList<String> KillDups(ArrayList<String> ListOfStrings) { for (int i = 0 ; i < ListOfStrings.size(); i++) { for (int j = i + 1; j < ListOfStrings.size(); j++) { //don't start on the same word or you'll eliminate it. if ( ListOfStrings.get(i).toString().equalsIgnoreCase( ListOfStrings.get(j).toString() ) ) { ListOfStrings.remove(j);//if they are the same,DITCH ONE. j = j -1; //removing the word basically changes the index,so swing down one. } } } return ListOfStrings; }
This is good for my task, but I doubt it will be very useful in the real world Is there any way to ignore spaces and special characters during comparison? Is there a cleaner way to deal with this problem (there may be no nested for loops)? There are other questions I should ask. Don't I know?
Solution
Yes It can be done on line 1 (elegant):
List<String> noDups = new ArrayList<String>(new LinkedHashSet<String>(list));
Make sure there are no duplicates Select the linkedhashset implementation of set to preserve the order of the list
In addition, on the Style Description:
>Name methods and parameters with names starting with lowercase letters > when specifying method signatures, always refer to abstractions (i.e. lists) rather than concretes (i.e. ArrayLists)
So your whole approach is:
private static List<String> killDups(List<String> list) { return new ArrayList<String>(new LinkedHashSet<String>(list)); }
For additional brownie points, make the method generic, so it applies to any type of list:
private static <T> List<T> killDups(List<T> list) { return new ArrayList<T>(new LinkedHashSet<T>(list)); }
If you want to ignore some characters, I'll create a class for it and list them Hashsets relies on hashcode() and equals() methods to remove duplicates:
public class MungedString { // simplified code String s; public boolean equals(Object o) { // implement how you want to compare them here } public int hashCode() { // keep this consistent with equals() } }
then
List<MungedString> list; List<MungedString> noDupList = killDups(list);