Java – why HashSet RemoveAll requires secondary operations?

I have this code to generate a HashSet and call removeall() I made a Class A, which is just an int wrapper. It records the number equal to the number of calls - the program outputs this number

import java.util.*;

class A {
    int x;
    static int equalsCalls;

    A(int x) {
        this.x = x;
    }

    @Override
    public int hashCode() {
        return x;
    }

    @Override
    public boolean equals(Object o) {
        equalsCalls++;
        if (!(o instanceof A)) return false;
        return x == ((A)o).x;
    }

    public static void main(String[] args) {
        int setSize = Integer.parseInt(args[0]);
        int listSize = Integer.parseInt(args[1]);
        Set<A> s = new HashSet<A>();
        for (int i = 0; i < setSize; i ++)
            s.add(new A(i));
        List<A> l = new LinkedList<A>();
        for (int i = 0; i < listSize; i++)
            l.add(new A(i));
        s.removeAll(l);
        System.out.println(A.equalsCalls);
    }
}

It turns out that the removeAll operation is not linear. I think:

$java A 10 10
55
$java A 100 100
5050
$java A 1000 1000
500500

In fact, it seems to be secondary Why?

Replace line s.removeall (L); And (a B: l) s. remove (b); Fix it to act the way I want it to:

$java A 10 10
10
$java A 100 100
100
$java A 1000 1000
1000

Why?

This is a graph showing a conic:

The X and Y axes are two parameters of a java program The Z axis is the number of a.equals calls

The figure is generated by this asymptote program:

import graph3;
import grid3;
import palette;

currentprojection=orthographic(0.8,1,1);

size(400,300,IgnoreAspect);

defaultrender.merge=true;

real[][] a = new real[][]{
  new real[]{0,0},new real[]{0,1},3,3},2,6,6},10,10},4,15,15},5,21,21},28,28},7,36,36},8,45,45},9,55,55},66,66},11,78,78},12,91,91},13,105,105},14,120,120},136,136},16,153,153},17,171,171},18,190,190},19,210},};


surface s=surface(a,(-1/2,-1/2),(1/2,1/2),Spline);
draw(s,mean(palette(s.map(zpart),Rainbow())),black);
//grid3(XYZgrid);

Array A is generated by Haskell program:

import System.Process
import Data.List

f :: Integer -> Integer -> IO Integer
f x y = fmap read $readProcess "/usr/bin/java" ["A",show x,show y] ""

g :: [[Integer]] -> [String]
g xss = map (\xs -> "new real[]{" ++ intercalate "," (map show xs) ++ "},") xss

main = do
  let n = 20
  xs <- mapM (\x -> mapM (\y -> f x y) [0..n]) [0..n]
  putStrLn $unlines $g xs

Solution

The time it takes for removeAll to work depends on what collection you pass through When you look at the implementation of removeAll:

public boolean removeAll(Collection<?> c) {
    boolean modified = false;

    if (size() > c.size()) {
        for (Iterator<?> i = c.iterator(); i.hasNext(); )
            modified |= remove(i.next());
    } else {
        for (Iterator<?> i = iterator(); i.hasNext(); ) {
            if (c.contains(i.next())) {
                i.remove();
                modified = true;
            }
        }
    }
    return modified;
}

You can see that when the HashSet and the collection have the same size, it iterates over the HashSet and calls c.contains.for each element Since you use LinkedList as a parameter, this is the O (n) operation of each element Since this operation needs to be performed for each of the N deleted elements, the result is an O (N2) operation

If you replace the LinkedList with a collection that provides more efficient content, the performance of removeAll will improve accordingly If you make the HashSet larger than the collection, the performance should also be significantly improved because it traverses the collection

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>