Extract flow graph from Java stream >

I have a stream (this format is not set by me and cannot be changed) for example

Stream<String> doc1 = Stream.of("how","are","you","doing","doing");
Stream<String> doc2 = Stream.of("what","what","upto");
Stream<String> doc3 = Stream.of("how","how");
Stream<Stream<String>> docs = Stream.of(doc1,doc2,doc3);

I tried to convert it into the structure of map < string, multiset < integer > > (or its corresponding stream, because I want to deal with it further), where the key string is the word itself, and multiset < integer > represents the number of words appearing in each document (0 should be excluded) Multiset is a Google guava class (not Java. Util.)

For example:

how   -> {1,2}  // because it appears once in doc1,twice in doc3 and none in doc2(so doc2's count should not be included)
are   -> {1,1}  // once in doc1 and once in doc3
you   -> {1,1}  // once in doc1 and once in doc2
doing -> {3}     // thrice in doc3,none in others 
what  -> {2,1}   // so on
upto  -> {1}

What is a good way to do in Java 8?

I tried to use flatmap, but the internal stream greatly limited my choice

Solution

Map<String,List<Long>> map = docs.flatMap(
Map<String,List<Long>> map = docs.flatMap(
            inner -> inner.collect(
                    Collectors.groupingBy(Function.identity(),Collectors.counting()))
                    .entrySet()
                    .stream())
            .collect(Collectors.groupingBy(
                    Entry::getKey,Collectors.mapping(Entry::getValue,Collectors.toList())));

System.out.println(map);

// {upto=[1],how=[1,2],doing=[3],what=[2,1],are=[1,you=[1,1]}
The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>