Extract flow graph from Java stream >
•
Java
I have a stream (this format is not set by me and cannot be changed) for example
Stream<String> doc1 = Stream.of("how","are","you","doing","doing");
Stream<String> doc2 = Stream.of("what","what","upto");
Stream<String> doc3 = Stream.of("how","how");
Stream<Stream<String>> docs = Stream.of(doc1,doc2,doc3);
I tried to convert it into the structure of map < string, multiset < integer > > (or its corresponding stream, because I want to deal with it further), where the key string is the word itself, and multiset < integer > represents the number of words appearing in each document (0 should be excluded) Multiset is a Google guava class (not Java. Util.)
For example:
how -> {1,2} // because it appears once in doc1,twice in doc3 and none in doc2(so doc2's count should not be included)
are -> {1,1} // once in doc1 and once in doc3
you -> {1,1} // once in doc1 and once in doc2
doing -> {3} // thrice in doc3,none in others
what -> {2,1} // so on
upto -> {1}
What is a good way to do in Java 8?
I tried to use flatmap, but the internal stream greatly limited my choice
Solution
Map<String,List<Long>> map = docs.flatMap(
Map<String,List<Long>> map = docs.flatMap(
inner -> inner.collect(
Collectors.groupingBy(Function.identity(),Collectors.counting()))
.entrySet()
.stream())
.collect(Collectors.groupingBy(
Entry::getKey,Collectors.mapping(Entry::getValue,Collectors.toList())));
System.out.println(map);
// {upto=[1],how=[1,2],doing=[3],what=[2,1],are=[1,you=[1,1]}
The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
二维码
