Parallelism in rxjava – filters
I have some very simple code and read a bunch of strings & application filters I want the filter to run on multiple threads
Iterable<String> outputs = Observable .from(Files.readLines(new File("E:\\SAMA\\Test\\ImageNetBullets.txt"),Charset.forName("utf-8"))) .take(20).subscribeOn(Schedulers.from(threadPoolExecutor)).filter(str -> isURLOK(str)) .toBlocking().toIterable();
From the log, it seems that the filter method runs on only one thread:
In Thread pool-1-thread-1 In Thread pool-1-thread-1 http://farm2.static.flickr.com/1258/1479683334_3ff920d217.jpg In Thread pool-1-thread-1 In Thread pool-1-thread-1
How can I speed up?
Solution
Rxjava is sequential in nature For example, using map (func1), func1 itself will execute non concurrently with the value passed through the parent sequence:
Observable.range(1,10).map(v -> v * v).subscribe(System.out::println);
Here, lambda V – > V * V will call the values 1 to 10 in a sequential manner
Rxjava can be asynchronous in such a way that stages in the pipeline (range – > map - > subscribe) can occur simultaneously / in parallel with each other For example:
Observable.range(1,10) .subscribeOn(Schedulers.computation()) .map(v -> v * v) // (1) .observeOn(Schedulers.io()) .map(v -> -v) // (2) .toBlocking() .subscribe(System.out::println); // (3)
Here, (1) can be run in parallel with (2) and (3), that is, while (2) calculates AV = 3 * 3, (1) may have calculated v = 5 and (3) prints - 1 at the same time
If you want to process the elements of the sequence at the same time, you must "separate" the sequence into sub observables, and then connect the results with flatmap:
Observable.range(1,10) .flatMap(v -> Observable.just(v) .subscribeOn(Schedulers.computation()) .map(v -> v * v) ) .toBlocking() .subscribe(System.out::println);
Here, each value V starts a new observable running on the background thread and is calculated through map () V = 1 can run on thread 1, v = 2 can run on thread 2, and V = 3 can run on thread 1, but strictly after v = 1