Multithreading – the parbuffer evaluation does not give the expected acceleration

I have a Haskell function, and I want to evaluate it with accurate intermediate results:

f 0 x = 0
f n x = let tmp = f (n-1) x in
        tmp + (x-tmp^2)/2

Due to (^ 2), complexity increases exponentially in n Because I wanted to draw a graph and two different x calculations were completely independent, I expected to get near optimal acceleration from parallel evaluation My code:

import Data.Ratio
import Control.Parallel.Strategies

f 0 x = 0
f n x = let tmp = f (n-1) x in
        tmp + (x-tmp^2)/2

main = do
        it <- readLn
        let fn = fromRational . f it 
            values = map fn [0,1%2..10] :: [Double]
            computed = values `using` parBuffer 16 rseq
        mapM_ (putStrLn . show) computed

But to my surprise, this did not really expand (in my dual core I3 and HT):

$ghc -threaded -O f.hs
[1 of 1] Compiling Main             ( f.hs,f.o )
Linking f ...
$time echo 20 | (./f +RTS -N1 > /dev/null)

real    0m4.760s
user    0m4.736s
sys     0m0.016s
$time echo 20 | (./f +RTS -N2 > /dev/null)

real    0m4.041s
user    0m5.416s
sys     0m2.548s
$time echo 20 | (./f +RTS -N3 > /dev/null)

real    0m4.884s
user    0m10.936s
sys     0m3.464s
$time echo 20 | (./f +RTS -N4 > /dev/null)

real    0m5.536s
user    0m17.028s
sys     0m3.888s

What did I do wrong here? It seems to take a long time to lock (sys?) Instead of doing useful work

Solution

I think that because the total running time is relatively small, there will be a lot of trouble in adjusting the initial size of the heap during garbage collection You can try to make the initial allocation area larger by passing RTS - a100m

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>