Performance – unpacked boxed value in a vector of quads
I try to debug performance problems as part of more complex code It seems that the append function I used to create dynamic, growable (int, int, int) vectors causes an int in the tuple to be boxed and unboxed before being written to the vector I wrote a simpler code to reproduce this problem - it seems to happen only when I add the vector growth function to the append function - the following example code (it doesn't do much useful work except to reproduce the problem), followed by the core fragment, showing boxed and unboxed values:
{-# LANGUAGE BangPatterns #-} module Test where import Data.Vector.Un@R_624_2419@ed.Mutable as MU import Data.Vector.Un@R_624_2419@ed as U hiding (mapM_) import Control.Monad.ST as ST import Control.Monad.Primitive (PrimState) import Control.Monad (when) import GHC.Float.RealFracMethods (int2Float) import Data.STRef (newSTRef,writeSTRef,readSTRef) import Data.Word type MVI1 s = MVector (PrimState (ST s)) Int type MVI4 s = MVector (PrimState (ST s)) (Int,Int) data Snakev s = S {-# UNPACK #-}!Int !(MVI4 s) newVI1 :: Int -> Int -> ST s (MVI1 s) newVI1 n x = do a <- new n mapM_ (\i -> MU.unsafeWrite a i x) [0..n-1] return a -- Growable array - we always append an element. It grows by factor of 1.5 if more capacity is needed append :: Snakev s -> (Int,Int) -> ST s (Snakev s) append (S i v) x = do if i < MU.length v then MU.unsafeWrite v i x >> return (S (i+1) v) else MU.unsafeGrow v (floor $! 1.5 * (int2Float $MU.length v)) >>= (\y -> MU.unsafeWrite y i x >> return (S (i+1) y)) gridWalk :: Vector Word8 -> Vector Word8 -> MVI1 s -> MVI1 s -> Snakev s -> Int -> (Vector Word8 -> Vector Word8 -> Int -> Int -> Int) -> ST s (Snakev s) gridWalk a b fp snodes snakesv !k cmp = do let offset = 1+U.length a xp = offset-k snodep <- MU.unsafeRead snodes xp -- get the index of prevIoUs snake node in snakev array append snakesv (snodep,xp,xp) {-#INLINABLE gridWalk #-}
GHC generates an append version for gridwalk This function is $wa in the core – note the boxed int parameter:
$wa :: forall s. Int# -> MVI4 s -> Int# -> Int# -> Int# -> Int ======= @R_624_2419@ed value - one of (Int,Int) is @R_624_2419@ed -> State# s -> (# State# s,Snakev s #) $wa = \ (@ s) (ww :: Int#) (ww1 :: MVI4 s) (ww2 :: Int#) (ww3 :: Int#) (ww4 :: Int#) (ww5 :: Int) === @R_624_2419@ed value (w :: State# s) -> .... .... of ipv12 { __DEFAULT -> case (writeIntArray# ipv7 ww ww4 (ipv12 `cast` ...)) `cast` ... of ipv13 { __DEFAULT -> (# case ww5 of _ { I# x# -> (writeIntArray# ipv10 ww x# (ipv13 `cast` ...)) `cast` ... },S (+# ww 1) ((MV_4 (+# y rb) ==== x below un@R_624_2419@ed from arg ww5 ====== ((MVector 0 x ipv1) `cast` ...) ((MVector 0 x1 ipv4) `cast` ...) ((MVector 0 x2 ipv7) `cast` ...) ((MVector 0 x3 ipv10) `cast` ...)) `cast` ...) #)
When gridwalk calls append, enter a value:
=== function called by gridWalk ====== a :: forall s. Vector Word8 -> Vector Word8 -> MVI1 s -> MVI1 s -> Snakev s -> Int -> (Vector Word8 -> Vector Word8 -> Int -> Int -> Int) -> State# s -> (# State# s,Snakev s #) a = \ (@ s) (a1 :: Vector Word8) _ _ (snodes :: MVI1 s) (snakesv :: Snakev s) (k :: Int) _ (eta :: State# s) -> case k of _ { I# ipv -> case snodes `cast` ... of _ { MVector rb _ rb2 -> case a1 `cast` ... of _ { Vector _ rb4 _ -> let { y :: Int# y = -# (+# 1 rb4) ipv } in case readIntArray# rb2 (+# rb y) (eta `cast` ...) of _ { (# ipv1,ipv2 #) -> case snakesv of _ { S ww ww1 -> ====== y @R_624_2419@ed below before append called ====== $wa ww ww1 ipv2 y y (I# y) (ipv1 `cast` ...) } } } } }
Therefore, before inserting into the vector of (int, int), the effect seems to be the boxing and additional unboxing of values in gridwalk Appending inline to the tag does not change the behavior - those boxed values only move in the body of the gridwalk function
I would appreciate how to unpack this value I want to preserve the append function (that is, handle vector growth when the capacity is exceeded) and reconstruct it at the same time
GHC version is 7.6 1. The vector version is 0.10
Solution
This is just a comment I think I will get rid of the tuple parameter (adjusting the use of append in gridWalk), but the result is (only) the last Int parameter must explode so that nothing is packed, which looks strange.
append :: Snakev s -> Int -> Int -> Int -> Int -> ST s (Snakev s) append (S i v) a b c !d = do if i < len then do MU.unsafeWrite v i (a,b,c,d) return $S (i+1) v else do y <- MU.unsafeGrow v additional MU.unsafeWrite y i (a,d) return $S (i+1) y where len = MU.length v additional = floor (1.5 * int2Float len) -- this seems kind of bizarre -- by the way; can't you stay inside Int? -- 3 * (len `div` 2) or something
Edit, if you move the application of S (I 1) outside the do block, you can also get all the unpacked things, but I'm not sure if it brings us closer to the quarry...:
append :: Snakev s -> Int -> Int -> Int -> Int -> ST s (Snakev s) append (S i v) a b c d = do if i < len then liftM (S (i+1)) $do MU.unsafeWrite v i (a,d) return v else liftM ( S (i+1)) $do y <- MU.unsafeGrow v zzz MU.unsafeWrite y i (a,d) return y where len = MU.length v zzz = floor (1.5 * int2Float len)
But if liftm is replaced by fmap, we will return to the separate unboxing state If liftm (s (1 I) or fmap (s (I 1) moves all the way to the front, the situation will proceed smoothly:
append (S i v) a b c d = S (i+1) <$> do ...