<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:blogger='http://schemas.google.com/blogger/2008' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-301893373063372699</id><updated>2013-05-03T20:14:42.651+10:00</updated><title type='text'>Disciple Development</title><subtitle type='html'>More info at the &lt;a href="http://disciple.ouroborus.net"&gt;The Disciplined Disciple Compiler (DDC) Development Wiki&lt;/a&gt;.&lt;br&gt;
Easy tickets to get started with: &lt;a href="http://disciple.ouroborus.net/report/9"&gt;on the trac&lt;/a&gt;.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>15</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-661333655626726737</id><published>2013-05-03T20:04:00.000+10:00</published><updated>2013-05-03T20:14:42.656+10:00</updated><title type='text'>First working program using the Repa plugin</title><content type='html'> &lt;h3&gt;Haskell Source&lt;/h3&gt;&lt;pre&gt;&lt;br /&gt;repa_process :: R.Stream k Int -&gt; Int&lt;br /&gt;repa_process s&lt;br /&gt; = R.fold (+) 0 s + R.fold (*) 1 s&lt;br /&gt;&lt;/pre&gt;&lt;br&gt;  &lt;h3&gt;Actual well formed GHC Core code&lt;/h3&gt;.. or at least the parts that get printed with -dsuppress-all.   &lt;pre&gt;&lt;br /&gt;repa_process&lt;br /&gt;repa_process =&lt;br /&gt;  \ @ k_aq6 arg_s2Rf -&gt;&lt;br /&gt;    let { (# x1_s2R6, x1_acc_s2R5 #) ~ _ &lt;- newByteArray# 8 realWorld# } in&lt;br /&gt;    let { __DEFAULT ~ x2_s2R8            &lt;- writeIntArray# x1_acc_s2R5 0 0 x1_s2R6 } in&lt;br /&gt;    let { (# x3_s2Rd, x3_acc_s2Rc #) ~ _ &lt;- newByteArray# 8 x2_s2R8 } in&lt;br /&gt;    let { __DEFAULT ~ x4_s2RS            &lt;- writeIntArray# x3_acc_s2Rc 0 1 x3_s2Rd } in&lt;br /&gt;    let { Stream ds1_s2Rs ds2_s2Rj ~ _   &lt;- arg_s2Rf } in&lt;br /&gt;    let { Vector rb_s2Rz _ rb2_s2Ry ~ _  &lt;- ds2_s2Rj `cast` ... } in&lt;br /&gt;    letrec {&lt;br /&gt;      go_s2RP&lt;br /&gt;      go_s2RP =&lt;br /&gt;        \ ix_s2Rr world_s2Ru -&gt;&lt;br /&gt;          case &gt;=# ix_s2Rr ds1_s2Rs of _ {&lt;br /&gt;            False -&gt;&lt;br /&gt;              let { (# x7_s2RF, x0_s2RC #) ~ _ &lt;- readIntArray# x1_acc_s2R5 0 world_s2Ru } in&lt;br /&gt;              let { __DEFAULT ~ sat_s2SV       &lt;- +# rb_s2Rz ix_s2Rr } in&lt;br /&gt;              let { __DEFAULT ~ wild3_s2RD     &lt;- indexIntArray# rb2_s2Ry sat_s2SV } in&lt;br /&gt;              let { __DEFAULT ~ sat_s2SU       &lt;- +# x0_s2RC wild3_s2RD } in&lt;br /&gt;              let { __DEFAULT ~ x8_s2RH        &lt;- writeIntArray# x1_acc_s2R5 0 sat_s2SU x7_s2RF } in&lt;br /&gt;              let { (# x9_s2RN, x1_s2RL #) ~ _ &lt;- readIntArray# x3_acc_s2Rc 0 x8_s2RH } in&lt;br /&gt;              let { __DEFAULT ~ sat_s2ST       &lt;- *# x1_s2RL wild3_s2RD } in&lt;br /&gt;              let { __DEFAULT ~ x10_s2RR       &lt;- writeIntArray# x3_acc_s2Rc 0 sat_s2ST x9_s2RN } in&lt;br /&gt;              let { __DEFAULT ~ sat_s2SS &lt;- +# ix_s2Rr 1 } in&lt;br /&gt;              go_s2RP sat_s2SS x10_s2RR;&lt;br /&gt;            True -&gt; world_s2Ru&lt;br /&gt;          }; } in&lt;br /&gt;    let { __DEFAULT ~ x11_s2RU &lt;- go_s2RP 0 x4_s2RS } in&lt;br /&gt;    let { (# x12_s2RY, x1_s2S2 #) ~ _  &lt;- readIntArray# x1_acc_s2R5 0 x11_s2RU } in&lt;br /&gt;    let { (# _, x2_s2S3 #) ~ _         &lt;- readIntArray# x3_acc_s2Rc 0 x12_s2RY } in&lt;br /&gt;    let { __DEFAULT ~ sat_s2SW         &lt;- +# x1_s2S2 x2_s2S3 } in I# sat_s2SW&lt;br /&gt;&lt;/pre&gt; Im using ByteArrays to fake imperative variables. The next job is to convert the arrays x1_acc_s2R5 and x3_acc_s2Rc into proper accumulation variables, and attach them to the go_s2RP loop. &lt;br&gt;&lt;br&gt; &lt;h3&gt;x86 assembly code&lt;/h3&gt;This is just for the inner loop.  &lt;pre&gt;&lt;br /&gt;LBB11_2:&lt;br /&gt;        movq    (%rcx), %rdi&lt;br /&gt;        addq    %rdi, 16(%rdx)&lt;br /&gt;        imulq   16(%rsi), %rdi&lt;br /&gt;        movq    %rdi, 16(%rsi)&lt;br /&gt;        addq    $8, %rcx&lt;br /&gt;        incq    %r14&lt;br /&gt;        cmpq    %r14, %rax&lt;br /&gt;        jg      LBB11_2&lt;br /&gt;&lt;/pre&gt; The extra memory operations are there because the LLVM optimiser doesn't know that the two ByteArrays I'm using to fake imperative variables don't alias. This should turn into optimal code once it uses pure accumulators.  </content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/661333655626726737/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2013/05/first-working-program-using-repa-plugin.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/661333655626726737'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/661333655626726737'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2013/05/first-working-program-using-repa-plugin.html' title='First working program using the Repa plugin'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-3526409899006429971</id><published>2013-05-01T17:08:00.000+10:00</published><updated>2013-05-02T00:38:02.064+10:00</updated><title type='text'>Array fusion using the lowering transform</title><content type='html'>&lt;div&gt;I'm getting back into blogging. This is current work in progress.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Repa 4 will include a GHC plugin that performs array fusion using a version of Richard Waters's series expressions system, extended to support the segmented operators we need for Data Parallel Haskell.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The plugin converts GHC core code to Disciple Core, performs the fusion transform, and then converts back to GHC core. We're using Disciple Core for the main transform because it has a simple (and working) external core format, and the core language AST along with most of the core-to-core transforms are parametric in the type used for names. This second feature is&amp;nbsp;important&amp;nbsp;because we use a version of Disciple Core where the main array combinators (map, fold, filter etc) are primitive operators rather than regular function bindings.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The fusion transform is almost (almost) working for some simple examples. Here is the compilation sequence:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;b&gt;Haskell Source&lt;/b&gt;&lt;br /&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;process :: Stream k Int -&amp;gt; Int&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;process s&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp;= fold (+) 0 s + fold (*) 1 s&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;In this example we're performing two separate reductions over the same stream. None of the existing short-cut fusion approaches can handle this properly. Stream fusion in Data.Vector, Repa-style delayed array fusion, and build/foldr fusion will all read the stream elements from memory twice (if they are already manifest) or compute them twice (if they are delayed). We want to compute both reductions in a single pass.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;b&gt;Raw GHC core converted to DDC core&lt;/b&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;repa_process_r2 : [k_aq0 : Data].Stream_r0 k_aq0 Int_3J -&amp;gt; Int_3J&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; = \(k_c : *_34d).\(s_aqe : Stream_r0 k_c Int_3J).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; $fNumInt_$c+_rjF&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(fold_r34 [k_c] [Int_3J] [Int_3J] $fNumInt_$c+_rjF&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(I#_6d 0i#) s_aqe)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(fold_r34 [k_c] [Int_3J] [Int_3J] $fNumInt_$c*_rjE&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(I#_6d 1i#) s_aqe)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Detect array combinators from GHC core, and convert to DDC primops&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;repa_process_r2 : [k_aq0 : Rate].Stream# k_aq0 Int# -&amp;gt; Int#&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp;= /\(k_c : Rate).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; \(s_aqe : Stream# k_c Int#).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;add# [Int#]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (fold# [k_c] [Int#] [Int#] (add# [Int#]) 0i# s_aqe)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (fold# [k_c] [Int#] [Int#] (mul# [Int#]) 1i# s_aqe)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Normalize and shift array combinators to top-level&lt;/b&gt;&lt;/div&gt;&lt;div&gt;All array combinators are used in their own binding.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;repa_process_r2 : [k_aq0 : Rate].Stream# k_aq0 Int# -&amp;gt; Int#&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp;= /\(k_c : Rate).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; \(s_aqe : Stream# k_c Int#).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let x0 = add# [Int#] in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let x1 = fold# [k_c] [Int#] [Int#] x0 0i# s_aqe in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let x2 = mul# [Int#] in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let x3 = fold# [k_c] [Int#] [Int#] x2 1i# s_aqe in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;add# [Int#] x1 x3&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Inline and eta-expand worker functions&lt;/b&gt;&lt;/div&gt;&lt;div&gt;This puts the program in the correct form for the next phase.&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;repa_process_r2 : [k_aq0 : Rate].Stream# k_aq0 Int# -&amp;gt; Int#&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp;= /\(k_c : Rate).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; \(s_aqe : Stream# k_c Int#).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let x1&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; = fold# [k_c] [Int#] [Int#]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (\(x0 x1 : Int#). add# [Int#] x0 x1) 0i# s_aqe in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let x3&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; = fold# [k_c] [Int#] [Int#]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (\(x2 x3 : Int#). mul# [Int#] x2 x3) 1i# s_aqe in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; add# [Int#] x1 x3&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Do the lowering transform&lt;/b&gt;&lt;/div&gt;&lt;div&gt;This is the main pass that performs array fusion. Note that we've introduced a single loop# that computes both of the fold# results.&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;br /&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;repa_process_r2 : [k_c : Rate].Stream# k_c Int# -&amp;gt; Int#&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp;= /\(k_c : Rate).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; \(s_aqe : Stream# k_c Int#).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let x1_acc : Ref# Int# = new# [Int#] 0i# in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let x3_acc : Ref# Int# = new# [Int#] 1i# in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let _ : Unit&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; = loop# (lengthOfRate# [k_c])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (\(x0 : Nat#).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let x1 : Int# = next# [Int#] [k_c] s_aqe x0 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let x0 : Int# = read# [Int#] x1_acc in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let _ &amp;nbsp;: Void#&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;= write# [Int#] x1_acc (add# [Int#] x0 x1) in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let x2 : Int# = read# [Int#] x3_acc in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let _ &amp;nbsp;: Void#&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;= write# [Int#] x3_acc (mul# [Int#] x2 x1) in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;()) in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let x1 : Int# = read# [Int#] x1_acc in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let x3 : Int# = read# [Int#] x3_acc in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;add# [Int#] x1 x3&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Assign imperative variable storage to arrays&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;We need to convert the code back to GHC core, but we don't want to use IORefs because they can't hold unboxed values (of types like Int#). Instead, we use some new arrays to hold these values instead.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;repa_process_r2 : [k_c : Rate].Stream# k_c Int# -&amp;gt; Int#&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp;= /\(k_c : Rate).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; \(s_aqe : Stream# k_c Int#).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let x1_acc : Array# Int# = newArray# [Int#] 8# in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let _ &amp;nbsp; &amp;nbsp; &amp;nbsp;: Void# &amp;nbsp; &amp;nbsp; &amp;nbsp; = writeArray# [Int#] x1_acc 0# 0i# in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let x3_acc : Array# Int# = newArray# [Int#] 8# in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let _ &amp;nbsp; &amp;nbsp; &amp;nbsp;: Void# &amp;nbsp; &amp;nbsp; &amp;nbsp; = writeArray# [Int#] x3_acc 0# 1i# in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp;let _ : Unit&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;= loop# (lengthOfRate# [k_c])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(\(x0 : Nat#).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let x1 : Int# = next# [Int#] [k_c] s_aqe x0 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let x0 : Int# = readArray# [Int#] x1_acc 0# in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let _ : Void#&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; = writeArray# [Int#] x1_acc 0#&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (add# [Int#] x0 x1) in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let x2 : Int# = readArray# [Int#] x3_acc 0# in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let _ : Void#&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;= writeArray# [Int#] x3_acc 0#&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(mul# [Int#] x2 x1) in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;()) in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; let x1 : Int# = readArray# [Int#] x1_acc 0# in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; let x3 : Int# = readArray# [Int#] x3_acc 0# in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; add# [Int#] x1 x3&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Thread state token through effectful primops&lt;/b&gt;&lt;/div&gt;&lt;div&gt;The lowered code is naturally imperative, and GHC uses state threading to represent this.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;repa_process_r2 : [k_c : Rate].Stream# k_c Int# -&amp;gt; World# -&amp;gt; Tuple2# World# Int#&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; = /\(k_c : Rate).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;\(s_aqe : Stream# k_c Int#).\(x0 : World#).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; caselet T2# (x1 : World#) (x1_acc : Array# Int#)&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; = newArray# [Int#] 8# x0 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; caselet T2# (x2 : World#) (_ : Void#)&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; = writeArray# [Int#] x1_acc 0# 0i# x1 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; caselet T2# (x3 : World#) (x3_acc : Array# Int#)&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; = newArray# [Int#] 8# x2 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; caselet T2# (x4 : World#) (_ : Void#)&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; = writeArray# [Int#] x3_acc 0# 1i# x3 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; caselet T2# (x11 : World#) (_ : Unit)&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; = loop# (lengthOfRate# [k_c])&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (\(x0 : Nat#).\(x5 : World#).&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;caselet T2# (x6 : World#) (x1 : Int#)&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;= next# [Int#] [k_c] s_aqe x0 x5 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;caselet T2# (x7 : World#) (x0 : Int#)&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;= readArray# [Int#] x1_acc 0# x6 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;caselet T2# (x8 : World#) (_ : Void#)&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;= writeArray# [Int#] x1_acc 0# (add# [Int#] x0 x1) x7 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;caselet T2# (x9 : World#) (x2 : Int#)&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;= readArray# [Int#] x3_acc 0# x8 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;caselet T2# (x10 : World#) (_ : Void#)&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;= writeArray# [Int#] x3_acc 0# (mul# [Int#] x2 x1) x9 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;T2# x10 ()) x4 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; caselet T2# (x12 : World#) (x1 : Int#)&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; = readArray# [Int#] x1_acc 0# x11 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; caselet T2# (x13 : World#) (x3 : Int#)&amp;nbsp;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; = readArray# [Int#] x3_acc 0# x12 in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; T2# x13 (add# [Int#] x1 x3)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here, "caselet" is just sugar for a case expression with a single alternative.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Covert back to GHC core&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;repa_process_sTX&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; :: forall k_c.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;Data.Array.Repa.Series.Stream k_c GHC.Types.Int&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;-&amp;gt; GHC.Prim.State# GHC.Prim.RealWorld&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;-&amp;gt; (# GHC.Prim.State# GHC.Prim.RealWorld, GHC.Prim.Int# #)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;[LclId]&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;lowered_sTX =&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; \ (@ k_c)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; (rate_sTY :: GHC.Prim.Int#)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; (x_sTZ :: Data.Array.Repa.Series.Stream k_c GHC.Types.Int)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; (x_sU0 :: GHC.Prim.State# GHC.Prim.RealWorld) -&amp;gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; let { (# x_sU1, x_sU2 #) ~ scrut_sUv&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;lt;- newByteArray#_sU3 @ GHC.Prim.RealWorld 8 x_sU0&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; } in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; let { __DEFAULT ~ x_sU8&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;lt;- writeIntArray#_sU4 @ GHC.Prim.RealWorld x_sU2 0 0 x_sU1&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; } in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; let { (# x_sU5, x_sU6 #) ~ scrut_sUw&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;lt;- newByteArray#_sU7 @ GHC.Prim.RealWorld 8 x_sU8&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; } in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; let { __DEFAULT ~ x_sUp&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;lt;- writeIntArray#_sU9 @ GHC.Prim.RealWorld x_sU6 0 1 x_sU5&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; } in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; let { (# x_sUa, x_sUb #) ~ x_sUa&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;lt;- Main.primLoop&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(Main.primLengthOfRate rate_sTY)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(\ (x_sUc :: GHC.Prim.Int#)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (x_sUd :: GHC.Prim.State# GHC.Prim.RealWorld) -&amp;gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let { (# x_sUe, x_sU1 #) ~ x_sUe&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;lt;- Main.primNext_Int @ k_c x_sTZ x_sUc x_sUd&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; } in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let { (# x_sUf, x_sUc #) ~ x_sUf&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;lt;- readIntArray#_sUg x_sU2 0 x_sUe&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; } in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let { __DEFAULT ~ x_sUl&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;lt;- writeIntArray#_sUh&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;@ GHC.Prim.RealWorld x_sU2 0 (+#_sUi x_sUc x_sU1) x_sUf&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; } in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let { (# x_sUj, x_sU8 #) ~ x_sUj&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;lt;- readIntArray#_sUk x_sU6 0 x_sUl&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; } in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let { __DEFAULT ~ x_sUo&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;lt;- writeIntArray#_sUm&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;@ GHC.Prim.RealWorld x_sU6 0 (*#_sUn x_sU8 x_sU1) x_sUj&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; } in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (# x_sUo, GHC.Tuple.() #))&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;x_sUp&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; } in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; let { (# x_sUq, x_sU1 #) ~ x_sUq&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;lt;- readIntArray#_sUr x_sU2 0 x_sUa&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; } in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; let { (# x_sUs, x_sU5 #) ~ x_sUs&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; &amp;lt;- readIntArray#_sUt x_sU6 0 x_sUq&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; } in&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: Courier New, Courier, monospace;"&gt;&amp;nbsp; &amp;nbsp; (# x_sUs, +#_sUu x_sU1 x_sU5 #)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This doesn't work yet because I've forgotten to pass the type arguments to the unboxed tuple constructor &lt;span style="font-family: Courier New, Courier, monospace;"&gt;(#,#)&lt;/span&gt;&lt;span style="font-family: inherit;"&gt;, and maybe other problems as well. I'll post again when I have an actual program running.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span style="font-family: inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/3526409899006429971/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2013/05/array-fusion-using-lowering-transform.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/3526409899006429971'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/3526409899006429971'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2013/05/array-fusion-using-lowering-transform.html' title='Array fusion using the lowering transform'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-3333727029780551750</id><published>2013-01-01T16:03:00.000+11:00</published><updated>2013-01-01T16:03:39.604+11:00</updated><title type='text'>Code Generators, Rewrite Rules, Aliasing and the Coq</title><content type='html'>&lt;div&gt;DDC 0.3.1 was pushed onto &lt;a href="http://hackage.haskell.org/package/ddc-tools"&gt;Hackage&lt;/a&gt; just before Christmas.&lt;br /&gt;&lt;br /&gt;The main features in this new release are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Back end code generation via C and LLVM that accepts the new core language. There is enough implemented to compile first order programs, but we'll need to implement a lambda lifter for the new core language before we can compile higher order programs directly.&lt;/li&gt;&lt;li&gt;Lots more program transformations. You can apply program transformations to core programs on the command line, and check that the code is being optimised the way it should be. There is a &lt;a href="http://disciple.ouroborus.net/wiki/Tutorial/Core/Optimisation"&gt;tutorial&lt;/a&gt; describing how to do this.&amp;nbsp;&lt;/li&gt;&lt;li&gt;A rewrite rule framework that understands the region and effect system. Rewrite rules work on effectful expressions, and can be given predicates so they only apply when particular effects are non-interfering. This lets us perform build-fold fusion when the builder and folder have (non-interferring) side effects. (more info in Amos Robinson's&amp;nbsp;&lt;a href="http://code.ouroborus.net/ddc/doc/theses/2012-AmosRobinson-RewriteRules.pdf"&gt;honours thesis&lt;/a&gt;)&lt;/li&gt;&lt;li&gt;Propagation of no-aliasing meta-data to the LLVM code generator. We have extended the core language with witnesses of distinctness, that prove particular regions of the store are distinct (do not alias). When LLVM has this meta-data it can perform more optimisations on the generated code that it would otherwise be able to. (more info in Tran Ma's&amp;nbsp;&lt;a href="http://code.ouroborus.net/ddc/doc/theses/2012-TranMa-AliasingControl.pdf"&gt;honours thesis&lt;/a&gt;)&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;The main missing feature is:&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;No type inference, so the core programs contain lots of type annotations. We'll need to get the type inferencer back online before the language will be useful to end-users.&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;Our immediate plans are to get a paper together about the new core language and how it supports the rewrite rules and LLVM meta-data. There are interesting stories to tell about all of these things. After that, we'll work on the type inferencer.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For the paper, we'll need a proof that the core language does what it's supposed to. To get this, I've started adding the region and effect system to the Coq formalisation of SystemF2+MutableStore I did about a year ago.&amp;nbsp;I hadn't touched Coq since then, so eased my way back into it by cleaning up the proofs I had already done, then packaged them up into the&amp;nbsp;&lt;a href="http://iron.ouroborus.net/"&gt;Iron Lambda&lt;/a&gt;&amp;nbsp;collection.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://iron.ouroborus.net/"&gt;Iron Lambda&lt;/a&gt;&amp;nbsp;is a collection of Coq formalisations for functional core languages of increasing complexity, with the SystemF2+Regions+Effects language I'm working on now being the latest one. I made a wiki page for Iron Lambda because I think it will be useful in its own right, to help intermediate Coq users bridge the gap between the end of the &lt;a href="http://www.cis.upenn.edu/~bcpierce/sf/"&gt;Software Foundations &lt;/a&gt;course and what appears in current research papers. The biggest language in Software Foundations is Simply Typed Lambda Calculus (STLC)+Records+SubTyping, but it doesn't include polymorphism or indirect mutual recursion. To deal with SystemF style polymorphism we need a better approach to names than unique identifiers (ie, something like deBruijn indices). To deal with indirectly mutually recursive language definitions we need to know how to define custom own induction schemes. Iron Lambda shows how to do these things.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;For the impatient:&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;DDC 0.3.1&lt;/b&gt;&amp;nbsp;&lt;/div&gt;&lt;pre&gt; $ cabal update&lt;br /&gt; $ cabal install ddc-tools&lt;/pre&gt;&lt;br /&gt;&lt;b&gt;Iron Lambda&lt;/b&gt;&lt;br /&gt;&lt;pre&gt; $ darcs get http://code.ouroborus.net/iron/iron-head/&lt;/pre&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/3333727029780551750/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2013/01/code-generators-rewrite-rules-aliasing.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/3333727029780551750'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/3333727029780551750'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2013/01/code-generators-rewrite-rules-aliasing.html' title='Code Generators, Rewrite Rules, Aliasing and the Coq'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-1305137130684749361</id><published>2012-02-02T17:50:00.004+11:00</published><updated>2012-02-02T17:59:54.876+11:00</updated><title type='text'>Vectorisation without Replication in Data Parallel Haskell</title><content type='html'>Here is a &lt;a href="http://en.wikipedia.org/wiki/Barnes-Hut"&gt;Barnes-Hut&lt;/a&gt; gravitation simulation written using &lt;a href="http://hackage.haskell.org/package/vector-0.9.1"&gt;Data.Vector&lt;/a&gt; and &lt;a href="http://hackage.haskell.org/package/gloss-1.6.0.1"&gt;Gloss&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;iframe width="640" height="360" src="http://www.youtube.com/embed/tDgOBM29ny4?rel=0" frameborder="0" allowfullscreen&gt;&lt;/iframe&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;If done naively, such an n-body simulation has a runtime complexity of O(n^2) because we need to consider the interaction of every body with every other body. Barnes-Hut performs an approximation that reduces this to O(n . log n). At each step in the simulation we insert the bodies (grey) into a quad-tree (green), and compute the centroid for each branch (blue). Now, if some other body is sufficiently far away from a centroid, then we use the centroid to approximate the force due to the bodies in the corresponding branch, instead of inspecting each one individually.&lt;br /&gt;&lt;br /&gt;Now you've seen the video, the following graph sums up my work on Data Parallel Haskell (DPH) for the past six months:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://code.ouroborus.net/dph/blog/2012/02/nbody-graph.png" imageanchor="1" style=""&gt;&lt;img border="0" height="451" width="600" src="http://code.ouroborus.net/dph/blog/2012/02/nbody-graph.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;This is a plot of runtime vs the number of bodies in the Barnes-Hut simulation. The simulation in the video uses just 1000 bodies, but my understanding is that the physicists need millions or billions to do useful physics work. Back to the graph, the red line is the program from the video, which uses Data.Vector to store its arrays and runs sequentially. The brown line is using the DPH library that shipped with GHC 7.2. The blue line is using the DPH library that will ship with GHC 7.4. Note how the asymptotic complexity of the program with New DPH is the same as with Data.Vector. &lt;br /&gt;&lt;br /&gt;In Old DPH we were using a naive approach to vectorisation that resulted in the quad-tree being replicated (copied) once for every body in the simulation (bad!). In New DPH we compute the force on each body in parallel while sharing the same quad-tree. There is a story about this, and you can read all about it when we finish the paper.&lt;br /&gt;&lt;br /&gt;Of course, asymptotic complexity isn't everything. The DPH program is still running about 10x slower than the Data.Vector program for large input sizes, and I'm sure that production C programs would run at least 5x faster than that. There's much low hanging fruit here. DPH misses many standard optimisation opportunities, which results in numerical values being unboxed and reboxed in our inner loops. There are also algorithmic improvements to the library that are just waiting for me to implement them. If I can make the blue line cross the red line in the next six months, then I'm buying everyone a beer.</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/1305137130684749361/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2012/02/vectorisation-without-replication-in.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/1305137130684749361'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/1305137130684749361'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2012/02/vectorisation-without-replication-in.html' title='Vectorisation without Replication in Data Parallel Haskell'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://img.youtube.com/vi/tDgOBM29ny4/default.jpg' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-4620264616681706587</id><published>2012-01-23T17:20:00.000+11:00</published><updated>2012-01-23T17:20:20.150+11:00</updated><title type='text'>The new Disciple Core language</title><content type='html'>It's been a while since updates. The last one was in August, and after that I got distracted with ICFP. When I got back I spent more time doing Coq proofs, finishing with Progress and Preservation for a version of SystemF2 with mutable algebraic data. The proofs are &lt;a href="http://code.ouroborus.net/ddc/ddc-head/proof/DDC/Language/SystemF2Store/"&gt;here&lt;/a&gt;. All data is allocated into the store (heap), and there are primitive functions to update it. In this language you can, say, destructively update a list so that the tail points to different physical data. This language doesn't have effect typing, but after this proof I felt like I had crossed the line from "Coq-shy" to "Coq-sure".&lt;br /&gt;&lt;br /&gt;In early November I read about ciu-equivalence, and how to prove contextual equivalence of program transformations. Thinking ahead, it felt like time to cleanup the sponginess in the existing DDC core language, because I'd need to do this before trying to formalise it. Although the plain calculus has a proper semantics and a hand-done soundness proof, the actual core language as used in the compiler also has some half-baked hacks. The reviewers of a previous paper had made suggestions about how to cleanup other aspects of the core calculus, and having Coqified all those languages, I now understand why it was good advice.&lt;br /&gt;&lt;br /&gt;Cutting to the chase, I've redesigned the DDC core language and there is an interpreter that works well enough to evaluate simple recursive functions. Here is an example: &lt;br /&gt;&lt;pre&gt;  letrec {&lt;br /&gt;   ack    [r1 r2 r3: %] &lt;br /&gt;          (m : Int r1) {!0 | Use r1 + Use r2 + Use r3}&lt;br /&gt;          (n : Int r2) { Read r1 + Read r2 + Alloc r3&lt;br /&gt;                       | Use r1  + Use r2  + Use r3}&lt;br /&gt;          : Int r3&lt;br /&gt;    = letregion r4 in&lt;br /&gt;      let zero = 0 [r4] () in&lt;br /&gt;      let one  = 1 [r4] () in&lt;br /&gt;      case eqInt [:r1 r4 r4:] m zero of {&lt;br /&gt;          1 -&gt; addInt [:r2 r4 r3:] n one;&lt;br /&gt;          _ -&gt; case eqInt [:r2 r4 r4:] n zero of {&lt;br /&gt;                  1 -&gt; ack [:r4 r4 r3:] &lt;br /&gt;                           (subInt [:r1 r4 r4:] m one) &lt;br /&gt;                           (1 [r4] ());&lt;br /&gt;&lt;br /&gt;                  _ -&gt; ack [:r4 r4 r3:] &lt;br /&gt;                           (subInt [:r1 r4 r4:] m one)&lt;br /&gt;                           (ack [:r1 r4 r4:] m (subInt [:r2 r4 r4:] n one));&lt;br /&gt;          }&lt;br /&gt;    }&lt;br /&gt;  } in ack [:R0# R0# R0#:] (2 [R0#] ()) (3 [R0#] ());;&lt;br /&gt;&lt;/pre&gt;Here is another example that does destructive update of an integer:&lt;br /&gt;&lt;pre&gt;  letregion r1 with {w1 : Mutable r1} in&lt;br /&gt;  let x : Int r1 = 0 [r1] () in&lt;br /&gt;  let _ : Unit   = updateInt [:r1 r1:] &amp;lt w1 &amp;gt x (2 [r1] ()) in&lt;br /&gt;  addInt [:r1 r1 R2#:] x (3 [r1] ());;&lt;br /&gt;&lt;/pre&gt;and one that suspends the allocation of an integer with lazy evaluation:&lt;br /&gt;&lt;pre&gt;  letregion r1 with { w1 : Const r1; w2 : Lazy r1; w3 : Global r1 } in&lt;br /&gt;  let x : Int r1 lazy &amp;lt w2 &amp;gt &lt;br /&gt;     = purify &amp;lt alloc [r1] w1 &amp;gt  in&lt;br /&gt;       forget &amp;lt use [r1] w3 w1 &amp;gt in&lt;br /&gt;       2 [r1] () &lt;br /&gt;  in addInt [:r1 R0# R0#:] x (3 [R0#] ());;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Note that this is an intermediate representation for a compiler, and has vastly more type information than a real user would want to write. The compiler will perform type inference on the source language, and automatically translate the user program to this lower level language.&lt;br /&gt;&lt;br /&gt;The new DDC core language is described on the &lt;a href="http://disciple.ouroborus.net/wiki/Language/Core"&gt;wiki&lt;/a&gt; and so far I've been reasonably good at keeping the wiki up to date with what's implemented. &lt;br /&gt;&lt;br /&gt;The main changes are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Witnesses now exist in their own universe, at level 0 next to values. Both values and witnesses are classified by types, and types classified by kinds. This removes the dependent-kinding of the old language. The more I thought about it, the less fun formalising a dependently kinded language seemed to be.&lt;br /&gt;&lt;br /&gt;&lt;li&gt;I've removed the more-than constraints on effect and closure variables. The type inference algorithm I am using returns constraints on effect and closure variables, so I had added similar constraints to the core language. This was a bad idea because managing the additional subsumption judgements during core type checking is a headache. The new core language has no such constraints, and I think I know to process the output of the inferencer so I won't need them.&lt;br /&gt;&lt;br /&gt;&lt;li&gt;Introducing lazy evaluation is now done with the  &lt;tt&gt;let .. lazy&lt;/tt&gt; binding form instead of a primitive suspend operator. This goes hand-in-hand with the &lt;tt&gt;purify&lt;/tt&gt; and &lt;tt&gt;forget&lt;/tt&gt; keywords that mask the effect and closure of their body respectively. These two keywords are akin to the type casts used by SystemFc and the GHC core language to support GADTs. I think it makes more sense to mask out the effect of an expression before suspending it, than to pass a witness as to why it's pure. The former style will still be used in the source program because that's how the type inferencer works, but the latter should be easier to work with in the core language.&lt;br /&gt;&lt;/ul&gt;Anyway, if you &lt;tt&gt;darcs get&lt;/tt&gt; the DDC repo you can &lt;tt&gt;make bin/ddci-core&lt;/tt&gt; to build the interpreter. Run the examples like &lt;tt&gt;bin/ddci-core test/ddci-core/30-Eval/30-Letrec/Test.dcx&lt;/tt&gt;. It's not completely finished, but all the examples under the &lt;tt&gt;test/ddci-core&lt;/tt&gt; directory run fine.&lt;br /&gt;&lt;br /&gt;The interpreter should be done in another couple of weeks. After that it'll be time to split off the LLVM backend from the existing compiler so that we can compile core programs directly.</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/4620264616681706587/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2012/01/new-disciple-core-language.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/4620264616681706587'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/4620264616681706587'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2012/01/new-disciple-core-language.html' title='The new Disciple Core language'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-1212596525606215985</id><published>2011-08-06T19:39:00.001+10:00</published><updated>2011-08-08T18:10:22.515+10:00</updated><title type='text'>How I learned to stop worrying and love de Bruijn indices</title><content type='html'>When I started learning how to formalise languages in Coq, one of my initial stumbling blocks was choosing an approach to name binding. There are at least three fancy sounding contenders: Locally Nameless, Nominal Reasoning and Parametric Higher Order Abstract Syntax. Of course, you could also use good old de Bruijn indices, but proponents of fancy approaches warn about the large  volume of lemmas regarding lifting and substitution you'll need to prove before you can start the "real work". &lt;br /&gt;&lt;br /&gt;Back in April I scratched around for about two weeks reading up on all the different approaches, trying to identify the "best" one (whatever that means). Then I read the following comment by Xavier Leroy on the &lt;a href="https://alliance.seas.upenn.edu/~plclub/cgi-bin/poplmark/index.php?title=Submission_by_Xavier_Leroy"&gt;POPLmark website&lt;/a&gt;. He had used the Locally Nameless approach, but had stated: "If the purpose is just to solve the challenge and be done with it, I would rather go for pure de Bruijn". That was enough to convince me to stop worrying and just get into it. I've spent about 250 hours now doing my own proofs, and I can report that using de Bruijn isn't really that bad. Yes, you need to prove some lemmas about lifting and substitution, but my entire response to naysayers is summed up henceforth: "The only computational operation that lambda calculus has is substitution, so don't pretend you can get away without proving something about it." Corollary: HOAS doesn't count.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;The Lemmas&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;You need to know what the lemmas are though, and I advise cribbing them from someone else's proof. I learned them from Arthur Charguéraud's proof, but I don't know where they are from originally. You're more than welcome to &lt;a href="http://code.ouroborus.net/ddc/ddc-head/proof/DDC/Language/"&gt;copy them from mine&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Once you know what the lemmas are, and how they work in STLC, extending them to new languages is straightforward. For reference, the following are the definitions of the lifting and substitution operators for STLC. Note that if you just want to prove Progress and Preservation for STLC, you don't actually need the lemmas. They're needed in bigger languages such as SystemF2 to show that type substitution is commutative. Nevertheless, for illustrative purposes we'll just stick with STLC expressions.&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;Inductive exp : Type :=&lt;br /&gt; | XVar  : nat -&gt; exp&lt;br /&gt; | XLam  : ty  -&gt; exp -&gt; exp&lt;br /&gt; | XApp  : exp -&gt; exp -&gt; exp.&lt;br /&gt;&lt;br /&gt;Fixpoint &lt;br /&gt; liftX  (d:  nat) (* current binding depth in expression *)&lt;br /&gt;        (xx: exp) (* expression containing references to lift *)&lt;br /&gt;        : exp&lt;br /&gt; := match xx with &lt;br /&gt;    |  XVar ix    &lt;br /&gt;    =&gt; if le_gt_dec d ix&lt;br /&gt;        (* Index is referencing the env, so lift it across the new elem *)&lt;br /&gt;        then XVar (S ix)&lt;br /&gt;&lt;br /&gt;        (* Index is locally bound in the expression itself, and &lt;br /&gt;           not the environment, so we don't need to change it. *)&lt;br /&gt;        else xx&lt;br /&gt;&lt;br /&gt;    (* Increase the current depth as we move across a lambda. *)&lt;br /&gt;    |  XLam t1 x1&lt;br /&gt;    =&gt; XLam t1 (liftX (S d) x1)&lt;br /&gt;&lt;br /&gt;    |  XApp x1 x2&lt;br /&gt;    =&gt; XApp   (liftX d x1) (liftX d x2)&lt;br /&gt;    end.&lt;br /&gt;&lt;br /&gt;Fixpoint&lt;br /&gt; substX (d:  nat) (* current binding depth in expression *)&lt;br /&gt;        (u:  exp) (* new expression to substitute *)&lt;br /&gt;        (xx: exp) (* expression to substitute into *)&lt;br /&gt;        : exp &lt;br /&gt; := match xx with&lt;br /&gt;    | XVar ix &lt;br /&gt;    =&gt; match nat_compare ix d with&lt;br /&gt;       (* Index matches the one we are substituting for. *)&lt;br /&gt;       | Eq  =&gt; u&lt;br /&gt;       &lt;br /&gt;       (* Index was free in the original expression.&lt;br /&gt;          As we've removed the outermost binder, also decrease this&lt;br /&gt;          index by one. *)&lt;br /&gt;       | Gt  =&gt; XVar (ix - 1)&lt;br /&gt;&lt;br /&gt;       (* Index was bound in the original expression. *)&lt;br /&gt;       | Lt  =&gt; XVar ix&lt;br /&gt;       end&lt;br /&gt;&lt;br /&gt;    (* Increase the depth as we move across a lambda. *)&lt;br /&gt;    |  XLam t1 x2&lt;br /&gt;    =&gt; XLam t1 (substX (S d) (liftX 0 u) x2)&lt;br /&gt;&lt;br /&gt;    |  XApp x1 x2 &lt;br /&gt;    =&gt; XApp (substX d u x1) (substX d u x2)&lt;br /&gt; end. &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;b&gt;Drawing the deBruijn environment&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Here is a nice lemma to get started. &lt;br /&gt;&lt;pre&gt;Lemma lift_lift&lt;br /&gt; :  forall n m t&lt;br /&gt; ,  lift n              (lift (n + m) t) &lt;br /&gt; =  lift (1 + (n + m)) (lift n t).&lt;br /&gt;&lt;/pre&gt;This is one of the lemmas you need for proving commutativity of substitution. Although it looks complicated at first glance, it will make significantly more sense when you know how to draw it. &lt;br /&gt;&lt;br /&gt;First, consider the standard typing judgement form:&lt;br /&gt;&lt;pre&gt;Env |- Exp :: Type&lt;br /&gt;&lt;/pre&gt;Assume there are de Bruijn indices in the expression. Some of these indices point to lambdas in the expression itself, corresponding to locally bound variables, while some point to lambdas in the environment, corresponding to free variables. Here is an example:&lt;br /&gt;&lt;br /&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-ovN0N2suJ9U/Tjzvg78Eu7I/AAAAAAAAADU/HZSZVsu_qsc/s800/debruijnexample.png" /&gt;&lt;br /&gt;&lt;br /&gt;Suppose I apply (lift 2) to the above expression. Although it would be helpful for educational purposes to do this directly using the definition of 'lift' above (try it!), I'll tell you the shortcut instead. Firstly, if we assume that the definition of 'lift' is correct, none of the bound indices will be changed during the application. Ignoring bound indices, the only ones remaining are free indices. These are the indices that point into the environment. &lt;br /&gt;&lt;br /&gt;Now, although applying 'lift' to an expression may increment these free indices, instead of thinking about indices being incremented, it's easier to think about environment elements being shifted. I'll explain what I mean by that in a moment, but for now let's just do it...&lt;br /&gt;&lt;br /&gt;Here is the same expression as before, with the environment positions numbered in blue:&lt;br /&gt;&lt;br /&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-QHLoh_KX6Wk/Tjz0gHKQpfI/AAAAAAAAADc/naH-0Vypd6s/s800/debruijnenvpos.png" /&gt;&lt;br /&gt;Shifting the elements after position two yields the following:&lt;br /&gt;&lt;br /&gt;&lt;img border="0" src="http://4.bp.blogspot.com/-ua-UyDDJTVU/Tjz5vjy_oyI/AAAAAAAAADs/ilnIANDn1uI/s800/debruijnenvlifted.png" /&gt;&lt;br /&gt;&lt;br /&gt;In the above diagram I've used '_' to represent a hole in the environment. This is a place where I could insert a new element, and be guaranteed that none of the indices in the expression point to this new element. This in turn means that if I apply (lift 2) to the expression, then insert a new element into the hole, then the resulting judgement is guaranteed to give me the same type as before (t5 in this case).&lt;br /&gt;&lt;br /&gt;Of course, applying 'lift' to the expression doesn't actually change the environment. In this example I've moved 't1' and 't2' to the left for illustrative purposes, but this is an operation separate from applying 'lift' to the expression itself. However, applying 'lift' does changes the indices, and these indices are &lt;i&gt;pointers&lt;/i&gt; into the environment. If we like, we could instead think about moving the pointers (the arrow heads) instead of the actual environment elements. This is an important change in perspective. Instead of thinking about 'lift' as incrementing indices in the expression, we want to think about 'lift' as shifting environment pointers to the left.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Let's look at our lemma again:&lt;br /&gt;&lt;pre&gt;Lemma lift_lift&lt;br /&gt; :  forall n m t&lt;br /&gt; ,  lift m              (lift (n + m) t) &lt;br /&gt; =  lift (1 + (n + m)) (lift n t).&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;As 't' is quantified, we need to prove this for all possible terms 't'. Using our change in perspective, this also means that we want to prove the lemma for all possible sets of environment pointers. Given the environment pointers for 't', the expression '(lift (n + m) t)' transforms them into a new set (by adding a hole at position n + m). Likewise the expression 'lift (1 + (n + m)) (lift n t)' transforms the environment pointers of 't' by adding a hole a position 'n', and then adding another hole at position '1 + (n + m)'. Here is a picture of the full situation. I've just drawn the contiguous set of environment pointers as rectangles, with the holes are marked in grey.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-uAGyoL6az9g/TjzWE3X5vsI/AAAAAAAAADE/mYBg55TTJh4/s800/lemmaliftlift.png" imageanchor="1" style="margin-left:1em; margin-right:1em"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/-uAGyoL6az9g/TjzWE3X5vsI/AAAAAAAAADE/mYBg55TTJh4/s800/lemmaliftlift.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;In the first line we have set of environment pointers for 't'. Applying 'lift n' adds a hole a position 'n'. Alternatively, applying 'lift (n + m)' adds a hole at position '(n + m)'. Given 'lift n t', if we add another hole at (1 + n + m) we get the last set shown. Likewise, if we take the set 'lift (n + m) t' and add a hole at position 'n' then we also get the last line.&lt;br /&gt;&lt;br /&gt;That's it. That complicated looking lemma above can be drawn with a few rectangles, and provided the definition of 'lift' does what it's supposed to, the lemma is obviously true. For a language like SystemF2 you need about 5 lemmas concerning lifting and substitution. They're all easy to draw, and the proofs are straightforward once you understand the general approach. In the next post I'll show how to draw substitution, which works in a similar way.</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/1212596525606215985/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2011/08/how-i-learned-to-stop-worrying-and-love.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/1212596525606215985'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/1212596525606215985'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2011/08/how-i-learned-to-stop-worrying-and-love.html' title='How I learned to stop worrying and love de Bruijn indices'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-ovN0N2suJ9U/Tjzvg78Eu7I/AAAAAAAAADU/HZSZVsu_qsc/s72-c/debruijnexample.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-4069330490138844450</id><published>2011-05-25T16:42:00.003+10:00</published><updated>2011-05-25T17:04:02.258+10:00</updated><title type='text'>Proofs and mutual recursion</title><content type='html'>Quick update. I'm still working on mechanising the proofs for the DDC core language in Coq. So far I've made it through Progress, Preservation and the correspondence between Big and Small step semantics for &lt;a href="http://code.ouroborus.net/ddc/ddc-head/proof/Simple/"&gt;Simply Typed Lambda Calculus&lt;/a&gt; (STLC), &lt;a href="http://code.ouroborus.net/ddc/ddc-head/proof/SystemF/"&gt;System-F&lt;/a&gt;, &lt;a href="http://code.ouroborus.net/ddc/ddc-head/proof/SystemF2/"&gt;System-F2&lt;/a&gt;, and &lt;a href="http://code.ouroborus.net/ddc/ddc-head/proof/Simple_PCF/"&gt;PCF&lt;/a&gt;. Extending the STLC proof with polymorphism and fixpoints etc was surprisingly straightforward. It only took about 4 hours each to create the other versions of the proof.&lt;br /&gt;&lt;br /&gt;The next qualitative hurdle appears to be handling the mutual recursion that arises when we extend the language in other ways. To add case expressions, I need to represent case alternatives. Case alternatives contain expressions, and expressions contain (lists of) alternatives.&lt;br /&gt;&lt;br /&gt;Unfortunately, with mutually recursive definitions, it appears that Coq can't automatically generate all the induction schemes I need. I could be wrong on this though, as I'm still a Coq newbie. I've figured out that the Coq command "Combined Schemes", will make &lt;i&gt;some&lt;/i&gt; of the schemes I might want, but it doesn't cover all of the use-cases. The problem (I think) is that although expressions are defined in terms of lists of alternatives, the lists themselves are polymorphic and aren't directly defined in terms of alternatives. For this reason, I can't convince "Combined Scheme" to make me a strong enough induction scheme, so I must write down and prove this myself.&lt;br /&gt;&lt;br /&gt;At least, that's what I'm inferring from reading the &lt;a href="http://www.ii.uni.wroc.pl/~dabi/Publications/Haskell10/stg-in-coq/stg-in-coq/"&gt;proof of soundness for the STG machine&lt;/a&gt; described in a &lt;a href="http://www.comlab.ox.ac.uk/files/3858/pirog-biernacki-hs10.pdf"&gt;paper&lt;/a&gt; by Maciej Pirog and Dariusz Biernacki. I'm still working through it.</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/4069330490138844450/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2011/05/proofs-and-mutual-recursion.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/4069330490138844450'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/4069330490138844450'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2011/05/proofs-and-mutual-recursion.html' title='Proofs and mutual recursion'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-6354990687442850953</id><published>2011-04-19T21:40:00.000+10:00</published><updated>2011-04-19T21:40:04.070+10:00</updated><title type='text'>Falling down the naming well</title><content type='html'>Over the last month or so I've started to formalise the proofs of the DDC core language in Coq. The existing proofs were just getting too big to manage by hand, and typing them all up in Latex was seeming like a less and less fun thing to do.&lt;br /&gt;&lt;br /&gt;Being a Coq-newbie I started with Benjamin Pierce and co's &lt;a href="http://www.cis.upenn.edu/~bcpierce/sf/"&gt;Software Foundations&lt;/a&gt; course notes. I had done a couple of pages of Coq tutorial before, but never tried to use it for anything real. I've found the course notes to provide a fairly smooth ride, though I got antsy during the IMP formalisation and skipped straight to the lambda calculus one. After about two weeks of evenings I was able to redo the simply typed lambda calculus (STLC) one by myself, but hit a wall trying to scale up to System-F. To cut a longer story short, I've come to appreciate (along with many others) what a massive hack capture avoiding substitution is.&lt;br /&gt;&lt;br /&gt;I'm fast moving towards the viewpoint that the deBruijn and/or locally nameless representation is actually the "real" lambda calculus, whereas named binders are just a layer of syntactic sugar that confuses everything (and gives you cavities). I've ended up just settling on vanilla deBruijn indices for further work, mainly because the examples I've seen so far using the locally nameless representation rely on a heavy amount of Coq Ltac magic that I don't understand yet. I'm also more than willing to trade time for cleverness. By that I mean I'm willing to commit more time for a less clever solution, provided I know that the approach I take will get the job done and I won't have to start all over again using a different formalisation of binders.&lt;br /&gt;&lt;br /&gt;Underneath all of this is an itch in my mind saying that I was already writing a perfectly good paper about the DDC core language, and I don't want to start another one right now about "How to formalise DDC Core in Coq". Anyway, I've made it through Progress and Preservation for STLC using the deBruijn representation, including all the list lemmas, and am feeling fairly good about it so far. The proofs are in the main DDC tree at &lt;a href="http://code.ouroborus.net/ddc/ddc-head/proof/"&gt;http://code.ouroborus.net/ddc/ddc-head/proof/&lt;/a&gt;.</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/6354990687442850953/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2011/04/falling-down-naming-well.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/6354990687442850953'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/6354990687442850953'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2011/04/falling-down-naming-well.html' title='Falling down the naming well'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-2550061964781586025</id><published>2011-03-29T19:44:00.005+11:00</published><updated>2011-03-29T21:05:56.619+11:00</updated><title type='text'>Real Time Edge Detection in Haskell</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/-k6mYbRMrKTM/TZGhiwGTWvI/AAAAAAAAAB4/aX9H50xgY-k/s1600/beholder.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 241px;" src="http://2.bp.blogspot.com/-k6mYbRMrKTM/TZGhiwGTWvI/AAAAAAAAAB4/aX9H50xgY-k/s400/beholder.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5589426230942456562" /&gt;&lt;/a&gt;&lt;br /&gt;Last week we submitted a &lt;a href="http://www.cse.unsw.edu.au/~benl/papers/stencil/stencil-icfp2011-sub.pdf"&gt;paper&lt;/a&gt; to ICFP about how to implement efficient parallel stencil convolutions in Haskell. This goes along with a new release of &lt;a href="http://hackage.haskell.org/package/repa"&gt;Repa&lt;/a&gt; that runs on GHC 7.0.3. We've extended the old array representation to support the partitioning of arrays into several regions, and rewritten the back-end to recover sharing between the computation of adjacent array elements. Coupled with a careful management of aliasing issues, this lets us implement image processing algorithms that have comparable performance to &lt;a href="http://opencv.willowgarage.com/wiki/"&gt;OpenCV&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt;Absolute performance depends on what data format we're using. Using Word8s is about 4 times slower than OpenCV because it uses SIMD intrinsics that we don't have access to from Haskell (yet!). On the other hand, using Float32s and 2 Haskell threads can run faster than single threaded OpenCV, as the parallelism makes up for the lack of SIMD instructions.&lt;br /&gt;&lt;br /&gt;Here is a &lt;a href="http://code.ouroborus.net/beholder/video/Edge480.mov"&gt;video&lt;/a&gt; of me demonstrating a Canny edge detector applied to a stream captured from my iSight camera. The demo uses Objective C for the GUI, and Haskell for the processing. The &lt;a href="http://code.ouroborus.net/beholder/beholder-head/"&gt;source&lt;/a&gt; builds with XCode 3.2.1 and GHC 7.0.3, though you'll need to update and run the CONFIGURE.sh script in the root dir to point it to your GHC install. There are also prebuilt versions with the GHC "+RTS -N" option baked in for &lt;a href="http://code.ouroborus.net/beholder/distro/beholder-N2.tgz"&gt;two&lt;/a&gt;, &lt;a href="http://code.ouroborus.net/beholder/distro/beholder-N4.tgz"&gt;four&lt;/a&gt; and &lt;a href="http://code.ouroborus.net/beholder/distro/beholder-N6.tgz"&gt;six&lt;/a&gt; threads. They all do the same thing apart from the number of threads. You can change the RTS options in the  main.m module if you want to rebuild it. On my 4-Core i7 desktop it does about 20 frames per second with no problem -- the framerate from the iSight is set dynamically by OSX depending on system load. I've tested it on 3 separate machines, but let me know if it doesn't work for you. Non-Mac users can run the edge detector over .bmp files, using the repa-canny program from &lt;a href="http://hackage.haskell.org/package/repa-examples"&gt;repa-examples&lt;/a&gt;.</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/2550061964781586025/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2011/03/real-time-edge-detection-in-haskell.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/2550061964781586025'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/2550061964781586025'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2011/03/real-time-edge-detection-in-haskell.html' title='Real Time Edge Detection in Haskell'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-k6mYbRMrKTM/TZGhiwGTWvI/AAAAAAAAAB4/aX9H50xgY-k/s72-c/beholder.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-3142037159814850638</id><published>2011-01-03T22:54:00.006+11:00</published><updated>2011-01-04T00:03:29.475+11:00</updated><title type='text'>Papers, Proofs, PrimOps, Allocation and Strengthening</title><content type='html'>&lt;span style="font-weight:bold;"&gt;Papers&lt;/span&gt;&lt;br /&gt;General status update: I got the reviews back for the &lt;a href="http://www.cse.unsw.edu.au/~benl/papers/witness/witnessing-jfp-sub.pdf"&gt;paper&lt;/a&gt; I submitted to JFP on the DDC core language. The reviews were very helpful, and a huge thank you to the reviewers if you're reading this. I will endeavour to make all my own reviews as helpful.&lt;br /&gt;&lt;br /&gt;The bottom line was the paper looked promising, but needs more motivation for the language design, and more meta-theory. The motivation is easy enough to 'port across from my thesis, so this part shouldn't take too long. For background, the JFP paper is an extension of the APLAS 2009 paper I wrote just after I finished my thesis. The original version of that paper had more motivational material, but I ended up cutting it to fit the page limit. For the JFP paper I tried to stick to the details of the core language, but I suppose not discussing the source language and wider context makes it hard to see the point of it all. There were also suggestions and questions as to the details of the language and typing rules, which I'll either fold in or explain why.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Proofs&lt;/span&gt;&lt;br /&gt;For the meta-theory, the paper needs proofs of observational equivalence for the example optimisations, showing how the witnesses are used during optimisation. I had a proof of soundness for the type system in the appendix, and worked examples of the optimisations, but not formal proofs that the optimisations are ok. Said optimisations are just standard ones like let floating and full-laziness, but I agree that if the point of having witnesses of purity and constancy in the core language is to perform optimisations in the presence of side-effects, then we want some formal link between the witnesses and those optimisations.&lt;br /&gt;&lt;br /&gt;Anyway, I started on the proofs over the christmas break, and have the structure of one showing that let-floating is sound (changing the order of let-bindings with non-interferring effects), though I still need to fill in some details. I'm estimating about two or three weeks work (full time equivalent) to finish the proofs and update the paper.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;PrimOps&lt;/span&gt;&lt;br /&gt;I went to a hackathon hosted by &lt;a href="http://www.mega-nerd.com/erikd/Blog"&gt;Erik&lt;/a&gt; last week where we worked on LLVM support for DDC. Erik's written all the code for the backend, though to push it all the way through I needed to change how primops are handled in the rest of the compiler (which is explained in Erik's blog post). I just finished that this afternoon, so we're gunning for full LLVM support in the near future. The primop handling is pretty clean now, so if we wanted to, say, add support for vector operations like MMX and SSE then this should be reasonably straightforward.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Allocation Effects&lt;/span&gt; &lt;br /&gt;The paper reviews have also convinced me that we want explicit allocation effects in the source and core languages. One of my thesis reviewers also suggested this, though so far I've left them out because they weren't absolutely needed. Allocation effects have a slightly different flavour to world effects (like writing to the Console) and to read/writing mutable data. When two expressions have world or read/write effects then this means they can interfere in such a way that you cannot change the &lt;span style="font-style:italic;"&gt;order&lt;/span&gt; of these two expressions. In contrast, you can always change the order of expressions that just allocate new objects.&lt;br /&gt;&lt;br /&gt;However, if an expression allocates new &lt;span style="font-style:italic;"&gt;mutable&lt;/span&gt; objects, then you cannot hoist it out of a lambda (apply the full-laziness transform). Hoisting an expression means it is potentially executed &lt;span style="font-style:italic;"&gt;less often&lt;/span&gt;, so we're not just changing the order of two actions in time. If you want to do hoisting without information about allocation, then it's sufficient to check that the binding to be hoisted performs no world or read/write effects, and does not return mutable data. However, using allocation effects will make this reasoning clearer, and can enable more optimisations later, so I'll add them in. I started on a &lt;a href="http://code.ouroborus.net/ddc/ddc-alloc/"&gt;branch&lt;/a&gt; that has them, but now I've convinced myself they're easy enough to add I'm going to leave the hacking to later and focus on getting the paper finished first.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Strengthening&lt;/span&gt; &lt;br /&gt;In the type inferencer, it turns out that strengthening more-than constraints to equality constraints during inference breaks programs that use interesting higher order functions (like with parser combinators). This isn't a problem with standard 2nd order functions like map, fold, filter etc, but only appears with 3rd order functions and higher, or with data containing functions, which is why I hadn't noticed it before. By "break" I mean the inferencer returns a bad type, so we get a panic when we try to type check the corresponding core program. We still want to do the strengthening because it removes the need for a pile of redundant effect applications in the core programs, but it should be done while converting the desugared language to core, instead of during inference, which is fairly obvious in hindsight. I've started fixing this as well, but need to adjust the type utils (like type simplification and effect masking) so they'll handle the non-strengthened forms. Another job to be left until after the paper is finished...</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/3142037159814850638/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2011/01/papers-proofs-primops-allocation-and_03.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/3142037159814850638'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/3142037159814850638'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2011/01/papers-proofs-primops-allocation-and_03.html' title='Papers, Proofs, PrimOps, Allocation and Strengthening'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-6915360865043118372</id><published>2010-11-27T13:56:00.009+11:00</published><updated>2010-11-27T16:28:42.093+11:00</updated><title type='text'>Shootout: Fannkuch redux</title><content type='html'>A few times after finishing some GHC hacking, I've been talking to Simon Marlow and mentioned that whatever it was I was doing now "works".&lt;br /&gt;Usually, his reply is then "for what value of works?". Especially with compilers, there is quite a gap between "sorta works", "seems to work" and "industrial strength". &lt;br /&gt;&lt;br /&gt;Last night I was taking a stroll through the "sorta" end of town while hacking on a Disciple version of the fannkuch-redux program from the languages benchmarks game. Disciple is a dialect of Haskell, so I started with one of the nicer existing Haskell versions.&lt;br /&gt;&lt;br /&gt;Here is the version by Miha Vučkovič. I've reformatted it a bit and used the Disciple string pasting operator % instead of the (hated) Haskell ++, but it's otherwise the same program. (I've also added a type sig because DDC doesn't do dictionary passing yet, and used iterateL, but more on that later.)&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;flop (2:x1:t)                   = x1:2:t&lt;br /&gt;flop (3:x1:x2:t)                = x2:x1:3:t&lt;br /&gt;flop (4:x1:x2:x3:t)             = x3:x2:x1:4:t&lt;br /&gt;flop (5:x1:x2:x3:x4:t)          = x4:x3:x2:x1:5:t&lt;br /&gt;flop (6:x1:x2:x3:x4:x5:t)       = x5:x4:x3:x2:x1:6:t&lt;br /&gt;flop (7:x1:x2:x3:x4:x5:x6:t)    = x6:x5:x4:x3:x2:x1:7:t&lt;br /&gt;&lt;br /&gt;flop lst@(h:_) = r&lt;br /&gt; where flop' 0 (t, r)          = (t, r)&lt;br /&gt;       flop' n ((h:t), r)      = flop' (n-1) (t, h:r)&lt;br /&gt;       (t, r)                  = flop' h (lst, t)&lt;br /&gt; &lt;br /&gt;flopS (1:_)  = 0&lt;br /&gt;flopS lst    = 1 + flopS (flop lst)&lt;br /&gt;&lt;br /&gt;rotate n (h:t)         = rotate' (n-1) t &lt;br /&gt; where rotate' 0 l     = h:l&lt;br /&gt;       rotate' n (f:t) = f:(rotate' (n-1) t)&lt;br /&gt;&lt;br /&gt;checksum :: Int -&gt; Int -&gt; Int&lt;br /&gt;checksum i f&lt;br /&gt;   | mod i 2 == 0 = f&lt;br /&gt;   | True         = -f&lt;br /&gt;&lt;br /&gt;pfold :: (Int, Int) -&gt; [(Int, Int)] -&gt; (Int, Int)&lt;br /&gt;pfold r [] = r&lt;br /&gt;pfold (ac, af) ((c, f):t)  = pfold (sc, sf) t &lt;br /&gt; where  sc = ac+c&lt;br /&gt;        sf = max af f&lt;br /&gt;&lt;br /&gt;permut n          = foldr perm [[1..n]] [2..n] &lt;br /&gt; where perm x lst = concat [take x $ iterateL (rotate x) l | l &lt;- lst]&lt;br /&gt;&lt;br /&gt;main () &lt;br /&gt; = do n = 8&lt;br /&gt;      (chksm, mflops) &lt;br /&gt;            = pfold (0,0) &lt;br /&gt;            $ map (\(i, p) -&gt; let flops = flopS p &lt;br /&gt;                              in  (checksum i flops, flops))&lt;br /&gt;            $ zip [0..] (permut n)&lt;br /&gt;&lt;br /&gt; putStr $ show chksm % "\nPfannkuchen(" % show n % ") = " % show mflops % "\n"&lt;br /&gt;&lt;/pre&gt;It doesn't matter what it does, or how it does it. Check the shootout page if you're interested, we're just trying to get it running for now. &lt;br /&gt;&lt;br /&gt;&lt;h3&gt;Faking value recursion&lt;/h3&gt;&lt;br /&gt;My first compile attempt reported:&lt;pre&gt;&lt;br /&gt;desire:ddc-head-devel benl$ bin/ddc -O --make test/60-Shootout/Fannkuch-Redex/MainShoot.hs &lt;br /&gt;[1 of 1] Compiling MainShoot&lt;br /&gt;ddc: ERROR&lt;br /&gt;./test/60-Shootout/Fannkuch-Redex/MainShoot.hs:16:29&lt;br /&gt;        Conflicting region constraints.&lt;br /&gt;        &lt;br /&gt;              constraint: Direct %115&lt;br /&gt;         from the use of: flop&lt;br /&gt;                      at: ./test/60-Shootout/Fannkuch-Redex/MainShoot.hs:16:29&lt;br /&gt;        &lt;br /&gt;            conflicts with,&lt;br /&gt;              constraint: Lazy %115&lt;br /&gt;         from the use of: permut&lt;br /&gt;                      at: ./test/60-Shootout/Fannkuch-Redex/MainShoot.hs:42:29&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Confusing, yes, but at least it's a civilised message. What it's trying to say it thought the region named %115 was Direct, meaning it cannot contain thunks, but later code wants to add thunks to it (make it Lazy). Disciple tracks what objects can or cannot be thunks, because if the backend doesn't have to check for them then the code can be a lot faster.&lt;br /&gt;&lt;br /&gt;The confusing thing is why it thinks this region is supposed to be Direct. Top level data (like global integers) default to Direct because when we create them we know they're fully evaluated, and not thunks. However, flop is a function so should be polymorphic in this aspect, it should not require its arguments to be Direct.&lt;br /&gt;&lt;br /&gt;Ok, looking in the interface file for the inferred type of flop says:&lt;pre&gt;&lt;br /&gt;flop    :: Data.List.List %rTC0 (Int32 %rTC4) -(!e0 $c0)&gt; Data.List.List %rTC0 (Int32 %rTC4)&lt;br /&gt;        :- $c0   =  ${t : %rTC0} + ${t : %rTC4}&lt;br /&gt;        ,  !e0   =  !Read %rTC0 + !Read %rTC4&lt;br /&gt;        ,  Const  %rTC0&lt;br /&gt;        ,  Const  %rTC4&lt;br /&gt;        ,  Direct %rTC0&lt;br /&gt;        ,  Direct %rTC4&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;What I'm looking at is the line: &lt;pre&gt;&lt;br /&gt;           $c0   =  ${t : %rTC0} + ${t : %rTC4}&lt;br /&gt;&lt;/pre&gt;This is a closure constraint, and reveals that the inferencer thinks this function is referencing data in regions %rTC0 and %rTC4 from its environment. This might be true if it was referring to some top level piece of data, but it's not. The closure constraint also reports the variable being used to refer to this data, namely "t".&lt;br /&gt;&lt;br /&gt;A closer inspection of flop reveals the underlying problem:&lt;pre&gt;&lt;br /&gt;flop lst@(h:_) = r&lt;br /&gt; where flop' 0 (t, r)          = (t, r)&lt;br /&gt;       flop' n ((h:t), r)      = flop' (n-1) (t, h:r)&lt;br /&gt;       (t, r)                  = flop' h (lst, t)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Here, Miha has been tricky and used value recursion in the last line. Notice that "t" appears on both the left and the right of the binding, but the binding isn't defining a recursive function -- it's a recursive value.&lt;br /&gt;&lt;br /&gt;Disciple doesn't support value recursion yet, as it's tricky to implement in a default-strict language, but I'd like to look into it in the future. Anyway, in Disciple today only function bindings can be recursive, so what &lt;span style="font-style:italic;"&gt;should&lt;/span&gt; have happened is that the renamer should have said that "t" is not in scope in the right of the binding. Instead, that screwup has gotten through the renamer, and now the inferencer thinks that "t" has been defined outside of "flop". It has then defaulted its region to Direct and Const, as there are no opposing Lazy or Mutable constraints.&lt;br /&gt;&lt;br /&gt;The work around is to rewrite this as an actual function binding, then use laziness to break the recursive loop:&lt;pre&gt;&lt;br /&gt;flop lst@(h:_) = snd (thing ())&lt;br /&gt; where  flop' 0 (t, r)          = (t, r)&lt;br /&gt;        flop' n ((h:t), r)      = flop' (n-1) (t, h:r)&lt;br /&gt;        thing ()                = flop' h @ (lst, fst @ (thing @ ()))&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;A @ on the right of a binding means "lazy function application". For example, the expression (f @ x) builds a thunk holding pointers to the code for "f" and its argument "x". The programmer decides where to suspend expressions, and they are forced automatically by consumers as needed. I should probably use a different operator because it looks weird with respect to the @ on the left of bindings, but no matter.&lt;br /&gt;&lt;br /&gt;Changing this makes the program run fine, at least with an input size of 7 (the "n" in the main function).&lt;pre&gt;&lt;br /&gt;desire:ddc-head-devel benl$ bin/ddc -O --make test/60-Shootout/Fannkuch-Redex/MainShoot.hs &lt;br /&gt;[1 of 1] Compiling MainShoot&lt;br /&gt;&lt;br /&gt;desire:ddc-head-devel benl$ ./a.out &lt;br /&gt;228&lt;br /&gt;Pfannkuchen(7) = 16&lt;/pre&gt;&lt;br /&gt;Sadly, setting n &gt;= 8 gives&lt;pre&gt;&lt;br /&gt;desire:ddc-head-devel benl$ ./a.out &lt;br /&gt;*** DDC RTS PANIC! Slot stack overflow.&lt;br /&gt;Abort trap&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;h3&gt;Explicit Laziness&lt;/h3&gt;&lt;br /&gt;The "slot stack" is the stack that holds pointers to heap objects. This is sometimes called a "shadow stack" and the pointers are used as roots for the garbage collector. The reason it's overflown is that Disciple is a default strict language, so our friends map, fold, filter etc evaluate the entire list when called. We can see where the stack has overflown by running the program in GDB, the GNU debugger.&lt;pre&gt;&lt;br /&gt;desire:ddc-head-devel benl$ gdb a.out &lt;br /&gt;GNU gdb 6.3.50-20050815 (Apple version gdb-1346) (Fri Sep 18 20:40:51 UTC 2009)&lt;br /&gt;...&lt;br /&gt;(gdb) run&lt;br /&gt;Starting program: /Users/benl/devel/ddc/ddc-head-devel/a.out &lt;br /&gt;Reading symbols for shared libraries ++. done&lt;br /&gt;*** DDC RTS PANIC! Slot stack overflow.&lt;br /&gt;&lt;br /&gt;Program received signal SIGABRT, Aborted.&lt;br /&gt;0x93a74732 in __kill ()&lt;br /&gt;(gdb) &lt;br /&gt;(gdb) bt&lt;br /&gt;#0  0x93a74732 in __kill ()&lt;br /&gt;#1  0x93a74724 in kill$UNIX2003 ()&lt;br /&gt;#2  0x93b0798d in raise ()&lt;br /&gt;#3  0x93b1da44 in abort ()&lt;br /&gt;#4  0x00012041 in _panicOutOfSlots ()&lt;br /&gt;#5  0x00003a0c in Data_List__symCl ()&lt;br /&gt;#6  0x00003c04 in Data_List_iterateL ()&lt;br /&gt;#7  0x00014d17 in _forceStep ()&lt;br /&gt;#8  0x00014db8 in _force ()&lt;br /&gt;#9  0x000048a0 in Data_List_take ()&lt;br /&gt;#10 0x000049c5 in Data_List_take ()&lt;br /&gt;#11 0x0000363c in Data_Function__symDl ()&lt;br /&gt;#12 0x0000679f in _permut_vCL2 ()&lt;br /&gt;#13 0x0000434a in Data_List_concatMap ()&lt;br /&gt;#14 0x00004362 in Data_List_concatMap ()&lt;br /&gt;#15 0x00004362 in Data_List_concatMap ()&lt;br /&gt;#16 0x00004362 in Data_List_concatMap ()&lt;br /&gt;#17 0x00004362 in Data_List_concatMap ()&lt;br /&gt;#18 0x00004362 in Data_List_concatMap ()&lt;br /&gt;#19 0x00004362 in Data_List_concatMap ()&lt;br /&gt;#20 0x00004362 in Data_List_concatMap ()&lt;br /&gt;#21 0x00004362 in Data_List_concatMap ()&lt;br /&gt;#22 0x00004362 in Data_List_concatMap ()&lt;br /&gt;#23 0x00004362 in Data_List_concatMap ()&lt;br /&gt;... 5000 more lines&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;Usefully, DDC compiles via C and also uses the regular C stack for return addresses and unboxed data. The intermediate C files look like they almost could have been written by a human, and there is no evil mangler butchering the assembly code once it's been compiled. This means we get real stack traces, and I don't have the same sense of impending doom as when a GHC compiled program segfaults on me.&lt;br /&gt;&lt;br /&gt;From the stack trace, it looks a bit like it's run out of space in concatMap. There is no concatMap in the source code, but of course every Haskell hacker knows that list comprehensions are desugared to concatMap, so it's run out of space in the list comprehension.&lt;br /&gt;&lt;br /&gt;Disciple is default strict, so if you want producer/consumer style list processing to run in constant space using laziness, then you have to say so. Here are new versions of permut and main after rewriting the list comprehension to use a map, and converting to the lazy list functions concatL mapL and zipL:&lt;pre&gt;&lt;br /&gt;permut n   = foldr perm [[1..n]] [2..n] &lt;br /&gt; where perm x lst = concatL $ mapL (\l -&gt; take x $ iterateL (rotate x) l) lst&lt;br /&gt;&lt;br /&gt;main () &lt;br /&gt; = do   n = 10&lt;br /&gt;        (chksm, mflops) &lt;br /&gt;                = pfold (0,0) &lt;br /&gt;                $ mapL (\(i, p) -&gt; let flops = flopS p &lt;br /&gt;                                   in  (checksum i flops, flops))&lt;br /&gt;                $ zipL [0..] (permut n)&lt;br /&gt;&lt;br /&gt;        putStr $ show chksm % "\nPfannkuchen(" % show n % ") = " % show mflops % "\n"&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The functions concatL, mapL and zipL have similar definitions to the strict versions, but they're more lazy. This is like the difference between foldl and foldl' in the Haskell libraries. iterateL builds an infinite list, so there is no strict version called plain "iterate".&lt;br /&gt;&lt;br /&gt;This makes n=8 work, but n=9 runs out of heap.&lt;pre&gt;&lt;br /&gt;desire:ddc-head-devel benl$ bin/ddc -O --make test/60-Shootout/Fannkuch-Redex/MainShoot.hs &lt;br /&gt;[1 of 1] Compiling MainShoot&lt;br /&gt;desire:ddc-head-devel benl$ ./a.out &lt;br /&gt;*** DDC RTS PANIC! Out of heap space.&lt;br /&gt;        current (full) heap size: 9999999 bytes&lt;br /&gt;      could not allocate another: 16 bytes&lt;br /&gt;Abort trap&lt;br /&gt;&lt;/pre&gt;In a real runtime system, when the runtime heap is full it would ask for more from the OS. Unfortunately, the Disciple runtime is less featureful, uses a fixed size heap, and when it's full it's dead. It does have a garbage collector, but when all data in the heap is live there's nothing more the GC can do. However, you can set the size of the heap on the command line. Giving it 100Megs gets us n=9.&lt;pre&gt;&lt;br /&gt;desire:ddc-head-devel benl$ ./a.out +RTS -H 100000000&lt;br /&gt;8629&lt;br /&gt;Pfannkuchen(9) = 30&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;h3&gt;The Treasure Finder&lt;/h3&gt;&lt;br /&gt;Garbage collection is a misnomer. Garbage collectors don't collect unused data, they copy out live data to a new space. It'd be more accurate to call them "Treasure Finders". Anyway, for interests sake we can examine the rate treasure is being found in our program. The DDC runtime supports some simple statistics like so:&lt;pre&gt;&lt;br /&gt;desire:ddc-head-devel benl$ ./a.out +RTS -H 100000000 -profile-gc&lt;br /&gt;8629&lt;br /&gt;Pfannkuchen(9) = 30&lt;br /&gt;desire:ddc-head-devel benl$ cat ddc-rts.prof &lt;br /&gt;&lt;br /&gt;-- Garbage Collection&lt;br /&gt;  collection    count =                      17&lt;br /&gt;&lt;br /&gt;  alloc         total =             760,836,560     (bytes)&lt;br /&gt;&lt;br /&gt;  copy          total =           1,034,618,572     (bytes)&lt;br /&gt;                count =              68,351,051     (objects)&lt;br /&gt;      avg object size =                      15.137 (bytes)&lt;br /&gt;&lt;br /&gt;  process time   user =                       6.120 (s)&lt;br /&gt;               system =                       0.070 (s)&lt;br /&gt;&lt;br /&gt;  mutator       total =                       1.760 (s)&lt;br /&gt;                 user =                       1.700 (s)&lt;br /&gt;               system =                       0.060 (s)&lt;br /&gt;&lt;br /&gt;  collector     total =                       4.430 (s)&lt;br /&gt;                 user =                       4.420 (s)&lt;br /&gt;               system =                       0.010 (s)&lt;br /&gt;&lt;br /&gt;     time efficiency  =                      28.433 (%)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Heh, 28% time efficiency isn't the best. The Disciple GC is a simple two space collector. It's not a generational GC and there is no nursery. The way the code is compiled also makes it hold onto far too much intermediate data. Fixing that would be equivalent do doing "register allocation" on the GC slot stack, so it doesn't treat an object as a root unless it really needs it.&lt;br /&gt;&lt;br /&gt;Of course, the great thing about having 8GB of RAM in your desktop machine is that you can achieve so much with a naive language implementation. Let's compare against GHC to see how we're going:&lt;pre&gt;&lt;br /&gt;desire:ddc-head-devel benl$ /usr/bin/time ./a.out +RTS -H 1000000000&lt;br /&gt;73196&lt;br /&gt;Pfannkuchen(10) = 38&lt;br /&gt;       83.04 real        82.12 user         0.90 sys&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;desire:ddc-head-devel benl$ ghc -O2 --make test/60-Shootout/Fannkuch-Redex/MainGHC.hs -o Main&lt;br /&gt;desire:ddc-head-devel benl$ /usr/bin/time ./Main 10&lt;br /&gt;73196&lt;br /&gt;Pfannkuchen(10) = 38&lt;br /&gt;        2.10 real         2.08 user         0.01 sys&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;That's when I went to bed.&lt;br /&gt;&lt;br&gt;&lt;br /&gt;&lt;br&gt;</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/6915360865043118372/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2010/11/shootout-fannkuch-redux.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/6915360865043118372'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/6915360865043118372'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2010/11/shootout-fannkuch-redux.html' title='Shootout: Fannkuch redux'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-3310805120073891485</id><published>2010-10-20T17:53:00.002+11:00</published><updated>2010-10-20T18:19:54.244+11:00</updated><title type='text'>Effect inference</title><content type='html'>A few people have mentioned they are worried about the type signatures from my previous post.&lt;br /&gt;&lt;br /&gt;For a sig like:&lt;pre&gt;&lt;br /&gt;fillPrimes :: Array Int -&gt; Int -&gt; Int -&gt; Int -&gt; ()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The fact that there is no 'IO' constructor in that type doesn't mean that side effects aren't being tracked. The type inferencer knows what's going on, and so does the optimiser, you just don't have to write effect information in sigs if you don't want to.&lt;br /&gt;&lt;br /&gt;If we add all the information about effects and mutability we actually have:&lt;pre&gt;&lt;br /&gt;fillPrimes&lt;br /&gt;    :: forall %r0 %r1 %r2 %r3 %r4&lt;br /&gt;    .  Array %r0 (Int %r1) -&gt; Int %r2 -&gt; Int %r3 -&gt; Int %r4 -(!e0)&gt; ()&lt;br /&gt;    :- !e0 =  !Read %r0 + !Read %r1 + !Write %r1 + !Read %r2 + !Read %r3 + !Read %r4 &lt;br /&gt;    ,  Mutable %r1&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;A term like Array %r0 (Int %r1) means that the fillPrimes accepts an array of boxed integers. The pointers (handles) to those integers are in some region named %r0, and the integers themselves are in region %r1. The effect term !Write %r1 reveals that fillPrimes updates those integers, which requires them to be mutable, hence the Mutable %r1 constraint.&lt;br /&gt;&lt;br /&gt;Writing sigs this way is a bit verbose, so we also support the following "short hand" syntax. This sig gets desugared to the previous one using some simple defaulting rules:&lt;pre&gt;&lt;br /&gt;fillPrimes :: Array (Int{modify}) -&gt; Int{read} -&gt; Int{read} -&gt; Int{read} -&gt; ()&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Full information about effects and mutability is exported in the interface files. I'm currently working on a system whereby you can say "These are the effects I think this function has, but tell me if you think it has more." I'll post more about this when it's done (it half works now).</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/3310805120073891485/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2010/10/effect-inference.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/3310805120073891485'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/3310805120073891485'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2010/10/effect-inference.html' title='Effect inference'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-5323508322367862068</id><published>2010-10-16T20:24:00.009+11:00</published><updated>2010-10-16T21:39:16.840+11:00</updated><title type='text'>Code: Three primes programs</title><content type='html'>This post illustrates some of the programming styles possible in Disciple.&lt;br /&gt;&lt;br /&gt;We'll start with ye'olde primes program straight from the nofib benchmark suite. This is the exact code, except that for the sake of example I've hard wired it to show the 1500th prime instead of taking a command line argument.&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;suCC :: Int -&gt; Int&lt;br /&gt;suCC x = x + 1&lt;br /&gt;&lt;br /&gt;isdivs :: Int  -&gt; Int -&gt; Bool&lt;br /&gt;isdivs n x = mod x n /= 0&lt;br /&gt;&lt;br /&gt;the_filter :: [Int] -&gt; [Int]&lt;br /&gt;the_filter (n:ns) = filter (isdivs n) ns&lt;br /&gt;&lt;br /&gt;primes :: [Int]&lt;br /&gt;primes = map head (iterate the_filter (iterate suCC 2))&lt;br /&gt;&lt;br /&gt;main = print $ primes !! 1500&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;This program builds an infinite list of primes, and then uses list indexing to take the 1500'th one. A couple of nice functional programming paradigms are used, namely higher order functions and laziness. Disciple is a default-strict dialect of Haskell, so we can use almost the same code, except we have to be explicit when we want laziness:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;suCC :: Int -&gt; Int&lt;br /&gt;suCC x = x + 1&lt;br /&gt;&lt;br /&gt;isdivs :: Int -&gt; Int -&gt; Bool&lt;br /&gt;isdivs n x = mod x n /= 0&lt;br /&gt;&lt;br /&gt;the_filter :: [Int] -&gt; [Int]&lt;br /&gt;the_filter (n:ns) = filterL (isdivs n) ns&lt;br /&gt;&lt;br /&gt;primes :: [Int]&lt;br /&gt;primes = mapL head (iterateL the_filter (iterateL suCC 2))&lt;br /&gt;&lt;br /&gt;main () = println $ show $ primes !! 1500&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;That's it. I've changed map, filter and iterate to mapL, filterL and iterateL, and wibbled the main function slightly, but it's otherwise the same code. The difference between something like map and mapL is like the difference between foldl and foldl' in Haskell, they're implemented almost the same way, but have slightly different behaviours. Check the base libs for details.&lt;br /&gt;&lt;br /&gt;Ok, so Disciple can express the dominant paradigms of Haskell, now it's time to subvert them. In the "Beyond Haskell" session at HIW 2010 we talked about how sometimes when optimising a program in a high level language, you reach a point where you just have to drop down to a "lower level" language for performance reasons. This is like when you're writing a Ruby program and it's running so slow there's simply nothing else to do than to "shell out" to a C routine. This also happens in Haskell, but perhaps not as much.&lt;br /&gt;&lt;br /&gt;Anyway, one of the goals of Disciple is to avoid this problem if at all possible. If the lazy, purely functional version of primes just isn't fast enough for you, then let's write the same program using destructive update of a mutable array:&lt;pre&gt;&lt;br /&gt;-- | Check if an int is a multiple of any in a given array.&lt;br /&gt;checkPrime :: Array Int -&gt; Int -&gt; Int -&gt; Int -&gt; Bool&lt;br /&gt;checkPrime array high x n&lt;br /&gt;        | n &gt;= high             = True&lt;br /&gt;        | mod x array.(n) == 0  = False&lt;br /&gt;        | otherwise             = checkPrime array high x (n + 1)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;-- | Fill an array with primes.&lt;br /&gt;fillPrimes :: Array Int -&gt; Int -&gt; Int -&gt; Int -&gt; ()&lt;br /&gt;fillPrimes primes max high i&lt;br /&gt;        | high &gt; max = ()&lt;br /&gt;&lt;br /&gt;        | checkPrime primes high i 0  &lt;br /&gt;        = do    primes.(high) := i&lt;br /&gt;                fillPrimes primes max (high + 1) (i + 1)&lt;br /&gt;&lt;br /&gt;        | otherwise&lt;br /&gt;        =       fillPrimes primes max high       (i + 1)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;main ()&lt;br /&gt; = do  -- We want the 1500'th prime.&lt;br /&gt;        max    = 1500&lt;br /&gt;&lt;br /&gt;        -- Start with an array containing the first prime as its first element.&lt;br /&gt;        primes    = generate&amp;{Array Int} (max + 1) (\_ -&gt; 0)&lt;br /&gt;        primes.(0) := 2&lt;br /&gt; &lt;br /&gt;        -- Fill the array with more primes.&lt;br /&gt;        fillPrimes primes max 1 2&lt;br /&gt;&lt;br /&gt;        -- Take the last prime found as the answer.&lt;br /&gt;        println    $ show primes.(max)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The syntax primes.(max) means "return the max'th element of the primes array". The syntax primes.(high) := i means  "destructively update the high'th element of the primes array to the value i". Updating an array requires the array to be mutable. It's possible to express this mutability constraint in the type signature, but it's not required, so I usually just leave it to the inferencer. The compiler knows that you're using side effects, and will optimise around them, but you usually don't have to say anything about it in source level type sigs. I'll talk about the cases when you do in another post. Note that we don't have to pollute the type sigs with the IO constructor just to use destructive update. After all, checkPrime and fillPrimes aren't doing any IO...&lt;br /&gt;&lt;br /&gt;The above code is good, but it still feels like a Haskell program. I'm particularly looking at the tail calls to checkPrime and fillPrimes. This is how you express primitive looping in many functional languages, but it can become tedious when there are lots of state variables to pass back to each iteration. Here is another version of primes written in a more imperative style.&lt;pre&gt;&lt;br /&gt;-- | Check if an int is a multiple of any in a given array.&lt;br /&gt;checkPrime :: Array Int -&gt; Int -&gt; Int -&gt; Bool&lt;br /&gt;checkPrime array high x&lt;br /&gt; = do   n       = 0&lt;br /&gt;        isPrime = 1&lt;br /&gt;&lt;br /&gt;        while (n &lt; high)&lt;br /&gt;         do     when (mod x array.(n) == 0)&lt;br /&gt;                 do     isPrime := 0&lt;br /&gt;                        break&lt;br /&gt;&lt;br /&gt;                n := n + 1&lt;br /&gt;&lt;br /&gt;        isPrime /= 0&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;main ()&lt;br /&gt; = do -- We want the 1500'th prime.&lt;br /&gt;        max    = 1500&lt;br /&gt;&lt;br /&gt;        -- Start with an array containing the first prime as its first element.&lt;br /&gt;        primes     = generate&amp;{Array Int} (max + 1) (\_ -&gt; 0)&lt;br /&gt;        primes.(0) := 2&lt;br /&gt;  &lt;br /&gt;        -- Fill the array with primes.&lt;br /&gt;        high    = 1&lt;br /&gt;        i       = 2&lt;br /&gt;        while (high &lt;= max)&lt;br /&gt;         do     when (checkPrime primes high i)&lt;br /&gt;                 do     primes.(high) := i&lt;br /&gt;                        high          := high + 1&lt;br /&gt;&lt;br /&gt;                i := i + 1&lt;br /&gt;&lt;br /&gt;        -- Take the last prime found as the answer.&lt;br /&gt;        println $ show primes.(max)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Now we've got while loops, and break, even. Of course, the while syntax desugars to a simple higher order function that takes the loop body, and break just throws an exception to end the iteration. Such syntax should help the Python programmers feel at home, but Disciple is still lambdas all the way down.&lt;br /&gt;&lt;br /&gt;Finally, note that we started with a lazy, purely functional program but arrived at an unapologetically imperative program all in the same language. In my dreams people talk about "functional vs imperative" programs instead of "functional vs imperative" &lt;span style="font-style:italic;"&gt;languages&lt;/span&gt;. &lt;br /&gt;&lt;br /&gt;Write the lazy functional program because it's shorter, cleaner, and easier to understand.. but when the clock is tickin' and the space is leakin' then wouldn't it be nicer "shell out" to the same language?&lt;br /&gt;&lt;br /&gt;PS: Code examples are part of the DDC test suite &lt;a href="http://code.haskell.org/ddc/ddc-head/test/50-Defib/Primes/"&gt;here&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;PPS: Of couse the absolute performance of these programs compiled with DDC today is atrocious. This is a development blog after all, and we're still developing. They do work though. See &lt;a href="http://trac.haskell.org/ddc/"&gt;the wiki&lt;/a&gt; to find out how you can help.</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/5323508322367862068/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2010/10/code-three-primes-programs.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/5323508322367862068'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/5323508322367862068'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2010/10/code-three-primes-programs.html' title='Code: Three primes programs'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-1466457928077864965</id><published>2010-10-16T14:22:00.006+11:00</published><updated>2010-10-16T17:41:32.894+11:00</updated><title type='text'>Hacks: Type error beautification</title><content type='html'>In this series of blog posts I'll name the ones that describe what I'm currently working on after "Hacks: whatever".&lt;br /&gt;&lt;br /&gt;I'm currently working on type error beautification. The problem is that without it you get error messages that look like this: &lt;pre&gt;&lt;br /&gt;./test/15-Typing/Signatures/LargerEffect/Test.ds:2:8&lt;br /&gt;    Inferred type:&lt;br /&gt;      add&lt;br /&gt;        :: forall %rDE111 %rDE112 %rDE113&lt;br /&gt;        .  Int32 %rDE111 -&gt; Int32 %rDE112 -(!eDE484 $cDE487)&gt; Int32 %rDE113&lt;br /&gt;        :- !eDE484 =  !{!Read %rDE111; !Read %rDE112}&lt;br /&gt;        ,  $cDE487 =  Test.x : %rDE111&lt;br /&gt;&lt;br /&gt;    Is bigger than signature:&lt;br /&gt;      add&lt;br /&gt;        :: forall %rDE111 %rDE112 %rDE113&lt;br /&gt;        .  Int32 %rDE111 -&gt; Int32 %rDE112 -(!eDE484 $cDE487)&gt; Int32 %rDE113&lt;br /&gt;        :- !eDE484 =  !Read %rDE111&lt;br /&gt;        ,  $cDE487 =  xDE485 : %rDE111&lt;br /&gt;&lt;br /&gt;    Because  !{!Read %rDE111; !Read %rDE112}  does not match  !Read %rDE111 &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;That's all fine, apart from the variable names like %rDE111 and $cDE485. Those are "dummy" variables that have been invented by the type elaborator, which fills in missing effect terms in type sigs provided by the user. For example, the source program that gave the above error message is actually:&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;add  &lt;: Int{read} -&gt; Int -&gt; Int&lt;br /&gt;add x y = x + y&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;The type signature says that the function is permitted to read its first argument, where {read} is a short-hand way of giving the real region and effect term. The elaborator takes the short-hand sigs provided by the user, then fills in the missing bits using some simple defaulting rules, before passing them to the type checker proper. Of course, this function actually reads both of its arguments, hence the error message.&lt;br /&gt;&lt;br /&gt;The problem is that when the elaborator needs a new variable it just takes the next available name in the namespace, so the exact names you get depend on what other signatures the elaborator has already processed. This means that if you add a new signature to an imported module, the dummy variables in all the successive signatures change. That's fine for type checking, but it means that error messages presented to the user aren't consistent. It's become a problem because our regression tests compare error messages to what they were before, so now if you modify any of the base libraries, you have to accept a bunch of wibbles in the test suite.&lt;br /&gt;&lt;br /&gt;The solution is to rewrite variables in each set of error messages to use consistent names like t0, t1, %r1, %r2 etc before presenting them to the user. Complicating factors are that if a variable was a "real" variable that came from the source program we don't want to rewrite it, and that generated/elaborated names should not collide with real names from the source program. &lt;br /&gt;&lt;br /&gt;This is one of those PITAs you discover when you sit down and try to write a real compiler. No sexy semantics, no fancy extensions to the type system. I need to format error messages properly so my regression tests don't break after every commit... but we press onward for the glory.</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/1466457928077864965/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2010/10/hacks-type-error-beautification.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/1466457928077864965'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/1466457928077864965'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2010/10/hacks-type-error-beautification.html' title='Hacks: Type error beautification'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-301893373063372699.post-2262800600424962439</id><published>2010-10-16T13:27:00.006+11:00</published><updated>2010-10-16T21:51:14.599+11:00</updated><title type='text'>For Human Consumption</title><content type='html'>Over the last year I've spent a lot of time cleaning up the internals of DDC, and I've reached a point where I'm no longer embarrassed about the implementation. It's still full of bugs and holes, but enough of the structure and design is there that others should be able to find their way around the code base without too much trouble. I could also use some help, so this blog is an attempt to publicise the project and hopefully convince some of you wonderful people that Disciple/DDC is indeed the way of the future, and that spending some time hacking on it would be a fun thing to do...&lt;br /&gt;&lt;br /&gt;I've been working on DDC since I started my PhD back in 2004, though there were several false starts and major rewrites along the way. Back then it was called "Trauma",  because all the good names like "Clean", "Hope" and "Eden" were already taken, and I wanted a name that reflected what language design / compiler development / doing a PhD was really like. &lt;br /&gt;&lt;br /&gt;DDC as a "plan" started to congeal in 2006 when I had done enough reading and research to form a vision of where I wanted to go. The first patch in the repo is from Sunday September 9th 2007. Before then my "version control" consisted of making a tarball every couple of weeks. Prior to starting my PhD I worked as a C++ programmer for the Australian government, and my experience of ClearCase there was enough to put me off version control systems entirely until I met darcs while working on GHC at MSR in 2007.&lt;br /&gt;&lt;br /&gt;Anyway, from "Trauma" we now have "Disciplined Disciple". A few people have asked me about the awkwardly religious sounding name, so I'll explain it here for the record. I chose this name to remind myself to spend more time reading papers than dreaming up crazy language features. Writing a compiler is a massive amount of work, and the only way I see myself ending up with a usable system is by sticking with the game plan and not getting side tracked. Disciple is supposed to be Haskell - Laziness + Effect typing + Mutability Polymorphism. The way I see it there are a huge number of ideas and libraries from the Haskell community that would really shine if we just shifted the base language a bit, and if you've seen the &lt;a href="http://vimeo.com/15576718"&gt;"Beyond Haskell"&lt;/a&gt; talk from HIW 2010 in Baltimore you'll know what I mean by that. I want to see the next version of Quake written ground-up in a functional language, with no cop-outs about how it's good enough just to influence the design of C#2++. I had a gut-full of pointer based, monomorphic, imperative programming in my previous life and I'm not planning on going back.&lt;br /&gt;&lt;br /&gt;I won't live long enough to invent a whole new eco system or approach to writing programs myself, but if I stick to the game plan and follow the leaders wherever possible I hope that will be enough to push it though. I am the Disciplined Disciple.</content><link rel='replies' type='application/atom+xml' href='http://disciple-devel.blogspot.com/feeds/2262800600424962439/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://disciple-devel.blogspot.com/2010/10/for-human-consumption.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/2262800600424962439'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/301893373063372699/posts/default/2262800600424962439'/><link rel='alternate' type='text/html' href='http://disciple-devel.blogspot.com/2010/10/for-human-consumption.html' title='For Human Consumption'/><author><name>Ben Lippmeier</name><uri>http://www.blogger.com/profile/08287674468193351664</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://1.bp.blogspot.com/_LHkvBlK1v2I/TLkJ_Jet4SI/AAAAAAAAABA/YbVJgyKQxIc/S220/100_4990.JPG'/></author><thr:total>1</thr:total></entry></feed>