I wrote a program in assembly a few weeks ago to do this exact thing.
It can spawn worker threads to speed up the summation. I have a quad core CPU, so I've set it to 4. If you have a different number of cores, you can modify line 8 to change the number of workers it will spawn.
It can spawn worker threads to speed up the summation. I have a quad core CPU, so I've set it to 4. If you have a different number of cores, you can modify line 8 to change the number of workers it will spawn.
https://gist.github.com/3369946