湖南快乐十分每天期数 www.91zhb.com I have taken Problem #12

from Project Euler

as a programming exercise and to compare my (surely not optimal) implementations in C, Python, Erlang and Haskell. In order to get some higher execution times, I search for the first triangle number with more than 1000 divisors instead of 500 as stated in the original problem.

The result is the following:

#### C:

<a href="/cdn-cgi/l/email-protection" data-cfemail="274b485542495d486742495d48">[email protected]</a>:~/erlang$ gcc -lm -o euler12.bin euler12.c <a href="/cdn-cgi/l/email-protection" data-cfemail="e5898a97808b9f8aa5808b9f8a">[email protected]</a>:~/erlang$ time ./euler12.bin 842161320 real 0m11.074s user 0m11.070s sys 0m0.000s

#### Python:

<a href="/cdn-cgi/l/email-protection" data-cfemail="1975766b7c776376597c776376">[email protected]</a>:~/erlang$ time ./euler12.py 842161320 real 1m16.632s user 1m16.370s sys 0m0.250s

#### Python with PyPy:

<a href="/cdn-cgi/l/email-protection" data-cfemail="711d1e03141f0b1e31141f0b1e">[email protected]</a>:~/Downloads/pypy-c-jit-43780-b590cf6de419-linux64/bin$ time ./pypy /home/lorenzo/erlang/euler12.py 842161320 real 0m13.082s user 0m13.050s sys 0m0.020s

#### Erlang:

<a href="/cdn-cgi/l/email-protection" data-cfemail="701c1f02151e0a1f30151e0a1f">[email protected]</a>:~/erlang$ erlc euler12.erl <a href="/cdn-cgi/l/email-protection" data-cfemail="076b687562697d684762697d68">[email protected]</a>:~/erlang$ time erl -s euler12 solve Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:false] Eshell V5.7.4 (abort with ^G) 1> 842161320 real 0m48.259s user 0m48.070s sys 0m0.020s

#### Haskell:

<a href="/cdn-cgi/l/email-protection" data-cfemail="95f9fae7f0fbeffad5f0fbeffa">[email protected]</a>:~/erlang$ ghc euler12.hs -o euler12.hsx [1 of 1] Compiling Main ( euler12.hs, euler12.o ) Linking euler12.hsx ... <a href="/cdn-cgi/l/email-protection" data-cfemail="563a392433382c391633382c39">[email protected]</a>:~/erlang$ time ./euler12.hsx 842161320 real 2m37.326s user 2m37.240s sys 0m0.080s

#### Summary:

- C: 100%
- Python: 692% (118% with PyPy)
- Erlang: 436% (135% thanks to RichardC)
- Haskell: 1421%

I suppose that C has a big advantage as it uses long for the calculations and not arbitrary length integers as the other three. Also it doesn't need to load a runtime first (Do the others?).

Question 1:Do Erlang, Python and Haskell lose speed due to using arbitrary length integers or don't they as long as the values are less than `MAXINT`

?

Question 2:Why is Haskell so slow? Is there a compiler flag that turns off the brakes or is it my implementation? (The latter is quite probable as Haskell is a book with seven seals to me.)

Question 3:Can you offer me some hints how to optimize these implementations without changing the way I determine the factors? Optimization in any way: nicer, faster, more "native" to the language.

#### EDIT:

Question 4:Do my functional implementations permit LCO (last call optimization, a.k.a tail recursion elimination) and hence avoid adding unnecessary frames onto the call stack?

I really tried to implement the same algorithm as similar as possible in the four languages, although I have to admit that my Haskell and Erlang knowledge is very limited.

Source codes used:

#include <stdio.h> #include <math.h> int factorCount (long n) { double square = sqrt (n); int isquare = (int) square; int count = isquare == square ? -1 : 0; long candidate; for (candidate = 1; candidate <= isquare; candidate ++) if (0 == n % candidate) count += 2; return count; } int main () { long triangle = 1; int index = 1; while (factorCount (triangle) < 1001) { index ++; triangle += index; } printf ("%ldn", triangle); }

#! /usr/bin/env python3.2 import math def factorCount (n): square = math.sqrt (n) isquare = int (square) count = -1 if isquare == square else 0 for candidate in range (1, isquare + 1): if not n % candidate: count += 2 return count triangle = 1 index = 1 while factorCount (triangle) < 1001: index += 1 triangle += index print (triangle)

-module (euler12). -compile (export_all). factorCount (Number) -> factorCount (Number, math:sqrt (Number), 1, 0). factorCount (_, Sqrt, Candidate, Count) when Candidate > Sqrt -> Count; factorCount (_, Sqrt, Candidate, Count) when Candidate == Sqrt -> Count + 1; factorCount (Number, Sqrt, Candidate, Count) -> case Number rem Candidate of 0 -> factorCount (Number, Sqrt, Candidate + 1, Count + 2); _ -> factorCount (Number, Sqrt, Candidate + 1, Count) end. nextTriangle (Index, Triangle) -> Count = factorCount (Triangle), if Count > 1000 -> Triangle; true -> nextTriangle (Index + 1, Triangle + Index + 1) end. solve () -> io:format ("~p~n", [nextTriangle (1, 1) ] ), halt (0).

factorCount number = factorCount' number isquare 1 0 - (fromEnum $ square == fromIntegral isquare) where square = sqrt $ fromIntegral number isquare = floor square factorCount' number sqrt candidate count | fromIntegral candidate > sqrt = count | number `mod` candidate == 0 = factorCount' number sqrt (candidate + 1) (count + 2) | otherwise = factorCount' number sqrt (candidate + 1) count nextTriangle index triangle | factorCount triangle > 1000 = triangle | otherwise = nextTriangle (index + 1) (triangle + index + 1) main = print $ nextTriangle 1 1

Using `GHC 7.0.3`

, `gcc 4.4.6`

, `Linux 2.6.29`

on an x86_64 Core2 Duo (2.5GHz) machine, compiling using `ghc -O2 -fllvm -fforce-recomp`

for Haskell and `gcc -O3 -lm`

for C.

- Your C routine runs in 8.4 seconds (faster than your run probably because of
`-O3`

) - The Haskell solution runs in 36 seconds (due to the
`-O2`

flag) - Your
`factorCount'`

code isn't explicitly typed and defaulting to`Integer`

(thanks to Daniel for correcting my misdiagnosis here!). Giving an explicit type signature (which is standard practice anyway) using`Int`

and the time changes to**11.1 seconds** - in
`factorCount'`

you have needlessly called`fromIntegral`

. A fix results in no change though (the compiler is smart, lucky for you). - You used
`mod`

where`rem`

is faster and sufficient. This changes the time to**8.5 seconds**

. -
`factorCount'`

is constantly applying two extra arguments that never change (`number`

,`sqrt`

). A worker/wrapper transformation gives us:

$ time ./so 842161320 real 0m7.954s user 0m7.944s sys 0m0.004s

That's right, **7.95 seconds**

. Consistently **half a second faster than the C solution**

. Without the `-fllvm`

flag I'm still getting `8.182 seconds`

, so the NCG backend is doing well in this case too.

Conclusion: Haskell is awesome.

#### Resulting Code

factorCount number = factorCount' number isquare 1 0 - (fromEnum $ square == fromIntegral isquare) where square = sqrt $ fromIntegral number isquare = floor square factorCount' :: Int -> Int -> Int -> Int -> Int factorCount' number sqrt candidate0 count0 = go candidate0 count0 where go candidate count | candidate > sqrt = count | number `rem` candidate == 0 = go (candidate + 1) (count + 2) | otherwise = go (candidate + 1) count nextTriangle index triangle | factorCount triangle > 1000 = triangle | otherwise = nextTriangle (index + 1) (triangle + index + 1) main = print $ nextTriangle 1 1

EDIT: So now that we've explored that, lets address the questions

Question 1: Do erlang, python and haskell lose speed due to using arbitrary length integers or don't they as long as the values are less than MAXINT?

In Haskell, using `Integer`

is slower than `Int`

but how much slower depends on the computations performed. Luckily (for 64 bit machines) `Int`

is sufficient. For portability sake you should probably rewrite my code to use `Int64`

or `Word64`

(C isn't the only language with a `long`

).

Question 2: Why is haskell so slow? Is there a compiler flag that turns off the brakes or is it my implementation? (The latter is quite probable as haskell is a book with seven seals to me.) Question 3: Can you offer me some hints how to optimize these implementations without changing the way I determine the factors? Optimization in any way: nicer, faster, more "native" to the language.

That was what I answered above. The answer was

- 0) Use optimization via
`-O2`

- 1) Use fast (notably: unbox-able) types when possible
- 2)
`rem`

not`mod`

(a frequently forgotten optimization) and - 3) worker/wrapper transformation (perhaps the most common optimization).

Question 4: Do my functional implementations permit LCO and hence avoid adding unnecessary frames onto the call stack?

Yes, that wasn't the issue. Good work and glad you considered this.