NEWTON

NEWTON


Popular tags

    What is the inefficiency in this cairo code using alloc_locals

    Asked

    7 months ago

    14

    views


    0

    The following code:

    func pow4(n) -> (m : felt):
        alloc_locals
        local x
    
        jmp body if n != 0
        [ap] = 0; ap++
        ret
    
        body:
        x = n * n
        [ap] = x * x; ap++
        ret
    end
    
    func main():
        pow4(n=5)
        ret
    end
    
    

    is declared inefficient in the doc because of non-continuous memory.

    I ran it and could not see any hole in the memory table:

    Addr  Value
    -----------
    â‹®
    ###
    â‹®
    1:0   2:0
    1:1   3:0
    1:2   5
    1:3   1:2
    1:4   0:21
    1:5   25
    1:6   625
    
    

    so I don't understand where the problem is. I do see a hole with n=0 though:

    Addr  Value
    -----------
    â‹®
    ###
    â‹®
    1:0   2:0
    1:1   3:0
    1:2   0
    1:3   1:2
    1:4   0:21
    â‹®
    1:6   0
    
    

    that I can fix using:

        jmp body if n != 0
        x = 0
        ret
    
    

    but it's not what's suggested:

    • Move the instruction alloc_locals.
    • or Use tempvar instead of local.

    This question was originally posted on Stack Overflow

      starknetcairo-lang

    2 answers

    0

    So following @dan-carmon answer and for the sake of completeness, here is a summary of the possible implementations and their memory table "1" when n = 0 & n > 0

    # n = 0      n = 5
    1:0   2:0    1:0   2:0
    1:1   3:0    1:1   3:0
    1:2   0      1:2   5
    1:3   1:2    1:3   1:2
    1:4   0:14   1:4   0:14
    1:5   0      1:5   25
                 1:6   625
    
    

    Note however that the implementation using tempvar uses 2 slots less than the others in the program table "0".

    Implementations tested

    func pow4_reuse_slot(n) -> (m : felt):
        alloc_locals
        local x
    
        jmp body if n != 0
        x = 0
        ret
    
        body:
        x = n * n
        [ap] = x * x; ap++
        ret
    end
    
    func pow4_alloc_in_body(n) -> (m : felt):
        jmp body if n != 0
        [ap] = 0; ap++
        ret
    
        body:
        alloc_locals
        local x
        x = n * n
        [ap] = x * x; ap++
        ret
    end
    
    func pow4_use_tempvar(n) -> (m : felt):
        jmp body if n != 0
        [ap] = 0; ap++
        ret
    
        body:
        tempvar x = n * n
        [ap] = x * x
        ret
    end
    
    

    This answer was originally posted on Stack Overflow

    answered

    7 months ago

    0

    You are correct that the inefficiency is due to the memory hole, and correct that it appears only for n=0. Holes do not cause an inefficiency just by existing, but rather, their existence usually means that an equivalent code could have executed using fewer memory cells in total (in this case, 6 instead of 7, in the memory section "1:"). To make it more efficient, one should aim to remove the hole (so that the parts before and after it become continuous), whereas your suggested solution just fills it (which the prover does anyway). So your solution would still use 7 memory cells in the memory segment; and will in fact also use 2 additional cells in the program section (for the x=0 command), so it is actually less efficient than leaving the hole empty: compile and check!

    The inefficiency in the original code arises from the local variable x being assigned a memory cell even in the case n=0, despite not being used. To make it more efficient we would simply not want to declare and allocate x in this case. This can be done by moving the alloc_locals command inside "body", so that it isn't executed (and the locals aren't allocated) in the n=0 case, as in the first suggestion: this saves one memory cell in the n=0 case and doesn't affect the n!=0 case. Note that alloc_locals does not have to appear right at the beginning of the code.

    The second suggestion is to make x a tempvar instead of a local (and remove alloc_locals entirely). This again means no local variable is allocated in the n=0 case, thus again only 6 memory cells will be used instead of 7. And in fact, because the code is so simple, using the tempvar is also more efficient that declaring it as a local, at least when used as tempvar x = n * n, as it skips merges the ap += 1 command (which is what alloc_locals does in this case) with the x = n * n command, rather than run them separately.

    Beyond discussing the theory, you should compile each of these options and compare the lengths of the code and memory segments obtained, and see what is really the most efficient and by how much: this is always true when optimizing, you should always check the actual performance.

    This answer was originally posted on Stack Overflow

    answered

    7 months ago

    Your answer

    NEWTON

    NEWTON