Speed differences between calling directly and assigning to variable

Question

Situation: Consider the following two snippets of Python codes:-

Code 1:

for root, dirs, files in os.walk(top):
    for f in files:
        path = os.path.join(root, f)
        print(path)

Code 2:

for root, dirs, files in os.walk(top):
    for f in files:
        print(os.path,join(root,f))

Question: Will there any differences in terms of performance or speed if I do not declare the file path as a variable (assuming that I will only use it once -- if using more than once declaring the variable makes much more sense)

Likely not enough to matter, but if you care: https://docs.python.org/3/library/timeit.html — Stephen Rauch, May 22 '17 at 04:35
You should **always** profile your code before optimizing it. Don't waste effort on trivial optimizations. — Arya McCarthy, May 22 '17 at 04:39
I tried to understand the definition of profiling on wikipedia but am still confused. Mind enlightening me? @aryamccarthy — Timothy Wong, May 22 '17 at 04:41
Profiling is looking at how long code takes to run, plus where that time is spent. Take a look at [this](http://pynash.org/2013/03/06/timing-and-profiling/). — Arya McCarthy, May 22 '17 at 04:43

score 1 · Accepted Answer · answered May 22 '17 at 08:40

In addition to using timeit for simple benchmarking you can pytest-benchmark, which makes it super-simple to create a comparison, simply:

import os

def f1(top):
    for root, dirs, files in os.walk(top):
        for f in files:
            path = os.path.join(root, f)
            print(path)

def f2(top):
    for root, dirs, files in os.walk(top):
        for f in files:
            print(os.path.join(root, f))

def test_f1(benchmark):
    benchmark(f1, '~/tmp')

def test_f2(benchmark):
    benchmark(f2, '~/tmp')

Note: ~/tmp contains 350 files/folders, YMMV. Running

python -m pytest test.py --benchmark-min-time=0.001 --benchmark-histogram=hist

Gives you nice data and a histogram:

----------------------------------------------------------------------- benchmark: 2 tests ----------------------------------------------------------------------
Name (time in us)        Min               Max              Mean            StdDev            Median               IQR            Outliers(*)  Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
test_f1               4.4811 (1.0)      8.6253 (1.0)      4.7941 (1.00)     0.3531 (1.0)      4.7141 (1.01)     0.2762 (1.31)            15;7     216        1000
test_f2               4.4967 (1.00)     9.3009 (1.08)     4.7773 (1.0)      0.5242 (1.48)     4.6838 (1.0)      0.2113 (1.0)             6;13     215        1000
-----------------------------------------------------------------------------------------------------------------------------------------------------------------

As you can see, the difference is not significant considering the high variance.

Now if you are still curious, you can use dis to show the bytecode that is being executed by CPython. This is a functionality of the CPython interpreter, which is the most common way to run python code:

In [1]: import os, dis

In [2]: def f1(top):
   ...:     for root, dirs, files in os.walk(top):
   ...:         for f in files:
   ...:             path = os.path.join(root, f)
   ...:             print(path)
   ...:             

In [3]: def f2(top):
   ...:     for root, dirs, files, in os.walk(top):
   ...:         for f in files:
   ...:             print(os.path.join(root, f))
   ...:             

In [4]: dis.dis(f1)
  2           0 SETUP_LOOP              60 (to 62)
              2 LOAD_GLOBAL              0 (os)
              4 LOAD_ATTR                1 (walk)
              6 LOAD_FAST                0 (top)
              8 CALL_FUNCTION            1
             10 GET_ITER
        >>   12 FOR_ITER                46 (to 60)
             14 UNPACK_SEQUENCE          3
             16 STORE_FAST               1 (root)
             18 STORE_FAST               2 (dirs)
             20 STORE_FAST               3 (files)

  3          22 SETUP_LOOP              34 (to 58)
             24 LOAD_FAST                3 (files)
             26 GET_ITER
        >>   28 FOR_ITER                26 (to 56)
             30 STORE_FAST               4 (f)

  4          32 LOAD_GLOBAL              0 (os)
             34 LOAD_ATTR                2 (path)
             36 LOAD_ATTR                3 (join)
             38 LOAD_FAST                1 (root)
             40 LOAD_FAST                4 (f)
             42 CALL_FUNCTION            2
             44 STORE_FAST               5 (path)

  5          46 LOAD_GLOBAL              4 (print)
             48 LOAD_FAST                5 (path)
             50 CALL_FUNCTION            1
             52 POP_TOP
             54 JUMP_ABSOLUTE           28
        >>   56 POP_BLOCK
        >>   58 JUMP_ABSOLUTE           12
        >>   60 POP_BLOCK
        >>   62 LOAD_CONST               0 (None)
             64 RETURN_VALUE

In [5]: dis.dis(f2)
  2           0 SETUP_LOOP              56 (to 58)
              2 LOAD_GLOBAL              0 (os)
              4 LOAD_ATTR                1 (walk)
              6 LOAD_FAST                0 (top)
              8 CALL_FUNCTION            1
             10 GET_ITER
        >>   12 FOR_ITER                42 (to 56)
             14 UNPACK_SEQUENCE          3
             16 STORE_FAST               1 (root)
             18 STORE_FAST               2 (dirs)
             20 STORE_FAST               3 (files)

  3          22 SETUP_LOOP              30 (to 54)
             24 LOAD_FAST                3 (files)
             26 GET_ITER
        >>   28 FOR_ITER                22 (to 52)
             30 STORE_FAST               4 (f)

  4          32 LOAD_GLOBAL              2 (print)
             34 LOAD_GLOBAL              0 (os)
             36 LOAD_ATTR                3 (path)
             38 LOAD_ATTR                4 (join)
             40 LOAD_FAST                1 (root)
             42 LOAD_FAST                4 (f)
             44 CALL_FUNCTION            2
             46 CALL_FUNCTION            1
             48 POP_TOP
             50 JUMP_ABSOLUTE           28
        >>   52 POP_BLOCK
        >>   54 JUMP_ABSOLUTE           12
        >>   56 POP_BLOCK
        >>   58 LOAD_CONST               0 (None)
             60 RETURN_VALUE

So the first code does indeed produce more bytecode instructions.

Anyhow, you should consider profiling - make sure you look at parts of the code that are really relevant and avoid optimizing blindly.

Speed differences between calling directly and assigning to variable

1 Answers1