1

In pandas, if I use series.apply() to apply a function with an inner function definition, for example:

def square_times_two(x):
  def square(y):
    return y ** 2
  return square(x) * 2

data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']} df = pd.DataFrame.from_dict(data)

df["col_3"] = df.col_1.apply(square_times_two)

is the inner function redefined for each row? Would there be a performance impact to having many inner functions in a function applied to a large series?

Ethan
  • 1,657
  • 9
  • 25
  • 39
Alex
  • 13
  • 3

1 Answers1

0

The function will only be compiled once, but there may be a small overhead. This should be neglegible though, since the inner function does not use vars from the outer one.

Yet, for the same reason, there does not seem to be the necessity to define the inner function there, right? You could just move it to the same level as the outer one.

def square(y):
  return y ** 2

def square_times_two(x): return square(x) * 2

data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']} df = pd.DataFrame.from_dict(data)

df["col_3"] = df.col_1.apply(square_times_two) ```

buddemat
  • 138
  • 8