2

say I have a pandas dataframe with the following structure:

      col1      col2
0     Jack      Jill
1  Michael     Micah
2  Derrick    Daliah
3   Martin    Martha
4  Patrick  Patricia
5   Dennis    Denise    

I have a list of characters:

characters = ['a', 'b', 'c']

I want to create a new column in the dataframe, so the df has the following structure (output):

      col1      col2    label
0     Jack      Jill    a
1  Michael     Micah    b
2  Derrick    Daliah    c
3   Martin    Martha    a
4  Patrick  Patricia    b
5   Dennis    Denise    c

I thought I could do this by iterating through the two lists together, but zip only iteratres to the length of the shortest list:

for x,y in zip(df['col1', characters):
    print(y)

output:

a
b
c

and a nested for loop:

for x in df['col1']:
    for y in characters:
        print(y)

prints each character for every name in x in col1 (so I get a,b,c for Jack, a,b,c for Michael, etc.)

If I could get the iteration to repeat for characters once the characters list is done, as displayed in my example output, I could append them to a list and then just:

df['label'] = characters_list_for_df

Any help would be great!

jpp
  • 159,742
  • 34
  • 281
  • 339
d_kennetz
  • 5,219
  • 5
  • 21
  • 44

1 Answers1

2

You can use this recipe to repeat your string up to a given length:

def repeat_to_length(s, wanted):
    return (s * (wanted // len(s) + 1))[:wanted]

df['label'] = list(repeat_to_length('abc', len(df.index)))

print(df)

      col1      col2 label
0     Jack      Jill     a
1  Michael     Micah     b
2  Derrick    Daliah     c
3   Martin    Martha     a
4  Patrick  Patricia     b
5   Dennis    Denise     c
jpp
  • 159,742
  • 34
  • 281
  • 339
  • This is wonderful, I have never seen this before. It works great! I will accept when possible. It is also easy to understand. – d_kennetz Sep 05 '18 at 18:55
  • 1
    I think `list(islice(cycle(["a","b","c"]), len(df)))` is a lot less fiddly, but if you're doing divisions, at least use `//` and lose the `int`. – DSM Sep 05 '18 at 18:57
  • @DSM, Fair point on floor division. `islice` + `cycle` works I guess. – jpp Sep 05 '18 at 22:25