I have 2 complex dataframes of serial numbers (with a couple dozen columns of other information). The serial numbers can appear multiple times each in the first data frame. The second datafram is a subset of the first with all duplicates removed. In the second DF, I generated a unique code for each serial number but now need to assign that unique code to each instance of the serial number in the first dataframe.
I know how to do this using a for loop, but I'm trying to do it more efficiently since the datasets are massive. I have tried various types merge() but can't get the desired output maybe because the dataframe lengths are significantly different. Any ideas on how to do this with existing libraries? Thank you in advance.
This is an oversimplified view of the dataframes and the intended output.
Reference Dataframe
| SERIAL1 | SERIALCODE |
|---|---|
| 666661 | C12345 |
| 666662 | C12346 |
| 666663 | C12347 |
| 666664 | C12348 |
| 666665 | C12349 |
Active Dataframe
| SERIAL1 | SERIALCODE |
|---|---|
| 666663 | NaN |
| 666664 | NaN |
| 666665 | NaN |
| 666661 | NaN |
| 666663 | NaN |
| 666665 | NaN |
| 666662 | NaN |
| 666663 | NaN |
Desired Output
| SERIAL1 | SERIALCODE |
|---|---|
| 666661 | C12345 |
| 666662 | C12346 |
| 666663 | C12347 |
| 666661 | C12345 |
| 666664 | C12348 |
| 666665 | C12349 |
| 666661 | C12345 |
| 666663 | C12346 |
| 666665 | C12349 |
| 666662 | C12346 |
| 666663 | C12347 |