I have a DataFrame as below with all three columns having strings:
| Column A | Column B | Column C |
|---|---|---|
| Apple | red, yellow, blue | Texas, California |
| Banana | yellow, orange | Indiana, New Zealand |
| Watermelon | grey |
I would like to split each row into multiple rows based on a delimiter value. The final output would look like below.
| Column A | Column B | Column C |
|---|---|---|
| Apple | red | Texas |
| Apple | yellow | California |
| Apple | blue | |
| Banana | yellow | Indiana |
| Banana | orange | New Zealand |
| Watermelon | grey |
Note:
- Delimiters would only be present in Column B and Column C.
- The splits need not be equal. For instance, In row 1, the max splits across Column B and Column C are 3. But since column C has only 2 splits, the third row would result in column C as null/blank.
I tried following this stack overflow post, but couldn't apply it in my case here since each column would have different values based on splits across multiple columns, and since I am dealing with string splits.