77

I am trying to convert a list of lists which looks like the following into a Pandas Dataframe

[['New York Yankees ', '"Acevedo Juan"  ', 900000, ' Pitcher\n'], 
['New York Yankees ', '"Anderson Jason"', 300000, ' Pitcher\n'], 
['New York Yankees ', '"Clemens Roger" ', 10100000, ' Pitcher\n'], 
['New York Yankees ', '"Contreras Jose"', 5500000, ' Pitcher\n']]

I am basically trying to convert each item in the array into a pandas data frame which has four columns. What would be the best approach to this as pd.Dataframe does not quite give me what I am looking for.

Emre
  • 10,541
  • 1
  • 31
  • 39
Aravind Veluchamy
  • 871
  • 1
  • 6
  • 3

4 Answers4

89
import pandas as pd

data = [['New York Yankees', 'Acevedo Juan', 900000, 'Pitcher'], 
        ['New York Yankees', 'Anderson Jason', 300000, 'Pitcher'], 
        ['New York Yankees', 'Clemens Roger', 10100000, 'Pitcher'], 
        ['New York Yankees', 'Contreras Jose', 5500000, 'Pitcher']]

df = pd.DataFrame.from_records(data)
Emre
  • 10,541
  • 1
  • 31
  • 39
22

Once you have the data:

import pandas as pd

data = [['New York Yankees ', '"Acevedo Juan"  ', 900000, ' Pitcher\n'], 
        ['New York Yankees ', '"Anderson Jason"', 300000, ' Pitcher\n'], 
        ['New York Yankees ', '"Clemens Roger" ', 10100000, ' Pitcher\n'], 
        ['New York Yankees ', '"Contreras Jose"', 5500000, ' Pitcher\n']]

You can create dataframe from the transposing the data:

data_transposed = zip(data)
df = pd.DataFrame(data_transposed, columns=["Team", "Player", "Salary", "Role"])

Another way:

df = pd.DataFrame(data)
df = df.transpose()
df.columns = ["Team", "Player", "Salary", "Role"]
Paloma Manzano
  • 221
  • 2
  • 2
7

You can just directly define it as a data frame as follows:

import pandas as pd

data = [['New York Yankees', 'Acevedo Juan', 900000, 'Pitcher'], 
        ['New York Yankees', 'Anderson Jason', 300000, 'Pitcher'], 
        ['New York Yankees', 'Clemens Roger', 10100000, 'Pitcher'], 
        ['New York Yankees', 'Contreras Jose', 5500000, 'Pitcher']]

data = pd.DataFrame(data)
LUSAQX
  • 783
  • 3
  • 10
  • 24
1

This one by far was the simplest:

import pandas as pd

data = [['New York Yankees', 'Acevedo Juan', 900000, 'Pitcher'], 
        ['New York Yankees', 'Anderson Jason', 300000, 'Pitcher'], 
        ['New York Yankees', 'Clemens Roger', 10100000, 'Pitcher'], 
        ['New York Yankees', 'Contreras Jose', 5500000, 'Pitcher']]

data = pd.DataFrame(data)

now, if the keys are the first list in the list of lists (data[0]), you can assign them to column headers in the dataframe like so:

import pandas as pd

data = [['key1', 'key2', key3, 'key4'], 
    ['New York Yankees', 'Anderson Jason', 300000, 'Pitcher'], 
    ['New York Yankees', 'Clemens Roger', 10100000, 'Pitcher'], 
    ['New York Yankees', 'Contreras Jose', 5500000, 'Pitcher']]

data = pd.DataFrame(data[1:], columns=data[0])
GManAsg
  • 11
  • 1