Latest Questions tagged Pandas

Browse all latest questions tagged Pandas

10
Votes
Published 24 Apr, 2022
Python - Pandas DataFrame: normalize one JSON column and merge with other columns

I have a pandas DataFrame containing one column with multiple JSON data items as list of dicts. I want to normalize the JSON column and duplicate the non-JSON columns: # creating dataframe df_actions...

21
Votes
Published 08 May, 2022
Python - Plyr or dplyr in Python

This is more of a conceptual question, I do not have a specific problem. I am learning python for data analysis, but I am very familiar with R - one of the great things about R is plyr (and of course...

24
Votes
Published 08 May, 2022
Python - Pandas Correlation Groupby

Assuming I have a dataframe similar to the below, how would I get the correlation between 2 specific columns and then group by the 'ID' column? I believe the Pandas 'corr' method finds the correlatio...

71
Votes
Published 27 Aug, 2022
Python - Find rows of dataframe with the same column value in Pandas

Consider a dataframe with 2 columns for easiness. The first column is id and it is the key. The second column, named code is not a key but the case of two entries having the same value is very rare....

70
Votes
Published 20 Aug, 2022
Python - How to compare two data frame and get the unmatched rows using python?

I have two data frames, df1 and df2. Now, df1 contains 6 records and df2 contains 4 records. I want to get the unmatched records out of it. I tried it but getting an error ValueError: Can only compare...

84
Votes
Published 31 Aug, 2022
Python - Write .csv file from pandas dataframe with consecutive spaces as delimiter

I want to write a text file which is separated by four spaces instead of one tab: df.to_csv(file,sep= '\s\s\s\s') instead of df.to_csv(file,sep= '\t') I tried regex : df.to_csv(file,sep= r'\s...

42
Votes
Published 29 Aug, 2022
Python - pandas 0.24.1 Key Error: "None of [Index(['A' 'B'], dtype='object')] are in the [columns]"

Formerly I had anaconda with pandas 0.18. Using the code below, I made a calculation by the function "calc_func" and assign the result to the the columns of DataFrame, say "A" and "B". df[["A", "B"]]...

59
Votes
Published 23 Aug, 2022
Python - pandas groupby column and check if group meets multiple conditions

I have a DataFrame that looks like the following: X Y Date are_equal 0 50.0 10.0 2018-08-19 False 1 NaN 10.0 2018-08-19 False 2 NaN 50.0 2018-08-1...

21
Votes
Published 31 Aug, 2022
Python - Create pandas dataframe from numpy array

To create a pandas dataframe from numpy I can use : columns = ['1','2'] data = np.array([[1,2] , [1,5] , [2,3]]) df_1 = pd.DataFrame(data,columns=columns) df_1 If I instead use : columns = ['1',...

43
Votes
Published 27 Aug, 2022
Python - Python pandas dataframe group by based on a condition

My question is simple, I have a dataframe and I groupby the results based on a column and get the size like this: df.groupby('column').size() Now the problem is that I only want the ones where size...

29
Votes
Published 25 Aug, 2022
Python - Save pandas dataframe with numpy arrays column

Let us consider the following pandas dataframe: df = pd.DataFrame([[1,np.array([6,7])],[4,np.array([8,9])]], columns = {'A','B'}) where the B column is composed by two numpy arrays. If we save t...

55
Votes
Published 06 May, 2022
Python - How to fill dataframe Nan values with empty list [] in pandas?

This is my dataframe: date ids 0 2011-04-23 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,... 1 2011-04-24 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13...

16
Votes
Published 26 Aug, 2022
Python - pandas: drop duplicates in groupby 'date'

In the dataframe below, I would like to eliminate the duplicate cid values so the output from df.groupby('date').cid.size() matches the output from df.groupby('date').cid.nunique(). I have looked at...

33
Votes
Published 25 Aug, 2022
Python - Convert a dictionary to DataFrame with specified column names

I have a dictionary which is dict['TimeStamp'] = [value1,value2,value3] the dict has many times stamps and each time stamp has 3 values for example I want to make panda dataframe of all values of dict...

69
Votes
Published 29 Aug, 2022
Python - Fast way to split column into multiple rows in Pandas

I have the following data frame: import pandas as pd df = pd.DataFrame({ 'gene':["foo", "bar // lal", "qux", "woz"]...