开发者问题收集

Pandas 数据框根据组替换列值

2019-04-01
984

我有一个具有以下结构的数据框,

   master_mac    slave_mac        uuid           rawData               
0  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                         
1  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                         
2  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                          
3  ac233fc01403  ac233f26492b     e2c56db5       ac0228  
4  ac233fc01403  e464eecba5eb     NaN            590080             
5  ac233fc01403  ac233f26492b     e2c56db5       ac0228  
6  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                          
7  ac233fc01403  ac233f26492b     e2c56db5       636800       
  • 如果“uuid”列对于某个组(即“master_mac”和“slave_mac”)不为空,则相应行应包含“rawData”列的 NaN。

结果应该是,

 master_mac    slave_mac        uuid           rawData               
0  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                         
1  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                         
2  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                          
3  ac233fc01403  ac233f26492b     e2c56db5       NaN  
4  ac233fc01403  e464eecba5eb     NaN            590080             
5  ac233fc01403  ac233f26492b     e2c56db5       NaN  
6  ac233fc01403  ac233f26492b     e2c56db5       NaN                                                          
7  ac233fc01403  ac233f26492b     e2c56db5       NaN

有人能帮我吗?

3个回答

使用:

m = df['uuid'].notna()

如果需要每个组进行处理,请使用 GroupBy.transform GroupBy.any 来测试每个组中至少一个非 NaN

m = df['uuid'].notna().groupby([df['master_mac'],df['slave_mac']]).transform('any')

df['rawData'] = df['rawData'].mask(m)
print (df)
     master_mac     slave_mac      uuid rawData
0  ac233fc01403  ac233f26492b  e2c56db5     NaN
1  ac233fc01403  ac233f26492b  e2c56db5     NaN
2  ac233fc01403  ac233f26492b  e2c56db5     NaN
3  ac233fc01403  ac233f26492b  e2c56db5     NaN
4  ac233fc01403  e464eecba5eb       NaN  590080
5  ac233fc01403  ac233f26492b  e2c56db5     NaN
6  ac233fc01403  ac233f26492b  e2c56db5     NaN
7  ac233fc01403  ac233f26492b  e2c56db5     NaN

或者:

df.loc[m, 'rawData'] = np.nan
jezrael
2019-04-01

如果您需要根据 uuid 列中的值修改每一行 rawData 列中的值,您只需执行以下操作:

df['rawData'].loc[df['uuid'].notna()] = np.nan
sentence
2019-04-01
duckdb

df1.sql.select("master_mac,slave_mac,uuid,case when uuid is null then rawData end rawData")

┌──────────────┬──────────────┬──────────┬─────────┐
│  master_mac  │  slave_mac   │   uuid   │ rawData │
│   varchar    │   varchar    │ varchar  │ varchar │
├──────────────┼──────────────┼──────────┼─────────┤
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
│ ac233fc01403 │ e464eecba5eb │ NULL     │ 590080  │
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
│ ac233fc01403 │ ac233f26492b │ e2c56db5 │ NULL    │
└──────────────┴──────────────┴──────────┴─────────┘
G.G
2024-04-13