網(wǎng)站首頁(yè) 編程語(yǔ)言 正文
通過(guò)dataframe的第二個(gè)條件,進(jìn)行篩選
#make字段異常值清洗
new = data[['make', 'model', 'instance_id']]
new['make_model'] = new['make']+':::'+new['model']
new.head(3)
# new.make_model.value_counts()
# 統(tǒng)計(jì)make_model列屬性值出現(xiàn)的次數(shù)
new.make_model.value_counts()[new.make_model.value_counts() <= 200]
"""
OPPO:::OPPO+A59st 200
OPPO:::3007 200
Xiaomi:::Redmi%20Note%203 200
Meizu:::MEIZU-M6 199
samsung:::SM-N9006 199
...
OPPO,OPPO A53,A53:::OPPO A53 1
boway U15:::boway U15 1
BaiMao:::BM I8 1
vivo:::vivoy75a 1
SUPERJO:::SUPERJO 1
Name: make_model, Length: 15597, dtype: int64
"""
?找出符合第二列篩選條件的index(這里index不是0-n,而是剛才value_counts()的index)
(new.make_model.value_counts()[new.make_model.value_counts() <= 200]).index
"""
Index(['OPPO:::OPPO+A59st', 'OPPO:::3007', 'Xiaomi:::Redmi%20Note%203',
'Meizu:::MEIZU-M6', 'samsung:::SM-N9006', 'Coolpad:::MTS-T0',
'OPPO R11st:::OPPO R11st', 'Blephone:::lephone T7A', 'GIONEE:::GN9011',
'Meizu:::PRO 7-S',
...
'HUAWEI:::HUAWEI%25252BG7-UL20', 'VOLTE:::L3', 'GIONEE:::GN868',
'alps:::SOP-i9', 'GT-I9300I:::GT-I9300I',
'OPPO,OPPO A53,A53:::OPPO A53', 'boway U15:::boway U15',
'BaiMao:::BM I8', 'vivo:::vivoy75a', 'SUPERJO:::SUPERJO'],
dtype='object', length=15597)
"""
new.make_model
"""
0 HUAWEI:::HUAWEI-CAZ-AL10
1 Xiaomi:::Redmi Note 4
2 OPPO:::OPPO+R11s
3 NaN
4 Apple:::iPhone 7
...
1041669 OPPO:::OPPO-R9s
1041670 Xiaomi:::MI-5X
1041671 vivo:::vivo Y37
1041672 vivo:::vivo%20Y75A
1041673 OPPO:::A31
Name: make_model, Length: 1041674, dtype: object
"""
dataframe.loc(行索引, 列名)
# 在make_model列,
# 定位符合 new.make_model.isin((new.make_model.value_counts()[new.make_model.value_counts() <= 200]).index) 的行
#
new.loc[new.make_model.isin((new.make_model.value_counts()[new.make_model.value_counts() <= 200]).index), 'make_model'] = 'other' #去除低頻詞
?再感受下第二個(gè)case
data['day'] = data['time'].apply(lambda x : int(time.strftime("%d", time.localtime(x))))
data['period'] = data['day']
data[['period']].head(3)
data['period'].unique()
# array([29, 30, 31, 27, 1, 2, 28, 3])
?直接用列篩選
[data['period']<27]
"""
[0 False
1 False
2 False
3 False
4 False
...
1041669 True
1041670 True
1041671 True
1041672 True
1041673 True
Name: period, Length: 1041674, dtype: bool]
"""
data['period']<27
"""
0 False
1 False
2 False
3 False
4 False
...
1041669 True
1041670 True
1041671 True
1041672 True
1041673 True
Name: period, Length: 1041674, dtype: bool
"""
挑選period列,值<27的行(已成功挑選)
data['period'][data['period']<27]
"""
950 1
951 1
952 1
953 1
954 1
..
1041669 3
1041670 3
1041671 3
1041672 3
1041673 3
Name: period, Length: 348536, dtype: int64
"""
data['period'][data['period']<27] = data['period'][data['period']<27] + 31
這樣可以使用head展示
data[['period']][data['period']<27].head(3)
還有種單列就能篩選的方法
t2['receive_number'] = t2.date_received.apply(lambda s:len(s.split(':')))
t2 = t2[t2.receive_number>1]
t2.head(3)
原文鏈接:https://blog.csdn.net/weixin_31866177/article/details/128454219
相關(guān)推薦
- 2022-06-19 rsa詳解及例題及python算法_python
- 2022-08-05 利用Python?list列表修改元素_python
- 2023-03-29 Python之sklearn數(shù)據(jù)預(yù)處理中fit(),transform()與fit_transfor
- 2022-11-23 Python?property裝飾器使用案例介紹_python
- 2022-07-10 linux 目錄和文件管理
- 2022-10-05 Win10下自帶的PowerShell讀取文件哈希值_PowerShell
- 2022-08-11 TypeScript中的裝飾器用法_基礎(chǔ)知識(shí)
- 2022-08-01 C++無(wú)符號(hào)整數(shù)溢出問(wèn)題解析_C 語(yǔ)言
- 最近更新
-
- window11 系統(tǒng)安裝 yarn
- 超詳細(xì)win安裝深度學(xué)習(xí)環(huán)境2025年最新版(
- Linux 中運(yùn)行的top命令 怎么退出?
- MySQL 中decimal 的用法? 存儲(chǔ)小
- get 、set 、toString 方法的使
- @Resource和 @Autowired注解
- Java基礎(chǔ)操作-- 運(yùn)算符,流程控制 Flo
- 1. Int 和Integer 的區(qū)別,Jav
- spring @retryable不生效的一種
- Spring Security之認(rèn)證信息的處理
- Spring Security之認(rèn)證過(guò)濾器
- Spring Security概述快速入門
- Spring Security之配置體系
- 【SpringBoot】SpringCache
- Spring Security之基于方法配置權(quán)
- redisson分布式鎖中waittime的設(shè)
- maven:解決release錯(cuò)誤:Artif
- restTemplate使用總結(jié)
- Spring Security之安全異常處理
- MybatisPlus優(yōu)雅實(shí)現(xiàn)加密?
- Spring ioc容器與Bean的生命周期。
- 【探索SpringCloud】服務(wù)發(fā)現(xiàn)-Nac
- Spring Security之基于HttpR
- Redis 底層數(shù)據(jù)結(jié)構(gòu)-簡(jiǎn)單動(dòng)態(tài)字符串(SD
- arthas操作spring被代理目標(biāo)對(duì)象命令
- Spring中的單例模式應(yīng)用詳解
- 聊聊消息隊(duì)列,發(fā)送消息的4種方式
- bootspring第三方資源配置管理
- GIT同步修改后的遠(yuǎn)程分支