Python统计list中各个元素出现的次数--688IT编程网

Python统计list中各个元素出现的次数

列表count()函数调⽤⽅法

对象.count(参数)

count()⽅法操作⽰例

有列表['a','iplaypython','c','b‘,'a']，想统计字符串'a'在列表中出现的次数，可以这样操作

>>> ['a','iplaypython','c','b','a'].count('a')

其返回值就是要统计参数出现的次数。在应⽤的时候最好是把列表赋给⼀个变量，之后再⽤count()⽅法来操作⽐较好。当对象是⼀个嵌套的列表时，要查嵌套列表中的列表参数count()⽅法同样可以完成

>>> x = [1,2,'a',[1,2],[1,2]]

>>> x.count([1,2])

>>> x.count(1)

>>> x.count('a')

1. 计算字母和数字出现的次数

str='abc123abc456aa'

d={}

for x in str:

print x

if not x in d:

d[x]=1

else:

d[x]=d[x]+1

print d

{'a': 4, 'c': 2, 'b': 2, '1': 1, '3': 1, '2': 1, '5': 1, '4': 1, '6': 1}

#!/usr/bin/python3

str="ABCdefabcdefabc"

str=str.lower()

str_list=list(str)

char_dict={}

for char1 in str:

if char1 in char_dict:

count=char_dict[char1]

else:

count=0

count=count+1

char_dict[char1]=count

print(char_dict)

a = "aAsmr3idd4bgs7Dlsf9eAF"

请将a字符串的数字取出，并输出成⼀个新的字符串。

请统计a字符串出现的每个字母的出现次数（忽略⼤⼩写，a与A是同⼀个字母），并输出成⼀个字典。例 {'a':3,'b':1}

请去除a字符串多次出现的字母，仅留最先出现的⼀个,⼤⼩写不敏感。例 'aAsmr3idd4bgs7Dlsf9eAF'，经过去除后，输出 'asmr3id4bg7lf9e' a = "aAsmr3idd4bgs7Dlsf9eAF"

def fun1_2(x): #1&2

x = x.lower() #⼤⼩写转换

num = []

dic = {}

for i in x:

if i.isdigit(): #判断如果为数字，请将a字符串的数字取出，并输出⼀个新的字符串

num.append(i)

else: #2 请统计a字符串出现每个字母的出现次数（忽视⼤⼩写），并输出⼀个字典。例：{'a':3,'b':1}

if i in dic:

continue

else:

dic[i] = x.count(i)

new = ''.join(num)

print"the new numbers string is: " + new

print"the dictionary is: %s" % dic

fun1_2(a)

def fun3(x):

x = x.lower()

new3 = []

for i in x:

if i in new3:

continue

else:

new3.append(i)

print''.join(new3)

fun3(a)

三种⽅法：

①直接使⽤dict

②使⽤defaultdict

③使⽤Counter

ps:`int()`函数默认返回0

①dict

1. text = "I'm a hand some boy!"

3. frequency = {}

5. for word in text.split():

6. if word not in frequency:

7. frequency[word] = 1

8. else:

9. frequency[word] += 1

②defaultdict

1. import collections

3. frequency = collections.defaultdict(int)

5. text = "I'm a hand some boy!"

7. for word in text.split():

8. frequency[word] += 1

③Counter

1. import collections

3. text = "I'm a hand some boy!"

4. frequency = collections.Counter(text.split())

现有列表如下：

[6, 7, 5, 9, 4, 1, 8, 6, 2, 9]

希望统计各个元素出现的次数，可以看作⼀个词频统计的问题。

我们希望最终得到⼀个这样的结果：{6:2, }即 {某个元素：出现的次数...}

⾸先要将这些元素作为字典的键，建⽴⼀个初值为空的字典：

>>> from random import randint

>>> l = [randint(1,10) for x in xrange(10)]

>>> l

[6, 7, 5, 9, 4, 1, 8, 6, 2, 9]

>>> d = dict.fromkeys(l, 0)

>>> d

{1: 0, 2: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}

# 现在的任务是需要将d中每个键所对应的值统计出来>>> for x in l:

>>> d[x] += 1

>>> d

{1: 1, 2: 1, 4: 1, 5: 1, 6: 2, 7: 1, 8: 1, 9: 2}

# 这就统计完了所有的元素出现的次数

另外⼀种⽅法，利⽤collections模块中的Counter对象

>>> from collections import Counter

# 这个Counter可以直接接受⼀个列表，将它转化为统计完成的结果

>>> d = Counter(l)

>>> d

Counter({6: 2, 9: 2, 1: 1, 2: 1, 4: 1, 5: 1, 7: 1, 8: 1})

# 该Counter对象是字典对象的⼦类，也可以通过键来访问对应值

>>> d[6]

# Counter对象⽅便之处在于它内置有most_common(n)⽅法，可以直接统计出前n个最⾼词频

>>> d.most_common(2)

[(6, 2), (9, 2)]

⽤python做词频统计

import string

import time

path='C:\\Users\\ZHANGSHUAILING\\Desktop\\'

with open(path,'r') as text:

words=[raw_word.strip(string.punctuation).lower() for raw_word ad().split()]

words_index=set(words)

counts_dict={unt(index) for index in words_index}

for word in sorted(counts_dict,key=lambda x:counts_dict[x],reverse=True):

time.sleep(2)

print ('{}--{} times'.format(word,counts_dict[word]))

{'the': 2154, 'and': 1394, 'to': 1080, 'of': 871, 'a': 861, 'his': 639, 'The': 637, 'in': 515, 'he': 461, 'with': 310, 'that': 308, 'you': 295, 'for': 280, 'A': 269, 'was': 258, 'him': 246, 'I': 234, 'had': 220, 'as': 217, 'not': 215, 'by': 196, 'on': 189, 'it': 178, 'be': 164, 'at': 153, 'from': 149, 'they': 149, 'but': 149, 'is': 144, 'her': 144, 'their': 143, 'who': 131, 'all': 121, 'one': 119, 'which': 119,}#部分结果展⽰

import re,collections

def get_words(file):

with open (file) as f:

words_box=[]

for line in f:

if re.match(r'[a-zA-Z0-9]*',line):#避免中⽂影响

d(line.strip().split())

return collections.Counter(words_box)

print(get_nums('')+get_nums('伊索寓⾔.txt'))

import re,collections

def get_words(file):

with open (file) as f:

words_box=[]

for line in f:

if re.match(r'[a-zA-Z0-9]',line):

d(line.strip().split())

return collections.Counter(words_box)

a=get_nums('')+get_nums('伊索寓⾔.txt')

st_common(10))

python 计数⽅法⼩结

⽅法⼀：遍历法

def get_counts(sequence):

counts = {}

for x in sequence:

if x in counts:

counts[x] += 1字符串函数strip的作用

else:

counts[x] = 1

return counts

这是最常规的⽅法，⼀个⼀个数咯

⽅法⼆： defaultdict

这⾥⽤到了coollections 库

from collections import defaultdict

def get_counts2(sequence):

counts = defaultdict(int) #所有值被初始化为0

for x in sequence:

counts[x] += 1

return counts

最后得到的是元素：个数的⼀个字典

⽅法三：value_counts()

这个⽅法是pandas 中的，所以使⽤时候需要先导⼊pandas ，该⽅法会对元素计数，并按从⼤到⼩的顺序排列

tz_counts = frame['tz'].value_counts()

tz_counts[:10]

>>>

America/New_York 1251

521

America/Chicago 400

America/Los_Angeles 382

America/Denver 191

Europe/London 74

Asia/Tokyo 37

Pacific/Honolulu 36

Europe/Madrid 35

America/Sao_Paulo 33

Name: tz, dtype: int64

我们看⼀下官⽅⽂档中的说明

Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)[source]? Returns object containing counts of unique values.

这⾥说明⼀下返回的数据是Series 格式的

总的来说⽅法⼀最为普通如果数据量⽐较⼤的话⾮常费时间，⽅法三对数据的格式有要求，所以推荐使⽤⽅法⼆

688IT编程网

Python统计list中各个元素出现的次数

发表评论

推荐文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

热门文章

随机森林特征选择原理

自动驾驶系统中的随机森林算法解析

随机森林算法及其在生物信息学中的应用

监督学习中的随机森林算法解析(六)

随机森林算法在数据分析中的应用

机器学习——随机森林,RandomForestClassifier参数含义详解

随机森林的算法

随机森林算法作用

监督学习中的随机森林算法解析(十)

随机森林算法案例

随机森林案例

二分类问题常用的模型

绘制ssd框架训练流程

一种基于信息熵和DTW的多维时间序列相似性度量算法

SVM训练过程范文

如何使用支持向量机进行股票预测与交易分析

二分类交叉熵损失函数binary

tinybert_训练中文文本分类模型_概述说明

基于门控可形变卷积和分层Transformer的图像修复模型及其应用

人工智能开发技术的测试和评估方法

最新文章

基于随机森林的数据分类算法改进

人工智能中的智能识别与分类技术

基于人工智能技术的随机森林算法在医疗数据挖掘中的应用

随机森林回归模型的建模步骤

r语言随机森林预测模型校准曲线

《2024年随机森林算法优化研究》范文

标签列表

688IT编程网

Python统计list中各个元素出现的次数

发表评论

推荐文章

随机森林算法介绍及R语言实现

基于随机森林优化的神经网络算法在冬小麦产量预测中的应用研究_百度文 ...

基于正则化贪心森林算法的情感分析方法研究

随机森林算法和grandientboosting算法

基于随机森林的图像分类算法研究

热门文章

随机森林特征选择原理

自动驾驶系统中的随机森林算法解析

随机森林算法及其在生物信息学中的应用

监督学习中的随机森林算法解析(六)

随机森林算法在数据分析中的应用

机器学习——随机森林,RandomForestClassifier参数含义详解

随机森林 的算法

随机森林算法作用

监督学习中的随机森林算法解析(十)

随机森林算法案例

随机森林案例

二分类问题常用的模型

绘制ssd框架训练流程

一种基于信息熵和DTW的多维时间序列相似性度量算法

SVM训练过程范文

如何使用支持向量机进行股票预测与交易分析

二分类交叉熵损失函数binary

tinybert_训练中文文本分类模型_概述说明

基于门控可形变卷积和分层Transformer的图像修复模型及其应用

人工智能开发技术的测试和评估方法

最新文章

基于随机森林的数据分类算法改进

人工智能中的智能识别与分类技术

基于人工智能技术的随机森林算法在医疗数据挖掘中的应用

随机森林回归模型的建模步骤

r语言随机森林预测模型校准曲线

《2024年随机森林算法优化研究》范文

标签列表

随机森林的算法