(完整版)计算机语言python100道pandas(含答案)--688IT编程网

1. Import pandas under the name pd .

In [1]:

import pandas as pd

import numpy as np

2. Print the version of pandas that has been imported.

In [2]:

pd.__version_

3. Print out all the version information of the libraries that are required by the pandas library

In [3]:

pd.show_versions()

4. Create a DataFrame df from this dictionary data which has the index labels .

In [2]:

data = {'animal': ['cat', 'cat', 'snake', 'dog', 'dog', 'cat', 'snake', 'cat', 'dog

'age': [2.5, 3, 0.5, np.nan, 5, 2, 4.5, np.nan, 7, 3],

'visits': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],

'priority': ['yes', 'yes', 'no', 'yes', 'no', 'no', 'no', 'yes', 'no', 'no']

labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

df = pd.DataFrame(data, index=labels)

5. Display a summary of the basic information about this DataFrame and its data.

In [5]:

df.info()

# ...or...

df.describe()

6.Return the first 3 rows of the DataFrame df

In [6]:

df.iloc[:3]

# or equivalently

df.head(3)

7. Select just the 'animal' and 'age' columns from the DataFrame df .

In [7]:

df.loc[:, ['animal', 'age']]

# or

df[['animal', 'age']]

8. Select the data in rows [3, 4, 8] and in columns ['animal', 'age'] .

In [3]:

df.loc[df.index[[3, 4, 8]], ['animal', 'age']]

9. Select only the rows where the number of visits is greater than 3.

In [4]:

df[df['visits'] > 3]

10. Select the rows where the age is missing, i.e. is NaN .

In [5]:

df[df['age'].isnull()]

11. Select the rows where the animal is a cat and the age is less than 3.

In [6]:

df[(df['animal'] == 'cat') & (df['age'] < 3)]

12. Select the rows the age is between 2 and 4 (inclusive).

In [7]:

df[df['age'].between(2, 4)]

13. Change the age in row 'f' to 1.5.

In [ ]:

df.loc['f', 'age'] = 1.5

14. Calculate the sum of all visits (the total number of visits).

In [ ]:

df['visits'].sum()

15. Calculate the mean age for each different animal in df .

In [8]:

df.groupby('animal')['age'].mean()

16. Append a new row 'k' to df with your choice of values for each column. Then delete that row to return the

original DataFrame.

In [ ]:df.loc['k'] = [5.5, 'dog', 'no', 2]

# and then deleting the

df = df.drop('k')

17. Count the number of each type of animal in df .

In [9]:

df['animal'].value_counts()

18. Sort df first by the values in the 'age' in decending order, then by the value in the 'visit' column in

ascending order.

In [10]:

df.sort_values(by=['age', 'visits'], ascending=[False, True])

19. The 'priority' column contains the values 'yes' and 'no'. Replace this column with a column of boolean

values: 'yes' should be True and 'no' should be False .

In [ ]:

df['priority'] = df['priority'].map({'yes': True, 'no': False})

In [14]:

df['animal'] = df['animal'].replace('snake', 'python')

print(df)

21. For each animal type and each number of visits, find the mean age. In other words, each row is an animal,

each column is a number of visits and the values are the mean ages (hint: use a pivot table).

In [15]:

df.pivot_table(index='animal', columns='visits', values='age', aggfunc='mean')

22. You have a DataFrame df with a column 'A' of integers. For example:

df = pd.DataFrame({'A': [1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7]})

How do you filter out rows which contain the same integer as the row immediately above?

In [16]:python菜鸟教程100

df = pd.DataFrame({'A': [1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7]})

df.loc[df['A'].shift() != df['A']]

# Alternatively, we could use drop_duplicates() here. Note

# that this removes *all* duplicates though, so it won't

23. Given a DataFrame of numeric values, say

df = pd.DataFrame(np.random.random(size=(5, 3))) # a 5x3 frame of float valu

how do you subtract the row mean from each element in the row?

In [ ]:df.an(axis=1), axis=0)

24. Suppose you have DataFrame with 10 columns of real numbers, for example:

df = pd.DataFrame(np.random.random(size=(5, 10)), columns=list('abcdefghij'

))

Which column of numbers has the smallest sum? ((Find that column's label.)

In [17]:

df.sum().idxmin()

25. How do you count how many unique rows a DataFrame has (i.e. ignore all rows that are duplicates)?

In [ ]:

len(df) - df.duplicated(keep=False).sum()

# or perhaps

len(df.drop_duplicates(keep=False))

26. You have a DataFrame that consists of 10 columns of floating--point numbers. Suppose that exactly 5

688IT编程网

(完整版)计算机语言python100道pandas(含答案)

发表评论

推荐文章

java正则表达式选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额正则表达式

提取文本中数字的函数

热门文章

excel文字递增函数公式

数字递增公式

notepad 正则变量运算

C++regex库常用函数及实例

js正则表达式之前瞻后顾与非捕获分组

indesign正则数字和英文之间的空格

C#匹配中文字符串的4种正则表达式分享

PHP正则表达式匹配中文字符

匹配中文汉字的正则表达式介绍

Python正则表达式如何进行字符串替换

orcl中用正则表达式

sql正则表达式excel

dataframe正则表达式

postgress sql正则

el-upload accept 正则表达式

半小时正则表达式

判断科学计数法的正则

根据url判断静态资源的方法

Java正则表达式-匹配正负浮点数

替换模糊匹配正则-hive

最新文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

能被5整除的十进制整数的正规表达式

大于0小于等于1的正则表达式

linux grep 26个字母

java pattern 正则表达式

掌握文本编辑器中的搜索和替换技巧

标签列表

688IT编程网

(完整版)计算机语言python100道pandas(含答案)

发表评论

推荐文章

java正则表达式 选择题

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

工龄小数点提取

非零金额 正则表达式

提取文本中数字的函数

热门文章

excel文字递增函数公式

数字递增公式

notepad 正则变量运算

C++regex库常用函数及实例

js正则表达式之前瞻后顾与非捕获分组

indesign正则数字和英文之间的空格

C#匹配中文字符串的4种正则表达式分享

PHP正则表达式匹配中文字符

匹配中文汉字的正则表达式介绍

Python正则表达式如何进行字符串替换

orcl中用正则表达式

sql正则表达式excel

dataframe正则表达式

postgress sql正则

el-upload accept 正则表达式

半小时 正则表达式

判断科学计数法的正则

根据url判断静态资源的方法

Java正则表达式-匹配正负浮点数

替换模糊匹配正则-hive

最新文章

一种基于正则表达式的DBC文件解析及报文分析方法[发明专利]

能被5整除的十进制整数的正规表达式

大于0小于等于1的正则表达式

linux grep 26个字母

java pattern 正则表达式

掌握文本编辑器中的搜索和替换技巧

标签列表

java正则表达式选择题

非零金额正则表达式

半小时正则表达式