-
Notifications
You must be signed in to change notification settings - Fork 448
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #111 from jiewong42/pub0
Pub0
- Loading branch information
Showing
2 changed files
with
155 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,145 @@ | ||
# 常用包的学习 | ||
## Numpy | ||
|
||
## NumPy | ||
|
||
**简介:** | ||
NumPy 是 Python 的基础科学计算包,提供了高性能的 N 维数组对象(`ndarray`)和大量用于数组操作的函数。它是很多其他数据分析包(例如 pandas 和 scikit-learn)的基础。 | ||
|
||
**学习重点:** | ||
|
||
- **数组创建:** | ||
- 从列表创建:`np.array([1, 2, 3])` | ||
- 特殊数组:`np.zeros((3, 4))`、`np.ones((2, 5))`、`np.eye(4)`、`np.arange(0, 10, 2)`、`np.linspace(0, 1, 5)` | ||
|
||
- **数组操作:** | ||
- 形状操作:`array.shape`、`np.reshape()`、`np.transpose()` | ||
- 切片和索引:类似 Python 列表,但支持多维索引 | ||
- 广播机制:在不同形状的数组间进行运算 | ||
|
||
- **数学运算:** | ||
- 基本运算:加减乘除(直接对数组操作) | ||
- 通用函数(ufuncs):如 `np.sin()`、`np.log()`、`np.exp()` 等 | ||
- 矩阵运算:`np.dot()`、`np.matmul()`、`@` 操作符 | ||
|
||
**示例代码:** | ||
|
||
```python | ||
import numpy as np | ||
|
||
# 创建数组 | ||
a = np.array([1, 2, 3]) | ||
b = np.arange(0, 10, 2) # [0, 2, 4, 6, 8] | ||
|
||
# 数组的基本运算 | ||
c = a * 2 | ||
d = np.sin(a) | ||
|
||
# 数组的形状操作 | ||
matrix = np.arange(12).reshape(3, 4) | ||
transposed = matrix.T | ||
|
||
print("数组 a:", a) | ||
print("数组 b:", b) | ||
print("数组 c (a*2):", c) | ||
print("数组 d (sin(a)):", d) | ||
print("原始矩阵:\n", matrix) | ||
print("转置矩阵:\n", transposed) | ||
``` | ||
|
||
--- | ||
|
||
## pandas | ||
## matplotlib | ||
|
||
**简介:** | ||
pandas 是基于 NumPy 构建的数据分析库,主要提供了两种核心数据结构:`Series`(一维数据)和 `DataFrame`(二维数据表)。它极大地方便了数据读取、清洗、变换和分析。 | ||
|
||
**学习重点:** | ||
|
||
- **数据结构:** | ||
- **Series:** 类似于带标签的一维数组 | ||
- **DataFrame:** 表格数据结构,既有行索引也有列标签 | ||
|
||
- **数据读写:** | ||
- 从 CSV、Excel、SQL 等文件或数据库中读取数据:`pd.read_csv()`, `pd.read_excel()` | ||
- 保存数据:`DataFrame.to_csv()`, `DataFrame.to_excel()` | ||
|
||
- **数据操作:** | ||
- 选择与过滤:通过标签(`.loc`)或位置(`.iloc`)索引 | ||
- 数据清洗:处理缺失值(`dropna()`, `fillna()`)、重复值 | ||
- 数据变换:排序(`sort_values()`)、重设索引(`reset_index()`) | ||
- 数据聚合:`groupby()`、`pivot_table()`、`merge()`、`join()` | ||
|
||
**示例代码:** | ||
|
||
```python | ||
import pandas as pd | ||
|
||
# 创建 DataFrame | ||
data = { | ||
'姓名': ['Alice', 'Bob', 'Charlie', 'David'], | ||
'年龄': [25, 30, 35, 40], | ||
'城市': ['北京', '上海', '广州', '深圳'] | ||
} | ||
df = pd.DataFrame(data) | ||
|
||
# 查看数据 | ||
print("DataFrame 内容:") | ||
print(df) | ||
|
||
# 筛选年龄大于30的数据 | ||
df_filtered = df[df['年龄'] > 30] | ||
print("\n年龄大于30的人:") | ||
print(df_filtered) | ||
|
||
# 分组聚合:按城市统计平均年龄 | ||
df_grouped = df.groupby('城市')['年龄'].mean().reset_index() | ||
print("\n按城市计算平均年龄:") | ||
print(df_grouped) | ||
``` | ||
|
||
--- | ||
|
||
## matplotlib | ||
|
||
**简介:** | ||
matplotlib 是 Python 最常用的数据可视化库,特别是其 `pyplot` 接口,能够方便地创建各种图表(折线图、散点图、直方图、条形图等)。 | ||
|
||
**学习重点:** | ||
|
||
- **基本绘图:** | ||
- 使用 `plt.plot()` 绘制折线图 | ||
- `plt.scatter()` 绘制散点图 | ||
- `plt.bar()` 绘制条形图 | ||
- `plt.hist()` 绘制直方图 | ||
|
||
- **图表定制:** | ||
- 添加标题:`plt.title()` | ||
- 添加坐标轴标签:`plt.xlabel()`、`plt.ylabel()` | ||
- 添加图例:`plt.legend()` | ||
- 自定义颜色、线型、标记等 | ||
|
||
- **显示与保存图表:** | ||
- 显示图表:`plt.show()` | ||
- 保存图表:`plt.savefig('filename.png')` | ||
|
||
**示例代码:** | ||
|
||
```python | ||
import matplotlib.pyplot as plt | ||
import numpy as np | ||
|
||
# 生成数据 | ||
x = np.linspace(0, 10, 100) | ||
y = np.sin(x) | ||
|
||
# 绘制折线图 | ||
plt.figure(figsize=(8, 4)) | ||
plt.plot(x, y, label='sin', color='blue', linestyle='-', marker='o', markersize=4) | ||
plt.title('sin') | ||
plt.xlabel('X ') | ||
plt.ylabel('Y ') | ||
plt.legend() | ||
plt.grid(True) | ||
plt.show() | ||
|
||
``` |