[add]上传训练benchmark by z00560161

This commit is contained in:
liang_chaoming@huawei.com
2020-10-19 20:22:23 +08:00
parent 22b83024f5
commit 82522e2f61
1225 changed files with 345421 additions and 0 deletions
@@ -0,0 +1,141 @@
# YOLOv3_TensorFlow训练说明
### 1. 介绍
YOLOv3是基于第三方TensorFlow开源代码,使用darknet-53作为主干网络,同时支持单尺度与多尺度训练。包含训练集和验证集两部分,可选用包括COCO2014、COCO2017等, 本文档以COCO2014数据集为例,说明yolov3训练操作步骤。
### 2. 运行环境
Python版本: 3.7.5
主要python三方库:
- tensorflow >= 1.15.0 (satisfied with NPU)
- opencv-python
1、直接pip install opencv-python
2、如果直接使用pip install opencv-python无法正常安装三方库,则采用离线安装方法安装。
(1)'解压opencv包'
(2)'进入解压后的opencv包 cd opencv'
(3)'mkdir -p build'
(4)'cd build'
(5)'cmake -D BUILD_opencv_python3=yes -D BUILD_opencv_python2=no -D PYTHON3_EXECUTABLE=/usr/local/python3.7.5/bin/python3.7m -D PYTHON3_INCLUDE_DIR=/usr/local/python3.7.5/include/python3.7m -D PYTHON3_LIBRARY=/usr/local/python3.7.5/lib/libpython3.7m.so -D PYTHON3_NUMPY_INCLUDE_DIRS=/usr/local/python3.7.5/lib/python3.7/site-packages/numpy/core/include -D PYTHON3_PACKAGES_PATH=/usr/local/python3.7.5/lib/python3.7/site-packages -D PYTHON_DEFAULT_EXECUTABLE=/usr/local/python3.7.5/bin/python3.7m ..'
(5)'make -j4'
(6)'make install'
说明:cmake -D 后参数匹配当前环境
- tqdm 安装方式:pip install tqdm
- pycocotools 安装方式:pip install pycocotools
说明: 评测的时候需要用到三方库pycocotools
### 3. 数据集预处理
#### 3.1 修改coco_dataset_path的值
在yolov3/tensorflow/code下对coco_minival_anns.py和coco_trainval_anns.py中coco_dataset_path的值改为当前环境的数据集路径, 如/opt/dataset/coco2014。
#### 3.2 运行脚本
```
python3.7 coco_minival_anns.py
python3.7 coco_trainval_anns.py
```
生成训练和验证样本标注文件coco2014_trainval.txt和coco2014_minival.txt,请将这2个文件放置到yolov3/tensorflow/code/data下。
生成的txt文件内容示例如下:
```
0 xxx/xxx/a.jpg 1920 1080 0 453 369 473 391 1 588 245 608 268
1 xxx/xxx/b.jpg 1920 1080 1 466 403 485 422 2 793 300 809 320
...
```
### 4. 准备预训练模型
#### 4.1 下载预训练模型
请从链接https://pjreddie.com/media/files/yolov3.weights下载darknet框架下的预训练模型。
#### 4.2 模型转换
使用train/atlas_benchmark-master/object_detection/yolov3/tensorflow/code下的convert_weight.py将预处理模型转换为TensorFlow框架的ckpt文件:
在convert_weight.py中将weight_path修改为下载下的预训练模型文件的路径,save_path的值修改为命名的转换为TensorFlow框架的ckpt文件的路径; 如
```
weight_path = '../yolov3-tf2/data/darknet53.conv.74'
save_path = './data/darknet_weights/darknet53.ckpt'
```
然后执行
```
python3.7 convert_weight.py
```
注意:save_path中ckpt文件的路径不是在train/atlas_benchmark-master/object_detection/yolov3/tensorflow/code/data/darknet_weights/下时, 请将其手动移至该路径;
### 5. 模型训练
#### 5.1 训练参数配置
在train/yaml/YoLoV3.yaml中修改相应配置, 配置项含义:
```
mode: yolov3的单尺度或者多尺度模式,值为single或者 multi
data_url:数据集路径
runmode: 运行模式,是训练还是评测,值为train或者evaluate
ckpt_path: 评测时要用到的ckpt文件的路径, 仅在evaluate时用到
total_epoches: 跑多少个epoch
save_epoch: 多少epoch保存一次ckpt文件
device_group_1p: 跑1p时的device_id
device_group_2p: 跑2p时的device_id
device_group_4p: 跑4p时的device_id
mpirun_ip: 仅集群场景时需要配置, 格式ip1:卡数量1,ip2:卡数量2
docker_image: docker镜像名称:版本号
```
YoLoV3.yaml中配置项示例:
```
mode: single
data_url: /opt/npu/dataset
runmode: train
ckpt_path: /home/benchmark-master720/train/atlas_benchmark-master/object_detection/yolov3/tensorflow/result/TrainingJob-20200724115042
total_epoches: 1
save_epoch: 3
device_group_1p: 0
device_group_2p: 0 1
device_group_4p: 0 1 2 3
mpirun_ip: 90.90.176.152:8,90.90.176.154:8
docker_image: mpirun3:latest
```
#### 5.2 训练脚本启动
当前路径为benchmark包的train文件夹下
```
bash benchmark.sh -e YoLoV3 -hw 1p # host侧1p
bash benchmark.sh -e YoLoV3 -hw 8p # host侧8p
bash benchmark.sh -e YoLoV3 -hw 1p -docker # docker侧1p
bash benchmark.sh -e YoLoV3 -hw 8p -docker # docker侧8p
bash benchmark.sh -e YoLoV3 -ct # host侧集群
bash benchmark.sh -e YoLoV3 -ct -docker # docker侧集群
```
#### 5.3 训练日志
日志在benchmark包的train路径下reuslt中找到YoLoV3的文件夹里。
```
./result/tf_yolov3/TrainingJob-2020xxxxxxxxxx/train_${device_id}.log
./result/TrainingJob-2020xxxxxxxxxx/train_${device_id}.log
./result/tensorflow/yolov3t/TrainingJob-2020xxxxxxxxxx/device_id/hw_yolov3.log
```
### 6. 模型评测
将train/yaml/YoLoV3.yaml中ckpt_path的值改为训练产生的日志的路径, runmode的值改为evaluate,如5.1中示例;
然后运行与训练时相同的脚本,结果参看见train.log。
### 7. 训练结果参考
| Model | Npu_nums | mAP | FPS |
| :-------------------- | :------: | :------: | :------: |
| single_scale | 8 | 30.0 | 740 |
| multi_scale | 8 | 31.0 | 340 |
| single_scale | 1 | ---- | 96 |
| multi_scale | 1 | ---- | 44 |
-------
@@ -0,0 +1,13 @@
# dirs
.idea/
__pycache__/
tmp*/
# fils
*.pyc
*.log
*.out
data/darknet_weights/*.ckpt*
@@ -0,0 +1,140 @@
# YOLOv3_TensorFlow
### 1. Introduction
This is npu implementation of [YOLOv3](https://pjreddie.com/media/files/papers/YOLOv3.pdf) using TensorFlow modified from [YOLOv3_TensorFlow](https://github.com/wizyoung/YOLOv3_TensorFlow).
### 2. Requirements
Python version: 3.7.5
Main Python Packages:
- tensorflow >= 1.15.0 (satisfied with NPU)
- opencv-python
- tqdm
### 3. Weights convertion
The pretrained darknet53 weights file can be downloaded [here](https://pjreddie.com/media/files/darknet53.conv.74).
Place this weights file under directory `./data/darknet_weights/` and then run:
```python
python3 convert_weight.py
```
Then the converted TensorFlow checkpoint file will be saved to `./data/darknet_weights/` directory.
In this repo, conerted weight file may be contained.
### 4. Training
#### 4.1 Data preparation
0. dataset
To compare with official implement, for example, we use [get_coco_dataset.sh](https://github.com/pjreddie/darknet/blob/master/scripts/get_coco_dataset.sh) to prepare our dataset.
1. annotation file
- ATTENTION: you can use easy tricks to fit default setting
- ln -s ${real_dataset_path} /opt/npu/dataset/coco
Using script generate `coco2014_trainval.txt/coco2014_minival.txt` files under `./data/` directory.
```python
python3 coco_trainval_anns.py
python3 coco_minival_anns.py
```
One line for one image, in the format like `image_index image_absolute_path img_width img_height box_1 box_2 ... box_n`.
Box_x format:
- `label_index x_min y_min x_max y_max`. (The origin of coordinates is at the left top corner, left top => (xmin, ymin), right bottom => (xmax, ymax).)
- `image_index` is the line index which starts from zero. `label_index` is in range [0, class_num - 1].
For example:
```
0 xxx/xxx/a.jpg 1920 1080 0 453 369 473 391 1 588 245 608 268
1 xxx/xxx/b.jpg 1920 1080 1 466 403 485 422 2 793 300 809 320
...
```
(2) class_names file:
Generate the `data.names` file under `./data/` directory. Each line represents a class name.
For example:
```
bird
person
bike
...
```
The COCO dataset class names file is placed at `./data/coco.names`.
(3) prior anchor file:
Using the kmeans algorithm to get the prior anchors:
```
python get_kmeans.py
```
Then you will get 9 anchors and the average IoU. Save the anchors to a txt file.
The COCO dataset anchors offered by YOLO's author is placed at `./data/yolo_anchors.txt`, you can use that one too.
The yolo anchors computed by the kmeans script is on the resized image scale. The default resize method is the letterbox resize, i.e., keep the original aspect ratio in the resized image.
#### 4.2 Training
1. single scale
Using `npu_train_*p_single.sh`. The hyper-parameters and the corresponding annotations can be found in `args_single.py`:
```shell
bash npu_train_1p_single.sh
or
bash npu_train_8p_single.sh
```
2. multi scale
Using `npu_train_*p_multi.sh`. The hyper-parameters and the corresponding annotations can be found in `args_multi.py`:
```shell
bash npu_train_1p_multi.sh
or
bash npu_train_8p_multi.sh
```
Check the `args.py` for more details. You should set the parameters yourself in your own specific task.
3. training details
1. nohup.out -- training task main_log
2. ./training/t1/D0/train_0.log -- training host log
3. training/t1/D0/training/train.log -- training perf log
### 5. Evaluation
Using `eval.sh` to evaluate the validation or test dataset. The parameters are as following:
```shell
bash eval.sh
```
Check the `eval.py` for more details. You could set the parameters yourself.
You will get the mAP metrics results using official cocoapi.
Using `tail -f eval_*.out` to watching results of models.
### 6. Training result
| Model | Npu_nums | mAP | FPS |
| :-------------------- | :------: | :------: | :------: |
| single_scale | 8 | 30.0 | 740 |
| multi_scale | 8 | 31.0 | 340 |
| single_scale | 1 | ---- | 96 |
| multi_scale | 1 | ---- | 44 |
-------
### Credits:
I referred to many fantastic repos during the implementation:
[YunYang1994/tensorflow-yolov3](https://github.com/YunYang1994/tensorflow-yolov3)
[qqwweee/keras-yolo3](https://github.com/qqwweee/keras-yolo3)
[eriklindernoren/PyTorch-YOLOv3](https://github.com/eriklindernoren/PyTorch-YOLOv3)
[pjreddie/darknet](https://github.com/pjreddie/darknet)
[dmlc/gluon-cv](https://github.com/dmlc/gluon-cv/tree/master/scripts/detection/yolo)
@@ -0,0 +1,110 @@
# coding: utf-8
# This file contains the parameter used in train.py
from __future__ import division, print_function
from utils.misc_utils import parse_anchors, read_class_names
import math
import os
save_dir = './training/' # The directory of the weights to save.
log_dir = './training/logs/' # The directory to store the tensorboard log files.
progress_log_path = './training/train.log' # The path to record the training progress.
# save_dir = os.path.join(work_path, save_dir)
# log_dir = os.path.join(work_path, log_dir)
# progress_log_path = os.path.join(work_path, progress_log_path)
if not os.path.exists(save_dir):
os.makedirs(save_dir)
if not os.path.exists(log_dir):
os.makedirs(log_dir)
work_path = os.path.realpath(__file__+"/..")
### Some paths
train_file = os.path.realpath(os.path.join(work_path, './data/coco2014_trainval.txt')) # The path of the training txt file.
val_file = os.path.realpath(os.path.join(work_path, './data/coco2014_minival.txt')) # The path of the validation txt file.
restore_path = os.path.realpath(os.path.join(work_path, './data/darknet_weights/darknet53.ckpt')) # The path of the weights to restore.
anchor_path = os.path.realpath(os.path.join(work_path, './data/yolo_anchors.txt')) # The path of the anchor txt file.
class_name_path = os.path.realpath(os.path.join(work_path, './data/coco.names')) # The path of the class names.
### Distribution setting
num_gpus=int(os.environ['RANK_SIZE'])
iterations_per_loop=10
### Training releated numbersls
batch_size = 16
img_size = [608, 608] # Images will be resized to `img_size` and fed to the network, size format: [width, height]
letterbox_resize = True # Whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image.
total_epoches = 200
train_evaluation_step = 1000 # Evaluate on the training batch after some steps.
val_evaluation_epoch = 2 # Evaluate on the whole validation dataset after some epochs. Set to None to evaluate every epoch.
save_epoch = 10 # Save the model after some epochs.
batch_norm_decay = 0.99 # decay in bn ops
weight_decay = 5e-4 # l2 weight decay
global_step = 0 # used when resuming training
### tf.data parameters
num_threads = 8 # Number of threads for image processing used in tf.data pipeline.
prefetech_buffer = batch_size * 4 # Prefetech_buffer used in tf.data pipeline.
### Learning rate and optimizer
optimizer_name = 'momentum' # Chosen from [sgd, momentum, adam, rmsprop]
save_optimizer = True # Whether to save the optimizer parameters into the checkpoint file.
learning_rate_base = 75e-4
learning_rate_base_batch_size = 64
learning_rate_init = learning_rate_base * ((batch_size * num_gpus) / learning_rate_base_batch_size)
lr_type = 'piecewise' # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise]
lr_decay_epoch = 5 # Epochs after which learning rate decays. Int or float. Used when chosen `exponential` and `cosine_decay_restart` lr_type.
lr_decay_factor = 0.96 # The learning rate decay factor. Used when chosen `exponential` lr_type.
lr_lower_bound = 1e-6 # The minimum learning rate.
# only used in piecewise lr type
pw_boundaries = [80, 90] # epoch based boundaries
pw_values = [learning_rate_init, learning_rate_init*0.1, learning_rate_init*0.01]
### Load and finetune
# Choose the parts you want to restore the weights. List form.
# restore_include: None, restore_exclude: None => restore the whole model
# restore_include: None, restore_exclude: scope => restore the whole model except `scope`
# restore_include: scope1, restore_exclude: scope2 => if scope1 contains scope2, restore scope1 and not restore scope2 (scope1 - scope2)
# choise 1: only restore the darknet body
# restore_include = ['yolov3/darknet53_body']
restore_exclude = None
# choise 2: restore all layers except the last 3 conv2d layers in 3 scale
restore_include = None
# restore_exclude = ['yolov3/yolov3_head/Conv_14', 'yolov3/yolov3_head/Conv_6', 'yolov3/yolov3_head/Conv_22']
# restore_exclude = None
# Choose the parts you want to finetune. List form.
# Set to None to train the whole model.
# update_part = ['yolov3/yolov3_head']
update_part = None
### other training strategies
multi_scale_train = True # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default.
use_label_smooth = False # Whether to use class label smoothing strategy.
use_focal_loss = False # Whether to apply focal loss on the conf loss.
use_mix_up = False # Whether to use mix up data augmentation strategy.
use_warm_up = True # whether to use warm up strategy to prevent from gradient exploding.
warm_up_epoch = min(total_epoches*0.1, 3) # Warm up training epoches. Set to a larger value if gradient explodes.
### some constants in validation
# nms
nms_threshold = 0.5 # iou threshold in nms operation
score_threshold = 0.001 # threshold of the probability of the classes in nms operation, i.e. score = pred_confs * pred_probs. set lower for higher recall.
nms_topk = 100 # keep at most nms_topk outputs after nms
# mAP eval
eval_threshold = 0.5 # the iou threshold applied in mAP evaluation
use_voc_07_metric = False # whether to use voc 2007 evaluation metric, i.e. the 11-point metric
### parse some params
anchors = parse_anchors(anchor_path)
classes = read_class_names(class_name_path)
class_num = len(classes)
train_img_cnt = len(open(train_file, 'r').readlines())
val_img_cnt = len(open(val_file, 'r').readlines())
train_batch_num = int(float(train_img_cnt) / batch_size / num_gpus)
lr_decay_freq = int(train_batch_num * lr_decay_epoch)
pw_boundaries = [float(i) * train_batch_num + global_step for i in pw_boundaries]
@@ -0,0 +1,105 @@
# coding: utf-8
# This file contains the parameter used in train.py
from __future__ import division, print_function
from utils.misc_utils import parse_anchors, read_class_names
import math
import os
save_dir = './training/' # The directory of the weights to save.
log_dir = './training/logs/' # The directory to store the tensorboard log files.
progress_log_path = './training/train.log' # The path to record the training progress.
if not os.path.exists(save_dir):
os.makedirs(save_dir)
if not os.path.exists(log_dir):
os.makedirs(log_dir)
work_path = os.path.realpath(__file__+"/..")
### Some paths
train_file = os.path.realpath(os.path.join(work_path, './data/coco2014_trainval.txt')) # The path of the training txt file.
val_file = os.path.realpath(os.path.join(work_path, './data/coco2014_minival.txt')) # The path of the validation txt file.
restore_path = os.path.realpath(os.path.join(work_path, './data/darknet_weights/darknet53.ckpt')) # The path of the weights to restore.
anchor_path = os.path.realpath(os.path.join(work_path, './data/yolo_anchors.txt')) # The path of the anchor txt file.
class_name_path = os.path.realpath(os.path.join(work_path, './data/coco.names')) # The path of the class names.
### Distribution setting
num_gpus=int(os.environ['RANK_SIZE'])
iterations_per_loop=10
### Training releated numbersls
batch_size = 32
img_size = [416, 416] # Images will be resized to `img_size` and fed to the network, size format: [width, height]
letterbox_resize = True # Whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image.
total_epoches = 200
train_evaluation_step = 1000 # Evaluate on the training batch after some steps.
val_evaluation_epoch = 2 # Evaluate on the whole validation dataset after some epochs. Set to None to evaluate every epoch.
save_epoch = 10 # Save the model after some epochs.
batch_norm_decay = 0.99 # decay in bn ops
weight_decay = 5e-4 # l2 weight decay
global_step = 0 # used when resuming training
### tf.data parameters
num_threads = 8 # Number of threads for image processing used in tf.data pipeline.
prefetech_buffer = batch_size * 4 # Prefetech_buffer used in tf.data pipeline.
### Learning rate and optimizer
optimizer_name = 'momentum' # Chosen from [sgd, momentum, adam, rmsprop]
save_optimizer = True # Whether to save the optimizer parameters into the checkpoint file.
learning_rate_base = 5e-3
learning_rate_base_batch_size = 64
learning_rate_init = learning_rate_base * ((batch_size * num_gpus) / learning_rate_base_batch_size)
lr_type = 'piecewise' # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise]
lr_decay_epoch = 5 # Epochs after which learning rate decays. Int or float. Used when chosen `exponential` and `cosine_decay_restart` lr_type.
lr_decay_factor = 0.96 # The learning rate decay factor. Used when chosen `exponential` lr_type.
lr_lower_bound = 1e-6 # The minimum learning rate.
# only used in piecewise lr type
pw_boundaries = [80, 90] # epoch based boundaries
pw_values = [learning_rate_init, learning_rate_init*0.1, learning_rate_init*0.01]
### Load and finetune
# Choose the parts you want to restore the weights. List form.
# restore_include: None, restore_exclude: None => restore the whole model
# restore_include: None, restore_exclude: scope => restore the whole model except `scope`
# restore_include: scope1, restore_exclude: scope2 => if scope1 contains scope2, restore scope1 and not restore scope2 (scope1 - scope2)
# choise 1: only restore the darknet body
# restore_include = ['yolov3/darknet53_body']
restore_exclude = None
# choise 2: restore all layers except the last 3 conv2d layers in 3 scale
restore_include = None
# restore_exclude = ['yolov3/yolov3_head/Conv_14', 'yolov3/yolov3_head/Conv_6', 'yolov3/yolov3_head/Conv_22']
# Choose the parts you want to finetune. List form.
# Set to None to train the whole model.
# update_part = ['yolov3/yolov3_head']
update_part = None
### other training strategies
multi_scale_train = False # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default.
use_label_smooth = False # Whether to use class label smoothing strategy.
use_focal_loss = False # Whether to apply focal loss on the conf loss.
use_mix_up = False # Whether to use mix up data augmentation strategy.
use_warm_up = True # whether to use warm up strategy to prevent from gradient exploding.
warm_up_epoch = min(total_epoches*0.1, 3) # Warm up training epoches. Set to a larger value if gradient explodes.
### some constants in validation
# nms
nms_threshold = 0.5 # iou threshold in nms operation
score_threshold = 0.001 # threshold of the probability of the classes in nms operation, i.e. score = pred_confs * pred_probs. set lower for higher recall.
nms_topk = 100 # keep at most nms_topk outputs after nms
# mAP eval
eval_threshold = 0.5 # the iou threshold applied in mAP evaluation
use_voc_07_metric = False # whether to use voc 2007 evaluation metric, i.e. the 11-point metric
### parse some params
anchors = parse_anchors(anchor_path)
classes = read_class_names(class_name_path)
class_num = len(classes)
train_img_cnt = len(open(train_file, 'r').readlines())
val_img_cnt = len(open(val_file, 'r').readlines())
train_batch_num = int(float(train_img_cnt) / batch_size / num_gpus)
lr_decay_freq = int(train_batch_num * lr_decay_epoch)
pw_boundaries = [float(i) * train_batch_num + global_step for i in pw_boundaries]
@@ -0,0 +1,113 @@
import json,cv2
from collections import defaultdict
ban_path = './data/5k.txt'
with open(ban_path, 'r')as f:
ban_list = f.read().split('\n')[:-1]
ban_list = [i.split('/')[-1] for i in ban_list]
name_box_id = defaultdict(list)
id_name = dict()
coco_dataset_path = '/opt/npu/dataset/coco/coco2014'
f = open(
coco_dataset_path + "/annotations/instances_train2014.json",
encoding='utf-8')
data = json.load(f)
annotations = data['annotations']
for ant in annotations:
id = ant['image_id']
name = coco_dataset_path + '/train2014/COCO_train2014_%012d.jpg' % id
cat = ant['category_id']
if cat >= 1 and cat <= 11:
cat = cat - 1
elif cat >= 13 and cat <= 25:
cat = cat - 2
elif cat >= 27 and cat <= 28:
cat = cat - 3
elif cat >= 31 and cat <= 44:
cat = cat - 5
elif cat >= 46 and cat <= 65:
cat = cat - 6
elif cat == 67:
cat = cat - 7
elif cat == 70:
cat = cat - 9
elif cat >= 72 and cat <= 82:
cat = cat - 10
elif cat >= 84 and cat <= 90:
cat = cat - 11
name_box_id[name].append([ant['bbox'], cat])
f = open(
coco_dataset_path + "/annotations/instances_val2014.json",
encoding='utf-8')
data = json.load(f)
annotations = data['annotations']
for ant in annotations:
id = ant['image_id']
name = coco_dataset_path + '/val2014/COCO_val2014_%012d.jpg' % id
cat = ant['category_id']
if cat >= 1 and cat <= 11:
cat = cat - 1
elif cat >= 13 and cat <= 25:
cat = cat - 2
elif cat >= 27 and cat <= 28:
cat = cat - 3
elif cat >= 31 and cat <= 44:
cat = cat - 5
elif cat >= 46 and cat <= 65:
cat = cat - 6
elif cat == 67:
cat = cat - 7
elif cat == 70:
cat = cat - 9
elif cat >= 72 and cat <= 82:
cat = cat - 10
elif cat >= 84 and cat <= 90:
cat = cat - 11
name_box_id[name].append([ant['bbox'], cat])
f = open('data/coco2014_minival.txt', 'w')
ii = 0
for idx, key in enumerate(name_box_id.keys()):
if key.split('/')[-1] not in ban_list:
continue
print('5k', key.split('/')[-1])
f.write('%d '%ii)
ii += 1
f.write(key)
img = cv2.imread(key)
h,w,c = img.shape
f.write(' %d %d'%(w,h))
box_infos = name_box_id[key]
for info in box_infos:
x_min = int(info[0][0])
y_min = int(info[0][1])
x_max = x_min + int(info[0][2])
y_max = y_min + int(info[0][3])
box_info = " %d %d %d %d %d" % (
int(info[1]), x_min, y_min, x_max, y_max
)
f.write(box_info)
f.write('\n')
f.close()
@@ -0,0 +1,113 @@
import json,cv2
from collections import defaultdict
ban_path = './data/5k.txt'
with open(ban_path, 'r')as f:
ban_list = f.read().split('\n')[:-1]
ban_list = [i.split('/')[-1] for i in ban_list]
name_box_id = defaultdict(list)
id_name = dict()
coco_dataset_path = '/opt/npu/dataset/coco/coco2014'
f = open(
coco_dataset_path + "/annotations/instances_train2014.json",
encoding='utf-8')
data = json.load(f)
annotations = data['annotations']
for ant in annotations:
id = ant['image_id']
name = coco_dataset_path + '/train2014/COCO_train2014_%012d.jpg' % id
cat = ant['category_id']
if cat >= 1 and cat <= 11:
cat = cat - 1
elif cat >= 13 and cat <= 25:
cat = cat - 2
elif cat >= 27 and cat <= 28:
cat = cat - 3
elif cat >= 31 and cat <= 44:
cat = cat - 5
elif cat >= 46 and cat <= 65:
cat = cat - 6
elif cat == 67:
cat = cat - 7
elif cat == 70:
cat = cat - 9
elif cat >= 72 and cat <= 82:
cat = cat - 10
elif cat >= 84 and cat <= 90:
cat = cat - 11
name_box_id[name].append([ant['bbox'], cat])
f = open(
coco_dataset_path + "/annotations/instances_val2014.json",
encoding='utf-8')
data = json.load(f)
annotations = data['annotations']
for ant in annotations:
id = ant['image_id']
name = coco_dataset_path + '/val2014/COCO_val2014_%012d.jpg' % id
cat = ant['category_id']
if cat >= 1 and cat <= 11:
cat = cat - 1
elif cat >= 13 and cat <= 25:
cat = cat - 2
elif cat >= 27 and cat <= 28:
cat = cat - 3
elif cat >= 31 and cat <= 44:
cat = cat - 5
elif cat >= 46 and cat <= 65:
cat = cat - 6
elif cat == 67:
cat = cat - 7
elif cat == 70:
cat = cat - 9
elif cat >= 72 and cat <= 82:
cat = cat - 10
elif cat >= 84 and cat <= 90:
cat = cat - 11
name_box_id[name].append([ant['bbox'], cat])
f = open('data/coco2014_trainval.txt', 'w')
ii = 0
for idx, key in enumerate(name_box_id.keys()):
if key.split('/')[-1] in ban_list:
continue
print('trainval', key.split('/')[-1])
f.write('%d '%ii)
ii += 1
f.write(key)
img = cv2.imread(key)
h,w,c = img.shape
f.write(' %d %d'%(w,h))
box_infos = name_box_id[key]
for info in box_infos:
x_min = int(info[0][0])
y_min = int(info[0][1])
x_max = x_min + int(info[0][2])
y_max = y_min + int(info[0][3])
box_info = " %d %d %d %d %d" % (
int(info[1]), x_min, y_min, x_max, y_max
)
f.write(box_info)
f.write('\n')
f.close()
@@ -0,0 +1,38 @@
# coding: utf-8
# for more details about the yolo darknet weights file, refer to
# https://itnext.io/implementing-yolo-v3-in-tensorflow-tf-slim-c3c55ff59dbe
from __future__ import division, print_function
import os
import sys
import tensorflow as tf
import numpy as np
from model import yolov3
from utils.misc_utils import parse_anchors, load_weights
num_class = 80
img_size = 416
weight_path = '../yolov3-tf2/data/darknet53.conv.74'
save_path = './data/darknet_weights/darknet53.ckpt'
anchors = parse_anchors('./data/yolo_anchors.txt')
model = yolov3(80, anchors)
with tf.Session() as sess:
inputs = tf.placeholder(tf.float32, [1, img_size, img_size, 3])
with tf.variable_scope('yolov3'):
feature_map = model.forward(inputs)
saver = tf.train.Saver(var_list=tf.global_variables(scope='yolov3'))
load_ops = load_weights(tf.global_variables(scope='yolov3'), weight_path)
sess.run(tf.global_variables_initializer())
sess.run(load_ops)
saver.save(sess, save_path=save_path)
print('TensorFlow model checkpoint has been saved to {}'.format(save_path))
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,80 @@
person
bicycle
car
motorbike
aeroplane
bus
train
truck
boat
traffic light
fire hydrant
stop sign
parking meter
bench
bird
cat
dog
horse
sheep
cow
elephant
bear
zebra
giraffe
backpack
umbrella
handbag
tie
suitcase
frisbee
skis
snowboard
sports ball
kite
baseball bat
baseball glove
skateboard
surfboard
tennis racket
bottle
wine glass
cup
fork
knife
spoon
bowl
banana
apple
sandwich
orange
broccoli
carrot
hot dog
pizza
donut
cake
chair
sofa
pottedplant
bed
diningtable
toilet
tvmonitor
laptop
mouse
remote
keyboard
cell phone
microwave
oven
toaster
sink
refrigerator
book
clock
vase
scissors
teddy bear
hair drier
toothbrush
@@ -0,0 +1 @@
10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 84 KiB

@@ -0,0 +1,220 @@
# coding: utf-8
from __future__ import division, print_function
import tensorflow as tf
import numpy as np
import argparse
import cv2
from utils.misc_utils import parse_anchors, read_class_names
from utils.nms_utils import gpu_nms, cpu_nms
from utils.plot_utils import get_color_table, plot_one_box
from utils.data_aug import letterbox_resize
from model import yolov3
from tqdm import trange
import json
import os,time
# npu modified
from npu_bridge.estimator import npu_ops
from npu_bridge.estimator.npu.npu_optimizer import NPUDistributedOptimizer
from tensorflow.core.protobuf.rewriter_config_pb2 import RewriterConfig
from npu_bridge.estimator.npu import util
'''
coco weight from official checked
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.309
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.555
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.311
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.136
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.337
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.460
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.273
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.430
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.465
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.270
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.511
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.629
'''
parser = argparse.ArgumentParser(description="YOLO-V3 test single image test procedure.")
parser.add_argument("--annotation_txt", type=str, default='../code/data/coco2014_minival.txt',
help="The path of the input image. Or annotation label txt.")
parser.add_argument("--anchor_path", type=str, default="../code/data/yolo_anchors.txt",
help="The path of the anchor txt file.")
parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416],
help="Resize the input image with `new_size`, size format: [width, height]")
parser.add_argument("--max_test", type=int, default=-1,
help="max step for test")
parser.add_argument("--score_thresh", type=float, default=1e-3,
help="score_threshold for test")
parser.add_argument("--nms_thresh", type=float, default=0.5,
help="iou_threshold for test")
parser.add_argument("--max_boxes", type=int, default=100,
help="max_boxes for test")
parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True,
help="Whether to use the letterbox resize.")
parser.add_argument("--class_name_path", type=str, default="../code/data/coco.names",
help="The path of the class names.")
parser.add_argument("--restore_path", type=str, default="../code/data/darknet_weights/yolo3.ckpt",
# parser.add_argument("--restore_path", type=str, default="./training_s2/checkpoint_dir/model.ckpt-45800",
help="The path of the weights to restore.")
parser.add_argument("--save_img", type=bool, default=False,
help="whether to save detected-result image")
parser.add_argument("--save_json", type=bool, default=False,
help="whether to save detected-result cocolike json")
parser.add_argument("--save_json_path", type=str, default="../result/result.json",
help="The path of the result.json.")
args = parser.parse_args()
args.anchors = parse_anchors(args.anchor_path)
args.classes = read_class_names(args.class_name_path)
args.num_class = len(args.classes)
color_table = get_color_table(args.num_class)
cat_id_to_real_id = \
{1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9, 10: 10, 11: 11, 13: 12, 14: 13, 15: 14, 16: 15, 17: 16,
18: 17, 19: 18, 20: 19, 21: 20, 22: 21, 23: 22, 24: 23, 25: 24, 27: 25, 28: 26, 31: 27, 32: 28, 33: 29, 34: 30,
35: 31, 36: 32, 37: 33, 38: 34, 39: 35, 40: 36, 41: 37, 42: 38, 43: 39, 44: 40, 46: 41, 47: 42, 48: 43, 49: 44,
50: 45, 51: 46, 52: 47, 53: 48, 54: 49, 55: 50, 56: 51, 57: 52, 58: 53, 59: 54, 60: 55, 61: 56, 62: 57, 63: 58,
64: 59, 65: 60, 67: 61, 70: 62, 72: 63, 73: 64, 74: 65, 75: 66, 76: 67, 77: 68, 78: 69, 79: 70, 80: 71, 81: 72,
82: 73, 84: 74, 85: 75, 86: 76, 87: 77, 88: 78, 89: 79, 90: 80}
real_id_to_cat_id = {cat_id_to_real_id[i]: i for i in cat_id_to_real_id}
def get_default_dict():
return {"image_id": -1, "category_id": -1, "bbox": [], "score": 0}
eval_path = args.annotation_txt
with open(eval_path, 'r')as f:
eval_file_list = f.read().split('\n')[:-1]
print(len(eval_file_list))
eval_file_dict = {}
for i in eval_file_list:
tmp_list = i.split(' ')
idx = int(tmp_list[0])
path = tmp_list[1]
w = float(tmp_list[2])
h = float(tmp_list[3])
bbox_len = len(tmp_list[4:]) // 5
bbox = []
for bbox_idx in range(bbox_len):
label, x1, y1, x2, y2 = tmp_list[4:][bbox_idx * 5:bbox_idx * 5 + 5]
bbox.append([label, x1, y1, x2, y2])
eval_file_dict[idx] = {
'path': path,
'w': w,
'h': h,
'bbox': bbox
}
config = tf.ConfigProto()
custom_op = config.graph_options.rewrite_options.custom_optimizers.add()
custom_op.name = "NpuOptimizer"
custom_op.parameter_map["use_off_line"].b = True # training on Ascend chips
config.graph_options.rewrite_options.remapping = RewriterConfig.OFF
json_out = []
with tf.Session(config=config) as sess:
# with tf.Session() as sess:
input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data')
yolo_model = yolov3(args.num_class, args.anchors)
with tf.variable_scope('yolov3'):
pred_feature_maps = yolo_model.forward(input_data, False)
pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps)
pred_scores = pred_confs * pred_probs
# boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=100, score_thresh=args.score_thresh, nms_thresh=0.5)
saver = tf.train.Saver()
if args.restore_path.find('.ckpt') < 0 and args.restore_path.find('model-') < 0:
with open(os.path.join(args.restore_path, 'checkpoint'), 'r')as f:
tmp_checkpoint = f.readline()
tmp_checkpoint = tmp_checkpoint.replace('"', '').split(':')[1].strip()
args.restore_path = os.path.join(args.restore_path, tmp_checkpoint)
print('tmp_checkpoint: ', tmp_checkpoint)
# input()
saver.restore(sess, args.restore_path)
if args.max_test > 0:
test_len = min(args.max_test, len(eval_file_dict.keys()))
else:
test_len = len(eval_file_dict.keys())
for test_idx in trange(test_len):
img_path = eval_file_dict[test_idx]['path']
img_ori = cv2.imread(img_path)
if args.letterbox_resize:
img, resize_ratio, dw, dh = letterbox_resize(img_ori, args.new_size[0], args.new_size[1])
else:
height_ori, width_ori = img_ori.shape[:2]
img = cv2.resize(img_ori, tuple(args.new_size))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = np.asarray(img, np.float32)
img = img[np.newaxis, :] / 255.
# boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img})
# print('bbox: ',boxes_)
t = time.time()
boxes_, scores_ = sess.run([pred_boxes, pred_scores], feed_dict={input_data: img})
# print("FPS: ", 1/(time.time() - t))
boxes_, scores_, labels_ = cpu_nms(boxes_, scores_, args.num_class, args.max_boxes, args.score_thresh, args.nms_thresh)
# print('bbox: ', boxes_)
# try:
# boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio
# except:
# print("boxes_: ", boxes_)
# continue
# print("boxes_: ", boxes_)
# rescale the coordinates to the original image
if args.letterbox_resize:
boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio
boxes_[:, [1, 3]] = (boxes_[:, [1, 3]] - dh) / resize_ratio
else:
boxes_[:, [0, 2]] *= (width_ori / float(args.new_size[0]))
boxes_[:, [1, 3]] *= (height_ori / float(args.new_size[1]))
if args.save_img:
# print("box coords:")
# print(boxes_)
# print('*' * 30)
# print("scores:")
# print(scores_)
# print('*' * 30)
# print("labels:")
# print(labels_)
for i in range(len(boxes_)):
x0, y0, x1, y1 = boxes_[i]
plot_one_box(img_ori, [x0, y0, x1, y1],
label=args.classes[labels_[i]] + ', {:.2f}%'.format(scores_[i] * 100),
color=color_table[labels_[i]])
cv2.imwrite('tmp/%d_detection_result.jpg' % test_idx, img_ori)
print('%d done' % test_idx)
if args.save_json:
for i in range(len(boxes_)):
x0, y0, x1, y1 = boxes_[i]
bw = x1 - x0
bh = y1 - y0
s = scores_[i]
c = labels_[i]
t_dict = get_default_dict()
t_dict['image_id'] = int(img_path.split('/')[-1].split('.')[0].split('_')[-1])
t_dict['category_id'] = real_id_to_cat_id[int(c) + 1]
t_dict['bbox'] = [int(i) for i in [x0, y0, bw, bh]]
t_dict['score'] = float(s)
json_out.append(t_dict)
if args.save_json:
with open(args.save_json_path, 'w')as f:
json.dump(json_out, f)
print('output json saved to: ', args.save_json_path)
eval_coco = os.path.realpath(__file__ + "/../eval_coco.py")
os.system('python3.7 %s %s' % (eval_coco, args.save_json_path))
@@ -0,0 +1,61 @@
#export CUDA_VISIBLE_DEVICES=''
#export CUDA_VISIBLE_DEVICES=7
# setting main path
MAIN_PATH=$(dirname $(readlink -f $0))
## set env
#export PYTHONPATH=/usr/local/Ascend/ops/op_impl/built-in/ai_core/tbe/:$MAIN_PATH/../../../
#export LD_LIBRARY_PATH=/usr/local/lib/:/usr/lib/:/usr/local/Ascend/fwkacllib/lib64/:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:/usr/local/Ascend/add-ons/:/usr/lib/x86_64-linux-gnu
#PATH=$PATH:$HOME/bin
#export PATH=$PATH:/usr/local/Ascend/fwkacllib/ccec_compiler/bin:$PATH
#export ASCEND_OPP_PATH=/usr/local/Ascend/opp
# set env
export ASCEND_HOME=/usr/local/Ascend
export LD_LIBRARY_PATH=/usr/local/lib/:/usr/lib/:/usr/local/Ascend/ascend-toolkit/latest/fwkacllib/lib64:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:/usr/local/Ascend/add-ons/
export PYTHONPATH=$PYTHONPATH:/usr/local/Ascend/ascend-toolkit/latest/opp/op_impl/built-in/ai_core/tbe:/usr/local/Ascend/ascend-toolkit/latest/fwkacllib/python/site-packages/te:/usr/local/Ascend/ascend-toolkit/latest/fwkacllib/python/site-packages/topi:/usr/local/Ascend/ascend-toolkit/latest/fwkacllib/python/site-packages/hccl:/usr/local/Ascend/ascend-toolkit/latest/tfplugin/python/site-packages:$currentDir
export PATH=$PATH:/usr/local/Ascend/ascend-toolkit/latest/fwkacllib/ccec_compiler/bin
export ASCEND_OPP_PATH=/usr/local/Ascend/ascend-toolkit/latest/opp/
export DDK_VERSION_FLAG=1.60.T49.0.B201
export NEW_GE_FE_ID=1
export GE_AICPU_FLAG=1
export SOC_VERSION=Ascend910
export JOB_ID=10087
export FUSION_TENSOR_SIZE=1000000000
#export SLOG_PRINT_TO_STDOUT=1
#export DUMP_GE_GRAPH=2
#export DUMP_GRAPH_LEVEL=3
for((RANK_ID=0;RANK_ID<8;RANK_ID++));
do
export RANK_ID=$RANK_ID
export RANK_SIZE=1
export DEVICE_ID=$RANK_ID
export DEVICE_INDEX=$RANK_ID
su HwHiAiUser -c "adc --host 0.0.0.0:22118 --log \"SetLogLevel(0)[debug]\" --device "$RANK_ID
RESTORE_PATH=./training/t1/D$RANK_ID/training/
nohup python3.7 eval.py \
--save_json True \
--score_thresh 0.0001 \
--nms_thresh 0.55 \
--max_boxes 100 \
--restore_path $RESTORE_PATH \
--max_test 10000 \
--save_json_path eval_res_D$RANK_ID.json > eval_$RANK_ID.out &
done
@@ -0,0 +1,57 @@
#-*- coding:utf-8 -*-
# import matplotlib.pyplot as plt
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
import numpy as np
import pylab,json
import sys
# pylab.rcParams['figure.figsize'] = (10.0, 8.0)
def get_img_id(file_name):
ls = []
myset = []
annos = json.load(open(file_name, 'r'))
for anno in annos:
ls.append(anno['image_id'])
myset = {}.fromkeys(ls).keys()
return myset
'''
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.317
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.562
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.321
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.162
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.343
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.448
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.278
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.438
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.464
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.275
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.497
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.625
'''
if __name__ == '__main__':
annType = ['segm', 'bbox', 'keypoints']#set iouType to 'segm', 'bbox' or 'keypoints'
annType = annType[1] # specify type here
cocoGt_file = '/opt/npu/dataset/coco/coco2014/annotations/instances_val2014.json'
cocoGt = COCO(cocoGt_file)#取得标注集中coco json对象
# print(list(cocoGt.anns.items())[:10])
# print(cocoGt.anns[318219])
# input()
# cocoDt_file = 'result.json'
cocoDt_file = sys.argv[1]
imgIds = get_img_id(cocoDt_file)
# print(len(imgIds))
cocoDt = cocoGt.loadRes(cocoDt_file)#取得结果集中image json对象
imgIds = sorted(imgIds)#按顺序排列coco标注集image_id
# print(imgIds)
# input()
# imgIds = imgIds[0:5000]#标注集中的image数据
cocoEval = COCOeval(cocoGt, cocoDt, annType)
cocoEval.params.imgIds = imgIds#参数设置
cocoEval.evaluate()#评价
cocoEval.accumulate()#积累
cocoEval.summarize()#总结
@@ -0,0 +1,155 @@
# coding: utf-8
# This script is modified from https://github.com/lars76/kmeans-anchor-boxes
from __future__ import division, print_function
import numpy as np
def iou(box, clusters):
"""
Calculates the Intersection over Union (IoU) between a box and k clusters.
param:
box: tuple or array, shifted to the origin (i. e. width and height)
clusters: numpy array of shape (k, 2) where k is the number of clusters
return:
numpy array of shape (k, 0) where k is the number of clusters
"""
x = np.minimum(clusters[:, 0], box[0])
y = np.minimum(clusters[:, 1], box[1])
if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0:
raise ValueError("Box has no area")
intersection = x * y
box_area = box[0] * box[1]
cluster_area = clusters[:, 0] * clusters[:, 1]
iou_ = np.true_divide(intersection, box_area + cluster_area - intersection + 1e-10)
# iou_ = intersection / (box_area + cluster_area - intersection + 1e-10)
return iou_
def avg_iou(boxes, clusters):
"""
Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters.
param:
boxes: numpy array of shape (r, 2), where r is the number of rows
clusters: numpy array of shape (k, 2) where k is the number of clusters
return:
average IoU as a single float
"""
return np.mean([np.max(iou(boxes[i], clusters)) for i in range(boxes.shape[0])])
def translate_boxes(boxes):
"""
Translates all the boxes to the origin.
param:
boxes: numpy array of shape (r, 4)
return:
numpy array of shape (r, 2)
"""
new_boxes = boxes.copy()
for row in range(new_boxes.shape[0]):
new_boxes[row][2] = np.abs(new_boxes[row][2] - new_boxes[row][0])
new_boxes[row][3] = np.abs(new_boxes[row][3] - new_boxes[row][1])
return np.delete(new_boxes, [0, 1], axis=1)
def kmeans(boxes, k, dist=np.median):
"""
Calculates k-means clustering with the Intersection over Union (IoU) metric.
param:
boxes: numpy array of shape (r, 2), where r is the number of rows
k: number of clusters
dist: distance function
return:
numpy array of shape (k, 2)
"""
rows = boxes.shape[0]
distances = np.empty((rows, k))
last_clusters = np.zeros((rows,))
np.random.seed()
# the Forgy method will fail if the whole array contains the same rows
clusters = boxes[np.random.choice(rows, k, replace=False)]
while True:
for row in range(rows):
distances[row] = 1 - iou(boxes[row], clusters)
nearest_clusters = np.argmin(distances, axis=1)
if (last_clusters == nearest_clusters).all():
break
for cluster in range(k):
clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0)
last_clusters = nearest_clusters
return clusters
def parse_anno(annotation_path, target_size=None):
anno = open(annotation_path, 'r')
result = []
for line in anno:
s = line.strip().split(' ')
img_w = int(s[2])
img_h = int(s[3])
s = s[4:]
box_cnt = len(s) // 5
for i in range(box_cnt):
x_min, y_min, x_max, y_max = float(s[i*5+1]), float(s[i*5+2]), float(s[i*5+3]), float(s[i*5+4])
width = x_max - x_min
height = y_max - y_min
assert width > 0
assert height > 0
# use letterbox resize, i.e. keep the original aspect ratio
# get k-means anchors on the resized target image size
if target_size is not None:
resize_ratio = min(target_size[0] / img_w, target_size[1] / img_h)
width *= resize_ratio
height *= resize_ratio
result.append([width, height])
# get k-means anchors on the original image size
else:
result.append([width, height])
result = np.asarray(result)
return result
def get_kmeans(anno, cluster_num=9):
anchors = kmeans(anno, cluster_num)
ave_iou = avg_iou(anno, anchors)
anchors = anchors.astype('int').tolist()
anchors = sorted(anchors, key=lambda x: x[0] * x[1])
return anchors, ave_iou
if __name__ == '__main__':
# target resize format: [width, height]
# if target_resize is speficied, the anchors are on the resized image scale
# if target_resize is set to None, the anchors are on the original image scale
target_size = [416, 416]
annotation_path = "train.txt"
anno_result = parse_anno(annotation_path, target_size=target_size)
anchors, ave_iou = get_kmeans(anno_result, 9)
anchor_string = ''
for anchor in anchors:
anchor_string += '{},{}, '.format(anchor[0], anchor[1])
anchor_string = anchor_string[:-2]
print('anchors are:')
print(anchor_string)
print('the average iou is:')
print(ave_iou)
@@ -0,0 +1,32 @@
{
"board_id": "0x002f",
"chip_info": "910",
"deploy_mode": "lab",
"group_count": "1",
"group_list": [
{
"device_num": "1",
"server_num": "1",
"group_name": "",
"instance_count": "1",
"instance_list": [
{
"devices": [
{
"device_id": "0",
"device_ip": "192.168.100.101"
}
],
"rank_id": "0",
"server_id": "0.0.0.0"
}
]
}
],
"para_plane_nic_location": "device",
"para_plane_nic_name": [
"eth0"
],
"para_plane_nic_num": "1",
"status": "completed"
}
@@ -0,0 +1,43 @@
{
"board_id": "0x002f",
"chip_info": "910",
"deploy_mode": "lab",
"group_count": "1",
"group_list": [
{
"device_num": "2",
"server_num": "1",
"group_name": "",
"instance_count": "2",
"instance_list": [
{
"devices": [
{
"device_id": "0",
"device_ip": "192.168.100.101"
}
],
"rank_id": "0",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "1",
"device_ip": "192.168.101.101"
}
],
"rank_id": "1",
"server_id": "0.0.0.0"
}
]
}
],
"para_plane_nic_location": "device",
"para_plane_nic_name": [
"eth0",
"eth1"
],
"para_plane_nic_num": "2",
"status": "completed"
}
@@ -0,0 +1,65 @@
{
"board_id": "0x002f",
"chip_info": "910",
"deploy_mode": "lab",
"group_count": "1",
"group_list": [
{
"device_num": "4",
"server_num": "1",
"group_name": "",
"instance_count": "4",
"instance_list": [
{
"devices": [
{
"device_id": "0",
"device_ip": "192.168.100.101"
}
],
"rank_id": "0",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "1",
"device_ip": "192.168.101.101"
}
],
"rank_id": "1",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "2",
"device_ip": "192.168.102.101"
}
],
"rank_id": "2",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "3",
"device_ip": "192.168.103.101"
}
],
"rank_id": "3",
"server_id": "0.0.0.0"
}
]
}
],
"para_plane_nic_location": "device",
"para_plane_nic_name": [
"eth0",
"eth1",
"eth2",
"eth3"
],
"para_plane_nic_num": "4",
"status": "completed"
}
@@ -0,0 +1,109 @@
{
"board_id": "0x002f",
"chip_info": "910",
"deploy_mode": "lab",
"group_count": "1",
"group_list": [
{
"device_num": "8",
"server_num": "1",
"group_name": "",
"instance_count": "8",
"instance_list": [
{
"devices": [
{
"device_id": "0",
"device_ip": "192.168.100.101"
}
],
"rank_id": "0",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "1",
"device_ip": "192.168.101.101"
}
],
"rank_id": "1",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "2",
"device_ip": "192.168.102.101"
}
],
"rank_id": "2",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "3",
"device_ip": "192.168.103.101"
}
],
"rank_id": "3",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "4",
"device_ip": "192.168.100.100"
}
],
"rank_id": "4",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "5",
"device_ip": "192.168.101.100"
}
],
"rank_id": "5",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "6",
"device_ip": "192.168.102.100"
}
],
"rank_id": "6",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "7",
"device_ip": "192.168.103.100"
}
],
"rank_id": "7",
"server_id": "0.0.0.0"
}
]
}
],
"para_plane_nic_location": "device",
"para_plane_nic_name": [
"eth0",
"eth1",
"eth2",
"eth3",
"eth4",
"eth5",
"eth6",
"eth7"
],
"para_plane_nic_num": "8",
"status": "completed"
}
@@ -0,0 +1,88 @@
# coding: utf-8
# This file contains the parameter used in train.py
from __future__ import division, print_function
from utils.misc_utils import parse_anchors, read_class_names
import math
### Some paths
train_file = './data/my_data/train.txt' # The path of the training txt file.
val_file = './data/my_data/val.txt' # The path of the validation txt file.
restore_path = './data/darknet_weights/yolov3.ckpt' # The path of the weights to restore.
save_dir = './checkpoint/' # The directory of the weights to save.
log_dir = './data/logs/' # The directory to store the tensorboard log files.
progress_log_path = './data/progress.log' # The path to record the training progress.
anchor_path = './data/yolo_anchors.txt' # The path of the anchor txt file.
class_name_path = './data/voc.names' # The path of the class names.
### Training releated numbers
batch_size = 6
img_size = [416, 416] # Images will be resized to `img_size` and fed to the network, size format: [width, height]
letterbox_resize = False # Whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image.
total_epoches = 100
train_evaluation_step = 100 # Evaluate on the training batch after some steps.
val_evaluation_epoch = 1 # Evaluate on the whole validation dataset after some steps. Set to None to evaluate every epoch.
save_epoch = 10 # Save the model after some epochs.
batch_norm_decay = 0.99 # decay in bn ops
weight_decay = 5e-4 # l2 weight decay
global_step = 0 # used when resuming training
### tf.data parameters
num_threads = 10 # Number of threads for image processing used in tf.data pipeline.
prefetech_buffer = 5 # Prefetech_buffer used in tf.data pipeline.
### Learning rate and optimizer
optimizer_name = 'momentum' # Chosen from [sgd, momentum, adam, rmsprop]
save_optimizer = False # Whether to save the optimizer parameters into the checkpoint file.
learning_rate_init = 1e-4
lr_type = 'piecewise' # Chosen from [fixed, exponential, cosine_decay, cosine_decay_restart, piecewise]
lr_decay_epoch = 5 # Epochs after which learning rate decays. Int or float. Used when chosen `exponential` and `cosine_decay_restart` lr_type.
lr_decay_factor = 0.96 # The learning rate decay factor. Used when chosen `exponential` lr_type.
lr_lower_bound = 1e-6 # The minimum learning rate.
# piecewise params
pw_boundaries = [25, 40] # epoch based boundaries
pw_values = [learning_rate_init, 3e-5, 1e-4]
### Load and finetune
# Choose the parts you want to restore the weights. List form.
# restore_include: None, restore_exclude: None => restore the whole model
# restore_include: None, restore_exclude: scope => restore the whole model except `scope`
# restore_include: scope1, restore_exclude: scope2 => if scope1 contains scope2, restore scope1 and not restore scope2 (scope1 - scope2)
# choise 1: only restore the darknet body
# restore_include = ['yolov3/darknet53_body']
# restore_exclude = None
# choise 2: restore all layers except the last 3 conv2d layers in 3 scale
restore_include = None
restore_exclude = ['yolov3/yolov3_head/Conv_14', 'yolov3/yolov3_head/Conv_6', 'yolov3/yolov3_head/Conv_22']
# Choose the parts you want to finetune. List form.
# Set to None to train the whole model.
update_part = None
### other training strategies
multi_scale_train = True # Whether to apply multi-scale training strategy. Image size varies from [320, 320] to [640, 640] by default.
use_label_smooth = True # Whether to use class label smoothing strategy.
use_focal_loss = True # Whether to apply focal loss on the conf loss.
use_mix_up = True # Whether to use mix up data augmentation strategy.
use_warm_up = True # whether to use warm up strategy to prevent from gradient exploding.
warm_up_epoch = 3 # Warm up training epoches. Set to a larger value if gradient explodes.
### some constants in validation
# nms
nms_threshold = 0.45 # iou threshold in nms operation
score_threshold = 0.01 # threshold of the probability of the classes in nms operation, i.e. score = pred_confs * pred_probs. set lower for higher recall.
nms_topk = 150 # keep at most nms_topk outputs after nms
# mAP eval
eval_threshold = 0.5 # the iou threshold applied in mAP evaluation
use_voc_07_metric = False # whether to use voc 2007 evaluation metric, i.e. the 11-point metric
### parse some params
anchors = parse_anchors(anchor_path)
classes = read_class_names(class_name_path)
class_num = len(classes)
train_img_cnt = len(open(train_file, 'r').readlines())
val_img_cnt = len(open(val_file, 'r').readlines())
train_batch_num = int(math.ceil(float(train_img_cnt) / batch_size))
lr_decay_freq = int(train_batch_num * lr_decay_epoch)
pw_boundaries = [float(i) * train_batch_num + global_step for i in pw_boundaries]
@@ -0,0 +1,140 @@
# coding: utf-8
from __future__ import division, print_function
import tensorflow as tf
import numpy as np
import argparse
from tqdm import trange
from utils.data_utils import get_batch_data
from utils.misc_utils import parse_anchors, read_class_names, AverageMeter
from utils.eval_utils import evaluate_on_cpu, evaluate_on_gpu, get_preds_gpu, voc_eval, parse_gt_rec
from utils.nms_utils import gpu_nms
from model import yolov3
#################
# ArgumentParser
#################
parser = argparse.ArgumentParser(description="YOLO-V3 eval procedure.")
# some paths
parser.add_argument("--eval_file", type=str, default="./data/my_data/val.txt",
help="The path of the validation or test txt file.")
parser.add_argument("--restore_path", type=str, default="./data/checkpoint_whole_finetune_no_letterbox/best_model_Epoch_32_step_91046_mAP_0.8754_loss_2.2147_lr_3e-05",
help="The path of the weights to restore.")
parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
help="The path of the anchor txt file.")
parser.add_argument("--class_name_path", type=str, default="./data/voc.names",
help="The path of the class names.")
# some numbers
parser.add_argument("--img_size", nargs='*', type=int, default=[416, 416],
help="Resize the input image to `img_size`, size format: [width, height]")
parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=False,
help="Whether to use the letterbox resize.")
parser.add_argument("--num_threads", type=int, default=10,
help="Number of threads for image processing used in tf.data pipeline.")
parser.add_argument("--prefetech_buffer", type=int, default=5,
help="Prefetech_buffer used in tf.data pipeline.")
parser.add_argument("--nms_threshold", type=float, default=0.45,
help="IOU threshold in nms operation.")
parser.add_argument("--score_threshold", type=float, default=0.01,
help="Threshold of the probability of the classes in nms operation.")
parser.add_argument("--nms_topk", type=int, default=150,
help="Keep at most nms_topk outputs after nms.")
parser.add_argument("--use_voc_07_metric", type=lambda x: (str(x).lower() == 'true'), default=False,
help="Whether to use the voc 2007 mAP metrics.")
args = parser.parse_args()
# args params
args.anchors = parse_anchors(args.anchor_path)
args.classes = read_class_names(args.class_name_path)
args.class_num = len(args.classes)
args.img_cnt = len(open(args.eval_file, 'r').readlines())
# setting placeholders
is_training = tf.placeholder(dtype=tf.bool, name="phase_train")
handle_flag = tf.placeholder(tf.string, [], name='iterator_handle_flag')
pred_boxes_flag = tf.placeholder(tf.float32, [1, None, None])
pred_scores_flag = tf.placeholder(tf.float32, [1, None, None])
gpu_nms_op = gpu_nms(pred_boxes_flag, pred_scores_flag, args.class_num, args.nms_topk, args.score_threshold, args.nms_threshold)
##################
# tf.data pipeline
##################
val_dataset = tf.data.TextLineDataset(args.eval_file)
val_dataset = val_dataset.batch(1)
val_dataset = val_dataset.map(
lambda x: tf.py_func(get_batch_data, [x, args.class_num, args.img_size, args.anchors, 'val', False, False, args.letterbox_resize], [tf.int64, tf.float32, tf.float32, tf.float32, tf.float32]),
num_parallel_calls=args.num_threads
)
val_dataset.prefetch(args.prefetech_buffer)
iterator = val_dataset.make_one_shot_iterator()
image_ids, image, y_true_13, y_true_26, y_true_52 = iterator.get_next()
image_ids.set_shape([None])
y_true = [y_true_13, y_true_26, y_true_52]
image.set_shape([None, args.img_size[1], args.img_size[0], 3])
for y in y_true:
y.set_shape([None, None, None, None, None])
##################
# Model definition
##################
yolo_model = yolov3(args.class_num, args.anchors)
with tf.variable_scope('yolov3'):
pred_feature_maps = yolo_model.forward(image, is_training=is_training)
loss = yolo_model.compute_loss(pred_feature_maps, y_true)
y_pred = yolo_model.predict(pred_feature_maps)
saver_to_restore = tf.train.Saver()
with tf.Session() as sess:
sess.run([tf.global_variables_initializer()])
saver_to_restore.restore(sess, args.restore_path)
print('\n----------- start to eval -----------\n')
val_loss_total, val_loss_xy, val_loss_wh, val_loss_conf, val_loss_class = \
AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter()
val_preds = []
for j in trange(args.img_cnt):
__image_ids, __y_pred, __loss = sess.run([image_ids, y_pred, loss], feed_dict={is_training: False})
pred_content = get_preds_gpu(sess, gpu_nms_op, pred_boxes_flag, pred_scores_flag, __image_ids, __y_pred)
val_preds.extend(pred_content)
val_loss_total.update(__loss[0])
val_loss_xy.update(__loss[1])
val_loss_wh.update(__loss[2])
val_loss_conf.update(__loss[3])
val_loss_class.update(__loss[4])
rec_total, prec_total, ap_total = AverageMeter(), AverageMeter(), AverageMeter()
gt_dict = parse_gt_rec(args.eval_file, args.img_size, args.letterbox_resize)
print('mAP eval:')
for ii in range(args.class_num):
npos, nd, rec, prec, ap = voc_eval(gt_dict, val_preds, ii, iou_thres=0.5, use_07_metric=args.use_voc_07_metric)
rec_total.update(rec, npos)
prec_total.update(prec, nd)
ap_total.update(ap, 1)
print('Class {}: Recall: {:.4f}, Precision: {:.4f}, AP: {:.4f}'.format(ii, rec, prec, ap))
mAP = ap_total.average
print('final mAP: {:.4f}'.format(mAP))
print("recall: {:.3f}, precision: {:.3f}".format(rec_total.average, prec_total.average))
print("total_loss: {:.3f}, loss_xy: {:.3f}, loss_wh: {:.3f}, loss_conf: {:.3f}, loss_class: {:.3f}".format(
val_loss_total.average, val_loss_xy.average, val_loss_wh.average, val_loss_conf.average, val_loss_class.average
))
@@ -0,0 +1,20 @@
aeroplane
bicycle
bird
boat
bottle
bus
car
cat
chair
cow
diningtable
dog
horse
motorbike
person
pottedplant
sheep
sofa
train
tvmonitor
@@ -0,0 +1,96 @@
# coding: utf-8
import xml.etree.ElementTree as ET
import os
names_dict = {}
cnt = 0
f = open('./voc_names.txt', 'r').readlines()
for line in f:
line = line.strip()
names_dict[line] = cnt
cnt += 1
voc_07 = '/data/VOCdevkit/VOC2007'
voc_12 = '/data/VOCdevkit/VOC2012'
anno_path = [os.path.join(voc_07, 'Annotations'), os.path.join(voc_12, 'Annotations')]
img_path = [os.path.join(voc_07, 'JPEGImages'), os.path.join(voc_12, 'JPEGImages')]
trainval_path = [os.path.join(voc_07, 'ImageSets/Main/trainval.txt'),
os.path.join(voc_12, 'ImageSets/Main/trainval.txt')]
test_path = [os.path.join(voc_07, 'ImageSets/Main/test.txt')]
def parse_xml(path):
tree = ET.parse(path)
img_name = path.split('/')[-1][:-4]
height = tree.findtext("./size/height")
width = tree.findtext("./size/width")
objects = [img_name, width, height]
for obj in tree.findall('object'):
difficult = obj.find('difficult').text
if difficult == '1':
continue
name = obj.find('name').text
bbox = obj.find('bndbox')
xmin = bbox.find('xmin').text
ymin = bbox.find('ymin').text
xmax = bbox.find('xmax').text
ymax = bbox.find('ymax').text
name = str(names_dict[name])
objects.extend([name, xmin, ymin, xmax, ymax])
if len(objects) > 1:
return objects
else:
return None
test_cnt = 0
def gen_test_txt(txt_path):
global test_cnt
f = open(txt_path, 'w')
for i, path in enumerate(test_path):
img_names = open(path, 'r').readlines()
for img_name in img_names:
img_name = img_name.strip()
xml_path = anno_path[i] + '/' + img_name + '.xml'
objects = parse_xml(xml_path)
if objects:
objects[0] = img_path[i] + '/' + img_name + '.jpg'
if os.path.exists(objects[0]):
objects.insert(0, str(test_cnt))
test_cnt += 1
objects = ' '.join(objects) + '\n'
f.write(objects)
f.close()
train_cnt = 0
def gen_train_txt(txt_path):
global train_cnt
f = open(txt_path, 'w')
for i, path in enumerate(trainval_path):
img_names = open(path, 'r').readlines()
for img_name in img_names:
img_name = img_name.strip()
xml_path = anno_path[i] + '/' + img_name + '.xml'
objects = parse_xml(xml_path)
if objects:
objects[0] = img_path[i] + '/' + img_name + '.jpg'
if os.path.exists(objects[0]):
objects.insert(0, str(train_cnt))
train_cnt += 1
objects = ' '.join(objects) + '\n'
f.write(objects)
f.close()
gen_train_txt('train.txt')
gen_test_txt('val.txt')
@@ -0,0 +1,32 @@
# coding: utf-8
# This script is used to remove the optimizer parameters in the saved checkpoint files.
# These parameters are useless in the forward process.
# Removing them will shrink the checkpoint size a lot.
import sys
sys.path.append('..')
import os
import tensorflow as tf
from model import yolov3
# params
ckpt_path = ''
class_num = 20
save_dir = 'shrinked_ckpt'
if not os.path.exists(save_dir):
os.makedirs(save_dir)
image = tf.placeholder(tf.float32, [1, 416, 416, 3])
yolo_model = yolov3(class_num, None)
with tf.variable_scope('yolov3'):
pred_feature_maps = yolo_model.forward(image)
saver_to_restore = tf.train.Saver()
saver_to_save = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver_to_restore.restore(sess, ckpt_path)
saver_to_save.save(sess, save_dir + '/shrinked')
@@ -0,0 +1,457 @@
# coding=utf-8
# for better understanding about yolov3 architecture, refer to this website (in Chinese):
# https://blog.csdn.net/leviopku/article/details/82660381
from __future__ import division, print_function
import tensorflow as tf
slim = tf.contrib.slim
from utils.layer_utils import conv2d, darknet53_body, yolo_block, upsample_layer
class yolov3(object):
def __init__(self, class_num, anchors, use_label_smooth=False, use_focal_loss=False, batch_norm_decay=0.999,
weight_decay=5e-4, use_static_shape=True,
img_size=(416, 416), batch_size=None):
# self.anchors = [[10, 13], [16, 30], [33, 23],
# [30, 61], [62, 45], [59, 119],
# [116, 90], [156, 198], [373, 326]]
self.class_num = class_num
self.anchors = anchors
self.batch_norm_decay = batch_norm_decay
self.use_label_smooth = use_label_smooth
self.use_focal_loss = use_focal_loss
self.weight_decay = weight_decay
# inference speed optimization
# if `use_static_shape` is True, use tensor.get_shape(), otherwise use tf.shape(tensor)
# static_shape is slightly faster
self.use_static_shape = use_static_shape
self.batch_size = batch_size
# self.img_size = (416, 416)
self.img_size = img_size
self.featrue_map_shape_base = [32, 16, 8]
self.featrue_map_shape = [(self.img_size[0] // i, self.img_size[1] // i) for i in self.featrue_map_shape_base]
def forward(self, inputs, is_training=False, reuse=False):
# the input img_size, form: [height, weight]
# it will be used later
# self.img_size = tf.shape(inputs)[1:3]
# self.featrue_map_shape = [(self.img_size[0]//i, self.img_size[1]//i) for i in self.featrue_map_shape_base]
# set batch norm params
batch_norm_params = {
'decay': self.batch_norm_decay,
'epsilon': 1e-05,
'scale': True,
'is_training': is_training,
'fused': None, # Use fused batch norm if possible.
}
with slim.arg_scope([slim.conv2d, slim.batch_norm], reuse=reuse):
with slim.arg_scope([slim.conv2d],
normalizer_fn=slim.batch_norm,
normalizer_params=batch_norm_params,
biases_initializer=None,
activation_fn=lambda x: tf.nn.leaky_relu(x, alpha=0.1),
weights_regularizer=slim.l2_regularizer(self.weight_decay)):
with tf.variable_scope('darknet53_body'):
route_1, route_2, route_3 = darknet53_body(inputs)
with tf.variable_scope('yolov3_head'):
inter1, net = yolo_block(route_3, 512)
feature_map_1 = slim.conv2d(net, 3 * (5 + self.class_num), 1,
stride=1, normalizer_fn=None,
activation_fn=None, biases_initializer=tf.zeros_initializer())
feature_map_1 = tf.identity(feature_map_1, name='feature_map_1')
inter1 = conv2d(inter1, 256, 1)
inter1 = upsample_layer(inter1,
route_2.get_shape().as_list() if self.use_static_shape else tf.shape(
route_2))
concat1 = tf.concat([inter1, route_2], axis=3)
inter2, net = yolo_block(concat1, 256)
feature_map_2 = slim.conv2d(net, 3 * (5 + self.class_num), 1,
stride=1, normalizer_fn=None,
activation_fn=None, biases_initializer=tf.zeros_initializer())
feature_map_2 = tf.identity(feature_map_2, name='feature_map_2')
inter2 = conv2d(inter2, 128, 1)
inter2 = upsample_layer(inter2,
route_1.get_shape().as_list() if self.use_static_shape else tf.shape(
route_1))
concat2 = tf.concat([inter2, route_1], axis=3)
_, feature_map_3 = yolo_block(concat2, 128)
feature_map_3 = slim.conv2d(feature_map_3, 3 * (5 + self.class_num), 1,
stride=1, normalizer_fn=None,
activation_fn=None, biases_initializer=tf.zeros_initializer())
feature_map_3 = tf.identity(feature_map_3, name='feature_map_3')
return feature_map_1, feature_map_2, feature_map_3
def reorg_layer(self, feature_map, anchors):
'''
feature_map: a feature_map from [feature_map_1, feature_map_2, feature_map_3] returned
from `forward` function
anchors: shape: [3, 2]
'''
# NOTE: size in [h, w] format! don't get messed up!
grid_size = feature_map.get_shape().as_list()[1:3] if self.use_static_shape else tf.shape(feature_map)[
1:3] # [13, 13]
# the downscale ratio in height and weight
# ratio = tf.cast(self.img_size / grid_size, tf.float32)
ratio = tf.cast([self.img_size[0] / grid_size[0], self.img_size[1] / grid_size[1]], tf.float32)
# rescale the anchors to the feature_map
# NOTE: the anchor is in [w, h] format!
rescaled_anchors = [(anchor[0] / ratio[1], anchor[1] / ratio[0]) for anchor in anchors]
feature_map = tf.reshape(feature_map, [-1, grid_size[0], grid_size[1], 3, 5 + self.class_num])
# split the feature_map along the last dimension
# shape info: take 416x416 input image and the 13*13 feature_map for example:
# box_centers: [N, 13, 13, 3, 2] last_dimension: [center_x, center_y]
# box_sizes: [N, 13, 13, 3, 2] last_dimension: [width, height]
# conf_logits: [N, 13, 13, 3, 1]
# prob_logits: [N, 13, 13, 3, class_num]
# box_centers, box_sizes, conf_logits, prob_logits = tf.split(feature_map, [2, 2, 1, self.class_num], axis=-1)
box_centers = feature_map[..., :2]
box_sizes = feature_map[..., 2:4]
conf_logits = feature_map[..., 4:5]
prob_logits = feature_map[..., 5:]
# conf_logits = tf.expand_dims(conf_logits, -1)
box_centers = tf.nn.sigmoid(box_centers)
# use some broadcast tricks to get the mesh coordinates
grid_x = tf.range(grid_size[1], dtype=tf.int32)
grid_y = tf.range(grid_size[0], dtype=tf.int32)
grid_x, grid_y = tf.meshgrid(grid_x, grid_y)
x_offset = tf.reshape(grid_x, (-1, 1))
y_offset = tf.reshape(grid_y, (-1, 1))
x_y_offset = tf.concat([x_offset, y_offset], axis=-1)
# shape: [13, 13, 1, 2]
x_y_offset = tf.cast(tf.reshape(x_y_offset, [grid_size[0], grid_size[1], 1, 2]), tf.float32)
# get the absolute box coordinates on the feature_map
box_centers = box_centers + x_y_offset
# rescale to the original image scale
box_centers = box_centers * ratio[::-1]
# avoid getting possible nan value with tf.clip_by_value
box_sizes = tf.exp(box_sizes) * rescaled_anchors
# box_sizes = tf.clip_by_value(tf.exp(box_sizes), 1e-9, 100) * rescaled_anchors
# rescale to the original image scale
box_sizes = box_sizes * ratio[::-1]
# shape: [N, 13, 13, 3, 4]
# last dimension: (center_x, center_y, w, h)
boxes = tf.concat([box_centers, box_sizes], axis=-1)
# shape:
# x_y_offset: [13, 13, 1, 2]
# boxes: [N, 13, 13, 3, 4], rescaled to the original image scale
# conf_logits: [N, 13, 13, 3, 1]
# prob_logits: [N, 13, 13, 3, class_num]
return x_y_offset, boxes, conf_logits, prob_logits
def predict(self, feature_maps):
'''
Receive the returned feature_maps from `forward` function,
the produce the output predictions at the test stage.
'''
feature_map_1, feature_map_2, feature_map_3 = feature_maps
feature_map_anchors = [(feature_map_1, self.anchors[6:9]),
(feature_map_2, self.anchors[3:6]),
(feature_map_3, self.anchors[0:3])]
reorg_results = [self.reorg_layer(feature_map, anchors) for (feature_map, anchors) in feature_map_anchors]
def _reshape(result):
x_y_offset, boxes, conf_logits, prob_logits = result
grid_size = x_y_offset.get_shape().as_list()[:2] if self.use_static_shape else tf.shape(x_y_offset)[:2]
boxes = tf.reshape(boxes, [-1, grid_size[0] * grid_size[1] * 3, 4])
conf_logits = tf.reshape(conf_logits, [-1, grid_size[0] * grid_size[1] * 3, 1])
prob_logits = tf.reshape(prob_logits, [-1, grid_size[0] * grid_size[1] * 3, self.class_num])
# shape: (take 416*416 input image and feature_map_1 for example)
# boxes: [N, 13*13*3, 4]
# conf_logits: [N, 13*13*3, 1]
# prob_logits: [N, 13*13*3, class_num]
return boxes, conf_logits, prob_logits
boxes_list, confs_list, probs_list = [], [], []
for result in reorg_results:
boxes, conf_logits, prob_logits = _reshape(result)
confs = tf.sigmoid(conf_logits)
probs = tf.sigmoid(prob_logits)
boxes_list.append(boxes)
confs_list.append(confs)
probs_list.append(probs)
# collect results on three scales
# take 416*416 input image for example:
# shape: [N, (13*13+26*26+52*52)*3, 4]
boxes = tf.concat(boxes_list, axis=1)
# shape: [N, (13*13+26*26+52*52)*3, 1]
confs = tf.concat(confs_list, axis=1)
# shape: [N, (13*13+26*26+52*52)*3, class_num]
probs = tf.concat(probs_list, axis=1)
# center_x, center_y, width, height = tf.split(boxes, [1, 1, 1, 1], axis=-1)
# center_x = tf.expand_dims(boxes[..., 0], 2)
# center_y = tf.expand_dims(boxes[..., 1], 2)
# width = tf.expand_dims(boxes[..., 2], 2)
# height = tf.expand_dims(boxes[..., 3], 2)
center_x = boxes[..., 0:1]
center_y = boxes[..., 1:2]
width = boxes[..., 2:3]
height = boxes[..., 3:]
x_min = center_x - width / 2
y_min = center_y - height / 2
x_max = center_x + width / 2
y_max = center_y + height / 2
boxes = tf.concat([x_min, y_min, x_max, y_max], axis=-1)
return boxes, confs, probs
def loss_layer(self, feature_map_i, y_true, anchors, feature_map_shape_i, gt_box_i):
'''
calc loss function from a certain scale
input:
feature_map_i: feature maps of a certain scale. shape: [N, 13, 13, 3*(5 + num_class)] etc.
y_true: y_ture from a certain scale. shape: [N, 13, 13, 3, 5 + num_class + 1] etc.
anchors: shape [9, 2]
'''
# size in [h, w] format! don't get messed up!
# grid_size = tf.shape(feature_map_i)[1:3]
grid_size = tf.shape(feature_map_i)[1:3]
# the downscale ratio in height and weight
ratio = tf.cast(self.img_size / grid_size, tf.float32)
# N: batch_size
N = tf.cast(tf.shape(feature_map_i)[0], tf.float32)
x_y_offset, pred_boxes, pred_conf_logits, pred_prob_logits = self.reorg_layer(feature_map_i, anchors)
###########
# get mask
###########
# shape: take 416x416 input image and 13*13 feature_map for example:
# [N, 13, 13, 3, 1]
object_mask = y_true[..., 4:5]
# the calculation of ignore mask if referred from
# https://github.com/pjreddie/darknet/blob/master/src/yolo_layer.c#L179
# ignore_mask = tf.TensorArray(tf.float32, size=0, dynamic_size=True)
# def loop_cond(idx, ignore_mask):
# return tf.less(idx, tf.cast(N, tf.int32))
# def loop_body(idx, ignore_mask=None):
# # shape: [13, 13, 3, 4] & [13, 13, 3] ==> [V, 4]
# # V: num of true gt box of each image in a batch
# valid_true_boxes = tf.boolean_mask(y_true[idx, ..., 0:4], tf.cast(object_mask[idx, ..., 0], 'bool'))
# # shape: [13, 13, 3, 4] & [V, 4] ==> [13, 13, 3, V]
# iou = self.box_iou(pred_boxes[idx], valid_true_boxes)
# # shape: [13, 13, 3]
# best_iou = tf.reduce_max(iou, axis=-1)
# # shape: [13, 13, 3]
# ignore_mask_tmp = tf.cast(best_iou < 0.5, tf.float32)
# # finally will be shape: [N, 13, 13, 3]
# # ignore_mask = ignore_mask.write(idx, ignore_mask_tmp)
# if ignore_mask is None:
# ignore_mask = tf.expand_dims(ignore_mask_tmp, 0)
# else:
# ignore_mask = tf.concat([ignore_mask, tf.expand_dims(ignore_mask_tmp, 0)], 0)
# print(idx, ignore_mask)
# return idx + 1, ignore_mask
# ignore_mask = None
# _, ignore_mask = tf.while_loop(cond=loop_cond, body=loop_body, loop_vars=[0, ignore_mask])
# ignore_mask = ignore_mask.stack()
iou = self.box_iou(pred_boxes, gt_box_i) # [N, 13, 13, 3, 16]
best_iou = tf.reduce_max(iou, axis=-1) # [N, 13, 13, 3]
ignore_mask = tf.cast(best_iou < 0.5, tf.float32) # [N, 13, 13, 3]
# shape: [N, 13, 13, 3, 1]
ignore_mask = tf.expand_dims(ignore_mask, -1)
ignore_mask = tf.stop_gradient(ignore_mask)
# shape: [N, 13, 13, 3, 2]
pred_box_xy = pred_boxes[..., 0:2]
pred_box_wh = pred_boxes[..., 2:4]
# get xy coordinates in one cell from the feature_map
# numerical range: 0 ~ 1
# shape: [N, 13, 13, 3, 2]
print(y_true[..., 0:2], ratio[::-1], x_y_offset)
true_xy = y_true[..., 0:2] / ratio[::-1] - x_y_offset
pred_xy = pred_box_xy / ratio[::-1] - x_y_offset
# get_tw_th
# numerical range: 0 ~ 1
# shape: [N, 13, 13, 3, 2]
true_tw_th = y_true[..., 2:4] / anchors
pred_tw_th = pred_box_wh / anchors
# for numerical stability
true_tw_th = tf.where(condition=tf.equal(true_tw_th, 0),
x=tf.ones_like(true_tw_th), y=true_tw_th)
pred_tw_th = tf.where(condition=tf.equal(pred_tw_th, 0),
x=tf.ones_like(pred_tw_th), y=pred_tw_th)
true_tw_th = tf.log(tf.clip_by_value(true_tw_th, 1e-9, 1e9))
pred_tw_th = tf.log(tf.clip_by_value(pred_tw_th, 1e-9, 1e9))
# box size punishment:
# box with smaller area has bigger weight. This is taken from the yolo darknet C source code.
# shape: [N, 13, 13, 3, 1]
box_loss_scale = 2. - (y_true[..., 2:3] / tf.cast(self.img_size[1], tf.float32)) * (
y_true[..., 3:4] / tf.cast(self.img_size[0], tf.float32))
############
# loss_part
############
# mix_up weight
# mix_w = y_true[..., self.class_num+5]
# [N, 13, 13, 3, 1]
# mix_w = y_true[..., -1:]
mix_w = y_true[..., 85:]
# mix_w = tf.expand_dims(mix_w, -1)
# shape: [N, 13, 13, 3, 1]
xy_loss = tf.reduce_sum(tf.square(true_xy - pred_xy) * object_mask * box_loss_scale * mix_w) / N
wh_loss = tf.reduce_sum(tf.square(true_tw_th - pred_tw_th) * object_mask * box_loss_scale * mix_w) / N
# shape: [N, 13, 13, 3, 1]
conf_pos_mask = object_mask
conf_neg_mask = (1 - object_mask) * ignore_mask
conf_loss_pos = conf_pos_mask * tf.nn.sigmoid_cross_entropy_with_logits(labels=object_mask,
logits=pred_conf_logits)
conf_loss_neg = conf_neg_mask * tf.nn.sigmoid_cross_entropy_with_logits(labels=object_mask,
logits=pred_conf_logits)
# TODO: may need to balance the pos-neg by multiplying some weights
conf_loss = conf_loss_pos + conf_loss_neg
if self.use_focal_loss:
alpha = 1.0
gamma = 2.0
# TODO: alpha should be a mask array if needed
focal_mask = alpha * tf.pow(tf.abs(object_mask - tf.sigmoid(pred_conf_logits)), gamma)
conf_loss *= focal_mask
conf_loss = tf.reduce_sum(conf_loss * mix_w) / N
# shape: [N, 13, 13, 3, 1]
# whether to use label smooth
if self.use_label_smooth:
delta = 0.01
label_target = (1 - delta) * y_true[..., 5:(5 + self.class_num)] + delta * 1. / self.class_num
else:
label_target = y_true[..., 5:(5 + self.class_num)]
class_loss = object_mask * tf.nn.sigmoid_cross_entropy_with_logits(labels=label_target,
logits=pred_prob_logits) * mix_w
class_loss = tf.reduce_sum(class_loss) / N
return xy_loss, wh_loss, conf_loss, class_loss
def box_iou(self, pred_boxes, valid_true_boxes):
'''
param:
pred_boxes: [13, 13, 3, 4], (center_x, center_y, w, h)
valid_true: [1, 16, 4]
'''
# valid_true_boxes = tf.expand_dims(valid_true_boxes, -2)
# [13, 13, 3, 2]
pred_box_xy = pred_boxes[..., 0:2]
pred_box_wh = pred_boxes[..., 2:4]
# shape: [13, 13, 3, 1, 2]
pred_box_xy = tf.expand_dims(pred_box_xy, -2)
pred_box_wh = tf.expand_dims(pred_box_wh, -2)
print('##################pred_box_wh', pred_box_wh)
# [V, 2]
# N,H,W,A,C = valid_true_boxes.shape
# valid_true_boxes = tf.gather(valid_true_boxes, tf.where(object_mask))
# print(valid_true_boxes, object_mask)
# print(valid_true_boxes)
# input()
# valid_true_boxes = tf.reshape(valid_true_boxes, (self.batch_size, 1, 1, 3, -1, 4))
# x = tf.reshape(valid_true_boxes[..., 0], (self.batch_size, 3, -1))
# y = tf.reshape(valid_true_boxes[..., 1], (self.batch_size, 3, -1))
# w = tf.reshape(valid_true_boxes[..., 2], (self.batch_size, 3, -1))
# h = tf.reshape(valid_true_boxes[..., 3], (self.batch_size, 3, -1))
# valid_true_boxes = tf.stack([x,y,w,h], axis=-1)
valid_true_boxes = tf.expand_dims(valid_true_boxes, 1) # [1, 1, 16, 4]
valid_true_boxes = tf.expand_dims(valid_true_boxes, 1) # [1, 1, 1, 16, 4]
print('##################valid_true_boxes', valid_true_boxes)
# valid_true_boxes = tf.tile(valid_true_boxes, [1,H,W,1,1])
# print(valid_true_boxes)
# input()
true_box_xy = valid_true_boxes[..., :2] # [1, 1, 1, 16, 2]
true_box_wh = valid_true_boxes[..., 2:] # [1, 1, 1, 16, 2]
print('##################true_box_wh', true_box_wh)
# [13, 13, 3, 1, 2] & [1, 1, 1, 16, 2] ==> [13, 13, 3, 16, 2]
intersect_mins = tf.maximum(pred_box_xy - pred_box_wh / 2.,
true_box_xy - true_box_wh / 2.)
intersect_maxs = tf.minimum(pred_box_xy + pred_box_wh / 2.,
true_box_xy + true_box_wh / 2.)
intersect_wh = tf.maximum(intersect_maxs - intersect_mins, 0.)
print('##################intersect_mins', intersect_mins)
print('##################intersect_wh', intersect_wh)
# shape: [13, 13, 3, 16]
intersect_area = intersect_wh[..., 0] * intersect_wh[..., 1]
# shape: [13, 13, 3, 1]
pred_box_area = pred_box_wh[..., 0] * pred_box_wh[..., 1]
# shape: [1, 1, 1, 16]
true_box_area = true_box_wh[..., 0] * true_box_wh[..., 1]
# shape: [1, V]
# true_box_area = tf.expand_dims(true_box_area, -2)
print('##################intersect_area', intersect_area)
print('##################pred_box_area', pred_box_area)
print('##################true_box_area', true_box_area)
# [13, 13, 3, 16]
iou = intersect_area / (pred_box_area + true_box_area - intersect_area + 1e-10)
print('##################iou', iou)
# iou = tf.clip_by_value(iou, 0, 1)
# print(pred_box_xy, pred_box_wh)
# print(intersect_area , pred_box_area , true_box_area , intersect_area)
# print(iou)
# input()
return iou
def compute_loss(self, y_pred, y_true, gt_box):
'''
param:
y_pred: returned feature_map list by `forward` function: [feature_map_1, feature_map_2, feature_map_3]
y_true: input y_true by the tf.data pipeline
'''
loss_xy, loss_wh, loss_conf, loss_class = 0., 0., 0., 0.
anchor_group = [self.anchors[6:9], self.anchors[3:6], self.anchors[0:3]]
# calc loss in 3 scales
for i in range(len(y_pred)):
print('##################level', i)
result = self.loss_layer(y_pred[i], y_true[i], anchor_group[i], self.featrue_map_shape[i], gt_box[i])
loss_xy += result[0]
loss_wh += result[1]
loss_conf += result[2]
loss_class += result[3]
total_loss = loss_xy + loss_wh + loss_conf + loss_class
return [total_loss, loss_xy, loss_wh, loss_conf, loss_class]
@@ -0,0 +1,58 @@
#!/bin/bash
scriptDir=$(cd "$(dirname "$0")"; pwd)
currentDir=$(cd "$(dirname "$scriptDir")"; pwd)
# set env
source ${currentDir}/config/npu_set_env.sh
# setting main path
CODE_PATH=currentDir/code
# set env
export ASCEND_HOME=/usr/local/Ascend
export LD_LIBRARY_PATH=/usr/local/lib/:/usr/lib/:/usr/local/Ascend/ascend-toolkit/latest/fwkacllib/lib64:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:/usr/local/Ascend/add-ons/
export PYTHONPATH=$PYTHONPATH:/usr/local/Ascend/ascend-toolkit/latest/opp/op_impl/built-in/ai_core/tbe:/usr/local/Ascend/ascend-toolkit/latest/fwkacllib/python/site-packages/te:/usr/local/Ascend/ascend-toolkit/latest/fwkacllib/python/site-packages/topi:/usr/local/Ascend/ascend-toolkit/latest/fwkacllib/python/site-packages/hccl:/usr/local/Ascend/ascend-toolkit/latest/tfplugin/python/site-packages:$currentDir
export PATH=$PATH:/usr/local/Ascend/ascend-toolkit/latest/fwkacllib/ccec_compiler/bin
export ASCEND_OPP_PATH=/usr/local/Ascend/ascend-toolkit/latest/opp/
export DDK_VERSION_FLAG=1.60.T49.0.B201
export NEW_GE_FE_ID=1
export GE_AICPU_FLAG=1
export SOC_VERSION=Ascend910
#export DUMP_GE_GRAPH=2
#export DUMP_GRAPH_LEVEL=3
#export PRINT_MODEL=1
export SLOG_PRINT_TO_STDOUT=0
# dump op data
#export DISABLE_REUSE_MEMORY=1
#export DUMP_OP=1
ulimit -c unlimited
# local variable
RANK_SIZE=$1
RANK_TABLE_FILE=./hccl_config/${RANK_SIZE}p.json
RANK_ID_START=0
SAVE_PATH=training/t1
# training stage
MODE=$2
for((RANK_ID=$RANK_ID_START;RANK_ID<$((RANK_SIZE+RANK_ID_START));RANK_ID++));
do
echo
su HwHiAiUser -c "adc --host 0.0.0.0:22118 --log \"SetLogLevel(0)[error]\" --device "$RANK_ID
TMP_PATH=$SAVE_PATH/D$RANK_ID
mkdir -p $TMP_PATH
cp run_yolov3.sh $TMP_PATH/
cp $RANK_TABLE_FILE $TMP_PATH/rank_table.json
cd $TMP_PATH
nohup bash run_yolov3.sh $RANK_ID $RANK_SIZE $CODE_PATH $MODE > train_$RANK_ID.log &
cd -
done
@@ -0,0 +1 @@
nohup bash npu_train.sh 1 multi &
@@ -0,0 +1 @@
nohup bash npu_train.sh 1 single &
@@ -0,0 +1 @@
nohup bash npu_train.sh 8 multi &
@@ -0,0 +1 @@
nohup bash npu_train.sh 8 single &
@@ -0,0 +1,50 @@
#clean slog
rm -rf /var/log/npu/slog/host-0/*.log
rm -rf /var/log/npu/slog/device-*/*.log
# setting main path
MAIN_PATH=$(dirname $(readlink -f $0))
# set env
export PYTHONPATH=/usr/local/Ascend/ops/op_impl/built-in/ai_core/tbe/:$MAIN_PATH/../../../
export LD_LIBRARY_PATH=/usr/local/lib/:/usr/lib/:/usr/local/Ascend/fwkacllib/lib64/:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:/usr/local/Ascend/add-ons/:/usr/lib/x86_64-linux-gnu
PATH=$PATH:$HOME/bin
export PATH=$PATH:/usr/local/Ascend/fwkacllib/ccec_compiler/bin:$PATH
export ASCEND_OPP_PATH=/usr/local/Ascend/opp
export DDK_VERSION_FLAG=1.60.T49.0.B201
export NEW_GE_FE_ID=1
export GE_AICPU_FLAG=1
export SOC_VERSION=Ascend910
export DUMP_GE_GRAPH=1
export DUMP_GRAPH_LEVEL=1
export PRINT_MODEL=1
#export SLOG_PRINT_TO_STDOUT=1
ulimit -c unlimited
# local variable
RANK_SIZE=$1
RANK_TABLE_FILE=./configs/${RANK_SIZE}p.json
RANK_ID_START=1
SAVE_PATH=training/t1
for((RANK_ID=$RANK_ID_START;RANK_ID<$((RANK_SIZE+RANK_ID_START));RANK_ID++));
do
echo
su HwHiAiUser -c "adc --host 0.0.0.0:22118 --log \"SetLogLevel(0)[debug]\" --device "$RANK_ID
TMP_PATH=$SAVE_PATH/D$RANK_ID
mkdir -p $TMP_PATH
cp run_yolov3.sh $TMP_PATH/
cp $RANK_TABLE_FILE $TMP_PATH/rank_table.json
cd $TMP_PATH
nohup bash run_yolov3.sh $RANK_ID $RANK_SIZE $MAIN_PATH > train_$RANK_ID.log &
cd -
done
@@ -0,0 +1,29 @@
#!/bin/bash
rm -rf Onnxgraph
rm -rf Partition
rm -rf OptimizeSubGraph
rm -rf Aicpu_Optimized
rm *txt
rm -rf result_$RANK_ID
export RANK_ID=$1
export RANK_SIZE=$2
export DEVICE_ID=$RANK_ID
export DEVICE_INDEX=$RANK_ID
export RANK_TABLE_FILE=rank_table.json
export JOB_ID=123678
export FUSION_TENSOR_SIZE=1000000000
KERNEL_NUM=20
PID_START=$((KERNEL_NUM * RANK_ID))
PID_END=$((PID_START + KERNEL_NUM - 1))
#sleep 5
taskset -c $PID_START-$PID_END python3 $3/train.py \
--mode $4
mkdir graph
mv *.txt graph
mv *.pbtxt graph
@@ -0,0 +1,57 @@
#export CUDA_VISIBLE_DEVICES=''
#export CUDA_VISIBLE_DEVICES=7
# setting main path
MAIN_PATH=$(dirname $(readlink -f $0))
# set env
export PYTHONPATH=/usr/local/Ascend/ops/op_impl/built-in/ai_core/tbe/:$MAIN_PATH/../../../
export LD_LIBRARY_PATH=/usr/local/lib/:/usr/lib/:/usr/local/Ascend/fwkacllib/lib64/:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:/usr/local/Ascend/add-ons/:/usr/lib/x86_64-linux-gnu
PATH=$PATH:$HOME/bin
export PATH=$PATH:/usr/local/Ascend/fwkacllib/ccec_compiler/bin:$PATH
export ASCEND_OPP_PATH=/usr/local/Ascend/opp
export DDK_VERSION_FLAG=1.60.T49.0.B201
export NEW_GE_FE_ID=1
export GE_AICPU_FLAG=1
export SOC_VERSION=Ascend910
export RANK_ID=7
export RANK_SIZE=1
export DEVICE_ID=$RANK_ID
export DEVICE_INDEX=$RANK_ID
export JOB_ID=10087
export FUSION_TENSOR_SIZE=1000000000
#export SLOG_PRINT_TO_STDOUT=1
#export DUMP_GE_GRAPH=2
#export DUMP_GRAPH_LEVEL=3
su HwHiAiUser -c "adc --host 0.0.0.0:22118 --log \"SetLogLevel(0)[debug]\" --device "$RANK_ID
#RESTORE_PATH=/opt/npu/wujianping/epoch200/
RESTORE_PATH=/opt/npu/w00558981/yolov3_ok_bak_zip/training/t1/D0/training/
#RESTORE_PATH=/opt/npu/w00558981/training_done_yolov3/training/t1/D0/training/model-epoch_200_step_182000_loss_20.7852_lr_0
while :
do
#python3.7 eval.py \
#--save_img True \
#--score_thresh 0.2 \
#--restore_path $RESTORE_PATH \
#--max_test 10 \
python3.7 eval.py \
--save_json True \
--score_thresh 0.001 \
--restore_path $RESTORE_PATH \
--max_test 10000
break
sleep 1200
done
@@ -0,0 +1,86 @@
# coding: utf-8
from __future__ import division, print_function
import tensorflow as tf
import numpy as np
import argparse
import cv2
from utils.misc_utils import parse_anchors, read_class_names
from utils.nms_utils import gpu_nms
from utils.plot_utils import get_color_table, plot_one_box
from utils.data_aug import letterbox_resize
from model import yolov3
parser = argparse.ArgumentParser(description="YOLO-V3 test single image test procedure.")
parser.add_argument("input_image", type=str,
help="The path of the input image.")
parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
help="The path of the anchor txt file.")
parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416],
help="Resize the input image with `new_size`, size format: [width, height]")
parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True,
help="Whether to use the letterbox resize.")
parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
help="The path of the class names.")
parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/yolov3.ckpt",
help="The path of the weights to restore.")
args = parser.parse_args()
args.anchors = parse_anchors(args.anchor_path)
args.classes = read_class_names(args.class_name_path)
args.num_class = len(args.classes)
color_table = get_color_table(args.num_class)
img_ori = cv2.imread(args.input_image)
if args.letterbox_resize:
img, resize_ratio, dw, dh = letterbox_resize(img_ori, args.new_size[0], args.new_size[1])
else:
height_ori, width_ori = img_ori.shape[:2]
img = cv2.resize(img_ori, tuple(args.new_size))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = np.asarray(img, np.float32)
img = img[np.newaxis, :] / 255.
with tf.Session() as sess:
input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data')
yolo_model = yolov3(args.num_class, args.anchors)
with tf.variable_scope('yolov3'):
pred_feature_maps = yolo_model.forward(input_data, False)
pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps)
pred_scores = pred_confs * pred_probs
boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=200, score_thresh=0.3, nms_thresh=0.45)
saver = tf.train.Saver()
saver.restore(sess, args.restore_path)
boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img})
# rescale the coordinates to the original image
if args.letterbox_resize:
boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio
boxes_[:, [1, 3]] = (boxes_[:, [1, 3]] - dh) / resize_ratio
else:
boxes_[:, [0, 2]] *= (width_ori/float(args.new_size[0]))
boxes_[:, [1, 3]] *= (height_ori/float(args.new_size[1]))
print("box coords:")
print(boxes_)
print('*' * 30)
print("scores:")
print(scores_)
print('*' * 30)
print("labels:")
print(labels_)
for i in range(len(boxes_)):
x0, y0, x1, y1 = boxes_[i]
plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]] + ', {:.2f}%'.format(scores_[i] * 100), color=color_table[labels_[i]])
cv2.imshow('Detection result', img_ori)
cv2.imwrite('detection_result.jpg', img_ori)
cv2.waitKey(0)
@@ -0,0 +1,287 @@
# coding: utf-8
from __future__ import division, print_function
import tensorflow as tf
import numpy as np
import logging
from tqdm import trange
import random
import time
import datetime
from utils.data_utils import get_batch_data, color_jitter
from utils.misc_utils import shuffle_and_overwrite, make_summary, config_learning_rate, config_optimizer, AverageMeter
from utils.eval_utils import evaluate_on_cpu, evaluate_on_gpu, get_preds_gpu, voc_eval, parse_gt_rec
from model import yolov3
import time
import os
import sys
# npu modified
from npu_bridge.estimator import npu_ops
from npu_bridge.estimator.npu.npu_optimizer import NPUDistributedOptimizer
from npu_bridge.estimator.npu.npu_loss_scale_optimizer import NPULossScaleOptimizer
from npu_bridge.estimator.npu.npu_loss_scale_manager import FixedLossScaleManager
from npu_bridge.estimator.npu.npu_loss_scale_manager import ExponentialUpdateLossScaleManager
from tensorflow.core.protobuf.rewriter_config_pb2 import RewriterConfig
from npu_bridge.estimator.npu import util
sys.path.append(os.path.join(os.path.abspath(os.path.dirname(__file__)),'../../../../../'))
sys.path.append(os.path.join(os.path.abspath(os.path.dirname(__file__)),'../../../../utils/atlasboost'))
from benchmark_log import hwlog
from benchmark_log.basic_utils import get_environment_info
from benchmark_log.basic_utils import get_model_parameter
import argparse
hwlog.ROOT_DIR = os.path.split(os.path.abspath(__file__))[0]
cpu_info, npu_info, framework_info, os_info, benchmark_version = get_environment_info("tensorflow")
config_info = get_model_parameter("tensorflow_config")
initinal_data={"base_lr": 0.128, "dataset": "coco1024", "optimizer": "Adam", "loss_scale": 512, "batchsize": 32}
hwlog.remark_print(key=hwlog.CPU_INFO, value=cpu_info)
hwlog.remark_print(key=hwlog.NPU_INFO, value=npu_info)
hwlog.remark_print(key=hwlog.OS_INFO, value=os_info)
hwlog.remark_print(key=hwlog.FRAMEWORK_INFO, value=framework_info)
hwlog.remark_print(key=hwlog.BENCHMARK_VERSION, value=benchmark_version)
hwlog.remark_print(key=hwlog.CONFIG_INFO, value=config_info)
hwlog.remark_print(key=hwlog.BASE_LR, value=initinal_data.get("base_lr"))
hwlog.remark_print(key=hwlog.DATASET, value=initinal_data.get("dataset"))
hwlog.remark_print(key=hwlog.OPT_NAME, value=initinal_data.get("optimizer"))
hwlog.remark_print(key=hwlog.LOSS_SCALE, value=initinal_data.get("loss_scale"))
hwlog.remark_print(key=hwlog.INPUT_BATCH_SIZE, value=initinal_data.get("batchsize"))
parser = argparse.ArgumentParser(description="YOLO-V3 training setting.")
parser.add_argument("--mode", type=str, default='single',
help="setting train mode of training.")
parser.add_argument("--resume", type=bool, default=False,
help="setting if train from resume.")
args_input = parser.parse_args()
if args_input.mode == 'single':
import args_single as args
elif args_input.mode == 'multi':
import args_multi as args
print('setting train mode %s.' %args_input.mode)
# setting loggers
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s %(levelname)s %(message)s',
datefmt='%a, %d %b %Y %H:%M:%S', filename=args.progress_log_path, filemode='w')
##################
# tf.data pipeline
##################
train_dataset = tf.data.TextLineDataset(args.train_file)
print('##########################args_input.rank_id', os.environ['RANK_ID'])
logging.info('shuffle seed_%s args.', os.environ['RANK_ID'])
train_dataset = train_dataset.shuffle(args.train_img_cnt, seed=int(os.environ['RANK_ID']),
reshuffle_each_iteration=True)
print('##########################args.train_img_cnt', args.train_img_cnt)
train_dataset = train_dataset.repeat()
train_dataset = train_dataset.batch(args.batch_size, drop_remainder=True) # npu modified
train_dataset = train_dataset.map(
lambda x: tf.py_func(get_batch_data,
inp=[x, args.class_num, args.img_size, args.anchors, 'train', args.multi_scale_train,
args.use_mix_up, args.letterbox_resize],
Tout=[tf.float32,
tf.float32, tf.float32, tf.float32,
tf.float32, tf.float32, tf.float32]),
num_parallel_calls=20
)
def valid_shape(*x):
image, y_true_13, y_true_26, y_true_52, gt_box_13, gt_box_26, gt_box_52 = x
y_true = [y_true_13, y_true_26, y_true_52]
gt_box = [gt_box_13, gt_box_26, gt_box_52]
# npu modified
if args_input.mode == 'single':
image.set_shape([args.batch_size, args.img_size[0], args.img_size[1], 3])
y_true[0].set_shape([args.batch_size, 13, 13, 3, 86])
y_true[1].set_shape([args.batch_size, 26, 26, 3, 86])
y_true[2].set_shape([args.batch_size, 52, 52, 3, 86])
elif args_input.mode == 'multi':
image.set_shape([args.batch_size, args.img_size[0], args.img_size[1], 3])
y_true[0].set_shape([args.batch_size, 19*1, 19*1, 3, 86])
y_true[1].set_shape([args.batch_size, 19*2, 19*2, 3, 86])
y_true[2].set_shape([args.batch_size, 19*4, 19*4, 3, 86])
gt_box[0].set_shape([args.batch_size, 1, 32, 4])
gt_box[1].set_shape([args.batch_size, 1, 64, 4])
gt_box[2].set_shape([args.batch_size, 1, 128, 4])
image = color_jitter(
image, brightness=0.125, contrast=0.5, saturation=0.5, hue=0.05)
return image, y_true_13, y_true_26, y_true_52, gt_box_13, gt_box_26, gt_box_52
train_dataset = train_dataset.map(valid_shape, num_parallel_calls=20)
train_dataset = train_dataset.prefetch(args.prefetech_buffer)
iterator = tf.data.Iterator.from_structure(train_dataset.output_types, train_dataset.output_shapes)
train_init_op = iterator.make_initializer(train_dataset)
# get an element from the chosen dataset iterator
image, y_true_13, y_true_26, y_true_52, gt_box_13, gt_box_26, gt_box_52 = iterator.get_next()
y_true = [y_true_13, y_true_26, y_true_52]
gt_box = [gt_box_13, gt_box_26, gt_box_52]
##################
# Model definition
##################
yolo_model = yolov3(args.class_num, args.anchors, args.use_label_smooth, args.use_focal_loss, args.batch_norm_decay,
args.weight_decay, use_static_shape=False,
batch_size=args.batch_size, img_size=args.img_size)
with tf.variable_scope('yolov3'):
pred_feature_maps = yolo_model.forward(image, is_training=True)
loss = yolo_model.compute_loss(pred_feature_maps, y_true, gt_box)
l2_loss = tf.losses.get_regularization_loss()
# setting restore parts and vars to update
saver_to_restore = tf.train.Saver(
var_list=tf.contrib.framework.get_variables_to_restore(include=args.restore_include, exclude=args.restore_exclude))
update_vars = tf.contrib.framework.get_variables_to_restore(include=args.update_part)
tf.summary.scalar('train_batch_statistics/total_loss', loss[0])
tf.summary.scalar('train_batch_statistics/loss_xy', loss[1])
tf.summary.scalar('train_batch_statistics/loss_wh', loss[2])
tf.summary.scalar('train_batch_statistics/loss_conf', loss[3])
tf.summary.scalar('train_batch_statistics/loss_class', loss[4])
tf.summary.scalar('train_batch_statistics/loss_l2', l2_loss)
tf.summary.scalar('train_batch_statistics/loss_ratio', l2_loss / loss[0])
def learning_rate_fn(global_step):
"""Builds scaled learning rate function with 0.08 epoch warm up."""
initial_learning_rate = args.learning_rate_init
batches_per_epoch = args.train_batch_num // args.iterations_per_loop * args.iterations_per_loop
total_steps = int(args.total_epoches * batches_per_epoch)
warmup_steps = int(batches_per_epoch * args.warm_up_epoch)
tf.compat.v1.logging.info('total_steps: %d', int(total_steps))
tf.compat.v1.logging.info('warmup_steps: %d', int(warmup_steps))
lr = tf.maximum(
tf.compat.v1.train.cosine_decay(
learning_rate=initial_learning_rate,
global_step=global_step - warmup_steps,
decay_steps=total_steps - warmup_steps,
),
0,
)
warmup_lr = (
initial_learning_rate * tf.cast(global_step, tf.float32) / tf.cast(
warmup_steps, tf.float32))
return tf.cond(pred=global_step < warmup_steps,
true_fn=lambda: warmup_lr,
false_fn=lambda: lr)
global_step = tf.train.get_or_create_global_step()
learning_rate = learning_rate_fn(global_step)
tf.summary.scalar('learning_rate', learning_rate)
if not args.save_optimizer:
saver_to_save = tf.train.Saver()
saver_best = tf.train.Saver()
optimizer = config_optimizer(args.optimizer_name, learning_rate)
optimizer = NPUDistributedOptimizer(optimizer)
loss_scale_manager = FixedLossScaleManager(loss_scale=128)
if args.num_gpus > 1:
optimizer = NPULossScaleOptimizer(optimizer, loss_scale_manager, is_distributed=True)
else:
optimizer = NPULossScaleOptimizer(optimizer, loss_scale_manager, is_distributed=False)
# set dependencies for BN ops
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
# apply gradient clip to avoid gradient exploding
gvs = optimizer.compute_gradients(loss[0] + l2_loss, var_list=update_vars)
clip_grad_var = [gv if gv[0] is None else [
tf.clip_by_norm(gv[0], 100.), gv[1]] for gv in gvs]
train_op = optimizer.apply_gradients(clip_grad_var, global_step=tf.train.get_global_step())
if args.save_optimizer:
print(
'Saving optimizer parameters to checkpoint! Remember to restore the global_step in the fine-tuning afterwards.')
saver_to_save = tf.train.Saver()
saver_best = tf.train.Saver()
# npu modified
config = tf.ConfigProto()
custom_op = config.graph_options.rewrite_options.custom_optimizers.add()
custom_op.name = "NpuOptimizer"
custom_op.parameter_map["use_off_line"].b = True # training on Ascend chips
custom_op.parameter_map["enable_data_pre_proc"].b = True
custom_op.parameter_map["iterations_per_loop"].i = args.iterations_per_loop
config.graph_options.rewrite_options.remapping = RewriterConfig.OFF
with tf.Session(config=config) as sess:
# yolov3 finetuning训练开启(darknet53.ckpt
sess.run([tf.global_variables_initializer(), tf.local_variables_initializer()])
# 断点续训开启
if args_input.resume:
saver_to_restore = tf.train.Saver()
saver_to_restore.restore(sess, tf.train.latest_checkpoint(args.save_dir))
else:
saver_to_restore.restore(sess, args.restore_path)
merged = tf.summary.merge_all()
writer = tf.summary.FileWriter(args.log_dir, sess.graph)
print('\n----------- start to train -----------\n')
#hwlog.logger.info("time_ts:%s, hardware:%s current os:%s" %(date_time,'Ascend910','Ubuntu 18.04'))
#hwlog.logger.info("time_ts:%s, framework is tensorflow 1.15.0 " %(date_time))
#remark_logger.info("ABK time_ts: %s, yolov3 %s model train begain, total train_epoches:%d, file: %s, lineno: %s" %(date_time,args_input.mode,args.total_epoches,file_name,sys._getframe().f_lineno))
hwlog.remark_print(key=hwlog.TOTAL_TRAIN_EPOCH, value=f"{args.total_epoches}")
best_mAP = -np.Inf
train_op = util.set_iteration_per_loop(sess, train_op, args.iterations_per_loop)
sess.run(train_init_op)
for epoch in range(args.total_epoches):
loss_total, loss_xy, loss_wh, loss_conf, loss_class = AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter(), AverageMeter()
for i in trange(args.train_batch_num // args.iterations_per_loop):
t = time.time()
_, summary, __y_true, __loss, __global_step, __lr = sess.run(
[train_op, merged, y_true, loss, global_step, learning_rate]
)
fps = 1 / (time.time() - t) * args.iterations_per_loop * args.num_gpus * args.batch_size
writer.add_summary(summary, global_step=__global_step)
loss_total.update(__loss[0], len(__y_true[0]))
loss_xy.update(__loss[1], len(__y_true[0]))
loss_wh.update(__loss[2], len(__y_true[0]))
loss_conf.update(__loss[3], len(__y_true[0]))
loss_class.update(__loss[4], len(__y_true[0]))
info = "Epoch: {}, global_step: {} fps: {:.2f} lr: {:.5f} | loss: total: {:.2f}, xy: {:.2f}, wh: {:.2f}, conf: {:.2f}, class: {:.2f} | ".format(
epoch, int(__global_step), fps, __lr, loss_total.average, loss_xy.average, loss_wh.average,
loss_conf.average,
loss_class.average)
print(info)
logging.info(info)
#remark_logger.info("ABK time_ts:%s, global_steps %d, learning rate %2f, file: %s, lineno: %s" %(date_time,int(__global_step),__lr,file_name,sys._getframe().f_lineno))
#remark_logger.info("ABK time_ts:%s, fps %2f, loss_total %2f, file: %s, lineno: %s" %(date_time,fps,loss_total.average,file_name,sys._getframe().f_lineno))
hwlog.remark_print(key=hwlog.FPS, value=f"{fps}")
hwlog.remark_print(key=hwlog.GLOBAL_STEP, value=f"{int(__global_step)}")
# NOTE: this is just demo. You can set the conditions when to save the weights.
temp_epoch = epoch + 1
if temp_epoch % args.save_epoch == 0 and epoch > 0:
saver_to_save.save(sess, args.save_dir + 'model-epoch_{}_step_{}_loss_{:.4f}_lr_{:.5g}'.format( \
temp_epoch,
int(__global_step),
loss_total.average,
__lr))
if __lr <= 0:
break
saver_to_save.save(sess, args.save_dir + 'model-final_step_{}_loss_{:.4f}_lr_{:.5g}'.format( \
int(__global_step),
loss_total.average,
__lr))
@@ -0,0 +1,109 @@
{
"board_id": "0x002f",
"chip_info": "910",
"deploy_mode": "lab",
"group_count": "1",
"group_list": [
{
"device_num": "8",
"server_num": "1",
"group_name": "",
"instance_count": "8",
"instance_list": [
{
"devices": [
{
"device_id": "0",
"device_ip": "192.168.100.101"
}
],
"rank_id": "0",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "1",
"device_ip": "192.168.101.101"
}
],
"rank_id": "1",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "2",
"device_ip": "192.168.102.101"
}
],
"rank_id": "2",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "3",
"device_ip": "192.168.103.101"
}
],
"rank_id": "3",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "4",
"device_ip": "192.168.100.100"
}
],
"rank_id": "4",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "5",
"device_ip": "192.168.101.100"
}
],
"rank_id": "5",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "6",
"device_ip": "192.168.102.100"
}
],
"rank_id": "6",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "7",
"device_ip": "192.168.103.100"
}
],
"rank_id": "7",
"server_id": "0.0.0.0"
}
]
}
],
"para_plane_nic_location": "device",
"para_plane_nic_name": [
"eth0",
"eth1",
"eth2",
"eth3",
"eth4",
"eth5",
"eth6",
"eth7"
],
"para_plane_nic_num": "8",
"status": "completed"
}
@@ -0,0 +1,29 @@
#!/bin/bash
rm -rf Onnxgraph
rm -rf Partition
rm -rf OptimizeSubGraph
rm -rf Aicpu_Optimized
rm *txt
rm -rf result_$RANK_ID
export RANK_ID=$1
export RANK_SIZE=$2
export DEVICE_ID=$RANK_ID
export DEVICE_INDEX=$RANK_ID
export RANK_TABLE_FILE=rank_table.json
export JOB_ID=123678
export FUSION_TENSOR_SIZE=1000000000
KERNEL_NUM=20
PID_START=$((KERNEL_NUM * RANK_ID))
PID_END=$((PID_START + KERNEL_NUM - 1))
#sleep 5
taskset -c $PID_START-$PID_END python3 $3/train.py \
--mode $4
mkdir graph
mv *.txt graph
mv *.pbtxt graph
@@ -0,0 +1,109 @@
{
"board_id": "0x002f",
"chip_info": "910",
"deploy_mode": "lab",
"group_count": "1",
"group_list": [
{
"device_num": "8",
"server_num": "1",
"group_name": "",
"instance_count": "8",
"instance_list": [
{
"devices": [
{
"device_id": "0",
"device_ip": "192.168.100.101"
}
],
"rank_id": "0",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "1",
"device_ip": "192.168.101.101"
}
],
"rank_id": "1",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "2",
"device_ip": "192.168.102.101"
}
],
"rank_id": "2",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "3",
"device_ip": "192.168.103.101"
}
],
"rank_id": "3",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "4",
"device_ip": "192.168.100.100"
}
],
"rank_id": "4",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "5",
"device_ip": "192.168.101.100"
}
],
"rank_id": "5",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "6",
"device_ip": "192.168.102.100"
}
],
"rank_id": "6",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "7",
"device_ip": "192.168.103.100"
}
],
"rank_id": "7",
"server_id": "0.0.0.0"
}
]
}
],
"para_plane_nic_location": "device",
"para_plane_nic_name": [
"eth0",
"eth1",
"eth2",
"eth3",
"eth4",
"eth5",
"eth6",
"eth7"
],
"para_plane_nic_num": "8",
"status": "completed"
}
@@ -0,0 +1,29 @@
#!/bin/bash
rm -rf Onnxgraph
rm -rf Partition
rm -rf OptimizeSubGraph
rm -rf Aicpu_Optimized
rm *txt
rm -rf result_$RANK_ID
export RANK_ID=$1
export RANK_SIZE=$2
export DEVICE_ID=$RANK_ID
export DEVICE_INDEX=$RANK_ID
export RANK_TABLE_FILE=rank_table.json
export JOB_ID=123678
export FUSION_TENSOR_SIZE=1000000000
KERNEL_NUM=20
PID_START=$((KERNEL_NUM * RANK_ID))
PID_END=$((PID_START + KERNEL_NUM - 1))
#sleep 5
taskset -c $PID_START-$PID_END python3 $3/train.py \
--mode $4
mkdir graph
mv *.txt graph
mv *.pbtxt graph
@@ -0,0 +1,109 @@
{
"board_id": "0x002f",
"chip_info": "910",
"deploy_mode": "lab",
"group_count": "1",
"group_list": [
{
"device_num": "8",
"server_num": "1",
"group_name": "",
"instance_count": "8",
"instance_list": [
{
"devices": [
{
"device_id": "0",
"device_ip": "192.168.100.101"
}
],
"rank_id": "0",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "1",
"device_ip": "192.168.101.101"
}
],
"rank_id": "1",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "2",
"device_ip": "192.168.102.101"
}
],
"rank_id": "2",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "3",
"device_ip": "192.168.103.101"
}
],
"rank_id": "3",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "4",
"device_ip": "192.168.100.100"
}
],
"rank_id": "4",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "5",
"device_ip": "192.168.101.100"
}
],
"rank_id": "5",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "6",
"device_ip": "192.168.102.100"
}
],
"rank_id": "6",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "7",
"device_ip": "192.168.103.100"
}
],
"rank_id": "7",
"server_id": "0.0.0.0"
}
]
}
],
"para_plane_nic_location": "device",
"para_plane_nic_name": [
"eth0",
"eth1",
"eth2",
"eth3",
"eth4",
"eth5",
"eth6",
"eth7"
],
"para_plane_nic_num": "8",
"status": "completed"
}
@@ -0,0 +1,29 @@
#!/bin/bash
rm -rf Onnxgraph
rm -rf Partition
rm -rf OptimizeSubGraph
rm -rf Aicpu_Optimized
rm *txt
rm -rf result_$RANK_ID
export RANK_ID=$1
export RANK_SIZE=$2
export DEVICE_ID=$RANK_ID
export DEVICE_INDEX=$RANK_ID
export RANK_TABLE_FILE=rank_table.json
export JOB_ID=123678
export FUSION_TENSOR_SIZE=1000000000
KERNEL_NUM=20
PID_START=$((KERNEL_NUM * RANK_ID))
PID_END=$((PID_START + KERNEL_NUM - 1))
#sleep 5
taskset -c $PID_START-$PID_END python3 $3/train.py \
--mode $4
mkdir graph
mv *.txt graph
mv *.pbtxt graph
@@ -0,0 +1,109 @@
{
"board_id": "0x002f",
"chip_info": "910",
"deploy_mode": "lab",
"group_count": "1",
"group_list": [
{
"device_num": "8",
"server_num": "1",
"group_name": "",
"instance_count": "8",
"instance_list": [
{
"devices": [
{
"device_id": "0",
"device_ip": "192.168.100.101"
}
],
"rank_id": "0",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "1",
"device_ip": "192.168.101.101"
}
],
"rank_id": "1",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "2",
"device_ip": "192.168.102.101"
}
],
"rank_id": "2",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "3",
"device_ip": "192.168.103.101"
}
],
"rank_id": "3",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "4",
"device_ip": "192.168.100.100"
}
],
"rank_id": "4",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "5",
"device_ip": "192.168.101.100"
}
],
"rank_id": "5",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "6",
"device_ip": "192.168.102.100"
}
],
"rank_id": "6",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "7",
"device_ip": "192.168.103.100"
}
],
"rank_id": "7",
"server_id": "0.0.0.0"
}
]
}
],
"para_plane_nic_location": "device",
"para_plane_nic_name": [
"eth0",
"eth1",
"eth2",
"eth3",
"eth4",
"eth5",
"eth6",
"eth7"
],
"para_plane_nic_num": "8",
"status": "completed"
}
@@ -0,0 +1,29 @@
#!/bin/bash
rm -rf Onnxgraph
rm -rf Partition
rm -rf OptimizeSubGraph
rm -rf Aicpu_Optimized
rm *txt
rm -rf result_$RANK_ID
export RANK_ID=$1
export RANK_SIZE=$2
export DEVICE_ID=$RANK_ID
export DEVICE_INDEX=$RANK_ID
export RANK_TABLE_FILE=rank_table.json
export JOB_ID=123678
export FUSION_TENSOR_SIZE=1000000000
KERNEL_NUM=20
PID_START=$((KERNEL_NUM * RANK_ID))
PID_END=$((PID_START + KERNEL_NUM - 1))
#sleep 5
taskset -c $PID_START-$PID_END python3 $3/train.py \
--mode $4
mkdir graph
mv *.txt graph
mv *.pbtxt graph
@@ -0,0 +1,109 @@
{
"board_id": "0x002f",
"chip_info": "910",
"deploy_mode": "lab",
"group_count": "1",
"group_list": [
{
"device_num": "8",
"server_num": "1",
"group_name": "",
"instance_count": "8",
"instance_list": [
{
"devices": [
{
"device_id": "0",
"device_ip": "192.168.100.101"
}
],
"rank_id": "0",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "1",
"device_ip": "192.168.101.101"
}
],
"rank_id": "1",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "2",
"device_ip": "192.168.102.101"
}
],
"rank_id": "2",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "3",
"device_ip": "192.168.103.101"
}
],
"rank_id": "3",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "4",
"device_ip": "192.168.100.100"
}
],
"rank_id": "4",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "5",
"device_ip": "192.168.101.100"
}
],
"rank_id": "5",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "6",
"device_ip": "192.168.102.100"
}
],
"rank_id": "6",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "7",
"device_ip": "192.168.103.100"
}
],
"rank_id": "7",
"server_id": "0.0.0.0"
}
]
}
],
"para_plane_nic_location": "device",
"para_plane_nic_name": [
"eth0",
"eth1",
"eth2",
"eth3",
"eth4",
"eth5",
"eth6",
"eth7"
],
"para_plane_nic_num": "8",
"status": "completed"
}
@@ -0,0 +1,29 @@
#!/bin/bash
rm -rf Onnxgraph
rm -rf Partition
rm -rf OptimizeSubGraph
rm -rf Aicpu_Optimized
rm *txt
rm -rf result_$RANK_ID
export RANK_ID=$1
export RANK_SIZE=$2
export DEVICE_ID=$RANK_ID
export DEVICE_INDEX=$RANK_ID
export RANK_TABLE_FILE=rank_table.json
export JOB_ID=123678
export FUSION_TENSOR_SIZE=1000000000
KERNEL_NUM=20
PID_START=$((KERNEL_NUM * RANK_ID))
PID_END=$((PID_START + KERNEL_NUM - 1))
#sleep 5
taskset -c $PID_START-$PID_END python3 $3/train.py \
--mode $4
mkdir graph
mv *.txt graph
mv *.pbtxt graph
@@ -0,0 +1,109 @@
{
"board_id": "0x002f",
"chip_info": "910",
"deploy_mode": "lab",
"group_count": "1",
"group_list": [
{
"device_num": "8",
"server_num": "1",
"group_name": "",
"instance_count": "8",
"instance_list": [
{
"devices": [
{
"device_id": "0",
"device_ip": "192.168.100.101"
}
],
"rank_id": "0",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "1",
"device_ip": "192.168.101.101"
}
],
"rank_id": "1",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "2",
"device_ip": "192.168.102.101"
}
],
"rank_id": "2",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "3",
"device_ip": "192.168.103.101"
}
],
"rank_id": "3",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "4",
"device_ip": "192.168.100.100"
}
],
"rank_id": "4",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "5",
"device_ip": "192.168.101.100"
}
],
"rank_id": "5",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "6",
"device_ip": "192.168.102.100"
}
],
"rank_id": "6",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "7",
"device_ip": "192.168.103.100"
}
],
"rank_id": "7",
"server_id": "0.0.0.0"
}
]
}
],
"para_plane_nic_location": "device",
"para_plane_nic_name": [
"eth0",
"eth1",
"eth2",
"eth3",
"eth4",
"eth5",
"eth6",
"eth7"
],
"para_plane_nic_num": "8",
"status": "completed"
}
@@ -0,0 +1,29 @@
#!/bin/bash
rm -rf Onnxgraph
rm -rf Partition
rm -rf OptimizeSubGraph
rm -rf Aicpu_Optimized
rm *txt
rm -rf result_$RANK_ID
export RANK_ID=$1
export RANK_SIZE=$2
export DEVICE_ID=$RANK_ID
export DEVICE_INDEX=$RANK_ID
export RANK_TABLE_FILE=rank_table.json
export JOB_ID=123678
export FUSION_TENSOR_SIZE=1000000000
KERNEL_NUM=20
PID_START=$((KERNEL_NUM * RANK_ID))
PID_END=$((PID_START + KERNEL_NUM - 1))
#sleep 5
taskset -c $PID_START-$PID_END python3 $3/train.py \
--mode $4
mkdir graph
mv *.txt graph
mv *.pbtxt graph
@@ -0,0 +1,109 @@
{
"board_id": "0x002f",
"chip_info": "910",
"deploy_mode": "lab",
"group_count": "1",
"group_list": [
{
"device_num": "8",
"server_num": "1",
"group_name": "",
"instance_count": "8",
"instance_list": [
{
"devices": [
{
"device_id": "0",
"device_ip": "192.168.100.101"
}
],
"rank_id": "0",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "1",
"device_ip": "192.168.101.101"
}
],
"rank_id": "1",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "2",
"device_ip": "192.168.102.101"
}
],
"rank_id": "2",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "3",
"device_ip": "192.168.103.101"
}
],
"rank_id": "3",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "4",
"device_ip": "192.168.100.100"
}
],
"rank_id": "4",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "5",
"device_ip": "192.168.101.100"
}
],
"rank_id": "5",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "6",
"device_ip": "192.168.102.100"
}
],
"rank_id": "6",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "7",
"device_ip": "192.168.103.100"
}
],
"rank_id": "7",
"server_id": "0.0.0.0"
}
]
}
],
"para_plane_nic_location": "device",
"para_plane_nic_name": [
"eth0",
"eth1",
"eth2",
"eth3",
"eth4",
"eth5",
"eth6",
"eth7"
],
"para_plane_nic_num": "8",
"status": "completed"
}
@@ -0,0 +1,29 @@
#!/bin/bash
rm -rf Onnxgraph
rm -rf Partition
rm -rf OptimizeSubGraph
rm -rf Aicpu_Optimized
rm *txt
rm -rf result_$RANK_ID
export RANK_ID=$1
export RANK_SIZE=$2
export DEVICE_ID=$RANK_ID
export DEVICE_INDEX=$RANK_ID
export RANK_TABLE_FILE=rank_table.json
export JOB_ID=123678
export FUSION_TENSOR_SIZE=1000000000
KERNEL_NUM=20
PID_START=$((KERNEL_NUM * RANK_ID))
PID_END=$((PID_START + KERNEL_NUM - 1))
#sleep 5
taskset -c $PID_START-$PID_END python3 $3/train.py \
--mode $4
mkdir graph
mv *.txt graph
mv *.pbtxt graph
@@ -0,0 +1,109 @@
{
"board_id": "0x002f",
"chip_info": "910",
"deploy_mode": "lab",
"group_count": "1",
"group_list": [
{
"device_num": "8",
"server_num": "1",
"group_name": "",
"instance_count": "8",
"instance_list": [
{
"devices": [
{
"device_id": "0",
"device_ip": "192.168.100.101"
}
],
"rank_id": "0",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "1",
"device_ip": "192.168.101.101"
}
],
"rank_id": "1",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "2",
"device_ip": "192.168.102.101"
}
],
"rank_id": "2",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "3",
"device_ip": "192.168.103.101"
}
],
"rank_id": "3",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "4",
"device_ip": "192.168.100.100"
}
],
"rank_id": "4",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "5",
"device_ip": "192.168.101.100"
}
],
"rank_id": "5",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "6",
"device_ip": "192.168.102.100"
}
],
"rank_id": "6",
"server_id": "0.0.0.0"
},
{
"devices": [
{
"device_id": "7",
"device_ip": "192.168.103.100"
}
],
"rank_id": "7",
"server_id": "0.0.0.0"
}
]
}
],
"para_plane_nic_location": "device",
"para_plane_nic_name": [
"eth0",
"eth1",
"eth2",
"eth3",
"eth4",
"eth5",
"eth6",
"eth7"
],
"para_plane_nic_num": "8",
"status": "completed"
}
@@ -0,0 +1,29 @@
#!/bin/bash
rm -rf Onnxgraph
rm -rf Partition
rm -rf OptimizeSubGraph
rm -rf Aicpu_Optimized
rm *txt
rm -rf result_$RANK_ID
export RANK_ID=$1
export RANK_SIZE=$2
export DEVICE_ID=$RANK_ID
export DEVICE_INDEX=$RANK_ID
export RANK_TABLE_FILE=rank_table.json
export JOB_ID=123678
export FUSION_TENSOR_SIZE=1000000000
KERNEL_NUM=20
PID_START=$((KERNEL_NUM * RANK_ID))
PID_END=$((PID_START + KERNEL_NUM - 1))
#sleep 5
taskset -c $PID_START-$PID_END python3 $3/train.py \
--mode $4
mkdir graph
mv *.txt graph
mv *.pbtxt graph
@@ -0,0 +1,450 @@
# coding: utf-8
# part of this is take from Gluon's repo:
# https://github.com/dmlc/gluon-cv/blob/master/gluoncv/data/transforms/presets/yolo.py
from __future__ import division, print_function
import random
import numpy as np
import cv2
# from matplotlib.colors import rgb_to_hsv, hsv_to_rgb
def mix_up(img1, img2, bbox1, bbox2):
'''
return:
mix_img: HWC format mix up image
mix_bbox: [N, 5] shape mix up bbox, i.e. `x_min, y_min, x_max, y_mix, mixup_weight`.
'''
height = max(img1.shape[0], img2.shape[0])
width = max(img1.shape[1], img2.shape[1])
mix_img = np.zeros(shape=(height, width, 3), dtype='float32')
# rand_num = np.random.random()
rand_num = np.random.beta(1.5, 1.5)
rand_num = max(0, min(1, rand_num))
mix_img[:img1.shape[0], :img1.shape[1], :] = img1.astype('float32') * rand_num
mix_img[:img2.shape[0], :img2.shape[1], :] += img2.astype('float32') * (1. - rand_num)
mix_img = mix_img.astype('uint8')
# the last element of the 2nd dimention is the mix up weight
bbox1 = np.concatenate((bbox1, np.full(shape=(bbox1.shape[0], 1), fill_value=rand_num)), axis=-1)
bbox2 = np.concatenate((bbox2, np.full(shape=(bbox2.shape[0], 1), fill_value=1. - rand_num)), axis=-1)
mix_bbox = np.concatenate((bbox1, bbox2), axis=0)
return mix_img, mix_bbox
def bbox_crop(bbox, crop_box=None, allow_outside_center=True):
"""Crop bounding boxes according to slice area.
This method is mainly used with image cropping to ensure bonding boxes fit
within the cropped image.
Parameters
----------
bbox : numpy.ndarray
Numpy.ndarray with shape (N, 4+) where N is the number of bounding boxes.
The second axis represents attributes of the bounding box.
Specifically, these are :math:`(x_{min}, y_{min}, x_{max}, y_{max})`,
we allow additional attributes other than coordinates, which stay intact
during bounding box transformations.
crop_box : tuple
Tuple of length 4. :math:`(x_{min}, y_{min}, width, height)`
allow_outside_center : bool
If `False`, remove bounding boxes which have centers outside cropping area.
Returns
-------
numpy.ndarray
Cropped bounding boxes with shape (M, 4+) where M <= N.
"""
bbox = bbox.copy()
if crop_box is None:
return bbox
if not len(crop_box) == 4:
raise ValueError(
"Invalid crop_box parameter, requires length 4, given {}".format(str(crop_box)))
if sum([int(c is None) for c in crop_box]) == 4:
return bbox
l, t, w, h = crop_box
left = l if l else 0
top = t if t else 0
right = left + (w if w else np.inf)
bottom = top + (h if h else np.inf)
crop_bbox = np.array((left, top, right, bottom))
if allow_outside_center:
mask = np.ones(bbox.shape[0], dtype=bool)
else:
centers = (bbox[:, :2] + bbox[:, 2:4]) / 2
mask = np.logical_and(crop_bbox[:2] <= centers, centers < crop_bbox[2:]).all(axis=1)
# transform borders
bbox[:, :2] = np.maximum(bbox[:, :2], crop_bbox[:2])
bbox[:, 2:4] = np.minimum(bbox[:, 2:4], crop_bbox[2:4])
bbox[:, :2] -= crop_bbox[:2]
bbox[:, 2:4] -= crop_bbox[:2]
mask = np.logical_and(mask, (bbox[:, :2] < bbox[:, 2:4]).all(axis=1))
bbox = bbox[mask]
return bbox
def bbox_iou(bbox_a, bbox_b, offset=0):
"""Calculate Intersection-Over-Union(IOU) of two bounding boxes.
Parameters
----------
bbox_a : numpy.ndarray
An ndarray with shape :math:`(N, 4)`.
bbox_b : numpy.ndarray
An ndarray with shape :math:`(M, 4)`.
offset : float or int, default is 0
The ``offset`` is used to control the whether the width(or height) is computed as
(right - left + ``offset``).
Note that the offset must be 0 for normalized bboxes, whose ranges are in ``[0, 1]``.
Returns
-------
numpy.ndarray
An ndarray with shape :math:`(N, M)` indicates IOU between each pairs of
bounding boxes in `bbox_a` and `bbox_b`.
"""
if bbox_a.shape[1] < 4 or bbox_b.shape[1] < 4:
raise IndexError("Bounding boxes axis 1 must have at least length 4")
tl = np.maximum(bbox_a[:, None, :2], bbox_b[:, :2])
br = np.minimum(bbox_a[:, None, 2:4], bbox_b[:, 2:4])
area_i = np.prod(br - tl + offset, axis=2) * (tl < br).all(axis=2)
area_a = np.prod(bbox_a[:, 2:4] - bbox_a[:, :2] + offset, axis=1)
area_b = np.prod(bbox_b[:, 2:4] - bbox_b[:, :2] + offset, axis=1)
return area_i / (area_a[:, None] + area_b - area_i)
def random_crop_with_constraints(bbox, size, min_scale=0.25, max_scale=1,
max_aspect_ratio=2, constraints=None,
max_trial=10):
"""Crop an image randomly with bounding box constraints.
This data augmentation is used in training of
Single Shot Multibox Detector [#]_. More details can be found in
data augmentation section of the original paper.
.. [#] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy,
Scott Reed, Cheng-Yang Fu, Alexander C. Berg.
SSD: Single Shot MultiBox Detector. ECCV 2016.
Parameters
----------
bbox : numpy.ndarray
Numpy.ndarray with shape (N, 4+) where N is the number of bounding boxes.
The second axis represents attributes of the bounding box.
Specifically, these are :math:`(x_{min}, y_{min}, x_{max}, y_{max})`,
we allow additional attributes other than coordinates, which stay intact
during bounding box transformations.
size : tuple
Tuple of length 2 of image shape as (width, height).
min_scale : float
The minimum ratio between a cropped region and the original image.
The default value is :obj:`0.3`.
max_scale : float
The maximum ratio between a cropped region and the original image.
The default value is :obj:`1`.
max_aspect_ratio : float
The maximum aspect ratio of cropped region.
The default value is :obj:`2`.
constraints : iterable of tuples
An iterable of constraints.
Each constraint should be :obj:`(min_iou, max_iou)` format.
If means no constraint if set :obj:`min_iou` or :obj:`max_iou` to :obj:`None`.
If this argument defaults to :obj:`None`, :obj:`((0.1, None), (0.3, None),
(0.5, None), (0.7, None), (0.9, None), (None, 1))` will be used.
max_trial : int, default 40
Maximum number of trials for each constraint before exit no matter what.
Returns
-------
numpy.ndarray
Cropped bounding boxes with shape :obj:`(M, 4+)` where M <= N.
tuple
Tuple of length 4 as (x_offset, y_offset, new_width, new_height).
"""
# default params in paper
if constraints is None:
constraints = (
# (0.1, None),
(0.3, None),
(0.5, None),
(0.7, None),
(0.9, None),
(None, 1),
)
w, h = size
candidates = [(0, 0, w, h)]
for min_iou, max_iou in constraints:
min_iou = -np.inf if min_iou is None else min_iou
max_iou = np.inf if max_iou is None else max_iou
for _ in range(max_trial):
scale = random.uniform(min_scale, max_scale)
aspect_ratio = random.uniform(
max(1 / max_aspect_ratio, scale * scale),
min(max_aspect_ratio, 1 / (scale * scale)))
crop_h = int(h * scale / np.sqrt(aspect_ratio))
crop_w = int(w * scale * np.sqrt(aspect_ratio))
crop_t = random.randrange(h - crop_h)
crop_l = random.randrange(w - crop_w)
crop_bb = np.array((crop_l, crop_t, crop_l + crop_w, crop_t + crop_h))
if len(bbox) == 0:
top, bottom = crop_t, crop_t + crop_h
left, right = crop_l, crop_l + crop_w
return bbox, (left, top, right-left, bottom-top)
iou = bbox_iou(bbox, crop_bb[np.newaxis])
if min_iou <= iou.min() and iou.max() <= max_iou:
top, bottom = crop_t, crop_t + crop_h
left, right = crop_l, crop_l + crop_w
candidates.append((left, top, right-left, bottom-top))
break
# random select one
while candidates:
crop = candidates.pop(np.random.randint(0, len(candidates)))
new_bbox = bbox_crop(bbox, crop, allow_outside_center=False)
if new_bbox.size < 1:
continue
new_crop = (crop[0], crop[1], crop[2], crop[3])
return new_bbox, new_crop
return bbox, (0, 0, w, h)
def _rand(a=0., b=1.):
return np.random.rand() * (b - a) + a
def random_color_distort(image_data, _hue=0.1, _sat=1.5, _val=1.5):
_hue = _rand(-_hue, _hue)
_sat = _rand(1, _sat) if _rand() < .5 else 1 / _rand(1, _sat)
_val = _rand(1, _val) if _rand() < .5 else 1 / _rand(1, _val)
x = rgb_to_hsv(image_data)
x[..., 0] += _hue
x[..., 0][x[..., 0] > 1] -= 1
x[..., 0][x[..., 0] < 0] += 1
x[..., 1] *= _sat
x[..., 2] *= _val
x[x > 1] = 1
x[x < 0] = 0
image_data = hsv_to_rgb(x)
image_data = image_data.astype(np.float32)
return image_data
def random_color_distort_1(img, bgain=16, hgain=0.0138, sgain=0.678, vgain=0.36):
# brightness_delta = int(np.random.uniform(-bgain, bgain))
# img = np.clip(img + brightness_delta , 0, 255)
# img = img.astype(np.uint8)
r = np.random.uniform(-1, 1, 3) * [hgain, sgain, vgain] + 1 # random gains
hue, sat, val = cv2.split(cv2.cvtColor(img, cv2.COLOR_BGR2HSV))
dtype = img.dtype # uint8
x = np.arange(0, 256, dtype=np.int16)
lut_hue = ((x * r[0]) % 180).astype(dtype)
lut_sat = np.clip(x * r[1], 0, 255).astype(dtype)
lut_val = np.clip(x * r[2], 0, 255).astype(dtype)
img_hsv = cv2.merge((cv2.LUT(hue, lut_hue), cv2.LUT(sat, lut_sat), cv2.LUT(val, lut_val))).astype(dtype)
img = cv2.cvtColor(img_hsv, cv2.COLOR_HSV2BGR) # no return needed
return img
def random_color_distort_raw(img, brightness_delta=16, hue_vari=0.01, sat_vari=0.15, val_vari=0.15, p=0.2):
'''
randomly distort image color. Adjust brightness, hue, saturation, value.
param:
img: a BGR uint8 format OpenCV image. HWC format.
'''
def random_hue(img_hsv, hue_vari, p=p):
if np.random.uniform(0, 1) > p:
hue_delta = np.random.randint(-hue_vari, hue_vari)
img_hsv[:, :, 0] = (img_hsv[:, :, 0] + hue_delta) % 180
return img_hsv
def random_saturation(img_hsv, sat_vari, p=p):
if np.random.uniform(0, 1) > p:
sat_mult = 1 + np.random.uniform(-sat_vari, sat_vari)
img_hsv[:, :, 1] *= sat_mult
return img_hsv
def random_value(img_hsv, val_vari, p=p):
if np.random.uniform(0, 1) > p:
val_mult = 1 + np.random.uniform(-val_vari, val_vari)
img_hsv[:, :, 2] *= val_mult
return img_hsv
def random_brightness(img, brightness_delta, p=p):
if np.random.uniform(0, 1) > p:
img = img.astype(np.float32)
brightness_delta = int(np.random.uniform(-brightness_delta, brightness_delta))
img = img + brightness_delta
return np.clip(img, 0, 255)
# brightness
img = random_brightness(img, brightness_delta)
img = img.astype(np.uint8)
# color jitter
img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV).astype(np.float32)
if np.random.randint(0, 2):
img_hsv = random_value(img_hsv, val_vari)
img_hsv = random_saturation(img_hsv, sat_vari)
img_hsv = random_hue(img_hsv, hue_vari)
else:
img_hsv = random_saturation(img_hsv, sat_vari)
img_hsv = random_hue(img_hsv, hue_vari)
img_hsv = random_value(img_hsv, val_vari)
img_hsv = np.clip(img_hsv, 0, 255)
img = cv2.cvtColor(img_hsv.astype(np.uint8), cv2.COLOR_HSV2BGR)
return img
def letterbox_resize(img, new_width, new_height, interp=0):
'''
Letterbox resize. keep the original aspect ratio in the resized image.
'''
ori_height, ori_width = img.shape[:2]
resize_ratio = min(new_width / ori_width, new_height / ori_height)
resize_w = int(resize_ratio * ori_width)
resize_h = int(resize_ratio * ori_height)
img = cv2.resize(img, (resize_w, resize_h), interpolation=interp)
image_padded = np.full((new_height, new_width, 3), 128, np.uint8)
dw = int((new_width - resize_w) / 2)
dh = int((new_height - resize_h) / 2)
image_padded[dh: resize_h + dh, dw: resize_w + dw, :] = img
return image_padded, resize_ratio, dw, dh
def resize_with_bbox(img, bbox, new_width, new_height, interp=0, letterbox=False):
'''
Resize the image and correct the bbox accordingly.
'''
if letterbox:
image_padded, resize_ratio, dw, dh = letterbox_resize(img, new_width, new_height, interp)
# xmin, xmax
bbox[:, [0, 2]] = bbox[:, [0, 2]] * resize_ratio + dw
# ymin, ymax
bbox[:, [1, 3]] = bbox[:, [1, 3]] * resize_ratio + dh
return image_padded, bbox
else:
ori_height, ori_width = img.shape[:2]
img = cv2.resize(img, (new_width, new_height), interpolation=interp)
# xmin, xmax
bbox[:, [0, 2]] = bbox[:, [0, 2]] / ori_width * new_width
# ymin, ymax
bbox[:, [1, 3]] = bbox[:, [1, 3]] / ori_height * new_height
return img, bbox
def random_flip(img, bbox, px=0, py=0):
'''
Randomly flip the image and correct the bbox.
param:
px:
the probability of horizontal flip
py:
the probability of vertical flip
'''
height, width = img.shape[:2]
if np.random.uniform(0, 1) < px:
img = cv2.flip(img, 1)
xmax = width - bbox[:, 0]
xmin = width - bbox[:, 2]
bbox[:, 0] = xmin
bbox[:, 2] = xmax
if np.random.uniform(0, 1) < py:
img = cv2.flip(img, 0)
ymax = height - bbox[:, 1]
ymin = height - bbox[:, 3]
bbox[:, 1] = ymin
bbox[:, 3] = ymax
return img, bbox
def random_resize(img, bbox, min_ratio=0.25, max_ratio=2, jitter=0.3):
'''
Random expand original image with borders, this is identical to placing
the original image on a larger canvas.
param:
max_ratio :
Maximum ratio of the output image on both direction(vertical and horizontal)
fill :
The value(s) for padded borders.
keep_ratio : bool
If `True`, will keep output image the same aspect ratio as input.
'''
h,w,c = img.shape
max_ratio_limited = 608 / max(h,w)
scale = random.uniform(min_ratio, max_ratio)
scale = min(max_ratio_limited, scale)
w_ratio = random.uniform(1 - jitter, 1 + jitter) * scale
h_ratio = random.uniform(1 - jitter, 1 + jitter) * scale
dst = cv2.resize(img, None, fx=w_ratio, fy=h_ratio)
# correct bbox
bbox[:, 0] *= w_ratio
bbox[:, 2] *= w_ratio
bbox[:, 1] *= h_ratio
bbox[:, 3] *= h_ratio
return dst, bbox
def random_expand(img, bbox, max_ratio=2, fill=0, keep_ratio=True):
'''
Random expand original image with borders, this is identical to placing
the original image on a larger canvas.
param:
max_ratio :
Maximum ratio of the output image on both direction(vertical and horizontal)
fill :
The value(s) for padded borders.
keep_ratio : bool
If `True`, will keep output image the same aspect ratio as input.
'''
h, w, c = img.shape
ratio_x = random.uniform(1, max_ratio)
if keep_ratio:
ratio_y = ratio_x
else:
ratio_y = random.uniform(1, max_ratio)
oh, ow = int(h * ratio_y), int(w * ratio_x)
off_y = random.randint(0, oh - h)
off_x = random.randint(0, ow - w)
dst = np.full(shape=(oh, ow, c), fill_value=fill, dtype=img.dtype)
dst[off_y:off_y + h, off_x:off_x + w, :] = img
# correct bbox
bbox[:, :2] += (off_x, off_y)
bbox[:, 2:4] += (off_x, off_y)
return dst, bbox
@@ -0,0 +1,294 @@
# coding: utf-8
from __future__ import division, print_function
import numpy as np
import cv2
import sys
from utils.data_aug import *
import random
import tensorflow as tf
PY_VERSION = sys.version_info[0]
iter_cnt = 0
IterControl = 50
def color_jitter(image, brightness=0, contrast=0, saturation=0, hue=0):
"""Distorts the color of the image.
Args:
image: The input image tensor.
brightness: A float, specifying the brightness for color jitter.
contrast: A float, specifying the contrast for color jitter.
saturation: A float, specifying the saturation for color jitter.
hue: A float, specifying the hue for color jitter.
Returns:
The distorted image tensor.
"""
with tf.name_scope('distort_color'):
if brightness > 0:
image = tf.image.random_brightness(image, max_delta=brightness)
if contrast > 0:
image = tf.image.random_contrast(
image, lower=1-contrast, upper=1+contrast)
if saturation > 0:
image = tf.image.random_saturation(
image, lower=1-saturation, upper=1+saturation)
if hue > 0:
image = tf.image.random_hue(image, max_delta=hue)
return image
def parse_line(line):
'''
Given a line from the training/test txt file, return parsed info.
line format: line_index, img_path, img_width, img_height, [box_info_1 (5 number)], ...
return:
line_idx: int32
pic_path: string.
boxes: shape [N, 4], N is the ground truth count, elements in the second
dimension are [x_min, y_min, x_max, y_max]
labels: shape [N]. class index.
img_width: int.
img_height: int
'''
if 'str' not in str(type(line)):
line = line.decode()
s = line.strip().split(' ')
assert len(
s) > 8, 'Annotation error! Please check your annotation file. Make sure there is at least one target object in each image.'
# line_idx = int(s[0])
pic_path = s[1]
img_width = int(s[2])
img_height = int(s[3])
s = s[4:]
assert len(
s) % 5 == 0, 'Annotation error! Please check your annotation file. Maybe partially missing some coordinates?'
box_cnt = len(s) // 5
boxes = []
labels = []
for i in range(box_cnt):
label, x_min, y_min, x_max, y_max = int(s[i * 5]), float(s[i * 5 + 1]), float(s[i * 5 + 2]), float(
s[i * 5 + 3]), float(s[i * 5 + 4])
boxes.append([x_min, y_min, x_max, y_max])
labels.append(label)
boxes = np.asarray(boxes, np.float32)
labels = np.asarray(labels, np.int32)
return pic_path, boxes, labels, img_width, img_height
def process_box(boxes, labels, img_size, class_num, anchors):
'''
Generate the y_true label, i.e. the ground truth feature_maps in 3 different scales.
params:
boxes: [N, 5] shape, float32 dtype. `x_min, y_min, x_max, y_mix, mixup_weight`.
labels: [N] shape, int32 dtype.
class_num: int32 num.
anchors: [9, 4] shape, float32 dtype.
'''
anchors_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
# boxes = np.random.shuffle()
# convert boxes form:
# shape: [N, 2]
# (x_center, y_center)
box_centers = (boxes[:, 0:2] + boxes[:, 2:4]) / 2
# (width, height)
box_sizes = boxes[:, 2:4] - boxes[:, 0:2]
# [13, 13, 3, 5+num_class+1] `5` means coords and labels. `1` means mix up weight.
y_true_13 = np.zeros((img_size[1] // 32, img_size[0] // 32, 3, 6 + class_num), np.float32)
y_true_26 = np.zeros((img_size[1] // 16, img_size[0] // 16, 3, 6 + class_num), np.float32)
y_true_52 = np.zeros((img_size[1] // 8, img_size[0] // 8, 3, 6 + class_num), np.float32)
gt_box_13 = np.zeros((1, 32, 4), np.float32)
gt_box_26 = np.zeros((1, 64, 4), np.float32)
gt_box_52 = np.zeros((1, 128, 4), np.float32)
gt_box_list = [gt_box_13, gt_box_26, gt_box_52]
# mix up weight default to 1.
y_true_13[..., -1] = 1.
y_true_26[..., -1] = 1.
y_true_52[..., -1] = 1.
y_true = [y_true_13, y_true_26, y_true_52]
# [N, 1, 2]
box_sizes = np.expand_dims(box_sizes, 1)
# broadcast tricks
# [N, 1, 2] & [9, 2] ==> [N, 9, 2]
mins = np.maximum(- box_sizes / 2, - anchors / 2)
maxs = np.minimum(box_sizes / 2, anchors / 2)
# [N, 9, 2]
whs = maxs - mins
# [N, 9]
iou = (whs[:, :, 0] * whs[:, :, 1]) / (
box_sizes[:, :, 0] * box_sizes[:, :, 1] + anchors[:, 0] * anchors[:, 1] - whs[:, :, 0] * whs[:, :,
1] + 1e-10)
# [N]
best_match_idx = np.argmax(iou, axis=1)
ratio_dict = {1.: 8., 2.: 16., 3.: 32.}
index_dict = {0: 0, 1: 0, 2: 0}
for i, idx in enumerate(best_match_idx):
# idx: 0,1,2 ==> 2; 3,4,5 ==> 1; 6,7,8 ==> 0
feature_map_group = 2 - idx // 3
# scale ratio: 0,1,2 ==> 8; 3,4,5 ==> 16; 6,7,8 ==> 32
ratio = ratio_dict[np.ceil((idx + 1) / 3.)]
x = int(np.floor(box_centers[i, 0] / ratio))
y = int(np.floor(box_centers[i, 1] / ratio))
k = anchors_mask[feature_map_group].index(idx)
c = labels[i]
# print(feature_map_group, '|', y,x,k,c)
y_true[feature_map_group][y, x, k, :2] = box_centers[i]
y_true[feature_map_group][y, x, k, 2:4] = box_sizes[i]
y_true[feature_map_group][y, x, k, 4] = 1.
y_true[feature_map_group][y, x, k, 5 + c] = 1.
y_true[feature_map_group][y, x, k, -1] = boxes[i, -1]
if index_dict[feature_map_group] < gt_box_list[feature_map_group].shape[1]:
gt_box_list[feature_map_group][0, index_dict[feature_map_group], :2] = box_centers[i]
gt_box_list[feature_map_group][0, index_dict[feature_map_group], 2:4] = box_sizes[i]
index_dict[feature_map_group] += 1
return y_true_13, y_true_26, y_true_52, gt_box_13, gt_box_26, gt_box_52
def parse_data(line, class_num, img_size, anchors, mode, letterbox_resize, multi_scale):
'''
param:
line: a line from the training/test txt file
class_num: totol class nums.
img_size: the size of image to be resized to. [width, height] format.
anchors: anchors.
mode: 'train' or 'val'. When set to 'train', data_augmentation will be applied.
letterbox_resize: whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image.
'''
if not isinstance(line, list):
print('###################### line')
pic_path, boxes, labels, _, _ = parse_line(line)
img = cv2.imread(pic_path)
# expand the 2nd dimension, mix up weight default to 1.
boxes = np.concatenate((boxes, np.full(shape=(boxes.shape[0], 1), fill_value=1., dtype=np.float32)), axis=-1)
else:
print('###################### mixup')
# the mix up case
pic_path1, boxes1, labels1, _, _ = parse_line(line[0])
img1 = cv2.imread(pic_path1)
pic_path2, boxes2, labels2, _, _ = parse_line(line[1])
img2 = cv2.imread(pic_path2)
img, boxes = mix_up(img1, img2, boxes1, boxes2)
labels = np.concatenate((labels1, labels2))
if mode == 'train':
img, boxes = random_resize(img, boxes, min_ratio=0.25, max_ratio=2, jitter=0.3)
# random expansion with prob 0.5
if np.random.uniform(0, 1) > 0.5:
img, boxes = random_expand(img, boxes, max_ratio=3, fill=128, keep_ratio=False)
# random cropping
h, w, _ = img.shape
boxes, crop = random_crop_with_constraints(boxes, (w, h))
x0, y0, w, h = crop
img = img[y0: y0 + h, x0: x0 + w]
# resize with random interpolation
h, w, _ = img.shape
interp = np.random.randint(0, 5)
img, boxes = resize_with_bbox(img, boxes, img_size[0], img_size[1], interp=interp, letterbox=letterbox_resize)
# random horizontal flip
h, w, _ = img.shape
img, boxes = random_flip(img, boxes, px=0.5)
else:
img, boxes = resize_with_bbox(img, boxes, img_size[0], img_size[1], interp=1, letterbox=letterbox_resize)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB).astype(np.float32)
# the input of yolo_v3 should be in range 0~1
img = img / 255.
if mode == 'train' and iter_cnt >= IterControl and multi_scale:
cav = np.zeros((608, 608, 3), dtype=np.float32) + 0.5
true_h, true_w, c = img.shape
cav[:true_h, :true_w, :] = img
img = cav.astype(np.float32)
img_size = [608, 608]
y_true_13, y_true_26, y_true_52, gt_box_13, gt_box_26, gt_box_52 = process_box(boxes, labels, img_size, class_num,
anchors)
return img, y_true_13, y_true_26, y_true_52, gt_box_13, gt_box_26, gt_box_52
def get_batch_data(batch_line, class_num, img_size, anchors, mode, multi_scale=False, mix_up=False,
letterbox_resize=True, interval=10):
'''
generate a batch of imgs and labels
param:
batch_line: a batch of lines from train/val.txt files
class_num: num of total classes.
img_size: the image size to be resized to. format: [width, height].
anchors: anchors. shape: [9, 2].
mode: 'train' or 'val'. if set to 'train', data augmentation will be applied.
multi_scale: whether to use multi_scale training, img_size varies from [320, 320] to [640, 640] by default. Note that it will take effect only when mode is set to 'train'.
letterbox_resize: whether to use the letterbox resize, i.e., keep the original aspect ratio in the resized image.
interval: change the scale of image every interval batches. Note that it's indeterministic because of the multi threading.
'''
if isinstance(mode, bytes):
mode = mode.decode()
global iter_cnt
# multi_scale training
if multi_scale and mode == 'train' and iter_cnt >= IterControl:
random.seed(iter_cnt // interval)
random_img_size = [[x * 32, x * 32] for x in range(10, 20)]
img_size = random.sample(random_img_size, 1)[0]
print('multi_scale iter: %d, img_size: %d,%d' % (iter_cnt, img_size[0], img_size[1]))
else:
print('single_scale iter: %d, img_size: %d,%d' % (iter_cnt, img_size[0], img_size[1]))
iter_cnt += 1
img_idx_batch, img_batch, y_true_13_batch, y_true_26_batch, y_true_52_batch = [], [], [], [], []
gt_box_13_batch, gt_box_26_batch, gt_box_52_batch = [], [], []
# mix up strategy
if mix_up and mode == 'train':
mix_lines = []
batch_line = batch_line.tolist()
for idx, line in enumerate(batch_line):
if np.random.uniform(0, 1) < 0.5:
mix_lines.append([line, random.sample(batch_line[:idx] + batch_line[idx + 1:], 1)[0]])
else:
mix_lines.append(line)
batch_line = mix_lines
for line in batch_line:
img, y_true_13, y_true_26, y_true_52, gt_box_13, gt_box_26, gt_box_52 = parse_data(line, class_num,
img_size, anchors,
mode,
letterbox_resize,
multi_scale)
img_batch.append(img)
y_true_13_batch.append(y_true_13)
y_true_26_batch.append(y_true_26)
y_true_52_batch.append(y_true_52)
gt_box_13_batch.append(gt_box_13)
gt_box_26_batch.append(gt_box_26)
gt_box_52_batch.append(gt_box_52)
img_batch, y_true_13_batch, y_true_26_batch, y_true_52_batch = np.asarray(img_batch, np.float32), np.asarray(
y_true_13_batch, np.float32), np.asarray(y_true_26_batch, np.float32), np.asarray(y_true_52_batch, np.float32)
gt_box_13_batch, gt_box_26_batch, gt_box_52_batch = \
np.asarray(gt_box_13_batch), np.asarray(gt_box_26_batch), np.asarray(gt_box_52_batch)
return img_batch, y_true_13_batch, y_true_26_batch, y_true_52_batch, \
gt_box_13_batch, gt_box_26_batch, gt_box_52_batch
@@ -0,0 +1,423 @@
# coding: utf-8
from __future__ import division, print_function
import numpy as np
import cv2
from collections import Counter
from utils.nms_utils import cpu_nms, gpu_nms
from utils.data_utils import parse_line
def calc_iou(pred_boxes, true_boxes):
'''
Maintain an efficient way to calculate the ios matrix using the numpy broadcast tricks.
shape_info: pred_boxes: [N, 4]
true_boxes: [V, 4]
return: IoU matrix: shape: [N, V]
'''
# [N, 1, 4]
pred_boxes = np.expand_dims(pred_boxes, -2)
# [1, V, 4]
true_boxes = np.expand_dims(true_boxes, 0)
# [N, 1, 2] & [1, V, 2] ==> [N, V, 2]
intersect_mins = np.maximum(pred_boxes[..., :2], true_boxes[..., :2])
intersect_maxs = np.minimum(pred_boxes[..., 2:], true_boxes[..., 2:])
intersect_wh = np.maximum(intersect_maxs - intersect_mins, 0.)
# shape: [N, V]
intersect_area = intersect_wh[..., 0] * intersect_wh[..., 1]
# shape: [N, 1, 2]
pred_box_wh = pred_boxes[..., 2:] - pred_boxes[..., :2]
# shape: [N, 1]
pred_box_area = pred_box_wh[..., 0] * pred_box_wh[..., 1]
# [1, V, 2]
true_boxes_wh = true_boxes[..., 2:] - true_boxes[..., :2]
# [1, V]
true_boxes_area = true_boxes_wh[..., 0] * true_boxes_wh[..., 1]
# shape: [N, V]
iou = intersect_area / (pred_box_area + true_boxes_area - intersect_area + 1e-10)
return iou
def evaluate_on_cpu(y_pred, y_true, num_classes, calc_now=True, max_boxes=50, score_thresh=0.5, iou_thresh=0.5):
'''
Given y_pred and y_true of a batch of data, get the recall and precision of the current batch.
'''
num_images = y_true[0].shape[0]
true_labels_dict = {i: 0 for i in range(num_classes)} # {class: count}
pred_labels_dict = {i: 0 for i in range(num_classes)}
true_positive_dict = {i: 0 for i in range(num_classes)}
for i in range(num_images):
true_labels_list, true_boxes_list = [], []
for j in range(3): # three feature maps
# shape: [13, 13, 3, 80]
true_probs_temp = y_true[j][i][..., 5:-1]
# shape: [13, 13, 3, 4] (x_center, y_center, w, h)
true_boxes_temp = y_true[j][i][..., 0:4]
# [13, 13, 3]
object_mask = true_probs_temp.sum(axis=-1) > 0
# [V, 3] V: Ground truth number of the current image
true_probs_temp = true_probs_temp[object_mask]
# [V, 4]
true_boxes_temp = true_boxes_temp[object_mask]
# [V], labels
true_labels_list += np.argmax(true_probs_temp, axis=-1).tolist()
# [V, 4] (x_center, y_center, w, h)
true_boxes_list += true_boxes_temp.tolist()
if len(true_labels_list) != 0:
for cls, count in Counter(true_labels_list).items():
true_labels_dict[cls] += count
# [V, 4] (xmin, ymin, xmax, ymax)
true_boxes = np.array(true_boxes_list)
box_centers, box_sizes = true_boxes[:, 0:2], true_boxes[:, 2:4]
true_boxes[:, 0:2] = box_centers - box_sizes / 2.
true_boxes[:, 2:4] = true_boxes[:, 0:2] + box_sizes
# [1, xxx, 4]
pred_boxes = y_pred[0][i:i + 1]
pred_confs = y_pred[1][i:i + 1]
pred_probs = y_pred[2][i:i + 1]
# pred_boxes: [N, 4]
# pred_confs: [N]
# pred_labels: [N]
# N: Detected box number of the current image
pred_boxes, pred_confs, pred_labels = cpu_nms(pred_boxes, pred_confs * pred_probs, num_classes,
max_boxes=max_boxes, score_thresh=score_thresh, iou_thresh=iou_thresh)
# len: N
pred_labels_list = [] if pred_labels is None else pred_labels.tolist()
if pred_labels_list == []:
continue
# calc iou
# [N, V]
iou_matrix = calc_iou(pred_boxes, true_boxes)
# [N]
max_iou_idx = np.argmax(iou_matrix, axis=-1)
correct_idx = []
correct_conf = []
for k in range(max_iou_idx.shape[0]):
pred_labels_dict[pred_labels_list[k]] += 1
match_idx = max_iou_idx[k] # V level
if iou_matrix[k, match_idx] > iou_thresh and true_labels_list[match_idx] == pred_labels_list[k]:
if match_idx not in correct_idx:
correct_idx.append(match_idx)
correct_conf.append(pred_confs[k])
else:
same_idx = correct_idx.index(match_idx)
if pred_confs[k] > correct_conf[same_idx]:
correct_idx.pop(same_idx)
correct_conf.pop(same_idx)
correct_idx.append(match_idx)
correct_conf.append(pred_confs[k])
for t in correct_idx:
true_positive_dict[true_labels_list[t]] += 1
if calc_now:
# avoid divided by 0
recall = sum(true_positive_dict.values()) / (sum(true_labels_dict.values()) + 1e-6)
precision = sum(true_positive_dict.values()) / (sum(pred_labels_dict.values()) + 1e-6)
return recall, precision
else:
return true_positive_dict, true_labels_dict, pred_labels_dict
def evaluate_on_gpu(sess, gpu_nms_op, pred_boxes_flag, pred_scores_flag, y_pred, y_true, num_classes, iou_thresh=0.5, calc_now=True):
'''
Given y_pred and y_true of a batch of data, get the recall and precision of the current batch.
This function will perform gpu operation on the GPU.
'''
num_images = y_true[0].shape[0]
true_labels_dict = {i: 0 for i in range(num_classes)} # {class: count}
pred_labels_dict = {i: 0 for i in range(num_classes)}
true_positive_dict = {i: 0 for i in range(num_classes)}
for i in range(num_images):
true_labels_list, true_boxes_list = [], []
for j in range(3): # three feature maps
# shape: [13, 13, 3, 80]
true_probs_temp = y_true[j][i][..., 5:-1]
# shape: [13, 13, 3, 4] (x_center, y_center, w, h)
true_boxes_temp = y_true[j][i][..., 0:4]
# [13, 13, 3]
object_mask = true_probs_temp.sum(axis=-1) > 0
# [V, 80] V: Ground truth number of the current image
true_probs_temp = true_probs_temp[object_mask]
# [V, 4]
true_boxes_temp = true_boxes_temp[object_mask]
# [V], labels, each from 0 to 79
true_labels_list += np.argmax(true_probs_temp, axis=-1).tolist()
# [V, 4] (x_center, y_center, w, h)
true_boxes_list += true_boxes_temp.tolist()
if len(true_labels_list) != 0:
for cls, count in Counter(true_labels_list).items():
true_labels_dict[cls] += count
# [V, 4] (xmin, ymin, xmax, ymax)
true_boxes = np.array(true_boxes_list)
box_centers, box_sizes = true_boxes[:, 0:2], true_boxes[:, 2:4]
true_boxes[:, 0:2] = box_centers - box_sizes / 2.
true_boxes[:, 2:4] = true_boxes[:, 0:2] + box_sizes
# [1, xxx, 4]
pred_boxes = y_pred[0][i:i + 1]
pred_confs = y_pred[1][i:i + 1]
pred_probs = y_pred[2][i:i + 1]
# pred_boxes: [N, 4]
# pred_confs: [N]
# pred_labels: [N]
# N: Detected box number of the current image
pred_boxes, pred_confs, pred_labels = sess.run(gpu_nms_op,
feed_dict={pred_boxes_flag: pred_boxes,
pred_scores_flag: pred_confs * pred_probs})
# len: N
pred_labels_list = [] if pred_labels is None else pred_labels.tolist()
if pred_labels_list == []:
continue
# calc iou
# [N, V]
iou_matrix = calc_iou(pred_boxes, true_boxes)
# [N]
max_iou_idx = np.argmax(iou_matrix, axis=-1)
correct_idx = []
correct_conf = []
for k in range(max_iou_idx.shape[0]):
pred_labels_dict[pred_labels_list[k]] += 1
match_idx = max_iou_idx[k] # V level
if iou_matrix[k, match_idx] > iou_thresh and true_labels_list[match_idx] == pred_labels_list[k]:
if match_idx not in correct_idx:
correct_idx.append(match_idx)
correct_conf.append(pred_confs[k])
else:
same_idx = correct_idx.index(match_idx)
if pred_confs[k] > correct_conf[same_idx]:
correct_idx.pop(same_idx)
correct_conf.pop(same_idx)
correct_idx.append(match_idx)
correct_conf.append(pred_confs[k])
for t in correct_idx:
true_positive_dict[true_labels_list[t]] += 1
if calc_now:
# avoid divided by 0
recall = sum(true_positive_dict.values()) / (sum(true_labels_dict.values()) + 1e-6)
precision = sum(true_positive_dict.values()) / (sum(pred_labels_dict.values()) + 1e-6)
return recall, precision
else:
return true_positive_dict, true_labels_dict, pred_labels_dict
def get_preds_gpu(sess, gpu_nms_op, pred_boxes_flag, pred_scores_flag, image_ids, y_pred):
'''
Given the y_pred of an input image, get the predicted bbox and label info.
return:
pred_content: 2d list.
'''
image_id = image_ids[0]
# keep the first dimension 1
pred_boxes = y_pred[0][0:1]
pred_confs = y_pred[1][0:1]
pred_probs = y_pred[2][0:1]
boxes, scores, labels = sess.run(gpu_nms_op,
feed_dict={pred_boxes_flag: pred_boxes,
pred_scores_flag: pred_confs * pred_probs})
pred_content = []
for i in range(len(labels)):
x_min, y_min, x_max, y_max = boxes[i]
score = scores[i]
label = labels[i]
pred_content.append([image_id, x_min, y_min, x_max, y_max, score, label])
return pred_content
gt_dict = {} # key: img_id, value: gt object list
def parse_gt_rec(gt_filename, target_img_size, letterbox_resize=True):
'''
parse and re-organize the gt info.
return:
gt_dict: dict. Each key is a img_id, the value is the gt bboxes in the corresponding img.
'''
global gt_dict
if not gt_dict:
new_width, new_height = target_img_size
with open(gt_filename, 'r') as f:
for line in f:
img_id, pic_path, boxes, labels, ori_width, ori_height = parse_line(line)
objects = []
for i in range(len(labels)):
x_min, y_min, x_max, y_max = boxes[i]
label = labels[i]
if letterbox_resize:
resize_ratio = min(new_width / ori_width, new_height / ori_height)
resize_w = int(resize_ratio * ori_width)
resize_h = int(resize_ratio * ori_height)
dw = int((new_width - resize_w) / 2)
dh = int((new_height - resize_h) / 2)
objects.append([x_min * resize_ratio + dw,
y_min * resize_ratio + dh,
x_max * resize_ratio + dw,
y_max * resize_ratio + dh,
label])
else:
objects.append([x_min * new_width / ori_width,
y_min * new_height / ori_height,
x_max * new_width / ori_width,
y_max * new_height / ori_height,
label])
gt_dict[img_id] = objects
return gt_dict
# The following two functions are modified from FAIR's Detectron repo to calculate mAP:
# https://github.com/facebookresearch/Detectron/blob/master/detectron/datasets/voc_eval.py
def voc_ap(rec, prec, use_07_metric=False):
"""Compute VOC AP given precision and recall. If use_07_metric is true, uses
the VOC 07 11-point method (default:False).
"""
if use_07_metric:
# 11 point metric
ap = 0.
for t in np.arange(0., 1.1, 0.1):
if np.sum(rec >= t) == 0:
p = 0
else:
p = np.max(prec[rec >= t])
ap = ap + p / 11.
else:
# correct AP calculation
# first append sentinel values at the end
mrec = np.concatenate(([0.], rec, [1.]))
mpre = np.concatenate(([0.], prec, [0.]))
# compute the precision envelope
for i in range(mpre.size - 1, 0, -1):
mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])
# to calculate area under PR curve, look for points
# where X axis (recall) changes value
i = np.where(mrec[1:] != mrec[:-1])[0]
# and sum (\Delta recall) * prec
ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
return ap
def voc_eval(gt_dict, val_preds, classidx, iou_thres=0.5, use_07_metric=False):
'''
Top level function that does the PASCAL VOC evaluation.
'''
# 1.obtain gt: extract all gt objects for this class
class_recs = {}
npos = 0
for img_id in gt_dict:
R = [obj for obj in gt_dict[img_id] if obj[-1] == classidx]
bbox = np.array([x[:4] for x in R])
det = [False] * len(R)
npos += len(R)
class_recs[img_id] = {'bbox': bbox, 'det': det}
# 2. obtain pred results
pred = [x for x in val_preds if x[-1] == classidx]
img_ids = [x[0] for x in pred]
confidence = np.array([x[-2] for x in pred])
BB = np.array([[x[1], x[2], x[3], x[4]] for x in pred])
# 3. sort by confidence
sorted_ind = np.argsort(-confidence)
try:
BB = BB[sorted_ind, :]
except:
print('no box, ignore')
return 1e-6, 1e-6, 0, 0, 0
img_ids = [img_ids[x] for x in sorted_ind]
# 4. mark TPs and FPs
nd = len(img_ids)
tp = np.zeros(nd)
fp = np.zeros(nd)
for d in range(nd):
# all the gt info in some image
R = class_recs[img_ids[d]]
bb = BB[d, :]
ovmax = -np.Inf
BBGT = R['bbox']
if BBGT.size > 0:
# calc iou
# intersection
ixmin = np.maximum(BBGT[:, 0], bb[0])
iymin = np.maximum(BBGT[:, 1], bb[1])
ixmax = np.minimum(BBGT[:, 2], bb[2])
iymax = np.minimum(BBGT[:, 3], bb[3])
iw = np.maximum(ixmax - ixmin + 1., 0.)
ih = np.maximum(iymax - iymin + 1., 0.)
inters = iw * ih
# union
uni = ((bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) + (BBGT[:, 2] - BBGT[:, 0] + 1.) * (
BBGT[:, 3] - BBGT[:, 1] + 1.) - inters)
overlaps = inters / uni
ovmax = np.max(overlaps)
jmax = np.argmax(overlaps)
if ovmax > iou_thres:
# gt not matched yet
if not R['det'][jmax]:
tp[d] = 1.
R['det'][jmax] = 1
else:
fp[d] = 1.
else:
fp[d] = 1.
# compute precision recall
fp = np.cumsum(fp)
tp = np.cumsum(tp)
rec = tp / float(npos)
# avoid divide by zero in case the first detection matches a difficult
# ground truth
prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
ap = voc_ap(rec, prec, use_07_metric)
# return rec, prec, ap
return npos, nd, tp[-1] / float(npos), tp[-1] / float(nd), ap
@@ -0,0 +1,89 @@
# coding: utf-8
from __future__ import division, print_function
import numpy as np
import tensorflow as tf
slim = tf.contrib.slim
def conv2d(inputs, filters, kernel_size, strides=1):
def _fixed_padding(inputs, kernel_size):
pad_total = kernel_size - 1
pad_beg = pad_total // 2
pad_end = pad_total - pad_beg
padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end],
[pad_beg, pad_end], [0, 0]], mode='CONSTANT')
return padded_inputs
if strides > 1:
inputs = _fixed_padding(inputs, kernel_size)
inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides,
padding=('SAME' if strides == 1 else 'VALID'))
return inputs
def darknet53_body(inputs):
def res_block(inputs, filters):
shortcut = inputs
net = conv2d(inputs, filters * 1, 1)
net = conv2d(net, filters * 2, 3)
net = net + shortcut
return net
# first two conv2d layers
net = conv2d(inputs, 32, 3, strides=1)
net = conv2d(net, 64, 3, strides=2)
# res_block * 1
net = res_block(net, 32)
net = conv2d(net, 128, 3, strides=2)
# res_block * 2
for i in range(2):
net = res_block(net, 64)
net = conv2d(net, 256, 3, strides=2)
# res_block * 8
for i in range(8):
net = res_block(net, 128)
route_1 = net
net = conv2d(net, 512, 3, strides=2)
# res_block * 8
for i in range(8):
net = res_block(net, 256)
route_2 = net
net = conv2d(net, 1024, 3, strides=2)
# res_block * 4
for i in range(4):
net = res_block(net, 512)
route_3 = net
return route_1, route_2, route_3
def yolo_block(inputs, filters):
net = conv2d(inputs, filters * 1, 1)
net = conv2d(net, filters * 2, 3)
net = conv2d(net, filters * 1, 1)
net = conv2d(net, filters * 2, 3)
net = conv2d(net, filters * 1, 1)
route = net
net = conv2d(net, filters * 2, 3)
return route, net
def upsample_layer(inputs, out_shape):
new_height, new_width = out_shape[1], out_shape[2]
# NOTE: here height is the first
# TODO: Do we need to set `align_corners` as True?
inputs = tf.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled')
return inputs
@@ -0,0 +1,165 @@
# coding: utf-8
import numpy as np
import tensorflow as tf
import random
from tensorflow.core.framework import summary_pb2
def make_summary(name, val):
return summary_pb2.Summary(value=[summary_pb2.Summary.Value(tag=name, simple_value=val)])
class AverageMeter(object):
def __init__(self):
self.reset()
def reset(self):
self.val = 0
self.average = 0
self.sum = 0
self.count = 0
def update(self, val, n=1):
self.val = val
self.sum += val * n
self.count += n
self.average = self.sum / float(self.count)
def parse_anchors(anchor_path):
'''
parse anchors.
returned data: shape [N, 2], dtype float32
'''
anchors = np.reshape(np.asarray(open(anchor_path, 'r').read().split(','), np.float32), [-1, 2])
return anchors
def read_class_names(class_name_path):
names = {}
with open(class_name_path, 'r') as data:
for ID, name in enumerate(data):
names[ID] = name.strip('\n')
return names
def shuffle_and_overwrite(file_name):
content = open(file_name, 'r').readlines()
random.shuffle(content)
with open(file_name, 'w') as f:
for line in content:
f.write(line)
def update_dict(ori_dict, new_dict):
if not ori_dict:
return new_dict
for key in ori_dict:
ori_dict[key] += new_dict[key]
return ori_dict
def list_add(ori_list, new_list):
for i in range(len(ori_list)):
ori_list[i] += new_list[i]
return ori_list
def load_weights(var_list, weights_file):
"""
Loads and converts pre-trained weights.
param:
var_list: list of network variables.
weights_file: name of the binary file.
"""
with open(weights_file, "rb") as fp:
np.fromfile(fp, dtype=np.int32, count=5)
weights = np.fromfile(fp, dtype=np.float32)
ptr = 0
i = 0
assign_ops = []
try:
while i < len(var_list) - 1:
var1 = var_list[i]
var2 = var_list[i + 1]
# do something only if we process conv layer
if 'Conv' in var1.name.split('/')[-2]:
# check type of next layer
if 'BatchNorm' in var2.name.split('/')[-2]:
# load batch norm params
gamma, beta, mean, var = var_list[i + 1:i + 5]
batch_norm_vars = [beta, gamma, mean, var]
for var in batch_norm_vars:
shape = var.shape.as_list()
num_params = np.prod(shape)
var_weights = weights[ptr:ptr + num_params].reshape(shape)
ptr += num_params
assign_ops.append(tf.assign(var, var_weights, validate_shape=True))
# we move the pointer by 4, because we loaded 4 variables
i += 4
elif 'Conv' in var2.name.split('/')[-2]:
# load biases
bias = var2
bias_shape = bias.shape.as_list()
bias_params = np.prod(bias_shape)
bias_weights = weights[ptr:ptr +
bias_params].reshape(bias_shape)
ptr += bias_params
assign_ops.append(tf.assign(bias, bias_weights, validate_shape=True))
# we loaded 1 variable
i += 1
# we can load weights of conv layer
shape = var1.shape.as_list()
num_params = np.prod(shape)
var_weights = weights[ptr:ptr + num_params].reshape(
(shape[3], shape[2], shape[0], shape[1]))
# remember to transpose to column-major
var_weights = np.transpose(var_weights, (2, 3, 1, 0))
ptr += num_params
assign_ops.append(
tf.assign(var1, var_weights, validate_shape=True))
i += 1
except:
pass
return assign_ops
def config_learning_rate(args, global_step):
if args.lr_type == 'exponential':
lr_tmp = tf.train.exponential_decay(args.learning_rate_init, global_step, args.lr_decay_freq,
args.lr_decay_factor, staircase=True, name='exponential_learning_rate')
return tf.maximum(lr_tmp, args.lr_lower_bound)
elif args.lr_type == 'cosine_decay':
train_steps = (args.total_epoches - float(args.use_warm_up) * args.warm_up_epoch) * args.train_batch_num
return args.lr_lower_bound + 0.5 * (args.learning_rate_init - args.lr_lower_bound) * \
(1 + tf.cos(global_step / train_steps * np.pi))
elif args.lr_type == 'cosine_decay_restart':
return tf.train.cosine_decay_restarts(args.learning_rate_init, global_step,
args.lr_decay_freq, t_mul=2.0, m_mul=1.0,
name='cosine_decay_learning_rate_restart')
elif args.lr_type == 'fixed':
return tf.convert_to_tensor(args.learning_rate_init, name='fixed_learning_rate')
elif args.lr_type == 'piecewise':
return tf.train.piecewise_constant(global_step, boundaries=args.pw_boundaries, values=args.pw_values,
name='piecewise_learning_rate')
else:
raise ValueError('Unsupported learning rate type!')
def config_optimizer(optimizer_name, learning_rate, decay=0.9, momentum=0.9):
if optimizer_name == 'momentum':
return tf.train.MomentumOptimizer(learning_rate, momentum=momentum, use_nesterov=False)
elif optimizer_name == 'nesterov':
return tf.train.MomentumOptimizer(learning_rate, momentum=momentum, use_nesterov=True)
elif optimizer_name == 'rmsprop':
return tf.train.RMSPropOptimizer(learning_rate, decay=decay, momentum=momentum)
elif optimizer_name == 'adam':
return tf.train.AdamOptimizer(learning_rate)
elif optimizer_name == 'sgd':
return tf.train.GradientDescentOptimizer(learning_rate)
else:
raise ValueError('Unsupported optimizer type!')
@@ -0,0 +1,123 @@
# coding: utf-8
from __future__ import division, print_function
import numpy as np
import tensorflow as tf
def gpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, nms_thresh=0.5):
"""
Perform NMS on GPU using TensorFlow.
params:
boxes: tensor of shape [1, 10647, 4] # 10647=(13*13+26*26+52*52)*3, for input 416*416 image
scores: tensor of shape [1, 10647, num_classes], score=conf*prob
num_classes: total number of classes
max_boxes: integer, maximum number of predicted boxes you'd like, default is 50
score_thresh: if [ highest class probability score < score_threshold]
then get rid of the corresponding box
nms_thresh: real value, "intersection over union" threshold used for NMS filtering
"""
boxes_list, label_list, score_list = [], [], []
max_boxes = tf.constant(max_boxes, dtype='int32')
# since we do nms for single image, then reshape it
boxes = tf.reshape(boxes, [-1, 4]) # '-1' means we don't konw the exact number of boxes
score = tf.reshape(scores, [-1, num_classes])
# Step 1: Create a filtering mask based on "box_class_scores" by using "threshold".
mask = tf.greater_equal(score, tf.constant(score_thresh))
# Step 2: Do non_max_suppression for each class
for i in range(num_classes):
# Step 3: Apply the mask to scores, boxes and pick them out
filter_boxes = tf.boolean_mask(boxes, mask[:,i])
filter_score = tf.boolean_mask(score[:,i], mask[:,i])
nms_indices = tf.image.non_max_suppression(boxes=filter_boxes,
scores=filter_score,
max_output_size=max_boxes,
iou_threshold=nms_thresh, name='nms_indices')
label_list.append(tf.ones_like(tf.gather(filter_score, nms_indices), 'int32')*i)
boxes_list.append(tf.gather(filter_boxes, nms_indices))
score_list.append(tf.gather(filter_score, nms_indices))
boxes = tf.concat(boxes_list, axis=0)
score = tf.concat(score_list, axis=0)
label = tf.concat(label_list, axis=0)
return boxes, score, label
def py_nms(boxes, scores, max_boxes=50, iou_thresh=0.5):
"""
Pure Python NMS baseline.
Arguments: boxes: shape of [-1, 4], the value of '-1' means that dont know the
exact number of boxes
scores: shape of [-1,]
max_boxes: representing the maximum of boxes to be selected by non_max_suppression
iou_thresh: representing iou_threshold for deciding to keep boxes
"""
assert boxes.shape[1] == 4 and len(scores.shape) == 1
x1 = boxes[:, 0]
y1 = boxes[:, 1]
x2 = boxes[:, 2]
y2 = boxes[:, 3]
areas = (x2 - x1) * (y2 - y1)
order = scores.argsort()[::-1]
keep = []
while order.size > 0:
i = order[0]
keep.append(i)
xx1 = np.maximum(x1[i], x1[order[1:]])
yy1 = np.maximum(y1[i], y1[order[1:]])
xx2 = np.minimum(x2[i], x2[order[1:]])
yy2 = np.minimum(y2[i], y2[order[1:]])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (areas[i] + areas[order[1:]] - inter)
inds = np.where(ovr <= iou_thresh)[0]
order = order[inds + 1]
return keep[:max_boxes]
def cpu_nms(boxes, scores, num_classes, max_boxes=50, score_thresh=0.5, iou_thresh=0.5):
"""
Perform NMS on CPU.
Arguments:
boxes: shape [1, 10647, 4]
scores: shape [1, 10647, num_classes]
"""
boxes = boxes.reshape(-1, 4)
scores = scores.reshape(-1, num_classes)
# Picked bounding boxes
picked_boxes, picked_score, picked_label = [], [], []
for i in range(num_classes):
indices = np.where(scores[:,i] >= score_thresh)
filter_boxes = boxes[indices]
filter_scores = scores[:,i][indices]
if len(filter_boxes) == 0:
continue
# do non_max_suppression on the cpu
indices = py_nms(filter_boxes, filter_scores,
max_boxes=max_boxes, iou_thresh=iou_thresh)
picked_boxes.append(filter_boxes[indices])
picked_score.append(filter_scores[indices])
picked_label.append(np.ones(len(indices), dtype='int32')*i)
if len(picked_boxes) == 0:
return None, None, None
boxes = np.concatenate(picked_boxes, axis=0)
score = np.concatenate(picked_score, axis=0)
label = np.concatenate(picked_label, axis=0)
return boxes, score, label
@@ -0,0 +1,35 @@
# coding: utf-8
from __future__ import division, print_function
import cv2
import random
def get_color_table(class_num, seed=2):
random.seed(seed)
color_table = {}
for i in range(class_num):
color_table[i] = [random.randint(0, 255) for _ in range(3)]
return color_table
def plot_one_box(img, coord, label=None, color=None, line_thickness=None):
'''
coord: [x_min, y_min, x_max, y_max] format coordinates.
img: img to plot on.
label: str. The label name.
color: int. color index.
line_thickness: int. rectangle line thickness.
'''
tl = line_thickness or int(round(0.002 * max(img.shape[0:2]))) # line thickness
color = color or [random.randint(0, 255) for _ in range(3)]
c1, c2 = (int(coord[0]), int(coord[1])), (int(coord[2]), int(coord[3]))
cv2.rectangle(img, c1, c2, color, thickness=tl)
if label:
tf = max(tl - 1, 1) # font thickness
t_size = cv2.getTextSize(label, 0, fontScale=float(tl) / 3, thickness=tf)[0]
c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
cv2.rectangle(img, c1, c2, color, -1) # filled
cv2.putText(img, label, (c1[0], c1[1] - 2), 0, float(tl) / 3, [0, 0, 0], thickness=tf, lineType=cv2.LINE_AA)
@@ -0,0 +1,102 @@
# coding: utf-8
from __future__ import division, print_function
import tensorflow as tf
import numpy as np
import argparse
import cv2
import time
from utils.misc_utils import parse_anchors, read_class_names
from utils.nms_utils import gpu_nms
from utils.plot_utils import get_color_table, plot_one_box
from utils.data_aug import letterbox_resize
from model import yolov3
parser = argparse.ArgumentParser(description="YOLO-V3 video test procedure.")
parser.add_argument("input_video", type=str,
help="The path of the input video.")
parser.add_argument("--anchor_path", type=str, default="./data/yolo_anchors.txt",
help="The path of the anchor txt file.")
parser.add_argument("--new_size", nargs='*', type=int, default=[416, 416],
help="Resize the input image with `new_size`, size format: [width, height]")
parser.add_argument("--letterbox_resize", type=lambda x: (str(x).lower() == 'true'), default=True,
help="Whether to use the letterbox resize.")
parser.add_argument("--class_name_path", type=str, default="./data/coco.names",
help="The path of the class names.")
parser.add_argument("--restore_path", type=str, default="./data/darknet_weights/yolov3.ckpt",
help="The path of the weights to restore.")
parser.add_argument("--save_video", type=lambda x: (str(x).lower() == 'true'), default=False,
help="Whether to save the video detection results.")
args = parser.parse_args()
args.anchors = parse_anchors(args.anchor_path)
args.classes = read_class_names(args.class_name_path)
args.num_class = len(args.classes)
color_table = get_color_table(args.num_class)
vid = cv2.VideoCapture(args.input_video)
video_frame_cnt = int(vid.get(7))
video_width = int(vid.get(3))
video_height = int(vid.get(4))
video_fps = int(vid.get(5))
if args.save_video:
fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
videoWriter = cv2.VideoWriter('video_result.mp4', fourcc, video_fps, (video_width, video_height))
with tf.Session() as sess:
input_data = tf.placeholder(tf.float32, [1, args.new_size[1], args.new_size[0], 3], name='input_data')
yolo_model = yolov3(args.num_class, args.anchors)
with tf.variable_scope('yolov3'):
pred_feature_maps = yolo_model.forward(input_data, False)
pred_boxes, pred_confs, pred_probs = yolo_model.predict(pred_feature_maps)
pred_scores = pred_confs * pred_probs
boxes, scores, labels = gpu_nms(pred_boxes, pred_scores, args.num_class, max_boxes=200, score_thresh=0.3, nms_thresh=0.45)
saver = tf.train.Saver()
saver.restore(sess, args.restore_path)
for i in range(video_frame_cnt):
ret, img_ori = vid.read()
if args.letterbox_resize:
img, resize_ratio, dw, dh = letterbox_resize(img_ori, args.new_size[0], args.new_size[1])
else:
height_ori, width_ori = img_ori.shape[:2]
img = cv2.resize(img_ori, tuple(args.new_size))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = np.asarray(img, np.float32)
img = img[np.newaxis, :] / 255.
start_time = time.time()
boxes_, scores_, labels_ = sess.run([boxes, scores, labels], feed_dict={input_data: img})
end_time = time.time()
# rescale the coordinates to the original image
if args.letterbox_resize:
boxes_[:, [0, 2]] = (boxes_[:, [0, 2]] - dw) / resize_ratio
boxes_[:, [1, 3]] = (boxes_[:, [1, 3]] - dh) / resize_ratio
else:
boxes_[:, [0, 2]] *= (width_ori/float(args.new_size[0]))
boxes_[:, [1, 3]] *= (height_ori/float(args.new_size[1]))
for i in range(len(boxes_)):
x0, y0, x1, y1 = boxes_[i]
plot_one_box(img_ori, [x0, y0, x1, y1], label=args.classes[labels_[i]] + ', {:.2f}%'.format(scores_[i] * 100), color=color_table[labels_[i]])
cv2.putText(img_ori, '{:.2f}ms'.format((end_time - start_time) * 1000), (40, 40), 0,
fontScale=1, color=(0, 255, 0), thickness=2)
cv2.imshow('image', img_ori)
if args.save_video:
videoWriter.write(img_ori)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
vid.release()
if args.save_video:
videoWriter.release()
@@ -0,0 +1,9 @@
{
"server_count": "1",
"server_list": [{
"device": [{devices}],
"server_id": "127.0.0.1"
}],
"status": "completed",
"version": "1.0"
}
@@ -0,0 +1,29 @@
#!/bin/bash
# main env
if [ -d /usr/local/Ascend/nnae/latest ];then
export LD_LIBRARY_PATH=/usr/local/:/usr/local/lib/:/usr/lib/:/usr/local/Ascend/nnae/latest/fwkacllib/lib64:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:/usr/local/Ascend/add-ons/:/usr/local/Ascend/driver/tools/hccn_tool/:/usr/local/mpirun4.0/lib
export PYTHONPATH=$PYTHONPATH:/usr/local/Ascend/tfplugin/latest/tfplugin/python/site-packages:/usr/local/Ascend/nnae/latest/opp/op_impl/built-in/ai_core/tbe:/usr/local/Ascend/nnae/latest/fwkacllib/python/site-packages/:/usr/local/Ascend/tfplugin/latest/tfplugin/python/site-packages
export PATH=$PATH:/usr/local/Ascend/nnae/latest/fwkacllib/ccec_compiler/bin:/usr/local/mpirun4.0/bin
export ASCEND_OPP_PATH=/usr/local/Ascend/nnae/latest/opp
else
export LD_LIBRARY_PATH=/usr/local/lib/:/usr/lib/:/usr/local/Ascend/ascend-toolkit/latest/fwkacllib/lib64:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver/:/usr/local/Ascend/add-ons/:/usr/local/mpirun4.0/lib
export PYTHONPATH=$PYTHONPATH:/usr/local/Ascend/tfplugin/latest/tfplugin/python/site-packages:/usr/local/Ascend/ascend-toolkit/latest/opp/op_impl/built-in/ai_core/tbe:/usr/local/Ascend/ascend-toolkit/latest//fwkacllib/python/site-packages/:/usr/local/Ascend/ascend-toolkit/latest/tfplugin/python/site-packages:$projectDir
export PATH=$PATH:/usr/local/Ascend/ascend-toolkit/latest/fwkacllib/ccec_compiler/bin:/usr/local/mpirun4.0/bin
export ASCEND_OPP_PATH=/usr/local/Ascend/ascend-toolkit/latest/opp/
fi
export NEW_GE_FE_ID=1
export GE_AICPU_FLAG=1
export SOC_VERSION=Ascend910
#export DUMP_GE_GRAPH=2
#export DUMP_GRAPH_LEVEL=3
#export PRINT_MODEL=1
export SLOG_PRINT_TO_STDOUT=0
export HCCL_CONNECT_TIMEOUT=600
# system env
ulimit -c unlimited
@@ -0,0 +1,53 @@
# setting main path
MAIN_PATH=$(dirname $(readlink -f $0))
echo $MAIN_PATH
DEVICE_NUM=$1
ckpt_path=$2
#echo $1
#echo $2
# set env
export DDK_VERSION_FLAG=1.60.T49.0.B201
export NEW_GE_FE_ID=1
export GE_AICPU_FLAG=1
export SOC_VERSION=Ascend910
export JOB_ID=10087
export FUSION_TENSOR_SIZE=1000000000
export RANK_ID=yolo
#echo "device_num is $DEVICE_NUM"
for((i=0;i<${DEVICE_NUM};i++));
do
export RANK_SIZE=$DEVICE_NUM
export DEVICE_ID=$i
export DEVICE_INDEX=$i
#su HwHiAiUser -c "adc --host 0.0.0.0:22118 --log \"SetLogLevel(0)[debug]\" --device "$RANK_ID
cd ${MAIN_PATH}/../result
if [ x"${ckpt_path}" == x"" ];then
lastresult=$(ls -t | grep -E "Train*" | head -n 1)
RESTORE_PATH=${lastresult}/${i}/training/
else
lastresult=${ckpt_path}
RESTORE_PATH=${ckpt_path}/${i}/training/
fi
echo $RESTORE_PATH
python3.7 ${MAIN_PATH}/../code/eval.py \
--save_json True \
--score_thresh 0.0001 \
--nms_thresh 0.55 \
--max_boxes 100 \
--restore_path $RESTORE_PATH \
--max_test 10000 \
--save_json_path eval_res_D$DEVICE_NUM.json > ${lastresult}/eval_$i.out 2>&1
done
@@ -0,0 +1,77 @@
#!/bin/bash
rank_size=$1
yamlPath=$2
toolsPath=$3
if [ -f /.dockerenv ];then
CLUSTER=$4
MPIRUN_ALL_IP="$5"
export CLUSTER=${CLUSTER}
fi
currentDir=$(cd "$(dirname "$0")/.."; pwd)
# 从 yaml 获取配置
eval $(${toolsPath}/get_params_for_yaml.sh ${yamlPath} "tensorflow_config")
source ${currentDir}/config/npu_set_env.sh
if [ x"$runmode" != x"evaluate" ];then
currtime=`date +%Y%m%d%H%M%S`
mkdir -p ${currentDir%train*}/train/result/tf_yolov3/training_job_${currtime}/
train_job_dir=${currentDir%train*}/train/result/tf_yolov3/training_job_${currtime}/
echo "[`date +%Y%m%d-%H:%M:%S`] [INFO] ${train_job_dir} &"
fi
# device 列表, 若无指定 device 根据 rank_size 顺序选择
eval device_group=\$device_group_${rank_size}p
if [ x"${device_group}" == x"" ] || [ ${rank_size} -ge 8 ];then
device_group="$(seq 0 "$(expr $rank_size - 1)")"
fi
# get last device id in device_group, hw log in performance from the dir named first_device_id
device_group_str=`echo ${device_group} | sed 's/ //g'`
first_device_id=`echo ${device_group_str: 0:1}`
argsFilePath=${currentDir}/code/args_${mode}.py
#echo "argsFilePath is "${argsFilePath}
sed -i "0,/batch_size.*$/s//batch_size\ = ${batch_size}/g" ${argsFilePath}
sed -i "s/save_epoch.*$/save_epoch\ = ${save_epoch}/g" ${argsFilePath}
sed -i "s/total_epoches =.*$/total_epoches\ = ${total_epoches}/g" ${argsFilePath}
sed -i 's/\r//g' ${argsFilePath}
if [ x"${CLUSTER}" == x"True" ];then
# ln hw log
ln -snf ${train_job_dir}/0/hw_yolov3.log ${train_job_dir}
this_ip=$(hostname -I |awk '{print $1}')
for ip in $MPIRUN_ALL_IP;do
if [ x"$ip" != x"$this_ip" ];then
scp $yamlPath root@$ip:$yamlPath
scp $argsFilePath root@$ip:$argsFilePath
fi
done
export PATH=$PATH:/usr/local/mpirun4.0/bin
mpirun -H ${mpirun_ip} \
--bind-to none -map-by slot\
--allow-run-as-root \
--mca btl_tcp_if_exclude lo,docker0,endvnic,virbr0,vethf40501b,docker_gwbridge,br-f42ac38052b4\
--prefix /usr/local/mpirun4.0/ \
${currentDir}/scripts/train.sh 0 $rank_size $yamlPath $currtime ${toolsPath} ${CLUSTER}
elif [ $runmode == "train" ];then
ln -snf ${train_job_dir}/${first_device_id}/hw_yolov3.log ${train_job_dir}
rank_id=0
for device_id in $device_group;do
#echo "[`date +%Y%m%d-%H:%M:%S`] [INFO] start: train ${device_id} & " >> ${currentDir}/result/main.log
${currentDir}/scripts/train.sh $device_id $rank_size $yamlPath $currtime ${toolsPath} $rank_id&
let rank_id++
done
else
echo "[`date +%Y%m%d-%H:%M:%S`] [INFO] ${ckpt_path} &"
ln -snf ${train_job_dir}/${first_device_id}/hw_yolov3.log ${train_job_dir}
bash ${currentDir}/scripts/eval.sh ${rank_size} ${ckpt_path}
fi
wait
#echo "[`date +%Y%m%d-%H:%M:%S`] [INFO] all train exit " >> ${currentDir}/result/main.log
@@ -0,0 +1,115 @@
#!/bin/bash
scriptDir=$(cd "$(dirname "$0")"; pwd)
mainDir=$(cd "$(dirname "$scriptDir")"; pwd)
device_id=$1
rank_size=$2
yamlPath=$3
currentDir=$(cd "$(dirname "$0")/.."; pwd)
currtime=$4
toolsPath=$5
export YAML_PATH=$3
mkdir -p ${currentDir%train*}/train/result/tf_yolov3/training_job_${currtime}/
export train_job_dir=${currentDir%train*}/train/result/tf_yolov3/training_job_${currtime}/
# 从 yaml 获取配置
eval $(${toolsPath}/get_params_for_yaml.sh ${yamlPath} "tensorflow_config")
source ${currentDir}/config/npu_set_env.sh
# 声明变量
export REMARK_LOG_FILE=hw_yolov3.log # 打点日志文件名称, 必须hw_后跟模型名称小写
# 添加日志打点模块路径
benchmark_log_path=${currentDir%atlas_benchmark-master*}/atlas_benchmark-master/utils
export PYTHONPATH=$PYTHONPATH:${benchmark_log_path}
# user env
export HCCL_CONNECT_TIMEOUT=600
export RANK_TABLE_FILE=${currentDir}/config/${rank_size}p.json
export RANK_SIZE=${rank_size}
export SLOG_PRINT_TO_STDOUT=0
export DEVICE_ID=${device_id}
export DEVICE_INDEX=${DEVICE_INDEX}
export DEVICE_INDEX=$RANK_ID
export JOB_ID=123678
export FUSION_TENSOR_SIZE=1000000000
if [ ${profiling_mode} == True ];
then
export PROFILING_MODE=true
else
export PROFILING_MODE=false
fi
if [ ${aicpu_profiling_mode} == True ];
then
export AICPU_PROFILING_MODE=true
else
export AICPU_PROFILING_MODE=false
fi
export PROFILING_OPTIONS=${profiling_options}
export FP_POINT=${fp_point}
export BP_POINT=${bp_point}
cd ${train_job_dir}
curd_dir=${currentDir%atlas_benchmark-master*}/atlas_benchmark-master/utils/atlasboost
export PYTHONPATH=$PYTHONPATH:${curd_dir}
if [ x"$6" != x"True" ];then
rank_id=$6
export RANK_ID=$6
else
device_id_mo=$(python3.7 -c "import src.tensorflow.mpi_ops as atlasboost;atlasboost.init(); \
device_id = atlasboost.local_rank();cluster_device_id = str(device_id); \
atlasboost.set_device_id(device_id);print(atlasboost.rank())")
device_id_mo=`echo $device_id_mo`
rank_id=${device_id_mo##* }
export RANK_ID=${rank_id}
device=${device_id_mo##*deviceid = }
device_id=${device%% phyid=*}
export DEVICE_ID=${device_id}
hccljson=${train_job_dir}/*.json
cp ${hccljson} ${currentDir}/config/${rank_size}p.json
fi
#mkdir exec path
mkdir -p ${train_job_dir}/${device_id}
cd ${train_job_dir}/${device_id}
num_cpus=$(getconf _NPROCESSORS_ONLN)
num_cpus_per_device=$((num_cpus/8))
PID_START=$((num_cpus_per_device*device_id))
PID_END=$((num_cpus_per_device*device_id+num_cpus_per_device-1))
startTime=`date +%Y%m%d-%H:%M:%S`
startTime_s=`date +%s`
#KERNEL_NUM=20
#PID_START=$((KERNEL_NUM * DEVICE_ID))
#PID_END=$((PID_START + KERNEL_NUM - 1))
#sleep 5
taskset -c $PID_START-$PID_END python3.7 $mainDir/code/train.py --mode $mode > ${train_job_dir}/train_${device_id}.log 2>&1
if [ $? -eq 0 ] ;then
echo ":::ABK 1.0.0 yolov3 train success"
echo ":::ABK 1.0.0 yolov3 train success" >> ${train_job_dir}/train_${device_id}.log
echo ":::ABK 1.0.0 yolov3 train success" >> ${train_job_dir}/${device_id}/hw_yolov3.log
else
echo ":::ABK 1.0.0 yolov3 train failed"
echo ":::ABK 1.0.0 yolov3 train failed" >> ${train_job_dir}/train_${device_id}.log
echo ":::ABK 1.0.0 yolov3 train failed" >> ${train_job_dir}/${device_id}/hw_yolov3.log
fi
endTime=`date +%Y%m%d-%H:%M:%S`
endTime_s=`date +%s`
sumTime=$[ $endTime_s - $startTime_s ]
hour=$(( $sumTime/3600 ))
min=$(( ($sumTime-${hour}*3600)/60 ))
sec=$(( $sumTime-${hour}*3600-${min}*60 ))
echo ${hour}:${min}:${sec}
echo ":::ABK 1.0.0 yolov3 train total time ${hour}:${min}:${sec}" >> ${train_job_dir}/${device_id}/hw_yolov3.log