6 Commits

Author SHA1 Message Date
Lingxi Xie 2d0082e16b Create LICENSE.md 2023-02-26 22:33:39 +08:00
Lingxi Xie 9583d28ec8 Update README.md 2023-02-26 22:30:23 +08:00
Lingxi Xie 240731f9f8 Update README.md 2023-02-26 22:29:20 +08:00
Lingxi Xie ed94885bc1 update files 2023-02-26 21:06:50 +08:00
Lingxi Xie 85a25ba942 add files via upload 2023-02-26 20:59:59 +08:00
Lingxi Xie 9b26464a4f Add files via upload 2023-02-26 20:57:02 +08:00
7 changed files with 229 additions and 9 deletions
+21
View File
@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2023 Lingxi Xie
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
+89 -9
View File
@@ -2,23 +2,103 @@
This is the official repository for the Pangu-Weather paper.
Resources including pseudocode and pre-trained models will be updated. Stay tuned!
[Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather Forecast](https://arxiv.org/abs/2211.02556), arXiv preprint: 2211.02556, 2022.
#### Policy of using the contents
by Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu and Qi Tian
All models are trained using the ERA5 dataset provided by ECMWF. Please follow their policy and note that commercial use of these models are forbidden.
Resources including pseudocode, pre-trained models, and inference code are released.
More policy to be updated.
## Installation
#### Pseudocode and how to use
The downloaded files shall be organized as the following hierarchy:
To be updated.
```plain
├── root
│ ├── input_data
│ │ ├── input_surface.npy
│ │ ├── input_upper.npy
│ ├── output_data
│ ├── model_jit_cpu_1.onnx
│ ├── model_jit_cpu_3.onnx
│ ├── model_jit_cpu_6.onnx
│ ├── model_jit_cpu_24.onnx
│ ├── inference_cpu.py
│ ├── inference_gpu.py
│ ├── inference_iterative.py
```
#### Pre-trained models
If you use a CPU environment, please run:
```
pip install -r requirement_cpu.txt
```
To be updated.
If you use a GPU environment, please first confirm that the cuda version is 11.6 and the cudnn version is the 8.2.4 for Linux and 8.5.0.96 for Windows (please see [this page](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html) for details). Then, please run:
```
pip install -r requirement_gpu.txt
```
#### References
## Global weather forecasting (inference) using the trained models
#### Downloading trained models
Please download the four pre-trained models (~1.1GB each) from the Google drive:
The 1-hour model: [model_jit_cpu_1.onnx](https://drive.google.com/file/d/1fg5jkiN_5dHzKb-5H9Aw4MOmfILmeY-S/view?usp=share_link)
The 3-hour model: [model_jit_cpu_3.onnx](https://drive.google.com/file/d/1EdoLlAXqE9iZLt9Ej9i-JW9LTJ9Jtewt/view?usp=share_link)
The 6-hour model: [model_jit_cpu_6.onnx](https://drive.google.com/file/d/1a4XTktkZa5GCtjQxDJb_fNaqTAUiEJu4/view?usp=share_link)
The 24-hour model: [model_jit_cpu_24.onnx](https://drive.google.com/file/d/1lweQlxcn9fG0zKNW8ne1Khr9ehRTI6HP/view?usp=share_link)
These models are stored using the ONNX format, and thus can be used via different languages such as Python, C++, C#, Java, etc.
#### Input data preparation using Python
Please prepare the input data using [numpy](https://numpy.org/). There are two files that shall be put under the `input_data` folder, namely, `input_surface.npy` and `input_upper.npy`.
`input_surface.npy` stores the input surface variables. It is a numpy array shaped (4,721,1440) where the first dimension represents the 4 surface variables (MSLP, U10, V10, T2M *in the exact order*).
`input_upper.npy` stores the upper-air variables. It is a numpy array shaped (5,13,721,1440) where the first dimension represents the 5 surface variables (Z, Q, T, U and V *in the exact order*), and the second dimension represents the 13 pressure levels (1000hPa, 925hPa, 850hPa, 700hPa, 600hPa, 500hPa, 400hPa, 300hPa, 250hPa, 200hPa, 150hPa, 100hPa and 50hPa *in the exact order*).
In both cases, the dimensions of 721 and 1440 represent the size along the latitude and longitude, where the numerical range is [90,-90] degree and [0,359.75] degree, respectively, and the spacing is 0.25 degrees. For each 721x1440 slice, the data format is exactly the same as the `.nc` file download from the ERA5 official website.
Note that the numpy arrays should be in single precision (`.astype(np.float32)`), not in double precision.
We support ERA5 initial fields and ECMWF initial fields (e.g., the initial fields of the HRES forecast), where the latter often leads to a slight accuracy drop (mainly for T2M because the two fields are quite different in temperature). A `.nc` file of ERA5 can be transformed into a `.npy` file using the netCDF4 package, and a `.grib` file of the ECMWF initial fields can be transformed into a `.npy` file using the pygrib package. Note that Z represents geopotential, not geopotential height, so a factor of 9.80665 should be multiplied if the raw data contains the geopotential height.
We temporarily do not support other kinds of initial fields due to the possibly dramatic differences in the fields when Z<0.
We provide an example of transferred input files, `input_surface.npy` and `input_upper.npy`, which correspond to the ERA5 initial fields of at 12:00UTC, 2018/09/27. Please download them using Google drive:
[`input_surface.npy`](https://drive.google.com/file/d/1pj8QEVNpC1FyJfUabDpV4oU3NpSe0BkD/view?usp=share_link)
[`input_upper.npy`](https://drive.google.com/file/d/1--7xEBJt79E3oixizr8oFmK_haDE77SS/view?usp=share_link)
#### Inference
After the above steps are finished, please check `inference_cpu.py` for an example of making a 24-hour weather forecast on CPU with the 24-hour model, and `inference_gpu.py` for the GPU version.
For example, running the following command, one can get the 24-hour forecast in the `output_data` folder:
```
python inference_cpu.py # python inference_gpu.py for gpu environment
```
Also, `inference_iterative.py` shows an example to generate per-6-hour forecast within a week.
## Pseudocode and how to use
`pseudocode.py` contains the pseudocode that elaborates our main algorithm. It is written in Python and can be implemented using any deep learning library, e.g. PyTorch and TensorFlow.
Note that one needs to download about 60TB of ERA5 data and prepare for computational resource of 3000 GPU-days (in V100) to train each model.
## License
Pangu-Weather is released under the MIT license.
Also, please note that all models were trained using the ERA5 dataset provided by ECMWF. Please do follow their policy and note that commercial use of these models is forbidden.
## References
If you use the resource in your research, please cite our paper:
+36
View File
@@ -0,0 +1,36 @@
import os
import numpy as np
import onnx
import onnxruntime as ort
# The directory of your input and output data
input_data_dir = 'input_data'
output_data_dir = 'output_data'
model_24 = onnx.load('pangu_weather_24.onnx')
# Set the behavier of onnxruntime
options = ort.SessionOptions()
options.enable_cpu_mem_arena=False
options.enable_mem_pattern = False
options.enable_mem_reuse = False
# Increase the number for faster inference and more memory consumption
options.intra_op_num_threads = 1
# Set the behavier of cuda provider
cuda_provider_options = {'arena_extend_strategy':'kSameAsRequested',}
# Initialize onnxruntime session for Pangu-Weather Models
ort_session_24 = ort.InferenceSession('pangu_weather_24.onnx', sess_options=options, provider=['CPUExecutionProvider'])
# Load the upper-air numpy arrays
input = np.load(os.path.join(input_data_dir, 'input_upper.npy')).astype(np.float32)
# Load the surface numpy arrays
input_surface = np.load(os.path.join(input_data_dir, 'input_surface.npy')).astype(np.float32)
# Run the inference session
output, output_surface = ort_session_24.run(None, {'input':input, 'input_surface':input_surface})
# Save the results
np.save(os.path.join(output_data_dir, 'output_upper'), output)
np.save(os.path.join(output_data_dir, 'output_surface'), output_surface)
+35
View File
@@ -0,0 +1,35 @@
import os
import numpy as np
import onnx
import onnxruntime as ort
# The directory of your input and output data
input_data_dir = 'input_data'
output_data_dir = 'output_data'
model_24 = onnx.load('pangu_weather_24.onnx')
# Set the behavier of onnxruntime
options = ort.SessionOptions()
options.enable_cpu_mem_arena=False
options.enable_mem_pattern = False
options.enable_mem_reuse = False
# Increase the number for faster inference and more memory consumption
options.intra_op_num_threads = 1
# Set the behavier of cuda provider
cuda_provider_options = {'arena_extend_strategy':'kSameAsRequested',}
# Initialize onnxruntime session for Pangu-Weather Models
ort_session_24 = ort.InferenceSession('pangu_weather_24.onnx', sess_options=options, provider=[('CUDAExecutionProvider', cuda_provider_options)])
# Load the upper-air numpy arrays
input = np.load(os.path.join(input_data_dir, 'input_upper.npy')).astype(np.float32)
# Load the surface numpy arrays
input_surface = np.load(os.path.join(input_data_dir, 'input_surface.npy')).astype(np.float32)
# Run the inference session
output, output_surface = ort_session_24.run(None, {'input':input, 'input_surface':input_surface})
# Save the results
np.save(os.path.join(output_data_dir, 'output_upper'), output)
np.save(os.path.join(output_data_dir, 'output_surface'), output_surface)
+42
View File
@@ -0,0 +1,42 @@
import os
import numpy as np
import onnx
import onnxruntime as ort
# The directory of your input and output data
input_data_dir = 'input_data'
output_data_dir = 'output_data'
model_24 = onnx.load('pangu_weather_24.onnx')
model_6 = onnx.load('pangu_weather_6.onnx')
# Set the behavier of onnxruntime
options = ort.SessionOptions()
options.enable_cpu_mem_arena=False
options.enable_mem_pattern = False
options.enable_mem_reuse = False
# Increase the number for faster inference and more memory consumption
options.intra_op_num_threads = 1
# Set the behavier of cuda provider
cuda_provider_options = {'arena_extend_strategy':'kSameAsRequested',}
# Initialize onnxruntime session for Pangu-Weather Models
ort_session_24 = ort.InferenceSession('pangu_weather_24.onnx', sess_options=options, provider=[('CUDAExecutionProvider', cuda_provider_options)])
ort_session_6 = ort.InferenceSession('pangu_weather_6.onnx', sess_options=options, provider=[('CUDAExecutionProvider', cuda_provider_options)])
# Load the upper-air numpy arrays
input = np.load(os.path.join(input_data_dir, 'input_upper.npy')).astype(np.float32)
# Load the surface numpy arrays
input_surface = np.load(os.path.join(input_data_dir, 'input_surface.npy')).astype(np.float32)
# Run the inference session
input_24, input_surface_24 = input, input_surface
for i in range(28):
if (i+1) % 4 == 0:
output, output_surface = ort_session_24.run(None, {'input':input_24, 'input_surface':input_surface_24})
input_24, input_surface_24 = output, output_surface
else:
output, output_surface = ort_session_6.run(None, {'input':input, 'input_surface':input_surface})
input, input_surface = output, output_surface
# Your can save the results here
+3
View File
@@ -0,0 +1,3 @@
numpy
onnx==1.13.1
onnxruntime==1.14.0
+3
View File
@@ -0,0 +1,3 @@
numpy
onnx==1.12.0
onnxruntime-gpu==1.14.0