This is basic implementation of our KDD'20 Applied Data Science Track (Oral) paper:
Huiting Hong, Yucheng Lin, Xiaoqing Yang, Zang Li, Kung Fu, Zheng Wang, Xiaohu Qie, Jieping Ye. 2020. HetETA: Heterogeneous Information Network Embedding for Estimating Time of Arrival.
The source code is based on STGCN
The script has been tested running under Python 2.7.5, with the following packages installed (along with their dependencies):
argparse==1.1numpy==1.16.5scipy==1.2.2networkx==2.2tensorflow-gpu==1.13.1yaml==5.1.2
Here we provide the implementation of HetETA and a toy dataset.
The folder is organised as follows:
dataset/contains:make_sample.pyrandomly generates thetoy_sampledataset to help readers to figure out the input format;toy_sample/contains:adj_gap_top5.matis the vehicle-trajectories based network;adj.matis the multi-relational road network;link_info.npzis the static attributes of each road segment;dynamic_fes.npzis the dynamic feature (speed) of each road segment over time periods;eta_label.npzcontains the time it takes for a vehicle to travel through a path starting form period t.
codes/contains:data/:model/is used to save the trained model;config_*.yamlconfigures the path and paramenter settings.
model/contains the implementation of the HetETA network;utils/contains some tools for loading dataset;train.pyis used to execute a full training run on the dataset.
cd codes
python -u train.py --config data/config_HetETA_toy.yaml --model_dir data/model/HetETA_toy --dataset_dir ../dataset/toy_sample >> multi-HetETA_toy.logPlease note that the toy_sample dataset is not a real dataset, which is only used to provide examples of data formats, not to train models.
Didi Chuxing, Beijing, China.

