TRANSPORTATION NETWORK SPEED FOREEASTING METHOD USING DEEP CAPSULE NETWORKS WITH NESTED LSTM MODELS

0Associated
Cases 
0Associated
Defendants 
0Accused
Products 
0Forward
Citations 
0
Petitions 
0
Assignments
First Claim
1. A road network status prediction method based on a capsule network and a nested longshort term memory neural network, comprising the following specific steps:
 Step 1. selecting a target road network, dividing same into n road sections, and dividing time at equal interval;
Step 2. for a certain time interval t, calculating the average velocity of all vehicles passing through each road section within the time interval t;
if no vehicle passes through a certain road section a within the time interval t, replacing the average velocity with the average velocity within the previous time interval;
wherein the average velocity of the road section a within the time interval t is calculated as follows;
0 Assignments
0 Petitions
Accused Products
Abstract
This application is a transportation network speed forecasting method using deep capsule networks with nested LSTM models. The method includes the following steps: (1) This method divides the transport network into road links, calculates average speeds of each road link, maps the average speeds into a grid system, and generate traffic images representing traffic state at time intervals; (2) the method uses a CapsNet to capture the spatial relationship between road links. The learn patterns are represented in vectors; (3) The vectors of CapsNet are feed into a NLSTM model to learn temporal relationships between road links; (4) The model is trained using and training dataset, and predicts future traffic states using testing dataset. This application uses a new and advanced CapsNet neural structure, while can more efficiently deal with complex traffic networks than CNN models.
0 Citations
No References
No References
6 Claims
 1. A road network status prediction method based on a capsule network and a nested longshort term memory neural network, comprising the following specific steps:
Step 1. selecting a target road network, dividing same into n road sections, and dividing time at equal interval; Step 2. for a certain time interval t, calculating the average velocity of all vehicles passing through each road section within the time interval t; if no vehicle passes through a certain road section a within the time interval t, replacing the average velocity with the average velocity within the previous time interval; wherein the average velocity of the road section a within the time interval t is calculated as follows;  View Dependent Claims (2, 3, 4, 5, 6)
1 Specification
This application is related to transport information prediction. This is a transportation network speed forecasting method using deep capsule networks with nested LSTM models.
Transport prediction is an important transportation research topic. It predicts future traffic congestions using history traffic data. Transport prediction becomes one of the most powerful tools in transportation to solve traffic congestions by not only providing commuters with better routing scheme, but also developing key management insights for traffic planners. With the prevalent installation of intelligent transportation systems (ITS) and global position systems (GPS) on buses, the costs to collecting data are largely reduced compared with the traditional data collection methods, such as surveys and loop detectors. The vast data makes transport predictions at large scales become feasible, so as to the macro traffic controls by analyzing these traffic congestion data.
Road traffic is inherently dynamic, complex and unstable due to the complexity of transport networks, such as the coexistence of main stream, road intersections, quick ways, et. Moreover, the data quality of the captured data by ITS systems varies greatly, despite the data size is huge. The collected data is usually highly unstructured, heterogeneous in quality, dynamic in time and space. These characteristics make great challenges for conventional machine learning methods to extract valuable information from it. To address the problems, recent years show a trend of gradually employing deep learning models to analyze traffic data. Deep learning models show greater learning and generalization abilities than conventional machine leaning methods by adopting deep and welltuned model structures. Deep learning models can make much more accurate predictions on network level by mining timespace evolution patterns, of traffic from the collected big data.
However, deep learning models for traffic prediction have some limitations to date: (1) For deep learning models that construct time series for each road segment and make predictions by mining their time evolution, patterns using recursive artificial networks, the prediction accuracy is low because these models only consider value correlations across time for separate road segments. Traffic correlations across space are not considered in these models; (2) For convolutional deep learning models that represent traffic as images and learn timespace traffic relation through multiple convolution and pooling layers, the prediction accuracy is extremely unstable and dependent on the placing order of road segments on one dimension of the timespace image: (3) For other deep learning models that introduce coordinate systems into traffic networks, they see traffic evolutions across time as frames of videos and apply convolution and recurrent networks to mine the timespace patterns of traffic. These deep learning models ignore the graphic structure of traffic networks and treat overlapping road segments (such as bridge and roads under it) as one, so they cannot efficiently capture traffic flows on complex traffic networks with overlapping road structure. Moreover, the square size of coordinate systems also has great influence on the prediction accuracy of these models.
A transportation network forecasting method using deep capsule networks (CapsNet) with nested LSTM models (NLSTM) is proposed in this application address the limitations of current practice, and to efficiently mine the timespace pattern of traffic in complex traffic networks. Specifically, the model uses CapsNet to extract the spatial features of traffic networks and utilizes NLSTM to capture the hierarchical temporal dependencies in traffic sequence data. The CapsNet and NLSTM are, sequentially connected into the final model.
The model realizes its prediction power by using following steps.
First, setting up speed profile for each road segment based on three steps. The first step divides the traffic network into n road links. The second step discretizes the investigated time into intervals. The time interval should not be too long nor too short, in order to capture the traffic evolution pattern in short time periods. The natural choice of time interval can be around 24 minutes. The third step calculates average travel speed of each link at each time interval. The average travel speed V_{at }for link a∈(1, 2, . . . , n) at time t is given by
where k is the number of cars that travel through the road link at this time interval. V_{it }represents the average travel speed for car i.
Then, establishing the mapping relationship between the average speed and road link in GIS maps.
Finally, the geographical area of the road network is meshed into squares or coordinates. A value representing the average speed is assigned to each square. The average speed for each square is calculated as follows. For squares with no road links, the average speed is zero. The squares with at least one links, the value is the average speed of these links. Representing these average speeds as pixels of images, images representing the traffic state of the network in all time intervals can be obtained. These images are inputs of the proposed model. The model outputs are vectors containing average speeds for all road links at the next time interval. Let (X, Y) represents the model inputs and outputs.
CapsNet first extracts variety of local features of traffic speed through a primary layer. The local features are then integrated into highlevel features (i.e., represented by vectors) by final layers. The integrated features contain information not only about local timespace patterns between road links, but also about the highlevel correlation between these local features. Thus, the integrated feature represents traffic patterns of the whole network, while encapsulating local pattern into highlevel representations.
The inputs of the NLSTM are the output vectors of CapsNet. NLSTM transforms the traditional twolayer LSTM structure into two LSTM structures connected by a gate unit. NLSTM treats the input vectors as timeseries in training.
The output vectors that, represent traffic patterns of the transport network from the CapsNet model are feed into the NLSTM model as timeseries to learn temporal patterns across these abstract features NLSTM makes predictions on future traffic states (i.e., traffic speeds) by a fullyconnected layer. In summary, the model makes prediction on future traffic states by learning the history traffic patterns represented as images (in step 1).
This application has the following advantages.
This application solves the problem that the spatial structure of road links in complex traffic networks cannot be handled efficiently by traditional statistical models and machine learning models. This application represents traffic states over time as images, and utilizes a CapsNet model and a NLSTM model to learn spatial and temporal traffic patterns, respectively. The model proposed has much higher prediction accuracy compared with traditional methods.
This application uses a more advanced deep learning structure called CapsNet. The CapsNet model is more powerful in handling overlapping road structures and low data resolution situations than CNN models. CapsNet uses vectors neurons instead of scalar neurons, so that more comprehensive timespace features of traffic can be preserved such as link location, length, direction and traffic speeds.
This application alters the sequential layer structure of LSTM as internal and external structures and connects them with, a gate unit, so that information can be passed between internal and external memory units without a secondscreen process of sequential structure. This character makes the model more stable and efficient when dealing with long term history information.
Compared with traditional methods, this application makes predictions not only by mining tunespace patterns of traffic, but also by targeting and analyzing complex road structures, such as overlapping between roads and bridges. This application fills the gap that little practical methods are proposed to handle traffic prediction for complex road structures. The tests show that the model is accuracy and robust.
This application is a transportation network speed forecasting method using deep capsule networks with nested LSTM models. The implementation steps are as follows.
The selected network (
The road network is segmented by grids with a size of 0.0001°×0.0001° (latitude and longitude). The value of each grid is determined on the basis of the speed of links using the following criteria: if no link passes through the grid area, then the value is zero; if only one link passes through the grid area, the value is the speed of this link; if multiple links pass through the same grid area, the value is the average speed of, all links.
On the basis of the above process, each grid is taken as a pixel with one channel, in which its value is the projected velocity value. Sequences of images are generated as data samples, and the time interval in, these sequences is 2 minutes. These images not only represent the traffic state but also contain the spatial structure of the road network and the relative topology among different links.
The model input is a twodimensional vector containing traffic state in the last 15 time intervals (i.e., 30 minutes). The model output is a vector containing traffic states of all road link in the following 3 time intervals (i.e., 6 minutes). One training sample of the model is represented as s=[(x_{1}, x_{2}, . . . x_{15}), (y_{1}, y_{2}, y_{3})], where {x_{i}}_{i=1}^{15 }represents traffic states observed in the last 15 time intervals and (y_{1}, y_{2}, y_{3}) represent traffic states, in the 3 future time intervals. The implementation uses data from Jun. 1, 2015 to Jun. 30, 2015 as training set, and uses data from Aug. 1, 2015 to Aug. 14, 2015 as test set. Traffic data between 6:00 AM and 10:00 PM is used, so there are 481 samples every day.
CapsNet is a new type of NN structure. It replaces scalar neurons in the CNN with vector neurons, so that much more comprehensive traffic information can be kept, such as rotation angle, direction, and size of local features. In addition, CapsNet can retain all the extracted local features by replacing the pooling operation with a dynamic routing operation between capsule layers. Thus, CapsNet has greater learning ability than CNN because it keeps spatial relationships among road links.
CapsNet is composed of primary capsule layers (PrimaryCaps) and fully connected layers (TrafficCaps). The implementation of CapsNet is shown in
where v_{j }is the output vector, and s_{j }is the input vector. The squashing operation ensures that the short vectors shrink to approximately zero length and long vectors shrink to a length slightly below 1. Thus, the length of the output vector of a capsule can represent the probability of the existence of the extracted local features.
In the convolution layers, the value of neurons is the activated as the weighted sum of neurons in the leading layer. The network is solved using back propagation. The structure of the CapsNet is discussed as follows.
First, to obtain the spatial relationship between the local features of networklevel traffic state extracted by the primary layer and advanced features, an affine transformation is performed by multiplying the local features with a weight matrix W_{ij}.
û_{ji}=W_{ij}u_{i}, (2)
where u_{i }is the local features extracted by a primary capsule i, and û_{ji }is the input vector associated with an advanced capsule j.
Then, input s_{j }to an advanced capsule j is the weighted sum over all input vectors û_{ji }from the primary capsule layer.
s_{j}=Σ_{i}c_{ij}û_{ji} (3)
where weights c_{ij }are the coupling coefficients that determined by an iterative dynamic routing algorithm. The essence of the dynamic routing algorithm is to find a part of primary capsules that is highly correlated to the advanced capsules, that is, to determine the local features with high probability to be associated with the highlevel feature. This process represents the capability of the model to explore the spatial relationships among the distant links. The dynamic routing algorithm is described as follows.
1). For each primary capsule i in the primary capsule layer, the coupling coefficients c_{ij }with all the advanced capsules j are summed to 1 by using a SoftMax function:
where routing logit b_{ij }is the log prior probability that capsule i should be coupled to capsule j, and output c_{ij }represents the normalized probability that primary capsule i is associated with advanced capsule j. In the first iteration, the initial value of routing logit b_{ij }is set to zero in which the probabilities of the primary capsule accepted by each advanced capsule are equal.
2) After all the weights c_{ij }are calculated for all the primary capsules, each advanced capsule j is weighted by using Equation (3).
3) The input vector to advanced capsule layer is activated by a squashing function. The output is v_{j}.
4) Updating b_{ij }on the basis of the following rule:
b_{ij}=b_{ij}+û_{ji}·v_{j}.
Routing logit b_{ij }is updated by using the dot product of the input to capsule j and its output. In the field of mathematics, the dot product becomes large for similar vectors. Therefore, the corresponding routing logit increases when the input and output are similar; thus, the primary capsule is coupled to the advanced capsule with a similar output. This process represents the association of local features with the highlevel feature.
5) Repeating Steps 14 to obtain the optimal routing weights. The dynamic routing algorithm is easy to be optimized, and experiments show that the CapsNet model can be optimized by iterating three times on the training dataset.
Ĩ_{t}={tilde over (σ)}_{i}({tilde over (x)}_{t}{tilde over (W)}_{xi}+{tilde over (h)}_{t1}{tilde over (W)}_{hi}+{tilde over (b)}_{i})
{tilde over (f)}_{t}={tilde over (σ)}_{f}({tilde over (x)}_{t}{tilde over (W)}_{xf}+{tilde over (h)}_{t1}{tilde over (W)}_{hf}+{tilde over (b)}_{f})
{tilde over (c)}_{t}={tilde over (f)}_{t}⊙{tilde over (c)}_{t1}+Ĩ_{t}⊙{tilde over (σ)}_{c}({tilde over (x)}_{t}{tilde over (W)}_{xc}+{tilde over (h)}_{t1}{tilde over (W)}_{hc}+{tilde over (b)}_{c})
õ_{t}={tilde over (σ)}_{o}({tilde over (x)}_{t}{tilde over (W)}_{xo}+{tilde over (h)}_{t1}{tilde over (W)}_{ho}+{tilde over (b)}_{o})
{tilde over (h)}_{t}=õ_{t}⊙{tilde over (σ)}_{h}({tilde over (c)}_{t})
where {tilde over (x)}_{t}, {tilde over (h)}_{t1 }are the inputs of the internal LSTM unit. They can be calculated as
{tilde over (x)}_{t}=I_{t}⊙σ_{c}(x_{t}W_{xc}+h_{t1}W_{hc}+b_{c})
{tilde over (h)}
_{t1}
=f
_{t}
⊙c
_{t1 }
where Ĩ_{t}, {tilde over (f)}_{t}, and õ_{t }are the three states of the gates; {tilde over (c)}_{t }is the cell input state; {tilde over (W)}_{xi}, {tilde over (W)}_{xf}, {tilde over (W)}_{xo}, and {tilde over (W)}_{xc }are the weight matrices that connect {tilde over (x)}_{t }to the three gates and cell input; {tilde over (W)}_{hi}, {tilde over (W)}_{hf}, {tilde over (W)}_{ho}, and {tilde over (W)}_{hc }are the weight matrices that connect {tilde over (h)}_{t1 }to the three gates and cell input; {tilde over (b)}_{i}, {tilde over (b)}_{f}, {tilde over (b)}_{o}, and {tilde over (b)}_{c }are the biases of the three gates and cell input; σ represents the sigmoid function; and ⊙ represents the scalar product of two vectors.
For the external LSTM unit, only the cell state update rule is changed to the output of the internal LSTM, i.e., c_{t}={tilde over (h)}_{t}.
The final model connects the CapsNet model and NLSTM model sequentially, and puts a fully connected layer at last. The structure of the final model is, as follows.
The deep learning model is implemented based on Keras framework and is trained on a server with 8 NVIDIA GeForce Titan X GPUs (12 GB RAM).
Feeding the testing dataset into the trained model, traffic states at future six minutes can be predicted using historical 30 minutes data. The MSE and MAPE are calculated as follows.
Where ŷ_{i }is the predicted value, while y_{i }is the true value. The prediction accuracy is demonstrated as follows.
The results show that the proposed model generate lowest MSEs and MAPEs under all circumstances, suggesting that the proposed model, can mine traffic patterns efficiently and is accurate and stable in traffic state prediction.