The US roads exceed 4.1 million miles, the largest road network in the world [1]. Maintenance of this huge network alone costs more than $200 billion annually [2]. In 2000, The National Cooperative Highway Research Program announced the development of a new approach for design and analysis of pavement structures: Pavement-ME (Mechanistic-Empirical)[3] .
The goal of Pavement-ME is to identify the physical causes of stresses in pavement structures and calibrate them with observed pavement performance. These two elements define this approach to pavement design: the focus on physical causes is the “mechanistic” part, and using observed performance to determine relationships is the “empirical” part.
Using the mechanistic models and data, Pavement-ME process analyzes the pavement design with respect to performance indicators that reflect the projected impact of stresses and strains on the pavement over time. These performance indicators (for Hot Mix Asphalt pavements) include:
Pavement-ME requires hundreds of inputs to predict pavement response. The model that we propose here takes 11 inputs. These 11 inputs have been shortlisted from a large group of inputs by knowledge gathered from comprehensive literature review and performing in-house preliminary statistical analysis. Latin Hyper Cube Sampling (LHS) method [4] was adopted to select inputs combinations to cover the entire domain of inputs range. The pavement response for these combinations was obtained by Pavement-ME.
NOTE: The data set is available on corresponding GitHub repo.
This effort is part of a project sponsored by Michigan Department of Transportation, titled: "Preparation for Implementation of the Mechanistic Empirical Pavement Design Guide in Michigan Part 3: Local Calibration and Validation of the Pavement-ME Performance Models" executed at Michigan State University. [5]
Pavement-ME is relatively a new tool for States' Departments of Transportation, road agencies and pavement engineers. There is a lot to learn about this tool. One comprehensive aspect of Pavement-ME is that it requires hundreds of input before starting analysis. Moreover, each Pavement-ME run could take from several minutes to sometimes half an hour. Adding all other complications that come with this new tool makes it very time consuming for engineers to get quick estimates on pavement performance by changing input variables. Therefore, a model is proposed here to mimic pavement response outside Pavement-ME.
Our model will predict the four performance indicators: Rutting, IRI, Alligator Cracking, Longitudinal Cracking
Here is the information on our data set:
Features | Description | |
---|---|---|
0 | o_thick | Overlay (new pavement) thickness |
1 | o_effb | The amount of effective binder in overlay |
2 | o_pg | Overlay binder grade |
3 | o_av | Overlay air voids percentage |
4 | o_agg | Overlay aggregate gradation |
5 | ex_rating | Existing (old) pavement condition rating |
6 | ex_thick | Existing pavement thickness |
7 | ex_basem | Existing pavement base modulus |
8 | ex_subbasem | Existing pavement subbase modulus |
9 | sub_m | Modulus of subgrade reaction |
10 | climate | Michigan climate |
11 | lc | Pavement response: Logitudinal cracking |
12 | ac | Pavement response: Aligator cracking |
13 | rut | Pavement response: Rutting |
14 | iri | Pavement response: Internation Roughness Index |
The first 10 rows are the inputs (or independent variables) and the last 4 rows are the outputs (or pavement performance predictors). The inputs starting with "o" indicate overlay or the new pavement, that is the pavement that is going to be poured over the existing pavement, i.e. old pavement. Inputs related to the existing pavement are indicated with an "ex" in the beginning.
We use Pandas, Numpy, Matplotlib and Seaborn libraries to explore our data
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
df = pd.read_csv('MEPDG_Dataset.csv')
A quick look at our dataset reveals 220 observations (containing no null cell). Conventionally, pavement engineers dealing with Pavement-ME changed one input at a time while all other inputs constant. Here we propose changing all inputs at the same time. Here we are dealing with an 11-dimension space of inputs. Therefore, to produce representative Response Surface Models (RSMs) we use Latin Hyper Cube Sampling (LHS) method to sample our entire domain. in LHS technique we specify the number of inputs combination before sampling. In this case we specify 220 combinations, yielding 220 observations.
Note: This effort is a preliminary effort to evaluate the effectiveness of Machine Learning techniques in pavement predictions. Therefore, we start by a rather small sample.
df.info()
Some of the inputs are qualitative, namely the o_agg: overlay aggregate gradation, ex_rating: existing pavement condition rating and climate
df.head()
The rest of the inputs are quantitative. The min and max of the inputs are very close to whole numbers. This is due to the fact that before sampling our domain, we set a range for each input. For example for o_thick: overlay thickness the range is set between 2 to 8 inches. 2 to 8 inch pavement thickness is what is practiced in really. The range for the inputs are predetermined to avoid unrealistic pavement input combinations (e.g. a pavement with negative or 100 inch thickness).
Looking at the pavement responses lc: longitudinal cracking and ac: alligator cracking the min is almost 0 (no cracking in the pavement) and the max is ~10,000 and 60 respectively. However, the 50 percentile for both of these pavement responses are very close th min (175 and 1 respectively). This means that for most of the inputs combinations the Pavement-ME model predicts very little cracking. This skewed effect is less prominent in the other pavement responses (i.e. rut: rutting and iri: smoothness).
Finally, the inputs consist of variables with different orders of magnitude. Some inputs such as thickness vary from 2 to 8 while some other vary from 10,000 to 100,000. Thus it is essential to perform scaling on our inputs.
df.describe().transpose()
OVERALL GOAL: Get an understanding for which variables are important, view summary statistics, and visualize the data
fig, axs = plt.subplots(2, 2,figsize=(12,6))
axs[0, 0].hist(x=df['lc'],bins=10)
axs[0, 0].set_title('Longitudinal Cracking')
axs[0, 1].hist(df['ac'],bins=10)
axs[0, 1].set_title('Alligator Cracking')
axs[1, 0].hist(df['rut'],bins=10)
axs[1, 0].set_title('Rutting')
axs[1, 1].hist(df['iri'],bins=10)
axs[1, 1].set_title('Smoothness')
plt.tight_layout()
The majority of responses are skewed to the left. Especially for the case of alligator and longitudinal cracking the majority of the results are small numbers close to zero. We caught this behavior by looking at the summary of the data as well. We have to perform some kind of transformation to take care of the skewness.
Let's explore correlation between the continuous feature variables by calculating correlation between all continuous numeric variables.
df.corr()
We use heatmap to show the possible correlations between inputs.
Strong correlation among pavement responses is evident. This means that in case there exists a certain kind of pavement damage, another pavement damage is likely to exist too.
Moreover, correlation of overlay thickness and existing thickness with pavement response is relatively strong. The correlation is negative, meaning that the thicker the overlay thickness, the lower the pavement response. This is intuitive as we lay down a thicker pavement we expect to observe less damage over time. Same argument can be made for existing pavement thickness.
plt.figure(figsize=(12,7))
sns.heatmap(df.corr(),annot=True,cmap='viridis')
#plt.ylim(10, 0)
plt.tight_layout()
Scatter plots are another way of showing the dispersion in pavement responses. Previously, we established there is a correlation between pavement response and overlay thickness. So we plot them versus each other. Evidently, for longitudinal and alligator cracking the majority of the pavement responses are closer to zero. Seemingly, any pavement with 3 inch overlay or thicker is strong against alligator cracking.
fig, axs = plt.subplots(2, 2,figsize=(12,6))
axs[0, 0].scatter(x=df['o_thick'],y=df['lc'])
axs[0, 0].set_title('Longitudinal Cracking')
axs[0, 1].scatter(x=df['o_thick'],y=df['ac'])
axs[0, 1].set_title('Alligator Cracking')
axs[1, 0].scatter(x=df['o_thick'],y=df['rut'])
axs[1, 0].set_title('Rutting')
axs[1, 1].scatter(x=df['o_thick'],y=df['iri'])
axs[1, 1].set_title('Smoothness')
plt.tight_layout()
Pavement-ME requires more that 100 inputs to predict pavement response over time. Gathering 100 inputs is extremely time-consuming. Furthermore, some of the inputs are not significant and do not contribute to the final outcome. Moreover, Pavement-ME runs could take anywhere between 5 minutes to one hour for each prediction. Therefore, having a model to provide fast predictions using fewer inputs while working on the basis of Pavement-ME could be valuable for pavement engineers. They could use this model for quick prediction and estimation before attempting to employ with Pavement-ME.
Four of our inputs are categorical. We need to convert them to dummy variables before any modeling attempt. Since there are two classes in each categorical variable, we use 0 to denote one class and 1 to denote the other.
dummies = pd.get_dummies(df[['o_pg','o_agg','ex_rating','climate']], drop_first=True)
dummies
get_dummies command outputs two columns. We drop the extra column. Moreover, we do not need the variables columns before converting so we remove them from our dataset as well.
df = df.drop(['o_pg','o_agg','ex_rating','climate'],axis=1)
df
df = pd.concat([df,dummies],axis=1)
df.columns
This is how our dataset looks like now:
df
We need to apply a minor name change here:
df=df.rename(columns = {'o_pg_PG 76-28':'o_pg','o_agg_Fine':'o_agg','ex_rating_Very Poor':'ex_rating','climate_Pellston':'climate' })
df
There are 4 different pavement responses: Rutting, Alligator cracking, Longitudinal cracking and IRI. Therefore, we would like to create four different models to predict each one of these pavement responses. We start by modeling rutting in pavement.
We modify our dataset to create X and y for modeling. X is the portion of dataset containing inputs, and y is the rest, containing pavement responses.
X = df.drop(['lc','ac','rut','iri'],axis=1)
y_rut = df['rut']
y_lc = df['lc']
y_ac = df['ac']
y_iri = df['iri']
From scikit-Learn we import train_test_split module to split our dataset into training and testing sets. We use 70% of our data for training and 30% for testing.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y_rut,test_size=0.3)
The inputs consist of variables with different orders of magnitude. Some inputs such as thickness vary from 2 to 8 while some other vary from 10,000 to 100,000. Thus it is essential to perform scaling on our inputs. We use MinMaxScale that transform features by scaling each feature to a given range.
This estimator scales and translates each feature individually such that it is in the given range on the training set, e.g. between zero and one.
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X_train= scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
Due to the extreme non-linear nature of pavement response models, we pick neural networks as our modeling technique.
We use TensorFlow 2.0 and accompanying Keras package. Keras Sequential model which is a linear stack of layers seems like a good start for our modeling purpose. We also need to import Dense (which is just your regular densely-connected NN layer) and Activation that applies an activation function to an output. Finally we use Adam as an optimizer.
Note: Adam optimization is a stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.optimizers import Adam
Our sequential model consists of 5 layers (first 4 being our hidden layers and last one being the output layer) using Rectified Linear Unit (ReLU) as activation function.
In this model we are minimizing loss in terms of mean suqared error. The loss value that will be minimized by the model will then be the sum of all individual losses.
model = Sequential()
model.add(Dense(12,activation='relu'))
model.add(Dense(12,activation='relu'))
model.add(Dense(12,activation='relu'))
model.add(Dense(12,activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam',loss='mse')
We are set to train our model. We perform training 400 times and measure loss over the course of training session.
model.fit(x=X_train,y=y_train.values,
validation_data=(X_test,y_test.values),epochs=400, verbose = 0)
losses = pd.DataFrame(model.history.history)
In Figure below, loss vs. Epochs or training sessions is illustrated. There is a sharp decline of loss from 0.2 to 0.01 within the first 50 epochs for both training and validation sets. This is good news for us.
losses.plot()
In order to evaluate our model, we perform predictions using our model on our test data set and measure 3 indicators: mean squared error, mean absolute error and explained variance score.
from sklearn.metrics import mean_squared_error,mean_absolute_error,explained_variance_score
predictions = model.predict(X_test)
mean_absolute_error(y_test,predictions)
Mean absolute error for rutting model is 0.07. By looking at our dataset the mean rutting is 0.5. Therefore, the mean absolute error is at 14%.
Next, we compute mean squared error and explained variance score. When we compare the R2 Score with the Explained Variance Score, we are basically checking the Mean Error; so if R2 = Explained Variance Score, that means: The Mean Error = Zero! This turns out to be the case for our model!
The Mean Error reflects the tendency of our estimator, that is: the Biased v.s Unbiased Estimation.
np.sqrt(mean_squared_error(y_test,predictions))
explained_variance_score(y_test,predictions)
Moreover, we check how close our predictions are to Pavement-ME predictions. The red line is the equality line. Ideally we want all data points to fall on the red line. However, due to error of predictions there will be a spread.
plt.scatter(y_test,predictions)
plt.plot(y_test,y_test,'r')
errors = y_test.values - predictions
Finally, we plot the distribution of errors in prediction. Ideally we would like to see a sharp peak in the middle (at zero). However, due to the error in the prediction there is a band with most of the error hovering around zero.
sns.distplot(errors)
We follow the same procedure to split our data set for training and testing, use scaling on X, sequential model with adam optimizer as the model.
X_train, X_test, y_train, y_test = train_test_split(X,y_iri,test_size=0.3)
X_train= scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
150 epochs seem enough to achieve the minimum possible loss.
model.fit(x=X_train,y=y_train.values,
validation_data=(X_test,y_test.values),epochs=150, verbose = 0)
losses = pd.DataFrame(model.history.history)
losses.plot()
Loss drop by a factor of 10, from initial values of 12000 to 1200. Training loss and validation loss plateaus out at around the same value.
predictions = model.predict(X_test)
mean_absolute_error(y_test,predictions)
The average observed IRI is 112. The absolute mean error for the model is ~10 which is less than 9% of the mean observed IRI.
np.sqrt(mean_squared_error(y_test,predictions))
explained_variance_score(y_test,predictions)
The explained variance is returned as -0.13. It is not that uncommon to get negative variance. The solution is either to simplify model or get more data. A look at the prediction and observation indicate a decent fit between them. Therefore, the next step of the research should be getting more data for IRI.
plt.scatter(y_test,predictions)
plt.plot(y_test,y_test,'r')
errors = y_test.values - predictions
sns.distplot(errors)
Furthermore, the model does a good job in prediction error with an average error close to zero and a relatively narrow band.
X_train, X_test, y_train, y_test = train_test_split(X,y_lc,test_size=0.3)
X_train= scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
We stick to our established procedure for creating a model for longitudinal cracking.
model.fit(x=X_train,y=y_train.values,
validation_data=(X_test,y_test.values),epochs=400, verbose = 0)
losses = pd.DataFrame(model.history.history)
losses.plot()
predictions = model.predict(X_test)
mean_absolute_error(y_test,predictions)
The model mean absolute error is ~965 which compared to mean value of observations (2350) is relatively high. The error rate is at 40%
np.sqrt(mean_squared_error(y_test,predictions))
Large MSE also confirms the massive error in predictions.
explained_variance_score(y_test,predictions)
However, explained variance indicates the model does a relatively good job in predicting the trends in data as shown by the figure below as well.
plt.scatter(y_test,predictions)
plt.plot(y_test,y_test,'r')
errors = y_test.values - predictions
sns.distplot(errors)
The problem with longitudinal cracking is the abundance of zero or near-zero observation. One way to take care of skewed data is Box-Cox scaling. However, box-cox works with positive values strictly. So we go with the alternative, Yeo-Johnson algorithm.
from sklearn.preprocessing import PowerTransformer
pscaler = PowerTransformer(method='yeo-johnson',standardize=True, copy=True)
X_train = pscaler.fit_transform(X_train)
X_test = pscaler.transform(X_test)
model.fit(x=X_train,y=y_train.values,
validation_data=(X_test,y_test.values),epochs=200, verbose = 0)
losses = pd.DataFrame(model.history.history)
losses.plot()
predictions = model.predict(X_test)
mean_absolute_error(y_test,predictions)
As observed by the calculated errors and explained variance, the problem with longitudinal cracking persists. The last options for use would be increasing the sample size and gathering more data.
np.sqrt(mean_squared_error(y_test,predictions))
explained_variance_score(y_test,predictions)
plt.scatter(y_test,predictions)
plt.plot(y_test,y_test,'r')
errors = y_test.values - predictions
sns.distplot(errors)
We follow the same procedure for alligator cracking as longitudinal cracking. First we scale our data using MinMaxScaler. Then we apply Yeo-Johnson algorithm.
X_train, X_test, y_train, y_test = train_test_split(X,y_ac,test_size=0.3)
X_train= scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
model.fit(x=X_train,y=y_train.values,
validation_data=(X_test,y_test.values),epochs=200, verbose = 0)
losses = pd.DataFrame(model.history.history)
losses.plot()
Whether using MinMaxScaler or Yeo-Johnson algorithm the predictions are poor. In this case the explained variance is very poor too. Looking at alligator cracking data we find out the number of "0" cracking observations are even more prevalent. Therefore model is working with only a handful of non-zero values for prediction. Therefore, it is imperative to collect a larger pool of data to make modeling possible.
predictions = model.predict(X_test)
mean_absolute_error(y_test,predictions)
np.sqrt(mean_squared_error(y_test,predictions))
explained_variance_score(y_test,predictions)
plt.scatter(y_test,predictions)
plt.plot(y_test,y_test,'r')
errors = y_test.values - predictions
sns.distplot(errors)
X_train = pscaler.fit_transform(X_train)
X_test = pscaler.transform(X_test)
model.fit(x=X_train,y=y_train.values,
validation_data=(X_test,y_test.values),epochs=200, verbose = 0)
losses = pd.DataFrame(model.history.history)
losses.plot()
predictions = model.predict(X_test)
mean_absolute_error(y_test,predictions)
np.sqrt(mean_squared_error(y_test,predictions))
explained_variance_score(y_test,predictions)
plt.scatter(y_test,predictions)
plt.plot(y_test,y_test,'r')
[1] Policy and Governmental Affairs Office of Highway Policy Information
[2] BidNet.com Business Insight
[3] The Mechanistic-Empirical Pavement Design Guide, related documentation, and the latest version of its software are available online for evaluation through 30 September 2011.
[4] Latin Hypercube Sampling: Simple Definition. Statistics How To.
[5] Preparation for Implementation of the MechanisticEmpirical Pavement Design Guide in Michigan Part 3