Final Up to date on November 20, 2021

Convolutional neural networks have their roots in picture processing. It was first printed in LeNet to acknowledge the MNIST handwritten digits. Nonetheless, convolutional neural networks aren’t restricted to dealing with photographs.

On this tutorial, we’re going to take a look at an instance of utilizing CNN for time collection prediction with an software from monetary markets. By the use of this instance, we’re going to discover some methods in utilizing Keras for mannequin coaching as nicely.

After finishing this tutorial, you’ll know

- What a typical multidimensional monetary information collection appears like?
- How can CNN utilized to time collection in a classification downside
- use turbines to feed information to coach a Keras mannequin
- present a customized metric for evaluating a Keras mannequin

Let’s get began

## Tutorial overview

This tutorial is split into 7 elements; they’re:

- Background of the thought
- Preprocessing of information
- Knowledge generator
- The mannequin
- Coaching, validation, and take a look at
- Extensions
- Does it work?

## Background of the thought

On this tutorial we’re following the paper titled “CNNpred: CNN-based inventory market prediction utilizing a iverse set of variables” by Ehsan Hoseinzade and Saman Haratizadeh. The information file and pattern code from the creator can be found in github:

The objective of the paper is easy: To foretell the subsequent day’s course of the inventory market (i.e., up or down in comparison with right this moment), therefore it’s a binary classification downside. Nonetheless, it’s attention-grabbing to see how this downside are formulated and solved.

We now have seen the examples on utilizing CNN for sequence prediction. If we take into account Dow Jones Industrial Common (DJIA) for instance, we could construct a CNN with 1D convolution for prediction. This is sensible as a result of a 1D convolution on a time collection is roughly computing its shifting common or utilizing digital sign processing phrases, making use of a filter to the time collection. It ought to present some clues concerning the development.

Nonetheless, after we take a look at monetary time collection, it’s fairly a standard sense that some derived indicators are helpful for predictions too. For instance, worth and quantity collectively can present a greater clue. Additionally another technical indicators such because the shifting common of various window measurement are helpful too. If we put all these align collectively, we could have a desk of information, which every time occasion has a number of **options**, and the objective continues to be to foretell the course of **one** time collection.

Within the CNNpred paper, 82 such options are ready for the DJIA time collection:

In contrast to LSTM, which there’s an specific idea of time steps utilized, we current information as a matrix in CNN fashions. As proven within the desk under, the options throughout a number of time steps are introduced as a 2D array.

## Preprocessing of information

Within the following, we attempt to implement the thought of the CNNpred from scratch utilizing Tensorflow’s keras API. Whereas there’s a reference implementation from the creator within the github hyperlink above, we reimplement it in a different way as an instance some Keras methods.

Firstly the information are 5 CSV information, every for a distinct market index, underneath the `Dataset`

listing from github repository above, or we are able to additionally get a duplicate right here:

The enter information has a date column and a reputation column to establish the ticker image for the market index. We are able to go away the date column as time index and take away the title column. The remaining are all numerical.

As we’re going to predict the market course, we first attempt to create the classification label. The market course is outlined because the closing index of tomorrow in comparison with right this moment. If we’ve got learn the information right into a pandas DataFrame, we are able to use `X["Close"].pct_change()`

to search out the share change, which a optimistic change for the market goes up. So we are able to shift this to 1 time step again as our label:

... X[“Target”] = (X[“Close”].pct_change().shift(–1) > 0).astype(int) |

The road of code above is to compute the share change of the closing index and align the information with the day gone by. Then convert the information into both 1 or 0 for whether or not the share change is optimistic.

For 5 information file within the listing, we learn every of them as a separate pandas DataFrame and hold them in a Python dictionary:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
... information = {} for filename in os.listdir(DATADIR): if not filename.decrease().endswith(“.csv”): proceed # learn solely the CSV information filepath = os.path.be a part of(DATADIR, filename) X = pd.read_csv(filepath, index_col=“Date”, parse_dates=True) # fundamental preprocessing: get the title, the classification # Save the goal variable as a column in dataframe for simpler dropna() title = X[“Name”][0] del X[“Name”] cols = X.columns X[“Target”] = (X[“Close”].pct_change().shift(–1) > 0).astype(int) X.dropna(inplace=True) # Match the usual scaler utilizing the coaching dataset index = X.index[X.index > TRAIN_TEST_CUTOFF] index = index[:int(len(index) * TRAIN_VALID_RATIO)] scaler = StandardScaler().match(X.loc[index, cols]) # Save scale reworked dataframe X[cols] = scaler.rework(X[cols]) information[name] = X |

The results of the above code is a DataFrame for every index, which the classification label is the column “Goal” whereas all different columns are enter options. We additionally normalize the information with a typical scaler.

In time collection issues, it’s typically cheap to not cut up the information into coaching and take a look at units randomly, however to arrange a cutoff level through which the information earlier than the cutoff is coaching set whereas that afterwards is the take a look at set. The scaling above are primarily based on the coaching set however utilized to all the dataset.

## Knowledge generator

We aren’t going to make use of all time steps without delay, however as an alternative, we use a set size of N time steps to foretell the market course at step N+1. On this design, the window of N time steps can begin from anyplace. We are able to simply create a lot of DataFrames with great amount of overlaps with each other. To avoid wasting reminiscence, we’re going to construct a knowledge generator for coaching and validation, as follows:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
... TRAIN_TEST_CUTOFF = ‘2016-04-21’ TRAIN_VALID_RATIO = 0.75
def datagen(information, seq_len, batch_size, targetcol, variety): “As a generator to provide samples for Keras mannequin” batch = [] whereas True: # Decide one dataframe from the pool key = random.selection(listing(information.keys())) df = information[key] input_cols = [c for c in df.columns if c != targetcol] index = df.index[df.index < TRAIN_TEST_CUTOFF] cut up = int(len(index) * TRAIN_VALID_RATIO) if variety == ‘prepare’: index = index[:split] # vary for the coaching set elif variety == ‘legitimate’: index = index[split:] # vary for the validation set # Decide one place, then clip a sequence size whereas True: t = random.selection(index) # decide one time step n = (df.index == t).argmax() # discover its place within the dataframe if n–seq_len+1 < 0: proceed # cannot get sufficient information for one sequence size body = df.iloc[n–seq_len+1:n+1] batch.append([frame[input_cols].values, df.loc[t, targetcol]]) break # if we get sufficient for a batch, dispatch if len(batch) == batch_size: X, y = zip(*batch) X, y = np.expand_dims(np.array(X), 3), np.array(y) yield X, y batch = [] |

Generator is a particular operate in Python that doesn’t `return`

a worth however to `yield`

in iterations, such {that a} sequence of information are produced from it. For a generator for use in Keras coaching, it’s anticipated to `yield`

a batch of enter information and goal. This generator purported to run indefinitely. Therefore the generator operate above is created with an infinite loop begins with `whereas True`

.

In every iteration, it randomly decide one DataFrame from the Python dictionary, then throughout the vary of time steps of the coaching set (i.e., the start portion), we begin from a random level and take N time steps utilizing the pandas `iloc[start:end]`

syntax to create a enter underneath the variable `body`

. This DataFrame shall be a 2D array. The goal label is that of the final time step. The enter information and the label are then appended to the listing `batch`

. Till we accrued for one batch’s measurement, we dispatch it from the generator.

The final 4 strains on the code snippet above is to dispatch a batch for coaching or validation. We accumulate the listing of enter information (every a 2D array) in addition to an inventory of goal label into variables `X`

and `y`

, then convert them into numpy array so it will possibly work with our Keras mannequin. We have to add another dimension to the numpy array `X`

utilizing `np.expand_dims()`

due to the design of the community mannequin, as defined under.

## The Mannequin

The 2D CNN mannequin introduced within the authentic paper accepts an enter tensor of form $Ntimes m instances 1$ for N the variety of time steps and m the variety of options in every time step. The paper assumes $N=60$ and $m=82$.

The mannequin contains of three convolutional layers, as described as follows:

... def cnnpred_2d(seq_len=60, n_features=82, n_filters=(8,8,8), droprate=0.1): “2D-CNNpred mannequin in accordance with the paper” mannequin = Sequential([ Input(shape=(seq_len, n_features, 1)), Conv2D(n_filters[0], kernel_size=(1, n_features), activation=“relu”), Conv2D(n_filters[1], kernel_size=(3,1), activation=“relu”), MaxPool2D(pool_size=(2,1)), Conv2D(n_filters[2], kernel_size=(3,1), activation=“relu”), MaxPool2D(pool_size=(2,1)), Flatten(), Dropout(droprate), Dense(1, activation=“sigmoid”) ]) return mannequin |

and the mannequin is introduced by the next:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
Mannequin: “sequential” _________________________________________________________________ Layer (sort) Output Form Param # ================================================================= conv2d (Conv2D) (None, 60, 1, 8) 664 _________________________________________________________________ conv2d_1 (Conv2D) (None, 58, 1, 8) 200 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 29, 1, 8) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 27, 1, 8) 200 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 13, 1, 8) 0 _________________________________________________________________ flatten (Flatten) (None, 104) 0 _________________________________________________________________ dropout (Dropout) (None, 104) 0 _________________________________________________________________ dense (Dense) (None, 1) 105 ================================================================= Whole params: 1,169 Trainable params: 1,169 Non-trainable params: 0 |

The primary convolutional layer has 8 items, and is utilized throughout all options in every time step. It’s adopted by a second convolutional layer to think about three consecutive days without delay, for it’s a widespread perception that three days could make a development within the inventory market. It’s then utilized to a max pooling layer and one other convolutional layer earlier than it’s flattened right into a one-dimensional array and utilized to a fully-connected layer with sigmoid activation for binary classification.

## Coaching, validation, and take a look at

That’s it for the mannequin. The paper used MAE because the loss metric and likewise monitor for accuracy and F1 rating to find out the standard of the mannequin. We must always level out that F1 rating depends upon precision and recall ratios, that are each contemplating the optimistic classification. The paper, nonetheless, take into account the typical of the F1 from optimistic and adverse classification. Explicitly, it’s the F1-macro metric:

$$

F_1 = frac{1}{2}left(

frac{2cdot frac{TP}{TP+FP} cdot frac{TP}{TP+FN}}{frac{TP}{TP+FP} + frac{TP}{TP+FN}}

+

frac{2cdot frac{TN}{TN+FN} cdot frac{TN}{TN+FP}}{frac{TN}{TN+FN} + frac{TN}{TN+FP}}

proper)

$$

The fraction $frac{TP}{TP+FP}$ is the precision with TP and FP the variety of true optimistic and false optimistic. Equally $frac{TP}{TP+FN}$ is the recall. The primary time period within the massive parenthesis above is the conventional F1 metric that thought of optimistic classifications. And the second time period is the reverse, which thought of the adverse classifications.

Whereas this metric is accessible in scikit-learn as `sklearn.metrics.f1_score()`

there isn’t any equal in Keras. Therefore we might create our personal by borrowing code from this stackexchange query:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
from tensorflow.keras import backend as Okay
def recall_m(y_true, y_pred): true_positives = Okay.sum(Okay.spherical(Okay.clip(y_true * y_pred, 0, 1))) possible_positives = Okay.sum(Okay.spherical(Okay.clip(y_true, 0, 1))) recall = true_positives / (possible_positives + Okay.epsilon()) return recall
def precision_m(y_true, y_pred): true_positives = Okay.sum(Okay.spherical(Okay.clip(y_true * y_pred, 0, 1))) predicted_positives = Okay.sum(Okay.spherical(Okay.clip(y_pred, 0, 1))) precision = true_positives / (predicted_positives + Okay.epsilon()) return precision
def f1_m(y_true, y_pred): precision = precision_m(y_true, y_pred) recall = recall_m(y_true, y_pred) return 2*((precision*recall)/(precision+recall+Okay.epsilon()))
def f1macro(y_true, y_pred): f_pos = f1_m(y_true, y_pred) # adverse model of the information and prediction f_neg = f1_m(1–y_true, 1–Okay.clip(y_pred,0,1)) return (f_pos + f_neg)/2 |

The coaching course of can take hours to finish. Therefore we wish to save the mannequin in the course of the coaching in order that we could interrupt and resume it. We are able to make use of checkpoint options in Keras:

checkpoint_path = “./cp2d-{epoch}-{val_f1macro:.2f}.h5” callbacks = [ ModelCheckpoint(checkpoint_path, monitor=‘val_f1macro’, mode=“max”, verbose=0, save_best_only=True, save_weights_only=False, save_freq=“epoch”) ] |

We arrange a filename template `checkpoint_path`

and ask Keras to fill within the epoch quantity in addition to validation F1 rating into the filename. We put it aside by monitoring the validation’s F1 metric, and this metric is meant to extend when the mannequin will get higher. Therefore we move within the `mode="max"`

to it.

It ought to now be trivial to coach our mannequin, as follows:

seq_len = 60 batch_size = 128 n_epochs = 20 n_features = 82
mannequin = cnnpred_2d(seq_len, n_features) mannequin.compile(optimizer=“adam”, loss=“mae”, metrics=[“acc”, f1macro]) mannequin.match(datagen(information, seq_len, batch_size, “Goal”, “prepare”), validation_data=datagen(information, seq_len, batch_size, “Goal”, “legitimate”), epochs=n_epochs, steps_per_epoch=400, validation_steps=10, verbose=1, callbacks=callbacks) |

Two factors to notice within the above snippets. We provided `"acc"`

because the accuracy in addition to the operate `f1macro`

outlined above because the `metrics`

parameter to the `compile()`

operate. Therefore these two metrics shall be monitored throughout coaching. As a result of the operate is called `f1macro`

, we confer with this metric within the checkpoint’s `monitor`

parameter as `val_f1macro`

.

Individually, within the `match()`

operate, we offered the enter information by way of the `datagen()`

generator as outlined above. Calling this operate will produce a generator, which throughout the coaching loop, batches are fetched from it one after one other. Equally, validation information are additionally offered by the generator.

As a result of the character of a generator is to dispatch information indefinitely. We have to inform the coaching course of on outline a epoch. Recall that in Keras phrases, a batch is one iteration of doing gradient descent replace. An epoch is meant to be one cycle by way of all information within the dataset. On the finish of an epoch is the time to run validation. It’s also the chance for operating the checkpoint we outlined above. As Keras has no method to infer the scale of the dataset from a generator, we have to inform what number of batch it ought to course of in a single epoch utilizing the `steps_per_epoch`

parameter. Equally, it’s the `validation_steps`

parameter to inform what number of batch are utilized in every validation step. The validation doesn’t have an effect on the coaching, however it would report back to us the metrics we have an interest. Beneath is a screenshot of what we are going to see in the course of coaching, which we are going to see that the metric for coaching set are up to date on every batch however that for validation set is offered solely on the finish of epoch:

Epoch 1/20 400/400 [==============================] – 43s 106ms/step – loss: 0.4062 – acc: 0.6184 – f1macro: 0.5237 – val_loss: 0.4958 – val_acc: 0.4969 – val_f1macro: 0.4297 Epoch 2/20 400/400 [==============================] – 44s 111ms/step – loss: 0.2760 – acc: 0.7489 – f1macro: 0.7304 – val_loss: 0.5007 – val_acc: 0.4984 – val_f1macro: 0.4833 Epoch 3/20 60/400 [===>……………………..] – ETA: 39s – loss: 0.2399 – acc: 0.7783 – f1macro: 0.7643 |

After the mannequin completed coaching, we are able to take a look at it with unseen information, i.e., the take a look at set. As a substitute of producing the take a look at set randomly, we create it from the dataset in a deterministic means:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
def testgen(information, seq_len, targetcol): “Return array of all take a look at samples” batch = [] for key, df in information.objects(): input_cols = [c for c in df.columns if c != targetcol] # discover the beginning of take a look at pattern t = df.index[df.index >= TRAIN_TEST_CUTOFF][0] n = (df.index == t).argmax() for i in vary(n+1, len(df)+1): body = df.iloc[i–seq_len:i] batch.append([frame[input_cols].values, body[targetcol][–1]]) X, y = zip(*batch) return np.expand_dims(np.array(X),3), np.array(y)
# Put together take a look at information test_data, test_target = testgen(information, seq_len, “Goal”)
# Take a look at the mannequin test_out = mannequin.predict(test_data) test_pred = (test_out > 0.5).astype(int) print(“accuracy:”, accuracy_score(test_pred, test_target)) print(“MAE:”, mean_absolute_error(test_pred, test_target)) print(“F1:”, f1_score(test_pred, test_target)) |

The construction of the operate `testgen()`

is resembling that of `datagen()`

we outlined above. Besides in `datagen()`

the output information’s first dimension is the variety of samples in a batch however in `testgen()`

is the all the take a look at samples.

Utilizing the mannequin for prediction will produce a floating level between 0 and 1 as we’re utilizing the sigmoid activation operate. We are going to convert this into 0 or 1 through the use of the brink at 0.5. Then we use the capabilities from scikit-learn to compute the accuracy, imply absolute error and F1 rating (which accuracy is only one minus the MAE).

Tying all these collectively, the entire code is as follows:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
import os import random
import numpy as np import pandas as pd import tensorflow as tf from tensorflow.keras import backend as Okay from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D, Enter from tensorflow.keras.fashions import Sequential, load_model from tensorflow.keras.callbacks import ModelCheckpoint from sklearn.preprocessing import StandardScaler from sklearn.metrics import accuracy_score, f1_score, mean_absolute_error
DATADIR = “./Dataset” TRAIN_TEST_CUTOFF = ‘2016-04-21’ TRAIN_VALID_RATIO = 0.75
# https://datascience.stackexchange.com/questions/45165/how-to-get-accuracy-f1-precision-and-recall-for-a-keras-model # to implement F1 rating for validation in a batch def recall_m(y_true, y_pred): true_positives = Okay.sum(Okay.spherical(Okay.clip(y_true * y_pred, 0, 1))) possible_positives = Okay.sum(Okay.spherical(Okay.clip(y_true, 0, 1))) recall = true_positives / (possible_positives + Okay.epsilon()) return recall
def precision_m(y_true, y_pred): true_positives = Okay.sum(Okay.spherical(Okay.clip(y_true * y_pred, 0, 1))) predicted_positives = Okay.sum(Okay.spherical(Okay.clip(y_pred, 0, 1))) precision = true_positives / (predicted_positives + Okay.epsilon()) return precision
def f1_m(y_true, y_pred): precision = precision_m(y_true, y_pred) recall = recall_m(y_true, y_pred) return 2*((precision*recall)/(precision+recall+Okay.epsilon()))
def f1macro(y_true, y_pred): f_pos = f1_m(y_true, y_pred) # adverse model of the information and prediction f_neg = f1_m(1–y_true, 1–Okay.clip(y_pred,0,1)) return (f_pos + f_neg)/2
def cnnpred_2d(seq_len=60, n_features=82, n_filters=(8,8,8), droprate=0.1): “2D-CNNpred mannequin in accordance with the paper” mannequin = Sequential([ Input(shape=(seq_len, n_features, 1)), Conv2D(n_filters[0], kernel_size=(1, n_features), activation=“relu”), Conv2D(n_filters[1], kernel_size=(3,1), activation=“relu”), MaxPool2D(pool_size=(2,1)), Conv2D(n_filters[2], kernel_size=(3,1), activation=“relu”), MaxPool2D(pool_size=(2,1)), Flatten(), Dropout(droprate), Dense(1, activation=“sigmoid”) ]) return mannequin
def datagen(information, seq_len, batch_size, targetcol, variety): “As a generator to provide samples for Keras mannequin” batch = [] whereas True: # Decide one dataframe from the pool key = random.selection(listing(information.keys())) df = information[key] input_cols = [c for c in df.columns if c != targetcol] index = df.index[df.index < TRAIN_TEST_CUTOFF] cut up = int(len(index) * TRAIN_VALID_RATIO) assert cut up > seq_len, “Coaching information too small for sequence size {}”.format(seq_len) if variety == ‘prepare’: index = index[:split] # vary for the coaching set elif variety == ‘legitimate’: index = index[split:] # vary for the validation set else: increase NotImplementedError # Decide one place, then clip a sequence size whereas True: t = random.selection(index) # decide one time step n = (df.index == t).argmax() # discover its place within the dataframe if n–seq_len+1 < 0: proceed # this pattern will not be sufficient for one sequence size body = df.iloc[n–seq_len+1:n+1] batch.append([frame[input_cols].values, df.loc[t, targetcol]]) break # if we get sufficient for a batch, dispatch if len(batch) == batch_size: X, y = zip(*batch) X, y = np.expand_dims(np.array(X), 3), np.array(y) yield X, y batch = []
def testgen(information, seq_len, targetcol): “Return array of all take a look at samples” batch = [] for key, df in information.objects(): input_cols = [c for c in df.columns if c != targetcol] # discover the beginning of take a look at pattern t = df.index[df.index >= TRAIN_TEST_CUTOFF][0] n = (df.index == t).argmax() # extract pattern utilizing a sliding window for i in vary(n+1, len(df)+1): body = df.iloc[i–seq_len:i] batch.append([frame[input_cols].values, body[targetcol][–1]]) X, y = zip(*batch) return np.expand_dims(np.array(X),3), np.array(y)
# Learn information into pandas DataFrames information = {} for filename in os.listdir(DATADIR): if not filename.decrease().endswith(“.csv”): proceed # learn solely the CSV information filepath = os.path.be a part of(DATADIR, filename) X = pd.read_csv(filepath, index_col=“Date”, parse_dates=True) # fundamental preprocessing: get the title, the classification # Save the goal variable as a column in dataframe for simpler dropna() title = X[“Name”][0] del X[“Name”] cols = X.columns X[“Target”] = (X[“Close”].pct_change().shift(–1) > 0).astype(int) X.dropna(inplace=True) # Match the usual scaler utilizing the coaching dataset index = X.index[X.index < TRAIN_TEST_CUTOFF] index = index[:int(len(index) * TRAIN_VALID_RATIO)] scaler = StandardScaler().match(X.loc[index, cols]) # Save scale reworked dataframe X[cols] = scaler.rework(X[cols]) information[name] = X
seq_len = 60 batch_size = 128 n_epochs = 20 n_features = 82
# Produce CNNpred as a binary classification downside mannequin = cnnpred_2d(seq_len, n_features) mannequin.compile(optimizer=“adam”, loss=“mae”, metrics=[“acc”, f1macro]) mannequin.abstract() # print mannequin construction to console
# Arrange callbacks and match the mannequin # We use customized validation rating f1macro() and therefore monitor for “val_f1macro” checkpoint_path = “./cp2d-{epoch}-{val_f1macro:.2f}.h5” callbacks = [ ModelCheckpoint(checkpoint_path, monitor=‘val_f1macro’, mode=“max”, verbose=0, save_best_only=True, save_weights_only=False, save_freq=“epoch”) ] mannequin.match(datagen(information, seq_len, batch_size, “Goal”, “prepare”), validation_data=datagen(information, seq_len, batch_size, “Goal”, “legitimate”), epochs=n_epochs, steps_per_epoch=400, validation_steps=10, verbose=1, callbacks=callbacks)
# Put together take a look at information test_data, test_target = testgen(information, seq_len, “Goal”)
# Take a look at the mannequin test_out = mannequin.predict(test_data) test_pred = (test_out > 0.5).astype(int) print(“accuracy:”, accuracy_score(test_pred, test_target)) print(“MAE:”, mean_absolute_error(test_pred, test_target)) print(“F1:”, f1_score(test_pred, test_target)) |

## Extensions

The unique paper known as the above mannequin “2D-CNNpred” and there’s a model known as “3D-CNNpred”. The thought will not be solely take into account the numerous options of 1 inventory market index however cross evaluate with many market indices to assist prediction on **one index**. Discuss with the desk of options and time steps above, the information for one market index is introduced as 2D array. If we stack up a number of such information from completely different indices, we constructed a 3D array. Whereas the goal label is identical, however permitting us to take a look at a distinct market could present some further info to assist prediction.

As a result of the form of the information modified, the convolutional community additionally outlined barely completely different, and the information turbines want some modification accordingly as nicely. Beneath is the entire code of the 3D model, which the change from the earlier 2nd model ought to be self-explanatory:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
import os import random
import numpy as np import pandas as pd import tensorflow as tf from tensorflow.keras import backend as Okay from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D, Enter from tensorflow.keras.fashions import Sequential, load_model from tensorflow.keras.callbacks import ModelCheckpoint from sklearn.preprocessing import StandardScaler from sklearn.metrics import accuracy_score, f1_score, mean_absolute_error
DATADIR = “./Dataset” TRAIN_TEST_CUTOFF = ‘2016-04-21’ TRAIN_VALID_RATIO = 0.75
# https://datascience.stackexchange.com/questions/45165/how-to-get-accuracy-f1-precision-and-recall-for-a-keras-model # to implement F1 rating for validation in a batch def recall_m(y_true, y_pred): true_positives = Okay.sum(Okay.spherical(Okay.clip(y_true * y_pred, 0, 1))) possible_positives = Okay.sum(Okay.spherical(Okay.clip(y_true, 0, 1))) recall = true_positives / (possible_positives + Okay.epsilon()) return recall
def precision_m(y_true, y_pred): true_positives = Okay.sum(Okay.spherical(Okay.clip(y_true * y_pred, 0, 1))) predicted_positives = Okay.sum(Okay.spherical(Okay.clip(y_pred, 0, 1))) precision = true_positives / (predicted_positives + Okay.epsilon()) return precision
def f1_m(y_true, y_pred): precision = precision_m(y_true, y_pred) recall = recall_m(y_true, y_pred) return 2*((precision*recall)/(precision+recall+Okay.epsilon()))
def f1macro(y_true, y_pred): f_pos = f1_m(y_true, y_pred) # adverse model of the information and prediction f_neg = f1_m(1–y_true, 1–Okay.clip(y_pred,0,1)) return (f_pos + f_neg)/2
def cnnpred_3d(seq_len=60, n_stocks=5, n_features=82, n_filters=(8,8,8), droprate=0.1): “3D-CNNpred mannequin in accordance with the paper” mannequin = Sequential([ Input(shape=(n_stocks, seq_len, n_features)), Conv2D(n_filters[0], kernel_size=(1,1), activation=“relu”, data_format=“channels_last”), Conv2D(n_filters[1], kernel_size=(n_stocks,3), activation=“relu”), MaxPool2D(pool_size=(1,2)), Conv2D(n_filters[2], kernel_size=(1,3), activation=“relu”), MaxPool2D(pool_size=(1,2)), Flatten(), Dropout(droprate), Dense(1, activation=“sigmoid”) ]) return mannequin
def datagen(information, seq_len, batch_size, target_index, targetcol, variety): “As a generator to provide samples for Keras mannequin” # Be taught concerning the information’s options and time axis input_cols = [c for c in data.columns if c[0] != targetcol] tickers = sorted(set(c for _,c in input_cols)) n_features = len(input_cols) // len(tickers) index = information.index[data.index < TRAIN_TEST_CUTOFF] cut up = int(len(index) * TRAIN_VALID_RATIO) assert cut up > seq_len, “Coaching information too small for sequence size {}”.format(seq_len) if variety == “prepare”: index = index[:split] # vary for the coaching set elif variety == ‘legitimate’: index = index[split:] # vary for the validation set else: increase NotImplementedError # Infinite loop to generate a batch batch = [] whereas True: # Decide one place, then clip a sequence size whereas True: t = random.selection(index) n = (information.index == t).argmax() if n–seq_len+1 < 0: proceed # this pattern will not be sufficient for one sequence size body = information.iloc[n–seq_len+1:n+1][input_cols] # convert body with two stage of indices into 3D array form = (len(tickers), len(body), n_features) X = np.full(form, np.nan) for i,ticker in enumerate(tickers): X[i] = body.xs(ticker, axis=1, stage=1).values batch.append([X, data[targetcol][target_index][t]]) break # if we get sufficient for a batch, dispatch if len(batch) == batch_size: X, y = zip(*batch) yield np.array(X), np.array(y) batch = []
def testgen(information, seq_len, target_index, targetcol): “Return array of all take a look at samples” input_cols = [c for c in data.columns if c[0] != targetcol] tickers = sorted(set(c for _,c in input_cols)) n_features = len(input_cols) // len(tickers) t = information.index[data.index >= TRAIN_TEST_CUTOFF][0] n = (information.index == t).argmax() batch = [] for i in vary(n+1, len(information)+1): # Clip a window of seq_len ends at row place i-1 body = information.iloc[i–seq_len:i] goal = body[targetcol][target_index][–1] body = body[input_cols] # convert body with two stage of indices into 3D array form = (len(tickers), len(body), n_features) X = np.full(form, np.nan) for i,ticker in enumerate(tickers): X[i] = body.xs(ticker, axis=1, stage=1).values batch.append([X, target]) X, y = zip(*batch) return np.array(X), np.array(y)
# Learn information into pandas DataFrames information = {} for filename in os.listdir(DATADIR): if not filename.decrease().endswith(“.csv”): proceed # learn solely the CSV information filepath = os.path.be a part of(DATADIR, filename) X = pd.read_csv(filepath, index_col=“Date”, parse_dates=True) # fundamental preprocessing: get the title, the classification # Save the goal variable as a column in dataframe for simpler dropna() title = X[“Name”][0] del X[“Name”] cols = X.columns X[“Target”] = (X[“Close”].pct_change().shift(–1) > 0).astype(int) X.dropna(inplace=True) # Match the usual scaler utilizing the coaching dataset index = X.index[X.index < TRAIN_TEST_CUTOFF] index = index[:int(len(index) * TRAIN_VALID_RATIO)] scaler = StandardScaler().match(X.loc[index, cols]) # Save scale reworked dataframe X[cols] = scaler.rework(X[cols]) information[name] = X
# Remodel information into 3D dataframe (multilevel columns) for key, df in information.objects(): df.columns = pd.MultiIndex.from_product([df.columns, [key]]) information = pd.concat(information.values(), axis=1)
seq_len = 60 batch_size = 128 n_epochs = 20 n_features = 82 n_stocks = 5
# Produce CNNpred as a binary classification downside mannequin = cnnpred_3d(seq_len, n_stocks, n_features) mannequin.compile(optimizer=“adam”, loss=“mae”, metrics=[“acc”, f1macro]) mannequin.abstract() # print mannequin construction to console
# Arrange callbacks and match the mannequin # We use customized validation rating f1macro() and therefore monitor for “val_f1macro” checkpoint_path = “./cp3d-{epoch}-{val_f1macro:.2f}.h5” callbacks = [ ModelCheckpoint(checkpoint_path, monitor=‘val_f1macro’, mode=“max”, verbose=0, save_best_only=True, save_weights_only=False, save_freq=“epoch”) ]
mannequin.match(datagen(information, seq_len, batch_size, “DJI”, “Goal”, “prepare”), validation_data=datagen(information, seq_len, batch_size, “DJI”, “Goal”, “legitimate”), epochs=n_epochs, steps_per_epoch=400, validation_steps=10, verbose=1, callbacks=callbacks)
# Put together take a look at information test_data, test_target = testgen(information, seq_len, “DJI”, “Goal”)
# Take a look at the mannequin test_out = mannequin.predict(test_data) test_pred = (test_out > 0.5).astype(int) print(“accuracy:”, accuracy_score(test_pred, test_target)) print(“MAE:”, mean_absolute_error(test_pred, test_target)) print(“F1:”, f1_score(test_pred, test_target)) |

Whereas the mannequin above is for next-step prediction, it doesn’t cease you from making prediction for ok steps forward in the event you change the goal label to a distinct calculation. This can be an train for you.

## Does it work?

As in all prediction tasks within the monetary market, it’s at all times unrealistic to count on a excessive accuracy. The coaching parameter within the code above can produce barely greater than 50% accuracy within the testing set. Whereas the variety of epochs and batch measurement are intentionally set smaller to save lots of time, there shouldn’t be a lot room for enchancment.

Within the authentic paper, it’s reported that the 3D-CNNpred carried out higher than 2D-CNNpred however solely attaining the F1 rating of lower than 0.6. That is already doing higher than three baseline fashions talked about within the paper. It could be of some use, however not a magic that may assist you to earn money fast.

From machine studying method perspective, right here we classify a panel of information into whether or not the market course is up or down the subsequent day. Therefore whereas the information will not be a picture, it resembles one since each are introduced within the type of a 2D array. The strategy of convolutional layers can subsequently utilized, however we could use a distinct filter measurement to match the instinct we often have for monetary time collection.

## Additional readings

The unique paper is accessible at:

In case you are new to finance software and wish to construct the connection between machine studying methods and finance, chances are you’ll discover this ebook helpful:

On the same matter, we’ve got a earlier publish on utilizing CNN for time collection, however utilizing 1D convolutional layers;

You may additionally discover the next documentation useful to elucidate some syntax we used above:

## Abstract

On this tutorial, you found how a CNN mannequin could be constructed for prediction in monetary time collection.

Particularly, you realized:

- create 2D convolutional layers to course of the time collection
- current the time collection information in a multidimensional array in order that the convolutional layers could be utilized
- What’s a knowledge generator for Keras mannequin coaching and use it
- monitor the efficiency of mannequin coaching with a customized metric
- What to anticipate in predicting monetary market