Deep learning-based approaches have gained popularity for many applications in recent years and have become the state-of-the-art method in machine learning applications. However, most deep learning applications manipulate static data sets that do not vary over time. This is not the case for several significant recent applications, particularly for online learning on data streams that require real-time adjustment to the latest data context. This paper presents an architecture derived from the theory of transfer learning utilizing Extreme Learning Machine (ELM) and Recurrent Neural Network (RNN) methods. Unlike most data stream applications, when receiving high volumes of data at a high velocity, the proposed method does not store and reuse previous observations. The proposed model uses a Deep Echo State Network (DeepESN) for the primary classification task to compensate for the lack of past data. More precisely, instead of the data itself, we propagate generative models so that historical training data that is not stored can be regenerated. Before the DeepESN block, a layer of pre-trained ELMs was used to have generic feature engineering. The proposed method was tested on wellknown benchmark data sets, and it outperformed the state-of-the-art models by an accuracy improvement ranging from 0.02 to 3.3 percent. The impact of data set regeneration on the learning process was studied using different RNN approaches.