Data drifts naturally occur in data streams due to seasonality, change in data usage,
and the data generation process. Concepts modelled via the data streams will also
experience such drift. The problem of differentiating concept drift from anomalies
is important to identify normal vs abnormal behaviour. Existing techniques achieve
poor responsiveness and accuracy towards this differentiation task.
We take two approaches to address this problem. First, we extend an existing
sliding window algorithm to include multiple windows to model recently seen data
stream patterns, and define new parameters to compare the data streams. Second,
we study a set of optimisers and tune a Bi-LSTM model parameters to maximize
accuracy. / Thesis / Master of Applied Science (MASc)
Identifer | oai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/27330 |
Date | January 2021 |
Creators | Do, Ethan Quoc-Nam |
Contributors | Chiang, Fei, Computing and Software |
Source Sets | McMaster University |
Language | English |
Detected Language | English |
Type | Thesis |
Page generated in 0.0014 seconds