Feature selection or variable selection is an important step in different machine learning tasks. In a traditional approach, users specify the amount of features, which shall be selected. Afterwards, algorithm select features by using scores like the Joint Mutual Information (JMI). If users do not know the exact amount of features to select, they need to evaluate the full learning chain for different feature counts in order to determine, which amount leads to the lowest training error. To overcome this drawback, we extend the JMI score and mitigate the flaw by introducing a stopping criterion to the selection algorithm that can be specified depending on the learning task. With this, we enable developers to carry out the feature selection task before the actual learning is done. We call our new score Historical Joint Mutual Information (HJMI). Additionally, we compare our new algorithm, using the novel HJMI score, against traditional algorithms, which use the JMI score. With this, we demonstrate that the HJMI-based algorithm is able to automatically select a reasonable amount of features: Our approach delivers results as good as traditional approaches and sometimes even outperforms them, as it is not limited to a certain step size for feature evaluation.
Identifer | oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:33715 |
Date | 05 April 2019 |
Creators | Gocht, Andreas |
Contributors | Schöne, Robert, Lehmann, Christoph |
Source Sets | Hochschulschriftenserver (HSSS) der SLUB Dresden |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/publishedVersion, doc-type:conferenceObject, info:eu-repo/semantics/conferenceObject, doc-type:Text |
Rights | info:eu-repo/semantics/openAccess |
Relation | 10.1109/BigData.2018.8622548, 978-1-5386-5034-9, 978-1-5386-5036-3, 978-1-5386-5035-6, 10.1109/BigData.2018.8622548 |
Page generated in 0.002 seconds