This work outlines a possible technique for identifying webpages that contain product specifications. Using support vector machines a product web page classifier was constructed and tested with various settings. The final result for this classifier ended up being 0.958 in precision and 0.796 in recall for product pages. The scores imply that the method could be considered a valid technique in real world web classification tasks if additional features and more data were made available.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-189555 |
Date | January 2016 |
Creators | Huss, Jakob |
Publisher | KTH, Skolan för datavetenskap och kommunikation (CSC) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0126 seconds