Return to search

Designing Random Sample Synopses with Outliers

Random sampling is one of the most widely used means to build synopses of large datasets because random samples can be used for a wide range of analytical tasks. Unfortunately, the quality of the estimates derived from a sample is negatively affected by the presence of 'outliers' in the data. In this paper, we show how to circumvent this shortcoming by constructing outlier-aware sample synopses. Our approach extends the well-known outlier indexing scheme to multiple aggregation columns.

Identiferoai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:80383
Date12 August 2022
CreatorsLehner, Wolfgang, Rosch, Philip, Gemulla, Rainer
PublisherIEEE
Source SetsHochschulschriftenserver (HSSS) der SLUB Dresden
LanguageEnglish
Detected LanguageEnglish
Typeinfo:eu-repo/semantics/acceptedVersion, doc-type:conferenceObject, info:eu-repo/semantics/conferenceObject, doc-type:Text
Rightsinfo:eu-repo/semantics/openAccess
Relation978-1-4244-1836-7, 10.1109/ICDE.2008.4497569

Page generated in 0.009 seconds