Global ETD Search

51	Algoritmy pro detekci anomálií v datech z klinických studií a zdravotnických registrů / Algorithms for anomaly detection in data from clinical trials and health registries Bondarenko, Maxim January 2018 (has links) This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
52	Borcení časové osy v oblasti biosignálů / Dynamic Time Warping in Biosignal Processing Kubát, Milan January 2014 (has links) This work is dedicated to dynamic time warping in biosignal processing, especially it´s application for ECG signals. On the beginning the theoretical notes about cardiography are summarized. Then, the DTW analysis follows along with conditions and demands assessments for it’s successful application. Next, several variants and application possibilities are described. The practical part covers the design of this method, the outputs comprehension, settings optimization and realization of methods related with DTW
53	An apt perspective of analysis Kishore, Nanad, Chandra, Ramesh 02 May 2012 (has links) The discourse presented here is aimed at examining the justification of applications of current analysis to real world problems. info:eu-repo/classification/ddc/510 ddc:510
54	Získávání znalostí z multimediálních databází / Knowledge Discovery in Multimedia Databases Jirmásek, Tomáš Unknown Date (has links) This master's thesis deals with knowledge discovery in databases, especially basic methods of classification and prediction used for data mining are described here. The next chapter contains introduction to multimedia databases and knowledge discovery in multimedia databases. The main goal of this chapter was to focus on extraction of low level features from video data and images. In the next parts of this work, there is described data set and results of experiments in applications RapidMiner, LibSVM and own developed application. The last chapter summarises results of used methods for high level feature extraction from low level description of data.
55	Real-Time Continuous Euclidean Distance Fields for Large Indoor Environments Warberg, Erik January 2023 (has links) Real-time spatial awareness is essential in areas such as robotics and autonomous navigation. However, as environments expand and become increasingly complex, maintaining both a low computational load and high mapping accuracy remains a significant challenge. This thesis addresses these challenges by proposing a novel method for real-time construction of continuous Euclidean distance fields (EDF) using Gaussian process (GP) regression, hereafter referred to as GP-EDF, tailored specifically for large indoor environments. The proposed approach focuses on leveraging the inherent structural information of indoor spaces by partitioning them into rooms and constructing a local GP-EDF model for each, reducing the computational cost tied to large matrix operations in GPs. By also exploiting the geometric regularities commonly found in indoor spaces it detects walls and represents them as line segments. This information is integrated into the models’ priors to both improve accuracy and further reduce the computational expense. Comparison with two baselines demonstrated the proposed approach’s effectiveness. It maintained low computation times despite increasing amounts of sensor data, signifying a significant improvement in scalability. Results also confirmed that the EDF quality remains high and isn’t affected by partitioning the GP-EDF into local models. The method also reduced the influence of sensor noise on the EDF’s accuracy when incorporating the line segments into the model. Additionally, the proposed room segmentation method proved to be efficient and generated accurately partitioned rooms, with a high degree of independence between them. In conclusion, the proposed approach offers a scalable, accurate and efficient solution for real-time construction of EDFs, demonstrating significant potential in aiding autonomous navigation within large indoor spaces. / Realtidsrumslig medvetenhet är avgörande inom områden som robotik och autonom navigering. Emellertid, när miljöer expanderar och blir alltmer komplexa, kvarstår det en betydande utmaning att bibehålla både en låg beräkningsbelastning och hög kartläggningsnoggrannhet. Denna avhandling bemöter dessa utmaningar genom att föreslå en ny metod för realtidskonstruktion av kontinuerliga euklidiska avståndsfält (EDF) med hjälp av regression via gaussiska processer (GP), hädanefter benämnd GP-EDF, specifikt anpassad för stora inomhusmiljöer. Den föreslagna metoden fokuserar på att utnyttja den inneboende strukturella informationen i inomhusmiljöer genom att dela upp dem i rum och konstruera en lokal GP-EDF-modell för varje rum, vilket minskar den beräkningsbelastning som är kopplad till stora matrisoperationer i GP:er. Genom att även utnyttja de geometriska regelbundenheter som vanligtvis finns i inomhusutrymmen, detekterar den väggar och representerar dem som linjesegment. Denna information integreras sedan i modellernas a priori-fördelningar, både för att förbättra noggrannheten och ytterligare minska den beräkningsmässiga kostnaden. Jämförelse med två baslinjemodeller demonstrerade den föreslagna metodens effektivitet. Den bibehöll låga beräkningstider trots ökande mängder sensordata, vilket indikerar en betydande förbättring av skalbarheten. Resultaten bekräftade även att kvaliteten på EDF:en förblir hög och påverkas inte av uppdelningen av GP-EDF:en i lokala modeller. Metoden minskade även sensorbrusets inverkan på EDF:ens noggrannhet vid integrering av linjesegment i modellen. Dessutom visade sig den föreslagna rumsegmenteringsmetoden vara effektiv och genererade korrekt uppdelade rum, med en hög grad av oberoende mellan dem. Sammanfattningsvis erbjuder den föreslagna metoden en skalbar och effektiv lösning för realtidskonstruktion av EDF:er, och visar på betydande potential att underlätta autonom navigering inom stora inomhusutrymmen. Room segmentation Gaussian processes Euclidean distance fields mapping line segment detection mobile robots spectral clustering Rumssegmentering Gaussiska processer Euklidiska avståndsfält kartläggning detektering av linjesegment mobila robotar spektral klustring Computer and Information Sciences Data- och informationsvetenskap
56	Explicit Segmentation Of Speech For Indian Languages Ranjani, H G 03 1900 (has links) Speech segmentation is the process of identifying the boundaries between words, syllables or phones in the recorded waveforms of spoken natural languages. The lowest level of speech segmentation is the breakup and classification of the sound signal into a string of phones. The difficulty of this problem is compounded by the phenomenon of co-articulation of speech sounds. The classical solution to this problem is to manually label and segment spectrograms. In the first step of this two step process, a trained person listens to a speech signal, recognizes the word and phone sequence, and roughly determines the position of each phonetic boundary. The second step involves examining several features of the speech signal to place a boundary mark at the point where these features best satisfy a certain set of conditions specific for that kind of phonetic boundary. Manual segmentation of speech into phones is a highly time-consuming and painstaking process. Required for a variety of applications, such as acoustic analysis, or building speech synthesis databases for high-quality speech output systems, the time required to carry out this process for even relatively small speech databases can rapidly accumulate to prohibitive levels. This calls for automating the segmentation process. The state-of-art segmentation techniques use Hidden Markov Models (HMM) for phone states. They give an average accuracy of over 95% within 20 ms of manually obtained boundaries. However, HMM based methods require large training data for good performance. Another major disadvantage of such speech recognition based segmentation techniques is that they cannot handle very long utterances, Which are necessary for prosody modeling in speech synthesis applications. Development of Text to Speech (TTS) systems in Indian languages has been difficult till date owing to the non-availability of sizeable segmented speech databases of good quality. Further, no prosody models exist for most of the Indian languages. Therefore, long utterances (at the paragraph level and monologues) have been recorded, as part of this work, for creating the databases. This thesis aims at automating segmentation of very long speech sentences recorded for the application of corpus-based TTS synthesis for multiple Indian languages. In this explicit segmentation problem, we need to force align boundaries in any utterance from its known phonetic transcription. The major disadvantage of forcing boundary alignments on the entire speech waveform of a long utterance is the accumulation of boundary errors. To overcome this, we force boundaries between 2 known phones (here, 2 successive stop consonants are chosen) at a time. Here, the approach used is silence detection as a marker for stop consonants. This method gives around 89% (for Hindi database) accuracy and is language independent and training free. These stop consonants act as anchor points for the next stage. Two methods for explicit segmentation have been proposed. Both the methods rely on the accuracy of the above stop consonant detection stage. Another common stage is the recently proposed implicit method which uses Bach scale filter bank to obtain the feature vectors. The Euclidean Distance of the Mean of the Logarithm (EDML) of these feature vectors shows peaks at the point where the spectrum changes. The method performs with an accuracy of 87% within 20 ms of manually obtained boundaries and also achieves a low deletion and insertion rate of 3.2% and 21.4% respectively, for 100 sentences of Hindi database. The first method is a three stage approach. The first is the stop consonant detection stage followed by the next, which uses Quatieri’s sinusoidal model to classify sounds as voiced/unvoiced within 2 successive stop consonants. The final stage uses the EDML function of Bach scale feature vectors to further obtain boundaries within the voiced and unvoiced regions. It gives a Frame Error Rate (FER) of 26.1% for Hindi database. The second method proposed uses duration statistics of the phones of the language. It again uses the EDML function of Bach scale filter bank to obtain the peaks at the phone transitions and uses the duration statistics to assign probability to each peak being a boundary. In this method, the FER performance improves to 22.8% for the Hindi database. Both the methods are equally promising for the fact that they give low frame error rates. Results show that the second method outperforms the first, because it incorporates the knowledge of durations. For the proposed approaches to be useful, manual interventions are required at the output of each stage. However, this intervention is less tedious and reduces the time taken to segment each sentence by around 60% as compared to the time taken for manual segmentation. The approaches have been successfully tested on 3 different languages, 100 sentences each -Kannada, Tamil and English (we have used TIMIT database for validating the algorithms). In conclusion, a practical solution to the segmentation problem is proposed. Also, the algorithm being training free, language independent (ES-SABSF method) and speaker independent makes it useful in developing TTS systems for multiple languages reducing the segmentation overhead. This method is currently being used in the lab for segmenting long Kannada utterances, spoken by reading a set of 1115 phonetically rich sentences. Speech Processing Speech Segmentation Indian Languages - Speech Segmentation Stop-Consonant Detection Speech Segmentation - Algorithms Batch-Scale Filter Bank ES-SABSF Segmentation Method ES-DSBSF Segmentation Method Hidden Markov Models HMM) Text to Speech (TTS) Computer Science
57	Near-capacity sphere decoder based detection schemes for MIMO wireless communication systems Kapfunde, Goodwell January 2013 (has links) The search for the closest lattice point arises in many communication problems, and is known to be NP-hard. The Maximum Likelihood (ML) Detector is the optimal detector which yields an optimal solution to this problem, but at the expense of high computational complexity. Existing near-optimal methods used to solve the problem are based on the Sphere Decoder (SD), which searches for lattice points confined in a hyper-sphere around the received point. The SD has emerged as a powerful means of finding the solution to the ML detection problem for MIMO systems. However the bottleneck lies in the determination of the initial radius. This thesis is concerned with the detection of transmitted wireless signals in Multiple-Input Multiple-Output (MIMO) digital communication systems as efficiently and effectively as possible. The main objective of this thesis is to design efficient ML detection algorithms for MIMO systems based on the depth-first search (DFS) algorithms whilst taking into account complexity and bit error rate performance requirements for advanced digital communication systems. The increased capacity and improved link reliability of MIMO systems without sacrificing bandwidth efficiency and transmit power will serve as the key motivation behind the study of MIMO detection schemes. The fundamental principles behind MIMO systems are explored in Chapter 2. A generic framework for linear and non-linear tree search based detection schemes is then presented Chapter 3. This paves way for different methods of improving the achievable performance-complexity trade-off for all SD-based detection algorithms. The suboptimal detection schemes, in particular the Minimum Mean Squared Error-Successive Interference Cancellation (MMSE-SIC), will also serve as pre-processing as well as comparison techniques whilst channel capacity approaching Low Density Parity Check (LDPC) codes will be employed to evaluate the performance of the proposed SD. Numerical and simulation results show that non-linear detection schemes yield better performance compared to linear detection schemes, however, at the expense of a slight increase in complexity. The first contribution in this thesis is the design of a near ML-achieving SD algorithm for MIMO digital communication systems that reduces the number of search operations within the sphere-constrained search space at reduced detection complexity in Chapter 4. In this design, the distance between the ML estimate and the received signal is used to control the lower and upper bound radii of the proposed SD to prevent NP-complete problems. The detection method is based on the DFS algorithm and the Successive Interference Cancellation (SIC). The SIC ensures that the effects of dominant signals are effectively removed. Simulation results presented in this thesis show that by employing pre-processing detection schemes, the complexity of the proposed SD can be significantly reduced, though at marginal performance penalty. The second contribution is the determination of the initial sphere radius in Chapter 5. The new initial radius proposed in this thesis is based on the variable parameter α which is commonly based on experience and is chosen to ensure that at least a lattice point exists inside the sphere with high probability. Using the variable parameter α, a new noise covariance matrix which incorporates the number of transmit antennas, the energy of the transmitted symbols and the channel matrix is defined. The new covariance matrix is then incorporated into the EMMSE model to generate an improved EMMSE estimate. The EMMSE radius is finally found by computing the distance between the sphere centre and the improved EMMSE estimate. This distance can be fine-tuned by varying the variable parameter α. The beauty of the proposed method is that it reduces the complexity of the preprocessing step of the EMMSE to that of the Zero-Forcing (ZF) detector without significant performance degradation of the SD, particularly at low Signal-to-Noise Ratios (SNR). More specifically, it will be shown through simulation results that using the EMMSE preprocessing step will substantially improve performance whenever the complexity of the tree search is fixed or upper bounded. The final contribution is the design of the LRAD-MMSE-SIC based SD detection scheme which introduces a trade-off between performance and increased computational complexity in Chapter 6. The Lenstra-Lenstra-Lovasz (LLL) algorithm will be utilised to orthogonalise the channel matrix H to a new near orthogonal channel matrix H ̅.The increased computational complexity introduced by the LLL algorithm will be significantly decreased by employing sorted QR decomposition of the transformed channel H ̅ into a unitary matrix and an upper triangular matrix which retains the property of the channel matrix. The SIC algorithm will ensure that the interference due to dominant signals will be minimised while the LDPC will effectively stop the propagation of errors within the entire system. Through simulations, it will be demonstrated that the proposed detector still approaches the ML performance while requiring much lower complexity compared to the conventional SD. 384.5
58	Individualization of fixed-dose combination regimens : Methodology and application to pediatric tuberculosis / Individualisering av design och dosering av kombinationstabletter : Metodologi och applicering inom pediatrisk tuberkulos Yngman, Gunnar January 2015 (has links) Introduction: No Fixed-Dose Combination (FDC) formulations currently exist for pediatric tuberculosis (TB) treatment. Earlier work implemented, in the software NONMEM, a rational method for optimizing design and individualization of pediatric anti-TB FDC formulations based on patient body weight, but issues with parameter estimation, dosage strata heterogeneity and representative pharmacokinetics remained. Aim: To further develop the rational model-based methodology aiding the selection of appropriate FDC formulation designs and dosage regimens, in pediatric TB treatment. Materials and Methods: Optimization of the method with respect to the estimation of body weight breakpoints was sought. Heterogeneity of dosage groups with respect to treatment efficiency was sought to be improved. Recently published pediatric pharmacokinetic parameters were implemented and the model translated to MATLAB, where also the performance was evaluated by stochastic estimation and graphical visualization. Results: A logistic function was found better suited as an approximation of breakpoints. None of the estimation methods implemented in NONMEM were more suitable than the originally used FO method. Homogenization of dosage group treatment efficiency could not be solved. MATLAB translation was successful but required stochastic estimations and highlighted high densities of local minima. Representative pharmacokinetics were successfully implemented. Conclusions: NONMEM was found suboptimal for the task due to problems with discontinuities and heterogeneity, but a stepwise method with representative pharmacokinetics were successfully implemented. MATLAB showed more promise in the search for a method also addressing the heterogeneity issue. pharmacometrics pharmacokinetics PK population pharmacokinetics optimal design FDC fixed-dose combination pediatrics tuberculosis rifampicin pyrazinamide isoniazid ethambutol modeling simulation estimation allometric scaling stochastical simulation and estimation NONMEM MATLAB FO FOCE Expectation-Maximization algorithm importance sampling Nelder-Mead method logistic function discontinuity euclidean distance Sammon mapping global minimum local minima Pharmaceutical Sciences Farmaceutiska vetenskaper Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi)
59	Studying the effectiveness of dynamic analysis for fingerprinting Android malware behavior / En studie av effektivitet hos dynamisk analys för kartläggning av beteenden hos Android malware Regard, Viktor January 2019 (has links) Android is the second most targeted operating system for malware authors and to counter the development of Android malware, more knowledge about their behavior is needed. There are mainly two approaches to analyze Android malware, namely static and dynamic analysis. Recently in 2017, a study and well labeled dataset, named AMD (Android Malware Dataset), consisting of over 24,000 malware samples was released. It is divided into 135 varieties based on similar malicious behavior, retrieved through static analysis of the file classes.dex in the APK of each malware, whereas the labeled features were determined by manual inspection of three samples in each variety. However, static analysis is known to be weak against obfuscation techniques, such as repackaging or dynamic loading, which can be exploited to avoid the analysis. In this study the second approach is utilized and all malware in the dataset are analyzed at run-time in order to monitor their dynamic behavior. However, analyzing malware at run-time has known weaknesses as well, as it can be avoided through, for instance, anti-emulator techniques. Therefore, the study aimed to explore the available sandbox environments for dynamic analysis, study the effectiveness of fingerprinting Android malware using one of the tools and investigate whether static features from AMD and the dynamic analysis correlate. For instance, by an attempt to classify the samples based on similar dynamic features and calculating the Pearson Correlation Coefficient (r) for all combinations of features from AMD and the dynamic analysis. The comparison of tools for dynamic analysis, showed a need of development, as most popular tools has been released for a long time and the common factor is a lack of continuous maintenance. As a result, the choice of sandbox environment for this study ended up as Droidbox, because of aspects like ease of use/install and easily adaptable for large scale analysis. Based on the dynamic features extracted with Droidbox, it could be shown that Android malware are more similar to the varieties which they belong to. The best metric for classifying samples to varieties, out of four investigated metrics, turned out to be Cosine Similarity, which received an accuracy of 83.6% for the entire dataset. The high accuracy indicated a correlation between the dynamic features and static features which the varieties are based on. Furthermore, the Pearson Correlation Coefficient confirmed that the manually extracted features, used to describe the varieties, and the dynamic features are correlated to some extent, which could be partially confirmed by a manual inspection in the end of the study. Android malware dynamic analysis droidbox cuckoodroid droidscope mobsf malware behavior correlation pearson correlation cosine similarity euclidean distance chebyshev distance mahalanobis distance similarity analysis static features dynamic features tf-idf AMD Android malware dataset malware dataset UpDroid EC2 Computer and Information Sciences Data- och informationsvetenskap
60	Unsupervised Detection of Interictal Epileptiform Discharges in Routine Scalp EEG : Machine Learning Assisted Epilepsy Diagnosis Shao, Shuai January 2023 (has links) Epilepsy affects more than 50 million people and is one of the most prevalent neurological disorders and has a high impact on the quality of life of those suffering from it. However, 70% of epilepsy patients can live seizure free with proper diagnosis and treatment. Patients are evaluated using scalp EEG recordings which is cheap and non-invasive. Diagnostic yield is however low and qualified personnel need to process large amounts of data in order to accurately assess patients. MindReader is an unsupervised classifier which detects spectral anomalies and generates a hypothesis of the underlying patient state over time. The aim is to highlight abnormal, potentially epileptiform states, which could expedite analysis of patients and let qualified personnel attest the results. It was used to evaluate 95 scalp EEG recordings from healthy adults and adult patients with epilepsy. Interictal Epileptiform discharges (IED) occurring in the samples had been retroactively annotated, along with the patient state and maneuvers performed by personnel, to enable characterization of the classifier’s detection performance. The performance was slightly worse than previous benchmarks on pediatric scalp EEG recordings, with a 7% and 33% drop in specificity and sensitivity, respectively. Electrode positioning and partial spatial extent of events saw notable impact on performance. However, no correlation between annotated disturbances and reduction in performance could be found. Additional explorative analysis was performed on serialized intermediate data to evaluate the analysis design. Hyperparameters and electrode montage options were exposed to optimize for the average Mathew’s correlation coefficient (MCC) per electrode per patient, on a subset of the patients with epilepsy. An increased window length and lowered amount of training along with an common average montage proved most successful. The Euclidean distance of cumulative spectra (ECS), a metric suitable for spectral analysis, and homologous L2 and L1 loss function were implemented, of which the ECS further improved the average performance for all samples. Four additional analyses, featuring new time-frequency transforms and multichannel convolutional autoencoders were evaluated and an analysis using the continuous wavelet transform (CWT) and a convolutional autoencoder (CNN) performed the best, with an average MCC score of 0.19 and 56.9% sensitivity with approximately 13.9 false positives per minute. EEG electroencephalography IED interictal epileptiform discharges spike detection epilepsy unsupervised Fourier transform STFT short-time Fourier transform CWT continuous wavelet transform DWT discrete wavelet transform ML machine learning ANN artificial neural network CNN convolutional neural network autoencoder HMM hidden Markov model ECS Bioinformatics (Computational Biology) Bioinformatik (beräkningsbiologi) Neurology Neurologi

Search results